[vox-tech] A complicated SQL question

Harold Lee harold at hotelling.net
Thu Nov 9 14:32:58 PST 2006


On Thu, Nov 09, 2006 at 12:49:25PM -0800, Richard S. Crawford wrote:
> I have a number of rows in my database which look something like this:
> 
> 
> id | username | canonicalname | password
> ---+----------+---------------+-----------------------
> 01 | asmith   | ART SMITH     | md5 encrypted password
> 02 | asmith1  | ART SMITH     | <NULL>
> 03 | bjones   | BILL JONES    | md5 encrypted password
> 04 | bjones12 | BILL JONES    | <NULL>
> ---+----------+---------------+-----------------------
> 
> 
> There are something like 275 entries like these.  I'm having a hell of a time 
> finding a way to select rows like 02 and 04, where canonicalname is a 
> duplicate of the same value in another row and the password is blank, and 
> then deleting those rows.  After executing my query, the table should look 
> like this:
> 
> 
> id | username | canonicalname | password
> ---+----------+---------------+-----------------------
> 01 | asmith   | ART SMITH     | md5 encrypted password
> 03 | bjones   | BILL JONES    | md5 encrypted password
> ---+----------+---------------+-----------------------
> 
> 
> Is there a simple way to execute a single select query which would pull up all 
> the rows I need?
> 

Try a query like this (I've named the table Q) to select the records
(best to make sure before deleting them:

select id 
  from q 
 where password is null 
   and canonicalname in 
       (select canonicalname 
          from q 
         where password is not null)

If you also want to make sure that the username in the record with a
password is a substring of the username in the record without a
password, you can add that predicate too:

select id 
  from q a 
 where password is null
   and canonicalname in 
       (select canonicalname 
          from q b 
         where password is not null 
           and substr(a.username,1,length(b.username)) = b.username)

Then, if you want to delete these records (assuming the ID column is a
primary key), you can say

delete from q where id in (... whichever select you choose ...)


More information about the vox-tech mailing list