Wednesday, August 27, 2008

Ternary logic in SQL

A fellow colleague ran into a strange problem recently with a "not-in" SQL query. There were two tables, say A and B and we weretrying to get all values of a column from table A that did not exist in Table B.

Here's a result of the queries we ran.

select count(distinct column1) from A
-----17567 records

select count(distinct column1) from B
-----10234 records

So to get the values of column1 that are in A but not in B, we tried,

select distinct column1 from A where column1 not in (select distinct column1 from B)
-----0 records


Funnily, the query returned no records. Something was wrong.

After breaking our heads for several hours on this problem, the culprit turned out to be some null values in table B.

We changed the above query as follows to make it work.

select distinct column1 from A where column1 not in (select distinct column1 from B where column1 is not null)
-----~8000 records


SQL implements what is known as ternary logic for handling NULLs. I found a lucid explanation of this problem here.

No comments: