[sql] SQL multiple columns in IN clause

If we need to query a table based on some set of values for a given column, we can simply use the IN clause.

But if query need to be performed based on multiple columns, we could not use IN clause(grepped in SO threads.)

From other SO threads, we can circumvent this problem using joins or exists clause etc. But they all work if both main table and search data are in the database.

E.g
User table:
firstName, lastName, City

Given a list of (firstname, lastName) tuples, I need to get the cities.

I can think of following solutions.

1

Construct a select query like,

SELECT city from user where (firstName=x and lastName=y) or (firstName=a and lastName=b) or .....

2

Upload all firstName, lastName values into a staging table and perform a join between 'user' table and the new staging table.

Are there any options for solving this problem and what is the preferred of solving this problem in general?

This question is related to sql oracle

The answer is


You could do like this:

SELECT city FROM user WHERE (firstName, lastName) IN (('a', 'b'), ('c', 'd'));

The sqlfiddle.


In Oracle you can do this:

SELECT * FROM table1 WHERE (col_a,col_b) IN (SELECT col_x,col_y FROM table2)

It often ends up being easier to load your data into the database, even if it is only to run a quick query. Hard-coded data seems quick to enter, but it quickly becomes a pain if you start having to make changes.

However, if you want to code the names directly into your query, here is a cleaner way to do it:

with names (fname,lname) as (
    values
        ('John','Smith'),
        ('Mary','Jones')
)
select city from user
    inner join names on
        fname=firstName and
        lname=lastName;

The advantage of this is that it separates your data out of the query somewhat.

(This is DB2 syntax; it may need a bit of tweaking on your system).


Ensure you have an index on your firstname and lastname columns and go with 1. This really won't have much of a performance impact at all.

EDIT: After @Dems comment regarding spamming the plan cache ,a better solution might be to create a computed column on the existing table (or a separate view) which contained a concatenated Firstname + Lastname value, thus allowing you to execute a query such as

SELECT City 
FROM User 
WHERE Fullname in (@fullnames)

where @fullnames looks a bit like "'JonDoe', 'JaneDoe'" etc


Determine whether the list of names is different with each query or reused. If it is reused, it belongs to the database.

Even if it is unique with each query, it may be useful to load it to a temporary table (#table syntax) for performance reasons - in that case you will be able to avoid recompilation of a complex query.

If the maximum number of names is fixed, you should use a parametrized query.

However, if none of the above cases applies, I would go with inlining the names in the query as in your approach #1.


In general you can easily write the Where-Condition like this:

select * from tab1
where (col1, col2) in (select col1, col2 from tab2)

Note
Oracle ignores rows where one or more of the selected columns is NULL. In these cases you probably want to make use of the NVL-Funktion to map NULL to a special value (that should not be in the values):

select * from tab1
where (col1, NVL(col2, '---') in (select col1, NVL(col2, '---') from tab2)