WHERE col1,col2 IN (...) [SQL subquery using composite primary key] - sql

Given a table foo with a composite primary key (a,b), is there a legal syntax for writing a query such as:
SELECT ... FROM foo WHERE a,b IN (SELECT ...many tuples of a/b values...);
UPDATE foo SET ... WHERE a,b IN (SELECT ...many tuples of a/b values...);
If this is not possible, and you could not modify the schema, how could you perform the equivalent of the above?
I'm also going to put the terms "compound primary key", "subselect", "sub-select", and "sub-query" here for search hits on these aliases.
Edit: I'm interested in answers for standard SQL as well as those that would work with PostgreSQL and SQLite 3.

sqlite> create table foo (a,b,c);
sqlite> create table bar (x,y);
sqlite> select * from foo where exists (select 1 from bar where foo.a = bar.x and foo.b = bar.y);
Replace the select 1 from bar with your select ... many tuples of a/b values ....
Or create a temporary table of your select ... many tuples of a/b values ... and use it in place of bar..

Your syntax is very close to Standard SQL!
The following is valid FULL SQL-92 (as confirmed by the Mimer SQL-92 Validator)
SELECT *
FROM foo
WHERE (a, b) IN (
SELECT a, b
FROM bar
);
Of course, not every SQL product supports full SQL-92 (shame!) If anyone would like to see this syntax supported in Microsoft SQL Server, they can vote for it here.
A further SQL-92 construct that is more widely supported (e.g. by Microsoft SQL Server and Oracle) is INTERSECT e.g.
SELECT a, b
FROM Foo
INTERSECT
SELECT a, b
FROM Bar;
Note that these constructs properly handle the NULL value, unlike some of the other suggestions here e.g. those using EXISTS (<equality predicates>), concatenated values, etc.

You've done one very little mistake.
You have to put a,b in parentheses.
SELECT ... FROM foo WHERE (a,b) IN (SELECT f,d FROM ...);
That works!

The IN syntax you suggested is not valid SQL. A solution using EXISTS should work across all reasonably compliant SQL RDBMSes:
UPDATE foo SET x = y WHERE EXISTS
(SELECT * FROM bar WHERE bar.c1 = foo.c1 AND bar.c2 = foo.c2)
Be aware that this is often not especially performant.

SELECT ...
FROM foo
INNER JOIN (SELECT ...many tuples of a/b values...) AS results
ON results.a = foo.a
AND results.b = foo.b
That what you are looking for?

With concatenation, this works with PostgreSQL:
SELECT a,b FROM foo WHERE a||b IN (SELECT a||b FROM bar WHERE condition);
UPDATE foo SET x=y WHERE a||b IN (SELECT a||b FROM bar WHERE condition);

If you need a solution that doesn't require the tuples of values already existing in a table, you can concatenate the relevant table values and items in your list and then use the 'IN' command.
In postgres this would look like this:
SELECT * FROM foo WHERE a || '_' || b in ('Hi_there', 'Me_here', 'Test_test');
While in SQL I'd imagine it might look something like this:
SELECT * FROM foo WHERE CONCAT(a, "_", b) in ('Hi_there', 'Me_here', 'Test_test');

JOINS and INTERSECTS work fine as a substitute for IN, but they're not so obvious as a substitute for NOT IN, e.g.: inserting rows from TableA into TableB where they don't already exist in TableB where the PK on both tables is a composite.
I am currently using the concatenation method above in SQL Server, but it's not a very elegant solution.

Firebird uses this concatenation formula:
SELECT a,b FROM foo WHERE a||b IN (SELECT a||b FROM bar WHERE
condition);

Related

Ensuring two columns only contain valid results from same subquery

I have the following table:
id symbol_01 symbol_02
1 abc xyz
2 kjh okd
3 que qid
I need a query that ensures symbol_01 and symbol_02 are both contained in a list of valid symbols. In other words I would needs something like this:
select *
from mytable
where symbol_01 in (
select valid_symbols
from somewhere)
and symbol_02 in (
select valid_symbols
from somewhere)
The above example would work correctly, but the subquery used to determine the list of valid symbols is identical both times and is quite large. It would be very innefficient to run it twice like in the example.
Is there a way to do this without duplicating two identical sub queries?
Another approach:
select *
from mytable t1
where 2 = (select count(distinct symbol)
from valid_symbols vs
where vs.symbol in (t1.symbol_01, t1.symbol_02));
This assumes that the valid symbols are stored in a table valid_symbols that has a column named symbol. The query would also benefit from an index on valid_symbols.symbol
You could try use a CTE like;
WITH ValidSymbols AS (
SELECT DISTINCT valid_symbol
FROM somewhere
)
SELECT mt.*
FROM MyTable mt
INNER JOIN ValidSymbols v1
ON mt.symbol_01 = v1.valid_symbol
INNER JOIN ValidSymbols v2
ON mt.symbol_02 = v2.valid_symbol
From a performance perspective, your query is the right way to do this. I would write it as:
select *
from mytable t
where exists (select 1
from valid_symbols vs
where t.symbol_01 = vs.valid_symbol
) and
exists (select 1
from valid_symbols vs
where t.symbol_02 = vs.valid_symbol
) ;
The important component is that you need an index on valid_symbols(valid_symbol). With this index, the lookup should be pretty fast. Appropriate indexes can even work if valid_symbols is a view, although the effect depends on the complexity of the view.
You seem to have a situation where you have two foreign key relationships. If you explicitly declare these relationships, then the database will enforce that the columns in your table match the valid symbols.

'In' clause in SQL server with multiple columns

I have a component that retrieves data from database based on the keys provided.
However I want my java application to get all the data for all keys in a single database hit to fasten up things.
I can use 'in' clause when I have only one key.
While working on more than one key I can use below query in oracle
SELECT * FROM <table_name>
where (value_type,CODE1) IN (('I','COMM'),('I','CORE'));
which is similar to writing
SELECT * FROM <table_name>
where value_type = 1 and CODE1 = 'COMM'
and
SELECT * FROM <table_name>
where value_type = 1 and CODE1 = 'CORE'
together
However, this concept of using 'in' clause as above is giving below error in 'SQL server'
ERROR:An expression of non-boolean type specified in a context where a condition is expected, near ','.
Please let know if their is any way to achieve the same in SQL server.
This syntax doesn't exist in SQL Server. Use a combination of And and Or.
SELECT *
FROM <table_name>
WHERE
(value_type = 1 and CODE1 = 'COMM')
OR (value_type = 1 and CODE1 = 'CORE')
(In this case, you could make it shorter, because value_type is compared to the same value in both combinations. I just wanted to show the pattern that works like IN in oracle with multiple fields.)
When using IN with a subquery, you need to rephrase it like this:
Oracle:
SELECT *
FROM foo
WHERE
(value_type, CODE1) IN (
SELECT type, code
FROM bar
WHERE <some conditions>)
SQL Server:
SELECT *
FROM foo
WHERE
EXISTS (
SELECT *
FROM bar
WHERE <some conditions>
AND foo.type_code = bar.type
AND foo.CODE1 = bar.code)
There are other ways to do it, depending on the case, like inner joins and the like.
If you have under 1000 tuples you want to check against and you're using SQL Server 2008+, you can use a table values constructor, and perform a join against it. You can only specify up to 1000 rows in a table values constructor, hence the 1000 tuple limitation. Here's how it would look in your situation:
SELECT <table_name>.* FROM <table_name>
JOIN ( VALUES
('I', 'COMM'),
('I', 'CORE')
) AS MyTable(a, b) ON a = value_type AND b = CODE1;
This is only a good idea if your list of values is going to be unique, otherwise you'll get duplicate values. I'm not sure how the performance of this compares to using many ANDs and ORs, but the SQL query is at least much cleaner to look at, in my opinion.
You can also write this to use EXIST instead of JOIN. That may have different performance characteristics and it will avoid the problem of producing duplicate results if your values aren't unique. It may be worth trying both EXIST and JOIN on your use case to see what's a better fit. Here's how EXIST would look,
SELECT * FROM <table_name>
WHERE EXISTS (
SELECT 1
FROM (
VALUES
('I', 'COMM'),
('I', 'CORE')
) AS MyTable(a, b)
WHERE a = value_type AND b = CODE1
);
In conclusion, I think the best choice is to create a temporary table and query against that. But sometimes that's not possible, e.g. your user lacks the permission to create temporary tables, and then using a table values constructor may be your best choice. Use EXIST or JOIN, depending on which gives you better performance on your database.
Normally you can not do it, but can use the following technique.
SELECT * FROM <table_name>
where (value_type+'/'+CODE1) IN (('I'+'/'+'COMM'),('I'+'/'+'CORE'));
A better solution is to avoid hardcoding your values and put then in a temporary or persistent table:
CREATE TABLE #t (ValueType VARCHAR(16), Code VARCHAR(16))
INSERT INTO #t VALUES ('I','COMM'),('I','CORE')
SELECT DT. *
FROM <table_name> DT
JOIN #t T ON T.ValueType = DT.ValueType AND T.Code = DT.Code
Thus, you avoid storing data in your code (persistent table version) and allow to easily modify the filters (without changing the code).
I think you can try this, combine and and or at the same time.
SELECT
*
FROM
<table_name>
WHERE
value_type = 1
AND (CODE1 = 'COMM' OR CODE1 = 'CORE')
What you can do is 'join' the columns as a string, and pass your values also combined as strings.
where (cast(column1 as text) ||','|| cast(column2 as text)) in (?1)
The other way is to do multiple ands and ors.
I had a similar problem in MS SQL, but a little different. Maybe it will help somebody in futere, in my case i found this solution (not full code, just example):
SELECT Table1.Campaign
,Table1.Coupon
FROM [CRM].[dbo].[Coupons] AS Table1
INNER JOIN [CRM].[dbo].[Coupons] AS Table2 ON Table1.Campaign = Table2.Campaign AND Table1.Coupon = Table2.Coupon
WHERE Table1.Coupon IN ('0000000001', '0000000002') AND Table2.Campaign IN ('XXX000000001', 'XYX000000001')
Of cource on Coupon and Campaign in table i have index for fast search.
Compute it in MS Sql
SELECT * FROM <table_name>
where value_type + '|' + CODE1 IN ('I|COMM', 'I|CORE');

Using LIKE in an Oracle IN clause

I know I can write a query that will return all rows that contain any number of values in a given column, like so:
Select * from tbl where my_col in (val1, val2, val3,... valn)
but if val1, for example, can appear anywhere in my_col, which has datatype varchar(300), I might instead write:
select * from tbl where my_col LIKE '%val1%'
Is there a way of combing these two techniques. I need to search for some 30 possible values that may appear anywhere in the free-form text of the column.
Combining these two statements in the following ways does not seem to work:
select * from tbl where my_col LIKE ('%val1%', '%val2%', 'val3%',....)
select * from tbl where my_col in ('%val1%', '%val2%', 'val3%',....)
What would be useful here would be a LIKE ANY predicate as is available in PostgreSQL
SELECT *
FROM tbl
WHERE my_col LIKE ANY (ARRAY['%val1%', '%val2%', '%val3%', ...])
Unfortunately, that syntax is not available in Oracle. You can expand the quantified comparison predicate using OR, however:
SELECT *
FROM tbl
WHERE my_col LIKE '%val1%' OR my_col LIKE '%val2%' OR my_col LIKE '%val3%', ...
Or alternatively, create a semi join using an EXISTS predicate and an auxiliary array data structure (see this question for details):
SELECT *
FROM tbl t
WHERE EXISTS (
SELECT 1
-- Alternatively, store those values in a temp table:
FROM TABLE (sys.ora_mining_varchar2_nt('%val1%', '%val2%', '%val3%'/*, ...*/))
WHERE t.my_col LIKE column_value
)
For true full-text search, you might want to look at Oracle Text: http://www.oracle.com/technetwork/database/enterprise-edition/index-098492.html
A REGEXP_LIKE will do a case-insensitive regexp search.
select * from Users where Regexp_Like (User_Name, 'karl|anders|leif','i')
This will be executed as a full table scan - just as the LIKE or solution, so the performance will be really bad if the table is not small. If it's not used often at all, it might be ok.
If you need some kind of performance, you will need Oracle Text (or some external indexer).
To get substring indexing with Oracle Text you will need a CONTEXT index. It's a bit involved as it's made for indexing large documents and text using a lot of smarts. If you have particular needs, such as substring searches in numbers and all words (including "the" "an" "a", spaces, etc) , you need to create custom lexers to remove some of the smart stuff...
If you insert a lot of data, Oracle Text will not make things faster, especially if you need the index to be updated within the transactions and not periodically.
No, you cannot do this. The values in the IN clause must be exact matches. You could modify the select thusly:
SELECT *
FROM tbl
WHERE my_col LIKE %val1%
OR my_col LIKE %val2%
OR my_col LIKE %val3%
...
If the val1, val2, val3... are similar enough, you might be able to use regular expressions in the REGEXP_LIKE operator.
Yes, you can use this query (Instead of 'Specialist' and 'Developer', type any strings you want separated by comma and change employees table with your table)
SELECT * FROM employees em
WHERE EXISTS (select 1 from table(sys.dbms_debug_vc2coll('Specialist', 'Developer')) mt
where em.job like ('%' || mt.column_value || '%'));
Why my query is better than the accepted answer: You don't need a CREATE TABLE permission to run it. This can be executed with just SELECT permissions.
In Oracle you can use regexp_like as follows:
select *
from table_name
where regexp_like (name, '^(value-1|value-2|value-3....)');
The caret (^) operator to indicate a beginning-of-line character &
The pipe (|) operator to indicate OR operation.
This one is pretty fast :
select * from listofvalue l
inner join tbl on tbl.mycol like '%' || l.value || '%'
Just to add on #Lukas Eder answer.
An improvement to avoid creating tables and inserting values
(we could use select from dual and unpivot to achieve the same result "on the fly"):
with all_likes as
(select * from
(select '%val1%' like_1, '%val2%' like_2, '%val3%' like_3, '%val4%' as like_4, '%val5%' as like_5 from dual)
unpivot (
united_columns for subquery_column in ("LIKE_1", "LIKE_2", "LIKE_3", "LIKE_4", "LIKE_5"))
)
select * from tbl
where exists (select 1 from all_likes where tbl.my_col like all_likes.united_columns)
I prefer this
WHERE CASE WHEN my_col LIKE '%val1%' THEN 1
WHEN my_col LIKE '%val2%' THEN 1
WHEN my_col LIKE '%val3%' THEN 1
ELSE 0
END = 1
I'm not saying it's optimal but it works and it's easily understood. Most of my queries are adhoc used once so performance is generally not an issue for me.
select * from tbl
where exists (select 1 from all_likes where all_likes.value = substr(tbl.my_col,0, length(tbl.my_col)))
You can put your values in ODCIVARCHAR2LIST and then join it as a regular table.
select tabl1.* FROM tabl1 LEFT JOIN
(select column_value txt from table(sys.ODCIVARCHAR2LIST
('%val1%','%val2%','%val3%')
)) Vals ON tabl1.column LIKE Vals.txt WHERE Vals.txt IS NOT NULL
You don't need a collection type as mentioned in https://stackoverflow.com/a/6074261/802058. Just use an subquery:
SELECT *
FROM tbl t
WHERE EXISTS (
SELECT 1
FROM (
SELECT 'val1%' AS val FROM dual
UNION ALL
SELECT 'val2%' AS val FROM dual
-- ...
-- or simply use an subquery here
)
WHERE t.my_col LIKE val
)

Matching two columns in MySQL

I'm quite new to SQL and have a question about matching names from two columns located within a table:
Let's say I want to use the soundex() function to match two columsn. If I use this query:
SELECT * FROM tablename WHERE SOUNDEX(column1)=SOUNDEX(column2);
a row is returned if the two names within that row match. Now I'd also like to get those name matches between column1 and column2 that aren't in the same row. Is there a way to automate a procedure whereby every name from column1 is compared to every name from column2?
Thanks :)
p.s.: If anyone could point me in the direction of a n-gram/bi-gram matching algorithm that is easy for a noob to implement into mysql that would be good as well.
If your table has a key, say id, you can try:
select A.column1, B.column2
from tablename as A, tablename as B
where (A.id != B.id) and (SOUNDEX(A.column1) = SOUNDEX(B.column2))
You can join the table to itself on that relationship as such:
SELECT * FROM tablename t1 JOIN tablename t2
ON SOUNDEX(t1.column1) = SOUNDEX(t2.column2);

Using tuples in SQL in clause

Given a database like this:
BEGIN TRANSACTION;
CREATE TABLE aTable (
a STRING,
b STRING
);
INSERT INTO aTable VALUES('one','two');
INSERT INTO aTable VALUES('one','three');
CREATE TABLE anotherTable (
a STRING,
b STRING
);
INSERT INTO anotherTable VALUES('one','three');
INSERT INTO anotherTable VALUES('two','three');
COMMIT;
I would like to do something along the lines of
SELECT a,b FROM aTable
WHERE (aTable.a,aTable.b) IN
(SELECT anotherTable.a,anotherTable.b FROM anotherTable);
To get the answer 'one','three', but I'm getting "near ",": syntax error"
Is this possible in any flavour of SQL? (I'm using SQLite)
Am I making a gross conceptual error? Or what?
your code works if you do it in PostgreSQL or Oracle. on MS SQL, it is not supported
use this:
SELECT a,b FROM aTable
WHERE
-- (aTable.a,aTable.b) IN -- leave this commented, it makes the intent more clear
EXISTS
(
SELECT anotherTable.a,anotherTable.b -- do not remove this too, perfectly fine for self-documenting code, i.e.. tuple presence testing
FROM anotherTable
WHERE anotherTable.a = aTable.a AND anotherTable.b = aTable.b
);
[EDIT]
sans the stating of intent:
SELECT a,b FROM aTable
WHERE
EXISTS
(
SELECT *
FROM anotherTable
WHERE anotherTable.a = aTable.a AND anotherTable.b = aTable.b
);
it's somewhat lame, for more than a decade, MS SQL still don't have first-class support for tuples. IN tuple construct is way more readable than its analogous EXISTS construct. btw, JOIN also works (tster's code), but if you need something more flexible and future-proof, use EXISTS.
[EDIT]
speaking of SQLite, i'm dabbling with it recently. yeah, IN tuples doesn't work
you can use a join:
SELECT aTable.a, aTable.b FROM aTable
JOIN anotherTable ON aTable.a = anotherTable.a AND aTable.b = anotherTable.b
Another alternative is to use concatenation to make your 2-tuple into a single field :
SELECT a,b FROM aTable
WHERE (aTable.a||'-'||aTable.b) IN
(SELECT (anotherTable.a || '-' || anotherTable.b FROM anotherTable);
...just be aware that bad things can happen if a or b contain the delimiter '-'