Filtering several numbers - sql

I want to get some data froma Oracle DB to Power QUery (Excel).
I am managing this with a sql-statement.
There are 10 columns (50 total) and millions of rows. I need to filter some data / columns. The criterie are just some numbers like:
100258
100256
100333
100055
This are just SAP-Cost center
For now I have just a Where-statment, which includes 22 different numbers.
WHERE column1 = 100256,
column1 = 100258, ....
Is there maybe a more elegant way?
Maybe something like an array?
best regards
Joshua

You may use WHERE IN e.g.
WHERE column1 IN(100256, 100258, ...)
If you expect to have more values than can be supported by IN (1000, I think) then consider creating a table to store the values, say table1 with a column val. Then you could use:
WHERE column1 IN (SELECT val FROM table1)
You might also consider joining to table1, depending on what your actual query is.

Use with clause:
With numbers as (
Select 100256,100258 As number From dual
)
From table1,numbers
Where column1 = numbers.number

Related

Find rows with all columns duplicated and no unique field in PostgreSQL

Say I have a table like this, where no column or combination of columns is guaranteed to be unique:
GAME_EVENT
USERNAME
ITEM
QUANTITY
sell
poringLUVR
sword
1
sell
poringLUVR
sword
1
kill
daenerys
civilians
200000
kill
daenerys
civilians
200000
invoke
sylvanas
undead
1000000
And I want to retrieve the list of all rows that exist more than once (where the combination of ALL their columns appears more than once).
(In this case I would expect to get a list with the "sell/poringLUVR" and "kill/daenerys" rows)
What would be a good way of approaching this? Would a combined index be of any help? Suggestions for non-Postgres approaches are also welcome.
Assuming all columns NOT NULL, this will do:
SELECT *
FROM tbl t1
WHERE EXISTS (
SELECT FROM tbl t2
WHERE (t1.*) = (t2.*)
AND t1.ctid <> t2.ctid
);
ctid is a system column, the "tuple identifier" / "item pointer" that can serve as poor-man's PK in the absence of an actual PK (which you obviously don't have), and only within the scope of a single query. Related:
Delete duplicate rows from small table
How do I decompose ctid into page and row numbers?
If columns can be NULL, (more expensively) operate with IS NOT DISTINCT FROM instead of =. See:
How do I (or can I) SELECT DISTINCT on multiple columns?
(t1.*) = (t2.*) is comparing ROW values. This shorter syntax is equivalent: t1 = t2 unless a column of the same name exists in the underlying tables, in which case the second form fails while the first won't. See:
SQL syntax term for 'WHERE (col1, col2) < (val1, val2)'
Index?
If any of the columns has a particularly high cardinality (many distinct values, few duplicates), let's call it hi_cardi_column for the purpose of this answer, a plain btree index on just that column can be efficient for your task. A combination of a few, small columns with a multicolumn index can work, too. The point is to have a small, fast index or the overhead won't pay.
SELECT *
FROM tbl t1
WHERE EXISTS (
SELECT FROM tbl t2
WHERE t1.hi_cardi_column = t2.hi_cardi_column -- logically redundant
AND (t1.*) = (t2.*)
AND t1.ctid <> t2.ctid
);
The added condition t1.hi_cardi_column = t2.hi_cardi_column is logically redundant, but helps to utilize said index.
Other than that I don't see much potential for index support as all rows of the table have to be visited anyway, and all columns have to be checked.

multiple sets of conditions

I'd like to run a query on a table that matches any of the following sets of conditions:
SELECT
id,
time
FROM
TABLE
WHERE
<condition1 is True> OR,
<condition2 is True> OR,
<condition3 is True> OR,
...
Each condition might look like:
id = 'id1' AND t > 20 AND t < 40
The values from each WHERE condition (id, 20, 40 above) are rows in a pandas dataframe - that is 20k rows long. I see two options that would technically work:
make 20k queries to the database, one for each condition, and concat the result
generate a (very long) query as above and submit
My question: what would be an idiomatic/performant way to accomplish this?
I suspect neither of the above are appropriate approaches and this problem is somewhat difficult to google.
I think it would be better to create a temporary table with columns id, t1, and t2 and put your 20k rows in there. Then just join to this temporary table:
SELECT DISTINCT TABLE.id, time
FROM TABLE
JOIN TEMP_TABLE T2 ON
TABLE.ID = T2.ID AND TABLE.T > T1 AND TABLE.T < T2;

'In' clause in SQL server with multiple columns

I have a component that retrieves data from database based on the keys provided.
However I want my java application to get all the data for all keys in a single database hit to fasten up things.
I can use 'in' clause when I have only one key.
While working on more than one key I can use below query in oracle
SELECT * FROM <table_name>
where (value_type,CODE1) IN (('I','COMM'),('I','CORE'));
which is similar to writing
SELECT * FROM <table_name>
where value_type = 1 and CODE1 = 'COMM'
and
SELECT * FROM <table_name>
where value_type = 1 and CODE1 = 'CORE'
together
However, this concept of using 'in' clause as above is giving below error in 'SQL server'
ERROR:An expression of non-boolean type specified in a context where a condition is expected, near ','.
Please let know if their is any way to achieve the same in SQL server.
This syntax doesn't exist in SQL Server. Use a combination of And and Or.
SELECT *
FROM <table_name>
WHERE
(value_type = 1 and CODE1 = 'COMM')
OR (value_type = 1 and CODE1 = 'CORE')
(In this case, you could make it shorter, because value_type is compared to the same value in both combinations. I just wanted to show the pattern that works like IN in oracle with multiple fields.)
When using IN with a subquery, you need to rephrase it like this:
Oracle:
SELECT *
FROM foo
WHERE
(value_type, CODE1) IN (
SELECT type, code
FROM bar
WHERE <some conditions>)
SQL Server:
SELECT *
FROM foo
WHERE
EXISTS (
SELECT *
FROM bar
WHERE <some conditions>
AND foo.type_code = bar.type
AND foo.CODE1 = bar.code)
There are other ways to do it, depending on the case, like inner joins and the like.
If you have under 1000 tuples you want to check against and you're using SQL Server 2008+, you can use a table values constructor, and perform a join against it. You can only specify up to 1000 rows in a table values constructor, hence the 1000 tuple limitation. Here's how it would look in your situation:
SELECT <table_name>.* FROM <table_name>
JOIN ( VALUES
('I', 'COMM'),
('I', 'CORE')
) AS MyTable(a, b) ON a = value_type AND b = CODE1;
This is only a good idea if your list of values is going to be unique, otherwise you'll get duplicate values. I'm not sure how the performance of this compares to using many ANDs and ORs, but the SQL query is at least much cleaner to look at, in my opinion.
You can also write this to use EXIST instead of JOIN. That may have different performance characteristics and it will avoid the problem of producing duplicate results if your values aren't unique. It may be worth trying both EXIST and JOIN on your use case to see what's a better fit. Here's how EXIST would look,
SELECT * FROM <table_name>
WHERE EXISTS (
SELECT 1
FROM (
VALUES
('I', 'COMM'),
('I', 'CORE')
) AS MyTable(a, b)
WHERE a = value_type AND b = CODE1
);
In conclusion, I think the best choice is to create a temporary table and query against that. But sometimes that's not possible, e.g. your user lacks the permission to create temporary tables, and then using a table values constructor may be your best choice. Use EXIST or JOIN, depending on which gives you better performance on your database.
Normally you can not do it, but can use the following technique.
SELECT * FROM <table_name>
where (value_type+'/'+CODE1) IN (('I'+'/'+'COMM'),('I'+'/'+'CORE'));
A better solution is to avoid hardcoding your values and put then in a temporary or persistent table:
CREATE TABLE #t (ValueType VARCHAR(16), Code VARCHAR(16))
INSERT INTO #t VALUES ('I','COMM'),('I','CORE')
SELECT DT. *
FROM <table_name> DT
JOIN #t T ON T.ValueType = DT.ValueType AND T.Code = DT.Code
Thus, you avoid storing data in your code (persistent table version) and allow to easily modify the filters (without changing the code).
I think you can try this, combine and and or at the same time.
SELECT
*
FROM
<table_name>
WHERE
value_type = 1
AND (CODE1 = 'COMM' OR CODE1 = 'CORE')
What you can do is 'join' the columns as a string, and pass your values also combined as strings.
where (cast(column1 as text) ||','|| cast(column2 as text)) in (?1)
The other way is to do multiple ands and ors.
I had a similar problem in MS SQL, but a little different. Maybe it will help somebody in futere, in my case i found this solution (not full code, just example):
SELECT Table1.Campaign
,Table1.Coupon
FROM [CRM].[dbo].[Coupons] AS Table1
INNER JOIN [CRM].[dbo].[Coupons] AS Table2 ON Table1.Campaign = Table2.Campaign AND Table1.Coupon = Table2.Coupon
WHERE Table1.Coupon IN ('0000000001', '0000000002') AND Table2.Campaign IN ('XXX000000001', 'XYX000000001')
Of cource on Coupon and Campaign in table i have index for fast search.
Compute it in MS Sql
SELECT * FROM <table_name>
where value_type + '|' + CODE1 IN ('I|COMM', 'I|CORE');

Is there a single SQL (or its variations) function to check not equals for multiple columns at once?

Just as I can check if a column does not equal one of the strings given in a set.
SELECT * FROM table1 WHERE column1 NOT IN ('string1','string2','string3');
Is there a single function that I can make sure that multiple columns does not equal a single string? Maybe like this.
SELECT * FROM table1 WHERE EACH(column1,column2,column3) <> 'string1';
Such that it gives the same effect as:
SELECT * FROM table1 WHERE column1 <> 'string1'
AND column2 <> 'string1'
AND column3 <> 'string1';
If not, what's the most concise way to do so?
I believe you can just reverse the columns and constants in your first example:
SELECT * FROM table1 WHERE 'string1' NOT IN (column1, column2, column3);
This assumes you are using SQL Server.
UPDATE:
A few people have pointed out potential null comparison problems (even though your desired query would have the same potential problem). This could be worked around by using COALESCE in the following way:
SELECT * FROM table1 WHERE 'string1' NOT IN (
COALESCE(column1,'NA'),
COALESCE(column2,'NA'),
COALESCE(column3,'NA')
);
You should replace 'NA' with a value that will not match whatever 'string1' is. If you do not allow nulls for columns 1,2 and 3 this is not even an issue.
No, there is no standard SQL way to do this. Barring any special constraints on what the string fields contain there's no more concise way to do it than you've already hit upon (col1 <> 'String1' AND col2 <> 'String2').
Additionally, this kind of requirement is often an indication that you have a flaw in your database design and that you're storing the same information in several different columns. If that is true in your case then consider refactoring if possible into a separate table where each column becomes its own row.
The most concise way to do this is
SELECT * FROM table1 WHERE column1 <> 'string1'
AND column2 <> 'string1'
AND column3 <> 'string1';
Yes, I cut & pasted that from your original question. :-)
I'm more concerned why you're wanting to compare against all three columns. It sounds like you might have a table that needs normalization. What are the actual columns of column1, column2 and column3. Are they something like phone1, phone2, and phone3? Perhaps those three columns should actually be in a subtable.

Is there any way to combine IN with LIKE in an SQL statement?

I am trying to find a way, if possible, to use IN and LIKE together. What I want to accomplish is putting a subquery that pulls up a list of data into an IN statement. The problem is the list of data contains wildcards. Is there any way to do this?
Just something I was curious on.
Example of data in the 2 tables
Parent table
ID Office_Code Employee_Name
1 GG234 Tom
2 GG654 Bill
3 PQ123 Chris
Second table
ID Code_Wildcard
1 GG%
2 PQ%
Clarifying note (via third-party)
Since I'm seeing several responses which don't seems to address what Ziltoid asks, I thought I try clarifying what I think he means.
In SQL, "WHERE col IN (1,2,3)" is roughly the equivalent of "WHERE col = 1 OR col = 2 OR col = 3".
He's looking for something which I'll pseudo-code as
WHERE col IN_LIKE ('A%', 'TH%E', '%C')
which would be roughly the equivalent of
WHERE col LIKE 'A%' OR col LIKE 'TH%E' OR col LIKE '%C'
The Regex answers seem to come closest; the rest seem way off the mark.
I'm not sure which database you're using, but with Oracle you could accomplish something equivalent by aliasing your subquery in the FROM clause rather than using it in an IN clause. Using your example:
select p.*
from
(select code_wildcard
from second
where id = 1) s
join parent p
on p.office_code like s.code_wildcard
In MySQL, use REGEXP:
WHERE field1 REGEXP('(value1)|(value2)|(value3)')
Same in Oracle:
WHERE REGEXP_LIKE(field1, '(value1)|(value2)|(value3)')
Do you mean somethign like:
select * FROM table where column IN (
SELECT column from table where column like '%%'
)
Really this should be written like:
SELECT * FROM table where column like '%%'
Using a sub select query is really beneficial when you have to pull records based on a set of logic that you won't want in the main query.
something like:
SELECT * FROM TableA WHERE TableA_IdColumn IN
(
SELECT TableA_IdColumn FROM TableB WHERE TableA_IDColumn like '%%'
)
update to question:
You can't combine an IN statement with a like statement:
You'll have to do three different like statements to search on the various wildcards.
You could use a LIKE statement to obtain a list of IDs and then use that in the IN statement.
But you can't directly combine IN and LIKE.
Perhaps something like this?
SELECT DISTINCT
my_column
FROM
My_Table T
INNER JOIN My_List_Of_Value V ON
T.my_column LIKE '%' + V.search_value + '%'
In this example I've used a table with the values for simplicity, but you could easily change that to a subquery. If you have a large list (like tens of thousands) then performance might be rough.
select *
from parent
where exists( select *
from second
where office_code like trim( code_wildcard ) );
Trim code_wildcard just in case it has trailing blanks.
You could do the Like part in a subquery perhaps?
Select * From TableA Where X in (Select A from TableB where B Like '%123%')
tsql has the contains statement for a full-text-search enabled table.
CONTAINS(Description, '"sea*" OR "bread*"')
If I'm reading the question correctly, we want all Parent rows that have an Office_code that matches any Code_Wildcard in the "Second" table.
In Oracle, at least, this query achieves that:
SELECT *
FROM parent, second
WHERE office_code LIKE code_wildcard;
Am I missing something?