SQL Where Condition having 15+ different strings for one column - sql

I would like to implement for 1 column multiple specific parameters like:
select * from table1
where column1 = a or column1 = b or column1 = c ...
Can it be done in a better way (the SQL Statement in Use is over 10 lines long with the OR statements it'll grow another 10 lines O.o and it'll make the code much slower!)

You can use in:
select t.*
from t
where column in ( . . . );
The in list is pretty equivalent to a bunch of or conditions. There are some nuances. For instance, all the values in the in list will be converted to the same type. If one is a number and the rest are strings, then all will be converted to strings -- perhaps generating an error.
For performance, you want an index on t(column).

Related

Add column with substring of other column in SQL (Snowflake)

I feel like this should be simple but I'm relatively unskilled in SQL and I can't seem to figure it out. I'm used to wrangling data in python (pandas) or Spark (usually pyspark) and this would be a one-liner in either of those. Specifically, I'm using Snowflake SQL, but I think this is probably relevant to a lot of flavors of SQL.
Essentially I just want to trim the first character off of a specific column. More generally, what I'm trying to do is replace a column with a substring of the same column. I would even settle for creating a new column that's a substring of an existing column. I can't figure out how to do any of these things.
On obvious solution would be to create a temporary table with something like
CREATE TEMPORARY TABLE tmp_sub AS
SELECT id_col, substr(id_col, 2, 10) AS id_col_sub FROM table1
and then join it back and write a new table
CREATE TABLE table2 AS
SELECT
b.id_col_sub as id_col,
a.some_col1, a.some_col2, ...
FROM table1 a
JOIN tmp_sub b
ON a.id_col = b.id_col
My tables have roughly a billion rows though and this feels extremely inefficient. Maybe I'm wrong? Maybe this is just the right way to do it? I guess I could replace the CREATE TABLE table2 AS... to INSERT OVERWRITE INTO table1 ... and at least that wouldn't store an extra copy of the whole thing.
Any thoughts and ideas are most welcome. I come at this humbly from the perspective of someone who is baffled by a language that so many people seem to have mastery over.
I'm not sure the exact syntax/functions in Snowflake but generally speaking there's a few different ways of achieving this.
I guess the general approach that would work universally is using the SUBSTRING function that's available in any database.
Assuming you have a table called Table1 with the following data:
+-------+-----------------------------------------+
Code | Desc
+-------+-----------------------------------------+
0001 | 1First Character Will be Removed
0002 | xCharacter to be Removed
+-------+-----------------------------------------+
The SQL code to remove the first character would be:
select SUBSTRING(Desc,2,len(desc)) from Table1
Please note that the "SUBSTRING" function may vary according to different databases. In Oracle for example the function is "SUBSTR". You just need to find the Snowflake correspondent.
Another approach that would work at least in SQLServer and MySQL would be using the "RIGHT" function
select RIGHT(Desc,len(Desc) - 1) from Table1
Based on your question I assume you actually want to update the actual data within the table. In that case you can use the same function above in an update statement.
update Table1 set Desc = SUBSTRING(Desc,2,len(desc))
You didn't try this?
UPDATE tableX
SET columnY = substr(columnY, 2, 10 ) ;
-Paul-
There is no need to specify the length, as is evidenced from the following simple test harness:
SELECT $1
,SUBSTR($1, 2)
,RIGHT($1, -2)
FROM VALUES
('abcde')
,('bcd')
,('cdef')
,('defghi')
,('e')
,('fg')
,('')
;
Both expressions here - SUBSTR(<col>, 2) and RIGHT(<col>, -2) - effectively remove the first character of the <col> column value.
As for the strategy of using UPDATE versus INSERT OVERWRITE, I do not believe that there will be any difference in performance or outcome, so I might opt for the UPDATE since it is simpler. So, in conclusion, I would use:
UPDATE tableX
SET columnY = SUBSTR(columnY, 2)
;

Search for multiple values in IN clause located in where clause

I'm extending search functionality and I need to add search for multiple values in IN clause. So I have textbox and write some number in it. This is dynamic functionality which could add query syntaxis to the WHERE clause of the sql and should be added only when you search.
Let's say we put 1056 in our textbox to search for it.
SELECT
*
FROM
QueryTable
WHERE
(Select ID from SearchTable Where Number='1056') IN
(SELECT value FROM OPENJSON(JsonField,'$.Data.ArrayOfIds'))
This is working properly because searching for ID in SearchTable returns one row, but I want to make it work when returns multiple rows, something like this:
(Select ID from SearchTable Where Number like '%105%')
This obvious gives an error, because returns multiple records. What is the approach to search multiple values(or array of values) in IN clause.
Use an EXISTS. I think this is what you need (no sample data to test with to confirm):
FROM QueryTable QT
WHERE EXISTS (SELECT 1
FROM SearchTable ST
JOIN OPENJSON(QT.JsonField,'$.Data.ArrayOfIds') J ON ST.ID = J.[value]
WHERE ST.Number LIKE '%105%')
Note: I've added aliases to everything, but I've guessed the alias for JsonField. Quantifying your columns is really important to avoid unexpected behaviour and make your code easier to read for both yourself and others.

One select for multiple records by composite key

Such a query as in the title would look like this I guess:
select * from table t where (t.k1='apple' and t.k2='pie') or (t.k1='strawberry' and t.k2='shortcake')
... --10000 more key pairs here
This looks quite verbose to me. Any better alternatives? (Currently using SQLite, might use MYSQL/Oracle.)
You can use for example this on Oracle, i assume that if you use regular concatenate() instead of Oracle's || on other DB, it would work too (as it is simply just a string comparison with the IN list). Note that such query might have suboptimal execution plan.
SELECT *
FROM
TABLE t
WHERE
t.k1||','||t.k2 IN ('apple,pie',
'strawberry,shortcake' );
But if you have your value list stored in other table, Oracle supports also the format below.
SELECT *
FROM
TABLE t
WHERE (t.k1,t.k2) IN ( SELECT x.k1, x.k2 FROM x );
Don't be afraid of verbose syntax. Concatenation tricks can easily mess up the selectivity estimates or even prevent the database from using indexes.
Here is another syntax that may or may not work in your database.
select *
from table t
where (k1, k2) in(
('apple', 'pie')
,('strawberry', 'shortcake')
,('banana', 'split')
,('raspberry', 'vodka')
,('melon', 'shot')
);
A final comment is that if you find yourself wanting to submit 1000 values as filters you should most likely look for a different approach all together :)
select * from table t
where (t.k1+':'+t.k2)
in ('strawberry:shortcake','apple:pie','banana:split','etc:etc')
This will work in most of the cases as it concatenate and finds in as one column
off-course you need to choose a proper separator which will never come in the value of k1 and k2.
for e.g. if k1 and k2 are of type int you can take any character as separator
SELECT * FROM tableName t
WHERE t.k1=( CASE WHEN t.k2=VALUE THEN someValue
WHEN t.k2=otherVALUE THEN someotherValue END)
- SQL FIDDLE

Oracle SQL - Joining list of values to a field with those values concatenated

The title is a bit confusing, so I'll explain with an example what I'm trying to do.
I have a field called "modifier". This is a field with concatenated values for each individual. For example, the value in one row could be:
*26,50,4 *
and the value in the next row
*4 *
And the table (Table A) would look something like this:
Key Modifier
1 *26,50,4 *
2 *4 *
3 *1,2,3,4 *
The asterisks are always going to be in the same position (here, 1 and 26) with an uncertain number of numbers in between, separated by commas.
What I'd like to do is "join" this "modifier" field to another table (Table B) with a list of possible values for that modifier. e.g., that table could look like this:
ID MOD
1 26
2 3
3 50
4 78
If a value in A.modifier appears in B.mod, I want to keep that row in Table A. Otherwise, leave it out. (I use the term "join" loosely because I'm not sure that's what I need here.)
Is this possible? How would I do it?
Thanks in advance!
edit 1: I realize I can use regular expressions and do a bunch of or statements that search for the comma-separated values in the MOD list, but is there a better way?
One way to do it is using TRIM, string concatenations and LIKE.
SELECT *
FROM tableA a
WHERE EXISTS(
SELECT 1 FROM tableB b
WHERE
','|| trim( trim( BOTH '*' FROM a.Modifier )) ||','
LIKE '%,'|| b.mod || ',%'
);
Demo --> http://www.sqlfiddle.com/#!4/1caa8/10
This query migh be still slow for huge tables (it always performs full scans of tables or indexes), however it should be faster than using regular expressions or parsing comma separated lists into individual values.

How to make result set from ('1','2','3')?

I have a question, how can i make a result set making only list of values. For example i have such values : ('1','2','3')
And i want to make a sql that returns such table:
1
2
3
Thanks.
[Edit]
Sorry for wrong question.
Actually list not containing integers, but it contains strings.
I am currently need like ('aa','bb,'cc').
[/Edit]
If you want to write a SQL statement which will take a comma separate list and generate an arbitrary number of actually rows the only real way would be to use a table function, which calls a PL/SQL function which splits the input string and returns the elements as separate rows.
Check out this link for an intro to table-functions.
Alternatively, if you can construct the SQL statement programmatically in your client you can do:
SELECT 'aa' FROM DUAL
UNION
SELECT 'bb' FROM DUAL
UNION
SELECT 'cc' FROM DUAL
The best way I've found is using XML.
SELECT items.extract('/l/text()').getStringVal() item
FROM TABLE(xmlSequence(
EXTRACT(XMLType(''||
REPLACE('aa,bb,cc',',','')||'')
,'/all/l'))) items;
Wish I could take credit but alas : http://pbarut.blogspot.com/2006/10/binding-list-variable.html.
Basically what it does is convert the list to an xmldocument then parse it back out.
The easiest way is to abuse a table that is guaranteed to have enough rows.
-- for Oracle
select rownum from tab where rownum < 4;
If that is not possible, check out Oracle Row Generator Techniques.
I like this one (requires 10g):
select integer_value
from dual
where 1=2
model
dimension by ( 0 as key )
measures ( 0 as integer_value )
rules upsert ( integer_value[ for key from 1 to 10 increment 1 ] = cv(key) )
;
One trick I've used in various database systems (not just SQL databases) is actually to have a table which just contains the first 100 or 1000 integers. Such a table is very easy to create programatically, and your query then becomes:
SELECT value FROM numbers WHERE value < 4 ORDER BY value
You can use the table for lots of similar purposes.