Oracle - Compare column using REGEXP_LIKE - sql

I have a table containing 2 columns: FIRST_PART, SECOND_PART. What I need is to run a query again another table using the FIRST_PART, SECOND_PART as LIKE.
So, something like: SELECT {fields} FROM {table} WHERE {column} LIKE {first_part}%{second_part}
I thought maybe some string I construct and use EXECUTE IMMEDIATE, but there must be another way......

You can use:
SELECT field1,
field2,
field3
FROM table_name t
WHERE EXISTS(
SELECT 1
FROM other_table o
WHERE t.column_name LIKE o.first_part || '%' || o.second_part
);

Related

How do I select columns based on a string pattern in BigQuery

I have a table in BigQuery with hundreds of columns, and it just happens that I want to select all of them except for those that begin with an underscore. I know how to do a query to select the columns beginning with an underscore using the INFORAMTION_SCHEMA.COLUMNS table, but I can't figure out how I would use this query to select the columns I want. I know BigQuery has EXCEPT but I want to avoid writing out each column that begins with an underscore, and I can't seem to pass to it a subquery or even something like a._*.
Consider below approach
execute immediate (select '''
select * except(''' || string_agg(col) || ''') from your_table
'''
from (
select col
from (select * from your_table limit 1) t,
unnest([struct(translate(to_json_string(t), '{}"', '') as kvs)]),
unnest(split(kvs)) kv,
unnest([struct(split(kv, ':')[offset(0)] as col)])
where starts_with(col, '_')
));
if apply to table like below
it generates below statement
select * except(_c,_e) from your_table
and produces below output

How to extract the table name from a CREATE/UPDATE/INSERT statement in an SQL query?

I am trying to parse the table being created, inserted into or updated from the following sql queries stored in a table column.
Let's call the table column query. Following is some sample data to demonstrate variations in how the data could look like.
with sample_data as (
select 1 as id, 'CREATE TABLE tbl1 ...' as query union all
select 2 as id, 'CREATE OR REPLACE TABLE tbl1 ...' as query union all
select 3 as id, 'DROP TABLE IF EXISTS tbl1; CREATE TABLE tbl1 ...' as query union all
select 4 as id, 'INSERT /*some comment*/ INTO tbl2 ...' as query union all
select 5 as id, 'INSERT /*some comment*/ INTO tbl2 ...' as query union all
select 6 as id, 'UPDATE tbl3 SET col1 = ...' as query union all
select 7 as id, '/*some garbage comments*/ UPDATE tbl3 SET col1 = ...' as query union all
select 8 as id, 'DELETE tbl4 ...' as query
),
Following are the formats of the queries (we are trying to extract table_name ):
#1
some optional statements like drop table
CREATE some comments or optional statement like OR REPLACE TABLE table_name
everything else
#2
some optional statements like drop table
INSERT some comments INTO some comments table_name
#3
some optional statements like drop table
UPDATE some comments table_name
everything else
Regular Expression
To construct a suitable regex, let's start with the following relatively simple/readable version:
((CREATE( OR REPLACE)?|DROP) TABLE( IF EXISTS)?|UPDATE|DELETE|INSERT INTO) ([^\s\/*]+)
All the spaces above could be replaced with "at least one whitespace character", i.e. \s+. But we also need to allow comments. For a comment that looks like /*anything*/ the regex looks like \/\*.*\*\/ (where the comment characters are escaped with \ and "anything" is the .* in the middle). Given there could be multiple such comments, optionally separated by whitespace, we end up with (\s*\/\*.*\*\/\s*?)*\s+. Plugging this in everywhere there was a space gives:
((CREATE((\s*\/\*.*\*\/\s*?)*\s+OR(\s*\/\*.*\*\/\s*?)*\s+REPLACE)?|DROP)(\s*\/\*.*\*\/\s*?)*\s+TABLE((\s*\/\*.*\*\/\s*?)*\s+IF(\s*\/\*.*\*\/\s*?)*\s+EXISTS)?|UPDATE|DELETE|INSERT(\s*\/\*.*\*\/\s*?)*\s+INTO)(\s*\/\*.*\*\/\s*?)*\s+([^\s\/*]+)
One further refinement needs to be made: Bracketed expressions have been used for choices, e.g. (CHOICE1|CHOICE2). But this syntax includes them as capturing groups. Actually we only require one capturing group for the table name so we can exclude all the other capturing groups via ?:, e.g. (?:CHOICE1|CHOICE2). This gives:
(?:(?:CREATE(?:(?:\s*\/\*.*\*\/\s*?)*\s+OR(?:\s*\/\*.*\*\/\s*?)*\s+REPLACE)?|DROP)(?:\s*\/\*.*\*\/\s*?)*\s+TABLE(?:(?:\s*\/\*.*\*\/\s*?)*\s+IF(?:\s*\/\*.*\*\/\s*?)*\s+EXISTS)?|UPDATE|DELETE|INSERT(?:\s*\/\*.*\*\/\s*?)*\s+INTO)(?:\s*\/\*.*\*\/\s*?)*\s+([^\s\/*]+)
Online Regex Demo
Here's a demo of it working with your examples: Regex101 demo
SQL
The Google BigQuery documentation for REGEXP_EXTRACT says it will return the substring matched by the capturing group. So I'd expect something like this to work:
with sample_data as (
select 1 as id, 'CREATE TABLE tbl1 ...' as query union all
select 2 as id, 'CREATE OR REPLACE TABLE tbl1 ...' as query union all
select 3 as id, 'DROP TABLE IF EXISTS tbl1; CREATE TABLE tbl1 ...' as query union all
select 4 as id, 'INSERT /*some comment*/ INTO tbl2 ...' as query union all
select 5 as id, 'INSERT /*some comment*/ INTO tbl2 ...' as query union all
select 6 as id, 'UPDATE tbl3 SET col1 = ...' as query union all
select 7 as id, '/*some garbage comments*/ UPDATE tbl3 SET col1 = ...' as query union all
select 8 as id, 'DELETE tbl4 ...' as query
)
SELECT
*, REGEXP_EXTRACT(query, r"(?:(?:CREATE(?:(?:\s*\/\*.*\*\/\s*?)*\s+OR(?:\s*\/\*.*\*\/\s*?)*\s+REPLACE)?|DROP)(?:\s*\/\*.*\*\/\s*?)*\s+TABLE(?:(?:\s*\/\*.*\*\/\s*?)*\s+IF(?:\s*\/\*.*\*\/\s*?)*\s+EXISTS)?|UPDATE|DELETE|INSERT(?:\s*\/\*.*\*\/\s*?)*\s+INTO)(?:\s*\/\*.*\*\/\s*?)*\s+([^\s\/*]+)") AS table_name
FROM sample_data;
(The above is untested so please let me know in the comments if there are any issues.)
I think it really depends on your data, but you might find some success using an approach like this:
with data as (
select 1 as id, 'CREATE TABLE tbl1 ...' as query union all
select 2 as id, 'INSERT INTO tbl2 ...' as query union all
select 3 as id, 'UPDATE tbl3 ...' as query union all
select 4 as id, 'DELETE tbl4 ...' as query
),
splitted as (
select id, split(query, ' ') as query_parts from data
)
select
id,
case
when query_parts[safe_offset(0)] in('CREATE', 'INSERT') then query_parts[safe_offset(2)]
when query_parts[safe_offset(0)] in('UPDATE', 'DELETE') then query_parts[safe_offset(1)]
else 'Error'
end as table_name
from splitted
Of course this depends on the cleanliness and syntax in your query column. Also, if your table_name is qualified with project.table.dataset you would need to do further splitting.

ORACLE SQL CSV Column comparison

I have a table with CSV values as column. I want use that column in where clause to compare subset of CSV is present or not. For example Table has values like
1| 'A,B,C,D,E'
Query:
select id from tab where csv_column contains 'A,C';
This query should return 1.
How to achieve this in SQL?
You can handle this using LIKE, making sure to search for the three types of pattern for each letter/substring which you intend to match:
SELECT id
FROM yourTable
WHERE (csv_column LIKE 'A,%' OR csv_column LIKE '%,A,%' OR csv_column LIKE '%,A')
AND
(csv_column LIKE 'C,%' OR csv_column LIKE '%,C,%' OR csv_column LIKE '%,C')
Note that match for the substring A means that either A,, ,A, or ,A appears in the CSV column.
We could also write a structurally similar query using INSTR() in place of LIKE, which might even give a peformance boost over using wildcards.
there's probably something funky you can do with regular expressions but in simple terms... if A and C will always be in that order
csv_column LIKE '%A%C%'
otherwise
(csv_column LIKE '%A%' AND csv_column LIKE '%C%' )
If you don't want to edit your search string, this could be a way:
select *
from yourTable
where csv like '%' || replace('A,C', ',', '%') || '%'
For example:
with yourTable(id, csv) as (
select 1, 'A,B,C,D,E' from dual union all
select 2, 'A,C,D,E' from dual union all
select 3, 'B,C,D,E' from dual
)
select *
from yourTable
where csv like '%' || replace('A,C', ',', '%') || '%'
gives:
ID CSV
---------- ---------
1 A,B,C,D,E
2 A,C,D,E
Consider that this will only work if the characters in the search string have the same order of the CSV column; for example:
with yourTable(id, csv) as (
select 1, 'C,A,B' from dual
)
select *
from yourTable
here csv like '%' || replace('A,C', ',', '%') || '%'
will give no results
Why not store the values as separate columns, and then use simple predicate filtering?

SQL Search rows that contain strings from 2nd table list

I have a master table that contains a list of strings to search for. it returns TRUE/FALSE if any string in the cell contains text from the master lookup table. Currently I use excel's
=SUMPRODUCT(--ISNUMBER(SEARCH(masterTable,[#searchString])))>0
is there a way to do something like this in SQL? LEFT JOIN or OUTER APPLY would be simple solutions if the strings were equal; but they need be contains..
SELECT *
FROM t
WHERE col1 contains(lookupString,lookupColumn)
--that 2nd table could be maintained and referenced from multiple queries
hop
bell
PRS
2017
My desired results would be a column that shows TRUE/FALSE if the row contains any string from the lookup table
SEARCH_STRING Contained_in_lookup_column
hopping TRUE
root FALSE
Job2017 TRUE
PRS_tool TRUE
hand FALSE
Sorry i dont have access to the DB now to confirm the syntax, but should be something like this:
SELECT t.name,
case when (select count(1) from data_table where data_col like '%' || t.name || '%' > 0) then 'TRUE' else 'FALSE' end
FROM t;
or
SELECT t.name,
case when exists(select null from data_table where data_col like '%' || t.name || '%') then 'TRUE' else 'FALSE' end
FROM t;
Sérgio
You can use a combination of % wildcards with LIKE and EXISTS.
Example (using Oracle syntax) - we have a v_data table containing the data and a v_queries table containing the query terms:
with v_data (pk, value) as (
select 1, 'The quick brown fox jumps over the lazy dog' from dual union all
select 2, 'Yabba dabba doo' from dual union all
select 3, 'forty-two' from dual
),
v_queries (text) as (
select 'quick' from dual union all
select 'forty' from dual
)
select * from v_data d
where exists (
select null
from v_queries q
where d.value like '%' || q.text || '%');

Like Operator for checking multiple words

I'm struggling for a like operator which works for below example
Words could be
MS004 -- GTER
MS006 -- ATLT
MS009 -- STRR
MS014 -- GTEE
MS015 -- ATLT
What would be the like operator in Sql Server for pulling data which will contain words like ms004 and ATLT or any other combination like above.
I tried using multiple like for example
where column like '%ms004 | atl%'
but it didn't work.
EDIT
Result should be combination of both words only.
Seems you are looking for this.
`where column like '%ms004%' or column like '%atl%'`
or this
`where column like '%ms004%atl%'
;WITH LikeCond1 as (
SELECT 'MS004' as L1 UNION
SELECT 'MS006' UNION
SELECT 'MS009' UNION
SELECT 'MS014' UNION
SELECT 'MS015')
, LikeCond2 as (
SELECT 'GTER' as L2 UNION
SELECT 'ATLT' UNION
SELECT 'STRR' UNION
SELECT 'GTEE' UNION
SELECT 'ATLT'
)
SELECT TableName.*
FROM LikeCond1
CROSS JOIN LikeCond2
INNER JOIN TableName ON TableName.Column like '%' + LikeCond1.L1 + '%'
AND TableName.Column like '%' + LikeCond2.L2 + '%'
Try like this
select .....from table where columnname like '%ms004%' or columnname like '%atl%'