I know this question has been asked, but I have a slightly different flavour of it. I have a use case where the only thing I have control over is the WHERE clause of the query, and I have 2 tables.
Using simple example:
Table1 contains 1 column named "FULLNAME" with hundreds of values
Table2 contains 1 column named "PATTERN" with some matching text
so, What I need to do is select all values from Table 1 which match the values in table 2.
Here's a simple example:
Table1 (FULLNAME)
ANTARCTICA
ANGOLA
AUSTRALIA
AFRICA
INDIA
INDONESIA
Table2 (PATTERN)
AN
IN
Effectively what I need is the entries in Table1 which contain the values from Table2 (result would be ANTARCTICA, ANGOLA, INDIA, INDONESIA)
In other words, what I need is something like:
Select * from Table1 where FULLNAME IN LIKE (Select '%' || Pattern || '%' from
Table2)
The tricky thing here is I only have control over the where clause, I can't control the Select clause at all or add joins since I'm using a product which only allows control over the where clause. I can't use stored procedures either.
Is this possible?
I'm using Oracle as the backend DB
Thanks
One possible approach is to use EXISTS in combination with LIKE in the subquery:
select * from table1 t1
where exists (select null
from table2 t2
where t1.fullname like '%' || t2.pattern || '%');
I believe that you can do this with a simple JOIN:
SELECT DISTINCT
fullname
FROM
Table1 T1
INNER JOIN Table2 T2 ON T1.fullname LIKE '%' || T2.pattern || '%'
The DISTINCT is there for those cases where you might have a match to multiple rows in Table2.
If the patterns are always two characters and only have to match the start of the full name, like the examples you showed, you could do:
Select * from Table1 where substr(FULLNAME, 1, 2) IN (Select Pattern from Table2)
Which prevents any index on Table1 being used, and your real case may need to be more flexible...
Or probably even less efficiently, similar to TomH's approach, but with the join inside a subquery:
Select * from Table1 where FULLNAME IN (
Select t1.FULLNAME from Table1 t1
Join Table2 t2 on t1.FULLNAME like '%'||t2.Pattern||'%')
Right, this involved a bit of trickery. Conceptually what I've done is turned the column from the PATTERN into a single cell, and use that with REGEX_LIKE
So the values "AN and IN" becomes one single value '(AN|IN)' - I just feed this to the regexp_like
SELECT FULLNAME from table1 where
regexp_like(FULLNAME,(SELECT '(' || SUBSTR (SYS_CONNECT_BY_PATH (FULLNAME , '|'), 2) || ')' Table2
FROM (SELECT FULLNAME , ROW_NUMBER () OVER (ORDER BY FULLNAME) rn,
COUNT (*) OVER () cnt
FROM Table2)
WHERE rn = cnt START WITH rn = 1 CONNECT BY rn = PRIOR rn + 1))
The subquery in the regexp_like turns the column into a single cell containing the regular expression string.
I do realise this is probably a performance killer though, but thankfully I'm not that fussed about performance at this point
Related
I am new to sql and realize this may not be rocket science but I have two tables, the first one with a variable title_name from which I want to extract the theme(cannot use regex as the position varies) and a second table which is a lookup table with the correct correspondance.
Table_1
title_name (example: uk-book-heroicfantasy-language)
Table_2
theme_old (example: heroicfantasy)
theme_new (example: fantasy)
This is what i have come up with so far, but this query only keeps the rows with a match.
How can I say I want a '(not set)' value for theme_new when there is no match?
select
theme_new,
from Table_1, Table_2
where title_name like concat('%',theme_old ,'%')
I would very much appreciate any help, none of the approaches I have tried have worked so far.
I think you want to use a CASE.
Something like this;
select
CASE
WHEN title_name like concat('%',theme_old ,'%') THEN theme_new
ELSE '(not set)'
END
from Table_1, Table_2
You want a left join:
select theme_new
from Table_1 t1 left join
Table_2
on t2.title_name like concat('%', t2.theme_old , '%')
That said, I'm not sure if BigQuery supports this syntax.
You may need to use:
select t1.title_name, array_agg(theme_new ignore nulls)
from (select t1.title_name, theme_new,
from Table_1 t1 cross join
Table_2
where t2.title_name like concat('%', t2.theme_old , '%')
union all
select t1.title_name, null
from table_1 t1
) t1
group by t1.title_name;
In BigQuery, I have table1 which has a (not nullable) column id which is always a 5-digit integer. I want to join it with table2 which also has a column id which is (nullable) strings of these same IDs.
The trouble is that id in table2 can also be a list of ' / ' seperated IDs.
Here is an example of the column:
82795
82795
NULL
84660
84120 / 82795
73844 / 73845
73844 / 73845
NULL
83793 / 84758
73844 / 73845 / 84122 / 84136
73844 / 73845 / 84136
84845
How can I achieve something with similar logic to:
SELECT * FROM table1
LEFT JOIN table2
ON table1.id IN SPLIT(table2.id, ' / ')
I agree with what Tim says about normalising your table, but in the interim you should be able to use IN with UNNEST to search the results of SPLIT:
SELECT * FROM table1
LEFT JOIN table2
ON table1.id IN UNNEST(SPLIT(table2.id, ' / '))
You should consider normalizing your second table such that each id value appears on a separate record. As a workaround to your current situation, you may try the following:
SELECT *
FROM table1 t1
LEFT JOIN table2 t2
ON CONCAT(' ', t2.id, ' ') LIKE CONCAT('% ', CAST(t1.id AS STRING), ' %');
The above ON clause is a trick which searches for a table1 id somewhere in the table2 id. It works by padding the latter with spaces, such that we only need to search for a table1 id surrounded by spaces.
I agree with Nick, but I think this version works better in BigQuery:
with table1 as (
select 82795 as id union all
SELECT 1234 UNION ALL
SELECT 84122
),
table2 as (
SELECT '84120 / 82795' as id UNION ALL
SELECT '73844 / 73845 / 84122 / 84136'
)
SELECT t1, t2
FROM table1 t1 LEFT JOIN
(table2 t2 CROSS JOIN
UNNEST(SPLIT(t2.id, ' / ')) t2id
)
ON t1.id = safe_cast(t2id as int64);
(The above works.)
Notes:
BigQuery complains when I don't use = for LEFT JOIN.
I assume that table1.id is a number. String need to be converted for the comparison.
Duplicate column names are not allowed in a SELECT, so SELECT * doesn't work. The easy workaround is to select the values as records rather than as columns.
I have two tables in database.
Both tables have a business name column but not always going to be the same.
For example tbl 1 has a business name of 'Aone Dental Practices Limited TA Jaws Dental' and Tbl 2 has a business name of 'Jaws Dental'. I want to be able to join these together as Jaws Dental is visible in both.
I can't seem to get the Like clause working for this.
tried
Tbl1_BusinesName Like '%' + Tbl2_BusinesName + '%'
This query should work :
SELECT *
FROM Table1 T1
LEFT JOIN Table2 T2 ON T1.BusinesName LIKE '%'+TS.BusinesName+'%'
Using EXISTS you can get the expected result:
SELECT *
FROM dbo.TableName1 AS Tbl1
WHERE EXISTS (SELECT 1
FROM dbo.TableName2 AS Tbl2
WHERE Tbl1.BusinesName LIKE '%' + Tbl2.BusinesName + '%');
I am trying to join two tables in big query,
Table1 contains an ID column, and Table2 contains a column which has the same ID or multiple ID's in the form of a long string separated by commas, like "id123,id456,id678"
I can join the tables together if Table1.ID = Table2.ID but this ignores all the rows where Table1.ID is one of the multiple IDs in Table2.ID.
I have looked at similar post that tell me to use wildcards like
on concat('%',Table1.ID,'%') = Table2.ID
but this does not work, because it seems to create a string that contains the '%' character and doesn't actually use it as a wildcard.
I'm using standard sql in BigQuery, any help would be appreciated
Below example is for BigQuery Standard SQL
#standardSQL
WITH `project.dataset.table1` AS (
SELECT 123 id, 'a' test UNION ALL
SELECT 456, 'b' UNION ALL
SELECT 678, 'c'
), `project.dataset.table2` AS (
SELECT 'id123,id456' id UNION ALL
SELECT 'id678'
)
SELECT t2.id, test
FROM `project.dataset.table2` t2, UNNEST(SPLIT(id)) id2
JOIN `project.dataset.table1` t1
ON CONCAT('id', CAST(t1.id AS STRING)) = id2
result is as below
Row id test
1 id123,id456 a
2 id123,id456 b
3 id678 c
It is doubtful that you have values in the table that start and end with percentage signs. = does not recognize wildcards; like does:
on Table2.ID like concat('%', Table1.ID, '%')
As a warning. Such a construct is usually a performance killer. You would be better off trying to have columns in Table1 and Table2 that match exactly.
Working in MS SQL 2005 and I want to use a select statement within a wildcard where clause like so:
SELECT text
FROM table_1
WHERE ID LIKE '%SELECT ID FROM table_2%'
I'm looking for product ids within a large body of text that is held in a DB. The SELECT statement in the wildcard clause will return 50+ rows. The statement above is obviously not the way to go. Any suggestions?
You can do a join and construct the like string based on table_2.
SELECT * FROM table_1 t1
INNER JOIN table_2 t2 ON t1.ID LIKE '%' + CONVERT(VARCHAR, t2.ID) + '%'