How to check in SQL if multi columnar set is in the table (without string concatenation) - sql

Let's assume I've 3 columns in a table with values like this:
table_1:
A | B | C
-----------------------
'xx' | '' | 'y'
'x' | 'y' | 'x'
'x' | 'x' | 'y'
'x' | 'yy' | ''
'x' | '' | 'yy'
'x' | 'y' | 'y'
I've a result set (result of an SQL SELECT statement) which I want to identify in the above table if it exists there:
[
('x', 'x', 'y')
('x', 'y', 'y')
]
This result set would match for 5 (of 6) rows in instead of the 2 from the table above if I've compared the results of simple string concatenation, e.g. I would simply compare the results of this: SELECT concat(A, B, C) FROM table_1
I could solve this problem with comparing the results of more complex string concatenation functions like this: SELECT concat('A=', A, '_', 'B=', B, '_', 'C=', C )
BUT:
I don't want to use any hardcoded special separator in a string concatenation like _ or =
because any character might be in the data
e.g.: somewhere in column B there might be this value: xx_C=yy
it's not a clean solution
I don't want to use string concatenation at all, because it's an ugly solution
it makes the "distance" between the attributes disappear
not general enough
maybe I've columns with different datatypes I don't want to convert to a STRING based column
Question:
Is it possible to solve somehow this problem without using string concatenation?
Is there a simple solution for this multi column value checking problem?
I want to solve this in BiqQuery, but I'm interested in a general solution for every relational databse/datawarehouse.
Thank you!
CREATE TABLE test.table_1 (
A STRING,
B STRING,
C STRING
) AS
SELECT * FROM (
SELECT 'xx', '', 'y'
UNION ALL
SELECT 'x', 'y', 'x'
UNION ALL
SELECT 'x', 'x', 'y'
UNION ALL
SELECT 'x', 'yy', ''
UNION ALL
SELECT 'x', '', 'yy'
UNION ALL
SELECT 'x', 'y', 'y'
)
SELECT A, B, C
FROM test.table_1
WHERE (A, B, C) IN ( -> I need this functionality
SELECT 'x', 'x', 'y'
UNION ALL
SELECT 'x', 'y', 'y'
);

Below is the most generic way I can think of (BigQuery Standard SQL):
#standardSQL
SELECT *
FROM `project.test.table1` t
WHERE t IN (
SELECT t
FROM `project.test.table2` t
)
You can test, play with above using sample data from your question as in below example
#standardSQL
WITH `project.test.table1` AS (
SELECT 'xx' a, '' b, 'y' c UNION ALL
SELECT 'x', 'y', 'x' UNION ALL
SELECT 'x', 'x', 'y' UNION ALL
SELECT 'x', 'yy', '' UNION ALL
SELECT 'x', '', 'yy' UNION ALL
SELECT 'x', 'y', 'y'
), `project.test.table2` AS (
SELECT 'x' a, 'x' b, 'y' c UNION ALL
SELECT 'x', 'y', 'y'
)
SELECT *
FROM `project.test.table1` t
WHERE t IN (
SELECT t
FROM `project.test.table2` t
)
with output
Row a b c
1 x x y
2 x y y

Use join:
SELECT t1.*
FROM test.table_1 t1 JOIN
(SELECT 'x' as a, 'x' as b, 'y' as c
UNION ALL
SELECT 'x', 'y', 'y'
) t2
USING (a, b, c);

Related

SQL - create new col based on value of other column group by third column

I have this table
Id
item
type
A
itemA1
X
A
itemA2
X
B
itemA1
X
B
itemA2
X
B
itemA3
Y
And i would like to create new indicator which contains the information about if the Id contains only item of type X or only tpye Y or both like this :
Id
Indicator
A
Only X
B
Both
EDIT: It's possible to have more than 2 kind of types
Thanks in advance for your help
Consider below generic approach
select id,
if(count(distinct type)=1,'Only ','') || string_agg(distinct type) indicator
from your_table
group by id
This will cover if both item are 'Y' as well:
with table_with_sample_data as (
select 'A' as Id ,'itemA1' as item, 'X' as type union all
select 'A', 'itemA2', 'X' union all
select 'B', 'itemA1', 'X' union all
select 'B', 'itemA2', 'X' union all
select 'B', 'itemA3', 'Y' union all
select 'C', 'itemA1', 'Y' union all
select 'C', 'itemA2', 'Y'
)
select id,
if(count(distinct type)=1,'Only '|| max(type), 'Both') indicator from table_with_sample_data
group by id

Showing NULL on purpose when a NULL joined value is present in SQL

I have a table with some input values and a table with lookup values like below:
select input.value, coalesce(mapping.value, input.value) result from (
select 'a' union all select 'c'
) input (value) left join (
select 'a', 'z' union all select 'b', 'y'
) mapping (lookupkey, value) on input.value = mapping.lookupkey
which gives:
value | result
--------------
a | z
c | c
i.e. I want to show the original values as well as the mapped value but if there is none then show the original value as the result.
The above works well so far with coalesce to determine if there is a mapped value or not. But now if I allow NULL as a valid mapped value, I want to see NULL as the result and not the original value, since it does find the mapped value, only that the mapped value is NULL. The same code above failed to achieve this:
select input.value, coalesce(mapping.value, input.value) result from (
select 'a' union all select 'c'
) input (value) left join (
select 'a', 'z' union all select 'b', 'y' union all select 'c', null
) mapping (lookupkey, value) on input.value = mapping.lookupkey
which gives the same output as above, but what I want is:
value | result
--------------
a | z
c | NULL
Is there an alternative to coalesce that can achieve what I want?
I think you just want a case expression e.g.
select input.[value]
, coalesce(mapping.[value], input.[value]) result
, case when mapping.lookupkey is not null then mapping.[value] else input.[value] end new_result
from (
select 'a'
union all
select 'c'
) input ([value])
left join (
select 'a', 'z'
union all
select 'b', 'y'
union all
select 'c', null
) mapping (lookupkey, [value]) on input.[value] = mapping.lookupkey
Returns:
value result new_result
a z z
c c NULL

Comparing 2 lists in Oracle

I have 2 lists which I need to compare. I need to find if at least one element from List A is found in List B. I know IN doesn't work with 2 lists. What are my other options?
Basically something like this :
SELECT
CASE WHEN ('A','B','C') IN ('A','Z','H') THEN 1 ELSE 0 END "FOUND"
FROM DUAL
Would appreciate any help!
You are probably looking for something like this. The WITH clause is there just to simulate your "lists" (whatever you mean by that); they are not really part of the solution. The query you need is just the last three lines (plus the semicolon at the end).
with
first_list (str) as (
select 'A' from dual union all
select 'B' from dual union all
select 'C' from dual
),
second_list(str) as (
select 'A' from dual union all
select 'Z' from dual union all
select 'H' from dual
)
select case when exists (select * from first_list f join second_list s
on f.str = s.str) then 1 else 0 end as found
from dual
;
FOUND
----------
1
In Oracle you can do:
select
count(*) as total_matches
from table(sys.ODCIVarchar2List('A', 'B', 'C')) x,
table(sys.ODCIVarchar2List('A', 'Z', 'H')) y
where x.column_value = y.column_value;
You need to repeat the conditions:
SELECT (CASE WHEN 'A' IN ('A', 'Z', 'H') OR
'B' IN ('A', 'Z', 'H') OR
'C' IN ('A', 'Z', 'H')
THEN 1 ELSE 0
END) as "FOUND"
FROM DUAL
If you are working with collection of String you can try Multiset Operators.
create type coll_of_varchar2 is table of varchar2(4000);
and:
-- check if exits
select * from dual where cardinality (coll_of_varchar2('A','B','C') multiset intersect coll_of_varchar2('A','Z','H')) > 0;
-- list of maching elments
select * from table(coll_of_varchar2('A','B','C') multiset intersect coll_of_varchar2('A','Z','H'));
Additionally:
-- union of elemtns
select * from table(coll_of_varchar2('A','B','C') multiset union distinct coll_of_varchar2('A','Z','H'));
select * from table(coll_of_varchar2('A','B','C') multiset union all coll_of_varchar2('A','Z','H'));
-- eelemnt from col1 not in col2
select * from table(coll_of_varchar2('A','A','B','C') multiset except all coll_of_varchar2('A','Z','H'));
select * from table(coll_of_varchar2('A','A','B','C') multiset except distinct coll_of_varchar2('A','Z','H'));
-- check if col1 is subset col2
select * from dual where coll_of_varchar2('B','A') submultiset coll_of_varchar2('A','Z','H','B');
I am trying to do something very similar but the first list is another field on the same query created with listagg and containing integer numbers like:
LISTAGG(my_first_list,', ') WITHIN GROUP(
ORDER BY
my_id
) my_first_list
and return this with all the other fields that I am already returning
SELECT
CASE WHEN my_first_list IN ('1,2,3') THEN 1 ELSE 0 END "FOUND"
FROM DUAL

Any CONCAT() variation that tolerates NULL values?

CONCAT() returns NULL when any value is NULL. I have to use IFNULL() to
wrap all fields passed to CONCAT(). Is there a CONCAT() variation that just
ignores NULL?
For example:
#standardSQL
WITH data AS (
SELECT 'a' a, 'b' b, CAST(null AS STRING) nu
)
SELECT CONCAT(a, b, nu) concatenated, ARRAY_TO_STRING([a,b,nu], ',') w_array_to_string
FROM `data`
--->
null
Quick Jam Session on interesting theme in question
There are potentially unlimited combination of real-life use cases here
Below are few variations:
#standardSQL
WITH data AS (
SELECT 'a' a, 'b' b, 'c' c UNION ALL
SELECT 'y', 'x', NULL UNION ALL
SELECT 'v', NULL, 'w'
)
SELECT
*,
CONCAT(a, b, c) by_concat,
ARRAY_TO_STRING([a,b,c], '') by_array_to_string,
ARRAY_TO_STRING([a,b,c], '', '?') with_null_placeholder,
ARRAY_TO_STRING(
(SELECT ARRAY_AGG(col ORDER BY col DESC)
FROM UNNEST([a,b,c]) AS col
WHERE NOT col IS NULL)
, '') with_order
FROM `data`
The output is:
a b c by_concat by_array_to_string with_null_placeholder with_order
- ---- ---- --------- ------------------ --------------------- ----------
y x null null yx yx? yx
a b c abc abc abc cba
v null w null vw v?w wv
Use ARRAY_TO_STRING([col1, col2, ...]) instead:
#standardSQL
WITH data AS (
SELECT 'a' a, 'b' b, CAST(null AS STRING) nu
)
SELECT ARRAY_TO_STRING([a,b,nu], '') w_array_to_string
FROM `data`
--->
ab

Find duplicate records in database against unique attributes

I want to find duplicate of IRN # entered into a table in database. Here are the unique attributes (logically unique) of the IRN.
ProjectNo, DrawingNo, DrawingRev, SpoolNo, WeldNo
An IRN can have multiple WeldNos meaning the above unique attributes may repeat for one IRN # (with of course one of the 5 attribute values must be unique).
Now I am trying to find out whether there are any duplicate IRNs entered into the system or not? How can I find that through a sql query?
P.S: Due to bad design of database, there is no primary key in the table..
Here is what I have tried so far but this does not give the correct results.
select * from WeldInfo a, WeldInfo b
where a.ProjectNo = b.ProjectNo and
a.DrawingNo = b.DrawingNo and
a.DrawingRev = b.DrawingRev and
a.SpoolNo = b.SpoolNo and
a.WeldNo = b.WeldNo and
a.IrnNo <> b.IrnNo;
But i'm not sure, have i understood your question.
select * from (
select count(*) over ( partition by ProjectNo, DrawingNo, DrawingRev, SpoolNo, WeldN) rr,t.* from WeldInfo t)
where rr > 1;
Explanation.
with tab as (
select 1 as id, 'a' as a , 'b' as b , 'c' as c from dual
union all
select 2 , 'a', 'b', 'c' from dual
union all
select 3 , 'x', 'b', 'c' from dual
union all
select 3 , 'x', 'b', 'c' from dual
union all
select 3 , 'x', 'd', 'c' from dual
)
select t.*
, count(*) over (partition by a,b,c) cnt1
, count(distinct id) over (partition by a,b,c) cnt2
from tab t;