Match delimited values from two columns - sql

col1 col2 col3
21 12;13;45;67; 65;43;66;34;33
23 45;67;23;13; 45;78;89;09;32
I have something like above which I got from so many joins. One of my where condition is to check col1 is in col2 which I did with:
... col1 IN (SELECT value FROM STRING_SPLIT(col2, ';')`.
I have or condition to check col2 is in col3. How can I do that in sql.
Goal is col2 should be either in col1 or col3.

A query like this works:
SELECT value
FROM yourdata cross apply string_split(col2, ';')
where value in (col1)
and value in (SELECT value
FROM yourdata cross apply string_split(col3, ';'))
Data:
Col1 Col2 Col3
-----------------------------------------
21 12;13;45;67; 65;43;66;34;23 -- notice 23 in col3
23 45;67;23;13; 45;78;89;09;32

Related

lateral view explode in bigquery

I want to do something like this using BigQuery.
Input Table
Col1
Col2
Col3
Col4
1
A,B,C
123
789
Output Table
ID
COL
VALUE
1
COL1
1
1
COL2
A,B,C
1
COL3
123
1
COL4
789
I got this in hive with LATERAL VIEW explode(MAP), but I can't get the same in bigquery.
Consider below approach
select id, col, value
from (select *, row_number() over() as id from your_table)
unpivot (value for col in (Col1, Col2, Col3, Col4))
f apply to sample data in your question
with your_table as (
select '1' Col1, 'A,B,C' Col2, '123' Col3, '789' Col4
)
output is
Note - this particular approach requires all columns (Col1 - Col4) to be of the same type. If this is not a case you will need first apply cast for some of those to make them string
If it's a discrete number of columns, you can use UNIONs for this...
select id, 'Col1' as Column, col1 as Value
from table
union all
select id, 'Col2' as Column, col2 as Value
from table
union all
select id, 'Col3' as Column, col3 as Value
from table

SQL union grouped by rows

Assume I have a table like this:
col1
col2
col3
col4
commonrow
one
two
null
commonrow
null
null
three
How to produce a result to look like this:
col1
col2
col3
col4
commonrow
one
two
three
Thanks
like this, you can group by col1 and get the maximum in each group:
select col1 , max(col2) col2 , max(col3) col3 , max(col4) col4
from table
group by col1

How to use a variable as a select statement and then sum multiple values in a variable?

I have multiple rows with the following columns:
col1
col2
col3
col4
I want to say if rows where col3 equals col2 of other rows where col1 equals 'a111', then sum col4 of the rows where col3 equals col2 of other rows where col1 equals 'a111', and then rename the sum column to "Total".
Example table with the four columns and four rows:
col1 col2 col3 col4
---- ---- ---- ----
a222 a333 4444
a111 a333
a555 a444 1111
a111 a444
I've tried the following but it does not work.
Declare
var1 = Select col2 from table1 where col1='a111';
var2 = Select col3 from table1 where col3=var1;
var3 = Select col4 from table1 where col3=var1;
Begin
If var2=var1
Then Select SUM(var3) As "Total";
End
Expected result is:
Total
5555
I do not have the strongest of knowledge in programming overall or Oracle. Please ask any questions and I will do my best to answer.
Your logic is convoluted and hard to follow without an example of the data you have and an example of the data you want.. but translating your pseudocode into sql gives:
Declare
var1 = Select col2 from table1 where col1='[table2.col2 value]';
Called "find" in my query
var2 = Select col3 from table1 where col3=var1;
var3 = Select col4 from table1 where col3=var1;
Achieved by joining the table back to the "find"
Begin
If var2=var1
Then Select SUM(var3) As "Total";
End
Achieved with a sum of var3 on only rows where var1=var2, in "ifpart"
SELECT SUM(var3) FROM
(
SELECT alsot1.col3 as var2, alsot1.col4 as var3
FROM
table1 alsot1
INNER JOIN
(
SELECT t1.col2 as var1
FROM table1 t1 INNER JOIN table2 t2 ON t1.col1 = t2.col2
) find
ON find.var1 = alsot1.col3
) ifpart
WHERE
var1 = var2
This could be simplified, but I present it like this because it matches your understanding of the problem. The query optimizer will rewrite it anyway when the time comes to run it so it only pays to start messing with how it's done if performance is poor
By the way, you clearly said that two tables join via a common named col2 but you then in your pseudocode said the tables join on col1=col2. I followed your pseudocode
This sounds like something that hierarchical queries could handle. E.g. something like:
WITH your_table AS (SELECT NULL col1, 'a222' col2, 'a333' col3, 4444 col4 FROM dual UNION ALL
SELECT 'a111' col1, 'a333' col2, NULL col3, NULL col4 FROM dual UNION ALL
SELECT NULL col1, 'a555' col2, 'a444' col3, 1111 col4 FROM dual UNION ALL
SELECT 'a111' col1, 'a444' col2, NULL col3, NULL col4 FROM dual UNION ALL
SELECT 'a666' col1, 'a888' col2, NULL col3, NULL col4 FROM dual UNION ALL
SELECT NULL col1, 'a777' col2, 'a888' col3, 7777 col4 FROM dual)
SELECT col1,
SUM(col4) col4_total
FROM (SELECT connect_by_root(col1) col1,
col4
FROM your_table
CONNECT BY col3 = PRIOR col2
START WITH col1 IS NOT NULL) -- start with col1 = 'a111')
GROUP BY col1;
COL1 COL4_TOTAL
---- ----------
a666 7777
a111 5555
Nevermind. I believe I've determined the answer myself. I over complicated what I wanted. Thank you anyway.
Answer:
Select Sum(col4) as "Total" from table1 where col3 in (Select col2 from table1 where col1='a111')

How to get min value from multiple columns for a row in SQL

I need to get to first (min) date from a set of 4 (or more) columns.
I tried
select min (col1, col2, col3) from tbl
which is obviouslly wrong.
let's say I have these 4 columns
col1 | col2 | col3 | col4
1/1/17 | 2/2/17 | | 3/3/17
... in this case what I want to get is the value in col1 (1/1/17). and Yes, these columns can include NULLs.
I am running this in dashDB
the columns are Date data type,
there is no ID nor Primary key column in this table,
and I need to do this for ALL rows in my query,
the columns are NOT in order. meaning that col1 does NOT have to be before col2 or it has to be null AND col2 does NOT have to be before col3 or it has to be NULL .. and so on
If your DB support least function, it is the best approach
select
least
(
nvl(col1,TO_DATE('2901-01-01','YYYY-MM-DD')),
nvl(col2,TO_DATE('2901-01-01','YYYY-MM-DD')),
nvl(col3,TO_DATE('2901-01-01','YYYY-MM-DD')),
nvl(col4,TO_DATE('2901-01-01','YYYY-MM-DD'))
)
from tbl
Edit: If all col(s) are null, then you can hardcode the output as null. The below query should work. I couldn't test it but this should work.
select
case when
least
(
nvl(col1,TO_DATE('2901-01-01','YYYY-MM-DD')),
nvl(col2,TO_DATE('2901-01-01','YYYY-MM-DD')),
nvl(col3,TO_DATE('2901-01-01','YYYY-MM-DD')),
nvl(col4,TO_DATE('2901-01-01','YYYY-MM-DD'))
)
= TO_DATE('2901-01-01','YYYY-MM-DD')
then null
else
least
(
nvl(col1,TO_DATE('2901-01-01','YYYY-MM-DD')),
nvl(col2,TO_DATE('2901-01-01','YYYY-MM-DD')),
nvl(col3,TO_DATE('2901-01-01','YYYY-MM-DD')),
nvl(col4,TO_DATE('2901-01-01','YYYY-MM-DD'))
)
end
as min_date
from tbl
If a id column in your table. Then
Query
select t.id, min(t.col) as min_col_value from(
select id, col1 as col from your_table
union all
select id, col2 as col from your_table
union all
select id, col3 as col from your_table
union all
select id, col4 as col from your_table
)t
group by t.id;
If you want the first date, then use coalesce():
select coalesce(col1, col2, col3, col4)
from t;
This returns the first non-NULL value (which is one way that I interpret the question). This will be the minimum date, if the dates are in order.
Select Id, CaseWhen (Col1 <= Col2 OR Col2 is null) And (Col1 <= Col3 OR Col3 is null) Then Col1 When (Col2 <= Col1 OR Col1 is null) And (Col2 <= Col3 OR Col3 is null) Then Col2 Else Col3 End As Min From YourTable
This is for 3 Column, Same way you can write for 4 - or more column.

SQL Downselect query results based on field duplication

I'm looking for a little help with a SQL query. (I am using Oracle.)
I have a query that is a union of 2 differing select statments. The resulting data looks like the following:
Col1 Col2 Col3
XXX ValA Val1
XXX ValB Val2
YYY ValA Val1
YYY ValA Val2
In this setup the Col1 = XXX are default values and Col1 = YYY are real values. Real values (YYY) should take precidence over default values. The actual values are defined via columns 2 and 3.
I'm looking to downselect those results into the following:
Col1 Col2 Col3
XXX ValB Val2
YYY ValA Val1
YYY ValA Val2
Notice that the first row was removed ... that's because there is a real value (YYY in row 3) took precidence over the default value (XXX).
Any thoughts on how to approach this?
You want to filter out all the rows where col2 and col3 appear with XXX and with another value.
You can implement this filter by doing appropriate counts in a subquery using the analytic functions:
select col1, col2, col3
from (select t.*,
count(*) over (partition by col2, col3) as numcombos,
sum(case when col1 = 'XXX' then 1 else 0 end) over (partition by col2, col3) as numxs
from t
) t
where numcombos = numxs or (col1 <> 'xxx')
My instinct is to use an analytic function:
select distinct
first_value(col1)
over (partition by col2, col3
order by case col1
when 'XXX' then 1
else 0 end asc) as col1,
col2,
col3
from table1
However, if the table is large and indexed, it might be better to solve this with a full outer join (which is possible because there are only two possible values):
select coalesce(rl.col1, dflt.col1) as col1,
coalesce(rl.col2, dflt.col2) as col2,
coalesce(rl.col3, dflt.col3) as col3
from (select * from table1 where col1 = 'XXX') dflt
full outer join (select * from table1 where col1 <> 'XXX') rl
on dflt.col2 = rl.col2 and dflt.col3 = rl.col3;
[Solution in SQLFiddle]
I think you could use a trick like this:
select
case when
max(case when col1<>'XXX' then col1 end) is null then 'XXX' else
max(case when col1<>'XXX' then col1 end) end as col1,
col2,
col3
from
your_table
group by col2, col3
I transform default value to null, then i group by col3. The maximum value between null and a value is the value you are looking for. This works on your example data, but it might not be exactly what you are looking for, it depends on how your real data is.