How to get min value from multiple columns for a row in SQL - sql

I need to get to first (min) date from a set of 4 (or more) columns.
I tried
select min (col1, col2, col3) from tbl
which is obviouslly wrong.
let's say I have these 4 columns
col1 | col2 | col3 | col4
1/1/17 | 2/2/17 | | 3/3/17
... in this case what I want to get is the value in col1 (1/1/17). and Yes, these columns can include NULLs.
I am running this in dashDB
the columns are Date data type,
there is no ID nor Primary key column in this table,
and I need to do this for ALL rows in my query,
the columns are NOT in order. meaning that col1 does NOT have to be before col2 or it has to be null AND col2 does NOT have to be before col3 or it has to be NULL .. and so on

If your DB support least function, it is the best approach
select
least
(
nvl(col1,TO_DATE('2901-01-01','YYYY-MM-DD')),
nvl(col2,TO_DATE('2901-01-01','YYYY-MM-DD')),
nvl(col3,TO_DATE('2901-01-01','YYYY-MM-DD')),
nvl(col4,TO_DATE('2901-01-01','YYYY-MM-DD'))
)
from tbl
Edit: If all col(s) are null, then you can hardcode the output as null. The below query should work. I couldn't test it but this should work.
select
case when
least
(
nvl(col1,TO_DATE('2901-01-01','YYYY-MM-DD')),
nvl(col2,TO_DATE('2901-01-01','YYYY-MM-DD')),
nvl(col3,TO_DATE('2901-01-01','YYYY-MM-DD')),
nvl(col4,TO_DATE('2901-01-01','YYYY-MM-DD'))
)
= TO_DATE('2901-01-01','YYYY-MM-DD')
then null
else
least
(
nvl(col1,TO_DATE('2901-01-01','YYYY-MM-DD')),
nvl(col2,TO_DATE('2901-01-01','YYYY-MM-DD')),
nvl(col3,TO_DATE('2901-01-01','YYYY-MM-DD')),
nvl(col4,TO_DATE('2901-01-01','YYYY-MM-DD'))
)
end
as min_date
from tbl

If a id column in your table. Then
Query
select t.id, min(t.col) as min_col_value from(
select id, col1 as col from your_table
union all
select id, col2 as col from your_table
union all
select id, col3 as col from your_table
union all
select id, col4 as col from your_table
)t
group by t.id;

If you want the first date, then use coalesce():
select coalesce(col1, col2, col3, col4)
from t;
This returns the first non-NULL value (which is one way that I interpret the question). This will be the minimum date, if the dates are in order.

Select Id, CaseWhen (Col1 <= Col2 OR Col2 is null) And (Col1 <= Col3 OR Col3 is null) Then Col1 When (Col2 <= Col1 OR Col1 is null) And (Col2 <= Col3 OR Col3 is null) Then Col2 Else Col3 End As Min From YourTable
This is for 3 Column, Same way you can write for 4 - or more column.

Related

lateral view explode in bigquery

I want to do something like this using BigQuery.
Input Table
Col1
Col2
Col3
Col4
1
A,B,C
123
789
Output Table
ID
COL
VALUE
1
COL1
1
1
COL2
A,B,C
1
COL3
123
1
COL4
789
I got this in hive with LATERAL VIEW explode(MAP), but I can't get the same in bigquery.
Consider below approach
select id, col, value
from (select *, row_number() over() as id from your_table)
unpivot (value for col in (Col1, Col2, Col3, Col4))
f apply to sample data in your question
with your_table as (
select '1' Col1, 'A,B,C' Col2, '123' Col3, '789' Col4
)
output is
Note - this particular approach requires all columns (Col1 - Col4) to be of the same type. If this is not a case you will need first apply cast for some of those to make them string
If it's a discrete number of columns, you can use UNIONs for this...
select id, 'Col1' as Column, col1 as Value
from table
union all
select id, 'Col2' as Column, col2 as Value
from table
union all
select id, 'Col3' as Column, col3 as Value
from table

Dynamic Group By in a Query

Is there a way to apply or not a group by into a query? for example, I have this:
Col1 Col2 Col3
A 10 X
A 10 NULL
B 12 NULL
B 12 NULL
I have to group by Col1 and Col2 only when I have a value in Col3, if Col3 is null, I don't need to group it. The result should be:
Col1 Col2
A 20
B 12
B 12
Maybe is not an elegant example, but this is the idea.
Thank you.
Here's a SQL Fiddle that does what you want:
http://sqlfiddle.com/#!3/b7f07/2
Here's the SQL itself:
SELECT col1, sum(col2) as col2 FROM dataTable WHERE
col1 in (SELECT col1 from dataTable WHERE col3 IS NOT NULL)
GROUP BY col1
UNION ALL
SELECT col1, col2 FROM dataTable WHERE
(col1 not in
(SELECT col1 from dataTable WHERE col3 IS NOT NULL and col1 is not null))
It sounds like you want all unique values of col1 when col3 is not null. Otherwise, you want all values of col1.
Assuming you have a SQL engine that supports window functions, you can do this as:
select col1, sum(col2)
from (select t.*,
count(col3) over (partition by col1) as NumCol3Values,
row_number() over (partition by col1 order by col1) as seqnum
from t
) t
group by col1,
(case when NumCol3Values > 1 then NULL else seqnum end)
The logic is pretty much as you state it. If there is any non-NULL value, then the second clause of the group by always evaluates to NULL -- everything goes in the same group. If things are all NULL, then the clause evaluates to a sequence number, which puts each values on a separate row.
This is a bit more difficult without window functions. If I assume that the minimum value of column 3 (when not NULL) is unique, then the following would work:
select t.col1,
(case when minCol3 is null then tsum.col2 else t.col2 end) as col2
from t left outer join
(select col1, sum(col2) as col2,
min(col3) as minCol3
from t
) tsum
on t.col1 = tsum.col1
where minCol3 is NULL or t.col3 = MinCol3
re: Is there a way to apply or not a group by into a query?
Not directly, but you can break it down by groupings and then UNION the results together.
Does this work?
Select col1, sum(col2)
from table
group by col1, col2
having max(col3) is not null
union all
select col1, col2
from table t left outer join
(Select col1, col2
from table
group by col1, col2
having max(col3) is not null) g
where g.col1 is null

SQL Downselect query results based on field duplication

I'm looking for a little help with a SQL query. (I am using Oracle.)
I have a query that is a union of 2 differing select statments. The resulting data looks like the following:
Col1 Col2 Col3
XXX ValA Val1
XXX ValB Val2
YYY ValA Val1
YYY ValA Val2
In this setup the Col1 = XXX are default values and Col1 = YYY are real values. Real values (YYY) should take precidence over default values. The actual values are defined via columns 2 and 3.
I'm looking to downselect those results into the following:
Col1 Col2 Col3
XXX ValB Val2
YYY ValA Val1
YYY ValA Val2
Notice that the first row was removed ... that's because there is a real value (YYY in row 3) took precidence over the default value (XXX).
Any thoughts on how to approach this?
You want to filter out all the rows where col2 and col3 appear with XXX and with another value.
You can implement this filter by doing appropriate counts in a subquery using the analytic functions:
select col1, col2, col3
from (select t.*,
count(*) over (partition by col2, col3) as numcombos,
sum(case when col1 = 'XXX' then 1 else 0 end) over (partition by col2, col3) as numxs
from t
) t
where numcombos = numxs or (col1 <> 'xxx')
My instinct is to use an analytic function:
select distinct
first_value(col1)
over (partition by col2, col3
order by case col1
when 'XXX' then 1
else 0 end asc) as col1,
col2,
col3
from table1
However, if the table is large and indexed, it might be better to solve this with a full outer join (which is possible because there are only two possible values):
select coalesce(rl.col1, dflt.col1) as col1,
coalesce(rl.col2, dflt.col2) as col2,
coalesce(rl.col3, dflt.col3) as col3
from (select * from table1 where col1 = 'XXX') dflt
full outer join (select * from table1 where col1 <> 'XXX') rl
on dflt.col2 = rl.col2 and dflt.col3 = rl.col3;
[Solution in SQLFiddle]
I think you could use a trick like this:
select
case when
max(case when col1<>'XXX' then col1 end) is null then 'XXX' else
max(case when col1<>'XXX' then col1 end) end as col1,
col2,
col3
from
your_table
group by col2, col3
I transform default value to null, then i group by col3. The maximum value between null and a value is the value you are looking for. This works on your example data, but it might not be exactly what you are looking for, it depends on how your real data is.

select all columns with one column has different value

In my table,some records have all column values are the same, except one. I need write a query to get those records. what's the best way to do it? the table is like this:
colA colB colC
a b c
a b d
a b e
What's the best way to get all records with all the columns? Thanks for everyone's help.
Assuming you know that column3 will always be different, to get the rows that have more than one value:
SELECT Col1, Col2
FROM Table t
GROUP BY Col1, Col2
HAVING COUNT(distinct col3) > 1
If you need all the values in the three columns, then you can join this back to the original table:
SELECT t.*
FROM table t join
(SELECT Col1, Col2
FROM Table t
GROUP BY Col1, Col2
HAVING COUNT(distinct col3) > 1
) cols
on t.col1 = cols.col1 and t.col2 = cols.col2
Just select those rows that have the different values:
SELECT col1, col2
FROM myTable
WHERE colWanted != knownValue
If this is not what you are looking for, please post examples of the data in the table and the wanted output.
How about something like
SELECT Col1, Col2
FROM Table
GROUP BY Col1, Col2
HAVING COUNT(*) = 1
This will give you Col1, Col2 that have unique data.
Assuming col3 has the difs
SELECT Col1, Col2
FROM Table
GROUP BY Col1, Col2
HAVING COUNT(*) > 1
OR TO SHOW ALL 3 COLS
SELECT Col1, Col2, Col3
FROM Table1
GROUP BY Col1, Col2, Col3
HAVING COUNT(Col3) > 1

Multiple Subqueries and Conditions

If I have a subquery which does the following;
Select
Min(S_Date)
, Col1
, Col2
From
(
Select Dateadd(whatever) as S_Date, Userid
from tbl1 as t
where S_Date >'today'
)
How can I add another clause so that value from Col1 is only selected if another condition is met, i.e col3 = 'doit'. I guess I am trouble understanding how to use two where clauses in different places in a subquery.
You need to use a CASE statement:
SELECT
s_date
,CASE Col3 WHEN 'doit' THEN Col1 ELSE Col2 END AS selection
FROM (
SELECT
Sdate
, Col1
, Col2
, Col3
FROM foo
WHERE s_Date > GETDATE()
) AS sub
To use aggregate functions like MIN() you need to group by the columns you're not aggregating...
Select
Min(S_Date)
, Col1
, Col2
From
(
Select Dateadd(whatever) as S_Date, Userid
from tbl1 as t
where S_Date >'today'
)
GROUP BY
Col1,
Col2
If you then want the non-aggregated columns to be conditional, you again group by those conditional values...
Select
Min(S_Date)
, CASE WHEN col3 = 'doit' THEN Col1 ELSE Col2 END AS conditional_field
From
(
Select Dateadd(whatever) as S_Date, Userid
from tbl1 as t
where S_Date >'today'
)
GROUP BY
CASE WHEN col3 = 'doit' THEN Col1 ELSE Col2 END
I'm not 100% sure what you actually want to achieve though. Do you have a sample set of data, with the results you want, and an explanation of how the results relate to the source?