summarize data in group by in sql? - sql

I'm not sure what should I write in the following SQL query to show the following result:
Data:
Color is unique column...
Result:

select color as [name/color], value
from your_table
union all
select name, sum(value)
from your_table
group by name
And if you need a specific order then you can do
select [name/color], value
from
(
select color as [name/color], value, name as order_column
from your_table
union all
select name, sum(value), name
from your_table
group by name
) x
order by order_column

Related

I want to retrieve max value from two column - SQL table

Please find attached image for table structure
You could try using UNION ALL for build a unique column result and select the max from this
select max(col)
from (
select col1 col
from trans_punch
union all
select col2
from trans_punch) t
You can use a common table expression to union the columns, then select the max.
;with cteUnionPunch(Emp_id, both_punch) AS
(
SELECT Emp_id, In_Punch FROM trans_punch
UNION ALL
SELECT Emp_id, Out_Punch FROM trans_punch
)
SELECT Emp_id, max(both_punch) FROM cteUnionPunch GROUP BY Emp_id
You can use apply :
select tp.Emp_id, max(tpp.Punchs)
from trans_punch as tp cross apply
( values (In_Punch), (Out_Punch) ) tpp(Punchs)
group by tp.Emp_id;

Select a third column based on two distant rows within the same table

I want to select a third column based on two distant columns within the same table.
I could only think of this:
select tl.thirdcolumn
from table1 t1
WHERE
EXISTS
(
Select distinct tl.firstcolumn , t1.secondcolumn
From t1
)
This:
select distinct tl.thirdcolumn
from table t1
won't work as I don't want the distinct thirdrow. I want the thirdrow to be based on the first two rows being distinct.
I guess its a kind of nested sql statment with a select top 1... idk
CATEGORY NAME Query
---------------------------------------------------
STUDENTS NUMBER_OF_CHAPTERS QueryA
STUDENTS NUMBER_OF_STUDENT_MEMBERS QueryB
STUDENTS NUMBER_OF_STUDENT_MEMBERS QueryB
MEMBERS NUMBER_OF_MEMBERS_WORLDWIDE QueryC
MEMBERS NUMBER_OF_MEMBERS_WORLDWIDE QueryC
Your question is rather hard to follow, but I think you might simply want group by:
select tl.firstcolumn , t1.secondcolumn, max(tl.thirdcolumn)
from table1 t1
group by tl.firstcolumn , t1.secondcolumn;
If you want rows where the pair of values only appears once, then add having count(*) = 1:
select tl.firstcolumn , t1.secondcolumn, max(tl.thirdcolumn)
from table1 t1
group by tl.firstcolumn , t1.secondcolumn
having count(*) = 1;
Query -
SELECT
CATEGORY,NAME,QUERY
FROM
(
WITH TAB AS (
SELECT
'STUDENTS' AS CATEGORY,
'NUMBER_OF_CHAPTERS' AS NAME,
'QUERYA' AS QUERY
FROM
DUAL
UNION ALL
SELECT
'STUDENTS' AS CATEGORY,
'NUMBER_OF_STUDENT_MEMBERS' AS NAME,
'QUERYB' AS QUERY
FROM
DUAL
UNION ALL
SELECT
'STUDENTS' AS CATEGORY,
'NUMBER_OF_STUDENT_MEMBERS' AS NAME,
'QUERYB' AS QUERY
FROM
DUAL
UNION ALL
SELECT
'MEMBERS' AS CATEGORY,
'NUMBER_OF_MEMBERS_WORLDWIDE' AS NAME,
'QUERYC' AS QUERY
FROM
DUAL
UNION ALL
SELECT
'MEMBERS' AS CATEGORY,
'NUMBER_OF_MEMBERS_WORLDWIDE' AS NAME,
'QUERYC' AS QUERY
FROM
DUAL
) SELECT
CATEGORY,
NAME,
QUERY,
COUNT(*) OVER(PARTITION BY
CATEGORY,
NAME
ORDER BY
CATEGORY,
NAME,
QUERY
) AS RNK
FROM
TAB
)
WHERE
RNK = 1;
Output -
"CATEGORY","NAME","QUERY"
"STUDENTS","NUMBER_OF_CHAPTERS","QueryA"

BigQuery - Concatenate multiple rows into a single row

I have a BigQuery table with 2 columns:
id|name
1|John
1|Tom
1|Bob
2|Jack
2|Tim
Expected output: Concatenate names grouped by id
id|Text
1|John,Tom,Bob
2|Jack,Tim
For BigQuery Standard SQL:
#standardSQL
--WITH yourTable AS (
-- SELECT 1 AS id, 'John' AS name UNION ALL
-- SELECT 1, 'Tom' UNION ALL
-- SELECT 1, 'Bob' UNION ALL
-- SELECT 2, 'Jack' UNION ALL
-- SELECT 2, 'Tim'
--)
SELECT
id,
STRING_AGG(name ORDER BY name) AS Text
FROM yourTable
GROUP BY id
Optional ORDER BY name within STRING_CONCAT allows you to get out sorted list of names as below
id Text
1 Bob,John,Tom
2 Jack,Tim
For Legacy SQL
#legacySQL
SELECT
id,
GROUP_CONCAT(name) AS Text
FROM yourTable
GROUP BY id
If you would need to output sorted list here, you can use below (formally - it is not guaranteed by BigQuery Legacy SQL to get sorted list - but for most practical cases I had - it worked)
#legacySQL
SELECT
id,
GROUP_CONCAT(name) AS Text
FROM (
SELECT id, name
FROM yourTable
ORDER BY name
)
GROUP BY id
You can use GROUP_CONCAT
SELECT id, GROUP_CONCAT(name) AS Text FROM <dataset>.<table> GROUP BY id

How to get the first not null value from a column of values in Big Query?

I am trying to extract the first not null value from a column of values based on timestamp. Can somebody share your thoughts on this. Thank you.
What have i tried so far?
FIRST_VALUE( column ) OVER ( PARTITION BY id ORDER BY timestamp)
Input :-
id,column,timestamp
1,NULL,10:30 am
1,NULL,10:31 am
1,'xyz',10:32 am
1,'def',10:33 am
2,NULL,11:30 am
2,'abc',11:31 am
Output(expected) :-
1,'xyz',10:30 am
1,'xyz',10:31 am
1,'xyz',10:32 am
1,'xyz',10:33 am
2,'abc',11:30 am
2,'abc',11:31 am
You can modify your sql like this to get the data you want.
FIRST_VALUE( column )
OVER (
PARTITION BY id
ORDER BY
CASE WHEN column IS NULL then 0 ELSE 1 END DESC,
timestamp
)
Try this old trick of string manipulation:
Select
ID,
Column,
ttimestamp,
LTRIM(Right(CColumn,20)) as CColumn,
FROM
(SELECT
ID,
Column,
ttimestamp,
MIN(Concat(RPAD(IF(Column is null, '9999999999999999',STRING(ttimestamp)),20,'0'),LPAD(Column,20,' '))) OVER (Partition by ID) CColumn
FROM (
SELECT
*
FROM (Select 1 as ID, STRING(NULL) as Column, 0.4375 as ttimestamp),
(Select 1 as ID, STRING(NULL) as Column, 0.438194444444444 as ttimestamp),
(Select 1 as ID, 'xyz' as Column, 0.438888888888889 as ttimestamp),
(Select 1 as ID, 'def' as Column, 0.439583333333333 as ttimestamp),
(Select 2 as ID, STRING(NULL) as Column, 0.479166666666667 as ttimestamp),
(Select 2 as ID, 'abc' as Column, 0.479861111111111 as ttimestamp)
))
As far as I know, Big Query has no options like 'IGNORE NULLS' or 'NULLS LAST'. Given that, this is the simplest solution I could come up with. I would like to see even simpler solutions.
Assuming the input data is in table "original_data",
select w2.id, w1.column, w2.timestamp
from
(select id,column,timestamp
from
(select id,column,timestamp, row_number()
over (partition BY id ORDER BY timestamp) position
FROM original_data
where column is not null
)
where position=1
) w1
right outer join
original_data as w2
on w1.id = w2.id
SELECT id,
(SELECT top(1) column FROM test1 where id=1 and column is not null order by autoID desc) as name
,timestamp
FROM yourTable
Output :-
1,'xyz',10:30 am
1,'xyz',10:31 am
1,'xyz',10:32 am
1,'xyz',10:33 am
2,'abc',11:30 am
2,'abc',11:31 am

Join based on min

I have two tables.
Table1:
id, date
Table2:
id,date
Both the table contain information about id. Table1 and Table2 can have some extra rows which are not present in another table.
Example:
Table1:
1,15-Jun
2,16-Jun
4,17-Jun
Table2
1,14-Jun
2,17-Jun
3,18-Jun
I need a summarize result which give minimum date for each row.
Expected result:
1,14-Jun
2,16-Jun
3,18-Jun
4,17-Jun
select id, min(date_) from (
select id, date_ from table1
union all
select id, date_ from table12
) group by id;
SELECT id, MIN(date)
FROM (SELECT id, date
FROM Table1
UNION
SELECT id, date
FROM Table2)
GROUP BY id
with a as(select t.i_id,t.dt_date from t
union
select b.i_id,b.dt_date from b)
select a.i_id,min(a.dt_date) from a group by a.i_id order by a.i_id;
You can check this link