I have 2 tables in Big Query:
TABLE A
ID
Name
Date_A
field_x
field_y
field_z
xxx
tata
10/11/2021
a
0
1
xxx
tata
11/11/2021
a
1
1
zzz
tutu
01/11/2021
b
0
1
zzz
tutu
05/11/2021
b
1
1
yyy
titi
02/11/2021
c
0
1
uuu
tata
08/11/2021
d
0
0
TABLE B
ID
Name
Date_B
field_A
field_B
xxx
tata
13/11/2021
AA
BB
zzz
tutu
01/11/2021
CC
DD
yyy
titi
11/11/2021
AA
BB
uuu
tata
05/11/2021
DD
DD
And I would like to link (left join on ID and Name) rows from table B to the max date of table A, to get :
ID
Name
Date_A
field_x
field_y
field_z
field_A
field_B
xxx
tata
10/11/2021
a
0
1
NULL
NULL
xxx
tata
11/11/2021
a
1
1
AA
BB
zzz
tutu
01/11/2021
b
0
1
NULL
NULL
zzz
tutu
05/11/2021
b
1
1
CC
DD
yyy
titi
02/11/2021
c
0
1
AA
BB
uuu
tata
08/11/2021
d
0
0
DD
DD
How can I do that in SQL (Big Query) please ? Thanks
Consider below approach
select a.*,
(if(row_number() over win = 1, b, null)).* except(id, name, date_b)
from table_a a
left join table_b b
using(id, name)
window win as (partition by a.id, a.name order by date_a desc)
if applied to sample data in your question - output is
I didn't tested it but I think you should left join the b table to a table in which the max date is indicated. Usage of condition pure on the left table is somewhat unusual though from the definition of left join I expect it to work.
select a_ranked.ID, a_ranked.Name, a_ranked.Date_A
, a_ranked.field_x, a_ranked.field_y, a_ranked.field_z
, b.field_A, b.field_B
from (
select a.*, rank() over (partition by ID, Name order by Date_A desc) as r
from a
) a_ranked
left join b on a_ranked.ID = b.ID and a_ranked.Name = b.Name and a_ranked.r = 1
Related
I want to join few tables:
table1:
A B_key B_version C D
123 abc 1 ccc 11
123 abc 2 ddd 11
456 dfg 1 rrr 22
789 vvv 1 55
table2:
A E F
123 s 5
456 r
111 t 2
table3:
B_key B_version G
abc 1 aa
abc 1 bb
abc 2 aa
abc 2 cc
dfg 1 aa
so the result would look like this:
A B_key B_version C D E F G
123 abc 1 ccc 11 s 5 aa
123 abc 1 ccc 11 s 5 bb
123 abc 2 ddd 11 s 5 aa
123 abc 2 ddd 11 s 5 cc
456 dfg 1 rrr 22 r aa
789 vvv 1 55
Version can go as high as 8.
IF I don't have A, B_key or B_version - the line is useless. Otherwise I need to keep all the information I do have.
In reality I have many more columns.
I've tried:
SELECT table1.A, table 1.B_key, table 1.B_version, table 1.C, table 1.D,
table2.E, table2.F,
table3.G
FROM table1
LEFT JOIN table2
ON table1.A = table2.A
LEFT JOIN table3
ON table1.B_key = table3.B_key AND
table1.B_version = table3.B_version
and the same with FULL JOIN.
It ends up the same: for every B_key only the highest B_version is kept, while the others disappear.
How can I avoid loosing information?
You can use left joins among tables as below :
select t1.A, t1.B_key, t1.B_version, t1.C, t1.D, t2.E, t2.F, t3.G
from table1 t1
left join table2 t2 on t2.A = t1.A
left join table3 t3 on t3.B_key = t1.B_key and t3.B_version = t1.B_version
Demo
in order to bring also the rows for unmatched values for join conditions.
If I understand correctly, you want all the b_keys and b_versions from table1 and table3. Then you want to bring in the other data. That suggests left joins
select . . .
from ((select B_key, B_version
from table1
) union -- on purpose to remove duplicates
( select B_key, B_version
from table3
)
) bb left join
table1 t1
on t1.b_key = bb.b_key and
t1.b_version = bb.b_version left join
table2 t2
on t2.a = t1.a left join
table3 t3
on t1.b_key = bb.b_key and
t1.b_version = bb.b_version;
I have requirement
sample data :
Table A :
ID name
1 cat
2 Dog
3 Bird
Table B :
ID name
1 aaa
1 bbb
2 ccc
2 ddd
Table C :
ID name
1 xxx
1 yyy
1 zzz
2 www
Required Output :
ID name name name
1 cat aaa xxx
1 cat bbb yyy
1 cat null zzz
2 Dog ccc www
2 Dog ddd www
3 Bird NULL NULL
I have tried with different joins
Select a.ID,a.name,b.name,c.name from #A a
full join #b b
on a.ID = b.ID
full join #c c
on b.ID = c.ID
Can anyone suggest me the best way to Proceed?
You can use window function row_number to assign sequence number within each id in the order of increasing name for table b and c and then do a full join between them. Finally, do a left join with a table:
with b1 as (
select b.*, row_number() over (partition by id order by name) as rn
from b
),
c1 as (
select c.*, row_number() over (partition by id order by name) as rn
from c
)
select a.*, t.b_name, t.c_name
from a
left join (
select coalesce(b1.id, c1.id) as id,
b1.name as b_name,
c1.name as c_name
from b1
full join c1 on b1.id = c1.id
and b1.rn = c1.rn
) t on a.id = t.id;
This assumes that you need to join the tables b and c based on id and the position (in the order of name column).
I have a query where I need to extract data in a regular table and put two rows of data into a single row.
I have rows that consist of
StudentID AUDIT_ACTN Audit_Date .....
aaa A 01/01/2010
aaa A 03/04/2011
aaa A 02/02/2013
aaa D 09/10/2010
aaa D 05/06/2011
aaa D 06/07/2013
aaa A 11/12/2014~
bbb A 01/01/2010
bbb A 03/04/2011
bbb A 02/02/2013
bbb D 09/10/2010
bbb D 05/06/2011
bbb D 06/07/2013
bbb A 11/12/2014~
I want output like this
StudentID AUDIT_ACTN Audit_Date StudentID AUDIT_ACTN Audit_Date
aaa A 01/01/2010 aaa D 09/10/2010
aaa A 03/04/2011 aaa D 05/06/2011
aaa A 02/02/2013 aaa D 06/07/2013
aaa A 11/12/2014 NULL NULL NULL
bbb A 01/01/2010 bbb D 09/10/2010
bbb A 03/04/2011 bbb D 05/06/2011
bbb A 02/02/2013 bbb D 06/07/2013
bbb A 11/12/2014 NULL NULL NULL
The A & D data rows are related, a= add the something to the record and d = delete something from the record (something is an indicator). These is logical in that you must add something before you delete it and you cannot add it twice, without deleting it first.
My current script is probably going down the wrong track but here goes;
select a.StudentId,a.Audit_Date,a.AUDIT_ACTN,d.StudentId,Audit_Date,d.AUDIT_ACTN,
from table a
join
(select *
from
(Select StudentId, Audit_Date,AUDIT_ACTN
from table b
Where b.AUDIT_ACTN='D'
order by Audit_Date
)
where rownum=1
) d on a.StudentId = d.StudentId
and a.AUDIT_ACTN='A'
and Select * from (Select Audit_Date
Order by a.StudentId, a.Audit_Date
I know this is wrong but where do I go from here. If anyone can help and point me in the right direction. It would be appreciated.
My current attempts bring me zero rows, when I take out the rownum it brings me a x join returning 12 rows in this case.
thanks
Roger
I think you need aggregate function like below
select
A.StudentId,A.Audit_Date,A.AUDIT_ACTN,
D.StudentId,D.Audit_Date,D.AUDIT_ACTN
from
(Select StudentId, Audit_Date,AUDIT_ACTN,ROW_NUMBER()
OVER (PARTITION BY StudentId order by audit_date) rn
from table b
Where b.AUDIT_ACTN='D'
) D
FULL OUTER JOIN
(Select StudentId, Audit_Date,AUDIT_ACTN,ROW_NUMBER()
OVER (PARTITION BY StudentId order by audit_date) rn
from table b
Where b.AUDIT_ACTN='A'
) A
on A.rn = D.rn and A.StudentId = B.StudentId
I hope this will work
If I want to display Order_Number data1 data2 data3 (most current by date changed OtherData1, OtherData2, OtherData3) date_changed the problem is I wasn't just one line, I was getting multiple lines for each order number.
What I would love to get is
1, a, f,q, cc,ccc,abc, 12/2/2014, bob
3, c, b,h, aa,aaa,abc, 12/2/2014, bob
Thanks!
I was working with
SELECT
t.Order_Number,
cr.data1, cr.data2, cr.data3,
t.OtherData1, t.OtherData2, t.OtherData3,
x.date_changed, cr.name
FROM
(SELECT
Order_Number,
Max(date_changed) as date_changed
FROM
table2
GROUP BY
Order_Number) x
JOIN
table2 t ON x.date_changed = t.date_changed
LEFT JOIN
table1 cr ON x.Order_Number = cr.Order_Number
WHERE cr.name = 'bob'
Here are example tables.
Table1:
Order_Number data1 data2 data3 name
1 a f q bob
2 b g g john
3 c b h bob
4 d s j john
Table2:
Order_Number date_changed OtherData1 OtherData2 OtherData3
1 11/30/2014 aa aaa abc
1 12/1/2014 bb bbb def
1 12/2/2014 cc ccc abc
3 12/1/2014 dd aaa def
2 11/30/2014 dd bbb abc
2 12/1/2014 ss ccc def
3 12/2/2014 aa aaa abc
4 11/26/2014 fc wer wsd
Your Join to config_log (Table2) needs to include the entire composite key if you want to retrieve unique rows.
JOIN
conf_log t ON x.date_changed = t.date_changed
And x.Order_Number = t.Order_number
I think you need to have 2 sub querys:
SELECT
Data.Order_Number,
Data.data1, Data.data2, Data.data3,
Data.OtherData1, Data.OtherData2, Data.OtherData3,
Data.date_changed, Data.name
FROM
(SELECT
Order_Number,
Max(date_changed) as date_changed
FROM
table2
GROUP BY
Order_Number) x
JOIN (SELECT
t.Order_Number,
cr.data1, cr.data2, cr.data3,
t.OtherData1, t.OtherData2, t.OtherData3,
t.date_changed, cr.name
FROM table2 t
JOIN table1 cr
ON t.Order_Number = cr.Order_Number) AS Data
ON x.date_changed = data.date_changed
AND x.Order_Number = data.Order_number
WHERE cr.name = 'bob'
The fact that you had cr.name in the where clause means the LEFT JOIN had the same affect as just JOIN.
This one is support hard for me. I can do inner join with first result only, but if exist I want take 2nd result.
THIS IS MY TABLE A
ID NAME VALUE
1 A 123
2 B 456
3 C 789
4 A 456
TABLE B
BID BNAME BVALUE
1 A ABC
2 A CDE
3 B 845
4 C 1234
MY SELECT SQL:
SELECT * FROM A
CROSS APPLY (
SELECT TOP 1 *
FROM B
WHERE A.Name = B.BName
) BB
It return
1 A 123 1 A ABC
2 B 456 3 B 845
3 C 789 4 C 1234
4 A 456 1 A ABC
Please help, I want this result:
1 A 123 1 A ABC
2 B 456 3 B 845
3 C 789 4 C 1234
4 A 456 2 A CDE
I accept tmp table and any kind of query :(
Following clarification in the comments that both tables will always have matching rows.
WITH A
AS (SELECT *,
ROW_NUMBER() OVER (PARTITION BY NAME ORDER BY ID) AS RN
FROM TableA),
B
AS (SELECT *,
ROW_NUMBER() OVER (PARTITION BY BNAME ORDER BY BID) AS RN
FROM TableB)
SELECT A.ID,
A.NAME,
A.VALUE,
B.BID,
B.BNAME,
B.BVALUE
FROM A
JOIN B
ON A.NAME = B.BNAME
AND A.RN = B.RN