SQL JOIN same table - sql

I have one table with "ID", "Sequence", "Status":
ID | Seq | Status
======================
10 | 001 | 010
10 | 002 | test
10 | 003 | 005
11 | 001 | 010
11 | 002 | 338
The result from my query should give me the complete table plus an extra column with the status for the highest sequence for the respective ID:
ID | Seq | Status | LStatus
======================
10 | 001 | 010 | 005
10 | 002 | test | 005
10 | 003 | 005 | 005
11 | 001 | 010 | 338
11 | 002 | 338 | 338
I have no clue how to do it. I startet with something like that:
SELECT a.*, b.status as lstatus
FROM table a
left join (select top 1 b.status from table b order by b.seq DESC)
on a.id = b.id
Hope you can help me :)
Thanks in advance!!!

You should use a group by max for the subquery and join the base table
SELECT a.*, b.status as t.lstatus
FROM my_table a
INNER join (select id,
max(b.status) lstatus
from my_table b
group by id) t on t.id = a.id and
for numeric value only
SELECT a.*, b.status as t.lstatus
FROM my_table a
INNER join (select id,
max(b.status) lstatus
from my_table b
where IsNumeric([b.status])=True
group by id) t on t.id = a.id and

Try the below.
with status as (
Select distinct(id),status from table order by seq desc)select a.*,s.status as LStatus from table a,status s where a.id=s.id;

You can try using ROW_NUMBER() to assign number to each row within the ids ordered by seq. Join that back to your table where the id matches and the rownumber rn = 1.
DEMO
SELECT
a.*
, b.status
FROM table a
JOIN (SELECT *, ROW_NUMBER() OVER (PARTITION BY id ORDER BY seq DESC) AS rn
FROM dbo.Table) b ON a.id = b.id AND b.rn = 1

Use can use subquery here:
select *,
(select top 1 Status from table where id = t.id order by status desc) as Lstatus
from table t;

Related

How to select only 1 row from ordered table for each ID?

This is my SQL code:
SELECT a.ID
, a.Date
, a.Value
, b.Alias
FROM NAV a
LEFT JOIN Portfolio b ON a.ID = b.ID
ORDER BY a.ID, a.Date DESC, b.Alias, a.Value
It gives me a table that looks something like this:
| ID | Date | Value | Alias |
|----|------|-------|-------|
| 1 | 2021 | 300 | A |
| 1 | 2020 | 200 | A |
| 1 | 2019 | 400 | A |
| 2 | 2021 | 800 | B |
| 2 | 2020 | 700 | B |
| 3 | 2021 | 600 | C |
| 3 | 2019 | 300 | C |
| 3 | 2018 | 500 | C |
I want to only choose the most first row for each ID. How would I go about doing that? Apologies for the basic question, am new to SQL.
You can use row_number():
SELECT n.ID, n.Date, n.Value, p.Alias
FROM (SELECT n.*,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY Date DESC) as seqnum
FROM NAV n
) n LEFT JOIN
Portfolio p
ON p.ID = n.ID
WHERE seqnum = 1
ORDER BY n.ID;
Note:
Use meaningful table aliases instead of arbitrary letters.
I doubt a LEFT JOIN is needed. Are there really values of ID in NAV that are not in Portfolio?
ANSI standard SQL (assuming this returns the correct results)
Call NAV table twice -
First time - get the row of interest (table aliased as c
then use that row to get the relevant a.value by joining on both c.ID and c.Date to a.ID and a.Date
SELECT c.ID, c.Date, a.Value, b.Alias
FROM (select Id, max(date) as Date from NAV group by Id) c
inner join Nav a on a.ID = c.Id and a.date = c.date
left join Portfolio b on b.id = a.id
ORDER BY c.ID, c.Date DESC, b.Alias, a.Value;

LEFT JOIN ON most recent date in Google BigQuery

I've got two tables, both with timestamps and some more data:
Table A
| name | timestamp | a_data |
| ---- | ------------------- | ------ |
| 1 | 2018-01-01 11:10:00 | a |
| 2 | 2018-01-01 12:20:00 | b |
| 3 | 2018-01-01 13:30:00 | c |
Table B
| name | timestamp | b_data |
| ---- | ------------------- | ------ |
| 1 | 2018-01-01 11:00:00 | w |
| 2 | 2018-01-01 12:00:00 | x |
| 3 | 2018-01-01 13:00:00 | y |
| 3 | 2018-01-01 13:10:00 | y |
| 3 | 2018-01-01 13:10:00 | z |
What I want to do is
For each row in Table A LEFT JOIN the most recent record in Table B that predates it.
When there is more than one possibility take the last one
Target Result
| name | timestamp | a_data | b_data |
| ---- | ------------------- | ------ | ------ |
| 1 | 2018-01-01 11:10:00 | a | w |
| 2 | 2018-01-01 12:20:00 | b | x |
| 3 | 2018-01-01 13:30:00 | c | z | <-- note z, not y
I think this involves a subquery, but I cannot get this to work in Big Query. What I have so far:
SELECT a.a_data, b.b_data
FROM `table_a` AS a
LEFT JOIN `table_b` AS b
ON a.name = b.name
WHERE a.timestamp = (
SELECT max(timestamp) from `table_b` as sub
WHERE sub.name = b.name
AND sub.timestamp < a.timestamp
)
On my actual dataset, which is a very small test set (under 2Mb) the query runs but never completes. Any pointers much appreciated 👍🏻
You can try to use a select subquery.
SELECT a.*,(
SELECT MAX(b.b_data)
FROM `table_b` AS b
WHERE
a.name = b.name
and
b.timestamp < a.timestamp
) b_data
FROM `table_a` AS a
EDIT
Or you can try to use ROW_NUMBER window function in a subquery.
SELECT name,timestamp,a_data , b_data
FROM (
SELECT a.*,b.b_data,ROW_NUMBER() OVER(PARTITION BY a.name ORDER BY b.timestamp desc,b.name desc) rn
FROM `table_a` AS a
LEFT JOIN `table_b` AS b ON a.name = b.name AND b.timestamp < a.timestamp
) t1
WHERE rn = 1
Below is for BigQuery Standard SQL and does not require specifying all columns on both sides - only name and timestamp. So it will work for any number of the columns in both tables (assuming no ambiguity in name rather than for above mentioned two columns)
#standardSQL
SELECT a.*, b.* EXCEPT (name, timestamp)
FROM (
SELECT
ANY_VALUE(a) a,
ARRAY_AGG(b ORDER BY b.timestamp DESC LIMIT 1)[SAFE_OFFSET(0)] b
FROM `project.dataset.table_a` a
LEFT JOIN `project.dataset.table_b` b
USING (name)
WHERE a.timestamp > b.timestamp
GROUP BY TO_JSON_STRING(a)
)
In BigQuery, arrays are often an efficient way to solve such problems:
SELECT a.a_data, b.b_data
FROM `table_a` a LEFT JOIN
(SELECT b.name,
ARRAY_AGG(b.b_data ORDER BY b.timestamp DESC LIMIT 1)[OFFSET(1)] as b_data
FROM `table_b` b
GROUP BY b.name
) b
ON a.name = b.name;
this is a common case where you can't just Group by and get the minimum. I suggest the following:
SELECT *
FROM table_a as a inner join (SELECT name, min(timestamp) as timestamp
FROM table_b group by 1) as b
on (a.timestamp = b.timestamp and a.name = b.name)
This way you limit it only to the minimum present in Table b, as you specified.
You can also achieve that in a more readable way using the WITH statement:
WITH min_b as (
SELECT name,
min(timestamp) as timestamp
FROM table_b group by 1
)
SELECT *
FROM table_a as a inner join min_b
on (a.timestamp = min_b.timestamp and a.name = min_b.name)
Let me know if it worked!

If one condition is met by group by, filter it

I am trying to filter from the following table all the ids which have at least one status = C. If it has C status filter it from my existing table
This is an example of my hole dataset (example to illustrate my problem)
id | status
-------------
4567 | B
4567 | A
27 | A
27 | A
27 | C
9 | C
9 | B
Expected result
id | status
-------------
4567 | B
4567 | A
Try this
SELECT id,status FROM TABLE T
WHERE id NOT IN (SELECT id FROM TABLE T1 WHERE status ='C' )
Use not exists:
select t.*
from t
where not exists (select 1 from t t2 where t2.id = t.id and t2.status = 'C');

Using CASE for a specific situation - How TO

I'm trying to find the proper SQL for the following situation:
Supposed we have two tables:
TABLE A
ID int,
TEXT varchar(200)
TABLE B
ID int,
A_NO int,
B_NO int
Fields named "ID" on both tables can be join to link tables.
The following SQL:
SELECT
A.ID,
B.A_NO,
B.B_NO
FROM
A
LEFT JOIN
B
ON A.ID = B.ID
ORDER BY A.ID, B.A_NO, B.B_NO
gives the following results:
Now, the problem.
What is asked for is to have in the column B_NO a value = 1 for the MIN value of column A_NO and a value = 0 for all the others row with the same A_NO value.
The results below are expected:
Please note that, in this example, we can find two rows for each B_NO value but it is possible to have more than 2 rows.
I have tried to reproduce these results by using a CASE but with no success.
Thanks for you help in advance,
Bouzouki.
Try this using CTE and ROW_NUMBER(); (DEMO)
Please note: I have considered myT as your joined query of A and B tables for demo purpose. So replace myT with as yours A LEFT JOIN B ON A.ID = B.ID.
;with cte as (
select id, a_no, b_no,
row_number() over(partition by id,b_no order by a_no) rn
from myT
)
select id,a_no, case when rn=1 then b_no else 0 end b_no
from cte
order by a_no
--RESULTS FROM DEMO TABLE
| ID | A_NO | B_NO |
-------------------------
| 1031014 | 1 | 1 |
| 1031014 | 2 | 0 |
| 1031014 | 3 | 2 |
| 1031014 | 4 | 0 |
| 1031014 | 5 | 3 |
| 1031014 | 6 | 0 |
| 1031014 | 7 | 4 |
| 1031014 | 8 | 0 |
| 1031014 | 9 | 5 |
| 1031014 | 10 | 0 |
something like
select ID, a_no, b_no,
case when a_no = min_a_no then b_no else 0 end as new_b_no
from
a left join b on a.id = b.id left join
(Select ID, B_no, min(a_no) as min_a_no
from a left join b on a.id = b.id
group by id, b_no) m on a.id = m.id and b.b_no = m.b_no
ORDER BY A.ID, B.A_NO

Getting the most recent record of a group

I'm trying to find the most recent record of a group after doing a inner join.
Say I have the following two tables:
dateCreated | id
2011-12-27 | 1
2011-12-15 | 2
2011-12-17 | 6
2011-12-26 | 15
2011-12-15 | 18
2011-12-07 | 22
2011-12-09 | 23
2011-12-27 | 24
code | id
EFG | 1
ABC | 2
BCD | 6
BCD | 15
ABC | 18
BCD | 22
EFG | 23
EFG | 24
I want to display only the most recent of the groupings:
So the result would be:
dateCreated | code
2011-12-27 | EFG
2011-12-15 | ABC
2011-12-26 | BCD
I know this can be achieved using the max and group by functions, but I can't seem to get the desired result.
I think this should get you there:
select max(a.dateCreated) as dateCreated
, b.code
from table1 a
join table2 b on a.id = b.id
group by b.code
Assuming your tables are called a and b, try this:
select max(a.dateCreated) as dateCreated, b.code
from a join b on a.id = b.id
group by b.code
You can use analytical functions for this. This way, you are still choosing only one result for every code, even if they are two with the same last dateCreated (this may or may not be what you actually want as a result)
SELECT Code, dateCreated
FROM ( SELECT T2.Code, T1.dateCreated, ROW_NUMBER() OVER(PARTITION BY T2.Code ORDER BY T1.dateCreated DESC) Corr
FROM Table1 T1
INNER JOIN Table2 T2
ON T1.id = T2.id) A
WHERE Corr = 1