Last record of the same group - sql

I think there are similar questions but none of them really matches my case. I tried left outer join to get the latest status but then I can't get the latest record based on the parent id grouping.
I have the following tables
Document table
id
version name
parnet_id1
parent_id2
timestamp
UUI1
A
1
100
timestamp1
UUI2
B
1
100
timestamp2
UUI3
C
2
100
timestamp3
UUI4
D
2
100
timestamp4
UUI5
E
2
100
timestamp5
Document history table
document_id
status
timestamp
UUI1
Active
timestamp1
UUI1
Inactive
timestamp2
UUI2
InActive
timestamp3
UUI2
Active
timestamp4
UUI3
InActive
timestamp3
UUI3
Active
timestamp4
UUI4
InActive
timestamp3
UUI4
Active
timestamp4
UUI5
Active
timestamp3
UUI5
Inactive
timestamp4
What query gives me the following table
(group with parent_id1 and parent_id2)
So docs that have the same parent_id1 and parent_id2 are different versions of the same doc so we only interested in the latest version based on timestamp. We also need their latest status from the history table based on the timestamp (one to many)
id
version name
parnet_id1
parent_id2
timestamp
status
UUI2
B
1
100
timestamp2
Active
UUI5
E
2
100
timestamp5
Inactive

Using inner join and group by we can do it as follows :
select h3.*, h2.status
from history h2
inner join (
select t.id, t.version_name, t.parnet_id1, t.parent_id2, max(h.timestamp) as timestamp
from history h
inner join (
select d.id, d.version_name, d.parnet_id1, d.parent_id2
from document d
inner join (
select parnet_id1, parent_id2, max(timestamp) as timestamp
from document
group by parnet_id1, parent_id2
) as s on s.parnet_id1 = d.parnet_id1 and s.parent_id2 = d.parent_id2 and s.timestamp = d.timestamp
) t on t.id = h.document_id
group by t.id, t.version_name, t.parnet_id1, t.parent_id2
) as h3 on h2.document_id = h3.id and h3.timestamp = h2.timestamp
Demo here

Related

How to retrieve historical data based on condition on one row?

I have a table historical_data
ID
Date
column_a
column_b
1
2011-10-01
a
a1
1
2011-11-01
w
w1
1
2011-09-01
a
a1
2
2011-01-12
q
q1
2
2011-02-01
d
d1
3
2011-11-01
s
s1
I need to retrieve the whole history of an id based on the date condition on any 1 row related to that ID.
date>='2011-11-01' should get me
ID
Date
column_a
column_b
1
2011-10-01
a
a1
1
2011-11-01
w
w1
1
2011-09-01
a
a1
3
2011-11-01
s
s1
I am aware you can get this by using a CTE or a subquery like
with selected_id as (
select id from historical_data where date>='2011-11-01'
)
select hd.* from historical_data hd
inner join selected_id si on hd.id = si.id
or
select * from historical_data
where id in (select id from historical_data where date>='2011-11-01')
In both these methods I have to query/scan the table ``historical_data``` twice.
I have indexes on both id and date so it's not a problem right now, but as the table grows this may cause issues.
The table above is a sample table, the table I have is about to touch 1TB in size with upwards of 600M rows.
Is there any way to achieve this by only querying the table once? (I am using Snowflake)
Using QUALIFY:
SELECT *
FROM historical_data
QUALIFY MAX(date) OVER(PARTITION BY id) >= '2011-11-01'::DATE;

Selecting values in columns based on other columns

I have two tables, info and transactions.
info looks like this:
customer ID Postcode
1 ABC 123
2 DEF 456
and transactions looks like this:
customer ID day frequency
1 1/1/12 3
1 3/5/12 4
2 4/6/12 2
3 9/9/12 1
I want to know which day has the highest frequency for each postcode.
I know how to reference from two different tables but im not too sure how to reference multiple columns based on their values to other columns.
The output should be something like this:
customer ID postcode day frequency
1 ABC 123 3/5/12 4
2 DEF 456 4/6/12 2
3 GHI 789 9/9/12 1
and so on.
You can filter with a correlated subquery:
select
i.*,
t.day,
t.frequency
from info i
inner join transactions t on t.customerID = i.customerID
where t.frequency = (
select max(t.frequency)
from info i1
inner join transactions t1 on t1.customerID = i1.customerID
where i1.postcode = i.postcode
)
Or, if your RBDMS supports window functions, you can use rank():
select *
from (
select
i.*,
t.day,
t.frequency,
rank() over(partition by i.postcode order by t.frequency desc)
from info i
inner join transactions t on t.customerID = i.customerID
) t
where rn = 1

how to get value using latest date from one table and joining to another table

i have 1 table inventory_movement here is data in table
product_id | staff_name | status | sum | reference_number
--------------------------------------------------
1 zes cp 1 000122
2 shan cp 4 000133
i have another table inventory_orderproduct where i have cost date
orderdate product_id cost
--------------------------------
01/11/2018 1 3200
01/11/2018 2 100
02/11/2018 1 4000
02/11/2018 1 500
03/11/2018 2 2000
i want this result
product_id| staff_name | status | sum reference_number | cost
--------------------------------------------------------------
1 zes cp 1 000122 4000
2 shan cp 4 000133 2000
here is my query
select ipm.product_id,
case when ipm.order_by_id is not null then
(select au.first_name from users_staffuser us inner join auth_user au on us.user_id= au.id
where us.id = ipm.order_by_id) else '0' end as "Staff_name"
,ipm.status,
Sum(ipm.quantity), ip.reference_number
from inventory_productmovement ipm
inner join inventory_product ip on ipm.product_id = ip.id
inner join users_staffuser us on ip.branch_id = us.branch_id
inner join auth_user au on us.user_id = au.id
AND ipm.status = 'CP'
group by ipm.product_id, au.first_name, ipm.status,
ip.reference_number, ip.product_name
order by 1
Here is the solution of your question.its working fine.if you like the answer please vote!
SELECT i.product_id,i.staff_name,i.status,i.sum reference_number ,s.Cost
FROM (SELECT product_id,MAX(cost) AS Cost
FROM inventory_orderproduct
GROUP BY product_id ) s
JOIN inventory_movement i ON i.product_id =s.product_id
In the given situation, this should work fine:
Select table1.product_id, table2.staff_name, table2.status, table2.reference_number,
MAX(table1.cost)
FROM table2
LEFT JOIN table1 ON table1.product_id = table2.product_id
GROUP BY table2.product_id, table2.staff_name, table2.status, table2.reference_number
You can use the below query to get MAX cost for products
SELECT i.product_id,i.staff_name,i.status,i.sum reference_number ,s.MAXCost
FROM (SELECT product_id,MAX(cost) AS MAXCost
FROM inventory_orderproduct
GROUP BY product_id ) s
JOIN inventory_movement i ON i.product_id =s.product_id
For Retrieving the cost using the latest date use the below query
WITH cte as (
SELECT product_id,cost
,ROW_NUMBER() OVER (PARTITION BY product_id ORDER BY orderdate DESC) AS Rno
FROM inventory_orderproduct )
SELECT i.product_id,i.staff_name,i.status,i.sum reference_number ,s.Cost
FROM cte s
JOIN inventory_movement i ON i.product_id =s.product_id
WHERE s.Rno=1
You can use below query it will pick the data according to the latest date
WITH result as (
SELECT product_id,cost
,ROW_NUMBER() OVER (PARTITION BY product_id ORDER BY date DESC)
FROM inventory_orderproduct )
SELECT i.product_id,i.staff_name,i.status,i.sum reference_number ,s.Cost
FROM result s
JOIN inventory_movement i ON i.product_id =s.product_id

MaxMin Function within Select Statement SQL 2012

I am having a few issues making a MAX function work within the select statement See example data below:
Table 1 Table 2
Visit_ID Car_ID Move_ID Visit_ID MoveStartDate MoveEndDate
A 1 1 A 25/07/2016 27/07/2016
B 2 2 A 28/07/2016 28/07/2016
C 1 3 B 19/07/2016 22/07/2016
D 3 4 D 28/06/2016 30/06/2016
I would like my select statement to pick the min start time and Max start time based on the Visit_ID so I would be expecting:
Result
Visit_ID Car_ID StartDate EndDate
A 1 25/07/2016 28/07/2016
B 2 19/07/2016 22/07/2016
So far I have tried I already have Inner Joins in my select statement:
,(MAX (EndDate) WHERE Visit.Visit_ID = Move.Visit_ID) AS End Date
I have looked at some other queries with a second select statement within the select so you end up with something like:
Select Visit_ID, Car_ID ,(Select MAX(EndDate) FULL OUTER JOIN Table 2 ON Table 1.Visit_ID = Table 2.Visit_ID Group By Table 1.Visit_ID) AS End Date
Hope I have provided enough info currently stumped.
If you also want Car_ID = 3 in the result:
select t1.Visit_ID, t1.Car_ID, MIN(MoveStartDate), MAX(MoveEndDate)
from table1 t1
join table2 t2 on t1.Visit_ID = t2.Visit_ID
group by t1.Visit_ID, t1.Car_ID
Returns:
SQL>select t1.Visit_ID, t1.Car_ID, MIN(MoveStartDate), MAX(MoveEndDate)
SQL&from table1 t1
SQL& join table2 t2 on t1.Visit_ID = t2.Visit_ID
SQL&group by t1.Visit_ID, t1.Car_ID;
visit_id car_id
======== =========== ==================== ====================
A 1 25/07/2016 28/07/2016
B 2 19/07/2016 22/07/2016
D 3 28/06/2016 30/06/2016
3 rows found
I did not check it but your can try this
WITH cte
AS
(select Move_ID,Visit_ID,min(MoveStartDate) AS mMS,MAX(MoveEndDate) AS mME
FROM Table_2
GROUP BY Move_ID,Visit_ID)
SELECT c.Move_ID,c.Visit_ID,T1.Car_ID,c.mMS,c.mME
FROM Table_1 as T1 JOIN cte as C
ON c.Visit_ID=T1.Visit_ID

How can I avoiding Cartesian product on SQL on multiple tables

Here is my sqlfiddle http://sqlfiddle.com/#!3/671c8/1.
Here are my tables:
Person
PID LNAME FNAME
1 Bob Joe
2 Smith John
3 Johnson Jake
4 Doe Jane
Table1
PID VALUE
1 3
1 5
1 35
2 10
2 15
3 8
Table2
PID VALUE
1 X1
1 X2
1 X3
2 Z1
3 X3
I am trying to join several tables on a person's ID. These tables contain events with dates, but the dates may or may not match across table. So what I really want it to regardless of date join the tables in a way such that when I get results the table with the largest rows will be the amount of rows in my result and all other tables will "fit" within. For example
Instead of this which is a cartesian product:
PID LNAME FNAME THINGONE THINGTWO
1 Bob Joe 3 X1
1 Bob Joe 3 X2
1 Bob Joe 3 X3
1 Bob Joe 5 X1
1 Bob Joe 5 X2
1 Bob Joe 5 X3
1 Bob Joe 35 X1
1 Bob Joe 35 X2
1 Bob Joe 35 X3
I would like something like this:
PID LNAME FNAME THINGONE THINGTWO
1 Bob Joe 3 X1
1 Bob Joe 5 X2
1 Bob Joe 35 X3
My sql statement:
SELECT
p.*,
t1.value as thingone,
t2.value as thingtwo
FROM
person p
left outer join table1 t1 on p.pid=t1.pid
left outer join table2 t2 on p.pid=t2.pid
;
I can't fathom why you want to do this, but...
You need to create an artificial join between table1 and table2, and then link that to the master table. One way of doing that is by ranking the rows in order. eg:
SELECT
p.pid, p.lname,p.fname, thingone, thingtwo
FROM
person p
left outer join
(
select ISNULL(t1.pid, t2.pid) as pid, t1.value as thingone, t2.value as thingtwo
from
(select *, ROW_NUMBER() over (partition by pid order by value) rn
from table1) t1
full outer join
(select *, ROW_NUMBER() over (partition by pid order by value) rn
from table2) t2
on t1.pid=t2.pid and t1.rn=t2.rn
) v
on p.pid = v.pid
This is a trickier problem than I thought. The challenge is being sure that all the records appear, regardless of the lengths of the two lists. The following works by enumerating each of the lists and using that for the join conditions:
SELECT p.*,
t1.value as thingone,
t2.value as thingtwo
FROM person p left outer join
(select t1.*,
row_number() over (partition by pid order by pid) as seqnum,
count(*) over (partition by pid) as cnt
from table1 t1
) t1
on p.pid = t1.pid left outer join
(select t2.*, row_number() over (partition by pid order by pid) as seqnum,
count(*) over (partition by pid) as cnt
from table2 t2
) t2
on p.pid = t2.pid
WHERE t1.seqnum = t2.seqnum or
(t2.seqnum > t1.cnt) or
(t1.seqnum > t2.cnt) or
t1.seqnum is null or
t2.seqnum is null;
Here is a slight modification to your SQL Fiddle that has better test data.
EDIT:
The logic in the where clause handles these cases (in order by the clauses):
Where the two lists have sequence numbers, these must match.
Where list2 is longer and list1 has at least one element.
Where list1 is longer and list2 has at least one element.
Where list1 is empty
Where list 2 is empty
These were arrived at by trial and error, because the original condition did not work:
on p.pid = t2.pid and t1.seqnum = t2.seqnum
This returns NULL values for p.id for the extra elements on the list. Podliuska's approach may also work; I had just started down this path and the where conditions do the trick.