select the latest date from joins - sql

I am facing problem in join. I am getting duplicate records.Please help me to resolve that..
This is my query:
select requestinstanceid from
requestidt [RIM]
inner join requestcdt [RCDT] on [RIM].requestinstanceid = [RCDT].requestinstanceid
left join requestcmt [RCMT] on [RCDT].requestcommentid = [RCMT].requestcommentid
inner join requestddt [RDDT] on [RDDT].requestinstanceid = [RIM].requestinstanceid
left join requestdmt [RDMT] on [RDMT].requestdocumentid = [RDDT].requestdocumentid
I am getting result like this:
requestinstanceid
184
184
386
389
389
397
I should get not get duplicate record and I want to get the latest date from each record.

This code joins to the 'top 1' of a derived table. You should be able to work out how it could be applied to join to the Top 1 of sub-query sorted by date DESC.
SELECT T2.TempEmailID, T1.EmailID
FROM tbl1 T1
LEFT JOIN (SELECT *, RANK() Over (Partition By EmailID Order By TempEmailID DESC) as TopOne FROM tbl2) T2 ON T1.EmailID = T2.EmailID AND TopOne = 1

Related

Join same column from multiple tables

Below is my current code. I'm not sure what the best way is to amend this to give me the results I need.
SELECT
T1.SC,
T1.AN,
T1.DOFS_DATE,
T2.M_ID,
T3.OPDT,
T4.MARKER,
T5.E_DTE,
T5.E_TME,
T5.E_PST_DTE,
T5.E_AMT,
T5.E_NAR_O,
T5.E_NAR_T
FROM E_Base.AR_MyTable T1
LEFT JOIN E_Base.Translation T2
ON T1.SC = T2.SC
AND T1.AN = T2.AN
LEFT JOIN E_Base.BA T3
ON T2.M_ID = T3.M_ID
LEFT JOIN E_Base.APF T4
ON T3.M_ID = T4.M_ ID
AND MARKER = 54
LEFT JOIN U_DB.TEH_201804 T5
ON T2.M_ID = T5.M_ID
AND T1.DOFS_DATE = T5.E_PST_DTE
QUALIFY ROW_NUMBER() OVER (PARTITION BY T2.M_ID ORDER BY T2.ID_END_DATE DESC, T3.E_END_DATE DESC) = 1
The above code works. However, it is the final left join on T5 where I need help.
In T1 each M_ID has assigned it's own DOFS_DATE that could be any date within the year and I want the data from T5 U_DB.TEH_201804 for the matching date. However, 5 U_DB.TEH_201804 relates to only April 2018. There are 12 tables with the same database (201804, 201805, 201806 etc) that all have the exact same columns but relate to a different month within the year.
Ideally, I want to left join the columns from T5 once but search all 12 tables within the database to bring back the data where the dates correspond.
I was thinking UNION but am unsure how to work this in.
Any help would be greatly appreciated!
Thanks
You could change you code related to table t5 wuth a left join on a subquery that select the union all for all the bale you need ...... (i have named the subquery TT)
SELECT
T1.SC,
T1.AN,
T1.DOFS_DATE,
T2.M_ID,
T3.OPDT,
T4.MARKER,
TT.E_DTE,
TT.E_TME,
TT.E_PST_DTE,
TT.E_AMT,
TT.E_NAR_O,
TT.E_NAR_T
FROM E_Base.AR_MyTable T1
LEFT JOIN E_Base.Translation T2
ON T1.SC = T2.SC
AND T1.AN = T2.AN
LEFT JOIN E_Base.BA T3
ON T2.M_ID = T3.M_ID
LEFT JOIN E_Base.APF T4
ON T3.M_ID = T4.M_ ID
AND MARKER = 54
LEFT JOIN (
select *
FROM U_DB.TEH_201804
UNION ALL
select *
FROM U_DB.TEH_201805
UNION ALL
select *
FROM U_DB.TEH_201806
UNION ALL
select *
FROM U_DB.TEH_201807
UNION ALL
.....
) TT ON T2.M_ID = TT.M_ID
AND T1.DOFS_DATE = TT.E_PST_DTE
QUALIFY ROW_NUMBER() OVER (PARTITION BY T2.M_ID ORDER BY T2.ID_END_DATE DESC, T3.E_END_DATE DESC) = 1
It's hard to tell without additional details like explain and QueryLog step data.
Based on #scaisEdge answer:
You can try to move the first two joins into a Derived Table to apply the ROW_NUMBER early (possible because you do Outer Joins only):
SELECT
dt.*,
T4.MARKER,
TT.E_DTE,
TT.E_TME,
TT.E_PST_DTE,
TT.E_AMT,
TT.E_NAR_O,
TT.E_NAR_T
FROM
(
SELECT
T1.SC,
T1.AN,
T1.DOFS_DATE,
T2.M_ID,
T3.OPDT
FROM E_Base.AR_MyTable T1
LEFT JOIN E_Base.Translation T2
ON T1.SC = T2.SC
AND T1.AN = T2.AN
LEFT JOIN E_Base.BA T3
ON T2.M_ID = T3.M_ID
QUALIFY Row_Number()
Over (PARTITION BY T2.M_ID
ORDER BY T2.ID_END_DATE DESC, T3.E_END_DATE DESC) = 1
) AS dt
LEFT JOIN E_Base.APF T4
ON dt.M_ID = T4.M_ID
AND MARKER = 54
LEFT JOIN
(
SELECT *
FROM U_DB.TEH_201804
UNION ALL
SELECT *
FROM U_DB.TEH_201805
UNION ALL
SELECT *
FROM U_DB.TEH_201806
UNION ALL
SELECT *
FROM U_DB.TEH_201807
UNION ALL
.....
) TT
ON dt.M_ID = TT.M_ID
AND dt.DOFS_DATE = TT.E_PST_DTE
It might also help the optimizer to provide additional info about the data ranges. Those tables should have CHECK-constraints to tell the optimizer that they contain only data from a single month, if they don't exist try adding a WHERE-condition to each Select, e.g. WHERE E_PST_DTE BETWEEN DATE '2018-04-01' AND DATE '2018-04-30'.
Of course, always check Explain if the plan actually changes...

Inner join in postgreSQL getting duplicate rows

I have 2 SQL query.
query 1
select file_number_fk,sent_date as submitted_date from fl_file_movement
where sent_by_post_fk='735'
and file_number_fk='98223'
query 2
select file_number_fk,received_date as received_date from fl_file_movement
where recipient_post_fk='735'
and file_number_fk='98223'
each query return a table with 7 rows
when i try to join them i getting 49 rows
select distinct a.file_number_fk,
a.received_date,
b.submitted_date from(
select file_number_fk,received_date as received_date from fl_file_movement
where recipient_post_fk='735'
and file_number_fk='98223')a LEFT JOIN (
select file_number_fk,sent_date as submitted_date from fl_file_movement
where sent_by_post_fk='735'
and file_number_fk='98223')b ON a.file_number_fk=b.file_number_fk
i want a joined table with 7 rows. how to do this
I think your JOIN condition is not specific enough, as both of your query have equal on two fields and your JOIN has equal on only one of them, thus multiplying the result:
select distinct a.file_number_fk, a.received_date, b.submitted_date
from (select file_number_fk, received_date as received_date
from fl_file_movement
where recipient_post_fk='735' and file_number_fk='98223') a
LEFT JOIN (
select file_number_fk,sent_date as submitted_date
from fl_file_movement
where sent_by_post_fk='735'
and file_number_fk='98223') b
ON a.file_number_fk=b.file_number_fk AND a.recipient_post_fk = b.file_number_fk
The above query is basically what you have provided + extra JOIN condition + improved readability. I think that DISTINCT can be removed in this case.
Also, you can also solve this with SELF JOIN. Something like this:
SELECT src.file_number_fk, src.received_date, dest.submitted_date
FROM fl_file_movement src
JOIN fl_file_movement dest ON dest.recipient_post_fk = src.sent_by_post_fk and src.file_number_fk = dest.file_number_fk
WHERE dest.recipient_post_fk = '735' AND src.file_number_fk = '98223'

Sql Query combine using Top and Ascending order

I have been creating sql query as shown below :
select * from (select DISTINCT * from (select po.tGroup_id,pp.tGroup_id as GroupID from tPhos_Line_Operator PO
LEFT join tPhos_Line_Parameter PP
on PO.tGroup_id = PP.tGroup_id) A) Ac
left JOIN
(SELECT top 1 tGroup_id FROM tGROUP_LOG order by id desc) B
on Ac.tGroup_id = B.tGroup_id
I was expecting to see records like in the image below :
But I keep getting this records :
I tried to hardcode in the left join by putting the tGroup_id=29 and it work as I got the exact record I want. Refer to first image.
select * from (select DISTINCT * from (select po.tGroup_id,pp.tGroup_id as GroupID from tPhos_Line_Operator PO
LEFT join tPhos_Line_Parameter PP
on PO.tGroup_id = PP.tGroup_id) A) Ac
left JOIN
(SELECT top 1 tGroup_id FROM tGROUP_LOG where tGroup_id = 29 order by id desc
) B
on Ac.tGroup_id = B.tGroup_id
I do not want to hardcode it.
Can someone tell me where did I missed out or did wrong?
Thanks in advance.
I have found a way, instead of using top 1, I can use max.
SELECT max(id) as TESTID,tGroup_id FROM tGROUP_LOG
group by tGroup_id
I consider this has fixed my issue. Thanks

Limit join to one row

I have the following query:
SELECT sum((select count(*) as itemCount) * "SalesOrderItems"."price") as amount, 'rma' as
"creditType", "Clients"."company" as "client", "Clients".id as "ClientId", "Rmas".*
FROM "Rmas" JOIN "EsnsRmas" on("EsnsRmas"."RmaId" = "Rmas"."id")
JOIN "Esns" on ("Esns".id = "EsnsRmas"."EsnId")
JOIN "EsnsSalesOrderItems" on("EsnsSalesOrderItems"."EsnId" = "Esns"."id" )
JOIN "SalesOrderItems" on("SalesOrderItems"."id" = "EsnsSalesOrderItems"."SalesOrderItemId")
JOIN "Clients" on("Clients"."id" = "Rmas"."ClientId" )
WHERE "Rmas"."credited"=false AND "Rmas"."verifyStatus" IS NOT null
GROUP BY "Clients".id, "Rmas".id;
The problem is that the table "EsnsSalesOrderItems" can have the same EsnId in different entries. I want to restrict the query to only pull the last entry in "EsnsSalesOrderItems" that has the same "EsnId".
By "last" entry I mean the following:
The one that appears last in the table "EsnsSalesOrderItems". So for example if "EsnsSalesOrderItems" has two entries with "EsnId" = 6 and "createdAt" = '2012-06-19' and '2012-07-19' respectively it should only give me the entry from '2012-07-19'.
SELECT (count(*) * sum(s."price")) AS amount
, 'rma' AS "creditType"
, c."company" AS "client"
, c.id AS "ClientId"
, r.*
FROM "Rmas" r
JOIN "EsnsRmas" er ON er."RmaId" = r."id"
JOIN "Esns" e ON e.id = er."EsnId"
JOIN (
SELECT DISTINCT ON ("EsnId") *
FROM "EsnsSalesOrderItems"
ORDER BY "EsnId", "createdAt" DESC
) es ON es."EsnId" = e."id"
JOIN "SalesOrderItems" s ON s."id" = es."SalesOrderItemId"
JOIN "Clients" c ON c."id" = r."ClientId"
WHERE r."credited" = FALSE
AND r."verifyStatus" IS NOT NULL
GROUP BY c.id, r.id;
Your query in the question has an illegal aggregate over another aggregate:
sum((select count(*) as itemCount) * "SalesOrderItems"."price") as amount
Simplified and converted to legal syntax:
(count(*) * sum(s."price")) AS amount
But do you really want to multiply with the count per group?
I retrieve the the single row per group in "EsnsSalesOrderItems" with DISTINCT ON. Detailed explanation:
Select first row in each GROUP BY group?
I also added table aliases and formatting to make the query easier to parse for human eyes. If you could avoid camel case you could get rid of all the double quotes clouding the view.
Something like:
join (
select "EsnId",
row_number() over (partition by "EsnId" order by "createdAt" desc) as rn
from "EsnsSalesOrderItems"
) t ON t."EsnId" = "Esns"."id" and rn = 1
this will select the latest "EsnId" from "EsnsSalesOrderItems" based on the column creation_date. As you didn't post the structure of your tables, I had to "invent" a column name. You can use any column that allows you to define an order on the rows that suits you.
But remember the concept of the "last row" is only valid if you specifiy an order or the rows. A table as such is not ordered, nor is the result of a query unless you specify an order by
Necromancing because the answers are outdated.
Take advantage of the LATERAL keyword introduced in PG 9.3
left | right | inner JOIN LATERAL
I'll explain with an example:
Assuming you have a table "Contacts".
Now contacts have organisational units.
They can have one OU at a point in time, but N OUs at N points in time.
Now, if you have to query contacts and OU in a time period (not a reporting date, but a date range), you could N-fold increase the record count if you just did a left join.
So, to display the OU, you need to just join the first OU for each contact (where what shall be first is an arbitrary criterion - when taking the last value, for example, that is just another way of saying the first value when sorted by descending date order).
In SQL-server, you would use cross-apply (or rather OUTER APPLY since we need a left join), which will invoke a table-valued function on each row it has to join.
SELECT * FROM T_Contacts
--LEFT JOIN T_MAP_Contacts_Ref_OrganisationalUnit ON MAP_CTCOU_CT_UID = T_Contacts.CT_UID AND MAP_CTCOU_SoftDeleteStatus = 1
--WHERE T_MAP_Contacts_Ref_OrganisationalUnit.MAP_CTCOU_UID IS NULL -- 989
-- CROSS APPLY -- = INNER JOIN
OUTER APPLY -- = LEFT JOIN
(
SELECT TOP 1
--MAP_CTCOU_UID
MAP_CTCOU_CT_UID
,MAP_CTCOU_COU_UID
,MAP_CTCOU_DateFrom
,MAP_CTCOU_DateTo
FROM T_MAP_Contacts_Ref_OrganisationalUnit
WHERE MAP_CTCOU_SoftDeleteStatus = 1
AND MAP_CTCOU_CT_UID = T_Contacts.CT_UID
/*
AND
(
(#in_DateFrom <= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateTo)
AND
(#in_DateTo >= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateFrom)
)
*/
ORDER BY MAP_CTCOU_DateFrom
) AS FirstOE
In PostgreSQL, starting from version 9.3, you can do that, too - just use the LATERAL keyword to achieve the same:
SELECT * FROM T_Contacts
--LEFT JOIN T_MAP_Contacts_Ref_OrganisationalUnit ON MAP_CTCOU_CT_UID = T_Contacts.CT_UID AND MAP_CTCOU_SoftDeleteStatus = 1
--WHERE T_MAP_Contacts_Ref_OrganisationalUnit.MAP_CTCOU_UID IS NULL -- 989
LEFT JOIN LATERAL
(
SELECT
--MAP_CTCOU_UID
MAP_CTCOU_CT_UID
,MAP_CTCOU_COU_UID
,MAP_CTCOU_DateFrom
,MAP_CTCOU_DateTo
FROM T_MAP_Contacts_Ref_OrganisationalUnit
WHERE MAP_CTCOU_SoftDeleteStatus = 1
AND MAP_CTCOU_CT_UID = T_Contacts.CT_UID
/*
AND
(
(__in_DateFrom <= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateTo)
AND
(__in_DateTo >= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateFrom)
)
*/
ORDER BY MAP_CTCOU_DateFrom
LIMIT 1
) AS FirstOE
Try using a subquery in your ON clause. An abstract example:
SELECT
*
FROM table1
JOIN table2 ON table2.id = (
SELECT id FROM table2 WHERE table2.table1_id = table1.id LIMIT 1
)
WHERE
...

SQL JOIN Statement

Lets say I have a table e.g
Request No. Type Status
---------------------------
1 New Renewed
and then another table
Action ID Request No LastUpdated
------------------------------------
1 1 06-10-2010
2 1 07-14-2010
3 1 09-30-2010
How can I join the second table with the first table but only get the latest record from the second table(e.g Last Updated DESC)
SELECT T1.RequestNo ,
T1.Type ,
T1.Status,
T2.ActionId ,
T2.LastUpdated
FROM TABLE1 T1
JOIN TABLE2 T2
ON T1.RequestNo = T2.RequestNo
WHERE NOT EXISTS
(SELECT *
FROM TABLE2 T2B
WHERE T2B.RequestNo = T2.RequestNo
AND T2B.LastUpdated > T2.LastUpdated
)
Using aggregates:
SELECT r.*, re.*
FROM REQUESTS r
JOIN REQUEST_EVENTS re ON re.request_no = r.request_no
JOIN (SELECT t.request_no,
MAX(t.lastupdated) AS latest
FROM REQUEST_EVENTS t
GROUP BY t.request_no) x ON x.request_no = re.request_no
AND x.latest = re.lastupdated
Using LEFT JOIN & NOT EXISTS:
SELECT r.*, re.*
FROM REQUESTS r
JOIN REQUEST_EVENTS re ON re.request_no = r.request_no
WHERE NOT EXISTS(SELECT NULL
FROM REQUEST_EVENTS re2
WHERE re2.request_no = r2.request_no
AND re2.LastUpdated > re.LastUpdated)
SELECT *
FROM REQUEST, ACTION
WHERE REQUEST.REQUESTNO = ACTION.REQUESTNO --Joining here
AND ACTION.LastUpdated = (SELECT MAX(LastUpdated) FROM ACTION WHERE REQUEST.REQUESTNO = ACTION.REQUESTNO);
A sub-query is used to get the last updated record's date and matches against itself to prevent the other records being joined.
Granted, depending on how precise the LastUpdated field is, it can have problems with two records being updated on the same date, but that is a problem encountered in any other implementation, so the precision would have to be increased or some other logic would have to be in place or another distinguishing characteristic to prevent multiple rows being returned.
SELECT r.RequestNo, r.Type, r.Status, a.ActionID, MAX(a.LastUpdated)
FROM Request r
INNER JOIN Action a ON r.RequestNo = a.RequestNo
GROUP BY r.RequestNo, r.Type, r.Status, a.ActionID
We can use the operation Top 1 with ORDER BY clause. For instance, if your tables are RequestTable(ID,Type,Status) and ActionTable(ActionID,RequestID,LastUpdated), the query will be like this:
Select Top 1 rq.ID, rq.Status, at.ActionID
From RequestTable as rq
JOIN ActionTable as at ON rq.ID = at.RequestID
Order by at.LastUpdated DESC