Limit join to one row - sql

I have the following query:
SELECT sum((select count(*) as itemCount) * "SalesOrderItems"."price") as amount, 'rma' as
"creditType", "Clients"."company" as "client", "Clients".id as "ClientId", "Rmas".*
FROM "Rmas" JOIN "EsnsRmas" on("EsnsRmas"."RmaId" = "Rmas"."id")
JOIN "Esns" on ("Esns".id = "EsnsRmas"."EsnId")
JOIN "EsnsSalesOrderItems" on("EsnsSalesOrderItems"."EsnId" = "Esns"."id" )
JOIN "SalesOrderItems" on("SalesOrderItems"."id" = "EsnsSalesOrderItems"."SalesOrderItemId")
JOIN "Clients" on("Clients"."id" = "Rmas"."ClientId" )
WHERE "Rmas"."credited"=false AND "Rmas"."verifyStatus" IS NOT null
GROUP BY "Clients".id, "Rmas".id;
The problem is that the table "EsnsSalesOrderItems" can have the same EsnId in different entries. I want to restrict the query to only pull the last entry in "EsnsSalesOrderItems" that has the same "EsnId".
By "last" entry I mean the following:
The one that appears last in the table "EsnsSalesOrderItems". So for example if "EsnsSalesOrderItems" has two entries with "EsnId" = 6 and "createdAt" = '2012-06-19' and '2012-07-19' respectively it should only give me the entry from '2012-07-19'.

SELECT (count(*) * sum(s."price")) AS amount
, 'rma' AS "creditType"
, c."company" AS "client"
, c.id AS "ClientId"
, r.*
FROM "Rmas" r
JOIN "EsnsRmas" er ON er."RmaId" = r."id"
JOIN "Esns" e ON e.id = er."EsnId"
JOIN (
SELECT DISTINCT ON ("EsnId") *
FROM "EsnsSalesOrderItems"
ORDER BY "EsnId", "createdAt" DESC
) es ON es."EsnId" = e."id"
JOIN "SalesOrderItems" s ON s."id" = es."SalesOrderItemId"
JOIN "Clients" c ON c."id" = r."ClientId"
WHERE r."credited" = FALSE
AND r."verifyStatus" IS NOT NULL
GROUP BY c.id, r.id;
Your query in the question has an illegal aggregate over another aggregate:
sum((select count(*) as itemCount) * "SalesOrderItems"."price") as amount
Simplified and converted to legal syntax:
(count(*) * sum(s."price")) AS amount
But do you really want to multiply with the count per group?
I retrieve the the single row per group in "EsnsSalesOrderItems" with DISTINCT ON. Detailed explanation:
Select first row in each GROUP BY group?
I also added table aliases and formatting to make the query easier to parse for human eyes. If you could avoid camel case you could get rid of all the double quotes clouding the view.

Something like:
join (
select "EsnId",
row_number() over (partition by "EsnId" order by "createdAt" desc) as rn
from "EsnsSalesOrderItems"
) t ON t."EsnId" = "Esns"."id" and rn = 1
this will select the latest "EsnId" from "EsnsSalesOrderItems" based on the column creation_date. As you didn't post the structure of your tables, I had to "invent" a column name. You can use any column that allows you to define an order on the rows that suits you.
But remember the concept of the "last row" is only valid if you specifiy an order or the rows. A table as such is not ordered, nor is the result of a query unless you specify an order by

Necromancing because the answers are outdated.
Take advantage of the LATERAL keyword introduced in PG 9.3
left | right | inner JOIN LATERAL
I'll explain with an example:
Assuming you have a table "Contacts".
Now contacts have organisational units.
They can have one OU at a point in time, but N OUs at N points in time.
Now, if you have to query contacts and OU in a time period (not a reporting date, but a date range), you could N-fold increase the record count if you just did a left join.
So, to display the OU, you need to just join the first OU for each contact (where what shall be first is an arbitrary criterion - when taking the last value, for example, that is just another way of saying the first value when sorted by descending date order).
In SQL-server, you would use cross-apply (or rather OUTER APPLY since we need a left join), which will invoke a table-valued function on each row it has to join.
SELECT * FROM T_Contacts
--LEFT JOIN T_MAP_Contacts_Ref_OrganisationalUnit ON MAP_CTCOU_CT_UID = T_Contacts.CT_UID AND MAP_CTCOU_SoftDeleteStatus = 1
--WHERE T_MAP_Contacts_Ref_OrganisationalUnit.MAP_CTCOU_UID IS NULL -- 989
-- CROSS APPLY -- = INNER JOIN
OUTER APPLY -- = LEFT JOIN
(
SELECT TOP 1
--MAP_CTCOU_UID
MAP_CTCOU_CT_UID
,MAP_CTCOU_COU_UID
,MAP_CTCOU_DateFrom
,MAP_CTCOU_DateTo
FROM T_MAP_Contacts_Ref_OrganisationalUnit
WHERE MAP_CTCOU_SoftDeleteStatus = 1
AND MAP_CTCOU_CT_UID = T_Contacts.CT_UID
/*
AND
(
(#in_DateFrom <= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateTo)
AND
(#in_DateTo >= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateFrom)
)
*/
ORDER BY MAP_CTCOU_DateFrom
) AS FirstOE
In PostgreSQL, starting from version 9.3, you can do that, too - just use the LATERAL keyword to achieve the same:
SELECT * FROM T_Contacts
--LEFT JOIN T_MAP_Contacts_Ref_OrganisationalUnit ON MAP_CTCOU_CT_UID = T_Contacts.CT_UID AND MAP_CTCOU_SoftDeleteStatus = 1
--WHERE T_MAP_Contacts_Ref_OrganisationalUnit.MAP_CTCOU_UID IS NULL -- 989
LEFT JOIN LATERAL
(
SELECT
--MAP_CTCOU_UID
MAP_CTCOU_CT_UID
,MAP_CTCOU_COU_UID
,MAP_CTCOU_DateFrom
,MAP_CTCOU_DateTo
FROM T_MAP_Contacts_Ref_OrganisationalUnit
WHERE MAP_CTCOU_SoftDeleteStatus = 1
AND MAP_CTCOU_CT_UID = T_Contacts.CT_UID
/*
AND
(
(__in_DateFrom <= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateTo)
AND
(__in_DateTo >= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateFrom)
)
*/
ORDER BY MAP_CTCOU_DateFrom
LIMIT 1
) AS FirstOE

Try using a subquery in your ON clause. An abstract example:
SELECT
*
FROM table1
JOIN table2 ON table2.id = (
SELECT id FROM table2 WHERE table2.table1_id = table1.id LIMIT 1
)
WHERE
...

Related

Postgres, how to limit number of rows returned from joined tables

I have the following query that return the data I want, however for the joined tables, I want to limit the number of rows returned and preferrably be able to specify for each joined table.
I tried using limit with the select itself, but doesn't seem to be supported.
Is this possible? I am using Postgres 11.
select array_to_json(array_agg(t)) from (
select
tbl_327.field_43,tbl_327.field_1,tbl_327.field_2,
jsonb_agg(distinct jsonb_build_object('id',tbl_332.id,'data',tbl_332.fullname)) as field_7,
jsonb_agg(distinct jsonb_build_object('id',tbl_312.id,'data',tbl_312.fullname)) as field_33
from schema_1.tbl_327 tbl_327
left join schema_1.tbl_327_to_tbl_332_field_7 field_7 on field_7.tbl_327_id=tbl_327.id
left join schema_1.tbl_332_customid tbl_332 on tbl_332.id = field_7.tbl_332_id
left join schema_1.tbl_327_to_tbl_312_field_33 field_33 on field_33.tbl_327_id=tbl_327.id
left join schema_1.tbl_312_customid tbl_312 on tbl_312.id = field_33.tbl_312_id
group by tbl_327.field_43,tbl_327.field_1,tbl_327.field_2
) t
UPDATED
here is my new query. I simplified it, but the issue is it's no longer returning correct data. For the field_4 field, it's returing rows/data that isn't associated with the record. Do I have something wrong?
select array_to_json(array_agg(t)) from (
select
tbl_342.field_1,tbl_342.field_2,tbl_342.id,
jsonb_agg(distinct jsonb_build_object('id',tbl_312.id,'data',tbl_312.fullname)) as field_4
from schema_1.tbl_342 tbl_342
left join lateral (
select distinct field_4.*
from schema_1.tbl_342_to_tbl_312_field_4 field_4
where field_4.tbl_342_id=tbl_342.id
limit 50) field_4 on true
left join lateral (
select distinct tbl_312.*
from schema_1.tbl_312_customid tbl_312
where tbl_312_id = field_4.tbl_312_id
limit 5
) tbl_312 on true
group by tbl_342.field_1,tbl_342.field_2,tbl_342.id
) t
One approach is to turn each left join to a lateral join; you can then set the limit within each subquery:
select array_to_json(array_agg(t)) from (
select
tbl_327.field_43,tbl_327.field_1,tbl_327.field_2,
jsonb_agg(distinct jsonb_build_object('id',tbl_332.id,'data',tbl_332.fullname)) as field_7,
jsonb_agg(distinct jsonb_build_object('id',tbl_312.id,'data',tbl_312.fullname)) as field_33
from schema_1.tbl_327 tbl_327
left join lateral (
select field_7.*
from schema_1.tbl_327_to_tbl_332_field_7 field_7
where field_7.tbl_327_id=tbl_327.id
order by ...
limit 5
) field_7 on true
left join lateral (
select tbl_332.*
from schema_1.tbl_332_customid tbl_332
where tbl_332.id = field_7.tbl_332_id
order by ??
limit 5
) tbl_332 on true
left join lateral ...
group by tbl_327.field_43,tbl_327.field_1,tbl_327.field_2
) t
Note that you need an order by to go along with limit in order to get stable results - you can replace the question marks in the query with the revelant columns or set of columns.

SQL > Server > How to query for additional fields

I have the following query and it works perfectly and gives me 200 rows. However, I wanted to retrieve additional fields from ExecutionLogStorage table. When I added ExecutionLogStorage.TimeStart, ExecutionLogStorage.TimeDataRetrieval with group by the result is 8,000+ rows. How can I retrieve the latest date (Max of the date) and still keep 200 rows of data.
Select * from (
SELECT ExecutionLogStorage.ReportID, COUNT(*) AS HitCount, Catalog.Name, ExecutionLogStorage.UserName
FROM [SP_RPT_SVC].[dbo].ExecutionLogStorage INNER JOIN
Catalog ON [SP_RPT_SVC].[dbo].ExecutionLogStorage.ReportID = Catalog.ItemID
where Catalog.[Type] = 2
GROUP BY ExecutionLogStorage.ReportID, Catalog.Name, ExecutionLogStorage.UserName) X
LEFT Join
(SELECT [Id]
,[DirName]
,[LeafName]
FROM [SP_BI].[dbo].[AllDocs]) Y
on
Y.ID = X.ReportID
LEFT Join
(SELECT [NTName],[PreferredName]
FROM [SP_ProfileDB].[dbo].[UserProfile_Full]) Z
ON
X.UserName = Z.NTName
How can I retrieve the latest date (Max of the date) and still keep 200 rows of data.
No need to modify the GROUP BY clause: just add more aggregate functions to your inner query:
SELECT *
FROM (
SELECT
ExecutionLogStorage.ReportID,
COUNT(*) AS HitCount,
Catalog.Name,
ExecutionLogStorage.UserName,
MAX(ExecutionLogStorage.TimeStart) MaxTimeStart --> here,
MAX(ExecutionLogStorage.TimeDataRetrieval) MaxTimeDataRetrieval --> and here
FROM [SP_RPT_SVC].[dbo].ExecutionLogStorage
INNER JOIN Catalog
ON [SP_RPT_SVC].[dbo].ExecutionLogStorage.ReportID = Catalog.ItemID
WHERE Catalog.[Type] = 2
GROUP BY
ExecutionLogStorage.ReportID,
Catalog.Name,
ExecutionLogStorage.UserName
) X
LEFT JOIN ...

How can I join on multiple columns within the same table that contain the same type of info?

I am currently joining two tables based on Claim_Number and Customer_Number.
SELECT
A.*,
B.*,
FROM Company.dbo.Company_Master AS A
LEFT JOIN Company.dbp.Compound_Info AS B ON A.Claim_Number = B.Claim_Number AND A.Customer_Number = B.Customer_Number
WHERE A.Filled_YearMonth = '201312' AND A.Compound_Ind = 'Y'
This returns exactly the data I'm looking for. The problem is that I now need to join to another table to get information based on a Product_ID. This would be easy if there was only one Product_ID in the Compound_Info table for each record. However, there are 10. So basically I need to SELECT 10 additional columns for Product_Name based on each of those Product_ID's that are being selected already. How can do that? This is what I was thinking in my head, but is not working right.
SELECT
A.*,
B.*,
PD_Info_1.Product_Name,
PD_Info_2.Product_Name,
....etc {Up to 10 Product Names}
FROM Company.dbo.Company_Master AS A
LEFT JOIN Company.dbo.Compound_Info AS B ON A.Claim_Number = B.Claim_Number AND A.Customer_Number = B.Customer_Number
LEFT JOIN Company.dbo.Product_Info AS PD_Info_1 ON B.Product_ID_1 = PD_Info_1.Product_ID
LEFT JOIN Company.dbo.Product_Info AS PD_Info_2 ON B.Product_ID_2 = PD_Info_2.Product_ID
.... {Up to 10 LEFT JOIN's}
WHERE A.Filled_YearMonth = '201312' AND A.Compound_Ind = 'Y'
This query not only doesn't return the correct results, it also takes forever to run. My actual SQL is a lot longer and I've changed table names, etc but I hope that you can get the idea. If it matters, I will be creating a view based on this query.
Please advise on how to select multiple columns from the same table correctly and efficiently. Thanks!
I found put my extra stuff into CTE and add ROW_NUMBER to insure that I get only 1 row that I care about. it would look something like this. I only did for first 2 product info.
WITH PD_Info
AS ( SELECT Product_ID
,Product_Name
,Effective_Date
,ROW_NUMBER() OVER ( PARTITION BY Product_ID, Product_Name ORDER BY Effective_Date DESC ) AS RowNum
FROM Company.dbo.Product_Info)
SELECT A.*
,B.*
,PD_Info_1.Product_Name
,PD_Info_2.Product_Name
FROM Company.dbo.Company_Master AS A
LEFT JOIN Company.dbo.Compound_Info AS B
ON A.Claim_Number = B.Claim_Number
AND A.Customer_Number = B.Customer_Number
LEFT JOIN PD_Info AS PD_Info_1
ON B.Product_ID_1 = PD_Info_1.Product_ID
AND B.Fill_Date >= PD_Info_1.Effective_Date
AND PD_Info_2.RowNum = 1
LEFT JOIN PD_Info AS PD_Info_2
ON B.Product_ID_2 = PD_Info_2.Product_ID
AND B.Fill_Date >= PD_Info_2.Effective_Date
AND PD_Info_2.RowNum = 1

SQL JOIN Statement

Lets say I have a table e.g
Request No. Type Status
---------------------------
1 New Renewed
and then another table
Action ID Request No LastUpdated
------------------------------------
1 1 06-10-2010
2 1 07-14-2010
3 1 09-30-2010
How can I join the second table with the first table but only get the latest record from the second table(e.g Last Updated DESC)
SELECT T1.RequestNo ,
T1.Type ,
T1.Status,
T2.ActionId ,
T2.LastUpdated
FROM TABLE1 T1
JOIN TABLE2 T2
ON T1.RequestNo = T2.RequestNo
WHERE NOT EXISTS
(SELECT *
FROM TABLE2 T2B
WHERE T2B.RequestNo = T2.RequestNo
AND T2B.LastUpdated > T2.LastUpdated
)
Using aggregates:
SELECT r.*, re.*
FROM REQUESTS r
JOIN REQUEST_EVENTS re ON re.request_no = r.request_no
JOIN (SELECT t.request_no,
MAX(t.lastupdated) AS latest
FROM REQUEST_EVENTS t
GROUP BY t.request_no) x ON x.request_no = re.request_no
AND x.latest = re.lastupdated
Using LEFT JOIN & NOT EXISTS:
SELECT r.*, re.*
FROM REQUESTS r
JOIN REQUEST_EVENTS re ON re.request_no = r.request_no
WHERE NOT EXISTS(SELECT NULL
FROM REQUEST_EVENTS re2
WHERE re2.request_no = r2.request_no
AND re2.LastUpdated > re.LastUpdated)
SELECT *
FROM REQUEST, ACTION
WHERE REQUEST.REQUESTNO = ACTION.REQUESTNO --Joining here
AND ACTION.LastUpdated = (SELECT MAX(LastUpdated) FROM ACTION WHERE REQUEST.REQUESTNO = ACTION.REQUESTNO);
A sub-query is used to get the last updated record's date and matches against itself to prevent the other records being joined.
Granted, depending on how precise the LastUpdated field is, it can have problems with two records being updated on the same date, but that is a problem encountered in any other implementation, so the precision would have to be increased or some other logic would have to be in place or another distinguishing characteristic to prevent multiple rows being returned.
SELECT r.RequestNo, r.Type, r.Status, a.ActionID, MAX(a.LastUpdated)
FROM Request r
INNER JOIN Action a ON r.RequestNo = a.RequestNo
GROUP BY r.RequestNo, r.Type, r.Status, a.ActionID
We can use the operation Top 1 with ORDER BY clause. For instance, if your tables are RequestTable(ID,Type,Status) and ActionTable(ActionID,RequestID,LastUpdated), the query will be like this:
Select Top 1 rq.ID, rq.Status, at.ActionID
From RequestTable as rq
JOIN ActionTable as at ON rq.ID = at.RequestID
Order by at.LastUpdated DESC

Multiple MAX values select using inner join

I have query that work for me only when values in the StakeValue don't repeat.
Basically, I need to select maximum values from SI_STAKES table with their relations from two other tables grouped by internal type.
SELECT a.StakeValue, b.[StakeName], c.[ProviderName]
FROM SI_STAKES AS a
INNER JOIN SI_STAKESTYPES AS b ON a.[StakeTypeID] = b.[ID]
INNER JOIN SI_PROVIDERS AS c ON a.[ProviderID] = c.[ID] WHERE a.[EventID]=6
AND a.[StakeGroupTypeID]=1
AND a.StakeValue IN
(SELECT MAX(d.StakeValue) FROM SI_STAKES AS d
WHERE d.[EventID]=a.[EventID] AND d.[StakeGroupTypeID]=a.[StakeGroupTypeID]
GROUP BY d.[StakeTypeID])
ORDER BY b.[StakeName], a.[StakeValue] DESC
Results for example must be:
[ID] [MaxValue] [StakeTypeID] [ProviderName]
1 1,5 6 provider1
2 3,75 7 provider2
3 7,6 8 provider3
Thank you for your help
There are two problems to solve here.
1) Finding the max values per type. This will get the Max value per StakeType and make sure that we do the exercise only for the wanted events and group type.
SELECT StakeGroupTypeID, EventID, StakeTypeID, MAX(StakeValue) AS MaxStakeValue
FROM SI_STAKES
WHERE Stake.[EventID]=6
AND Stake.[StakeGroupTypeID]=1
GROUP BY StakeGroupTypeID, EventID, StakeTypeID
2) Then we need to get only one return back for that value since it may be present more then once.
Using the Max Value, we must find a unique row for each I usually do this by getting the Max ID is has the added advantage of getting me the most recent entry.
SELECT MAX(SMaxID.ID) AS ID
FROM SI_STAKES AS SMaxID
INNER JOIN (
SELECT StakeGroupTypeID, EventID, StakeTypeID, MAX(StakeValue) AS MaxStakeValue
FROM SI_STAKES
WHERE Stake.[EventID]=6
AND Stake.[StakeGroupTypeID]=1
GROUP BY StakeGroupTypeID, EventID, StakeTypeID
) AS SMaxVal ON SMaxID.StakeTypeID = SMaxVal.StakeTypeID
AND SMaxID.StakeValue = SMaxVal.MaxStakeValue
AND SMaxID.EventID = SMaxVal.EventID
AND SMaxID.StakeGroupTypeID = SMaxVal.StakeGroupTypeID
3) Now that we have the ID's of the rows that we want, we can just get that information.
SELECT Stakes.ID, Stakes.StakeValue, SType.StakeName, SProv.ProviderName
FROM SI_STAKES AS Stakes
INNER JOIN SI_STAKESTYPES AS SType ON Stake.[StakeTypeID] = SType.[ID]
INNER JOIN SI_PROVIDERS AS SProv ON Stake.[ProviderID] = SProv.[ID]
WHERE Stake.ID IN (
SELECT MAX(SMaxID.ID) AS ID
FROM SI_STAKES AS SMaxID
INNER JOIN (
SELECT StakeGroupTypeID, EventID, StakeTypeID, MAX(StakeValue) AS MaxStakeValue
FROM SI_STAKES
WHERE Stake.[EventID]=6
AND Stake.[StakeGroupTypeID]=1
GROUP BY StakeGroupTypeID, EventID, StakeTypeID
) AS SMaxVal ON SMaxID.StakeTypeID = SMaxVal.StakeTypeID
AND SMaxID.StakeValue = SMaxVal.MaxStakeValue
AND SMaxID.EventID = SMaxVal.EventID
AND SMaxID.StakeGroupTypeID = SMaxVal.StakeGroupTypeID
)
You can use the over clause since you're using T-SQL (hopefully 2005+):
select distinct
a.stakevalue,
max(a.stakevalue) over (partition by a.staketypeid) as maxvalue,
b.staketypeid,
c.providername
from
si_stakes a
inner join si_stakestypes b on
a.staketypeid = b.id
inner join si_providers c on
a.providerid = c.id
where
a.eventid = 6
and a.stakegrouptypeid = 1
Essentially, this will find the max a.stakevalue for each a.staketypeid. Using a distinct will return one and only one row. Now, if you wanted to include the min a.id along with it, you could use row_number to accomplish this:
select
s.id,
s.maxvalue,
s.staketypeid,
s.providername
from (
select
row_number() over (order by a.stakevalue desc
partition by a.staketypeid) as rownum,
a.id,
a.stakevalue as maxvalue,
b.staketypeid,
c.providername
from
si_stakes a
inner join si_stakestypes b on
a.staketypeid = b.id
inner join si_providers c on
a.providerid = c.id
where
a.eventid = 6
and a.stakegrouptypeid = 1
) s
where
s.rownum = 1