How to group a sum query without using group by - sql

I have a large query/table with many columns, and the primary key in the query/table is the id_shift. Multiple orders belong to one shift, and for each shift I want to display the value of the order with the largest ldm value (length of shipment).
I do not want to use group by because then I would need to specify all the columns in the query (which are about 50-100 columns), and it is important that the query is fast.
I have created this query (and I want to add it to the large query):
SELECT
(MAX(ldm.uvalue) OVER ()) AS [max ldm],
plannedshift.id_shift
FROM plannedshift
LEFT JOIN action ac ON plannedshift.id_shift=ac.id_shift AND ac.name = 'pickup'
JOIN [order] ord ON ac.id_order = ord.id_order AND ac.name = 'pickup'
LEFT JOIN orderamount ldm ON ord.id_order = ldm.id_order AND ldm.id_unit = 5
But this gives me multiple rows for the same id_shift, because a row is created for each order. For example:
id_shift
max ldm
62822
12.80
62822
12.80
62822
12.80
Is there something I can do to get only one row for each id_shift, with the max ldm value from all the orders that belong to that shift?

You can use row_number():
SELECT ps.*
FROM (SELECT . . ., -- whatever columns you want
ROW_NUMBER() OVER (PARTITION BY ps.id_shift ORDER BY ldm.uvalue DESC) as seqnum
FROM plannedshift ps LEFT JOIN
action ac
ON ps.id_shift = ac.id_shift AND
ac.name = 'pickup' LEFT JOIN
[order] ord
ON ac.id_order = ord.id_order AND
ac.name = 'pickup' LEFT JOIN
orderamount ldm
ON ord.id_order = ldm.id_order AND
ldm.id_unit = 5
) ps
WHERE seqnum = 1;
Note that I changed all the JOINs to LEFT JOINs to ensure that all shifts from the first table are in the result set.

You only need to include the columns you are actually displaying in a group by.
If you're query above is the complete query then this is not difficult at all:
group by ps.id_shift
Otherwise you could try a distinct after the select:
SELECT distinct
(MAX(ldm.uvalue) OVER ()) AS [max ldm],
plannedshift.id_shift
FROM plannedshift...

You can use a TOP (1) query that you add with OUTER APPLY to your main query in order to get the top row per shift:
select ...
from ...
outer apply
(
SELECT TOP(1) *
FROM action ac
JOIN orderamount ldm ON ldm.id_order = ac.id_order AND ldm.id_unit = 5
WHERE ac.id_shift = plannedshift.id_shift AND ac.name = 'pickup'
ORDER BY ldm.uvalue DESC
) top_order

Related

Display Y/N column if record found in detail table

I'm trying to create a query so that I can have a column show Y/N if a particular item was ordered for a group of orders. The item I'm looking for would be OLI.id = '538'.
So my results would be:
Order#, Customer#, FreightPaid
12345, 00112233, Y
12346, 00112233, N
I cannot figure out if I need to use a subquery or the where exists function ?
Here's my current query:
SELECT distinct
OrderID,
Accountuid as Customerno
FROM [SMILEWEB_live].[dbo].[OrderLog] OL
inner join Orderlog_item OLI on OLI.orderlogkey = OL.[key]
inner join Account A on A.uid = OL.Accountuid
where A.GroupId = 'X9955'
and OL.CreateDate >= GETDATE() - 60
I would suggest an exists clause instead of a join:
select ol.OrderID, ol.Accountuid as Customerno,
(case when exists (select 1
from Orderlog_item OLI join
Account A
on A.uid = OL.Accountuid
where OLI.orderlogkey = OL.[key] and A.GroupId = 'X9955'
)
then 1 else 0
end) as flag
from [SMILEWEB_live].[dbo].[OrderLog] OL
where OL.CreateDate >= GETDATE() - 60;
This prevents a couple of problems. First, duplicate rows which are caused when there are multiple matching rows (and select distinct add unnecessary overhead). Second, missing rows, which happen when you use inner join instead of an outer join.

Limit join to one row

I have the following query:
SELECT sum((select count(*) as itemCount) * "SalesOrderItems"."price") as amount, 'rma' as
"creditType", "Clients"."company" as "client", "Clients".id as "ClientId", "Rmas".*
FROM "Rmas" JOIN "EsnsRmas" on("EsnsRmas"."RmaId" = "Rmas"."id")
JOIN "Esns" on ("Esns".id = "EsnsRmas"."EsnId")
JOIN "EsnsSalesOrderItems" on("EsnsSalesOrderItems"."EsnId" = "Esns"."id" )
JOIN "SalesOrderItems" on("SalesOrderItems"."id" = "EsnsSalesOrderItems"."SalesOrderItemId")
JOIN "Clients" on("Clients"."id" = "Rmas"."ClientId" )
WHERE "Rmas"."credited"=false AND "Rmas"."verifyStatus" IS NOT null
GROUP BY "Clients".id, "Rmas".id;
The problem is that the table "EsnsSalesOrderItems" can have the same EsnId in different entries. I want to restrict the query to only pull the last entry in "EsnsSalesOrderItems" that has the same "EsnId".
By "last" entry I mean the following:
The one that appears last in the table "EsnsSalesOrderItems". So for example if "EsnsSalesOrderItems" has two entries with "EsnId" = 6 and "createdAt" = '2012-06-19' and '2012-07-19' respectively it should only give me the entry from '2012-07-19'.
SELECT (count(*) * sum(s."price")) AS amount
, 'rma' AS "creditType"
, c."company" AS "client"
, c.id AS "ClientId"
, r.*
FROM "Rmas" r
JOIN "EsnsRmas" er ON er."RmaId" = r."id"
JOIN "Esns" e ON e.id = er."EsnId"
JOIN (
SELECT DISTINCT ON ("EsnId") *
FROM "EsnsSalesOrderItems"
ORDER BY "EsnId", "createdAt" DESC
) es ON es."EsnId" = e."id"
JOIN "SalesOrderItems" s ON s."id" = es."SalesOrderItemId"
JOIN "Clients" c ON c."id" = r."ClientId"
WHERE r."credited" = FALSE
AND r."verifyStatus" IS NOT NULL
GROUP BY c.id, r.id;
Your query in the question has an illegal aggregate over another aggregate:
sum((select count(*) as itemCount) * "SalesOrderItems"."price") as amount
Simplified and converted to legal syntax:
(count(*) * sum(s."price")) AS amount
But do you really want to multiply with the count per group?
I retrieve the the single row per group in "EsnsSalesOrderItems" with DISTINCT ON. Detailed explanation:
Select first row in each GROUP BY group?
I also added table aliases and formatting to make the query easier to parse for human eyes. If you could avoid camel case you could get rid of all the double quotes clouding the view.
Something like:
join (
select "EsnId",
row_number() over (partition by "EsnId" order by "createdAt" desc) as rn
from "EsnsSalesOrderItems"
) t ON t."EsnId" = "Esns"."id" and rn = 1
this will select the latest "EsnId" from "EsnsSalesOrderItems" based on the column creation_date. As you didn't post the structure of your tables, I had to "invent" a column name. You can use any column that allows you to define an order on the rows that suits you.
But remember the concept of the "last row" is only valid if you specifiy an order or the rows. A table as such is not ordered, nor is the result of a query unless you specify an order by
Necromancing because the answers are outdated.
Take advantage of the LATERAL keyword introduced in PG 9.3
left | right | inner JOIN LATERAL
I'll explain with an example:
Assuming you have a table "Contacts".
Now contacts have organisational units.
They can have one OU at a point in time, but N OUs at N points in time.
Now, if you have to query contacts and OU in a time period (not a reporting date, but a date range), you could N-fold increase the record count if you just did a left join.
So, to display the OU, you need to just join the first OU for each contact (where what shall be first is an arbitrary criterion - when taking the last value, for example, that is just another way of saying the first value when sorted by descending date order).
In SQL-server, you would use cross-apply (or rather OUTER APPLY since we need a left join), which will invoke a table-valued function on each row it has to join.
SELECT * FROM T_Contacts
--LEFT JOIN T_MAP_Contacts_Ref_OrganisationalUnit ON MAP_CTCOU_CT_UID = T_Contacts.CT_UID AND MAP_CTCOU_SoftDeleteStatus = 1
--WHERE T_MAP_Contacts_Ref_OrganisationalUnit.MAP_CTCOU_UID IS NULL -- 989
-- CROSS APPLY -- = INNER JOIN
OUTER APPLY -- = LEFT JOIN
(
SELECT TOP 1
--MAP_CTCOU_UID
MAP_CTCOU_CT_UID
,MAP_CTCOU_COU_UID
,MAP_CTCOU_DateFrom
,MAP_CTCOU_DateTo
FROM T_MAP_Contacts_Ref_OrganisationalUnit
WHERE MAP_CTCOU_SoftDeleteStatus = 1
AND MAP_CTCOU_CT_UID = T_Contacts.CT_UID
/*
AND
(
(#in_DateFrom <= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateTo)
AND
(#in_DateTo >= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateFrom)
)
*/
ORDER BY MAP_CTCOU_DateFrom
) AS FirstOE
In PostgreSQL, starting from version 9.3, you can do that, too - just use the LATERAL keyword to achieve the same:
SELECT * FROM T_Contacts
--LEFT JOIN T_MAP_Contacts_Ref_OrganisationalUnit ON MAP_CTCOU_CT_UID = T_Contacts.CT_UID AND MAP_CTCOU_SoftDeleteStatus = 1
--WHERE T_MAP_Contacts_Ref_OrganisationalUnit.MAP_CTCOU_UID IS NULL -- 989
LEFT JOIN LATERAL
(
SELECT
--MAP_CTCOU_UID
MAP_CTCOU_CT_UID
,MAP_CTCOU_COU_UID
,MAP_CTCOU_DateFrom
,MAP_CTCOU_DateTo
FROM T_MAP_Contacts_Ref_OrganisationalUnit
WHERE MAP_CTCOU_SoftDeleteStatus = 1
AND MAP_CTCOU_CT_UID = T_Contacts.CT_UID
/*
AND
(
(__in_DateFrom <= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateTo)
AND
(__in_DateTo >= T_MAP_Contacts_Ref_OrganisationalUnit.MAP_KTKOE_DateFrom)
)
*/
ORDER BY MAP_CTCOU_DateFrom
LIMIT 1
) AS FirstOE
Try using a subquery in your ON clause. An abstract example:
SELECT
*
FROM table1
JOIN table2 ON table2.id = (
SELECT id FROM table2 WHERE table2.table1_id = table1.id LIMIT 1
)
WHERE
...

JOIN on varchar column with subquery

i know that this is not the recommended way to join tables. But it's only relevant for one rarely used report for one person and i don't want to change my datamodel for it.
I have two tables Model and SparePart that are not directly linked with each other via foreign keys.
Model SparePart
idModel idSparePart
ModelName SparePartDescription
Price
In special cases a model is also a sparepart(exchange unit). Then i need the price for this model from the SparePart table via its SparePartDescription column.
For example:
ModelName = C510
SparePartDescription = C510/Exchange Unit/Exch unit/Red
So i try to join both tables to get the price with following SQL:
SELECT m.idModel, m.ModelName, sp.Price, sp.SparePartDescription
FROM modModel AS m INNER JOIN
tabSparePart AS sp ON m.ModelName =
(SELECT TOP 1 LEFT(sp.SparePartDescription, CHARINDEX('/', sp.SparePartDescription) - 1)order by price desc)
WHERE (CHARINDEX('/', sp.SparePartDescription) > 0)
AND (sp.fiSparePartCategory = 6)
ORDER BY m.ModelName, sp.SparePartDescription
But i get multiple records for one model:
idModel ModelName Price SparePartDescription
569 C510 70,75 C510/Exchange Unit/Exch unit/Red
569 C510 70,75 C510/Exchange Unit/Latin/Generic/Black
569 C510 70,75 C510/Exchange Unit/Latin/Generic/Silver
433 C702 80,72 C702/Exchange Unit/Latin/Generic/Black
433 C702 NULL C702/Exchange Unit/Latin/Generic/Cyan
433 C702 80,72 C702/Exchange Unit/Orange Global/Black
I only want to select one record if there are multiple spareparts with matching SparePartDescription.
Sql Server 2005 and better introduced the 'APPLY' operator which allows you to join against a subquery... Try this.
SELECT m.idModel, m.ModelName, sp.Price, sp.SparePartDescription
FROM modModel AS m
CROSS APPLY
(
SELECT TOP 1 * FROM tabSparePart
WHERE m.ModelName =
LEFT(SparePartDescription, LEN(ModelName))
ORDER BY Price DESC
) sp
WHERE (sp.fiSparePartCategory = 6)
ORDER BY m.ModelName, sp.SparePartDescription
It inner joins the 'modModel' table with the subquery 'only the top one matching tabSparePart'.
You can also use OUTER APPLY which will emulate a LEFT JOIN on the subquery. Documentation is here.
Try the ROW_NUMBER function. It guarantees that you'll only get one of each item as defined in the PARTITION BY clause.
SELECT a.idModel, a.ModelName, Price, SparePartDescription
FROM modModel a
LEFT JOIN
(
SELECT m.idModel, m.ModelName, sp.Price, sp.SparePartDescription
, ROW_NUMBER() OVER (PARTITION BY m.idModel, m.ModelName ORDER BY sp.price DESC) AS r
FROM modModel AS m
INNER JOIN tabSparePart AS sp
ON m.ModelName = LEFT(sp.SparePartDescription, CHARINDEX('/', sp.SparePartDescription) - 1)
WHERE (CHARINDEX('/', sp.SparePartDescription) > 0)
AND (sp.fiSparePartCategory = 6)
) b
ON a.idModel = b.idModel
AND b.r = 1
ORDER BY ModelName, SparePartDescription
First, your join condition can be simplified some, and then you can use ROW_NUMBER() to specify some kind of order to your results, allowing the first result (per model) to be selected. I also changed it to a LEFT JOIN in case there was no match. If that is not required, it's simple to change back to an INNER JOIN :)
WITH
ranked_results AS
(
SELECT
m.idModel, m.ModelName, sp.Price, sp.SparePartDescription,
ROW_NUMBER() OVER (PARTITION BY m.idModel ORDER BY sp.Price DESC) AS rank
FROM
modModel AS m
LEFT JOIN
tabSparePart AS sp
ON LEFT(sp.SparePartDescription, LEN(m.ModelName)) = m.ModelName
AND (CHARINDEX('/', sp.SparePartDescription) > 0)
AND (sp.fiSparePartCategory = 6)
)
SELECT
*
FROM
ranked_results
WHERE
rank = 1
ORDER BY
ModelName,
SparePartDescription
#MattMurrell's answer just appeared while I was typing this. One difference here is that the selection criteria is being applied to the whole set, rather than separately in the CROSS APPLY. This may have a performance benefit, you'd have to try and see. CROSS APPLY with inline functions is normally more performant that correlated sub-queries, so I can't predict which is faster.

Multiple MAX values select using inner join

I have query that work for me only when values in the StakeValue don't repeat.
Basically, I need to select maximum values from SI_STAKES table with their relations from two other tables grouped by internal type.
SELECT a.StakeValue, b.[StakeName], c.[ProviderName]
FROM SI_STAKES AS a
INNER JOIN SI_STAKESTYPES AS b ON a.[StakeTypeID] = b.[ID]
INNER JOIN SI_PROVIDERS AS c ON a.[ProviderID] = c.[ID] WHERE a.[EventID]=6
AND a.[StakeGroupTypeID]=1
AND a.StakeValue IN
(SELECT MAX(d.StakeValue) FROM SI_STAKES AS d
WHERE d.[EventID]=a.[EventID] AND d.[StakeGroupTypeID]=a.[StakeGroupTypeID]
GROUP BY d.[StakeTypeID])
ORDER BY b.[StakeName], a.[StakeValue] DESC
Results for example must be:
[ID] [MaxValue] [StakeTypeID] [ProviderName]
1 1,5 6 provider1
2 3,75 7 provider2
3 7,6 8 provider3
Thank you for your help
There are two problems to solve here.
1) Finding the max values per type. This will get the Max value per StakeType and make sure that we do the exercise only for the wanted events and group type.
SELECT StakeGroupTypeID, EventID, StakeTypeID, MAX(StakeValue) AS MaxStakeValue
FROM SI_STAKES
WHERE Stake.[EventID]=6
AND Stake.[StakeGroupTypeID]=1
GROUP BY StakeGroupTypeID, EventID, StakeTypeID
2) Then we need to get only one return back for that value since it may be present more then once.
Using the Max Value, we must find a unique row for each I usually do this by getting the Max ID is has the added advantage of getting me the most recent entry.
SELECT MAX(SMaxID.ID) AS ID
FROM SI_STAKES AS SMaxID
INNER JOIN (
SELECT StakeGroupTypeID, EventID, StakeTypeID, MAX(StakeValue) AS MaxStakeValue
FROM SI_STAKES
WHERE Stake.[EventID]=6
AND Stake.[StakeGroupTypeID]=1
GROUP BY StakeGroupTypeID, EventID, StakeTypeID
) AS SMaxVal ON SMaxID.StakeTypeID = SMaxVal.StakeTypeID
AND SMaxID.StakeValue = SMaxVal.MaxStakeValue
AND SMaxID.EventID = SMaxVal.EventID
AND SMaxID.StakeGroupTypeID = SMaxVal.StakeGroupTypeID
3) Now that we have the ID's of the rows that we want, we can just get that information.
SELECT Stakes.ID, Stakes.StakeValue, SType.StakeName, SProv.ProviderName
FROM SI_STAKES AS Stakes
INNER JOIN SI_STAKESTYPES AS SType ON Stake.[StakeTypeID] = SType.[ID]
INNER JOIN SI_PROVIDERS AS SProv ON Stake.[ProviderID] = SProv.[ID]
WHERE Stake.ID IN (
SELECT MAX(SMaxID.ID) AS ID
FROM SI_STAKES AS SMaxID
INNER JOIN (
SELECT StakeGroupTypeID, EventID, StakeTypeID, MAX(StakeValue) AS MaxStakeValue
FROM SI_STAKES
WHERE Stake.[EventID]=6
AND Stake.[StakeGroupTypeID]=1
GROUP BY StakeGroupTypeID, EventID, StakeTypeID
) AS SMaxVal ON SMaxID.StakeTypeID = SMaxVal.StakeTypeID
AND SMaxID.StakeValue = SMaxVal.MaxStakeValue
AND SMaxID.EventID = SMaxVal.EventID
AND SMaxID.StakeGroupTypeID = SMaxVal.StakeGroupTypeID
)
You can use the over clause since you're using T-SQL (hopefully 2005+):
select distinct
a.stakevalue,
max(a.stakevalue) over (partition by a.staketypeid) as maxvalue,
b.staketypeid,
c.providername
from
si_stakes a
inner join si_stakestypes b on
a.staketypeid = b.id
inner join si_providers c on
a.providerid = c.id
where
a.eventid = 6
and a.stakegrouptypeid = 1
Essentially, this will find the max a.stakevalue for each a.staketypeid. Using a distinct will return one and only one row. Now, if you wanted to include the min a.id along with it, you could use row_number to accomplish this:
select
s.id,
s.maxvalue,
s.staketypeid,
s.providername
from (
select
row_number() over (order by a.stakevalue desc
partition by a.staketypeid) as rownum,
a.id,
a.stakevalue as maxvalue,
b.staketypeid,
c.providername
from
si_stakes a
inner join si_stakestypes b on
a.staketypeid = b.id
inner join si_providers c on
a.providerid = c.id
where
a.eventid = 6
and a.stakegrouptypeid = 1
) s
where
s.rownum = 1

MSSQL Paging is returning random rows when not supposed too

I'm trying to do some basic paging in MSSQL. The problem I'm having is that I'm sorting the paging on a row that (potentially) has similar values, and the ORDER BY clause is returning "random" results, which doesn't work well.
So for example.
If I have three rows, and I'm sorting them by a "rating", and all of the ratings are = '5' - the rows will seemingly "randomly" order themselves. How do I make it so the rows are showing up in the same order everytime?
I tried ordering it by a datetime that the field was last edited, but the "rating" is sorted in reverse, and again, does not work how i expect it to work.
Here is the SQL I'm using thus far. I know it's sort of confusing without the data so.. any help would be greatful.
SELECT * FROM
(
SELECT
CAST(grg.defaultthumbid AS VARCHAR) + '_' +
CAST(grg.garageid AS VARCHAR) AS imagename,
(
SELECT COUNT(imageid)
FROM dbo.images im (nolock)
WHERE im.garageid = grg.garageid
) AS piccount,
(
SELECT COUNT(commentid)
FROM dbo.comments cmt (nolock)
WHERE cmt.garageid = grg.garageid
) AS commentcount,
grg.GarageID, mk.make, mdl.model, grg.year,
typ.type, usr.username, grg.content,
grg.rating, grg.DateEdit as DateEdit,
ROW_NUMBER() OVER (ORDER BY Rating DESC) As RowIndex
FROM
dbo.garage grg (nolock)
LEFT JOIN dbo.users (nolock) AS usr ON (grg.userid = usr.userid)
LEFT JOIN dbo.make (nolock) AS mk ON (grg.makeid = mk.makeid)
LEFT JOIN dbo.type (nolock) AS typ ON (typ.typeid = mk.typeid)
LEFT JOIN dbo.model (nolock) AS mdl ON (grg.modelid = mdl.modelid)
WHERE
typ.type = 'Automobile' AND
grg.defaultthumbid != 0 AND
usr.username IS NOT NULL
) As QueryResults
WHERE
RowIndex BETWEEN (2 - 1) * 25 + 2 AND 2 * 25
ORDER BY
DateEdit DESC
Try ordering by both, e.g.:
ORDER BY Rating DESC, DateEdit ASC
The query first numbers the rows by [Rating], and then re-sorts the results by [DateEdit]. Possibly not what you intended. Ordering by [RowIndex] ASC should sort it out.
ROW_NUMBER() OVER (ORDER BY [Rating] DESC) As [RowIndex]
...
ORDER BY [RowIndex]