Remove duplicate rows in t-sql query results - sql

I have a some T-SQL below and i am facing a problem where each row is being returned twice with the same values. How can ensure each is returned once or force a select Distinct
;WITH CTE_ReportDetails
AS (
SELECT
CASE(GROUPING(M.Acronym))
WHEN 0 THEN [Acronym]
WHEN 1 THEN 'GRAND TOTAL'
END AS [Company],
CASE(GROUPING(DC.Name))
WHEN 0 THEN DC.[Name]
WHEN 1 THEN 'N/A'
END AS [CatName],
CASE(GROUPING(D.[Name]))
WHEN 0 THEN D.[Name]
WHEN 1 THEN 'Total '+ '('+DC.[Name]+')'
END AS [Name],
SUM(ISNULL(B.One, 0)) AS One,
SUM(ISNULL(B.Two, 0)) AS Two,
SUM(ISNULL(B.Three, 0)) AS Three,
ISNULL(B.Description, '') AS Description,
GROUPING(M.Acronym) AS CompanyGrouping,
GROUPING(DC.[Name]) AS DCatGroup,
GROUPING(D.[Name]) AS DGroup
FROM Dee D
INNER JOIN DeeCategory DC ON D.DeeCategoryId = DC.DeeCategoryId
INNER JOIN BD B ON B.DeeId = D.DeeId
INNER JOIN Report R ON R.RptId = B.RptId
INNER JOIN Company M ON R.CompanyId = M.CompanyId
WHERE (R.ReportDate >= #StartDate AND R.ReportDate <= #EndDate) AND (R.CompanyId IN (SELECT DATA FROM SPLIT(#CompanyIds,',')))
GROUP BY M.Acronym, DC.Name, D.Name,B.Description
WITH ROLLUP
)
SELECT Company, Name As [Dee],
One,
Three,
Two,
(One + Three + Two) AS Total,
Description
FROM CTE_ReportDetails

Your report should have a unique combinations of values from the group by. So, these four columns should be unique:
GROUP BY M.Acronym, DC.Name, D.Name,B.Description
If I had to guess, you are getting duplicates because B.Description is duplicated in the B table. I would suggest removing it from the GROUP BY and changing this line:
ISNULL(B.Description, '') AS Description,
to
ISNULL(max(B.Description), '') AS Description,

Related

Not does not exclude query info

I have a really long query and I'm finding that my NOT is not excluding what's in parenthesis after the NOT.
I saw Exclude and where not exists, but I'd have to re-select for that, and there's too many complicatedly joined tables in what I selected already, plus one table is very big and takes a long time to select what I have already, so I can't re-select because it will make the query take too long. How do I get this exclusion to work?
INSERT INTO #UNeedingC(id, CASEID, firstname, lastname, userid, AGEOFNOTIFICATION, DATETIMEDECISIONMADE, DELEGATESYSTEM, Person_id, request_type_id, service_place_id, status_summary, externalUserId, subject, onDate, externalPersonId, externalSystemId)
select distinct
c.id
,uc.case_id
,t_case.FIRSTNAME as first
,t_case.LASTNAME as last
,t_case.user_id as userid
,CONVERT(VARCHAR, DATEDIFF(dd, SC.status_change_date, GETDATE())) + ' Day(s) ' + CONVERT(VARCHAR, DATEDIFF(hh, SC.status_change_date, GETDATE()) % 24) + ' Hour(s) ' as [AGE OF NOTIFICATION]
,SC.status_change_date AS [DATE TIME DECISION MADE]
,[ckoltp_sys].DBO.ckfn_GetStringLocaleValue(152,9,uc.delegate_system,50,0) AS [DELEGATESYSTEM]
,c.person_id
,uc.request_type_id ------
,uc.service_place_id
,uc.status_summary
,eou.external_id
,c.tzix_id+' '+[ckoltp_sys].dbo.ckfn_GetStringLocaleValue(148,9,uc.status_summary,5,0)+' type' AS subject
,dateadd( hour,41,dateadd(day,0,datediff(d,0,sc.status_change_date)) ) AS onDate
,emd.externalId externalPersonId
,eou.system_id as externalSystemId
--,u.disable
from
#tempC t_case with (NOLOCK)
inner join dbo.org_case c with (nolock) ON t_case.Person_id=c.Person_id
INNER JOIN dbo.org_2_case uc with (NOLOCK) ON uc.case_id=c.id
inner JOIN dbo.ORG_LOS S WITH (NOLOCK) ON S.case_id = UC.case_id
inner JOIN dbo.ORG_EXTENSION SC WITH (NOLOCK) ON SC.los_id= S.id
inner join dbo.org_user u with (NOLOCK) on u.id=t_case.user_id
inner join dbo.org_person op with (NOLOCK) on op.id=c.Person_id
inner JOIN dbo.u_person_concept_value MC ON MC.CID = op.cid --this is the slow table
inner join dbo.EXTERNAL_ORG_USER_DATA eou with (NOLOCK) ON eou.org_user_id = t_case.user_id
inner join dbo.EXTERNAL_person_DATA emd with (NOLOCK) ON emd.CID = op.cid --op.id --?
WHERE
DATEDIFF(day, SC.status_change_date , GETDATE()) <= 2
AND
u.disable <> 1
AND
( --(denied/approved)
dbo.ckfn_GetStringLocaleValue(148,9,uc.status_summary,5,0) = 'Denied'
OR
(dbo.ckfn_GetStringLocaleValue(148,9,uc.status_summary,5,0) in( 'Fully Approved', 'Partially Approved'))
)
AND
(
(
ISNULL(uc.request_type_id,'') in( 12)
AND DATEDIFF(month, SC.status_change_date , GETDATE()) <= 2
)
OR
(
ISNULL(uc.request_type_id,'') in( 10,11)
)
--OR
--(
-- --exclude this
-- (
-- MC.concept_id = '501620' --general val1 (1000/1001)
-- AND
-- (C.ID in (select case_id from #CASES where str_value in ('1000','1001'))
-- AND (uc.service_place_id = 31 OR uc.service_place_id = 32))
-- ) --not
--) --or
)--AND
AND
(t_case.firstname not like '%external%' and t_case.lastname not like '%case manager%')
AND
(
C.ID in (select case_id from #CASES where concept_id='501620')--MC.concept_id = '501620'
)
--overall around AND (denied/approved)--
and DBO.ckfn_GetStringLocaleValue(152,9,uc.delegate_system,50,0) in ('AP','CA')
AND NOT --this not is not working...this appears in query results
(
--exclude these
(
MC.concept_id = '501620'
AND
(C.ID in (select case_id from #CASES where str_value in ('1000','1001'))
AND (uc.service_place_id = 31 OR uc.service_place_id = 32))
) --not
) --
select * from #UNeedingC
results show what is excluded:
id caseid firstname lastname userid ageofNotification Datetimedecisionmade DelegateSys Person_id request_type_id service_place_id status_summary externalUserId subject
onDate externalPersonId externalSystemId
000256200 256200 Sree Par 1234 0 Apr 5 CA
4270000 11 31 3 sparee 000256200 Fully Approved tested Ad 2021-04-06 17:00 363000 2
My question: do you know why the NOT is not working and how I can get this to exclude without another select? See "this not is not working" comment. I searched online but only found exclude and where not exists, which require another select, which I don't want.
I think I figured it out: "NOT acts on one condition. To negate two or more conditions, repeat the NOT for each condition,"
from not on two things.
This seems to work:
...
AND
--exclude these
(
MC.concept_id = '501620' --general val1 (1000/1001)
AND
(C.ID not in (select case_id from #CASES where str_value in ('1000','1001'))
AND (uc.service_place_id not in ('31','32')))
) --not

SQL Avoid multiplication on inner joins with several returns

OK, not the best title but could not explain it better.
I have a SQL query with a line like this.
count(PRStatusChangesLog.EffectiveMinutes) as timeInHandoverExternal
it works so far but I also want to add something like this
COUNT (distinct a.ActionId) as 'Number Of Actions',
which requires this
INNER JOIN PRAction a on a.PrId = PRHeader.prid
Now the problem which I am sure some of you have already seen. The previous count is now multiplied by the number of actions.
I can see why this happens but I am not sure how best to do this so I can get both the number of actions and the right count without the multiplier.
Simplified full query
SELECT
PRHeader.PrId,
COUNT (distinct a.ActionId) AS 'Number Of Actions',
COUNT (PRStatusChangesLog.EffectiveMinutes) AS timeInHandoverExternal
FROM
PRHeader
LEFT JOIN
PRStatusChangesLog ON PRStatusChangesLog.PrId = PRHeader.PrId
AND PRStatusChangesLog.StatusId = 4100
INNER JOIN
PRAction a ON a.PrId = PRHeader.prid
WHERE
DATEDIFF(mm, prheader.ClosedDate, getdate()) = 1
AND (PRHeader.siteId = 74)
AND prheader.PRTypeId IN (17, 19)
AND PRHeader.tmpStatusId <> 6010
GROUP BY
PRHeader.PrId
You can count a unique column with DISTINCT like COUNT(DISTINCT PRStatusChangesLog.id).
If this is not possible use a subquery for counting the actions. In the SELECT clause you should write something like: (SELECT COUNT(DISTINCT a.ActionId) FROM ... WHERE PRAction a on a.PrId = PRHeader.prid) AS action_count
Using select statement clause in joins to get counts individually then add with final outer select statement.
SELECT
PRHeader.PrId, Count1 'Number Of Actions', Count2 timeInHandoverExternal
FROM
PRHeader
JOIN
(SELECT COUNT (ActionId) Count1
FROM PRAction
GROUP BY PrId) A ON A.PrId = PRHeader.prid
LEFT JOIN
(SELECT
COUNT(PRStatusChangesLog.EffectiveMinutes) Count2, PrId, StatusId
FROM
PRStatusChangesLog
WHERE
StatusId = 4100
GROUP BY
PrId, StatusId) B ON B.PrId = PRHeader.PrId
WHERE
DATEDIFF(mm, prheader.ClosedDate, getdate()) = 1
AND (PRHeader.siteId = 74 )
AND prheader.PRTypeId IN (17,19)
AND PRHeader.tmpStatusId <> 6010

SQL left join returns nothing when no matches

I am writing a stored procedure that adds the counts to two fields. I have the following code:
SELECT Distinct DateTime1,SUM(TICKETREQ1)SUMREQ, SUM(TicketPU1)SUMPU1, (count(*))AS GRADCOUNT
FROM TABLEA
WHERE YEAR = '2015'
AND TicketReq1 > 0
group by DateTime1
Select DISTINCT(DateTime2),SUM(TicketReq2) SUMREQ,SUM(TicketPU2)SUMPU2, (count(*))AS GRADCOUNT
from TABLEA
where TicketReq2 > 0
and YEAR = '2015'
Group by DateTime2;
SELECT Distinct c.DateTime1,SUM(c.TICKETREQ1 + b.TicketReq2)SUMREQ, SUM(c.TicketPU1 + b.TicketPU2)SUMPU1, (count(b.id) + count(c.id))AS GRADCOUNT
FROM TABLEA c
LEFT JOIN TABLEA b
ON (b.DateTime2 = c.DateTime1
AND b.TicketReq2 > 0
AND b.YEAR = '2015')
WHERE c.YEAR = '2015'
AND c.TicketReq1 > 0
group by c.DateTime1
This returns:
For some ceremonies the second query does bring in results and adds them correctly. But if there are no records then it fails.
How can I get it to join the two counts together (Query 1 and 2) so that Query 3 displays both counts even when there is no match
The problem is the SUM statements on query #3. b.TicketReq2 is null, therefore SUM(c.TICKETREQ1 + b.TicketReq2) should encounter an error. Try using ISNULL(b.TicketReq2, 0) in your SUM function calls.
Try full outer join instead of left join
SELECT Distinct c.CeremonyDateTime1,SUM(c.TICKETREQ1 + b.TicketReq2)SUMREQ, SUM(c.TicketPU1 + b.TicketPU2)SUMPU1, (count(b.gid) + count(c.gid))AS GRADCOUNT
FROM ComTicket c
FULL OUTER JOIN ComTicket b
ON (b.CeremonyDateTime2 = c.CeremonyDateTime1
AND b.TicketReq2 > 0
AND b.Gradterm = '201540')
WHERE c.gradterm = '201540'
AND c.TicketReq1 > 0
group by c.CeremonyDateTime1
It might helpful to you..

Return First 4 Rows, then Repeat for Grouping

I am trying to return data that will ultimately populate a label.
Each label is going onto a box, and the box can only have 4 items in it.
If a delivery has more than 4 items, then I need one label per 4.
Each row of data returned will populate one label, so if the delivery contains 9 items, then I need 3 rows of data returned.
Below is my current query, which is returning all items into a comma separated value using Stuff.
I want it so the first 4 rows for the delivery return in the first row, then the next 4 in the second and so on.
My Field LineOrd returns correctly if there are more than 4 lines on the dispatch.
select Distinct
delivery_header.dh_datetime,
delivery_header.dh_number,
order_header.oh_order_number as 'Order No',
order_header_detail.ohd_delivery_name,
order_header_detail.ohd_delivery_address1,
order_header_detail.ohd_delivery_address2,
order_header_detail.ohd_delivery_address3,
order_header_detail.ohd_delivery_town,
order_header_detail.ohd_delivery_county,
order_header_detail.ohd_delivery_postcode,
order_header_detail.ohd_delivery_country,
STUFF((Select ', '+convert(varchar(50),convert(decimal(8,0),DL.dli_qty))+'x '+OLI.oli_description
from delivery_header DH join delivery_line_item DL on DL.dli_dh_id = DH.dh_id join order_line_item OLI on OLI.oli_id = DL.dli_oli_id
Outer APPLY
(select
case when DelCurLine.CurLine <= 4
then '1'
Else
Case when DelCurLine.CurLine <= 8
then '2'
Else '3'
End
End +'-'+order_header.oh_order_number as LineOrd) as StuffLineOrder
Where DH.dh_id = delivery_header.dh_id And StuffLineOrder.LineOrd = LineOrder.LineOrd
FOR XML PATH('')),1,1,'') as Items,
LineOrder.LineOrd
from delivery_header
join delivery_line_item on delivery_line_item.dli_dh_id = delivery_header.dh_id
join order_line_item on order_line_item.oli_id = delivery_line_item.dli_oli_id
join order_header on order_header.oh_id = order_line_item.oli_oh_id
join order_header_detail on order_header_detail.ohd_oh_id = order_header.oh_id
join variant_detail on variant_detail.vad_id = order_line_item.oli_vad_id
join stock_location on stock_location.sl_id = order_line_item.oli_sl_id
Outer APPLY
(select count(DLI.dli_id) CurLine from delivery_line_item DLI where DLI.dli_dh_id = delivery_header.dh_id and DLI.dli_id <= delivery_line_item.dli_id)
as DelCurLine
Outer APPLY
(select
case when DelCurLine.CurLine <= 4
then '1'
Else
Case when DelCurLine.CurLine <= 8
then '2'
Else '3'
End
End +'-'+order_header.oh_order_number as LineOrd) as LineOrder
Outer APPLY
(select convert(varchar(50),convert(decimal(8,0),delivery_line_item.dli_qty))+'x '+order_line_item.oli_description as LineName) as LineName
where
delivery_header.dh_datetime between #DateFrom and #DateTo
and stock_location.sl_id = #StockLoc
and (order_header.oh_order_number = #OrderNo or #AllOrder = 1)
order by
delivery_header.dh_datetime,
delivery_header.dh_number,
order_header.oh_order_number,
order_header_detail.ohd_delivery_name,
order_header_detail.ohd_delivery_address1,
order_header_detail.ohd_delivery_address2,
order_header_detail.ohd_delivery_address3,
order_header_detail.ohd_delivery_town,
order_header_detail.ohd_delivery_county,
order_header_detail.ohd_delivery_postcode,
order_header_detail.ohd_delivery_country
You can use ROW_NUMBER() with a division by 4. This truncate the decimal because numerator is an interger. This give you group number with a maximum of four row in each group. You can then adjust your query to use this group number in a "group by" clause to return grouped rows into a single one.
Exemple here :
SELECT RawData.BoxGroup,
MIN(dh_datetime),
MIN(dh_number),
MIN(order_header.oh_order_number) as 'Order No'
--And so on
FROM
(SELECT BoxGroup = (ROW_NUMBER() OVER(ORDER BY (SELECT 1)) - 1) / 4,
*
FROM [TableNameOrQuery]) AS RawData
GROUP BY RawData.BoxGroup
Hope this help.

TOP Returning null

I have the following view below. The second nested select is always returning null when I use the TOP(1) clause, but when I remove this clause it returns the data as expected, just more rows than is needed. Does anyone see anything that would explain this?
SELECT TOP (100) PERCENT
a.ITEMID AS Model
,id.CONFIGID
,id.INVENTSITEID AS SiteId
,id.INVENTSERIALID AS Serial
,it.ITEMNAME AS Description
,CASE WHEN it.DIMGROUPID LIKE '%LR-Y' THEN 'Y'
ELSE 'N'
END AS SerialNumberReqd
,ISNULL(it.PRIMARYVENDORID, N'') AS Vendor
,ISNULL(vt.NAME, N'') AS VendorName
,id.INVENTLOCATIONID AS Warehouse
,id.WMSLOCATIONID AS Bin
,ISNULL(CONVERT(varchar(12), CASE WHEN C.DatePhysical < '1901-01-01'
THEN NULL
ELSE C.DatePhysical
END, 101), N' ') AS DeliveryDate
,CASE WHEN (a.RESERVPHYSICAL > 0
OR C.StatusIssue = 1)
AND c.TransType = 0 THEN C.PONumber
ELSE ''
END AS SoNumber
,'' AS SoDetail
,ISNULL(C.PONumber, N'') AS RefNumber
,ISNULL(CONVERT(varchar(12), CASE WHEN ins.ProdDate < '1901-01-01'
THEN NULL
ELSE ins.PRODDATE
END, 101), N' ') AS DateReceived
,it.STKSTORISGROUPID AS ProdGroup
,ISNULL(CONVERT(varchar(12), CASE WHEN ins.ProdDate < '1901-01-01'
THEN NULL
ELSE ins.PRODDATE
END, 101), N' ') AS ProductionDate
,it.ITEMGROUPID
,it.STKSTORISGROUPID AS MerchandisingGroup
,CASE WHEN a.postedValue = 0
THEN (CASE WHEN D.CostAmtPosted = 0 THEN D.CostAmtPhysical
ELSE D.CostAmtPosted
END)
ELSE a.POSTEDVALUE
END AS Cost
,CASE WHEN a.PHYSICALINVENT = 0 THEN a.Picked
ELSE a.PhysicalInvent
END AS PhysicalOnHand
,ins.STKRUGSQFT AS RugSqFt
,ins.STKRUGVENDSERIAL AS RugVendSerial
,ins.STKRUGVENDDESIGN AS RugVendDesign
,ins.STKRUGEXACTSIZE AS RugExactSize
,ins.STKRUGCOUNTRYOFORIGIN AS RugCountryOfOrigin
,ins.STKRUGQUALITYID AS RugQualityId
,ins.STKRUGCOLORID AS RugColorId
,ins.STKRUGDESIGNID AS RugDesignId
,ins.STKRUGSHAPEID AS RugShapeId
,CASE WHEN (a.AVAILPHYSICAL > 0) THEN 'Available'
WHEN (id.WMSLOCATIONID = 'NIL') THEN 'Nil'
WHEN (a.RESERVPHYSICAL > 0)
AND (c.TransType = 0) THEN 'Committed'
WHEN (a.RESERVPHYSICAL > 0) THEN 'Reserved'
WHEN (id.WMSLOCATIONID LIKE '%-Q') THEN 'Damaged'
WHEN (a.Picked > 0) THEN 'Picked'
ELSE 'UNKNOWN'
END AS Status
,'' AS ReasonCode
,'' AS BaseModel
,ISNULL(CAST(ins.STKSTORISCONFIGINFO AS nvarchar(1000)), N'') AS StorisConfigInfo
,ISNULL(C.ConfigSummary, N'') AS ConfigSummary
FROM
dbo.INVENTSUM AS a WITH (NOLOCK)
INNER JOIN dbo.INVENTDIM AS id WITH (NOLOCK)
ON id.DATAAREAID = a.DATAAREAID
AND id.INVENTDIMID = a.INVENTDIMID
LEFT OUTER JOIN dbo.INVENTTABLE AS it WITH (NOLOCK)
ON it.DATAAREAID = a.DATAAREAID
AND it.ITEMID = a.ITEMID
LEFT OUTER JOIN dbo.VENDTABLE AS vt WITH (NOLOCK)
ON vt.DATAAREAID = it.DATAAREAID
AND vt.ACCOUNTNUM = it.PRIMARYVENDORID
LEFT OUTER JOIN dbo.INVENTSERIAL AS ins WITH (NOLOCK)
ON ins.DATAAREAID = id.DATAAREAID
AND ins.INVENTSERIALID = id.INVENTSERIALID
LEFT OUTER JOIN (SELECT TOP (1)
itt.ITEMID
,invt.INVENTSERIALID
,itt.DATEPHYSICAL AS DatePhysical
,itt.TRANSREFID AS PONumber
,itt.TRANSTYPE AS TransType
,itt.STATUSISSUE AS StatusIssue
,dbo.stkRowsToColumn(itt.INVENTTRANSID, 'STI') AS ConfigSummary
,itt.RECID
FROM
dbo.INVENTTRANS AS itt WITH (NOLOCK)
INNER JOIN dbo.INVENTDIM AS invt WITH (NOLOCK)
ON invt.DATAAREAID = itt.DATAAREAID
AND invt.INVENTDIMID = itt.INVENTDIMID
WHERE
(itt.DATAAREAID = 'STI')
AND (itt.TRANSTYPE IN (0, 2, 3, 8))
AND (invt.INVENTSERIALID <> '')
ORDER BY
itt.RECID DESC) AS C
ON C.ITEMID = a.ITEMID
AND C.INVENTSERIALID = id.INVENTSERIALID
LEFT OUTER JOIN (SELECT TOP (1)
itt2.ITEMID
,invt2.INVENTSERIALID
,itt2.COSTAMOUNTPOSTED AS CostAmtPosted
,itt2.COSTAMOUNTPHYSICAL + itt2.COSTAMOUNTADJUSTMENT AS CostAmtPhysical
,itt2.RECID
FROM
dbo.INVENTTRANS AS itt2 WITH (NOLOCK)
INNER JOIN dbo.INVENTDIM AS invt2 WITH (NOLOCK)
ON invt2.DATAAREAID = itt2.DATAAREAID
AND invt2.INVENTDIMID = itt2.INVENTDIMID
WHERE
(itt2.DATAAREAID = 'STI')
AND (itt2.TRANSTYPE IN (0, 2, 3, 4, 6, 8))
AND (invt2.INVENTSERIALID <> '')
ORDER BY
itt2.RECID DESC) AS D
ON D.ITEMID = a.ITEMID
AND D.INVENTSERIALID = id.INVENTSERIALID
WHERE
(a.DATAAREAID = 'STI')
AND (a.CLOSED = 0)
AND (a.PHYSICALINVENT > 0)
AND (it.ITEMGROUPID LIKE 'FG-%'
OR it.ITEMGROUPID = 'MULTISHIP')
ORDER BY
SiteId
,Warehouse
Presumably, the top value in the subquery doesn't meet the subsequent join conditions. That is, this condition is not met:
D.ITEMID = a.ITEMID AND D.INVENTSERIALID = id.INVENTSERIALID
You are using a left outer join, so NULL values are filled in.
EDIT:
To re-iterate. When you run it with top 1, there are no values (for at least some combinations of the two variables). So, NULL will be filled in for these values. After all, top 1 (with or without the parentheses) returns only one row.
When you run it returning multiple rows, presumably there are matches. For the rows that match, the corresponding values are put it. This is the way that left outer join works.
Gordon's answer is correct as to why I was getting a few rows when removing top and none when I had it. The subquery in question was returning all the rows in the InventTrans table (5 million+) so when I used top, it was just getting the first row which didn't have anything. I realized this was the case when I was trying random high values (e.g 50000) in the TOP clause.
The ultimate fix was to change the left outer joins on the C and D subqueries to Cross Apply, and then change the where clauses to better filter the table (e.g itt.itemid = a.itemid and invt1.inventserialid = id.inventserialid). Using that, I was able to use TOP 1 as expected.