How to use group by only for some columns in sql Query? - sql

The following query returns 550 records, which I am then grouping by some columns in the controller via linq. However, how can I achieve the "group by" logic in the SQL query itself? Additionally, post-grouping, I need to show only 150 results to the user.
Current SQL query:
SELECT DISTINCT
l.Id AS LoadId
, l.LoadTrackingNumber AS LoadDisplayId
, planningType.Text AS PlanningType
, loadStatus.Id AS StatusId
, loadWorkRequest.Id AS LoadRequestId
, loadStatus.Text AS Status
, routeIds.RouteIdentifier AS RouteName
, planRequest.Id AS PlanId
, originPartyRole.Id AS OriginId
, originParty.Id AS OriginPartyId
, originParty.LegalName AS Origin
, destinationPartyRole.Id AS DestinationId
, destinationParty.Id AS DestinationPartyId
, destinationParty.LegalName AS Destination
, COALESCE(firstSegmentLocation.Window_Start, originLocation.Window_Start) AS StartDate
, COALESCE(firstSegmentLocation.Window_Start, originLocation.Window_Start) AS BeginDate
, destLocation.Window_Finish AS EndDate
AS Number
FROM Domain.Loads (NOLOCK) AS l
INNER JOIN dbo.Lists (NOLOCK) AS loadStatus ON l.LoadStatusId = loadStatus.Id
INNER JOIN Domain.Routes (NOLOCK) AS routeIds ON routeIds.Id = l.RouteId
INNER JOIN Domain.BaseRequests (NOLOCK) AS loadWorkRequest ON loadWorkRequest.LoadId = l.Id
INNER JOIN Domain.BaseRequests (NOLOCK) AS planRequest ON planRequest.Id = loadWorkRequest.ParentWorkRequestId
INNER JOIN Domain.Schedules AS planSchedule ON planSchedule.Id = planRequest.ScheduleId
INNER JOIN Domain.Segments (NOLOCK) os on os.RouteId = routeIds.Id AND os.[Order] = 0
INNER JOIN Domain.LocationDetails (NOLOCK) AS originLocation ON originLocation.Id = os.DestinationId
INNER JOIN dbo.EntityRoles (NOLOCK) AS originPartyRole ON originPartyRole.Id = originLocation.DockRoleId
INNER JOIN dbo.Entities (NOLOCK) AS originParty ON originParty.Id = originPartyRole.PartyId
INNER JOIN Domain.LocationDetails (NOLOCK) AS destLocation ON destLocation.Id = routeIds.DestinationFacilityLocationId
INNER JOIN dbo.EntityRoles (NOLOCK) AS destinationPartyRole ON destinationPartyRole.Id = destLocation.DockRoleId
INNER JOIN dbo.Entities (NOLOCK) AS destinationParty ON destinationParty.Id = destinationPartyRole.PartyId
INNER JOIN dbo.TransportationModes (NOLOCK) lictm on lictm.Id = l.LoadInstanceCarrierModeId
INNER JOIN dbo.EntityRoles (NOLOCK) AS carrierPartyRole ON lictm.CarrierId = carrierPartyRole.Id
INNER JOIN dbo.Entities (NOLOCK) AS carrier ON carrierPartyRole.PartyId = carrier.Id
INNER JOIN dbo.EntityRoles (NOLOCK) AS respPartyRole ON l.ResponsiblePartyId = respPartyRole.Id
INNER JOIN dbo.Entities (NOLOCK) AS respParty ON respPartyRole.PartyId = respParty.Id
INNER JOIN Domain.LoadOrders (NOLOCK) lo ON lo.LoadInstanceId = l.Id
INNER JOIN Domain.Orders (NOLOCK) AS o ON lo.OrderInstanceId = o.Id
INNER JOIN Domain.BaseRequests (NOLOCK) AS loadRequest ON loadRequest.LoadId = l.Id
--Load Start Date
LEFT JOIN Domain.Segments (NOLOCK) AS segment ON segment.RouteId = l.RouteId AND segment.[Order] = 0
LEFT JOIN Domain.LocationDetails (NOLOCK) AS firstSegmentLocation ON firstSegmentLocation.Id = segment.DestinationId
LEFT JOIN dbo.Lists (NOLOCK) AS planningType ON l.PlanningTypeId = planningType.Id
LEFT JOIN dbo.EntityRoles (NOLOCK) AS billToRole ON o.BillToId = billToRole.Id
LEFT JOIN dbo.Entities (NOLOCK) AS billTo ON billToRole.PartyId = billTo.Id
WHERE o.CustomerId in (34236) AND originLocation.Window_Start >= '07/19/2015 00:00:00' AND originLocation.Window_Start < '07/25/2015 23:59:59' AND l.IsHistoricalLoad = 0
AND loadStatus.Id in (285, 286,289,611,290)
AND loadWorkRequest.ParentWorkRequestId IS NOT NULL
AND routeIds.RouteIdentifier IS NOT NULL
AND (planSchedule.EndDate IS NULL OR (planSchedule.EndDate is not null and CAST(CONVERT(varchar(10), planSchedule.EndDate,101) as datetime) > CAST(CONVERT(varchar(10),GETDATE(),101) as datetime))) ORDER BY l.Id DESC
linq:
//Get custom grouped data
var loadRequest = (from lq in returnList
let loadDisplayId = lq.LoadDisplayId
let origin = lq.OriginId //get this origin for route
let destination = lq.DestinationId // get this destination for route
group lq by new
{
RouteId = lq.RouteName,
PlanId = lq.PlanId,
Origin = lq.OriginId,
Destination = lq.DestinationId
}
into grp
select new
{
RouteId = grp.Key.RouteId,
PlanId = grp.Key.PlanId,
Origin = grp.Key.Origin,
Destination = grp.Key.Destination,
Loads = (from l in grp select l)
}).OrderBy(x => x.Origin).ToList();

I'm guessing you want to Group By column 1 but include columns 2 and 3 in your Select. Using a Group By you cannot do this. However, you can do this using a T-SQL Windowing function using the OVER() operator. Since you don't say how you want to aggregate, I cannot provide an example. But look at T-SQL Windowing functions. This article might help you get started.

One important thing you need to understand about GROUP BY is that you must assume that there are multiple values in every column outside of the GROUP BY list. In your case, you must assume that for each value of Column1 there would be multiple values of Column2 and Column3, all considered as a single group.
If you want your query to process any of these columns, you must specify what to do about these multiple values.
Here are some choices you have:
Pick the smallest or the largest value for a column in a group - use MIN(...) or MAX(...) aggregator for that
Count non-NULL items in a group - use COUNT(...)
Produce an average of non-NULL values in a group - use AVG(...)
For example, if you would like to find the smallest Column2 and an average of Column3 for each value of Column1, your query would look like this:
select
Column1, MIN(Column2), AVG(Column3)
from
TableName
group by
Column1

Related

Getting extra Row containing Null Value for specific Names

In my Query if CompanyName has Coamount = NULL then Final should be as of Contract amount. If CompanyName has Coamount then Final would be ContractAmount + Coamount.
I am getting all these answers but the query is also returning me rows which has NULL value in Coamount for those company name in which coamount is already present.
for eg: Row1, why is that row coming when I am getting row 2 as my answer according to my query.
Here is the code:
SELECT DISTINCT
HBRegions.RegionName,PropertyGroups.Name AS Owner,Properties.EntityID, Properties.Address,Vendors.CompanyName,
SubContractorDrawDetails.InvoiceUploadDate,SubContractorDrawDetails.InvoiceAmount,SubContractorDrawDetails.PaymentDate,
LienWaivers.FinalLienWaiverRcvdDate,SubContractorDrawDetails.PaymentAmount, subcontractors.ContractAmount,cc.Coamount ,
(subcontractors.ContractAmount + cc.Coamount ) AS ActualContractAmount,
case when cc.Coamount is null then SubContractors.ContractAmount
when CC.Coamount is not null then (subcontractors.ContractAmount + cc.Coamount ) end as Final
FROM Users inner JOIN
Vendors ON Users.UserId = Vendors.UserID inner JOIN
LienWaivers ON Users.UserId = LienWaivers.UserID inner JOIN
Constructions ON LienWaivers.ConstructionID = Constructions.ConstructionId inner JOIN
Properties ON Constructions.HBId = Properties.HBId inner JOIN
ChangeOrderDetails ON Constructions.ConstructionId = ChangeOrderDetails.ConstructionId left JOIN
ChangeOrderLineItems ON Users.UserId = ChangeOrderLineItems.SubContractorId AND
ChangeOrderDetails.ChangeOrderId = ChangeOrderLineItems.ChangeOrderId
left join
(select sum(ChangeOrderDetails.ChangeOrderApprovedAmountContractor) as Coamount, ChangeOrderLineItems.SubContractorId,Constructions.ConstructionId
from ChangeOrderDetails inner join ChangeOrderLineItems
on ChangeOrderDetails.ChangeOrderId = ChangeOrderLineItems.ChangeOrderId
inner join Users on ChangeOrderLineItems.SubContractorId = Users.UserId
left join Constructions on Constructions.ConstructionId = ChangeOrderDetails.ConstructionId
-- left join Properties on Properties.HBId = Constructions.HBId
where ChangeOrderDetails.ChangeOrderApprovedAmountContractor is not null
group by ChangeOrderLineItems.SubContractorId,Constructions.ConstructionId
) AS cc on cc.SubContractorId = ChangeOrderLineItems.SubContractorId and cc.ConstructionId = Constructions.ConstructionId
inner join HBRegions ON Properties.RegionId = HBRegions.RegionID inner JOIN
PropertyGroups ON Properties.PropertyGroup = PropertyGroups.PropertyGroupId
inner join SubContractors ON Users.UserId = SubContractors.UserID AND Properties.HBId = SubContractors.HBId
inner join SubContractorDrawDetails ON SubContractors.SubContractorId = SubContractorDrawDetails.SubContractorId
WHERE (Properties.Address = '470 ROYCROFT BLVD BUFFALO NY 14225')[enter image description here][1]

SQL - SUM TotalValue returning a null with LEFT JOIN Clause

I'd to like SUM a TotalValue column based on Vendor and logged in user. I completely returning a right value of other info in logged user and the connected vendor, the problem is the SUM of TotalValue column is returning a null value. am I missing something?
This is what I've already tried:
SELECT ,v.VendorName ,
u.Product ,
v.[Description] ,
v.Status ,
SUM(cpm.TotalValue) AS TotalValue
FROM Vendor v
LEFT JOIN [ProductContract] c ON v.VendorId = c.VendorId
AND c.[Status] = 4
AND c.ProductContractId IN
(SELECT con.ProductContractId
FROM [ProductContract] con
INNER JOIN [ProductContractPermission] cp ON cp.ProductContractId = con.ProductContractId
WHERE cp.UserInfoId = #UserInfoId)
LEFT JOIN ProductContractPaymentMenu cpm ON c.ProductContractId = cpm.ProductContractId
AND c.[Status] = 4
AND c.VendorId = #VendorId
LEFT JOIN VendorContact vc ON v.VendorId = vc.VendorId
AND vc.[Type] = 1
LEFT JOIN UserInfo u ON vc.UserInfoId = u.UserInfoId
WHERE v.VendorId IN
(SELECT VendorId
FROM ClientVendor
WHERE ClientId = #VendorId)
GROUP BY v.VendorName,
u.Product,
v.[Description],
v.Status,
cpm.TotalValue
ORDER BY v.[Status],
v.CreatedOn
Seems that you want to apply an aggregate filter, this is the famous HAVING clause:
...
GROUP BY v.VendorName,
u.Product,
v.[Description],
v.Status,
cpm.TotalValue
HAVING
SUM(cpm.TotalValue) > 0
ORDER BY v.[Status],
v.CreatedOn
Hi i think you have to add ISNULL(value,defaultvalue) because cpm table was on left join and it can be null.
SELECT v.VendorName ,
u.Product ,
v.[Description] ,
v.Status ,
SUM(ISNULL(cpm.TotalValue,0)) AS TotalValue
FROM Vendor v
LEFT JOIN [ProductContract] c ON v.VendorId = c.VendorId
AND c.[Status] = 4
AND c.ProductContractId IN
(SELECT con.ProductContractId
FROM [ProductContract] con
INNER JOIN [ProductContractPermission] cp ON cp.ProductContractId = con.ProductContractId
WHERE cp.UserInfoId = #UserInfoId)
LEFT JOIN ProductContractPaymentMenu cpm ON c.ProductContractId = cpm.ProductContractId
AND c.[Status] = 4
AND c.VendorId = #VendorId
LEFT JOIN VendorContact vc ON v.VendorId = vc.VendorId
AND vc.[Type] = 1
LEFT JOIN UserInfo u ON vc.UserInfoId = u.UserInfoId
WHERE v.VendorId IN
(SELECT VendorId
FROM ClientVendor
WHERE ClientId = #VendorId)
GROUP BY v.VendorName,
u.Product,
v.[Description],
v.Status,
cpm.TotalValue
ORDER BY v.[Status],
v.CreatedOn
https://learn.microsoft.com/en-us/sql/t-sql/functions/isnull-transact-sql?view=sql-server-2017
Edit: other point you have miss ',' in your query after the SELECT.
Your left join on ProductContractPaymentMenu may not always get an item, so cpm.TotalValue can be NULL sometimes. When you use SUM and when one value is NULL then the result will be NULL. You might rewrite that part as:
SUM(ISNULL(cpm.TotalValue, 0)) AS TotalValue
In that case, it will treat non-existing records as value 0.

SQL: Linking a Count to a specific value through multiple tables

I'm trying to link a COUNT to a specific value across several tables in a SQL Server Database. In this case the tables only share values through correlation. I am returning the values I want but the COUNT is counting everything in a given project not just the ones linked to their work items.
SELECT
[d].[Id]
,COUNT([t].[ItemId]) AS ItemCount
,[d].[ItemName]
FROM
[dbo].[Project_Map] [rm] WITH (NOLOCK)
INNER JOIN
[dbo].[WorkProjects] [r] WITH (NOLOCK)
ON [r].[DomainId] = [rm].[DomainId]
AND [r].[ProjectId] = [rm].[ProjectId]
AND [r].[ReleaseId] = [rm].[ReleaseId]
INNER JOIN
[dbo].[Items] [d] WITH (NOLOCK)
ON [d].[DomainId] = [r].[DomainId]
AND [d].[ProjectId] = [r].[ProjectId]
AND [d].[ReleaseId] = [r].[ReleaseId]
INNER JOIN [dbo].[Projects] [p] with (NOLOCK)
ON r.DomainId = p.DomainId
AND r.ProjectId = p.ProjectId
INNER JOIN [dbo].[Tests] [t] with (NOLOCK)
ON p.DomainId = t.DomainId
AND p.ProjectId = t.ProjectId
INNER JOIN
(
SELECT [Id], MAX([LastModifiedDate]) AS MostRecent
FROM Items
Group By [Id]
) AS updatedItem
ON updatedItem.Id = d.Id
INNER JOIN
[dbo].[WorkItemStates] [ds] WITH (NOLOCK)
ON [ds].[ItemStateName] = [d].[ItemStatus]
WHERE
d.Id = 111111
AND d.UserCategory Like 'SOMESTRING'
GROUP BY d.Id, d.ItemName
RETURNS: In this case the count should be 1 but it returns the count for the entire project.
ID COUNT ITEMNAME
86 5169 SOME NAME
173 5169 SOME NAME
170 5169 SOME NAME
Am I missing a join somewhere?
Currently, your counts are counting all JOIN instances and not just distinct Item level records. Consider turning your Item unit level join into an aggregate query join and include the count field in outer grouping:
Specifically, change:
INNER JOIN [dbo].[Tests] [t] with (NOLOCK)
ON p.DomainId = t.DomainId
AND p.ProjectId = t.ProjectId
Into:
INNER JOIN
(SELECT t.DomaindId, t.ProjectId, Count(*) As ItemCount
FROM [dbo].[Tests] t
GROUP BY t.DomaindId, t.ProjectId) agg
ON p.DomainId = agg.DomainId
AND p.ProjectId = agg.ProjectId
And then the outer query structure becomes:
SELECT
[d].[Id]
,agg.ItemCount
,[d].[ItemName]
FROM
...
GROUP BY
[d].[Id]
,agg.ItemCount
,[d].[ItemName]
Interestingly, you already do such an aggregate query join but never use that derived table updateItem or the field MostRecent.

why selecting particular columns from same table slows down query performance significantly?

I have SELECT statement that querying columns from tblQuotes. Why if I am selecting columns a.ProducerCompositeCommission and a.CompanyCompositeCommission, then query spinning forever.
Execution plans with and without those columns are IDENTICAL!
If I commented them out - then it brings result for 1 second.
SELECT
a.stateid risk_state1,
--those columns slows down performance
a.ProducerCompositeCommission,
a.CompanyCompositeCommission,
GETDATE() runDate
FROM
tblQuotes a
INNER JOIN
lstlines l ON a.LineGUID = l.LineGUID
INNER JOIN
tblSubmissionGroup tsg ON tsg.SubmissionGroupGUID = a.SubmissionGroupGuid
INNER JOIN
tblUsers u ON u.UserGuid = tsg.UnderwriterUserGuid
INNER JOIN
tblUsers u2 ON u2.UserGuid = a.UnderwriterUserGuid
LEFT OUTER JOIN
tblFin_Invoices tfi ON tfi.QuoteID = a.QuoteID AND tfi.failed <> 1
INNER JOIN
lstPolicyTypes lpt ON lpt.policytypeid = a.policytypeid
INNER JOIN
tblproducercontacts prodC ON prodC.producercontactguid = a.producercontactguid
INNER JOIN
tblProducerLocations pl ON pl.producerlocationguid = prodc.producerlocationguid
INNER JOIN
tblproducers prod ON prod.ProducerGUID = pl.ProducerGUID
LEFT OUTER JOIN
Catalytic_tbl_Model_Analysis aia ON aia.ImsControl = a.controlno
AND aia.analysisid = (SELECT TOP 1 tma2.analysisid
FROM Catalytic_tbl_Model_Analysis tma2
WHERE tma2.imscontrol = a.controlno)
LEFT OUTER JOIN
Catalytic_tbl_RDR_Analysis rdr ON rdr.ImsControl = a.controlno
AND rdr.analysisid = (SELECT TOP 1 tma2.analysisid
FROM Catalytic_tbl_RDR_Analysis tma2
WHERE tma2.imscontrol = a.controlno)
LEFT OUTER JOIN
tblProducerContacts mnged ON mnged.producercontactguid = ProdC.ManagedBy
LEFT OUTER JOIN
lstQuoteStatusReasons r1 ON r1.id = a.QuoteStatusReasonID
WHERE
l.LineName = 'EARTHQUAKE'
AND CAST(a.EffectiveDate AS DATE) >= CAST('2017-01-01' AS DATE)
AND CAST(a.EffectiveDate AS DATE) <= CAST('2017-12-31' AS DATE)
ORDER BY
a.effectiveDate
The execution plan can be found here:
https://www.brentozar.com/pastetheplan/?id=rJawDkTx-
I ran sp_help and this is what I see:
What exactly wrong with those columns?
I dont use them in a JOIN or anything. Why such bahaviour?
Table Size:
Indexes on table tblQuotes

using a subquery as a column that's using another column in the 'where' clause

I probably messed that title up! So I have a column called "programID" and I want another column to tell me the most recent date an order was placed using the programID. I think I'd need a subcolumn but the problem is how do I use the row's programID so that it matches the programID for the same row?
DECLARE #ClientIDs varchar(4) = 6653;
DECLARE #ProgramStatus char(1) = 'Y'; -- Y, N, or R
SELECT
rcp.RCPID AS ProgramID
, rcp.RCPName AS Program
, rcp.RCPActive AS ProgramStatus
, aa.AACustomCardInternalReview AS VCCP
, aa.AAJobNo AS AAJobNo
, aas.AASName AS AAStatus
, rcp.RCPOpsApproved AS OpsApproved
, clt.CltID AS ClientID
, rcp.RCPSignatureRequired AS SignatureRequired
, st.STEnumValue AS DefaultShipType
, rcp.RCPShipMethodOverrideType AS ShipTypeOverride
,aa.AANetworkProgramID
,(Select max(cdconfirmationdate) from carddet where ) --can't figure this part out
FROM
RetailerCardProgram rcp WITH (NOLOCK)
INNER JOIN ClientRetailerMap crm WITH (NOLOCK)
ON crm.CRMRetailerID = rcp.RCPRetailerID
INNER JOIN Client clt WITH(NOLOCK)
ON clt.CltID = crm.CRMCltID
LEFT JOIN AssociationApproval aa WITH (NOLOCK)
ON aa.AARetailerID = rcp.RCPRetailerID
AND aa.AABin = rcp.RCPBin6
AND aa.AAFrontOfPlasticTemplateID = rcp.RCPFOCTemplateID
AND aa.AABackOfPlasticTemplateID = rcp.RCPBOCTemplateID
AND ISNULL(aa.AACardID, 0) = ISNULL(rcp.RCPDefaultPlasticCardID, 0)
-- AND LOWER(rcp.RCPName) NOT LIKE '%do not use%' -- Needed for AA Job Number 1594
LEFT JOIN AssociationApprovalStatus aas WITH (NOLOCK)
ON aas.AASID = aa.AAAssociationApprovalStatusID
LEFT JOIN OpenLoopAssociation ola WITH (NOLOCK)
ON ola.OLAID=rcp.RCPOLAID
LEFT JOIN ClientCardProgramMap ccpm WITH (NOLOCK)
ON ccpm.CCPMCardProgramID = rcp.RCPID
AND ccpm.CCPMClientID = clt.CltID
LEFT JOIN TippingModule tm WITH (NOLOCK)
ON tm.TMid = rcp.RCPTippingModuleID
LEFT JOIN GiftCardTemplate fgt WITH (NOLOCK)
ON fgt.gtid = rcp.RCPFOCTemplateID
AND fgt.GTPage='P'
LEFT JOIN GiftCardTemplate bgt WITH (NOLOCK)
ON bgt.gtid = rcp.RCPBOCTemplateID
AND bgt.GTPage='PB'
LEFT JOIN Card c WITH (NOLOCK)
ON c.CardID = rcp.RCPDefaultCarrierID
LEFT JOIN CardType ct WITH (NOLOCK)
ON ct.CTID = c.CardTypeID
LEFT JOIN RetailerCardProgramTCSheetMap rtm1 WITH (NOLOCK)
ON rtm1.RTMRCPID = rcp.RCPID
AND rtm1.RTMInsertOrder = 1
LEFT JOIN RetailerCardProgramTCSheetMap rtm2 WITH (NOLOCK)
ON rtm2.RTMRCPID = rcp.RCPID
AND rtm2.RTMInsertOrder = 2
LEFT JOIN RetailerCardProgramTCSheetMap rtm3 WITH (NOLOCK)
ON rtm3.RTMRCPID = rcp.RCPID
AND rtm3.RTMInsertOrder = 3
LEFT JOIN RetailerCardProgramTCSheetMap rtm4 WITH (NOLOCK)
ON rtm4.RTMRCPID = rcp.RCPID
AND rtm4.RTMInsertOrder = 4
LEFT JOIN TCSheet i1 WITH (NOLOCK)
ON i1.TCSID = rtm1.RTMTCSID
LEFT JOIN TCSheet i2 WITH (NOLOCK)
ON i2.TCSID = rtm2.RTMTCSID
LEFT JOIN TCSheet i3 WITH (NOLOCK)
ON i3.TCSID = rtm3.RTMTCSID
LEFT JOIN TCSheet i4 WITH (NOLOCK)
ON i4.TCSID = rtm4.RTMTCSID
LEFT JOIN ShipType st WITH (NOLOCK)
ON st.STId = rcp.RCPDefaultShipTypeID
WHERE
clt.CltID IN (#ClientIDs) -- 6653 and 6657.
AND rcp.RCPActive IN (#ProgramStatus)
ORDER BY
AAJobNo
, Program
You want to join with a nested select on the table carddet. I'm inferring that RCPID is the relationship between carddet and your main table RetainerCardProgram...
SELECT rcp.RCPID AS ProgramID,
date.MAXDATE AS MaxDate,
rest of your columns...
FROM RetailerCardProgram rcp WITH (NOLOCK)
INNER JOIN (
SELECT RCPID, MAX(cdconfirmationdate) as 'MAXDATE'
FROM carddet
GROUP BY RCPID
) date on date.RCPID = rcp.RCPID
rest of query...
You may want a left join if not all IDs have a date in carddet.
Obligatory addition: Also your use of NOLOCK is a bit terrifying. See http://blogs.sqlsentry.com/aaronbertrand/bad-habits-nolock-everywhere/