SQL Server Query returning duplicate rows - sql

I have a query of SQL that is a join of multiple tables, it is returning duplicate rows and after hours of going through it can't find out where its going wrong
SELECT
StkItem.iUOMStockingUnitID,
_etblUnits1.cUnitCode as 'parkSize',
_etblUnits2.cUnitCode as 'quantitySize',
InvNum.fInvTotExclForeign,
[_btblInvoiceLines].*,
[_rtblCountry].cCountryName,
[CurrencyHist].fBuyRate,
Vendor.Name,
InvNum.OrderDate,
InvNum.InvNumber
FROM
[dbo].[_btblInvoiceLines]
LEFT JOIN
StkItem ON StkItem.StockLink = [_btblInvoiceLines].iStockCodeID
LEFT JOIN
_etblUnits as _etblUnits1 ON _etblUnits1.idunits = StkItem.iUOMDefSellUnitID
LEFT JOIN
_etblUnits as _etblUnits2 ON _etblUnits2.idunits = StkItem.iUOMStockingUnitID
LEFT JOIN
InvNum ON iInvoiceID = AutoIndex
LEFT JOIN
Vendor ON Vendor.DCLink = InvNum.AccountID
LEFT JOIN
[_rtblCountry] ON [_rtblCountry].idCountry = Vendor.iCountryID
LEFT JOIN
[CurrencyHist] ON InvNum.ForeignCurrencyID = [CurrencyHist].iCurrencyID
WHERE
OrderNum = ''
AND [CurrencyHist].iCurrencyID = (SELECT TOP 1 iCurrencyID
FROM [CurrencyHist]
WHERE iCurrencyID = InvNum.ForeignCurrencyID
ORDER BY idCurrencyHist DESC)
Here is the query, any help will be highly appreciated, thanks in advance

From your previous comments, The problem is coming when you join [CurrencyHist]. From the name, it seems it's a history table and so must be having multiple rows as a history for each currency. To eliminate duplicate rows, you should join with the latest updated record for the particular currency. So, your query could be like below,
SELECT StkItem.iUOMStockingUnitID,
_etblUnits1.cUnitCode as 'parkSize',
_etblUnits2.cUnitCode as 'quantitySize',
InvNum.fInvTotExclForeign,
[_btblInvoiceLines].*,
[_rtblCountry].cCountryName,
[CurrencyHist].fBuyRate,
Vendor.Name,
InvNum.OrderDate,
InvNum.InvNumber
FROM [dbo].[_btblInvoiceLines]
LEFT JOIN StkItem ON StkItem.StockLink = [_btblInvoiceLines].iStockCodeID
LEFT JOIN _etblUnits as _etblUnits1 ON _etblUnits1.idunits = StkItem.iUOMDefSellUnitID
LEFT JOIN _etblUnits as _etblUnits2 ON _etblUnits2.idunits = StkItem.iUOMStockingUnitID
LEFT JOIN InvNum ON iInvoiceID = AutoIndex
LEFT JOIN Vendor ON Vendor.DCLink = InvNum.AccountID
LEFT JOIN [_rtblCountry] ON [_rtblCountry].idCountry = Vendor.iCountryID
LEFT JOIN (SELECT DENSE_RANK() over (partition by [CurrencyHist].iCurrencyID order by [CurrencyHist].LastUpdated desc) as rn,[CurrencyHist].iCurrencyID as 'iCurrencyID'
FROM [CurrencyHist] AS [CurrencyHist]
)[CurrencyHist] ON InvNum.ForeignCurrencyID = [CurrencyHist].iCurrencyID
and [CurrencyHist].rn=1
WHERE OrderNum = '' AND
[CurrencyHist].iCurrencyID = (SELECT TOP 1 iCurrencyID
FROM [CurrencyHist]
WHERE iCurrencyID = InvNum.ForeignCurrencyID
ORDER BY idCurrencyHist DESC)
Note : I have assumed that CurrencyHist table has a LastUpdated with DateTime datatype Column

Related

why selecting particular columns from same table slows down query performance significantly?

I have SELECT statement that querying columns from tblQuotes. Why if I am selecting columns a.ProducerCompositeCommission and a.CompanyCompositeCommission, then query spinning forever.
Execution plans with and without those columns are IDENTICAL!
If I commented them out - then it brings result for 1 second.
SELECT
a.stateid risk_state1,
--those columns slows down performance
a.ProducerCompositeCommission,
a.CompanyCompositeCommission,
GETDATE() runDate
FROM
tblQuotes a
INNER JOIN
lstlines l ON a.LineGUID = l.LineGUID
INNER JOIN
tblSubmissionGroup tsg ON tsg.SubmissionGroupGUID = a.SubmissionGroupGuid
INNER JOIN
tblUsers u ON u.UserGuid = tsg.UnderwriterUserGuid
INNER JOIN
tblUsers u2 ON u2.UserGuid = a.UnderwriterUserGuid
LEFT OUTER JOIN
tblFin_Invoices tfi ON tfi.QuoteID = a.QuoteID AND tfi.failed <> 1
INNER JOIN
lstPolicyTypes lpt ON lpt.policytypeid = a.policytypeid
INNER JOIN
tblproducercontacts prodC ON prodC.producercontactguid = a.producercontactguid
INNER JOIN
tblProducerLocations pl ON pl.producerlocationguid = prodc.producerlocationguid
INNER JOIN
tblproducers prod ON prod.ProducerGUID = pl.ProducerGUID
LEFT OUTER JOIN
Catalytic_tbl_Model_Analysis aia ON aia.ImsControl = a.controlno
AND aia.analysisid = (SELECT TOP 1 tma2.analysisid
FROM Catalytic_tbl_Model_Analysis tma2
WHERE tma2.imscontrol = a.controlno)
LEFT OUTER JOIN
Catalytic_tbl_RDR_Analysis rdr ON rdr.ImsControl = a.controlno
AND rdr.analysisid = (SELECT TOP 1 tma2.analysisid
FROM Catalytic_tbl_RDR_Analysis tma2
WHERE tma2.imscontrol = a.controlno)
LEFT OUTER JOIN
tblProducerContacts mnged ON mnged.producercontactguid = ProdC.ManagedBy
LEFT OUTER JOIN
lstQuoteStatusReasons r1 ON r1.id = a.QuoteStatusReasonID
WHERE
l.LineName = 'EARTHQUAKE'
AND CAST(a.EffectiveDate AS DATE) >= CAST('2017-01-01' AS DATE)
AND CAST(a.EffectiveDate AS DATE) <= CAST('2017-12-31' AS DATE)
ORDER BY
a.effectiveDate
The execution plan can be found here:
https://www.brentozar.com/pastetheplan/?id=rJawDkTx-
I ran sp_help and this is what I see:
What exactly wrong with those columns?
I dont use them in a JOIN or anything. Why such bahaviour?
Table Size:
Indexes on table tblQuotes

How to use group by only for some columns in sql Query?

The following query returns 550 records, which I am then grouping by some columns in the controller via linq. However, how can I achieve the "group by" logic in the SQL query itself? Additionally, post-grouping, I need to show only 150 results to the user.
Current SQL query:
SELECT DISTINCT
l.Id AS LoadId
, l.LoadTrackingNumber AS LoadDisplayId
, planningType.Text AS PlanningType
, loadStatus.Id AS StatusId
, loadWorkRequest.Id AS LoadRequestId
, loadStatus.Text AS Status
, routeIds.RouteIdentifier AS RouteName
, planRequest.Id AS PlanId
, originPartyRole.Id AS OriginId
, originParty.Id AS OriginPartyId
, originParty.LegalName AS Origin
, destinationPartyRole.Id AS DestinationId
, destinationParty.Id AS DestinationPartyId
, destinationParty.LegalName AS Destination
, COALESCE(firstSegmentLocation.Window_Start, originLocation.Window_Start) AS StartDate
, COALESCE(firstSegmentLocation.Window_Start, originLocation.Window_Start) AS BeginDate
, destLocation.Window_Finish AS EndDate
AS Number
FROM Domain.Loads (NOLOCK) AS l
INNER JOIN dbo.Lists (NOLOCK) AS loadStatus ON l.LoadStatusId = loadStatus.Id
INNER JOIN Domain.Routes (NOLOCK) AS routeIds ON routeIds.Id = l.RouteId
INNER JOIN Domain.BaseRequests (NOLOCK) AS loadWorkRequest ON loadWorkRequest.LoadId = l.Id
INNER JOIN Domain.BaseRequests (NOLOCK) AS planRequest ON planRequest.Id = loadWorkRequest.ParentWorkRequestId
INNER JOIN Domain.Schedules AS planSchedule ON planSchedule.Id = planRequest.ScheduleId
INNER JOIN Domain.Segments (NOLOCK) os on os.RouteId = routeIds.Id AND os.[Order] = 0
INNER JOIN Domain.LocationDetails (NOLOCK) AS originLocation ON originLocation.Id = os.DestinationId
INNER JOIN dbo.EntityRoles (NOLOCK) AS originPartyRole ON originPartyRole.Id = originLocation.DockRoleId
INNER JOIN dbo.Entities (NOLOCK) AS originParty ON originParty.Id = originPartyRole.PartyId
INNER JOIN Domain.LocationDetails (NOLOCK) AS destLocation ON destLocation.Id = routeIds.DestinationFacilityLocationId
INNER JOIN dbo.EntityRoles (NOLOCK) AS destinationPartyRole ON destinationPartyRole.Id = destLocation.DockRoleId
INNER JOIN dbo.Entities (NOLOCK) AS destinationParty ON destinationParty.Id = destinationPartyRole.PartyId
INNER JOIN dbo.TransportationModes (NOLOCK) lictm on lictm.Id = l.LoadInstanceCarrierModeId
INNER JOIN dbo.EntityRoles (NOLOCK) AS carrierPartyRole ON lictm.CarrierId = carrierPartyRole.Id
INNER JOIN dbo.Entities (NOLOCK) AS carrier ON carrierPartyRole.PartyId = carrier.Id
INNER JOIN dbo.EntityRoles (NOLOCK) AS respPartyRole ON l.ResponsiblePartyId = respPartyRole.Id
INNER JOIN dbo.Entities (NOLOCK) AS respParty ON respPartyRole.PartyId = respParty.Id
INNER JOIN Domain.LoadOrders (NOLOCK) lo ON lo.LoadInstanceId = l.Id
INNER JOIN Domain.Orders (NOLOCK) AS o ON lo.OrderInstanceId = o.Id
INNER JOIN Domain.BaseRequests (NOLOCK) AS loadRequest ON loadRequest.LoadId = l.Id
--Load Start Date
LEFT JOIN Domain.Segments (NOLOCK) AS segment ON segment.RouteId = l.RouteId AND segment.[Order] = 0
LEFT JOIN Domain.LocationDetails (NOLOCK) AS firstSegmentLocation ON firstSegmentLocation.Id = segment.DestinationId
LEFT JOIN dbo.Lists (NOLOCK) AS planningType ON l.PlanningTypeId = planningType.Id
LEFT JOIN dbo.EntityRoles (NOLOCK) AS billToRole ON o.BillToId = billToRole.Id
LEFT JOIN dbo.Entities (NOLOCK) AS billTo ON billToRole.PartyId = billTo.Id
WHERE o.CustomerId in (34236) AND originLocation.Window_Start >= '07/19/2015 00:00:00' AND originLocation.Window_Start < '07/25/2015 23:59:59' AND l.IsHistoricalLoad = 0
AND loadStatus.Id in (285, 286,289,611,290)
AND loadWorkRequest.ParentWorkRequestId IS NOT NULL
AND routeIds.RouteIdentifier IS NOT NULL
AND (planSchedule.EndDate IS NULL OR (planSchedule.EndDate is not null and CAST(CONVERT(varchar(10), planSchedule.EndDate,101) as datetime) > CAST(CONVERT(varchar(10),GETDATE(),101) as datetime))) ORDER BY l.Id DESC
linq:
//Get custom grouped data
var loadRequest = (from lq in returnList
let loadDisplayId = lq.LoadDisplayId
let origin = lq.OriginId //get this origin for route
let destination = lq.DestinationId // get this destination for route
group lq by new
{
RouteId = lq.RouteName,
PlanId = lq.PlanId,
Origin = lq.OriginId,
Destination = lq.DestinationId
}
into grp
select new
{
RouteId = grp.Key.RouteId,
PlanId = grp.Key.PlanId,
Origin = grp.Key.Origin,
Destination = grp.Key.Destination,
Loads = (from l in grp select l)
}).OrderBy(x => x.Origin).ToList();
I'm guessing you want to Group By column 1 but include columns 2 and 3 in your Select. Using a Group By you cannot do this. However, you can do this using a T-SQL Windowing function using the OVER() operator. Since you don't say how you want to aggregate, I cannot provide an example. But look at T-SQL Windowing functions. This article might help you get started.
One important thing you need to understand about GROUP BY is that you must assume that there are multiple values in every column outside of the GROUP BY list. In your case, you must assume that for each value of Column1 there would be multiple values of Column2 and Column3, all considered as a single group.
If you want your query to process any of these columns, you must specify what to do about these multiple values.
Here are some choices you have:
Pick the smallest or the largest value for a column in a group - use MIN(...) or MAX(...) aggregator for that
Count non-NULL items in a group - use COUNT(...)
Produce an average of non-NULL values in a group - use AVG(...)
For example, if you would like to find the smallest Column2 and an average of Column3 for each value of Column1, your query would look like this:
select
Column1, MIN(Column2), AVG(Column3)
from
TableName
group by
Column1

Join questionn - Select must return only one record

I have 4 tables in my SQL Server 2008 database :
CONTACT
CONTACT_DETAILS
PLANS
PLANS_DETAILS
Every record is recorded in CONTACT and CONTACT_DATAILS but a CONTACT can have 0, 1, 2 or more records in PLANS, Active or Cancelled.
So I did this:
SELECT *
from CONTACT as c
left join PLANS as pp on pp.PKEY = c.PKEY
left join PLANS_DETAILS as pd on pd.PDKEY = p.PDKEY
inner join CONTACT_DETAILS as cd on cd.DKEY = c.DKEY
WHERE c.KEY = '267110' and PP.STATUS = 'Active'
"267110" have 1 active PLAN so it shows me 1 line, everything I need.
But if I put
WHERE c.KEY = '100003' and PP.STATUS = 'Active'
"100003" have 2 cancelled plans, so the result is empty. If I remove PP.STATUS = 'Active' , it returns me 2 identical results, but I need just one.
In resume: I need a select that returns me 1 row only. If there is an active plan, return the columns, if not, return the columns null. If someone have 1 cancelled and 1 active plan, return me only the active plan columns.
The answer to your question is to move the condition on pp to the on clause.
SELECT *
from CONTACT c inner join
CONTACT_DETAILS cd
on cd.DKEY = c.DKEY left join
PLANS pp
on pp.PKEY = c.PKEY AND PP.STATUS = 'Active' left join
PLANS_DETAILS pd
on pd.PDKEY = p.PDKEY
WHERE c.KEY = '267110' ;
In addition, when you have a series of inner and left joins, I recommend putting all the inner joins first, followed by the outer joins. That makes it clear which joins are used for keeping records and which for filtering.
Just add an ORDER BY PP.STATUS DESC and a TOP 1 clause, and delete the and PP.STATUS = 'Active', like this
SELECT TOP 1 * from CONTACT as c
left join PLANS as pp on pp.PKEY = c.PKEY
left join PLANS_DETAILS as pd on pd.PDKEY = p.PDKEY
inner join CONTACT_DETAILS as cd on cd.DKEY = c.DKEY
WHERE c.KEY = '100003'
ORDER BY PP.STATUS DESC
SELECT *
from CONTACT c
left join
(
select * from
(select
[whatever you need from this table]
,row_number over(partition by [keys in this table] order by status asc) rnk
from PLANS)
where rnk = 1
) pp
on pp.PKEY = c.PKEY
left join PLANS_DETAILS pd
on pd.PDKEY = p.PDKEY
inner join CONTACT_DETAILS cd
on cd.DKEY = c.DKEY
WHERE c.KEY = '100003'

SQL Server, insert value to variable and sort

I need to sort the results of a query after insert a value to a variable.
I am trying to sort according to 'RowId' but its not valid in my case.
Below is my query, how can I make it work?
Thanks.
SELECT TOP 1 #NumOfProducts = ROW_NUMBER() OVER(ORDER BY Products.Id) AS RowId
FROM Cities INNER JOIN
CitiesInLanguages ON Cities.Id = CitiesInLanguages.CityId INNER JOIN
ShopsInCities ON Cities.Id = ShopsInCities.CityId INNER JOIN
Categories INNER JOIN
ProductstInCategories ON Categories.Id = ProductstInCategories.CategoryId INNER JOIN
Products ON ProductstInCategories.ProductId = Products.Id INNER JOIN
ProductsInProdutGroup ON Products.Id = ProductsInProdutGroup.ProductId INNER JOIN
ProductsGroups ON ProductsInProdutGroup.ProductGroupId = ProductsGroups.Id INNER JOIN
ShopsInProductsGroup ON ProductsGroups.Id = ShopsInProductsGroup.ProductGroupId INNER JOIN
aspnet_Users ON ShopsInProductsGroup.ShopId = aspnet_Users.UserId ON ShopsInCities.ShopId = aspnet_Users.UserId INNER JOIN
ProductsNamesInLanguages ON Products.Id = ProductsNamesInLanguages.ProductId INNER JOIN
UsersInfo ON aspnet_Users.UserId = UsersInfo.UserId INNER JOIN
ProductOptions ON Products.Id = ProductOptions.ProductId INNER JOIN
ProductOptionsInLanguages ON ProductOptions.Id = ProductOptionsInLanguages.ProductOptionId INNER JOIN
ProductFiles ON Products.Id = ProductFiles.ProductId INNER JOIN
ProductsInOccasions ON Products.Id = ProductsInOccasions.ProductId INNER JOIN
Occasions ON ProductsInOccasions.OccasionId = Occasions.Id INNER JOIN
OccasionsInLanguages ON Occasions.Id = OccasionsInLanguages.OccasionId
WHERE (Products.IsAddition = 0) AND (Categories.IsEnable = 1) AND (Products.IsEnable = 1) AND (ProductsGroups.IsEnable = 1) AND (Cities.IsEnable = 1) AND
(ShopsInProductsGroup.IsEnable = 1) AND (CitiesInLanguages.CityName = #CityName) AND (ProductsNamesInLanguages.LanguageId = #languageId) AND
(Categories.Id = #CategoryId) AND (ProductOptions.IsEnable = 1) AND (ProductFiles.IsEnable = 1)
group by Products.Id, ProductsNamesInLanguages.ProductName, UsersInfo.Name
Order By RowId
With edit try this:
SELECT TOP 1 #NumOfProducts = ROW_NUMBER() OVER(ORDER BY Products.Id),
ROW_NUMBER() OVER(ORDER BY Products.Id) AS RowId
or try
ORDER BY ROW_NUMBER() OVER(ORDER BY Products.Id)
I'd have to test but I thik both will work.
The problem is that rowid is not in any of the group by items.
You could order by Products.id. If rowid is going to be the same for each one you could order by max(rowid) or min(rowid) or add rowid to the group by statement.
Are you trying to find the ID of the most recently inserted row? You want
SELECT Scope_Identity()
Edit
*I am trying to get the max row id of ROW_NUMBER()*
Wrap your query in
SELECT #NumOfProducts = Max(RowID) FROM
( [your query here] ) v
Alternately, a SELECT COUNT... query may provide the answer

Left Join, Order by, MySQL Optimization

I have a query like this:
SELECT m...., a...., r....
FROM 0_member AS m
LEFT JOIN 0_area AS a ON a.user_id = (SELECT user_id
FROM `0_area`
WHERE user_id = m.id
ORDER BY sec_id ASC LIMIT 1)
LEFT JOIN 0_rank as r ON a.rank_id = r.id
WHERE m.login_userid = '$username'
The idea is to get the first row from 0_area table and hence the attempted inner join. However, it is not working as expected.
Between 0_area and 0_member, 0_member.id = 0_area.user_id. However, there are multiple rows of 0_area.user_id and I want the row having the lowest value of sec_id.
Any help please?
SELECT m...., a...., r....
FROM 0_member AS m
LEFT JOIN (SELECT user_id, min(sec_id) minsec
FROM `0_area`
GROUP BY user_id) g1 on g1.user_id=m.id
LEFT JOIN 0_area AS a ON a.user_id = g1.user_id and a.sec_id=minsec
LEFT JOIN 0_rank as r ON a.rank_id = r.id
WHERE m.login_userid = '$username'