Why left join is not giving distinct result? - sql

I have following sql query and my left join is not giving me distinct result please help me to trace out.
SELECT DISTINCT
Position.Date,
Position.SecurityId,
Position.PurchaseLotId,
Position.InPosition,
ISNULL(ClosingPrice.Bid, Position.Mark) AS Mark
FROM
Fireball_Reporting.dbo.Reporting_DailyNAV_Pricing POSITION WITH (NOLOCK, READUNCOMMITTED)
LEFT JOIN Fireball.dbo.AdditionalSecurityPrice ClosingPrice WITH (NOLOCK, READUNCOMMITTED) ON
ClosingPrice.SecurityID = Position.PricingSecurityID AND
ClosingPrice.Date = Position.Date AND
ClosingPrice.SecurityPriceSourceID = #SourceID AND
ClosingPrice.PortfolioID IN (5,6)
WHERE
DatePurchased > #NewPositionDate AND
Position.Date = #CurrentPositionDate AND
InPosition = 1 AND
Position.PortfolioId IN (
SELECT
PARAM
FROM
Fireball_Reporting.dbo.ParseMultiValuedParameter(#PortfolioId, ',')
) AND
(
Position > 1 OR
Position < - 1
)
Now here in above my when I use LEFT JOIN ISNULL(ClosingPrice.Bid, Position.Mark) AS Mark and LEFT JOIN it is giving me more no of records with mutiple portfolio ids
for e.g . (5,6)
If i put portfolioID =5 giving result as 120 records
If i put portfolioID =6 giving result as 20 records
When I put portfolioID = (5,6) it should give me 140 records
but it is giving result as 350 records which is wrong . :(
It is happening because when I use LEFT JOIN there is no condition of PurchaseLotID in that as table Fireball.dbo.AdditionalSecurityPrice ClosingPrice not having column PurchaseLotID so it is giving me other records also whoes having same purchaseLotID's with diferent prices .
But I dont want that records
How can I eliminate those records ?

You get one Entry per DailyLoanAndCashPosition.PurchaseLotId = NAVImpact.PurchaseLotId
which would mean you must have more entrys in with the same PurchaseLotId

The most likely cause is that the left join produces duplicated PurchaseLotIds. The best way to know if if you perform a select distinct(PurchaseLotId) on your left side of the inner join.

Related

SQL Inner Join and nearest row to date

I dont't get it. I changed some of the code. In the WPLEVENT Table are a lot of Events per person. In the Persab-Table are the Persons with their History. Now I need the from the Persab Table just that row wich matches the persab.gltab Date nearest to the WPLEVENT.vdat Date. So all rows from the WPLEVENT, but just the one matching row from the PERSAB-Table.
SELECT
persab.name,
persab.vorname,
vdat,
eventstart,
persab.rc1,
persab.rc2
FROM wplevent
INNER JOIN
persab ON WPLEVENT.PersID = persab.PRIMKEY
INNER JOIN
(SELECT TOP 1 persab.rc1
FROM PERSAB
WHERE persab.gltab <= getdate() --/ Should be wplevent.vdat instead of getdate()
) NewTable ON wplevent.persid = persab.primkey
WHERE
persid ='100458'
ORDER BY vdat DESC
Need to use the MAX() function with the proper syntax by supplying an expression like MAX(persab.rc1). Also need to use GROUP BY for the second column rc2 in the subquery (although it looks like you do not need it). Finally you are missing the ON clause for the final INNER JOIN. I can update the answer to fix the query if you provide that information.
SELECT
Z1PERS.NAME
, Z1PERS.VORNAME
, WPLEVENT.VDat
, WPLEVENT.EventStart
, WPLEVENT.EventStop
, WPLEVENT.PEPGROUP
, Z1SGRP.TXXT
, PERSAB.GLTAB
, Z1PERS.PRIMKEY AS Expr1
, PERSAB.PRIMKEY
FROM
Z1PERS
INNER JOIN
WPLEVENT ON Z1PERS.PRIMKEY = WPLEVENT.PersID
INNER JOIN
Z1SGRP ON WPLEVENT.PEPGROUP = Z1SGRP.GRUPPE
INNER JOIN
(
SELECT MAX(Persab.rc1) --Fixed MAX expression
, persab.rc2
FROM
persab
GROUP BY
persab.rc2 --Need to group on rc2 if you want that column in the query otherwise remove this AND the rc2 column from select list
WHERE
WPLEVENT.PersID = PERSAB.PRIMKEY
AND WPLEVENT.VDat <= PERSAB.GLTAB
) --Missing ON clause for the INNER JOIN here
WHERE z1pers.vorname = 'henning'

Tricky (MS)SQL query with aggregated functions

I have these three tables:
table_things: [id]
table_location: [id]
[location]
[quantity]
table_reservation: [id]
[quantity]
[location]
[list_id]
Example data:
table_things:
id
1
2
3
table_location
id location quantity
1 100 10
1 101 4
2 100 1
table_reservation
id quantity location list_id
1 2 100 500
1 1 100 0
2 1 100 0
They are connected by [id] being the same in all three tables and [location] being the same in table_loation and table_reservation.
[quantity] in table_location shows how many ([quantity]) things ([id]) are in a certain place ([location]).
[quantity] in table_reservation shows how many ([quantity]) things ([id]) are reserved in a certain place ([location]).
There can be 0 or many rows in table_reservation that correspond to table_location.id = table_reservation_id, so I probably need to use an outer join for that.
I want to create a query that answers the question: How many things ([id]) are in this specific place (WHERE table_location=123), how many of of those things are reserved (table_reservation.[quantity]) and how many of those that are reserved are on a table_reservation.list_id where table_reservation.list_id > 0.
I can't get the aggregate functions right to where the answer contains only the number of lines that are in table_location with the given WHERE clause and at the same time I get the correct number of table_reservation.quantity.
If I do this I get the correct number of lines in the answer:
SELECT table_things.[id],
table_location.[quantity],
SUM(table_reservation.[quantity]
FROM table_location
INNER JOIN table_things ON table_location.[id] = table_things.[id]
RIGHT OUTER JOIN table_reservation ON table_things.location = table_reservation.location
WHERE table_location.location = 100
GROUP BY table_things.[id], table_location[quantity]
But the problem with that query is that I (of course) get an incorrect value for SUM(table_reservation.[quantity]) since it sums up all the corresponding rows in table_reservation and posts the same value on each of the rows in the result.
The second part is trying to get the correct value for the number of table_reservation.[quantity] whose list_id > 0. I tried something like this for that, in the SELECT list:
(SELECT SUM(CASE WHEN table_reservation.list_id > 0 THEN table_reservation.[quantity] ELSE 0 END)) AS test
But that doesn't even parse... I'm just showing it to show my thinking.
Probably an easy SQL problem, but it's been too long since I was doing these kinds of complicated queries.
For your first two questions:
How many things ([id]) are in this specific place (WHERE table_location=123), how many of of those things are reserved (table_reservation.[quantity])
I think you simply need a LEFT OUTER JOIN instead of RIGHT, and an additional join predicate for table_reservation
SELECT l.id,
l.quantity,
Reserved = SUM(ISNULL(r.quantity, 0))
FROM table_location AS l
INNER JOIN table_things AS t
ON t.id = l.ID
LEFT JOIN table_reservation r
ON r.id = t.id
AND r.location = l.location
WHERE l.location = 100
GROUP BY l.id, l.quantity;
N.B I have added ISNULL so that when nothing is reserved you get a result of 0 rather than NULL. You also don't actually need to reference table_things at all, but I am guessing this is a simplified example and you may need other fields from there so have left it in. I have also used aliases to make the query (in my opinion) easier to read.
For your 3rd question:
and how many of those that are reserved are on a table_reservation.list_id where table_reservation.list_id > 0.
Then you can use a conditional aggregate (CASE expression inside your SUM):
SELECT l.id,
l.quantity,
Reserved = SUM(r.quantity),
ReservedWithListOver0 = SUM(CASE WHEN r.list_id > 0 THEN r.[quantity] ELSE 0 END)
FROM table_location AS l
INNER JOIN table_things AS t
ON t.id = l.ID
LEFT JOIN table_reservation r
ON r.id = t.id
AND r.location = l.location
WHERE l.location = 100
GROUP BY l.id, l.quantity;
As a couple of side notes, unless you are doing it for the right reasons (so that different tables are queried depending on who is executing the query), then it is a good idea to always use the schema prefix, i.e. dbo.table_reservation rather than just table_reservation. It is also quite antiquated to prefix your object names with the object type (i.e. dbo.table_things rather than just dbo.things). It is somewhat subject, but this page gives a good example of why it might not be the best idea.
You can use a query like the following:
SELECT tt.[id],
tl.[quantity],
tr.[total_quantity],
tr.[partial_quantity]
FROM table_location AS tl
INNER JOIN table_things AS tt ON tl.[id] = tt.[id]
LEFT JOIN (
SELECT id, location,
SUM(quantity) AS total_quantity,
SUM(CASE WHEN list_id > 0 THEN quantity ELSE 0 END) AS partial_quantity
FROM table_reservation
GROUP BY id, location
) AS tr ON tl.id = tr.id AND tl.location = tr.location
WHERE tl.location = 100
The trick here is to do a LEFT JOIN to an already aggregated version of table table_reservation, so that you get one row per id, location. The derived table uses conditional aggregation to calculate field partial_quantity that contains the quantity where list_id > 0.
Output:
id quantity total_quantity partial_quantity
-----------------------------------------------
1 10 3 2
2 1 1 0
This was a classic case of sitting with a problem for a few hours and getting nowhere and then when you post to stackoverflow, you suddenly come up with the answer. Here's the query that gets me what I want:
SELECT table_things.[id],
table_location.[quantity],
SUM(table_reservation.[quantity],
(SELECT SUM(CASE WHEN table_reservation.list_id > 0 THEN ISNULL(table_reservation.[quantity], 0) ELSE 0 END)) AS test
FROM table_location
INNER JOIN table_things ON table_location.[id] = table_things.[id]
RIGHT OUTER JOIN table_reservation ON table_things.location = table_reservation.location AND table_things.[id] = table_reservation.[id]
WHERE table_location.location = 100
GROUP BY table_things.[id], table_location[quantity]
Edit: After having read GarethD's reply below, I did the changes he suggested (to my real code, not to the query above) which makes the (real) query correct.

Display Y/N column if record found in detail table

I'm trying to create a query so that I can have a column show Y/N if a particular item was ordered for a group of orders. The item I'm looking for would be OLI.id = '538'.
So my results would be:
Order#, Customer#, FreightPaid
12345, 00112233, Y
12346, 00112233, N
I cannot figure out if I need to use a subquery or the where exists function ?
Here's my current query:
SELECT distinct
OrderID,
Accountuid as Customerno
FROM [SMILEWEB_live].[dbo].[OrderLog] OL
inner join Orderlog_item OLI on OLI.orderlogkey = OL.[key]
inner join Account A on A.uid = OL.Accountuid
where A.GroupId = 'X9955'
and OL.CreateDate >= GETDATE() - 60
I would suggest an exists clause instead of a join:
select ol.OrderID, ol.Accountuid as Customerno,
(case when exists (select 1
from Orderlog_item OLI join
Account A
on A.uid = OL.Accountuid
where OLI.orderlogkey = OL.[key] and A.GroupId = 'X9955'
)
then 1 else 0
end) as flag
from [SMILEWEB_live].[dbo].[OrderLog] OL
where OL.CreateDate >= GETDATE() - 60;
This prevents a couple of problems. First, duplicate rows which are caused when there are multiple matching rows (and select distinct add unnecessary overhead). Second, missing rows, which happen when you use inner join instead of an outer join.

How can I join on multiple columns within the same table that contain the same type of info?

I am currently joining two tables based on Claim_Number and Customer_Number.
SELECT
A.*,
B.*,
FROM Company.dbo.Company_Master AS A
LEFT JOIN Company.dbp.Compound_Info AS B ON A.Claim_Number = B.Claim_Number AND A.Customer_Number = B.Customer_Number
WHERE A.Filled_YearMonth = '201312' AND A.Compound_Ind = 'Y'
This returns exactly the data I'm looking for. The problem is that I now need to join to another table to get information based on a Product_ID. This would be easy if there was only one Product_ID in the Compound_Info table for each record. However, there are 10. So basically I need to SELECT 10 additional columns for Product_Name based on each of those Product_ID's that are being selected already. How can do that? This is what I was thinking in my head, but is not working right.
SELECT
A.*,
B.*,
PD_Info_1.Product_Name,
PD_Info_2.Product_Name,
....etc {Up to 10 Product Names}
FROM Company.dbo.Company_Master AS A
LEFT JOIN Company.dbo.Compound_Info AS B ON A.Claim_Number = B.Claim_Number AND A.Customer_Number = B.Customer_Number
LEFT JOIN Company.dbo.Product_Info AS PD_Info_1 ON B.Product_ID_1 = PD_Info_1.Product_ID
LEFT JOIN Company.dbo.Product_Info AS PD_Info_2 ON B.Product_ID_2 = PD_Info_2.Product_ID
.... {Up to 10 LEFT JOIN's}
WHERE A.Filled_YearMonth = '201312' AND A.Compound_Ind = 'Y'
This query not only doesn't return the correct results, it also takes forever to run. My actual SQL is a lot longer and I've changed table names, etc but I hope that you can get the idea. If it matters, I will be creating a view based on this query.
Please advise on how to select multiple columns from the same table correctly and efficiently. Thanks!
I found put my extra stuff into CTE and add ROW_NUMBER to insure that I get only 1 row that I care about. it would look something like this. I only did for first 2 product info.
WITH PD_Info
AS ( SELECT Product_ID
,Product_Name
,Effective_Date
,ROW_NUMBER() OVER ( PARTITION BY Product_ID, Product_Name ORDER BY Effective_Date DESC ) AS RowNum
FROM Company.dbo.Product_Info)
SELECT A.*
,B.*
,PD_Info_1.Product_Name
,PD_Info_2.Product_Name
FROM Company.dbo.Company_Master AS A
LEFT JOIN Company.dbo.Compound_Info AS B
ON A.Claim_Number = B.Claim_Number
AND A.Customer_Number = B.Customer_Number
LEFT JOIN PD_Info AS PD_Info_1
ON B.Product_ID_1 = PD_Info_1.Product_ID
AND B.Fill_Date >= PD_Info_1.Effective_Date
AND PD_Info_2.RowNum = 1
LEFT JOIN PD_Info AS PD_Info_2
ON B.Product_ID_2 = PD_Info_2.Product_ID
AND B.Fill_Date >= PD_Info_2.Effective_Date
AND PD_Info_2.RowNum = 1

Join Table on one record but calculate field based on other rows in the join

I am trying to write a query to Identify my subscribers who have abandoned a shopping cart in the last day but also I need a calculated field that represents weather or not they have received and incentive in the last 7 days.
I have the following tables
AbandonCart_Subscribers
Sendlog
The first part of the query is easy, get abandoners in the last day
select a.* from AbandonCart_Subscribers
where DATEDIFF(day,a.DateAbandoned,GETDATE()) <= 1
Here is my attempt to calculate the incentive but I am fairly certain it is not correct as IncentiveRecieved is always 0 even when I know it should not be...
select a.*,
CASE
WHEN DATEDIFF(D,s.SENDDATE,GETDATE()) >= 7
THEN 1
ELSE 0
END As IncentiveRecieved
from AbandonCart_Subscribers a
left join SendLog s on a.EmailAddress = s.EmailAddress and s.CampaignID IS NULL
where
DATEDIFF(day,a.DateAbandoned,GETDATE()) <= 1
Here is a SQL fiddle with the objects and some data. I would really appreciate some help.
Thanks
http://sqlfiddle.com/#!3/f481f/1
Kishore is right in saying the main problem is that it should be <=7, not >=7. However, there is another problem.
As it stands, you could get multiple results. You don't want to do a left join on SendLog in case the same email address is in there more than once. Instead you should be getting a unique result from that table. There's a couple of ways of doing that; here is one such way which uses a derived table. The table I have called s will give you a unique list of emails that have been sent an incentive in the last week.
select a.*,
CASE
WHEN s.EmailAddress is not null
THEN 1
ELSE 0
END As IncentiveRecieved
from AbandonCart_Subscribers a
left join (select distinct EmailAddress
from SendLog s
where s.CampaignID IS NULL
and DATEDIFF(D,s.SENDDATE,GETDATE()) <= 7
) s on a.EmailAddress = s.EmailAddress
where DATEDIFF(day,a.DateAbandoned,GETDATE()) <= 1
You can set a variable using a condition:
select a.*,(DATEDIFF(D,s.SENDDATE,GETDATE()) >= 7) as `IncentiveRecieved `
from AbandonCart_Subscribers a
left join SendLog s on a.EmailAddress = s.EmailAddress and s.CampaignID IS NULL
where
DATEDIFF(day,a.DateAbandoned,GETDATE()) <= 1
Is this what you're looking for?
should it not be less than 7 instead of greater than 7?
select a.*,
CASE
WHEN DATEDIFF(D,s.SENDDATE,GETDATE()) <= 7 AND CampaignID is not null
THEN 1
ELSE 0
END As IncentiveRecieved
from AbandonCart_Subscribers a
left join SendLog s on a.EmailAddress = s.EmailAddress --and s.CampaignID IS NULL
where
DATEDIFF(day,a.DateAbandoned,GETDATE()) <= 1
Hope this satisfies your need.