Getting Duplicate Values When Joining the Same Table Twice - sql

Multiple joins to the same table using different criteria.
I'm trying to get a value from a table but have different criteria. There is a column that has the value I'm trying to retrieve. There are two sets of criteria.
The issue is with the Else line. The logic is, if the conditions in the first join are true, then get the c_wRVUAmt value. If there is no match, then use the conditions from the second join. If there is no match there, then use 0.
I'm getting duplicate records, which I understand. I just don't understand how to write the joins or the query to eliminate duplicate rows.
select a.[Revenue Id]
,a.CPT
,a.[Procedure Mod]
,dd.MemberId
,ee.MemberId
,dd.c_HCPCS
,ee.c_HCPCS
,dd.c_MOD
,ee.c_MOD
,CASE When a.[GL Company Unit] IN ('6500','6600','6700') and a.[Rev Code] = '0320'
then 0
When RTRIM(a.[BE Name]) <> 'Hospital'
Then 0
Else ISNULL(dd.c_wRVUAmt * a.[Total Qty],0)+ISNULL(ee.c_wRVUAmt * a.[Total Qty],0)
end as WorkRVUAmt
from GP_CUSTOMS..Revenue_Staging a
Left Outer Join d_Dim22 dd on a.[CPT] = dd.c_HCPCS
and a.[Procedure Mod] = dd.c_MOD
and a.[Procedure Mod] in ('26','53')
Left Outer Join d_Dim22 ee on a.[CPT] = ee.c_HCPCS
and a.[Procedure Mod] NOT IN ('26','53')
I'm looking for one row.

There are many ways to remove duplicates.
One of these is to select data with a subselect.
Try to put around your select another select WITH DISTINCT:
SELECT DISTINCT
[Revenue Id]
,CPT
,[Procedure Mod]
,MemberId1
,MemberId2
,c_HCPCS
,c_HCPCS
,c_MOD
,c_MOD
,WorkRVUAmt
FROM (<YOUR SELECT HERE>)
Remember to give an alias to your column names MemberId (e.g. MemberId1, MemberId2)

There is no way to eliminate records happen to come out of the left joins provided in your query. You can either write a correlated subquery and pass out a row instead from either of the tables on the right side of the joins or modify the result in the select by distinct which I believe is not your option!

Related

Query with Left outer join and group by returning duplicates

To begin with, I have a table in my db that is fed with SalesForce info. When I run this example query it returns 2 rows:
select * from SalesForce_INT_Account__c where ID_SAP_BAYER__c = '3783513'
When I run this next query on the same table I obtain one of the rows, which is what I need:
SELECT MAX(ID_SAP_BAYER__c) FROM SalesForce_INT_Account__c where ID_SAP_BAYER__c = '3783513' GROUP BY ID_SAP_BAYER__c
Now, I have another table (PedidosEspecialesZarateCabeceras) which has a field (NroClienteDireccionEntrega) that I can match with the field I've been using in the SalesForce table (ID_SAP_BAYER__c). This table has a key that consists of just 1 field (NroPedido).
What I need to do is join these 2 tables to obtain a row from PedidosEspecialesZarateCabeceras with additional fields coming from the SalesForce table, and in case those additional fields are not available, they should come as NULL values, so for that im using a LEFT OUTER JOIN.
The problem is, since I have to match NroClienteDireccionEntrega and ID_SAP_BAYER__c and there's 2 rows in the salesforce table with the same ID_SAP_BAYER__c, my query returns 2 duplicate rows from PedidosEspecialesZarateCabeceras (They both have the same NroPedido).
This is an example query that returns duplicates:
SELECT
cab.CUIT AS CUIT,
convert(nvarchar(4000), cab.NroPedido) AS NroPedido,
sales.BillingCity__c as Localidad,
sales.BillingState__c as IdProvincia,
sales.BillingState__c_Desc as Provincia,
sales.BillingStreet__c as Calle,
sales.Billing_Department__c as Distrito,
sales.Name as RazonSocial,
cab.NroCliente as ClienteId
FROM PedidosEspecialesZarateCabeceras AS cab WITH (NOLOCK)
LEFT OUTER JOIN
SalesForce_INT_Account__c AS sales WITH (NOLOCK) ON
cab.NroClienteDireccionEntrega = sales.ID_SAP_BAYER__c
and sales.ID_SAP_BAYER__c in
( SELECT MAX(ID_SAP_BAYER__c)
FROM SalesForce_INT_Account__c
GROUP BY ID_SAP_BAYER__c
)
WHERE cab.NroPedido ='5320'
Even though the join has MAX and Group By, this returns 2 duplicate rows with different SalesForce information (Because of the 2 salesforce rows with the same ID_SAP_BAYER__c), which should not be possible.
What I need is for the left outer join in my query to pick only ONE of the salesforce rows to prevent duplication like its happening right now. For some reason the select max with the group by is not working.
Maybe I should try to join this tables in a different way, can anyone give me some other ideas on how to join the two tables to return just 1 row? It doesnt matter if the SalesForce row that gets picked out of the 2 isn't the correct one, I just need it to pick one of them.
Your IN clause is not actually doing anything, since...
SELECT MAX(ID_SAP_BAYER__c)
FROM SalesForce_INT_Account__c
GROUP BY ID_SAP_BAYER__c
... returns all possible IDSAP_BAYER__c values. (The GROUP BY says you want to return one row per unique ID_SAP_BAYER__c and then, since your MAX is operating on exactly one unique value per group, you simply return that value.)
You will want to change your query to operate on a value that is actually different between the two rows you are trying to differentiate (probably the MAX(ID) for the relevant ID_SAP_BAYER__c). Plus, you will want to link that inner query to your outer query.
You could probably do something like:
...
LEFT OUTER JOIN
SalesForce_INT_Account__c sales
ON cab.NroClienteDireccionEntrega = sales.ID_SAP_BAYER__c
and sales.ID in
(
SELECT MAX(ID)
FROM SalesForce_INT_Account__c sales2
WHERE sales2.ID_SAP_BAYER__c = cab.NroClienteDireccionEntrega
)
WHERE cab.NroPedido ='5320'
By using sales.ID in ... SELECT MAX(ID) ... instead of sales.ID_SAP_BAYER__c in ... SELECT MAX(ID_SAP_BAYER__c) ... this ensures you only match one of the two rows for that ID_SAP_BAYER__c. The WHERE sales2.ID_SAP_BAYER__c = cab.NroClienteDireccionEntrega condition links the inner query to the outer query.
There are multiple ways of doing the above, especially if you don't care which of the relevant rows you match on. You can use the above as a starting point and make it match your preferred style.
An alternative might be to use OUTER APPLY with TOP 1. Something like:
SELECT
...
FROM PedidosEspecialesZarateCabeceras AS cab
OUTER APPLY(
SELECT TOP 1 *
FROM SalesForce_INT_Account__c s1
WHERE cab.NroClienteDireccionEntrega = s1.ID_SAP_BAYER__c
) sales
WHERE cab.NroPedido ='5320'
Without an ORDER BY the match that TOP 1 chooses will be arbitrary, but I think that's what you want anyway. (If not, you could add an ORDER BY).

Multiple rows from Left Join in SQL were rows are uniquely matched

I have two views that I am trying to join. I am joining on three elements, date, case number and surgeon id number. Each should only have one match for the previous case out value, but I am getting multiple rows after my left join.
Here is my code:
CREATE VIEW [dbo].[OR]
AS
SELECT DISTINCT
[ID].*,
[BYSURG].[PREV_PAT_OUT] AS PrevPtOut
FROM
[dbo].[OR_LOG_INDEXED] [ID]
LEFT JOIN
[DBO].[OR_CASE_NUM] BYSURG ON [ID].[SURG_DT] = [BYSURG].[SURG_DT]
AND [ID].[SURGEON_ID] = [BYSURG].[SURGEON_ID]
AND [ID].[CASE_NUM_BY_ROOM] = [BYSURG].[CASE_NUM_BY_ROOM_ADJ]
Any insights are much appreciated.
Thanks!
M
Replace your select block with one that retrieves all columns:
SELECT
*
FROM
[dbo].[OR_LOG_INDEXED] [ID]
LEFT JOIN
[DBO].[OR_CASE_NUM] BYSURG ON [ID].[SURG_DT] = [BYSURG].[SURG_DT]
AND [ID].[SURGEON_ID] = [BYSURG].[SURGEON_ID]
AND [ID].[CASE_NUM_BY_ROOM] = [BYSURG].[CASE_NUM_BY_ROOM_ADJ]
Run it and look at your "duplicate" rows - something about them will no longer be a duplicate - perhaps you've forgotten to include some other criteria in your where clause
Putting DISTINCT in the select block is not the answer - find out what data element about the "duplicate" rows is different and then filter out the rows you don't want

Compare value of one field in one table to the total sum of one column in another table

I'm having trouble with executing a query to compare where the value of a column in one table is not equal to the sum of another column in a different table. Below is the query I have been trying to execute:
select id.invoice_no,sum(id.bank_charges),
from db2apps.invoice_d id
inner join db2apps.invoice_h ih on (id.invoice_no = ih.invoice_no)
group by id.invoice_no
having coalesce(sum(id.bank_charges), 0) != ih.tax_value
with ur;
I tried with joining on the tables, the group by having format, etc and have had no luck. I really want to select id.invoice_no, ih.tax_value, and sum(id.bank_charges) in the result set, and also grab the data where the sum(id.bank_charges) is not equal to the value of ih.tax_value. Any help would be appreciated.
Perhaps this solves your problem:
select ih.invoice_no, ih.tax_value, sum(id.bank_charges)
from db2apps.invoice_h ih left join
db2apps.invoice_d id
on id.invoice_no = ih.invoice_no
group by ih.invoice_no, ih.tax_value
having coalesce(sum(id.bank_charges), 0) <> ih.tax_value;
The most logical way is probably to SUM the invoice detail first.
SELECT IH.INVOICE_NO
, IH.TAX_VALUE
FROM
DB2APPS.INVOICE_H IH
JOIN
( SELECT INVOICE_NO
, COALESCE(SUM(BANK_CHARGES),0) AS BANK_CHARGES
FROM
DB2APPS.INVOICE_D
GROUP BY
INVOICE_NO
) ID
ON
ID.INVOICE_NO = IH.INVOICE_NO
WHERE
ID.BANK_CHARGE <> IH.TAX_VALUE
Generally, you never need to use HAVING in SQL and often your code will be clearer and easier to follow if you do avoid using it (even if it it sometimes a bit longer).
P.S. you can remove the COALESCE if BANK_CHARGES is NOT NULL.

Execute a select from two tables with same columns

I have two tables with same columns, I need to make a select in this two tables, I want to know how is the best way to make this, my select test is:
SELECT
ISNULL(LoteDet.IdLoteDet, LoteDetPg.IdLoteDet) AS Expr1,
ISNULL(LoteDet.IDSac, LoteDetPg.IDSac) AS Expr2,
ISNULL(LoteDet.Comprom, LoteDetPg.Comprom) AS Expr3,
ISNULL(LoteDet.NossoNum, LoteDetPg.NossoNum) AS Expr4,
ISNULL(LoteDet.NossoNumDig, LoteDetPg.NossoNumDig) AS Expr5
FROM
LoteDet
CROSS JOIN
LoteDetPg
WHERE
Expr1 = 500
It's possible to make this ?
How is the better way to execute this kind of select, if not found the value in one table, the value will be in the other table....
------ EDIT
Perhaps create a view is a good alternative to this type of select?
Use COALESCE:
SELECT
COALESCE(LoteDet.IdLoteDet, LoteDetPg.IdLoteDet) AS Expr1,
COALESCE(LoteDet.IDSac, LoteDetPg.IDSac) AS Expr2,
COALESCE(LoteDet.Comprom, LoteDetPg.Comprom) AS Expr3,
COALESCE(LoteDet.NossoNum, LoteDetPg.NossoNum) AS Expr4,
COALESCE(LoteDet.NossoNumDig, LoteDetPg.NossoNumDig) AS Expr5
FROM
LoteDet
CROSS JOIN
LoteDetPg
WHERE
Expr1 = 500
Take a look on this documentation: https://msdn.microsoft.com/pt-br/library/ms190349.aspx
I believe this is going to return you what's called a Cartesian Product. It's the result of an open join, like you have above. That query is going to return TONS of records because you're not specifying how to JOIN the two tables, it's just going to blindly try matching columns. At the very least, add an ON condition to the JOIN so that you can match on IDs/keys. I think what you want is an INNER JOIN with an ON; this will return you all of the matching rows, based on ID/Key.
SELECT
CASE WHEN tbl1.Comprom IS NULL THEN tbl2.Comprom ELSE tbl1.Comprom END AS Expr1
CASE WHEN tbl1.Nossonum IS NULL THEN tbl2.Nossonum ELSE tbl1.Nossonum END AS Expr2
FROM
tbl1 --LoteDet
INNER JOIN tbl2 --LoteDetPg
ON (tbl1.ID = tbl2.ID)
WHERE
Expr1 = 500 --I know I swapped the expression values, use whichever expression you need here
Now, only rows that have a matching ID will return you values and it will use the value from tbl1, unless it is null, then it will use the value from tbl2.
Edit: I know CROSS JOIN turns into an INNER JOIN if a WHERE is specified, but does the WHERE need to include both tables? I feel that the Expr1 = 500 will still produce a Cartesian Product; can someone correct me?

Conditional JOIN

I'm wondering if it's possible to accomplish this in MS Access 2007:
A client gave me several tables, and they asked me for some queries. One of them has to get a field value from a table, depending on the value of a field of each record. This means, depending on the region, it has to look at one table, a second, or a third one.
So, I was wondering if I could do something like this:
SELECT
table2.some_value
FROM
table1
INNER JOIN table2
ON CASE table1.SOME_VALUE THEN table3.id = table2.some_id ELSE
CASE table1.SOME_VALUE THEN table4.id = table2.some_id ELSE
table5.id = table2.some_id END END
Is it clear? IF not, just ask and I'll answer your doubts.
EDIT:
I think I was not clear enough. I have a several joins in my query, but I have this last one, in which its ON statement will be different, depending on the data. For example:
I have a record in a table that has a State field, with three possibilities: CA, TX, FL.
If the value is CA, the ON statement of that JOIN should be CA_Standard_table.field = myTable.field.
If it's TX, the ON statement of that JOIN should be TX_Standard_table.field = myTable.field
And the same logic goes for FL.
How can I accomplish that?
EDIT 2:
Here is the query code, the last JOIN is the one that matters for this. The three possibilities of tables to join with in the ON statement are:
EU_Accepted_Standards
CA_Accepted_Standards
NZ_Accepted_Standards
It will decide for one of them, depending of which of the following fields are checked:
CAStandard: it should take CA_Accepted_Standards.
EUSelStandard:it should take EU_Accepted_Standards.
NZ_Accepted_Standards: it should take NZ_Accepted_Standards
Query
SELECT
Projects.COMPAS_ID,
Projects.[Opportunity Name],
IIf([VolCap]=True,1) AS [Volume Cap],
IIf([DelGuarantee]=True,1) AS [Delivery Guarantee],
Projects.Tech_Level_Name,
Counterparty.CPExpertise,
Counterparty.CPFinStrength,
Geographic_Location.Country_RiskLevel,
Project_Stage_Risk.ProStaRiskLevel,
Counterparty.CPExperience,
Projects.Country_Name,
IIf([EU ETS]=True,1) AS EU,
IIf([CA ETS]=True,1) AS CA,
IIf([NZ ETS]=True,1) AS NZ,
IIf([Australia ETS]=True,1) AS Australia,
IIf([CAProjectType] is not null, CA_Accepted_Projects.CAPTRiskLevel,
IIf([EUSelProjType] is not null, EU_ETS_Standards.EUPTRiskLevel,
IIf([NZSelProjType] is not null, NZ_Accepted_Projects.NZPTRiskLevel))) as [Risk Level],
IIf([CAStandard] is not null, CA_Accepted_Standards.CAStanRiskLevel,
IIf([EUSelStandard] is not null, EU_Accepted_Standards.EUStanRiskLevel,
IIf([NZSelStandard] is not null, NZ_Accepted_Standards.NZStanRiskLevel))) as [Standard Risk]
FROM
Project_Stage_Risk
INNER JOIN (((((((((Counterparty
INNER JOIN Projects
ON Counterparty.CPID = Projects.[Counter Party])
INNER JOIN Geographic_Location
ON Projects.Country_Name = Geographic_Location.Country_Name)
left JOIN CA_Accepted_Projects
ON Projects.CAProjectType = CA_Accepted_Projects.CA_ProjectTypes)
left JOIN NZ_Accepted_Projects
ON Projects.NZSelProjType = NZ_Accepted_Projects.NZ_StandardID)
left JOIN EU_ETS_Standards
ON Projects.EUSelProjType = EU_ETS_Standards.EU_StandardID)
left JOIN CA_Accepted_Standards
ON Projects.CAStandard = CA_Accepted_Standards.ID)
left JOIN NZ_Accepted_Standards
ON Projects.NZSelStandard = NZ_Accepted_Standards.ID)
left JOIN EU_Accepted_Standards
ON Projects.EUSelStandard = EU_Accepted_Standards.ID)
left join Emissions_Trading_Systems
ON Emissions_Trading_Systems.ETS = EU_Accepted_Standards.ETS)
ON Project_Stage_Risk.ProStaID = Projects.[Project Stage];
cross join the two sets in a view, put the condition in the select. make 2 views of this view. Join the 2 views together.
You could create a UNION query that unions together the three tables you want to conditionally join to, including a "Some_Value" column that will contain the item on which you want to join. Essentially, for each table you include in the UNION, set the value of the "Some_Value" column to a value you can use in a where clause to differentiate things. Then create an overall query that joins (in your example, table2) to the union query and use a WHERE clause to limit the records to the ones you need. I have done similar things myself on projects in the past with great success.
Thanks for the answers. I know it was not well explained though, but in the end, I could solve this problem by writing a subquery.
Join all five tables together, and use that CASE expression inside the SELECT clause to choose the appropriate field from all tables.
SELECT
CASE table1.some_value
WHEN 'a' THEN table2.some_value
WHEN 'b' THEN table3.some_value
WHEN 'c' THEN table4.some_value
WHEN 'd' THEN table5.some_value
END