Joining 2 columns where 1 column has unique records and the other has duplicates

Joining 2 columns where 1 column has unique records and the other has duplicates - sql

I have 2 column Risk_Geo_ID in one table which has duplicates and Geography_Identifier in other table which has unique records. Doing an inner join is giving me duplicate records and cannot process this query in cube as Geography_Identifier is a PK and its unique. Need a query to get unique records.
Here is the Code:
SELECT
s.Geography_Identifier
,s.State_Code
,s.State_Name
,s.County_Name
,s.City_Name
,s.ZIP_Code
,a.Risk_ID
,a.Risk_Address
,a.Latitude
,a.Longitude
,a.Distance_to_Coast
,a.Insurance_Score
FROM [Policy].[Dim_Risk] AS a
INNER JOIN [Policy].[Fact_Monthly_Policy_Snap] AS b
ON b.Risk_ID = a.Risk_ID
AND b.Insurance_score = a.Insurance_Score
INNER JOIN [Common].[Dim_Geography] AS s
ON b.Risk_Geo_ID = s.Geography_Identifier

Try using the distinct
SELECT
distinct s.Geography_Identifier
,s.State_Code
,s.State_Name
,s.County_Name
,s.City_Name
,s.ZIP_Code
,a.Risk_ID
,a.Risk_Address
,a.Latitude
,a.Longitude
,a.Distance_to_Coast
,a.Insurance_Score
FROM [Policy].[Dim_Risk] AS a
INNER JOIN [Policy].[Fact_Monthly_Policy_Snap] AS b
ON b.Risk_ID = a.Risk_ID
AND b.Insurance_score = a.Insurance_Score
INNER JOIN [Common].[Dim_Geography] AS s
ON b.Risk_Geo_ID = s.Geography_Identifier

Try to use CROSS APPLY with TOP 1 if you want to get just one row with such ID:
SELECT
q.Geography_Identifier
,s.State_Code
,s.State_Name
,s.County_Name
,s.City_Name
,s.ZIP_Code
,a.Risk_ID
,a.Risk_Address
,a.Latitude
,a.Longitude
,a.Distance_to_Coast
,a.Insurance_Score
FROM [Policy].[Dim_Risk] AS a
INNER JOIN [Policy].[Fact_Monthly_Policy_Snap] AS b
ON b.Risk_ID = a.Risk_ID
AND b.Insurance_score = a.Insurance_Score
CROSS APPLY
(
SELECT
TOP 1
*
FROM [Common].[Dim_Geography] AS s
WHERE s.Geography_Identifier = b.Risk_Geo_ID
)q

Related

SQL : combine 2 rows into 1

This is my database
I'm trying to display "hello" and "wo" in the same column
My SQL statement:
SELECT
d.CampaignId, d.ClientID,
citn.ImagePath AS Thumbnail, cidi.ImagePath AS DetailImage
FROM
MasterData.CampaignImage AS d
INNER JOIN
MasterData.CampaignImage AS citn ON d.CampaignId = citn.CampaignId
AND d.ClientID = citn.ClientID
AND citn.ImageTypeId = 1
INNER JOIN
MasterData.CampaignImage AS cidi ON d.CampaignId = cidi.CampaignId
AND d.ClientID = cidi.ClientID
AND cidi.ImageTypeId = 2
The output:
But now I have 2 rows in my output, how can I combine these into just one row?

Simply do SELECT DISTINCT to skip duplicate rows.

having an and condition on multiple rows - sql server

i have the following query where i'm trying to select one value based on a condition
select distinct tblItemTypePropValue.ItemType_Fkey, tblItemPrice.price, dbo.fnGetDiscountItemType(tblItemTypePropValue.ItemType_Fkey) as finalPrice
from tblItemTypePropValue
inner join tblItemPrice
on tblItemTypePropValue.ItemType_Fkey = tblItemPrice.itemType_fkey
where Value_Fkey in (2097,2131)
and tblitemtypepropvalue.ItemType_Fkey in (select id from tblItemType where Item_Fkey = 12241)
the problem is i n the where in value_fkey in (2097,2131)
is there a way to write an and condition on multiple rows where value_fkey = 2097 and = 2131.
and also, can it be done without a join since my parameters inside the in can vary in count.
sample result
itemtype_fkey price finalPrice
6191 100 93
so the query will only return the ItemType_Fkey that has both Value_Fkeys specified in the query
Value
thanks,

You can do two INNER JOINs and check for the existence of the 2097 and 2131 keys on them.
Being INNER JOINS, only the Prices having both of those keys will be returned.
select distinct P.ItemType_Fkey, P.price,
dbo.fnGetDiscountItemType(P.ItemType_Fkey) as finalPrice
from tblItemType T
inner join tblItemPrice P on P.ItemType_Fkey = T.id
inner join tblItemTypePropValue V1 on V1.ItemType_Fkey = P.ItemType_Fkey and V1.Value_Fkey = 2097
inner join tblItemTypePropValue V2 on V2.ItemType_Fkey = P.ItemType_Fkey and V2.Value_Fkey = 2104
where T.Item_Fkey = 12241

select distinct tblItemTypePropValue.Value_Fkey,tblItemTypePropValue.ItemType_Fkey, tblItemPrice.price, dbo.fnGetDiscountItemType(tblItemTypePropValue.ItemType_Fkey) as finalPrice
from tblItemTypePropValue
inner join tblItemPrice
on tblItemTypePropValue.ItemType_Fkey = tblItemPrice.itemType_fkey
where ((Value_Fkey = 2097) or (Value_Fkey = 2131))
and tblitemtypepropvalue.ItemType_Fkey in (select id from tblItemType where Item_Fkey = 12241)

You probably want those items where rows for both Value_Fkey exist like this
SELECT
v.ItemType_Fkey,
max(p.price), -- max/min/avg?
dbo.fnGetDiscountItemType(v.ItemType_Fkey) AS finalPrice
FROM tblItemTypePropValue AS v
INNER JOIN tblItemPrice AS p
ON v.ItemType_Fkey = p.itemType_fkey
WHERE Value_Fkey IN (2097,2131) --- list of values
AND v.ItemType_Fkey
IN ( SELECT id
FROM tblItemType
WHERE Item_Fkey = 12241
)
GROUP BY v.ItemType_Fkey
HAVING COUNT(DISTINCT Value_Fkey) = 2 -- number of values in list

SQL: If there are two rows that contain same record, want it to display one

based on my question above, below is the SQL
SELECT ets_tools.tools_id, ets_borrower.fullname, ets_team.team_name, ets_borrow.time_from,
ets_borrow.time_to, ets_borrow.borrow_id FROM ets_tools
INNER JOIN ets_tools_borrow ON ets_tools.tools_id = ets_tools_borrow.tools_id
INNER JOIN ets_borrow ON ets_borrow.borrow_id = ets_tools_borrow.borrow_id
INNER JOIN ets_borrower ON ets_borrower.badgeid = ets_borrow.badgeid
INNER JOIN ets_team ON ets_team.team_id = ets_borrower.team_id
WHERE ets_tools.borrow_id IS NOT NULL AND ets_borrow.status_id = 1 AND ets_borrow.time_to IS NULL
and the result display like this:
From the image above, we can see that the borrow_id with value 1 display two rows. Now, how to display only one borrow_id for value 1 since its duplicate the same things.
Anyone can help?

Assuming you want to retain the record having the smallest tools_id, you could aggregate by the other columns and take the MIN of tools_id:
SELECT
MIN(ets_tools.tools_id) AS tools_id,
ets_borrower.fullname,
ets_team.team_name,
ets_borrow.time_from,
ets_borrow.time_to,
ets_borrow.borrow_id
FROM ets_tools
INNER JOIN ets_tools_borrow ON ets_tools.tools_id = ets_tools_borrow.tools_id
INNER JOIN ets_borrow ON ets_borrow.borrow_id = ets_tools_borrow.borrow_id
INNER JOIN ets_borrower ON ets_borrower.badgeid = ets_borrow.badgeid
INNER JOIN ets_team ON ets_team.team_id = ets_borrower.team_id
WHERE
ets_tools.borrow_id IS NOT NULL AND
ets_borrow.status_id = 1 AND
ets_borrow.time_to IS NULL
GROUP BY
ets_borrower.fullname,
ets_team.team_name,
ets_borrow.time_from,
ets_borrow.time_to,
ets_borrow.borrow_id;

Try this:
Change the SELECT to SELECT TOP 1 WITH TIES
And at the end add ORDER BY ROW_NUMBER() OVER(PARTITION BY ets_borrow.borrow_id ORDER BY ets_tools.tools_id)

T-SQL Full Outer Join (sample provided)

I have some issues getting a full outer join to work in T-SQL. The left outer join part seems to be working okay, but the right outer join doesn't work as expected. Here's some sample data to test this on:
I have a table A with the following columns and data. The row marked with red is the row that cannot be found in table B.
And a second table B with the following columns and data. The rows marked with yellow are the rows that cannot be found in table A.
I am trying to join the tables using the following sql code:
select tableA.klientnr, tableA.uttakstype, tableA.uttaksnr, tableA.vareanr TableAItem,tableB.vareanr tableBitem, tableA.kvantum tableAquantity, tableB.totkvant tableBquantity
from tableA as tableA
full outer join tableB as tableB on tableA.klientnr=tableB.klientnr and tableA.uttakstype=tableB.uttakstype and tableA.uttaksnr=tableB.uttaksnr and tableA.vareanr=tableB.vareanr and tableB.IsDeleted=0
where tableA.UttaksNr=639779 and tableA.IsDeleted=0
The result of the sql is the following image. The row marked in red is the extra row from tableA that does show up, but I can't get the rows from table B to show up
Expected to have 2 extra rows
550 SA 639779 NULL 100059 NULL 0
550 SA 639779 NULL 103040 NULL 14
Later edit:
Would this be correct way to handle the full outer join where there's the header/line type of structure? Or can the query be optimized?
SELECT ISNULL(q1.accountid, q2.accountid) AccountId
,ISNULL(q1.klientnr, q2.klientnr) KlientNr
,ISNULL(q1.tilgangstype, q2.tilgangstype) 'Reception Type'
,ISNULL(q1.tilgangsnr, q2.tilgangsnr) 'Reception No'
,ISNULL(q1.dato, q2.dato) dato
,ISNULL(q1.LevNr, q2.LevNr) LevNr
,ISNULL(q1.Pakkemerke, q2.Pakkemerke) Pakkemerke
,ISNULL(q1.VareANr, q2.VareANr) VareANr
,ISNULL(q1.Ankomstdato,q2.Ankomstdato) 'Arrival Date'
,q1.Antall1
,q1.totkvant1
,q1.Antall2
,q1.totkvant2
,q2.Antall
,q2.totkvant
,q2.AntallTilFrys
,q2.TotKvantTilFrys
,ISNULL(q1.EksternKommentar1,q2.EksternKommentar1) EksternKommentar1
,q2.[Last Upsert]
FROM (
SELECT w700.accountid
,w700.klientnr
,w700.tilgangstype
,w700.tilgangsnr
,w700.dato
,w700.Ankomstdato
,w700.LevNr
,w700.pakkemerke
,w789.VareANr
,sum(IIF(w789.prognosetype = 1, w789.Antall, NULL)) AS Antall1
,sum(IIF(w789.prognosetype = 1, w789.totkvant, NULL)) AS totkvant1
,sum(IIF(w789.prognosetype = 2, w789.Antall, NULL)) AS Antall2
,sum(IIF(w789.prognosetype = 2, w789.totkvant, NULL)) AS totkvant2
,w700.EksternKommentar1
FROM trading.W789Prognosekjopstat AS w789
INNER JOIN trading.W700Tilgangshode AS w700 ON w700.AccountId = w789.AccountId
AND w700.KlientNr = w789.Klientnr
AND w700.Tilgangsnr = w789.Tilgangsnr
AND w700.Tilgangstype = w789.Tilgangstype
AND w700.IsDeleted = 0
WHERE w789.IsDeleted = 0
GROUP BY w700.accountid
,w700.klientnr
,w700.tilgangstype
,w700.tilgangsnr
,w700.dato
,w700.Ankomstdato
,w700.LevNr
,w700.pakkemerke
,w789.VareANr
,w700.EksternKommentar1
) q1
FULL OUTER JOIN (
SELECT w700.accountid
,w700.klientnr
,w700.tilgangstype
,w700.tilgangsnr
,w700.dato
,w700.Ankomstdato
,w700.LevNr
,w700.pakkemerke
,w702.VareANr
,w702.Antall
,w702.TotKvant
,w702.ValPris
,w702.AntallTilFrys
,w702.TotKvantTilFrys
,w700.EksternKommentar1
,(SELECT MAX(LastUpdateDate) FROM (VALUES (w702.createdAt),(w702.updatedAt)) AS UpdateDate(LastUpdateDate)) AS 'Last Upsert'
FROM trading.w702PrognoseKjop w702
INNER JOIN trading.W700Tilgangshode AS w700 ON w700.AccountId = w702.AccountId
AND w700.KlientNr = w702.Klientnr
AND w700.Tilgangsnr = w702.Tilgangsnr
AND w700.Tilgangstype = w702.Tilgangstype
AND w700.IsDeleted = 0
WHERE w702.IsDeleted = 0
) q2 ON q1.accountid = q2.accountid
AND q1.klientnr = q2.klientnr
AND q1.tilgangstype = q2.tilgangstype
AND q1.tilgangsnr = q2.tilgangsnr
AND q1.vareanr = q2.vareanr
WHERE totkvant1 IS NOT NULL
OR totkvant2 IS NOT NULL
OR totkvant IS NOT NULL

Filtering with full join is really tricky. Your where criteria are actually turning the full join into a left join.
You can do what you want by filtering before the join:
select a.klientnr, a.uttakstype, a.uttaksnr, a.vareanr a.tableAitem,
b.vareanr b.tableBitem, a.kvantum a.tableAquantity, b.totkvant b.tableBquantity
from (select a.*
from tableA a
where a.UttaksNr = 639779 and a.IsDeleted = 0
) a full join
(select b.*
from tableB b
where b.IsDeleted = 0
) b
on a.klientnr = b.klientnr and
a.uttakstype= b.uttakstype and
a.uttaksnr = b.uttaksnr and
a.vareanr = b.vareanr;

returning only the most recent record in dataset

I have a query returning data for several dates.
I want to return only record with the most recent date in the field SAPOD (the date is in fact CYYMMDD where C=0 before 2000 and 1 after 2000 and I can show this as YYYYMMDD using SAPOD=Case when LEFT(SAPOD,1)=1 then '20' else '19' end + SUBSTRING(cast(sapod as nvarchar(7)),2,7))
here is my query:
SELECT GFCUS, Ne.NEEAN, SCDLE, SAPOD, SATCD,CUS.GFCUN, BGCFN1,
BGCFN2, BGCFN3, SV.SVCSA, SV.SVNA1, SV.SVNA2, SV.SVNA3,
SV.SVNA4, SV.SVNA5, SV.SVPZIP, SV.NONUK
FROM SCPF ACC
INNER JOIN GFPF CUS ON GFCPNC = SCAN
LEFT OUTER JOIN SXPF SEQ ON SXCUS = GFCUS AND SXPRIM = ''
LEFT OUTER JOIN SVPFClean SV ON SVSEQ = SXSEQ
LEFT OUTER JOIN BGPF ON BGCUS = GFCUS AND BGCLC = GFCLC
LEFT OUTER JOIN NEPF NE ON SCAB=NE.NEAB and SCAN=ne.NEAN and SCAS=ne.NEAS
LEFT OUTER JOIN SAPF SA ON SCAB=SAAB and SCAN=SAAN and SCAS=SAAS
WHERE
(SATCD>500 and
scsac='IV' and
scbal = 0 and
scai30<>'Y' and
scai14<>'Y' and
not exists(select * from v5pf where v5and=scan and v5bal<>0))
GROUP BY GFCUS, Ne.NEEAN, SCDLE, SAPOD, SATCD,
CUS.GFCUN, BGCFN1, BGCFN2, BGCFN3, SV.SVCSA,
SV.SVNA1, SV.SVNA2, SV.SVNA3, SV.SVNA4, SV.SVNA5, SV.SVPZIP, SV.NONUK
ORDER BY MAX(SCAN) ASC, SAPOD DESC
I am getting results like the below where there are several transactions by a customer, and we only want to show the data of the most recent transaction:
So how can I show just the most recent transaction? Is this a case where I should use an OUTER APPLY or CROSS APPY?
EDIT:
Sorry I should clarify that I need the most recent date for each of the unique records in the field NEEAN which is the Account number

You can use ROW_NUMBER() as follows:
SELECT
ROW_NUMBER() OVER (PARTITION BY Ne.NEEAN ORDER BY SAPOD DESC) AS [Row],
GFCUS, Ne.NEEAN, SCDLE, SAPOD, SATCD,CUS.GFCUN, BGCFN1,
BGCFN2, BGCFN3, SV.SVCSA, SV.SVNA1, SV.SVNA2, SV.SVNA3,
SV.SVNA4, SV.SVNA5, SV.SVPZIP, SV.NONUK
FROM SCPF ACC
INNER JOIN GFPF CUS ON GFCPNC = SCAN
LEFT OUTER JOIN SXPF SEQ ON SXCUS = GFCUS AND SXPRIM = ''
LEFT OUTER JOIN SVPFClean SV ON SVSEQ = SXSEQ
LEFT OUTER JOIN BGPF ON BGCUS = GFCUS AND BGCLC = GFCLC
LEFT OUTER JOIN NEPF NE ON SCAB=NE.NEAB and SCAN=ne.NEAN and SCAS=ne.NEAS
LEFT OUTER JOIN SAPF SA ON SCAB=SAAB and SCAN=SAAN and SCAS=SAAS
WHERE
(SATCD>500 and
scsac='IV' and
scbal = 0 and
scai30<>'Y' and
scai14<>'Y' and
not exists(select * from v5pf where v5and=scan and v5bal<>0)) and
[Row] = 1
GROUP BY GFCUS, Ne.NEEAN, SCDLE, SAPOD, SATCD,
CUS.GFCUN, BGCFN1, BGCFN2, BGCFN3, SV.SVCSA,
SV.SVNA1, SV.SVNA2, SV.SVNA3, SV.SVNA4, SV.SVNA5, SV.SVPZIP, SV.NONUK
ORDER BY MAX(SCAN) ASC
You could encapsulate this within a subquery if you don't want to return the [Row] column.

you can user row_number to get top 1 row per customer
In the where clause need to return values with pos value as 1
sample query
row_number() over ( partition by GFCUS order by SAPOD desc) as pos

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Joining 2 columns where 1 column has unique records and the other has duplicates - sql

Related

SQL : combine 2 rows into 1

having an and condition on multiple rows - sql server

SQL: If there are two rows that contain same record, want it to display one

T-SQL Full Outer Join (sample provided)

returning only the most recent record in dataset

Categories

Resources