SQL: If there are two rows that contain same record, want it to display one

SQL: If there are two rows that contain same record, want it to display one - sql

based on my question above, below is the SQL
SELECT ets_tools.tools_id, ets_borrower.fullname, ets_team.team_name, ets_borrow.time_from,
ets_borrow.time_to, ets_borrow.borrow_id FROM ets_tools
INNER JOIN ets_tools_borrow ON ets_tools.tools_id = ets_tools_borrow.tools_id
INNER JOIN ets_borrow ON ets_borrow.borrow_id = ets_tools_borrow.borrow_id
INNER JOIN ets_borrower ON ets_borrower.badgeid = ets_borrow.badgeid
INNER JOIN ets_team ON ets_team.team_id = ets_borrower.team_id
WHERE ets_tools.borrow_id IS NOT NULL AND ets_borrow.status_id = 1 AND ets_borrow.time_to IS NULL
and the result display like this:
From the image above, we can see that the borrow_id with value 1 display two rows. Now, how to display only one borrow_id for value 1 since its duplicate the same things.
Anyone can help?

Assuming you want to retain the record having the smallest tools_id, you could aggregate by the other columns and take the MIN of tools_id:
SELECT
MIN(ets_tools.tools_id) AS tools_id,
ets_borrower.fullname,
ets_team.team_name,
ets_borrow.time_from,
ets_borrow.time_to,
ets_borrow.borrow_id
FROM ets_tools
INNER JOIN ets_tools_borrow ON ets_tools.tools_id = ets_tools_borrow.tools_id
INNER JOIN ets_borrow ON ets_borrow.borrow_id = ets_tools_borrow.borrow_id
INNER JOIN ets_borrower ON ets_borrower.badgeid = ets_borrow.badgeid
INNER JOIN ets_team ON ets_team.team_id = ets_borrower.team_id
WHERE
ets_tools.borrow_id IS NOT NULL AND
ets_borrow.status_id = 1 AND
ets_borrow.time_to IS NULL
GROUP BY
ets_borrower.fullname,
ets_team.team_name,
ets_borrow.time_from,
ets_borrow.time_to,
ets_borrow.borrow_id;

Try this:
Change the SELECT to SELECT TOP 1 WITH TIES
And at the end add ORDER BY ROW_NUMBER() OVER(PARTITION BY ets_borrow.borrow_id ORDER BY ets_tools.tools_id)

Related

Joining 2 columns where 1 column has unique records and the other has duplicates

I have 2 column Risk_Geo_ID in one table which has duplicates and Geography_Identifier in other table which has unique records. Doing an inner join is giving me duplicate records and cannot process this query in cube as Geography_Identifier is a PK and its unique. Need a query to get unique records.
Here is the Code:
SELECT
s.Geography_Identifier
,s.State_Code
,s.State_Name
,s.County_Name
,s.City_Name
,s.ZIP_Code
,a.Risk_ID
,a.Risk_Address
,a.Latitude
,a.Longitude
,a.Distance_to_Coast
,a.Insurance_Score
FROM [Policy].[Dim_Risk] AS a
INNER JOIN [Policy].[Fact_Monthly_Policy_Snap] AS b
ON b.Risk_ID = a.Risk_ID
AND b.Insurance_score = a.Insurance_Score
INNER JOIN [Common].[Dim_Geography] AS s
ON b.Risk_Geo_ID = s.Geography_Identifier

Try using the distinct
SELECT
distinct s.Geography_Identifier
,s.State_Code
,s.State_Name
,s.County_Name
,s.City_Name
,s.ZIP_Code
,a.Risk_ID
,a.Risk_Address
,a.Latitude
,a.Longitude
,a.Distance_to_Coast
,a.Insurance_Score
FROM [Policy].[Dim_Risk] AS a
INNER JOIN [Policy].[Fact_Monthly_Policy_Snap] AS b
ON b.Risk_ID = a.Risk_ID
AND b.Insurance_score = a.Insurance_Score
INNER JOIN [Common].[Dim_Geography] AS s
ON b.Risk_Geo_ID = s.Geography_Identifier

Try to use CROSS APPLY with TOP 1 if you want to get just one row with such ID:
SELECT
q.Geography_Identifier
,s.State_Code
,s.State_Name
,s.County_Name
,s.City_Name
,s.ZIP_Code
,a.Risk_ID
,a.Risk_Address
,a.Latitude
,a.Longitude
,a.Distance_to_Coast
,a.Insurance_Score
FROM [Policy].[Dim_Risk] AS a
INNER JOIN [Policy].[Fact_Monthly_Policy_Snap] AS b
ON b.Risk_ID = a.Risk_ID
AND b.Insurance_score = a.Insurance_Score
CROSS APPLY
(
SELECT
TOP 1
*
FROM [Common].[Dim_Geography] AS s
WHERE s.Geography_Identifier = b.Risk_Geo_ID
)q

where statement execute before inner join

I'm trying to grab the first instance of each result with a sysAddress of less than 4. However my statement currently grabs the min(actionTime) result first before applying the where sysAddress < 4. I'm trying to have the input for the inner join as the where sysAddress < 4 however i cant seem to figure out how to do it.
Should i be nesting it all differently? I didnt want to create an additional layer of table joins. Is this possible? I'm a bit lost at all the answers ive found.
SELECT
tblHistoryObject.info,
tblHistory.actionTime,
tblHistoryUser.userID,
tblHistoryUser.firstName,
tblHistoryUser.surname,
tblHistory.eventID,
tblHistoryObject.objectID,
tblHistorySystem.sysAddress
FROM tblHistoryObject
JOIN tblHistory
ON (tblHistory.historyObjectID = tblHistoryObject.historyObjectID)
JOIN tblHistorySystem
ON (tblHistory.historySystemID = tblHistorySystem.historySystemID)
JOIN tblHistoryUser
ON (tblHistory.historyUserID = tblHistoryUser.historyUserID)
INNER JOIN (SELECT
MIN(actionTime) AS recent_date,
historyObjectID
FROM tblHistory
GROUP BY historyObjectID) AS t2
ON t2.historyObjectID = tblHistoryObject.historyObjectID
AND tblHistory.actionTime = t2.recent_date
WHERE sysAddress < 4
ORDER BY actionTime ASC

WITH
all_action_times AS
(
SELECT
tblHistoryObject.info,
tblHistory.actionTime,
tblHistoryUser.userID,
tblHistoryUser.firstName,
tblHistoryUser.surname,
tblHistory.eventID,
tblHistoryObject.objectID,
tblHistorySystem.sysAddress,
ROW_NUMBER() OVER (PARTITION BY tblHistoryObject.historyObjectID
ORDER BY tblHistory.actionTime
)
AS historyObjectID_SeqByActionTime
FROM
tblHistoryObject
INNER JOIN
tblHistory
ON tblHistory.historyObjectID = tblHistoryObject.historyObjectID
INNER JOIN
tblHistorySystem
ON tblHistory.historySystemID = tblHistorySystem.historySystemID
INNER JOIN
tblHistoryUser
ON tblHistory.historyUserID = tblHistoryUser.historyUserID
WHERE
tblHistorySystem.sysAddress < 4
)
SELECT
*
FROM
all_action_times
WHERE
historyObjectID_SeqByActionTime = 1
ORDER BY
actionTime ASC
This does exactly what your original query did, without trying to filter by action_time.
Then it appends a new column, using ROW_NUMBER() to generate sequences from 1 for each individual tblHistoryObject.historyObjectID. Then it takes only the rows where this sequence value is 1 (the first row per historyObjectID, when sorted in action_time order).

update with subquery

The code :
UPDATE tt_t_documents
SET t_Doc_header_ID = (SELECT
MIN(dh.Doc_header_ID)
FROM tt_t_documents td WITH (NOLOCK)
JOIN Doc_header dh WITH (NOLOCK)
ON dh.DH_doc_number = td.t_dh_doc_number
AND dh.DH_sub = 1
JOIN Pred_entry pe WITH (NOLOCK)
ON pe.Pred_entry_ID = dh.DH_pred_entry
JOIN Doc_type dt WITH (NOLOCK)
ON dty.Doc_type_ID = pe.PD_doc_type
AND dt.DT_mode = 5
HAVING COUNT(dh.Doc_header_ID) = 1);
I want to update my columns, but before that I also want to check if there is only one ID found.
The problem in this select is that I get more than one ID.
How can I write a query that updates each row and checks in the same query that there is only one id found?

I am guessing that you intend something like this:
update td
set t_Doc_header_ID = min_Doc_header_ID
from tt_t_documents td join
(select DH_doc_number, min(dh.Doc_header_ID) as min_Doc_header_ID
from Doc_header dh join
Pred_entry pe
on pe.Pred_entry_ID = dh.DH_pred_entry join
Doc_type dt
on dty.Doc_type_ID = pe.PD_doc_type and dt.DT_mode = 5
where dh.DH_doc_number = td.t_dh_doc_number and dh.DH_sub = 1
group by DH_doc_number
having count(dh.Doc_header_ID) = 1
) dh
on dh.DH_doc_number = td.t_dh_doc_number;
Using a join also means that you do not update the values where the condition does not match. If you use a left join, then the values will be updated to NULL (if that is your intention).

I'm not sure I believe you are getting more than one id back with that select given that you are doing a 'min' on it and have no group by. It should be only returning the lowest value for Doc_header_id.
To do what you are asking, first, you should have some way of joining to the tt_t_documents table in the update statement (eg. where td.id == tt_t_documents.id).
Second, you could re-write it to use the sub-query in the from. Something like:
update
tt_t_documents
set
t_Doc_header_ID = x.Doc_Header_id
from tt_t_documents join (
select td.id,
min(dh.Doc_header_ID)
from
tt_t_documents td
join Doc_header dh
on dh.DH_doc_number = td.t_dh_doc_number
and dh.DH_sub = 1
join Pred_entry pe
on pe.Pred_entry_ID = dh.DH_pred_entry
join Doc_type dt
on dty.Doc_type_ID = pe.PD_doc_type
and dt.DT_mode = 5
group by td.id
having
count(dh.Doc_header_ID) = 1
) x on tt_t_documents.id= x.id;
The syntax may not be perfect and I'm not sure how you want to find the doc_header_id but it would be something like this. The sub query would only return values with 1 doc_header_id). Not knowing the schema of your tables, this is as close as I can get.

Only return value that matches the ID on table 1

I have tried all possible joins and sub-queries but I cant get the data to only return one value from table 2 that exactly matches the vendor ID. If I dont have the address included in the query, I get one hit for the vendor ID. How can I make it so that when I add the address, I only want the one vendor that I get prior to adding the address.
The vendor from table one must be VEN-CLASS IS NOT NULL.
This was my last attempt using subquery:
SELECT DISTINCT APVENMAST.VENDOR_GROUP,
APVENMAST.VENDOR,
APVENMAST.VENDOR_VNAME,
APVENMAST.VENDOR_CONTCT,
APVENMAST.TAX_ID,
Subquery.ADDR1
FROM (TEST.dbo.APVENMAST APVENMAST
INNER JOIN
(SELECT APVENADDR.ADDR1,
APVENADDR.VENDOR_GROUP,
APVENADDR.VENDOR,
APVENMAST.VEN_CLASS
FROM TEST.dbo.APVENADDR APVENADDR
INNER JOIN TEST.dbo.APVENMAST APVENMAST
ON (APVENADDR.VENDOR_GROUP = APVENMAST.VENDOR_GROUP)
AND (APVENADDR.VENDOR = APVENMAST.VENDOR)
WHERE (APVENMAST.VEN_CLASS IS NOT NULL)) Subquery
ON (APVENMAST.VENDOR_GROUP = Subquery.VENDOR_GROUP)
AND (APVENMAST.VENDOR = Subquery.VENDOR))
INNER JOIN TEST.dbo.APVENLOC APVENLOC
ON (APVENMAST.VENDOR_GROUP = APVENLOC.VENDOR_GROUP)
AND (APVENMAST.VENDOR = APVENLOC.VENDOR)
WHERE (APVENMAST.VEN_CLASS IS NOT NULL)

Try this:
SELECT APVENMAST.VENDOR_GROUP
, APVENMAST.VENDOR
, APVENMAST.VENDOR_VNAME
, APVENMAST.VENDOR_CONTCT
, APVENMAST.TAX_ID
, APVENADDR.ADDR1
FROM TEST.dbo.APVENMAST APVENMAST
INNER JOIN (
select VENDOR_GROUP, VENDOR, ADDR1
, row_number() over (partition by VENDOR_GROUP, VENDOR order by ADDR1) r
from TEST.dbo.APVENADDR
) APVENADDR
ON APVENADDR.VENDOR_GROUP = APVENMAST.VENDOR_GROUP
AND APVENADDR.VENDOR = APVENMAST.VENDOR
AND APVENADDR.r = 1
--do you need this table; you're not using it...
--INNER JOIN TEST.dbo.APVENLOC APVENLOC
--ON APVENMAST.VENDOR_GROUP = APVENLOC.VENDOR_GROUP
--AND APVENMAST.VENDOR = APVENLOC.VENDOR
WHERE APVENMAST.VEN_CLASS IS NOT NULL
--if the above inner join was to filter results, you can do this instead:
and exists (
select top 1 1
from TEST.dbo.APVENLOC APVENLOC
ON APVENMAST.VENDOR_GROUP = APVENLOC.VENDOR_GROUP
AND APVENMAST.VENDOR = APVENLOC.VENDOR
)

I found another column in the APVENLOC table that I can filter on to get the unique vendor. Turns out if the vendor address is for the main office, the vendor location is set blank.
Easier than I thought it would be!
SELECT DISTINCT APVENMAST.VENDOR_GROUP,
APVENMAST.VENDOR,
APVENMAST.VENDOR_VNAME,
APVENADDR.ADDR1,
APVENMAST.VENDOR_SNAME,
APVENADDR.LOCATION_CODE,
APVENMAST.VEN_CLASS
FROM TEST.dbo.APVENMAST APVENMAST
INNER JOIN TEST.dbo.APVENADDR APVENADDR
ON (APVENMAST.VENDOR_GROUP = APVENADDR.VENDOR_GROUP)
AND (APVENMAST.VENDOR = APVENADDR.VENDOR)
WHERE (APVENADDR.LOCATION_CODE = ' ')
Shaji

returning only the most recent record in dataset

I have a query returning data for several dates.
I want to return only record with the most recent date in the field SAPOD (the date is in fact CYYMMDD where C=0 before 2000 and 1 after 2000 and I can show this as YYYYMMDD using SAPOD=Case when LEFT(SAPOD,1)=1 then '20' else '19' end + SUBSTRING(cast(sapod as nvarchar(7)),2,7))
here is my query:
SELECT GFCUS, Ne.NEEAN, SCDLE, SAPOD, SATCD,CUS.GFCUN, BGCFN1,
BGCFN2, BGCFN3, SV.SVCSA, SV.SVNA1, SV.SVNA2, SV.SVNA3,
SV.SVNA4, SV.SVNA5, SV.SVPZIP, SV.NONUK
FROM SCPF ACC
INNER JOIN GFPF CUS ON GFCPNC = SCAN
LEFT OUTER JOIN SXPF SEQ ON SXCUS = GFCUS AND SXPRIM = ''
LEFT OUTER JOIN SVPFClean SV ON SVSEQ = SXSEQ
LEFT OUTER JOIN BGPF ON BGCUS = GFCUS AND BGCLC = GFCLC
LEFT OUTER JOIN NEPF NE ON SCAB=NE.NEAB and SCAN=ne.NEAN and SCAS=ne.NEAS
LEFT OUTER JOIN SAPF SA ON SCAB=SAAB and SCAN=SAAN and SCAS=SAAS
WHERE
(SATCD>500 and
scsac='IV' and
scbal = 0 and
scai30<>'Y' and
scai14<>'Y' and
not exists(select * from v5pf where v5and=scan and v5bal<>0))
GROUP BY GFCUS, Ne.NEEAN, SCDLE, SAPOD, SATCD,
CUS.GFCUN, BGCFN1, BGCFN2, BGCFN3, SV.SVCSA,
SV.SVNA1, SV.SVNA2, SV.SVNA3, SV.SVNA4, SV.SVNA5, SV.SVPZIP, SV.NONUK
ORDER BY MAX(SCAN) ASC, SAPOD DESC
I am getting results like the below where there are several transactions by a customer, and we only want to show the data of the most recent transaction:
So how can I show just the most recent transaction? Is this a case where I should use an OUTER APPLY or CROSS APPY?
EDIT:
Sorry I should clarify that I need the most recent date for each of the unique records in the field NEEAN which is the Account number

You can use ROW_NUMBER() as follows:
SELECT
ROW_NUMBER() OVER (PARTITION BY Ne.NEEAN ORDER BY SAPOD DESC) AS [Row],
GFCUS, Ne.NEEAN, SCDLE, SAPOD, SATCD,CUS.GFCUN, BGCFN1,
BGCFN2, BGCFN3, SV.SVCSA, SV.SVNA1, SV.SVNA2, SV.SVNA3,
SV.SVNA4, SV.SVNA5, SV.SVPZIP, SV.NONUK
FROM SCPF ACC
INNER JOIN GFPF CUS ON GFCPNC = SCAN
LEFT OUTER JOIN SXPF SEQ ON SXCUS = GFCUS AND SXPRIM = ''
LEFT OUTER JOIN SVPFClean SV ON SVSEQ = SXSEQ
LEFT OUTER JOIN BGPF ON BGCUS = GFCUS AND BGCLC = GFCLC
LEFT OUTER JOIN NEPF NE ON SCAB=NE.NEAB and SCAN=ne.NEAN and SCAS=ne.NEAS
LEFT OUTER JOIN SAPF SA ON SCAB=SAAB and SCAN=SAAN and SCAS=SAAS
WHERE
(SATCD>500 and
scsac='IV' and
scbal = 0 and
scai30<>'Y' and
scai14<>'Y' and
not exists(select * from v5pf where v5and=scan and v5bal<>0)) and
[Row] = 1
GROUP BY GFCUS, Ne.NEEAN, SCDLE, SAPOD, SATCD,
CUS.GFCUN, BGCFN1, BGCFN2, BGCFN3, SV.SVCSA,
SV.SVNA1, SV.SVNA2, SV.SVNA3, SV.SVNA4, SV.SVNA5, SV.SVPZIP, SV.NONUK
ORDER BY MAX(SCAN) ASC
You could encapsulate this within a subquery if you don't want to return the [Row] column.

you can user row_number to get top 1 row per customer
In the where clause need to return values with pos value as 1
sample query
row_number() over ( partition by GFCUS order by SAPOD desc) as pos

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL: If there are two rows that contain same record, want it to display one - sql

Try this: Change the SELECT to SELECT TOP 1 WITH TIES And at the end add ORDER BY ROW_NUMBER() OVER(PARTITION BY ets_borrow.borrow_id ORDER BY ets_tools.tools_id)

Related

Joining 2 columns where 1 column has unique records and the other has duplicates

where statement execute before inner join

update with subquery

Only return value that matches the ID on table 1

returning only the most recent record in dataset

Categories

Resources