CTE query not returning duplicate rows

CTE query not returning duplicate rows - sql

For auditing purposes, I use triggers to insert changes into an audit table. I then use a CTE to recursively get all audit rows of a particular item.
This SQLFiddle adds a few audit rows and the CTE that I'm currently using.
Here is the CTE:
;WITH cteAudit AS
(
SELECT id, [user_name], date_time, item_parent_type, item_parent_id,
item_type, item_id, item_action, row_guid, 1 AS audit_level
FROM audit
WHERE
(item_type = 22 AND item_id = 925)
UNION ALL
SELECT a.id, a.[user_name], a.date_time, a.item_parent_type, a.item_parent_id,
a.item_type, a.item_id, a.item_action, a.row_guid, cteAudit.audit_level + 1
FROM audit a
INNER JOIN cteAudit
ON a.item_parent_id = cteAudit.item_id
AND a.item_parent_type = cteAudit.item_type
WHERE
a.item_parent_type <> a.item_type AND
a.item_parent_id <> a.item_id
)
SELECT
cteAudit.id,
cteAudit.[user_name],
cteAudit.date_time,
cteAudit.item_parent_id,
#itemtype AS item_type,
cteAudit.item_id,
item_action,
cteAudit.audit_level,
CONVERT(nvarchar(36), cteAudit.row_guid) AS row_guid
ORDER BY date_time DESC, audit_level desc
However, with this CTE I have 2 problems:
The query does not return all 13 rows. It doesn't show rows 11-13. This is because the main table (customer) wasn't audited because the user just changed the address of a contact person.
This happens because I modified the 3rd level (contact person) and this level has a link to the 2nd level (contact list. However the 2nd level, which has the link to the 1st level (customer), was not modified and therefore is not in the audit table and therefore there's no link to the 1st level.
The query returns duplicate records.
How can I return all rows for the main table? And why is the query return duplicate rows?

Your problem comes from either incorrect query or incorrect data.
You ask your recursive query for these constraint:
WHERE
a.item_parent_type <> a.item_type AND
a.item_parent_id <> a.item_id
but for row 11 - 13, item_parent_id equals item_id
You also ask for
AND a.item_parent_type = cteAudit.item_type
but for row 11 - 13 your types are flipped and therefore incorrect.

Related

SQL query to join smae table multiple times

I have a scenario to join the same table multiple times to get the desired output. For ex I have two tables TABLE A and TABLE B.
Step 1: I want to take the all the parties from TABLE A which have
lowest Idate. Lowest idate will be fetched based partyid and idate
column.
Step 2: Then based on CID which is fetched from TABLE A in step 1,
we need to fetch the corresponding MID from TABLE B which have
MIDTYPE=130300.
Step 3: Then based on the MID fetched in step 2 we need to traverse
the same table and find out the latest record for the same MID based
on idate in TABLE B and fetch the corresponding CID for the MID.
Step 4: Now for that CID we need to fetch MID value for MIDTYPE
130307 in the same table(TABLEB). And my final output should be combination of MID
which we fetched for step 3 and MID fetched for 130307 in step 4.
I write a query like this ..but its taking lot of time for the query to run as we are going through the same table(TABLEB) multiple times and TABLEB have millions of rows. Is there anyway we can rewrite this query in different way. Could some one can help with this me.
SELECT
ident.mid mid1,
b.mid mid2
FROM
(
SELECT
*
FROM
tableb
WHERE
midtype = 130307
) ident
INNER JOIN (
SELECT
s.cid,
s.mid,
s.midtype
FROM
(
SELECT
cid,
partyid,
admin_sys_tp_cd,
mid,
ilast
FROM
(
SELECT
cq.cid,
RANK() OVER(
PARTITION BY cq.partyid
ORDER BY
cq.idate ASC
) rnk,
cq.idate,
cq.partyid,
i.mid,
i.idate AS ilast
FROM
tablea cq
INNER JOIN tableb i ON cq.cid = i.cid
INNER JOIN tablec ON i.cid = c.cid
WHERE
i.midtype = 130300
)
WHERE
rnk = 1
) a
INNER JOIN (
SELECT
*
FROM
(
SELECT
cid,
mid,
midtype,
RANK() OVER(
PARTITION BY mid
ORDER BY
idate DESC
) rnk_mpid
FROM
tableb
)
WHERE
rnk_mpid = 1
) s ON a.mid = s.mid
AND s.midtype = 130300
) b ON ident.cid = b.cid
AND ident.midtype = 130307

not what you asked, but before others and I, spent time trying to get different approaches for you, let's make sure the basics are covered.
No matter how different you can write an SQL query, they will never perform fast, in a MILLION base table if you don't have the proper indexes for it. Specially in your case, as you have to access it 3 times at least.
Just by looking at your detailed steps. I would say that you should have at least 3 different indexes created to support this query.
TableA_Index1 ( PARTYID, LDATE, INCLUDES CID)
TableB_Index1 (CID, MIDTYPE, INCLUDES MID )
TableB_Index2 (MID, LDATE, INCLUDES CID )
Do you have them ?
Have you ever tried to run this query on db2-advisor (db2advis) to get recommended indexes for it ?

SQL Server: Each GROUP BY expression must contain at least one column that is not an outer reference

scenario 1:
I have two tables INFUSION_APP_APPOINTMENT,INFUSION_APP_NURSE_NOTES where
INFUSION_APP_NURSE_NOTES.APPOINTMENT_ID=INFUSION_APP_APPOINTMENT.ID and i want to find out the INFUSION_APP_NURSE_NOTES.ID's where INFUSION_APP_NURSE_NOTES.APPOINTMENT_ID is same.
for eg. if the INFUSION_APP_NURSE_NOTES.APPOINTMENT_ID = 1 and INFUSION_APP_NURSE_NOTES.ID is 12,15,78, then i want to display all the
INFUSION_APP_NURSE_NOTES.ID's where INFUSION_APP_NURSE_NOTES.APPOINTMENT_ID =1.
i use below script
SELECT INFUSION_APP_NURSE_NOTES.APPOINTMENT_ID,INFUSION_APP_NURSE_NOTES.ID
FROM INFUSION_APP_NURSE_NOTES
GROUP BY INFUSION_APP_NURSE_NOTES.APPOINTMENT_ID,INFUSION_APP_NURSE_NOTES.ID
HAVING COUNT(INFUSION_APP_NURSE_NOTES.APPOINTMENT_ID)>1
but it does not gives me any records.
scenario 2:
I am running below script with the intention to get the duplicate records with different INFUSION_APP_NURSE_NOTES.ID's but same INFUSION_APP_NURSE_NOTES.APPOINTMENT_ID.
SELECT INFUSION_APP_NURSE_NOTES.ID,INFUSION_APP_NURSE_NOTES.APPOINTMENT_ID,INFUSION_APP_NURSE_NOTES.TYPE
FROM INFUSION_APP_NURSE_NOTES
WHERE
EXISTS (
SELECT 1 FROM INFUSION_APP_APPOINTMENT
WHERE
INFUSION_APP_NURSE_NOTES.ENABLE=1
AND INFUSION_APP_NURSE_NOTES.APPOINTMENT_ID=INFUSION_APP_APPOINTMENT.ID
GROUP BY INFUSION_APP_NURSE_NOTES.ID
HAVING COUNT(INFUSION_APP_NURSE_NOTES.APPOINTMENT_ID)>1
)
ORDER BY INFUSION_APP_NURSE_NOTES.APPOINTMENT_ID;
but getting below error
SQL Error(164): Each GROUP BY expression must contain at least one
column that is not an outer reference
how to solve it?
i want the only row which has common APPOINTMENT_ID but different n

The question is unclear. Finding duplicates is typically performed using ranking functions like ROW_NUMBER(). This query :
SELECT *,ROW_NUMBER(PARTITION BY APPOINTMENT_ID ORDER BYID) as RN
FROM INFUSION_APP_NURSE_NOTES
WHERE
ENABLE=1
Will rank notes for the same appointment by ID and return 1, 2, 3 etc starting from the earliest note. ORDER BY ID DESC would return 1 for the latest note.
This can be used in a subquery or CTE to find the first, last or or duplicate records, eg :
with notes as (
SELECT *,ROW_NUMBER(PARTITION BY APPOINTMENT_ID ORDER BYID) as RN
FROM INFUSION_APP_NURSE_NOTES
WHERE
ENABLE=1
)
select *
from notes
where RN=1
Will return the first note per appointment while :
where RN>1
Will return only duplicates.
The question doesn't say what should be done with the duplicates though.
If the question is how to return all notes from appointments with multiple notes, a subquery can be used to return the APPOINTMENT_IDs that have more than one note. There's no need to include the INFUSION_APP_APPOINTMENT table though :
SELECT *
FROM INFUSION_APP_NURSE_NOTES
where
ENABLE=1 AND
APPOINTMENT_ID IN ( SELECT APPOINTMENT_ID
FROM INFUSION_APP_NURSE_NOTES
WHERE
ENABLE=1
group by APPOINTMENT_ID
having count(*)>1)

Try this
SELECT INFUSION_APP_NURSE_NOTES.ID,INFUSION_APP_NURSE_NOTES.APPOINTMENT_ID,INFUSION_APP_NURSE_NOTES.TYPE
FROM INFUSION_APP_NURSE_NOTES
WHERE
EXISTS (
SELECT COUNT(B.APPOINTMENT_ID), B.ID
FROM INFUSION_APP_APPOINTMENT A
INNER JOIN INFUSION_APP_NURSE_NOTES B ON B.APPOINTMENT_ID = A.ID
WHERE
B.ENABLE=1
GROUP BY B.ID
HAVING COUNT(B.APPOINTMENT_ID)>1
)
ORDER BY INFUSION_APP_NURSE_NOTES.APPOINTMENT_ID;

Modify my SQL Server query -- returns too many rows sometimes

I need to update the following query so that it only returns one child record (remittance) per parent (claim).
Table Remit_To_Activate contains exactly one date/timestamp per claim, which is what I wanted.
But when I join the full Remittance table to it, since some claims have multiple remittances with the same date/timestamps, the outermost query returns more than 1 row per claim for those claim IDs.
SELECT * FROM REMITTANCE
WHERE BILLED_AMOUNT>0 AND ACTIVE=0
AND REMITTANCE_UUID IN (
SELECT REMITTANCE_UUID FROM Claims_Group2 G2
INNER JOIN Remit_To_Activate t ON (
(t.ClaimID = G2.CLAIM_ID) AND
(t.DATE_OF_LATEST_REGULAR_REMIT = G2.CREATE_DATETIME)
)
where ACTIVE=0 and BILLED_AMOUNT>0
)
I believe the problem would be resolved if I included REMITTANCE_UUID as a column in Remit_To_Activate. That's the REAL issue. This is how I created the Remit_To_Activate table (trying to get the most recent remittance for a claim):
SELECT MAX(create_datetime) as DATE_OF_LATEST_REMIT,
MAX(claim_id) AS ClaimID,
INTO Latest_Remit_To_Activate
FROM Claims_Group2
WHERE BILLED_AMOUNT>0
GROUP BY Claim_ID
ORDER BY Claim_ID
Claims_Group2 contains these fields:
REMITTANCE_UUID,
CLAIM_ID,
BILLED_AMOUNT,
CREATE_DATETIME
Here are the 2 rows that are currently giving me the problem--they're both remitts for the SAME CLAIM, with the SAME TIMESTAMP. I only want one of them in the Remits_To_Activate table, so only ONE remittance will be "activated" per Claim:
enter image description here

You can change your query like this:
SELECT
p.*, latest_remit.DATE_OF_LATEST_REMIT
FROM
Remittance AS p inner join
(SELECT MAX(create_datetime) as DATE_OF_LATEST_REMIT,
claim_id,
FROM Claims_Group2
WHERE BILLED_AMOUNT>0
GROUP BY Claim_ID
ORDER BY Claim_ID) as latest_remit
on latest_remit.claim_id = p.claim_id;
This will give you only one row. Untested (so please run and make changes).

Without having more information on the structure of your database -- especially the structure of Claims_Group2 and REMITTANCE, and the relationship between them, it's not really possible to advise you on how to introduce a remittance UUID into DATE_OF_LATEST_REMIT.
Since you are using SQL Server, however, it is possible to use a window function to introduce a synthetic means to choose among remittances having the same timestamp. For example, it looks like you could approach the problem something like this:
select *
from (
select
r.*,
row_number() over (partition by cg2.claim_id order by cg2.create_datetime desc) as rn
from
remittance r
join claims_group2 cg2
on r.remittance_uuid = cg2.remittance_uuid
where
r.active = 0
and r.billed_amount > 0
and cg2.active = 0
and cg2.billed_amount > 0
) t
where t.rn = 1
Note that that that does not depend on your DATE_OF_LATEST_REMIT table at all, it having been subsumed into the inline view. Note also that this will introduce one extra column into your results, though you could avoid that by enumerating the columns of table remittance in the outer select clause.
It also seems odd to be filtering on two sets of active and billed_amount columns, but that appears to follow from what you were doing in your original queries. In that vein, I urge you to check the results carefully, as lifting the filter conditions on cg2 columns up to the level of the join to remittance yields a result that may return rows that the original query did not (but never more than one per claim_id).

A co-worker offered me this elegant demonstration of a solution. I'd never used "over" or "partition" before. Works great! Thank you John and Gaurasvsa for your input.
if OBJECT_ID('tempdb..#t') is not null
drop table #t
select *, ROW_NUMBER() over (partition by CLAIM_ID order by CLAIM_ID) as ROW_NUM
into #t
from
(
select '2018-08-15 13:07:50.933' as CREATE_DATE, 1 as CLAIM_ID, NEWID() as
REMIT_UUID
union select '2018-08-15 13:07:50.933', 1, NEWID()
union select '2017-12-31 10:00:00.000', 2, NEWID()
) x
select *
from #t
order by CLAIM_ID, ROW_NUM
select CREATE_DATE, MAX(CLAIM_ID), MAX(REMIT_UUID)
from #t
where ROW_NUM = 1
group by CREATE_DATE

Order by data as per supplied Id in sql

Query:
SELECT *
FROM [MemberBackup].[dbo].[OriginalBackup]
where ration_card_id in
(
1247881,174772,
808454,2326154
)
Right now the data is ordered by the auto id or whatever clause I'm passing in order by.
But I want the data to come in sequential format as per id's I have passed
Expected Output:
All Data for 1247881
All Data for 174772
All Data for 808454
All Data for 2326154
Note:
Number of Id's to be passed will 300 000

One option would be to create a CTE containing the ration_card_id values and the orders which you are imposing, and the join to this table:
WITH cte AS (
SELECT 1247881 AS ration_card_id, 1 AS position
UNION ALL
SELECT 174772, 2
UNION ALL
SELECT 808454, 3
UNION ALL
SELECT 2326154, 4
)
SELECT t1.*
FROM [MemberBackup].[dbo].[OriginalBackup] t1
INNER JOIN cte t2
ON t1.ration_card_id = t2.ration_card_id
ORDER BY t2.position DESC
Edit:
If you have many IDs, then neither the answer above nor the answer given using a CASE expression will suffice. In this case, your best bet would be to load the list of IDs into a table, containing an auto increment ID column. Then, each number would be labelled with a position as its record is being loaded into your database. After this, you can join as I have done above.

If the desired order does not reflect a sequential ordering of some preexisting data, you will have to specify the ordering yourself. One way to do this is with a case statement:
SELECT *
FROM [MemberBackup].[dbo].[OriginalBackup]
where ration_card_id in
(
1247881,174772,
808454,2326154
)
ORDER BY CASE ration_card_id
WHEN 1247881 THEN 0
WHEN 174772 THEN 1
WHEN 808454 THEN 2
WHEN 2326154 THEN 3
END
Stating the obvious but note that this ordering most likely is not represented by any indexes, and will therefore not be indexed.

Insert your ration_card_id's in #temp table with one identity column.
Re-write your sql query as:
SELECT a.*
FROM [MemberBackup].[dbo].[OriginalBackup] a
JOIN #temps b
on a.ration_card_id = b.ration_card_id
order by b.id

Access 2010 SQL - Show Duplicate Records in Order for eventual Deletion (pure SQL Solution pls)

I have the following table named 'flt'
You can see the duplicates are identifed by 3 columns only (flight, fltno, stad)... I don't care about what is in col1 and col2.. But I should be able to show it in the query.
So.. you can see ids 8, 3 and 10 are duplicates.
I want to write a pure SQL query... that can do the following:
1) the duplicate count column.. which basically counts how many records are there that matches the flight, fltno, stad of the currently selected row.
2) the "duplicate rank" column which orders the duplicates.. 1 means first record, 2 means this is the 2nd record and 3 means this is the 3rd record. You can see ba 104 has 2 records in total... and it is ranked 1 and 2.
3) from the resulting (possibly editable) query.. I should be able to filter out (using where) all the duplicate ranks that are > 1... then able to delete those records.
So.. id 8, 3 and 10 are > 1.. and I should be able to delete them with in this query... by clicking on the row and delete key.
If the condition 3 is not entirely achievable.. please give me the best way possible. Thanks.

This SQL will get you the results as per your question, however it won't work as part of a DELETE query, I suggest SELECTING from this query into a temporary table and then running a DELETE query from that : )
SELECT A.id, A.flight, A.fltno, A.stad, A.col1, A.col2, B.concount AS [duplicate count], (SELECT Count(C.id) FROM tblfit As C WHERE C.flight&C.fltno&C.stad=A.concat AND C.id <= A.id) AS [duplicate rank]
FROM (SELECT tblfit.*, [flight] & [fltno] & [stad] AS concat
FROM tblfit) AS A,
(SELECT [flight] & [fltno] & [stad] AS concat, Count([concat]) AS concount
FROM tblfit
GROUP BY [flight] & [fltno] & [stad]) AS B
WHERE A.concat = B.concat;

Added a column in the table where the value is always 1 called
countValue
Then the first query for duplicate count is
SELECT tableA.flight, tableA.fltno, tableA.stad, Sum(tableA.countValue) AS duplicateCount
FROM tableA
GROUP BY tableA.flight, tableA.fltno, tableA.stad;
Then the second query for duplicate rank (ranked by id number) is
SELECT (SELECT Count(*)+1 FROM tableA WHERE id < temp.id AND stad = temp.stad AND flight = temp.flight AND fltno = temp.fltno) AS flightRank, temp.id, temp.flight, temp.fltno, temp.stad
FROM tableA AS temp;
Then you can join them
SELECT tableA.id, tableA.flight, tableA.fltno, tableA.stad, tableA.col1, tableA.col2, queryCounts.duplicateCount, queryRanking.flightRank
FROM (tableA INNER JOIN queryRanking ON tableA.id = queryRanking.id) INNER JOIN queryCounts ON (tableA.stad = queryCounts.stad) AND (tableA.fltno = queryCounts.fltno) AND (tableA.flight = queryCounts.flight);
Then regarding the delete query read this thread since you need to delete using joins
How to delete in MS Access when using JOIN's?

This will solve question 1. I do not believe all 3 questions can be answered in the same query.
Select Flight, FltNo, Stad, Sum(1) as DupCnt
from FLT
Group By Flight, FltNo, Stad
order by Sum(1) DESC
How do you know you want to delete 8 and 3, and not 4 and 10?

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

CTE query not returning duplicate rows - sql

Related

SQL query to join smae table multiple times

SQL Server: Each GROUP BY expression must contain at least one column that is not an outer reference

Modify my SQL Server query -- returns too many rows sometimes

Order by data as per supplied Id in sql

Access 2010 SQL - Show Duplicate Records in Order for eventual Deletion (pure SQL Solution pls)

Categories

Resources