Find duplicates within a given date range

Find duplicates within a given date range - sql

I'm trying to find a query which will return all entries in the database set that has duplicate signum(identifiers) and who are inside a given date range.
I tried something like this:
select * from dmt_teamresource where dmt_teamresource.signum IN (
select dmt_teamresource.signum from dmt_teamresource group by dmt_teamresource.signum having count((dmt_teamresource.signum)) > 1)
which returns all entries from dmt_teamresource that are duplicates. Now I need to add the date range somehow. Was thinking of something like:
where dmt_teamresource.startdate is between '2021-01-31' and '2021-12-31'
But don't know how to combine the queries.

Use window functions:
select tr.*
from (select ts.*, count(*) over (partition by tr.signum) as cnt
from dmt_teamresource tr
where . . . -- whatever conditions you want here
) tr
where cnt >= 2;

Related

The right way to use CTE

I'm new to Common Table Expressions and I think I need to use one in order to achieve what I require.
If I run the following script -
select MainRentAccountReference,EffectiveFromDate,CollectionDay,NumberOfCollections,DirectDebitTotalOverrideAmount
from DirectDebitApportionment
where id = 1
It would give me the below results -
So for each row that my CTE would return- for each unique MainRentAccountReference - I would want to create a row based on the following criteria.
3 Rows as the NumberOfCollections is set to 3
The following dates on each row - 01/05/18, 01/06/18, 01/07/18 so basically plus one month.
However is the CollectionDate was set to say 10, then I would want the 3 dates to be 10/05/18, 10/06/18, 10/07/18
Finally each row to have a value of DirectDebitTotalOverrideAmount divided by number of NumberOfCollections.
I've been playing about with this and can get no where near the results I'm trying to achieve. Any help would be greatly appreciated. Thanks

You can do this with a recursive CTE
with t as (
select *
from DirectDebitApportionment
where id = 1
),
cte as (
select . . ., , 1 as collection, DirectDebitTotalOverrideAmount / NumberOfCollections as collection_amount
from t
union all
select . . ., , collection + 1, DirectDebitTotalOverrideAmount / NumberOfCollections as collection_amount
from cte
where collection < NumberOfCollections
)
select . . .
from cte;
In some dialects of SQL, you need the recursive keyword.
Also, this can also be accomplished using a numbers table -- and that can be more efficient than the recursive CTE (although recursive CTEs often perform surprisingly well).

This seems to do the trick based on the pointers that Gordon gave me -
with t as (
select MainRentAccountReference,EffectiveFromDate,CollectionDay,NumberOfCollections,DirectDebitTotalOverrideAmount
from DirectDebitApportionment
where id = 1
),
cte as (
select 1 as collection
,t.MainRentAccountReference
,convert(decimal(18,2),DirectDebitTotalOverrideAmount / NumberOfCollections) as collection_amount
,NumberOfCollections
,convert(datetime,DATEFROMPARTS ( DATEPART(YEAR,EffectiveFromDate), DATEPART(MONTH,EffectiveFromDate), CollectionDay )) AS EffectiveFromDate
,CollectionDay
from t
union all
select collection + 1,MainRentAccountReference,collection_amount,NumberOfCollections,DATEADD(M,1,EffectiveFromDate),CollectionDay
from cte
where collection < cte.NumberOfCollections
)
select *
from cte
Order by MainRentAccountReference,collection
;
Gives me the following results -

Count of id per day using window function

I'm trying to count track_uri that are associated to a given playlist_uri in a day in a one month window and have composed the following sql:
SELECT
playlist_uri, playlist_date, track_uri, count(track_uri)
over (partition by playlist_uri, playlist_date) as count_tracks
FROM
tbl1
WHERE
_PARTITIONTIME BETWEEN '2017-09-09' AND '2017-10-09'
AND playlist_uri in (
SELECT playlist_uri from tbl2 WHERE playlist_owner = "spotify"
)
However I am getting the following output:
I instead would like it to show me the count of track_uri for each playlist_uri on each day.
Would really appreciate some help with this.

Not sure if I understand your question correctly, but if you might not need to use the window function for that:
SELECT
playlist_uri, playlist_date, COUNT(DISTINCT track_uri)
FROM
tbl1
WHERE
_PARTITIONTIME BETWEEN '2017-09-09' AND '2017-10-09'
AND playlist_uri in (
SELECT playlist_uri from tbl2 WHERE playlist_owner = "spotify"
)
GROUP BY 1, 2;

How to get datetime duplicate rows in SQL Server?

Im trying to find duplicate DATETIME rows in a table,
My column has datetime values such as 2015-01-11 11:24:10.000.
I must get the duplicates in 2015-01-11 11:24 type. Rest of it, not important. I can get the right value when I use SELECT with 'convert(nvarchar(16),column,121)', but when I put this in my code, I have to use 'group by' statement, so
My code is:
SELECT ID,
RECEIPT_BARCODE,
convert(nvarchar(16),TRANS_DATE,121),
PTYPE
FROM TRANSACTION_HEADER
WHERE TRANS_DATE BETWEEN '11.01.2015' AND '12.01.2015'
GROUP BY ID,RECEIPT_BARCODE,convert(nvarchar(16),TRANS_DATE,121),PTYPE
HAVING COUNT(convert(nvarchar(16),TRANS_DATE,121)) > 1
Since SQL forces me to use 'convert(nvarchar(16),TRANS_DATE,121)' in GROUP BY statement, I can't get the duplicate values.
Any idea for this?
Thanks in advance.

If you want the actual rows that are duplicated, then use window functions instead:
SELECT th.*, convert(nvarchar(16),TRANS_DATE,121)
FROM (SELECT th.*, COUNT(*) OVER (PARTITION BY convert(nvarchar(16),TRANS_DATE,121)) as cnt
FROM TRANSACTION_HEADER th
WHERE TRANS_DATE BETWEEN '11.01.2015' AND '12.01.2015'
) th
WHERE cnt > 1;

SELECT ID,RECEIPT_BARCODE,convert(nvarchar(16),TRANS_DATE,121), PTYPE ,COUNT(*)
FROM TRANSACTION_HEADER
WHERE TRANS_DATE BETWEEN '11.01.2015' AND '12.01.2015'
GROUP ID,RECEIPT_BARCODE,convert(nvarchar(16),TRANS_DATE,121), PTYPE
HAVING COUNT(*)>1;
I think you can use count(*) directly here.try the above one.

Got a error message when I try to find out which patient account have duplicated record.

When I run the script below, I got a error message "Cannot perform an aggregate function on an expression containing an aggregate or a subquery" Please provide some advice. Thanks
SELECT
CONVERT(DECIMAL(18,5),SUM(CASE WHEN PATIENT_ACCOUNT_NO IN (
SELECT PATIENT_ACCOUNT_NO
FROM STND_ENCOUNTER
GROUP BY PATIENT_ACCOUNT_NO
HAVING ( COUNT(PATIENT_ACCOUNT_NO) > 1)) THEN 0 ELSE 1 END)) dupPatNo
FROM [DBO].[STND_ENCOUNTER]

I think the error message is pretty clear. You have a sum() function with a subquery in it (albeit within a case, but that doesn't matter).
It seems that you want to choose patients that have more than one encounter, then add 0 if the patients is in the list and 1 if the patient is not. Hmmm. . . sounds like you want to count the number of patients with only one encounter.
Try using this logic instead:
select count(*)
from (select se.*, count(*) over (partition by PATIENT_ACCOUNT_NO) as NumEncounters
from dbo.stnd_encounter se
) se
where NumEncounters = 1;
As a note, the variable you are assigning is called DupPatientNo. This sounds like the number of patients that have duplicates. In that case, the query is:
select count(distinct PATIENT_ACCOUNT_NO)
from (select se.*, count(*) over (partition by PATIENT_ACCOUNT_NO) as NumEncounters
from dbo.stnd_encounter se
) se
where NumEncounters > 1;
(Or use count(*) if you want the number of encounters on duplicate patients.)

If you want to find number of PATIENT_ACCOUNT_NO that does not have any duplicates then use the following
SELECT COUNT(DISTINCT dupPatNo.PATIENT_ACCOUNT_NO)
FROM (
SELECT PATIENT_ACCOUNT_NO
FROM STND_ENCOUNTER
GROUP BY PATIENT_ACCOUNT_NO
HAVING COUNT(PATIENT_ACCOUNT_NO) = 1
) dupPatNo
If you want to find number of PATIENT_ACCOUNT_NO that have atleast one duplicate then use the following
SELECT COUNT(DISTINCT dupPatNo.PATIENT_ACCOUNT_NO)
FROM (
SELECT PATIENT_ACCOUNT_NO
FROM STND_ENCOUNTER
GROUP BY PATIENT_ACCOUNT_NO
HAVING COUNT(PATIENT_ACCOUNT_NO) > 1
) dupPatNo
Use of DISTINCT will make the query not count same item again and again
Though your query looks for first result, its not clear what you want. Hence giving query for both

Sum in subquery

My Query is
select count(*) as cnt,
EXTRACT(day FROM current_date - min(txdate))::int as days,
sum (Select opening from acledgerbal l
where acname='Arv'
union all
Select sum(v2.debit-v2.credit) as opening from acvoucher2 v2 where
txdate<='05/03/2014') as opening
from acduebills acb,acledger l
where (acb.opening+acb.debit-acb.credit) > 0
and acb.unitname='Sales'
and l.acname='Arv'
and l.acno=acb.acno
Here it show more than one row returned by a subquery used as an expression Error.
How do using sum for the subquery.
I'm using postgresql 9.1
EDIT:
I want to get count of rows in acduebills tables which is (acb.opening+acb.debit-acb.credit) > 0 and acb.unitname='Sales'. After that I want to get difference of day which is minimum date in same condition. After that I want to get opening, which comes from two tables: acledgerbal and acvoucher2. acvoucher is table checked by the txdate condition.
How to get those detail in single query?. How to get Same details in multiple schema's?

Something like this:
SELECT count(*) AS cnt
, current_date - min(txdate)::date AS days -- subtract dates directly
, (SELECT round(sum(opening)::numeric, 2)
FROM (
SELECT opening
FROM acledgerbal
WHERE acname = 'Arv'
UNION ALL
SELECT debit - credit
FROM acvoucher2
WHERE txdate <= '2014-05-03'
) sub
) AS opening
FROM acduebills b
JOIN acledger l USING (acno)
WHERE ((b.opening + b.debit) - b.credit) > 0
AND b.unitname ='Sales'
AND l.acname = 'Arv';
round() to decimal places only works with type numeric, so I cast the sum.

The problem here in the following statement:
sum ( Select opening from acledgerbal l
where acname='Arv'
union all
Select sum(v2.debit-v2.credit) as opening from acvoucher2 v2,
txdate<='05/03/2014' )
You use UNION so this subquery returns at least 2 rows. So you get an error that subquery can't return more than one row: "more than one row returned by a subquery used as an expression"
Try to change it to:
(Select SUM(opening) from acledgerbal l WHERE acname='Arv')
+
(Select SUM(v2.debit-v2.credit) as opening from acvoucher2 v2
WHERE txdate<='05/03/2014')

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Find duplicates within a given date range - sql

Use window functions: select tr.* from (select ts., count() over (partition by tr.signum) as cnt from dmt_teamresource tr where . . . -- whatever conditions you want here ) tr where cnt >= 2;

Related

The right way to use CTE

Count of id per day using window function

How to get datetime duplicate rows in SQL Server?

Got a error message when I try to find out which patient account have duplicated record.

Sum in subquery

Categories

Resources

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Find duplicates within a given date range - sql

Use window functions: select tr.* from (select ts.*, count(*) over (partition by tr.signum) as cnt from dmt_teamresource tr where . . . -- whatever conditions you want here ) tr where cnt >= 2;

Related

The right way to use CTE

Count of id per day using window function

How to get datetime duplicate rows in SQL Server?

Got a error message when I try to find out which patient account have duplicated record.

Sum in subquery

Categories

Resources

Use window functions: select tr.* from (select ts., count() over (partition by tr.signum) as cnt from dmt_teamresource tr where . . . -- whatever conditions you want here ) tr where cnt >= 2;