How to get datetime duplicate rows in SQL Server? - sql

Im trying to find duplicate DATETIME rows in a table,
My column has datetime values such as 2015-01-11 11:24:10.000.
I must get the duplicates in 2015-01-11 11:24 type. Rest of it, not important. I can get the right value when I use SELECT with 'convert(nvarchar(16),column,121)', but when I put this in my code, I have to use 'group by' statement, so
My code is:
SELECT ID,
RECEIPT_BARCODE,
convert(nvarchar(16),TRANS_DATE,121),
PTYPE
FROM TRANSACTION_HEADER
WHERE TRANS_DATE BETWEEN '11.01.2015' AND '12.01.2015'
GROUP BY ID,RECEIPT_BARCODE,convert(nvarchar(16),TRANS_DATE,121),PTYPE
HAVING COUNT(convert(nvarchar(16),TRANS_DATE,121)) > 1
Since SQL forces me to use 'convert(nvarchar(16),TRANS_DATE,121)' in GROUP BY statement, I can't get the duplicate values.
Any idea for this?
Thanks in advance.

If you want the actual rows that are duplicated, then use window functions instead:
SELECT th.*, convert(nvarchar(16),TRANS_DATE,121)
FROM (SELECT th.*, COUNT(*) OVER (PARTITION BY convert(nvarchar(16),TRANS_DATE,121)) as cnt
FROM TRANSACTION_HEADER th
WHERE TRANS_DATE BETWEEN '11.01.2015' AND '12.01.2015'
) th
WHERE cnt > 1;

SELECT ID,RECEIPT_BARCODE,convert(nvarchar(16),TRANS_DATE,121), PTYPE ,COUNT(*)
FROM TRANSACTION_HEADER
WHERE TRANS_DATE BETWEEN '11.01.2015' AND '12.01.2015'
GROUP ID,RECEIPT_BARCODE,convert(nvarchar(16),TRANS_DATE,121), PTYPE
HAVING COUNT(*)>1;
I think you can use count(*) directly here.try the above one.

Related

SQL - When result is duplicated on 2 fields remove all

When i run this query
SELECT
DT.CONTRACT_NUMBER,
DT.ROLE,
DT.TAX_ID,
DT.EFFECTIVE_DATE
FROM DATA_TABLE DT
I get this result.
Id like to remove results where the TAX ID appears more than once for each contract.
i.e This result would be gone. If they had 3 results they would be gone.
I think window functions might be the way to go:
SELECT DT.CONTRACT_NUMBER, DT.ROLE, DT.TAX_ID, DT.EFFECTIVE_DATE
FROM (SELECT DT.CONTRACT_NUMBER, DT.ROLE, DT.TAX_ID, DT.EFFECTIVE_DATE,
COUNT(*) OVER (PARTITION BY TAX_ID) as cnt
FROM DATA_TABLE DT
WHERE DT.CONTRACT_NUMBER = '551000280'
) DT
WHERE CNT = 1;
If you actually want to keep one row per tax id, then use row_number() instead of count(*).

Group by function in sql

It is throwing a error for the group by function. how do i fix it
SELECT TO_CHAR(TRANSBDATE,'MON-YYYY')PERIOD,B.CURCODE,COUNT(DISTINCT CUSTOMERID) UNIQUECUSTOMERS,COUNT(CURCODE)TRANS
FROM FX_TRANSHEADER A,FX_TRANSDETAIL B
WHERE A.TRANSNO = B.TRANSNO AND A.TRANSBDATE BETWEEN '01-JAN-2018' AND '30-JUN-2019' AND C_IDTYPE NOT IN ('CR')
ORDER BY 1 , 4
GROUP BY TO_CHAR(TRANSBDATE,'MON-YYYY')PERIOD,COUNT(DISTINCT CUSTOMERID),COUNT(CURCODE)
The GROUP BY only needs the fields on which you want to aggregate:
GROUP BY TO_CHAR(TRANSBDATE,'MON-YYYY')PERIOD, B.CURCODE
So not on COUNT(B.CURCODE) but on CURCODE
SELECT TO_CHAR(TRANSBDATE,'MON-YYYY')PERIOD,B.CURCODE,COUNT(DISTINCT CUSTOMERID) UNIQUECUSTOMERS,COUNT(CURCODE)TRANS
FROM FX_TRANSHEADER A,FX_TRANSDETAIL B
WHERE A.TRANSNO = B.TRANSNO AND A.TRANSBDATE BETWEEN '01-JAN-2018' AND '30-JUN-2019' AND C_IDTYPE NOT IN ('CR')
GROUP BY TO_CHAR(TRANSBDATE,'MON-YYYY'),B.CURCODE
ORDER BY 1 , 4
You cannot include aggregation functions in the GROUP BY. You only want the unaggregated columns from the SELECT.
I would recommend writing the query as:
SELECT TO_CHAR(H.TRANSBDATE,'MON-YYYY') as PERIOD,
D.CURCODE,
COUNT(DISTINCT ?.CUSTOMERID) as UNIQUECUSTOMERS,
COUNT(*) as TRANS
FROM FX_TRANSHEADER H JOIN
FX_TRANSDETAIL D
ON H.TRANSNO = D.TRANSNO
WHERE H.TRANSBDATE >= DATE '2018-01-01' AND
H.TRANSBDATE < DATE '2018-07-01' AND
?.C_IDTYPE NOT IN ('CR')
ORDER BY 1 , 4
GROUP BY TO_CHAR(H.TRANSBDATE, 'MON-YYYY');
The ? is for the column aliases for the tables that have the specified columns.
Notes:
Never use commas in the FROM clause. Always use proper, explicit, standard JOIN syntax.
Use meaningful table aliases (table abbreviations) rather than arbitrary letters.
Use DATE to represent date literals, so your code is not tied to particular internationalization settings on your server.
Replaced BETWEEN with two direct comparisons. This is particularly important in databases such as Oracle, where the DATE data type has a time component.

sql oracle group by subquery

I get the same ecommerce number for each date. I am trying to get ecommerce value count depending on the date, which is different for each date as the total number is only 105 for all October, not 391958.
Any idea how to group by the output of a subquery?
Thank you!
SELECT to_char(wcs1.start_tms,'DD/MM/YYYY') as dates,
(
SELECT count(*)
FROM ft_t_wcs1 wcs1,ft_t_stup stup
WHERE stup.modl_id='ECOMMERC'
AND stup.CROSS_REF_ID=wcs1.acct_id
AND stup.end_tms IS NULL
) AS ecommerce
FROM ft_t_wcs1 wcs1, ft_t_stup stup
WHERE wcs1.scenario='CREATE'
AND wcs1.acct_id IS NOT NULL
AND wcs1.start_tms BETWEEN add_months(TRUNC(SYSDATE,'mm'),-1) AND LAST_DAY(add_months(TRUNC(SYSDATE,'mm'),-1))
GROUP BY to_char(wcs1.start_tms,'DD/MM/YYYY')
ORDER BY to_char(wcs1.start_tms,'DD/MM/YYYY');
OUTPUT
Try below modified queries
select to_char(wcs1.start_tms,'DD/MM/YYYY') as dates,count(*) AS
ecommerce
from ft_t_wcs1 wcs1, ft_t_stup stup
where stup.modl_id='ECOMMERC' and stup.CROSS_REF_ID=wcs1.acct_id and stup.end_tms is null wcs1.scenario='CREATE' and wcs1.acct_id is not null and
wcs1.start_tms between add_months(TRUNC(SYSDATE,'mm'),-1) and
LAST_DAY(add_months(TRUNC(SYSDATE,'mm'),-1))
group by to_char(wcs1.start_tms,'DD/MM/YYYY')
order by to_char(wcs1.start_tms,'DD/MM/YYYY');
-- Another way using JOIN clause
select to_char(wcs1.start_tms,'DD/MM/YYYY') as dates,count(*) AS
ecommerce
from ft_t_wcs1 wcs1
join ft_t_stup stup
ON stup.CROSS_REF_ID=wcs1.acct_id
where stup.modl_id='ECOMMERC' and stup.end_tms is null wcs1.scenario='CREATE' and wcs1.acct_id is not null and
wcs1.start_tms between add_months(TRUNC(SYSDATE,'mm'),-1) and
LAST_DAY(add_months(TRUNC(SYSDATE,'mm'),-1))
group by to_char(wcs1.start_tms,'DD/MM/YYYY')
order by to_char(wcs1.start_tms,'DD/MM/YYYY');
It's hard to suggest an answer without understanding your table relationship, but I can tell that your problem is there is no relationship between your subquery and your main query. Your subquery simply returns a count where modl_id='ECOMMERC', so that value will always be the same - in your case, 105. You need to add a JOIN criteria to the subquery that ties the unique value to your main query. You'll also want to alias the table names differently to ensure you're joining correctly.
You are doing unnecessary joins when you just want a correlated subquery:
SELECT to_char(wcs1.start_tms,'DD/MM/YYYY') as dates,
(SELECT count(*)
FROM ft_t_stup stup
WHERE stup.modl_id= 'ECOMMERC' AND
stup.CROSS_REF_ID = wcs1.acct_id
stup.end_tms IS NULL
) AS ecommerce
FROM ft_t_wcs1 wcs1
WHERE wcs1.scenario = 'CREATE' AND
wcs1.acct_id IS NOT NULL AND
wcs1.start_tms BETWEEN add_months(TRUNC(SYSDATE,'mm'),-1) AND LAST_DAY(add_months(TRUNC(SYSDATE,'mm'),-1))
GROUP BY to_char(wcs1.start_tms, 'DD/MM/YYYY')
ORDER BY to_char(wcs1.start_tms, 'DD/MM/YYYY');

Got a error message when I try to find out which patient account have duplicated record.

When I run the script below, I got a error message "Cannot perform an aggregate function on an expression containing an aggregate or a subquery" Please provide some advice. Thanks
SELECT
CONVERT(DECIMAL(18,5),SUM(CASE WHEN PATIENT_ACCOUNT_NO IN (
SELECT PATIENT_ACCOUNT_NO
FROM STND_ENCOUNTER
GROUP BY PATIENT_ACCOUNT_NO
HAVING ( COUNT(PATIENT_ACCOUNT_NO) > 1)) THEN 0 ELSE 1 END)) dupPatNo
FROM [DBO].[STND_ENCOUNTER]
I think the error message is pretty clear. You have a sum() function with a subquery in it (albeit within a case, but that doesn't matter).
It seems that you want to choose patients that have more than one encounter, then add 0 if the patients is in the list and 1 if the patient is not. Hmmm. . . sounds like you want to count the number of patients with only one encounter.
Try using this logic instead:
select count(*)
from (select se.*, count(*) over (partition by PATIENT_ACCOUNT_NO) as NumEncounters
from dbo.stnd_encounter se
) se
where NumEncounters = 1;
As a note, the variable you are assigning is called DupPatientNo. This sounds like the number of patients that have duplicates. In that case, the query is:
select count(distinct PATIENT_ACCOUNT_NO)
from (select se.*, count(*) over (partition by PATIENT_ACCOUNT_NO) as NumEncounters
from dbo.stnd_encounter se
) se
where NumEncounters > 1;
(Or use count(*) if you want the number of encounters on duplicate patients.)
If you want to find number of PATIENT_ACCOUNT_NO that does not have any duplicates then use the following
SELECT COUNT(DISTINCT dupPatNo.PATIENT_ACCOUNT_NO)
FROM (
SELECT PATIENT_ACCOUNT_NO
FROM STND_ENCOUNTER
GROUP BY PATIENT_ACCOUNT_NO
HAVING COUNT(PATIENT_ACCOUNT_NO) = 1
) dupPatNo
If you want to find number of PATIENT_ACCOUNT_NO that have atleast one duplicate then use the following
SELECT COUNT(DISTINCT dupPatNo.PATIENT_ACCOUNT_NO)
FROM (
SELECT PATIENT_ACCOUNT_NO
FROM STND_ENCOUNTER
GROUP BY PATIENT_ACCOUNT_NO
HAVING COUNT(PATIENT_ACCOUNT_NO) > 1
) dupPatNo
Use of DISTINCT will make the query not count same item again and again
Though your query looks for first result, its not clear what you want. Hence giving query for both

Skip rows for specific time in SQL

Need a help.
I have two timestamp columns, so basically I want to get the max and min value with a thirD column showing as timedifference. I am skipping any 12.am time so used the syntax below. ANy help how to achieve the third column, timedifference.. It is in DB2.
SELECT EMPID,MIN(STARTDATETIME),MAX(ENDDATETIME)
FROM TABLE
WHERE DATE(STARTDATETIME)= '2012-05-15' AND HOUR(STARTDATETIME)<>0 AND HOUR(ENDDATETIME)<>0
GROUP BY EMPID
You can use the results from that in an inner select, and use those values to define the TimeDifference column. My knowledge of DB2 is very limited, so I'm making some assumptions, but this should give you an idea. I'll update the answer if something is drastically incorrect.
Select EmpId,
MinStartDate,
MaxEndDate,
MaxEndDate - MinStartDate As TimeDifference
From
(
Select EMPID,
MIN(STARTDATETIME) As MinStartDate,
MAX(ENDDATETIME) As MaxEndDate
From Table
Where DATE(STARTDATETIME) = '2012-05-15'
And HOUR(STARTDATETIME) <> 0
And HOUR(ENDDATETIME) <> 0
Group By EMPID
) A