Finding Duplicate Dates by a certain ID in a single table

Finding Duplicate Dates by a certain ID in a single table - sql

Here is what my data looks like
ID Date Other Field
123 1/2/2017 a
123 1/3/2017 b
123 1/3/2017 c
123 1/5/2017 d
123 1/6/2017 e
123 1/6/2017 f
456 2/4/2017 g
456 2/4/2017 h
456 2/5/2017 i
456 2/6/2017 j
I am looking to identify when there is a date that is duplicated by ID, i would like to return the entire record. The results would look like
ID Date Other Field
123 1/3/2017 b
123 1/3/2017 c
123 1/6/2017 e
123 1/6/2017 f
456 2/4/2017 g
456 2/4/2017 h
I am not sure how to identify duplicate dates within a single table, then return those records only individually.
Any help is greatly appreciated!

select
t1.* from
table t1
join
(select id,date,count(*) from
table t2
group by id,date
having count(*)>=2
) t2
on t1.id=t2.id and t1.date=t2.date

What database are you using? If it supports window functions (e.g. Oracle or PostgreSQL), then:
select
id, "date", other_field
from
(
select
tbl.*,
count(*) over (partition by id, "date") date_cnt
from tbl
)
where date_cnt >= 2

Related

Netezza SQL Join dataset A to dataset B but pull fields from B when b_date > a_date

I have 2 datasets from 2 different sources but many of the members are the same in both datasets. My select statement is :
Select a.member_id, a.start_date, a.customer_id, a.region_id, b.b_start_date, b.customer_id, b.region_id
from dataset1 a
left join dataset2 b
on a.member_id=b.member_id
I want to somehow pick up all recs in A and recs in B where a.member_id = b.member_id but bring in the fields from A when a.start_date = b.b_start_date or a.start_date > b.b_start_date and bring in the fields from B when b.b_start_date > a.start_date.
Here's a pretty small example:
Dataset A:
member_id
start_date
customer_id
region_id
1111
1/30/2021
123
555
2222
1/30/2021
222
555
3333
1/1/2021
345
678
Dataset B:
member_id
b_start_date
customer_id
region_id
1111
1/1/2022
567
444
2222
1/30/2021
222
555
Result:
member_id
customer_id
region_id
1111
567
444
2222
222
555
3333
345
678

/* try this */
select a.* from a inner join b using (member_id) where a.start_date >= b.b_start_date
union all
select b.* from a inner join b using (member_id) where b.b_start_date > a.start_date;

calculate Count and Sum from two different table with group by without using inner query

I have two table first A having column id,phone_number,refer_amount
and second B having column phone_number,transaction_amount
now i want sum() of refer_amount and transaction_amount and count() of phone_number from both table using group by phone_number without using inner query
Table A
phone_number refer_amount
123 50
456 80
789 90
123 90
123 80
123 20
456 20
456 79
456 49
123 49
Table B
phone_number transaction_amount
123 50
123 51
123 79
456 22
456 11
456 78
456 66
456 88
456 88
456 66
789 66
789 23
789 78
789 46
i have tried following query but it gives me wrong output:
SELECT a.phone_number,COUNT(a.phone_number) AS refer_count,SUM(a.refer_amount) AS refer_amount,b.phone_number,COUNT(b.phone_number) AS toal_count,SUM(b.transaction_amount) AS transaction_amount FROM dbo.A AS a,dbo.B AS b WHERE a.phone_number=b.phone_number GROUP BY a.phone_number,b.phone_number
output (wrong):
phone_number refer_count refer_amount phone_number transaction_count transaction_amount
123 15 867 123 15 900
456 28 1596 456 28 1676
789 5 450 789 5 291
output (That I want):
phone_number refer_count refer_amount phone_number transaction_count transaction_amount
123 5 289 123 3 180
456 4 228 456 7 419
789 1 90 789 5 291

I would do the aggregations on the B table in a separate subquery, and then join to it:
SELECT
a.phone_number,
COUNT(a.phone_number) AS a_cnt,
SUM(a.refer_amount) AS a_sum,
COALESCE(b.b_cnt, 0) AS b_cnt,
COALESCE(b.b_sum, 0) AS b_sum
FROM A a
LEFT JOIN
(
SELECT
phone_number,
COUNT(*) AS b_cnt,
SUM(transaction_amount) AS b_sum
FROM B
GROUP BY phone_number
) b
ON a.phone_number = b.phone_number;
One major potential issue with your current approach is that the join could result in duplicate counting, as a given phone_number record in the A table gets replicated due to the join.
Speaking of joins, note that above I use an explicit join, rather than the implicit one you were using. In general, you should not put commas into the FROM clause.

This can help. You don't need sum(b.phone_number) when checking for a.phone_number = b.phone_number. Distinct is needed for phone number as there are two columns to consider.
For group by, anything not in aggregate function needs to be in group by function.
select a.phone_number, count(distinct a.phone_number), sum(a.refer_amount),
sum (b.transaction_amount)
from A as a, B as b
where a.phone_number=b.phone_number
group by a.phone_number

exclude complete record if related table has a certain value

I have CLAIMMASTER table like this
CLAIMNO
123
456
789
and another related table PROCSTATUS like this
CLAIMNO PROCCODE
123 111
123 222
456 111
456 222
789 222
I want to exclude the records from table1 where proccode in table2 is 111
the result should be
CLAIMNO
789
I have tried almost everything i can but the closest result i get is like this
CLAIMNO PROCCODE
123 222
456 222
789 222
I know this should be easy but i can't figure out the query to do this.
please help.
Here is the query
select a.CLAIMNO,b.PROCCODE from dbo.CLAIMMASTER a left join
dbo.PROCSTATUS b on a.CLAIMNO = b.CLAIMNO
where a.CLAIMNO not in (select b.CLAIMNO where b.PROCCODE in ('111'))

If you only need to select claimno then no need to have join.
select a.CLAIMNO from dbo.CLAIMMASTER a
where a.CLAIMNO not in
(select distinct b.CLAIMNO from dbo.PROCSTATUS b where b.PROCCODE = '111')
Also if you have claimno repeated in the claimmaster table then you need to use distinct in the select clause.

select a.CLAIMNO
from dbo.CLAIMMASTER a
inner join dbo.PROCSTATUS b
on a.CLAIMNO = b.CLAIMNO
where a.CLAIMNO not in
(select CLAIMNO FROM dbo.PROCSTATUS where PROCCODE in ('111'))

Recursive sql with Left outer join in Teradata

I have a query that is using Table1 and Table2 with left outer join on 'usage'. Now i have to join that query with the Table3 with (I guess recursive sql) to generate the 'Resulting table'.
I saw lot of examples on recursive sql, but didnt find any thing that is using left outer join.
Now my existing query is like this
select PIN, startDt, StartTm, usage, Min
from Table1 t1 left Outer join Table2 t2 on t1.usage= t2.usage;
How can i do the Table3 with this query, so that ratGrp will be in comma separated way? Please help!!
Table1
PIN startDt StartTm usage Min
-----------------------------------------
123 08/03/2014 12:12:00 500 4567
234 08/04/2014 12:12:00 200 4568
.....
Table2
1stCol 2ndCol usage
------------------------
abc 234 500
Table3
PIN ratGrp
-----------------
123 3300
123 100
123 103
234 3300
234 550
Resulting table
PIN startDt StartTm usage Min ratGrp
-----------------------------------------------
123 08/03/2014 12:12:00 500 4567 3300,100,103
234 08/04/2014 12:12:00 200 4568 3300,550

SQL Server : take 1 to many record set and make 1 record per id

I need some help. I need to take the data from these 3 tables and create an output that looks like below. The plan_name_x and pending_tallyx columns are derived to make one line per claim id. Each claim id can be associated to up to 3 plans and I want to show each plan and tally amounts in one record. What is the best way to do this?
Thanks for any ideas. :)
Output result set needed:
claim_id ac_name plan_name_1 pending_tally1 plan_name_2 Pending_tally2 plan_name_3 pending_tally3
-------- ------- ----------- -------------- ----------- -------------- ----------- --------------
1234 abc cooks delux_prime 22 prime_express 23 standard_prime 2
2341 zzz bakers delpux_prime 22 standard_prime 2 NULL NULL
3412 azb pasta's prime_express 23 NULL NULL NULL NULL
SQL Server 2005 table to use for the above result set:
company_claims
claim_id ac_name
1234 abc cooks
2341 zzz bakers
3412 azb pasta's
claim_plans
claim_id plan_id plan_name
1234 101 delux_prime
1234 102 Prime_express
1234 103 standard_prime
2341 101 delux_prime
2341 103 standard_prime
3412 102 Prime_express
Pending_amounts
claim_id plan_id Pending_tally
1234 101 22
1234 102 23
1234 103 2
2341 101 22
2341 103 2
3412 102 23

If you know that 3 is always the max amount of plans then some left joins will work fine:
select c.claim_id, c.ac_name,
cp1.plan_name as plan_name_1, pa1.pending_tally as pending_tally1,
cp2.plan_name as plan_name_2, pa2.pending_tally as pending_tally2,
cp3.plan_name as plan_name_3, pa3.pending_tally as pending_tally3,
from company_claims c
left join claim_plans cp1 on c.claim_id = cp1.claim_id and cp1.planid = 101
left join claim_plans cp2 on c.claim_id = cp2.claim_id and cp2.planid = 102
left join claim_plans cp3 on c.claim_id = cp3.claim_id and cp3.planid = 103
left join pending_amounts pa1 on cp1.claim_id = pa1.claimid and cp1.planid = pa1.plainid
left join pending_amounts pa2 on cp2.claim_id = pa2.claimid and cp2.planid = pa2.plainid
left join pending_amounts pa3 on cp3.claim_id = pa3.claimid and cp3.planid = pa3.plainid

I would first join all your data so that you get the relevant columns: claim_id, ac_name, plan_name, pending tally.
Then I would add transform this to get plan name and plan tally on different rows, with a label tying them together.
Then it should be easy to pivot.
I would tie these together with common table expressions.
Here's the query:
with X as (
select cc.*, cp.plan_name, pa.pending_tally,
rank() over (partition by cc.claim_id order by plan_name) as r
from company_claims cc
join claim_plans cp on cp.claim_id = cc.claim_id
join pending_amounts pa on pa.claim_id = cp.claim_id
and pa.plan_id = cp.plan_id
), P as (
select
X.claim_id,
x.ac_name,
x.plan_name as value,
'plan_name_' + cast(r as varchar(max)) as label
from x
union all
select
X.claim_id,
x.ac_name,
cast(x.pending_tally as varchar(max)) as value,
'pending_tally' + cast(r as varchar(max)) as label
from x
)
select claim_id, ac_name, [plan_name_1], [pending_tally1],[plan_name_2], [pending_tally2],[plan_name_3], [pending_tally3]
from (select * from P) p
pivot (
max(value)
for label in ([plan_name_1], [pending_tally1],[plan_name_2], [pending_tally2],[plan_name_3], [pending_tally3])
) as pvt
order by pvt.claim_id, ac_name
Here's a fiddle showing it in action: http://sqlfiddle.com/#!3/68f62/10

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Finding Duplicate Dates by a certain ID in a single table - sql

select t1.* from table t1 join (select id,date,count() from table t2 group by id,date having count()>=2 ) t2 on t1.id=t2.id and t1.date=t2.date

What database are you using? If it supports window functions (e.g. Oracle or PostgreSQL), then: select id, "date", other_field from ( select tbl., count() over (partition by id, "date") date_cnt from tbl ) where date_cnt >= 2

Related

Netezza SQL Join dataset A to dataset B but pull fields from B when b_date > a_date

calculate Count and Sum from two different table with group by without using inner query

exclude complete record if related table has a certain value

Recursive sql with Left outer join in Teradata

SQL Server : take 1 to many record set and make 1 record per id

Categories

Resources

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Finding Duplicate Dates by a certain ID in a single table - sql

select t1.* from table t1 join (select id,date,count(*) from table t2 group by id,date having count(*)>=2 ) t2 on t1.id=t2.id and t1.date=t2.date

What database are you using? If it supports window functions (e.g. Oracle or PostgreSQL), then: select id, "date", other_field from ( select tbl.*, count(*) over (partition by id, "date") date_cnt from tbl ) where date_cnt >= 2

Related

Netezza SQL Join dataset A to dataset B but pull fields from B when b_date > a_date

calculate Count and Sum from two different table with group by without using inner query

exclude complete record if related table has a certain value

Recursive sql with Left outer join in Teradata

SQL Server : take 1 to many record set and make 1 record per id

Categories

Resources

select t1.* from table t1 join (select id,date,count() from table t2 group by id,date having count()>=2 ) t2 on t1.id=t2.id and t1.date=t2.date

What database are you using? If it supports window functions (e.g. Oracle or PostgreSQL), then: select id, "date", other_field from ( select tbl., count() over (partition by id, "date") date_cnt from tbl ) where date_cnt >= 2