Removing Duplicates Based date

Removing Duplicates Based date - sql

I have the following Query to select (will use for an update statement) remove duplicates based on the min service date and keeping the most recent svc date.
select st.SubID, st.RecordNo, st.Fname, st.Lname, st.MemberID, st.ServiceDate, IsDeduped, DedupCriteria
from stagingtable st
join (select MemberID
from stagingtable
where SubID = 99999
and waveseqid = 1
group by MemberID
having count(*) > 1) st2
on st.MemberID = st2.MemberID
and st.ServiceDate = (Select min(ServiceDate) from stagingtable s where s.subid = 99999 and s.waveseqid = 1 and st.MemberID = s.MemberID)
where SubID = 99999
and waveseqid = 1
order by RecordNo
This seems to pull in sometime only pull in multiples with the same date for the memberid:
SurveyID RecordNo Fname Lname MemberID Option9 IsDeduped DedupCriteria
99999 1 John Doe 123 10/1/2015 0 NULL x These show on the query
99999 2 John Doe 123 10/1/2015 0 NULL x These show on the query
99999 3 John Doe 123 10/8/2015 0 NULL But expected these as well
99999 4 John Doe 123 10/12/2015 0 NULL But expected these as well
99999 4 John Doe 123 10/14/2015 0 NULL But expected these as well
99999 6 John Doe 123 10/29/2015 0 NULL But expected these as well
99999 7 John Doe 123 12/14/2015 0 NULL But expected these as well

Your "AND" statement restricts the results to only rows with the minimum service date.
and st.ServiceDate = (Select min(ServiceDate) from stagingtable s where s.subid = 99999 and s.waveseqid = 1 and st.MemberID = s.MemberID)
That's why you get two rows and not all of them.

Related

sql finding cid with most expired cards [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 4 months ago.
Improve this question
I have a table Cards(card_id,status,cid)
With the columns:
cid - customer id
status - exp/vld
card_id - card id's
How to find the cid with the most expired cards?

From Oracle 12, you can use:
SELECT cid,
COUNT(*) AS num_exp
FROM cards
WHERE status = 'exp'
GROUP BY cid
ORDER BY num_exp DESC
FETCH FIRST ROW WITH TIES;

You can get count of expired cards for individual customers and then choose customer with MAX count. The below query should give results.
WITH t AS(
SELECT cid, count(1) customer_exp_cards_count
FROM Cards where status = 'exp'
group by cid)
SELECT cid FROM t t1
WHERE t1.customer_exp_cards_count IN (SELECT MAX(t2.customer_exp_cards_count)
FROM t t2)
Sample data and its result:
cardid status cid
3 exp 5
1 exp 1
2 exp 1
3 vld 1
5 vld 1
1 exp 2
2 exp 2
3 exp 2
4 vld 2
5 vld 2
6 exp 3
7 vld 4
4 vld 5
Result:
2

Suppose you have these two tables (just a sample data)
CUSTOMERS
CUST_ID
CUST_NAME
CUST_STATUS
101
John
ACTIVE
102
Annie
ACTIVE
103
Jane
ACTIVE
104
Bob
INACTIVE
CARDS
CARD_ID
CARD_STATUS
CUST_ID
1001001
VALID
101
1001002
VALID
101
1001003
EXPIRED
101
1001004
EXPIRED
101
1001005
VALID
101
1002010
VALID
102
1002020
EXPIRED
102
1002030
EXPIRED
102
1002040
EXPIRED
102
1003100
VALID
103
1003200
VALID
103
If you want just a CUST_ID with the number of most expired cards you can do it without table CUSTOMERS:
Select CUST_ID, EXPIRED_CARDS
From (Select CUST_ID, Count(CARD_ID) "EXPIRED_CARDS" From cards Where CARD_STATUS = 'EXPIRED' Group By CUST_ID)
Where EXPIRED_CARDS = (Select Max(EXPIRED_CARDS) From (Select Count(CARD_ID) "EXPIRED_CARDS" From cards Where CARD_STATUS = 'EXPIRED' Group By CUST_ID) )
--
-- R e s u l t
-- CUST_ID EXPIRED_CARDS
-- ---------- -------------
-- 102 3
Maybe you could consider creating a CTE with the data from both tables which will give you dataset that you could use later for different questions not just for this one. Something like this:
WITH
customers_cards AS
(
Select
cst.CUST_ID,
cst.CUST_NAME,
cst.CUST_STATUS,
crd.CARD_ID,
crd.CARD_STATUS,
Sum(CASE WHEN crd.CUST_ID Is Null Then 0 Else 1 End) OVER(Partition By crd.CUST_ID) "TOTAL_NUM_OF_CARDS",
Sum(CASE WHEN crd.CARD_ID Is Null Then Null WHEN crd.CARD_STATUS = 'VALID' And crd.CARD_ID Is Not Null Then 1 Else 0 End) OVER(Partition By crd.CUST_ID) "VALID_CARDS",
Sum(CASE WHEN crd.CARD_ID Is Null Then Null WHEN crd.CARD_STATUS = 'EXPIRED' And crd.CARD_ID Is Not Null Then 1 Else 0 End) OVER(Partition By crd.CUST_ID) "EXPIRED_CARDS"
From
customers cst
Left Join
cards crd on(crd.CUST_ID = cst.CUST_ID)
)
/* R e s u l t :
CUST_ID CUST_NAME CUST_STATUS CARD_ID CARD_STATUS TOTAL_NUM_OF_CARDS VALID_CARDS EXPIRED_CARDS
---------- --------- ----------- ------- ----------- ------------------ ----------- -------------
101 John ACTIVE 1001001 VALID 5 3 2
101 John ACTIVE 1001002 VALID 5 3 2
101 John ACTIVE 1001003 EXPIRED 5 3 2
101 John ACTIVE 1001004 EXPIRED 5 3 2
101 John ACTIVE 1001005 VALID 5 3 2
102 Annie ACTIVE 1002010 VALID 4 1 3
102 Annie ACTIVE 1002040 EXPIRED 4 1 3
102 Annie ACTIVE 1002030 EXPIRED 4 1 3
102 Annie ACTIVE 1002020 EXPIRED 4 1 3
103 Jane ACTIVE 1003100 VALID 2 2 0
103 Jane ACTIVE 1003200 VALID 2 2 0
104 Bob INACTIVE 0
*/
This can be used to answer many more potential questions. Here is the list of customers sorted by number of expired cards (descending):
Select Distinct
CUST_ID, CUST_NAME, TOTAL_NUM_OF_CARDS, VALID_CARDS, EXPIRED_CARDS
From
customers_cards
Order By
EXPIRED_CARDS Desc Nulls Last, CUST_ID
--
-- R e s u l t :
-- CUST_ID CUST_NAME TOTAL_NUM_OF_CARDS VALID_CARDS EXPIRED_CARDS
-- ---------- --------- ------------------ ----------- -------------
-- 102 Annie 4 1 3
-- 101 John 5 3 2
-- 103 Jane 2 2 0
-- 104 Bob 0
OR to answer your question:
Select Distinct
CUST_ID, CUST_NAME, TOTAL_NUM_OF_CARDS, VALID_CARDS, EXPIRED_CARDS
From
customers_cards
Where
EXPIRED_CARDS = (Select Max(EXPIRED_CARDS) From customers_cards)
Order By
CUST_ID
--
-- R e s u l t :
-- CUST_ID CUST_NAME TOTAL_NUM_OF_CARDS VALID_CARDS EXPIRED_CARDS
-- ---------- --------- ------------------ ----------- -------------
-- 102 Annie 4 1 3
Regards...

simple sql over (partition by) not working as expected

Feels like it should be simple but my mind has gone blank so would appreciate any help!
Let's say I have this dataset
Date sale_id salesperson Missed_payment_this_month
01/01/2016 1001 John 1
01/01/2016 1002 Bob 0
01/01/2016 1003 Bob 0
01/01/2016 1004 John N/A
01/02/2016 1001 John 1
01/02/2016 1002 Bob 1
01/02/2016 1003 Bob 0
01/02/2016 1004 John 1
01/03/2016 1001 John 1
01/03/2016 1002 Bob 0
01/03/2016 1003 Bob 0
01/03/2016 1004 John 1
And want to add these two columns to the end. They look at the number of missed payments previously, by sales_id and salesperson.
Previous_missed_payment_by_sale_id Previous_missed_payment_by_sales person
0 0
0 0
0 0
0 0
1 1
0 0
0 0
0 1
2 3
1 1
0 1
1 3
sales_id is ok but getting it over sales persons is giving me an error (group by) or adding in extra columns. I need to keep the rows constant.
My best guess that returns extra columns:
select t1.Date, t1.sale_id, t1.salesperson
,sum(case when t2.Missed_payment_this_month = '1' then 1 else 0 end) previous_missed_sales_id
,sum(case when t2.Missed_payment_this_month = '1' then 1 else 0 end) OVER (PARTITION by t1.salesperson) previous_missed_salesperson
from [dbo].[simple_join_table2] t1
inner join [dbo].[simple_join_table2] t2 on
(t2.[Date] < t1.[Date] AND t1.[sale_id] = t2.[sale_id])
group by t1.Date, t1.sale_id, t1.salesperson
,case when t2.Missed_payment_this_month = '1' then 1 else 0 end
this is the output:
Date sale_id salesperson previous_missed_sales_id previous_missed_salesperson
01/02/2016 1002 Bob 0 1
01/02/2016 1003 Bob 0 1
01/03/2016 1002 Bob 0 1
01/03/2016 1002 Bob 1 1
01/03/2016 1003 Bob 0 1
01/02/2016 1001 John 1 3
01/02/2016 1004 John 0 3
01/03/2016 1001 John 2 3
01/03/2016 1004 John 0 3
01/03/2016 1004 John 1 3
Is this possible without another sub query? I guess another way to put it is i'm trying to mimic the sumx and earlier functions of Powerpivot.

If you are on 2012+ use windowing aggregates. Previous = sum all_previous_including_curret - sum current. Ms sql default window is exactly ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
with [simple_join_table2] as(
-- sample data
select cast(valuesDate as Date) valuesDate, sale_id, salesperson, Missed_payment_this_month
from (
values
('20160101',1001,'John', 1)
,('20160101',1002,'Bob ', 0)
,('20160101',1003,'Bob ', 0)
,('20160101',1004,'John',null)
,('20160201',1001,'John', 1)
,('20160201',1002,'Bob ', 1)
,('20160201',1003,'Bob ', 0)
,('20160201',1004,'John', 1)
,('20160301',1001,'John', 1)
,('20160301',1002,'Bob ', 0)
,('20160301',1003,'Bob ', 0)
,('20160301',1004,'John', 1)
) t(valuesDate, sale_id, salesperson, Missed_payment_this_month)
)
select valuesDate,sale_id, salesperson, Missed_payment_this_month,
byidprevmonth = sum(Missed_payment_this_month ) over(partition by sale_id order by valuesDate)
- sum(Missed_payment_this_month) over(partition by valuesDate, sale_id),
bypersonprevmonth = sum(Missed_payment_this_month) over(partition by salesperson order by valuesDate)
- sum(Missed_payment_this_month) over(partition by valuesDate, salesperson)
from [simple_join_table2]
order by salesperson, valuesDate

get extra rows for each group where date doesn't exist

I've been playing with this for days, and can't seem to come up with something. I have this query:
select
v.emp_name as Name
,MONTH(v.YearMonth) as m
,v.SalesTotal as Amount
from SalesTotals
Which gives me these results:
Name m Amount
Smith 1 123.50
Smith 2 40.21
Smith 3 444.21
Smith 4 23.21
Jones 1 121.00
Jones 2 499.00
Jones 3 23.23
Jones 4 41.82
etc....
What I need to do is use a JOIN or something, so that I get a NULL value for each month (1-12), for each name:
Name m Amount
Smith 1 123.50
Smith 2 40.21
Smith 3 444.21
Smith 4 23.21
Smith 5 NULL
Smith 6 NULL
Smith ... NULL
Smith 12 NULL
Jones 1 121.00
Jones 2 499.00
Jones 3 23.23
Jones 4 41.82
Jones 5 NULL
Jones ... NULL
Jones 12 NULL
etc....
I have a "Numbers" table, and have tried doing:
select
v.emp_name as Name
,MONTH(v.YearMonth) as m
,v.SalesTotal as Amount
from SalesTotals
FULL JOIN Number n on n.Number = MONTH(v.YearMonth) and n in(1,2,3,4,5,6,7,8,9,10,11,12)
But that only gives me 6 additional NULL rows, where what I want is actually 6 NULL rows for each group of names. I've tried using Group By, but not sure how to use it in a JOIN statement like that, and not even sure if that's the correct route to take.
Any advice or direction is much appreciated!

Here's one way to do it:
select
s.emp_name as Name
,s.Number as m
,st.salestotal as Amount
from (
select distinct emp_name, number
from salestotals, numbers
where number between 1 and 12) s left join salestotals st on
s.emp_name = st.emp_name and s.number = month(st.yearmonth)
Condensed SQL Fiddle

You could do:
SELECT EN.emp_name Name,
N.Number M,
ST.SalesTotal Amount
FROM ( SELECT Number
FROM NumberTable
WHERE Number BETWEEN 1 AND 12) N
CROSS JOIN (SELECT DISTINCT emp_name
FROM SalesTotals) EN
LEFT JOIN SalesTotals ST
ON N.Number = MONTH(ST.YearMonth)
AND EN.emp_name = ST.emp_name

How to bring together multiple delta tables?

I have a table with IDs and primary information. I also have two delta tables keyed on ID and date of change. I need to build a view that merges these three tables together indicating all changes over time.
Main Table:
ID Name
-- ------------------
1 Bob Jones
2 Dave Smith
First Attribute Table:
ID Date Attr1
-- ---------- -----
1 01/01/2013 25
1 02/15/2013 33
1 02/17/2013 47
1 03/02/2013 58
2 02/01/2013 1
...
Second Attribute Table
ID Date Attr2
-- ---------- -----
1 01/01/2013 ABC
1 01/05/2013 DEF
1 01/15/2013 RST
1 02/10/2013 XYZ
1 02/15/2013 Foo
1 03/05/2013 Blah
2 02/01/2013 Two
...
Based on that data, for Bob Jones, I need the view to return the following:
ID Name Date Attr1 Attr2
-- ----------- ---------- ----- -----
1 Bob Jones 01/01/2013 25 ABC
1 Bob Jones 01/05/2013 25 DEF
1 Bob Jones 01/15/2013 25 RST
1 Bob Jones 02/10/2013 25 XYZ
1 Bob Jones 02/15/2013 33 Foo
1 Bob Jones 02/17/2013 47 Foo
1 Bob Jones 03/02/2013 58 Foo
1 Bob Jones 03/05/2013 58 Blah
I tried outer joining the attribute tables to get all change values ordered by date and then used an outer join on the entire query with itself to get "prior" records:
with qry as (
select
rownum = ROW_NUMBER() OVER (ORDER BY m.ID, a.DATE),
m.ID,
m.Name,
a.DATE,
a.Attr1,
a.Attr2
from Main m
inner join (
select
COALESCE(a1.ID, a2.ID) as ID,
COALESCE(a1.LOAD_DATE, a2.LOAD_DATE) as LOAD_DATE,
a1.Attr1,
a2.Attr2
from Attributes1 a1
full outer join Attributes2 a2
on (a1.ID = a2.ID and a1.DATE = a2.DATE)
) a on (a.ID = m.ID)
)
select
COALESCE(qry.ID, prev.ID) as ID,
COALESCE(qry.Name, prev.Name) as Name,
COALESCE(qry.DATE, prev.DATE) as DATE,
COALESCE(qry.Attr1, prev.Attr1) as Attr1,
COALESCE(qry.Attr2, prev.Attr2) as Attr2,
from qry
left join qry prev
on (prev.rownum = qry.rownum - 1)
order by ID, DATE
However, that doesn't work when one attribute table changes quicker than the other because the attributes that didn't change are null in the results of the attribute table join and if two nulls show up back-to-back, the coalesce will return a null when I need the last non-null value that was in that column.
Can this even be done in a view in SQL Server 2012?

Getting the sum of columns based on row values

I have a table that looks like the following.
EMPNUM EMPNAME LOCATION CATEGORY COUNT
123 JOHN DOE BLDG A 1 5
123 JOHN DOE BLDG A 1 6
123 JOHN DOE BLDG A 2 4
123 JOHN DOE BLDG A 3 7
123 JOHN DOE BLDG B 1 1
123 JOHN DOE BLDG B 2 3
234 EMILY DOE BLDG A 1 1
234 EMILY DOE BLDG A 2 2
234 EMILY DOE BLDG A 3 4
234 EMILY DOE BLDG B 2 3
234 EMILY DOE BLDG B 2 9
234 EMILY DOE BLDG B 3 3
I would like to transport it into columns that will yield to an output similar to below. I need to get the sum of COUNT based on the values of LOCATION and CATEGORY
EMPNUM EMPNAME SUM_A1 SUM_A2 SUM_A3 SUM_B1 SUM_B2 SUM_B3
123 JOHN DOE 11 4 7 1 3 0
234 EMILY DOE 1 2 4 0 12 3
Is there any way to do this as an SQL query? or in Crystal reports (though I prefer output using SQL)

If you are using 11g or later try
select * from table1
PIVOT (SUM("COUNT")
FOR ("LOCATION","CATEGORY") IN
(('BLDG A',1) AS sum_a1,
('BLDG A',2) AS sum_a2,
('BLDG A',3) AS sum_a3,
('BLDG B',1) AS sum_b1,
('BLDG B',2) AS sum_b2,
('BLDG B',3) AS sum_b3));
Here is a fiddle
Otherwise use APC's solution

This will work providing the values in LOCATION and CATEGORY are constant:
select empnum
, empname
, sum(case when location='BLDG A' and category = 1 then count else 0 end) sum_a1
, sum(case when location='BLDG A' and category = 2 then count else 0 end) sum_a2
, sum(case when location='BLDG A' and category = 3 then count else 0 end) sum_a3
, sum(case when location='BLDG B' and category = 1 then count else 0 end) sum_b1
, sum(case when location='BLDG B' and category = 2 then count else 0 end) sum_b2
, sum(case when location='BLDG B' and category = 3 then count else 0 end) sum_b3
from your_table
group by empnum
, empname
If the values are not known or not stable when you run the query you will need to use dynamic SQL.
Note that if you are on 11g you should employ A B Cade's PIVOT solution, which is more elegant.

The other answers will work great if you have a known number of values to transform into columns. But if you have an unknown number, then you can use dynamic sql to generate the results.
You would create the following procedure:
CREATE OR REPLACE procedure test_dynamic_pivot(p_cursor in out sys_refcursor)
as
sql_query varchar2(1000) := 'select empnum, empname';
begin
for x in (select distinct location, category from yourtable order by 1)
loop
sql_query := sql_query ||
' , sum(case when location = '''||x.location||''' and category='||x.category||' then cnt else 0 end) as sum_'||substr(x.location, -1, 1)||x.category;
dbms_output.put_line(sql_query);
end loop;
sql_query := sql_query || ' from yourtable group by empnum, empname';
open p_cursor for sql_query;
end;
/
And then to execute it:
variable x refcursor
exec test_dynamic_pivot(:x)
print x
The result is the same as the hard-coded version:
| EMPNUM | EMPNAME | SUM_A1 | SUM_A2 | SUM_A3 | SUM_B1 | SUM_B2 | SUM_B3 |
----------------------------------------------------------------------------
| 234 | EMILY DOE | 1 | 2 | 4 | 0 | 12 | 3 |
| 123 | JOHN DOE | 11 | 4 | 7 | 1 | 3 | 0 |

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Removing Duplicates Based date - sql

Your "AND" statement restricts the results to only rows with the minimum service date. and st.ServiceDate = (Select min(ServiceDate) from stagingtable s where s.subid = 99999 and s.waveseqid = 1 and st.MemberID = s.MemberID) That's why you get two rows and not all of them.

Related

sql finding cid with most expired cards [closed]

simple sql over (partition by) not working as expected

get extra rows for each group where date doesn't exist

How to bring together multiple delta tables?

Getting the sum of columns based on row values

Categories

Resources