How to group, count consecutive dates and use it as filter in Netezza - sql

I'm trying to group consecutive dates, count the consecutive dates, and use that count as filter.
I have a table that currently looks like:
pat_id admin_dates admin_grp daily_admin
-------------------------------------------------
1 08/20/2018 1 2 doses
1 08/21/2018 1 3 doses
1 08/22/2018 1 1 doses
1 10/05/2018 2 3 doses
1 12/10/2018 3 4 doses
2 01/05/2019 1 1 doses
2 02/10/2019 2 2 doses
2 02/11/2019 2 2 doses
where admin_grp is grouping consecutive dates per pat_id.
I want to exclude all rows that have less than 3 consecutive dates for same pat_id. In this example, only pat_id = 1 and admin_grp = 1 condition has 3 consecutive dates, which I would like to see in result. My desired output would be:
pat_id admin_dates admin_grp daily_admin
-------------------------------------------------
1 08/20/2018 1 2 doses
1 08/21/2018 1 3 doses
1 08/22/2018 1 1 doses
I honestly have no idea how to perform this.. my attempt failed to count how many admin_grp has same value within same pat_id, let alone using that count as filter. If anyone could help out / suggest ideas how to tackle this, it will be greatly appreciated.

Assuming that any admin_grp would only have consecutive days, you would just need to count those records by (patid,admin_grp) that have 3 or greater records.
Eg:
select x.*
from (select t.*
,count(*) over(partition by patid,admin_grp) as cnt
from table t
)x
where x.cnt>=3

Short answer: join the table with itself on ‘pat_id’ and filter appropriately:
Select a.* from TABLE a
join (Select * from TABLE where daily_admin=‘3 doses’) b
using (pat_id)
Where a.daily_admin in (‘1 doses’, ‘2 doses’, ‘3 doses’)
Btw: too bad the ‘daily_admin’ column is not an integer... better data model would have made the Where statement slightly simpler :)

Related

sql snowflake, aggregate over window or sth

I have a table below
days
balance
user_id
wanted column
2022/08/01
10
1
1
2022/08/02
11
1
1
2022/08/03
10
1
1
2022/08/03
0
2
1
2022/08/05
3
2
2
2022/08/06
3
2
2
2022/08/07
3
3
3
2022/08/08
0
2
3
since I'm new to SQL couldn't aggregate over window by clauses, correctly.
which means; I want to find unique users that have balance>0 per day.
thanks
update:
exact output wanted:
days
unque users
2022/08/01
1
2022/08/02
1
2022/08/03
1
2022/08/05
2
2022/08/06
2
2022/08/07
3
2022/08/08
3
update: how if I want to accumulate the number of unique users over time? with consideration of new users [means: counting users who didn't exist before], and the balance > 0
everyones help is appreaciated deeply :)
SELECT
*,
COUNT(DISTINCT CASE WHEN balance > 0 THEN USER_ID END) OVER (ORDER BY days)
FROM
your_table

Find count of occurrence of a row

I have a table with duplicate rows and I need to extract these duplicate rows alone. Below is an example of the table I have:
my_table:
ID Offer
1 10
2 10
1 12
1 10
2 20
2 10
What I want next is to count the occurrence of the offer for each ID. i.e, my final result should be:
ID Offer Count
1 10 1
2 10 1
1 12 1
1 10 2
2 20 1
2 10 2
As you can see, the count should increase based on the number of times the offer shows up per ID.
I tried something like:
select id,offer,count(offer) over (partition by id);
But this just gives the total count of that particular offer for that ID and is not the result I am looking for.
Any help is much appreciated!
You could use ROW_NUMBER:
select id,offer,ROW_NUMBER() over (partition by id, offer order by rownum)
from tab

Distinct count mismatch

This is my initial table structure.
MEMBER_ID ITEM_ID ACCOUNT
1 3 A
1 4 A
2 1 B
3 4 B
4 4 B
5 4 A
6 2 A
When I want the distinct number of members I do
Select COUNT(DISTINCT MEMBER_ID) FROM TABLE A
I get 6, the expected answer
When I do
SELECT COUNT(DISTINCT MEMBER_ID),ACCOUNT FROM TABLE A GROUP BY 2
I get something like A=4 and B=3, what do you think is the disconnect here.
Thanks
I find the results highly unlikely. You would, however, get 4 and 3 if the data were slightly different:
MEMBER_ID ITEM_ID ACCOUNT
1 3 A
1 4 B
2 1 B
3 4 B
4 4 A
5 4 A
6 2 A
With the group by, MEMBER_ID = 1 would be counted twice -- once for A and once for B. My guess is that something like this is happening for your real problem. COUNT(DISTINCT) is not additive. So, when you break it in apart using a group by, the sum of the values is not (necessarily) the sum for all the data. This differs from MIN(), MAX(), COUNT(*), and SUM(). However, AVG() is also not additive (although it is easily recalculated).

How can I find consecutive active weeks in SQL?

What I would like to do is find the number of consecutive weeks that someone is active on Sundays and assign them a value. They have to participate in at least 2 races a day to be counted as active for the week.
If they are active for 2 consecutive weeks I would like to assign a value of 100, 3 consecutive weeks a value of 200, 4 consecutive weeks a value of 300, and continuing up to 9 consecutive weeks.
My difficulty is not determining consecutive weeks, but breaks in between consecutive dates. Suppose the following dataset:
CustomerID RaceDate Races
1 2/2/2014 2
1 2/9/2014 5
1 2/16/2014 3
1 2/23/2014 3
1 3/2/2014 4
1 3/9/2014 3
1 3/16/2014 3
2 2/2/2014 2
2 2/9/2014 3
2 3/2/2014 2
2 3/9/2014 4
2 3/16/2014 3
CustomerID 1 would have 7 consecutive weeks for a value of 600.
The hard part for me is CustomerID 2. They would have 2 consecutive weeks AND 3 consecutive weeks. So their total value would be 100 + 200 = 300.
I would like to be able to do this with any different combination of consecutive weeks.
Any help please?
EDIT: I am using SQL Server 2008 R2.
When looking for sequential values, there is a simple observation that helps. If you subtract a sequence from the dates then the value is a constant. You can use this as a grouping mechanism
select CustomerId, min(RaceDate) as seqStart, max(RaceDate) as seqEnd,
count(*) as NumDaysRaced
from (select t.*,
dateadd(week, - row_number() over (partition by customerID, RaceDate),
RaceDate) as grp
from table t
where races >= 2
) t
group by CustomerId, grp;
You can then use this to get your final "points":
select CustomerId,
sum(case when NumDaysRaced > 1 then (NumDaysRaced - 1) * 100 else 0 end) as Points
from (select CustomerId, min(RaceDate) as seqStart, max(RaceDate) as seqEnd,
count(*) as NumDaysRaced
from (select t.*,
dateadd(week, - row_number() over (partition by customerID, RaceDate),
RaceDate) as grp
from table t
where races >= 2
) t
group by CustomerId, grp
) c
group by CustomerId;

Need help designing a proper sql query

I have this table:
DebitDate | DebitTypeID | DebitPrice | DebitQuantity
----------------------------------------------------
40577 1 50 3
40577 1 100 1
40577 2 75 2
40578 1 50 2
40578 2 150 2
I would like to get with a single query (if that's possible), these details:
date, debit_id, total_sum_of_same_debit, how_many_debits_per_day
so from the example above i would get:
40577, 1, (50*3)+(100*1), 2 (because 40577 has 1 and 2 so total of 2 debits per this day)
40577, 2, (75*2), 2 (because 40577 has 1 and 2 so total of 2 debits per this day)
40578, 1, (50*2), 2 (because 40578 has 1 and 2 so total of 2 debits per this day)
40578, 2, (150*2), 2 (because 40578 has 1 and 2 so total of 2 debits per this day)
So i have this sql query:
SELECT DebitDate, DebitTypeID, SUM(DebitPrice*DebitQuantity) AS TotalSum
FROM DebitsList
GROUP BY DebitDate, DebitTypeID, DebitPrice, DebitQuantity
And now i'm having trouble and i'm not sure where to put the count for the last info i need.
You would need a correlated subquery to get this new column. You also need to drop DebitPrice and DebitQuantity from the GROUP BY clause for it to work.
SELECT DebitDate,
DebitTypeID,
SUM(DebitPrice*DebitQuantity) AS TotalSum,
( select Count(distinct E.DebitTypeID)
from DebitsList E
where E.DebitDate=D.DebitDate) as CountDebits
FROM DebitsList D
GROUP BY DebitDate, DebitTypeID
I think this can help you.
SELECT DebitDate, SUM(DebitPrice*DebitQuantity) AS TotalSum, Count(DebitDate) as DebitDateCount
FROM DebitsList where DebitTypeID = 1
GROUP BY DebitDate