sql snowflake, aggregate over window or sth

sql snowflake, aggregate over window or sth - sql

I have a table below
days
balance
user_id
wanted column
2022/08/01
10
1
1
2022/08/02
11
1
1
2022/08/03
10
1
1
2022/08/03
0
2
1
2022/08/05
3
2
2
2022/08/06
3
2
2
2022/08/07
3
3
3
2022/08/08
0
2
3
since I'm new to SQL couldn't aggregate over window by clauses, correctly.
which means; I want to find unique users that have balance>0 per day.
thanks
update:
exact output wanted:
days
unque users
2022/08/01
1
2022/08/02
1
2022/08/03
1
2022/08/05
2
2022/08/06
2
2022/08/07
3
2022/08/08
3
update: how if I want to accumulate the number of unique users over time? with consideration of new users [means: counting users who didn't exist before], and the balance > 0
everyones help is appreaciated deeply :)

SELECT
*,
COUNT(DISTINCT CASE WHEN balance > 0 THEN USER_ID END) OVER (ORDER BY days)
FROM
your_table

Related

Resetting a Count in SQL

I have data that looks like this:
ID num_of_days
1 0
2 0
2 8
2 9
2 10
2 15
3 10
3 20
I want to add another column that increments in value only if the num_of_days column is divisible by 5 or the ID number increases so my end result would look like this:
ID num_of_days row_num
1 0 1
2 0 2
2 8 2
2 9 2
2 10 3
2 15 4
3 10 5
3 20 6
Any suggestions?
Edit #1:
num_of_days represents the number of days since the customer last saw a doctor between 1 visit and the next.
A customer can see a doctor 1 time or they can see a doctor multiple times.
If it's the first time visiting, the num_of_days = 0.

SQL tables represent unordered sets. Based on your question, I'll assume that the combination of id/num_of_days provides the ordering.
You can use a cumulative sum . . . with lag():
select t.*,
sum(case when prev_id = id and num_of_days % 5 <> 0
then 0 else 1
end) over (order by id, num_of_days)
from (select t.*,
lag(id) over (order by id, num_of_days) as prev_id
from t
) t;
Here is a db<>fiddle.
If you have a different ordering column, then just use that in the order by clauses.

How to group, count consecutive dates and use it as filter in Netezza

I'm trying to group consecutive dates, count the consecutive dates, and use that count as filter.
I have a table that currently looks like:
pat_id admin_dates admin_grp daily_admin
-------------------------------------------------
1 08/20/2018 1 2 doses
1 08/21/2018 1 3 doses
1 08/22/2018 1 1 doses
1 10/05/2018 2 3 doses
1 12/10/2018 3 4 doses
2 01/05/2019 1 1 doses
2 02/10/2019 2 2 doses
2 02/11/2019 2 2 doses
where admin_grp is grouping consecutive dates per pat_id.
I want to exclude all rows that have less than 3 consecutive dates for same pat_id. In this example, only pat_id = 1 and admin_grp = 1 condition has 3 consecutive dates, which I would like to see in result. My desired output would be:
pat_id admin_dates admin_grp daily_admin
-------------------------------------------------
1 08/20/2018 1 2 doses
1 08/21/2018 1 3 doses
1 08/22/2018 1 1 doses
I honestly have no idea how to perform this.. my attempt failed to count how many admin_grp has same value within same pat_id, let alone using that count as filter. If anyone could help out / suggest ideas how to tackle this, it will be greatly appreciated.

Assuming that any admin_grp would only have consecutive days, you would just need to count those records by (patid,admin_grp) that have 3 or greater records.
Eg:
select x.*
from (select t.*
,count(*) over(partition by patid,admin_grp) as cnt
from table t
)x
where x.cnt>=3

Short answer: join the table with itself on ‘pat_id’ and filter appropriately:
Select a.* from TABLE a
join (Select * from TABLE where daily_admin=‘3 doses’) b
using (pat_id)
Where a.daily_admin in (‘1 doses’, ‘2 doses’, ‘3 doses’)
Btw: too bad the ‘daily_admin’ column is not an integer... better data model would have made the Where statement slightly simpler :)

SQL Server : how can I get difference between counts of total rows and those with only data

I have a table with data as shown below (the table is built every day with current date, but I left off that field for ease of reading).
This table keeps track of people and the doors they enter on a daily basis.
Table entrance_t:
id entrance entered
------------------------
1 a 0
1 b 0
1 c 0
1 d 0
2 a 1
2 b 0
2 c 0
2 d 0
3 a 0
3 b 1
3 c 1
3 d 1
My goal is to report on people and count entrances not used(grouping on people), but ONLY if they entered(entered=1).
So using the above table, I would like the results of query to be...
id count
----------
2 3
3 1
(id=2 did not use 3 of the entrances and id=3 did not use 1)
I tried queries(some with inner joins on two instances of same table) and I can get the entrances not used, but it's always for everybody. Like this...
id count
----------
1 4
2 3
3 1
How do I not display results id=1 since they did not enter at all?
Thank you,

You could use conditional aggregation:
SELECT id, count(CASE WHEN entered = 0 THEN 1 END) AS cnt
FROM entrance_t
GROUP BY id
HAVING count(CASE WHEN entered = 1 THEN 1 END) > 0;
DBFiddle Demo

Inserting a new indicator column to tell if a given row maximizes another column in SQL

I currently have a table in SQL that looks like this
PRODUCT_ID_1 PRODUCT_ID_2 SCORE
1 2 10
1 3 100
1 10 3000
2 10 10
3 35 100
3 2 1001
That is, PRODUCT_ID_1,PRODUCT_ID_2 is a primary key for this table.
What I would like to do is use this table to add in a row to tell whether or not the current row is the one that maximizes SCORE for a value of PRODUCT_ID_1.
In other words, what I would like to get is the following table:
PRODUCT_ID_1 PRODUCT_ID_2 SCORE IS_MAX_SCORE_FOR_ID_1
1 2 10 0
1 3 100 0
1 10 3000 1
2 10 10 1
3 35 100 0
3 2 1001 1
I am wondering how I can compute the IS_MAX_SCORE_FOR_ID_1 column and insert it into the table without having to create a new table.

You can try like this...
Select PRODUCT_ID_1, PRODUCT_ID_2 ,SCORE,
(Case when b.Score=
(Select Max(a.Score) from TableName a where a.PRODUCT_ID_1=b. PRODUCT_ID_1)
then 1 else 0 End) as IS_MAX_SCORE_FOR_ID_1
from TableName b

You can use a window function for this:
select product_id_1,
product_id_2,
score,
case
when score = max(score) over (partition by product_id_1) then 1
else 0
end as is_max_score_for_id_1
from the_table
order by product_id_1;
(The above is ANSI SQL and should run on any modern DBMS)

Need help designing a proper sql query

I have this table:
DebitDate | DebitTypeID | DebitPrice | DebitQuantity
----------------------------------------------------
40577 1 50 3
40577 1 100 1
40577 2 75 2
40578 1 50 2
40578 2 150 2
I would like to get with a single query (if that's possible), these details:
date, debit_id, total_sum_of_same_debit, how_many_debits_per_day
so from the example above i would get:
40577, 1, (50*3)+(100*1), 2 (because 40577 has 1 and 2 so total of 2 debits per this day)
40577, 2, (75*2), 2 (because 40577 has 1 and 2 so total of 2 debits per this day)
40578, 1, (50*2), 2 (because 40578 has 1 and 2 so total of 2 debits per this day)
40578, 2, (150*2), 2 (because 40578 has 1 and 2 so total of 2 debits per this day)
So i have this sql query:
SELECT DebitDate, DebitTypeID, SUM(DebitPrice*DebitQuantity) AS TotalSum
FROM DebitsList
GROUP BY DebitDate, DebitTypeID, DebitPrice, DebitQuantity
And now i'm having trouble and i'm not sure where to put the count for the last info i need.

You would need a correlated subquery to get this new column. You also need to drop DebitPrice and DebitQuantity from the GROUP BY clause for it to work.
SELECT DebitDate,
DebitTypeID,
SUM(DebitPrice*DebitQuantity) AS TotalSum,
( select Count(distinct E.DebitTypeID)
from DebitsList E
where E.DebitDate=D.DebitDate) as CountDebits
FROM DebitsList D
GROUP BY DebitDate, DebitTypeID

I think this can help you.
SELECT DebitDate, SUM(DebitPrice*DebitQuantity) AS TotalSum, Count(DebitDate) as DebitDateCount
FROM DebitsList where DebitTypeID = 1
GROUP BY DebitDate

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

sql snowflake, aggregate over window or sth - sql

SELECT *, COUNT(DISTINCT CASE WHEN balance > 0 THEN USER_ID END) OVER (ORDER BY days) FROM your_table

Related

Resetting a Count in SQL

How to group, count consecutive dates and use it as filter in Netezza

SQL Server : how can I get difference between counts of total rows and those with only data

Inserting a new indicator column to tell if a given row maximizes another column in SQL

Need help designing a proper sql query

Categories

Resources