Need to count rows until a certain condition is given - sql

I have a table with ID and a flag
ID
flag
month
user_1
YES
2022-10-01
user_1
YES
2022-09-01
user_1
NO
2022-07-01
user_1
YES
2022-06-01
user_1
YES
2022-05-01
user_1
YES
2022-04-01
user_2
YES
2022-10-01
user_2
YES
2022-09-01
user_2
YES
2022-08-01
user_2
NO
2022-06-01
user_2
YES
2022-05-01
user_2
YES
2022-04-01
I want to count al the "YES" values, but only until the first "NO" is found
In this case i want to get something like:
ID
count
user_1
2
user_2
3

Consider below approach
select id, count(*) `count` from (
select * from your_table
qualify countif(flag='NO') over(partition by id order by month desc) = 0
)
group by id
if applied to sample data in your question - output is

You can seqarch first for the last Date when then No occurs and then filter all Dates that are greater than the date for the No
WITH CTE as (SELECT `ID`, MAX( `month`) max_date FROM Table1 WHERE `flag` = 'NO' GROUP BY `ID`)
SELECT t1.`ID`, COUNT(*) FROM Table1 t1 LEFT JOIN CTE c ON t1.`ID` = c.`ID`AND t1.`month` > c.max_date
GROUP BY t1.`ID`
ID
COUNT(*)
user_1
2
user_2
3

Related

How to write SQLite code to show a specified code along with its associated codes from group_concat function?

I am trying to create a table where the code N09 is included, where a student was assigned a set of codes that contains N09, and "Status Complete" was yes. I wanted to use group_concat to see if each set contains N09. I saw a similar question to this but unfortunately, it did not satisfy my goal for Table 2 as it led to a problem. This problem I am experiencing is that it keeps showing 1 instead of 2, 3 for count. It also keeps showing N09, instead of N09 and its other codes from the set from the group_concat function. Is there a code to achieve my goal for Table 2 in SQLite? If my question is not clear, feel free to comment as I am new here.
Goal for Table 2:
Student ID
Status Complete
Status Date
Status Time
Code
Count
Group_Concat(Code)
1
yes
03/03/2021
00:00:00
N09
1
N09
2
yes
03/04/2021
10:03:10
N09
2
N09, M33
3
yes
03/04/2021
01:00:10
N09
3
N09, Y03, B55
Problem:
Student ID
Status Complete
Status Date
Status Time
Code
Count
Group_Concat(Code)
1
yes
03/03/2021
00:00:00
N09
1
N09
2
yes
03/04/2021
10:03:10
N09
1
N09
3
yes
03/04/2021
01:00:10
N09
1
N09
Sample Data:
Student ID
Status Complete
Status Date
Status Time
Code
1
yes
03/03/2021
00:00:00
N09
2
yes
03/04/2021
10:03:10
N09
2
yes
03/04/2021
10:03:10
M33
3
yes
03/04/2021
01:00:10
N09
3
yes
03/04/2021
01:00:10
Y03
3
yes
03/04/2021
01:00:10
B55
Code:
CREATE TABLE table2 AS
select Student_ID
,Status_Complete
,Status_Date
,Status_TIME
,Code
,count(Code) /*over (partition by Student_ID,Code)*/ as 'Count'
,GROUP_CONCAT(Code)
from table1
where Code in ('N09') AND Status_Complete = 'yes'
group by Student_ID, Status_Date, Status_TIME, 'Count'
HAVING 'Count'> 0
ORDER BY Student_ID;
You should group by Student_ID only since you want only 1 row for each student.
The columns Status_Date and Status_TIME of the results that you want seem to be the min values of each student (I assume that the dates have the proper format of YYYY-mm-dd which is the only valid date format for SQLite).
Also, the condition Code = 'N09' should be checked in the HAVING clause:
CREATE TABLE table2 AS
SELECT Student_ID, Status_Complete,
MIN(Status_Date) Status_Date,
TIME(MIN(Status_Date || ' ' || Status_TIME)) Status_TIME,
COUNT(*) count,
GROUP_CONCAT(Code) Codes
FROM table1
WHERE Status_Complete = 'yes'
GROUP BY Student_ID
HAVING SUM(Code = 'N09') > 0
ORDER BY Student_ID;
See the demo.
Never use single quotes for column names.
'Count' is a string literal when used in code. It never refers to a column alias.
the WHERE cluase you have excludes all columns that are not N09 and have the status completed, so switch zu a EXISTS clause
As Lennart points out, here the having is redundant, as all rows now will have at least the count of 1
CREATE TABLE table2 AS
select Student_ID
,Status_Complete
,Status_Date
,Status_TIME
,Code
,count(Code) /*over (partition by Student_ID,Code)*/ as 'Count'
,GROUP_CONCAT(Code)
from table1 t1
where EXISTS( (SELECT 1 FROM table1 WHERR Code in ('N09') AND Status_Complete = 'yes' AND Student_ID = t1.Student_ID)
group by Student_ID, Status_Date, Status_TIME
ORDER BY Student_ID;

Get last non null value columnwise where column is sorted by date

sqlfiddle
select *
from example;
edate userid status
2022-05-01 abc123 true
2022-05-02 abc123 (null)
2022-05-03 abc123 (null)
2022-05-04 abc123 (null)
2022-05-05 abc123 false
2022-05-06 abc123 (null)
2022-05-07 abc123 (null)
2022-05-08 abc123 (null)
2022-05-09 abc123 true
2022-05-10 abc123 (null)
I want to write a new field, 'status_backfilled' based on the most recent data point for a userId.
In the example data, The users status is true on May 1st, then null untill May 5th. So, I would like the new field to be true between May 1st till May 4th. Then the status switches to false. This value is unchanged till May 9th, so I want false between May 5th till 8th, then true again.
Desired output:
select *
from example_desired;
edate userid status_backfilled
2022-05-01 abc123 true
2022-05-02 abc123 true
2022-05-03 abc123 true
2022-05-04 abc123 true
2022-05-05 abc123 false
2022-05-06 abc123 false
2022-05-07 abc123 false
2022-05-08 abc123 false
2022-05-09 abc123 true
2022-05-10 abc123 true
How can I columnwise coalesce to get the most recent non null status for a user where data are sorted, in this case by date?
actually, even better :
select e1.edate, e1.userId, coalesce(e1.status, t.status) as status
from example e1
cross join lateral (
select status from example e2
where e1.userid = e2.userid
and e1.edate > e2.edate
and e2.status is not null
order by e2.edate desc limit 1
) t
fiddle
here is another way :
with cte as (
select e.* ,e_s.edate s_edate, e_s.status s_status , row_number() over (partition by e.userid,e.edate order by e_s.edate desc) rn
from example e
left join (
select *
from example
where status is not null
) e_s on e.userid = e_s.userid
and e_s.edate < e.edate
)
select edate, userId, coalesce(status, s_status) as status
from cte where rn = 1
You can achieve your desired result by using a few window functions -
WITH grp AS (SELECT edate, userid, status,
CASE WHEN status IS NULL THEN 0
ELSE ROW_NUMBER() OVER(ORDER BY edate)
END RN
FROM example
),
grp_sum AS (SELECT edate, userid, status, SUM(RN) OVER(ORDER BY edate) grp_sum
FROM grp
)
SELECT edate, userid,
FIRST_VALUE(status) OVER(PARTITION BY grp_sum ORDER BY status NULLS LAST) status_backfilled
FROM grp_sum;
Demo.

Calculate account balance history in PostgreSQL

I am trying to get a balance history on the account using SQL. My table in PostgreSQL looks like this:
id sender_id recipient_id amount_money
--- ----------- ---------------------- -----------------
1 1 2 60.00
2 1 2 15.00
3 2 1 35.00
so the user with id number 2 currently has 40 dollars in his account.
I would like to get this result using sql:
[60, 75, 40]
Is it possible to do something like this using sql in postgres?
To get a rolling balance, you can SUM the amounts (up to and including the current row) based on whether the id was the recipient or sender:
SELECT id, sender_id, recipient_id, amount_money,
SUM(CASE WHEN recipient_id = 2 THEN amount_money
WHEN sender_id = 2 THEN -amount_money
END) OVER (ORDER BY id) AS balance
FROM transactions
Output:
id sender_id recipient_id amount_money balance
1 1 2 60.00 60.00
2 1 2 15.00 75.00
3 2 1 35.00 40.00
If you want an array, you can use array_agg with the above query as a derived table:
SELECT array_agg(balance)
FROM (
SELECT SUM(CASE WHEN recipient_id = 2 THEN amount_money
WHEN sender_id = 2 THEN -amount_money
END) OVER (ORDER BY id) AS balance
FROM transactions
) t
Output:
[60,75,40]
Demo on dbfiddle
If you want to be more sophisticated and support balances for multiple accounts, you need to split the initial data into account ids, adding when the id is the recipient and subtracting when the sender. You can use CTEs to generate the appropriate data:
WITH trans AS (
SELECT id, sender_id AS account_id, -amount_money AS amount
FROM transactions
UNION ALL
SELECT id, recipient_id AS account_id, amount_money AS amount
FROM transactions
),
balances AS (
SELECT id, account_id, ABS(amount),
SUM(amount) OVER (PARTITION BY account_id ORDER BY id) AS balance
FROM trans
)
SELECT account_id, ARRAY_AGG(balance) AS bal_array
FROM balances
GROUP BY account_id
Output:
account_id bal_array
1 [-60,-75,-40]
2 [60,75,40]
Demo on dbfiddle

UNION to one Line with Only 'Yes' Values

I have a table that in an ideal world should only return 1 row per 'policy' for items that were sold as part of an up-sell.
I wish to roll this up into one line per 'PolRef#' and basically the 'Yes' should supercede 'No' should it exist in the column.
B# PolRef# Uk Eu Date Ep500 Ep700 Ep3000 Keycare Wind Ep350 Ep250 Legal Totaladdon Finance_yn
2 ROGX17PC01 Yes No 2017-07-31 00:00:00.000 No No No No No NULL NULL NULL 62.00 Yes
2 ROGX17PC01 No No 2017-07-31 00:00:00.000 No No No No No NULL NULL Yes 32.00 Yes
This is an example, I know I could do a GROUP BY to GROUP the PolRef# and then SUM the TotalAddon, how however can I have it so that if 'Yes' exists in A column for that 'PolRef#' that is shows it.
Essentially above's result should look like
B# PolRef# Agent Uk Eu Date Ep500 Ep700 Ep3000 Keycare Wind Ep350 Ep250 Legal Totaladdon Finance_yn
2 ROGX17PC01 NULL Yes No 2017-07-31 00:00:00.000 No No No No No NULL NULL Yes 94.00 Yes
This is a prioritization query. One method uses row_number():
select t.*
from (select t.*,
row_number() over (partition by PolRef#
order by uk desc -- 'Yes' comes before 'No'
) as seqnum
from t
) t
where seqnum = 1;

count and group data from 2 column on single table

i have a problem when counting and grouping data from 2 column on single table.
My table structure:
id, price, user_1, user_2
Data sample:
001 500 bergkamp cech<br>
002 100 cech ljungberg<br>
003 200 viera henry<br>
004 300 bergkamp pires<br>
005 200 lauren bergkamp<br>
My query:
SELECT
user_1,user_2,
count(user_1) as total1,
count(user_2) as total2
FROM
sales
group by user_1 and user_2
results with not what i want,
I want the output like this:
bergkamp 3<br>
henry 1<br>
cech 2<br>
ljungberg 1<br>
lauren 1<br>
pires 1<br>
viera 1<br>
Any help will be so appreciated, thanks
Put both user columns into one with a UNION. Then group by that temp table result and count the names
select user_name, count(*)
from
(
SELECT user_1 as user_name FROM sales
union all
SELECT user_2 FROM sales
) tmp
group by user_name
You can use a UNION:
SELECT t.user, COUNT(*) AS total
FROM
(
SELECT user_1 AS user
FROM sales
UNION ALL
SELECT user_2
FROM sales
) t
GROUP BY t.user