I am working on the problem where I have to get the count of streak with max value, but to get the exact result I have to count that point as well where the streak breaks. My table looks like this
+-----------------+--------+-------+
| customer_number | Months | Flags |
+-----------------+--------+-------+
| 1 | 12 | 1 |
| 1 | 1 | 1 |
| 1 | 2 | 1 |
| 1 | 3 | 1 |
| 1 | 4 | 1 |
| 1 | 5 | 1 |
| 1 | 8 | 1 |
| 1 | 9 | 1 |
| 1 | 10 | 1 |
| 1 | 11 | 1 |
| 6 | 12 | 1 |
| 6 | 1 | 1 |
| 6 | 2 | 1 |
| 6 | 3 | 1 |
| 6 | 4 | 1 |
| 6 | 5 | 4 |
| 6 | 9 | 1 |
| 6 | 10 | 1 |
| 6 | 11 | 1 |
| 7 | 5 | 1 |
| 8 | 9 | 1 |
| 8 | 10 | 1 |
| 8 | 11 | 1 |
| 9 | 9 | 1 |
| 9 | 10 | 1 |
| 9 | 11 | 1 |
| 10 | 11 | 1 |
+-----------------+--------+-------+
and my desired output is
+----------+--------------------+
| Customer | Consecutive streak |
+----------+--------------------+
| 1 | 10 |
| 6 | 6 |
| 7 | 1 |
| 8 | 3 |
| 9 | 3 |
| 10 | 1 |
+----------+--------------------+
the code I have
SELECT customer_number, max(streak) max_consecutive_streak FROM (
SELECT customer_number, COUNT(*) as streak
FROM
(select *,
(row_number() over (order by customer_number) -
row_number() over (order by customer_number)
) as counts
from table1
) cc
group by customer_number, counts
)
GROUP BY 1;
It is working good but for customer_number 6 it returns 5 but I want it to be 6, means it should count 4 as well in its longest streak as the streak breaks at this point. Any idea how can I achieve that?
You can use a cte with row_number:
with cte(r, id, flag) as (
select row_number() over (order by c.customer_number), c.* from customers c
),
freq(id, t, f) as (
select c2.id, c2.f, count(*) from
(select c.id, (select sum(c1.flag!=c.flag) from cte c1 where c1.id=c.id and c1.r <= c.r) f from cte c)
c2 group by c2.id, c2.f
)
select id, max(f) from freq group by id;
I'm working with pharmacy data and I'm trying to rank the use of three specific medications (A, B, C) amongst a large group of patients. In short, I want to figure out the top 12 combinations of these meds that people are using. So for instance, patient 1 might take meds A + B,
patient 2 takes A + C, patient 3 takes B + C, patient 4 takes A + B, and so forth. I did some digging and there are 25 possible combinations to rank. I want my output to look something like this:
The tables I'm working with look like this:
Currently I'm breaking the drugs up into different combination groups by doing something like this:
select distinct concat(substance_name, dosage, unit) as Drug_Dose_Combo,
count(distinct user_id) as Patients
from pharmacy_data a join drug_reference_table b
on a.drug_code=b.drug_code
group by 1
order by 2 desc
However, this seems very inefficient so I'm looking for a better way of building this out. I don't necessarily need to use a rank() here, I just want the output to look similar to what I've outlined above.
Maybe something like (Untested):
WITH meds_taken AS
(SELECT sum(CASE WHEN d.drug_name = :namea THEN 1 ELSE 0 END) AS drug_a
, sum(CASE WHEN d.drug_name = :nameb THEN 1 ELSE 0 END) AS drug_b
, sum(CASE WHEN d.drug_name = :namec THEN 1 ELSE 0 END) AS drug_c
FROM pharmacy_data AS p
JOIN drug_reference AS d ON p.drug_code = d.drug_code
GROUP BY p.user_id)
, med_counts AS
(SELECT drug_a, drug_b, drug_c, count(*) AS "user total"
FROM meds_taken
GROUP BY drug_a, drug_b, drug_c)
SELECT rank() OVER (ORDER BY "user total" DESC) AS rank
, drug_a, drug_b, drug_c, "user total"
FROM med_counts
ORDER BY "user total" DESC;
Alright it's not too clear what you are looking for, but you did indicate that you want to perform some sort of frequency analysis based on combinations of up to three pharmaceutical products.
The first step in an analysis like this is to take the pharmacy data and for each user_id determine the sets of 1, 2, and 3 drug_dose combinations that they participate in, however, since you may want to do the same analysis on the substance_name, drug_name, and/or drug_code I'm going to throw the kitchen sink at it and do all four. Not knowing what sort of DB you have on the back end, I'm going to use SQL Server 2017 for this example, though the concepts used are applicable to DBs such as Oracle, MySQL, PostgreSQL and others though the syntax may differ.
To create the drug_code and other combinations I'll first join the pharmacy_data table to the drug_reference table and then use a recursive query on the composite data:
with usage_info as (
select pd.user_id
, dr.drug_code
, dr.drug_name
, dr.substance_name
, concat(dr.substance_name,dr.dosage,dr.unit) drug_dose
from pharmacy_data pd
join drug_reference dr
on dr.drug_code = pd.drug_code
), recur(user_id, combo_id, dc_combo, dc_combo_size, dn_combo, sn_combo, dd_combo, last_dc) as (
-- Anchor part
select user_id
, cast(cast(drug_code as binary(4)) as varbinary(max))
, cast(drug_code as varchar(max))
, 1
, cast(drug_name as varchar(max))
, cast(substance_name as varchar(max))
, cast(drug_dose as varchar(max))
, drug_code
from usage_info
union all
-- Recursive Part
select prev.user_id
, prev.combo_id+cast(curr.drug_code as binary(4))
, prev.dc_combo+','+cast(curr.drug_code as varchar(max))
, prev.dc_combo_size+1
, prev.dn_combo+','+curr.drug_name
, prev.sn_combo+','+curr.substance_name
, prev.dd_combo+','+curr.drug_dose
, curr.drug_code
from recur prev
join usage_info curr
on prev.user_id = curr.user_id
and prev.last_dc < curr.drug_code
and prev.dc_combo_size < 3 -- Maximum combination size
)
Selecting from the above common table expressions for the data provided in your question:
select * from recur;
shows that some irregularities in the groupings for dn_combo, sn_combo, and possibly the dd_combo columns for example there exists dn_combos for both 'CAZERTA,BEXERA' and 'BEXERA,CAZERTA' which really should be equivalent
To rectify this I'll normalize the combinations by splitting them up and recombining them in sorted order. In the process I'll also deduplicate any instance where a user_id may have two or more equivalent but not identical products e.g. two different doses of the same medication:
, combos as (
select user_id
, combo_id
, dc_combo
, dc_combo_size
, -- Normalize and deduplicate Drug_Name combos
(select string_agg(value,',') within group (order by value)
from (select distinct value from string_split(dn_combo,',')) dn
) dn_combo
, (select count(distinct value) from string_split(dn_combo,',')) dn_combo_size
, -- Normalize and deduplicate Substance_Name combos
(select string_agg(value,',') within group (order by value)
from (select distinct value from string_split(sn_combo,',')) sn
) sn_combo
, (select count(distinct value) from string_split(sn_combo,',')) sn_combo_size
, -- Normalize and deduplicate Drug_Dose combos
(select string_agg(value,',') within group (order by value)
from (select distinct value from string_split(dd_combo,',')) ddc
) dd_combo
, (select count(distinct value) from string_split(dd_combo,',')) dd_combo_size
from recur
)
Now while you could just select the count(user_id) over (partition by <grouping_column>) to get the occurrence frequency of each drug combination those numbers could be inflated. Take for example if your data had an additional user_id of 999 with drug_codes 50, 100, 200, and 350 (that's two different doses of BEXERA along with AXIOM and CAZERTA), then user_id 999 would show up multiple times for every combination that includes BEXERA. Depending on your database flavor you could just select the count(DISTINCT user_id) over (partition by <grouping_column>) but as of SQL Server 2017 it doesn't allow the distinct operator in analytic functions. </shrug> We can still do it just takes another step to identify the unique values per group. Enter Common Table combo2 where we compute row numbers across various partitions:
, combo2 as (
select user_id
, combo_id
, dc_combo
, dc_combo_size
, row_number() over (partition by dc_combo, user_id order by dc_combo) dc_uid_rn
, dn_combo
, dn_combo_size
, row_number() over (partition by dn_combo, user_id order by dc_combo) dn_uid_rn
, row_number() over (partition by dn_combo, dc_combo order by user_id) dn_combo_rn
, sn_combo
, sn_combo_size
, row_number() over (partition by sn_combo, user_id order by dc_combo) sn_uid_rn
, row_number() over (partition by sn_combo, dc_combo order by user_id) sn_combo_rn
, dd_combo
, dd_combo_size
, row_number() over (partition by dd_combo, user_id order by dc_combo) dd_uid_rn
, row_number() over (partition by dd_combo, dc_combo order by user_id) dd_combo_rn
from combos
)
And then finally calculate our counts of which we have two types. The uid_cnt columns are counts of distinct user_ids for each combination, and the combo_cnt columns indicate the number of distinct drug_code combinations that make up the less granular groupings:
select user_id
, combo_id
, dc_combo
, dc_combo_size
, count(case dc_uid_rn when 1 then 1 end) over (partition by dc_combo) dc_uid_cnt
, dn_combo
, dn_combo_size
, count(case dn_uid_rn when 1 then 1 end) over (partition by dn_combo) dn_uid_cnt
, count(case dn_combo_rn when 1 then 1 end) over (partition by dn_combo) dn_combo_cnt
, sn_combo
, sn_combo_size
, count(case sn_uid_rn when 1 then 1 end) over (partition by sn_combo) sn_uid_cnt
, count(case sn_combo_rn when 1 then 1 end) over (partition by sn_combo) sn_combo_cnt
, dd_combo
, dd_combo_size
, count(case dd_uid_rn when 1 then 1 end) over (partition by dd_combo) dd_uid_cnt
, count(case dd_combo_rn when 1 then 1 end) over (partition by dd_combo) dd_combo_cnt
from combo2
order by dn_combo, dd_combo
All together with my additional sample data the above code results in the following table. To see it in action please see the SQL Fiddle:
| user_id | dc_combo | dc_combo_size | dc_uid_cnt | dn_combo | dn_combo_size | dn_uid_cnt | dn_combo_cnt | sn_combo | sn_combo_size | sn_uid_cnt | sn_combo_cnt | dd_combo | dd_combo_size | dd_uid_cnt | dd_combo_cnt |
|---------|-------------|---------------|------------|----------------------|---------------|------------|--------------|---------------------------------|---------------|------------|--------------|-------------------------------------------------|---------------|------------|--------------|
| 3 | 200 | 1 | 2 | AXIOM | 1 | 4 | 3 | nsaid | 1 | 4 | 3 | nsaid10mg | 1 | 2 | 1 |
| 999 | 200 | 1 | 2 | AXIOM | 1 | 4 | 3 | nsaid | 1 | 4 | 3 | nsaid10mg | 1 | 2 | 1 |
| 175 | 300 | 1 | 1 | AXIOM | 1 | 4 | 3 | nsaid | 1 | 4 | 3 | nsaid25mg | 1 | 1 | 1 |
| 1 | 25 | 1 | 1 | AXIOM | 1 | 4 | 3 | nsaid | 1 | 4 | 3 | nsaid5mg | 1 | 1 | 1 |
| 999 | 200,350 | 2 | 1 | AXIOM,BEXERA | 2 | 3 | 5 | nsaid,potassium | 2 | 3 | 5 | nsaid10mg,potassium12mg | 2 | 1 | 1 |
| 999 | 50,200,350 | 3 | 1 | AXIOM,BEXERA | 2 | 3 | 5 | nsaid,potassium | 2 | 3 | 5 | nsaid10mg,potassium12mg,potassium20mg | 3 | 1 | 1 |
| 999 | 50,200 | 2 | 1 | AXIOM,BEXERA | 2 | 3 | 5 | nsaid,potassium | 2 | 3 | 5 | nsaid10mg,potassium20mg | 2 | 1 | 1 |
| 175 | 50,300 | 2 | 1 | AXIOM,BEXERA | 2 | 3 | 5 | nsaid,potassium | 2 | 3 | 5 | nsaid25mg,potassium20mg | 2 | 1 | 1 |
| 1 | 25,50 | 2 | 1 | AXIOM,BEXERA | 2 | 3 | 5 | nsaid,potassium | 2 | 3 | 5 | nsaid5mg,potassium20mg | 2 | 1 | 1 |
| 999 | 100,200,350 | 3 | 1 | AXIOM,BEXERA,CAZERTA | 3 | 2 | 3 | nsaid,potassium,sodium chloride | 3 | 2 | 3 | nsaid10mg,potassium12mg,sodium chloride10mg | 3 | 1 | 1 |
| 999 | 50,100,200 | 3 | 1 | AXIOM,BEXERA,CAZERTA | 3 | 2 | 3 | nsaid,potassium,sodium chloride | 3 | 2 | 3 | nsaid10mg,potassium20mg,sodium chloride10mg | 3 | 1 | 1 |
| 1 | 25,50,100 | 3 | 1 | AXIOM,BEXERA,CAZERTA | 3 | 2 | 3 | nsaid,potassium,sodium chloride | 3 | 2 | 3 | nsaid5mg,potassium20mg,sodium chloride10mg | 3 | 1 | 1 |
| 999 | 100,200 | 2 | 1 | AXIOM,CAZERTA | 2 | 2 | 2 | nsaid,sodium chloride | 2 | 2 | 2 | nsaid10mg,sodium chloride10mg | 2 | 1 | 1 |
| 1 | 25,100 | 2 | 1 | AXIOM,CAZERTA | 2 | 2 | 2 | nsaid,sodium chloride | 2 | 2 | 2 | nsaid5mg,sodium chloride10mg | 2 | 1 | 1 |
| 201 | 350 | 1 | 2 | BEXERA | 1 | 5 | 4 | potassium | 1 | 5 | 4 | potassium12mg | 1 | 2 | 1 |
| 999 | 350 | 1 | 2 | BEXERA | 1 | 5 | 4 | potassium | 1 | 5 | 4 | potassium12mg | 1 | 2 | 1 |
| 999 | 50,350 | 2 | 1 | BEXERA | 1 | 5 | 4 | potassium | 1 | 5 | 4 | potassium12mg,potassium20mg | 2 | 1 | 1 |
| 378 | 400 | 1 | 1 | BEXERA | 1 | 5 | 4 | potassium | 1 | 5 | 4 | potassium15mg | 1 | 1 | 1 |
| 1 | 50 | 1 | 3 | BEXERA | 1 | 5 | 4 | potassium | 1 | 5 | 4 | potassium20mg | 1 | 3 | 1 |
| 175 | 50 | 1 | 3 | BEXERA | 1 | 5 | 4 | potassium | 1 | 5 | 4 | potassium20mg | 1 | 3 | 1 |
| 999 | 50 | 1 | 3 | BEXERA | 1 | 5 | 4 | potassium | 1 | 5 | 4 | potassium20mg | 1 | 3 | 1 |
| 999 | 50,100,350 | 3 | 1 | BEXERA,CAZERTA | 2 | 4 | 5 | potassium,sodium chloride | 2 | 4 | 5 | potassium12mg,potassium20mg,sodium chloride10mg | 3 | 1 | 1 |
| 999 | 100,350 | 2 | 1 | BEXERA,CAZERTA | 2 | 4 | 5 | potassium,sodium chloride | 2 | 4 | 5 | potassium12mg,sodium chloride10mg | 2 | 1 | 1 |
| 201 | 350,450 | 2 | 1 | BEXERA,CAZERTA | 2 | 4 | 5 | potassium,sodium chloride | 2 | 4 | 5 | potassium12mg,sodium chloride30mg | 2 | 1 | 1 |
| 378 | 100,400 | 2 | 1 | BEXERA,CAZERTA | 2 | 4 | 5 | potassium,sodium chloride | 2 | 4 | 5 | potassium15mg,sodium chloride10mg | 2 | 1 | 1 |
| 1 | 50,100 | 2 | 2 | BEXERA,CAZERTA | 2 | 4 | 5 | potassium,sodium chloride | 2 | 4 | 5 | potassium20mg,sodium chloride10mg | 2 | 2 | 1 |
| 999 | 50,100 | 2 | 2 | BEXERA,CAZERTA | 2 | 4 | 5 | potassium,sodium chloride | 2 | 4 | 5 | potassium20mg,sodium chloride10mg | 2 | 2 | 1 |
| 1 | 100 | 1 | 3 | CAZERTA | 1 | 4 | 2 | sodium chloride | 1 | 4 | 2 | sodium chloride10mg | 1 | 3 | 1 |
| 378 | 100 | 1 | 3 | CAZERTA | 1 | 4 | 2 | sodium chloride | 1 | 4 | 2 | sodium chloride10mg | 1 | 3 | 1 |
| 999 | 100 | 1 | 3 | CAZERTA | 1 | 4 | 2 | sodium chloride | 1 | 4 | 2 | sodium chloride10mg | 1 | 3 | 1 |
| 201 | 450 | 1 | 1 | CAZERTA | 1 | 4 | 2 | sodium chloride | 1 | 4 | 2 | sodium chloride30mg | 1 | 1 | 1 |