I have a PostgreSQL table containing: person_identifier, period_identifier, status
person | period | status
-------+--------+--------
Bob | Jan | new
Bob | Feb | retained
Bob | Mar | retained
Bob | Apr | dormant
Bob | May | dormant
Bob | Jun | resurected
Bob | Jul | retained
Bob | Agu | dormant
Jim | Jan | new
Jim | Feb | dormant
Jim | Mar | dormant
Jim | Apr | dormant
Jim | May | dormant
Jim | Jun | resurected
Jim | Jul | dormant
Jim | Agu | resurected
What I need is to include a counter grouping by person, status, with the restriction that the counter needs to restart down to 1 whenever the status changes.
I tried the following query, but this doesn't reset the counter down to 1 whenever a status changes:
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY person, status ORDER BY period) AS wrong_counter
FROM
my_table
Here's the difference of my query and what I actually need; * stands for wrong value:
person | period | status | wrong_counter | needed_counter
-------+--------+-------------+ --------------+---------------
Bob | Jan | new | 1 | 1
Bob | Feb | retained | 1 | 1
Bob | Mar | retained | 2 | 2
Bob | Apr | dormant | 1 | 1
Bob | May | dormant | 2 | 2
Bob | Jun | resurected | 1 | 1
Bob | Jul | retained | 3* | 1
Bob | Agu | dormant | 3* | 1
Jim | Jan | new | 1 | 1
Jim | Feb | dormant | 1 | 1
Jim | Mar | dormant | 2 | 2
Jim | Apr | dormant | 3 | 3
Jim | May | dormant | 4 | 4
Jim | Jun | resurected | 1 | 1
Jim | Jul | dormant | 5* | 1
Jim | Agu | resurected | 2* | 1
Can anyone help me with this?
i did some normalisation :
person -> person_fk,
period -> period_int, Jan = 1 ...
status -> status_fk
and use as basetable : public.tbl_test
then i do some calculations: first get the status in the line before the actual line and is it a status_change or not.
make a help_partition and then get the row_number over the help_partition.
with temp_base_data as
(
select
*,
lag(status_fk,1,-1) over(ORDER BY person_fk, period_int) as status_before,
case
when lag(status_fk,1,-1) over(ORDER BY person_fk, period_int) = status_fk and lag(person_fk ,1,-1) over(ORDER BY person_fk, period_int) = person_fk
then 0
else 1
end as status_change
from public.tbl_test
order by person_fk, period_int
),
temp_partition AS
(
select
*,
sum(status_change) over ( order by person_fk, period_int RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as help_partition
from temp_base_data
order by person_fk, period_int
)
select
* ,
row_number() over (PARTITION by help_partition order by person_fk, period_int) as counter
from temp_partition
order by id
and result in : (last row is the counter you need)
person period status id period_int person_fk status_fk status_before status_change help_partition counter
Bob Jan new 1 1 1 1 -1 1 1 1
Bob Feb retained 2 2 1 2 1 1 2 1
Bob Mar retained 3 3 1 2 2 0 2 2
Bob Apr dormant 4 4 1 3 2 1 3 1
Bob May dormant 5 5 1 3 3 0 3 2
Bob Jun resurected 6 6 1 4 3 1 4 1
Bob Jul retained 7 7 1 2 4 1 5 1
Bob Agu dormant 8 8 1 3 2 1 6 1
Jim Jan new 9 1 2 1 3 1 7 1
Jim Feb dormant 10 2 2 3 1 1 8 1
Jim Mar dormant 11 3 2 3 3 0 8 2
Jim Apr dormant 12 4 2 3 3 0 8 3
Jim May dormant 13 5 2 3 3 0 8 4
Jim Jun resurected 14 6 2 4 3 1 9 1
Jim Jul dormant 15 7 2 3 4 1 10 1
Jim Agu resurected 16 8 2 4 3 1 11 1
tbl_vacations
vac_id | vac_name
1 | American vacation
2 | European vacation
tbl_vacation_stops
stop_id | vac_id | stop_sequence | stop_name | stop_strt_day | stop_end_day
1 | 1 | 1 | New York | may 1 2018 | may 3 2018
2 | 1 | 2 | Boston | may 4 2018 | may 6 2018
3 | 1 | 3 | Chicago | may 7 2018 | may 9 2018
4 | 2 | 1 | Paris | jun 10 2018 | jun 15 2018
5 | 2 | 2 | Berlin | jun 16 2018 | jun 19 2018
select
v.vac_id as vac_id,
v.vac_name as vac_name,
vs.stop_strt_day as vac_strt_day
from tbl_vacations v
join tbl_vacation_stops vs
where v.vac_id=vs.vac_id and vs.stop_sequence='1'
vac_id | vac_name | vac_strt_day | vac_end_day
1 | American vacation | may 1 2018 | may 9 2018
2 | European vacation | jun 10 2018 | jun 19 2018
If there are a different number of stops in each vacation, how do I figure out the vac_end_day based on max stop sequence?
this would do the trick:
select
v.vac_id as vac_id,
v.vac_name as vac_name,
(select stop_strt_day from tbl_vacation_stops where vac_id = v.vac_id
and stop_sequence = (select min(stop_sequence) from tbl_vacation_stops where vac_id =
v.vac_id)
) as vac_strt_day,
(select stop_end_day from tbl_vacation_stops where vac_id = v.vac_id
and stop_sequence = (select max(stop_sequence) from tbl_vacation_stops where vac_id =
v.vac_id)
) as vac_end_day
from tbl_vacations v
I would like to group by the first day and then the rest of the month, I have data that spans years.
I have data like below:
--------------------------------------
DAY MONTH YEAR VISITOR_COUNT
--------------------------------------
1 | 12 | 2014 | 16260
2 | 12 | 2014 | 15119
3 | 12 | 2014 | 14464
4 | 12 | 2014 | 13746
5 | 12 | 2014 | 13286
6 | 12 | 2014 | 14352
7 | 12 | 2014 | 19293
8 | 12 | 2014 | 13338
9 | 12 | 2014 | 13961
10 | 12 | 2014 | 9519
11 | 12 | 2014 | 10204
12 | 12 | 2014 | 9380
13 | 12 | 2014 | 11611
14 | 12 | 2014 | 14839
15 | 12 | 2014 | 10051
16 | 12 | 2014 | 8983
17 | 12 | 2014 | 7348
18 | 12 | 2014 | 7258
19 | 12 | 2014 | 7205
20 | 12 | 2014 | 6113
21 | 12 | 2014 | 5316
22 | 12 | 2014 | 6914
23 | 12 | 2014 | 6880
24 | 12 | 2014 | 6289
25 | 12 | 2014 | 6000
26 | 12 | 2014 | 13328
27 | 12 | 2014 | 10367
28 | 12 | 2014 | 7946
29 | 12 | 2014 | 9042
30 | 12 | 2014 | 9408
31 | 12 | 2014 | 8411
1 | 1 | 2015 | 9965
2 | 1 | 2015 | 10560
3 | 1 | 2015 | 9662
4 | 1 | 2015 | 8735
5 | 1 | 2015 | 12817
6 | 1 | 2015 | 13516
7 | 1 | 2015 | 9800
8 | 1 | 2015 | 10629
9 | 1 | 2015 | 12325
10 | 1 | 2015 | 11899
11 | 1 | 2015 | 11049
12 | 1 | 2015 | 13934
13 | 1 | 2015 | 16833
14 | 1 | 2015 | 13434
15 | 1 | 2015 | 13128
16 | 1 | 2015 | 14660
17 | 1 | 2015 | 11951
18 | 1 | 2015 | 10916
19 | 1 | 2015 | 14126
20 | 1 | 2015 | 16909
21 | 1 | 2015 | 16555
22 | 1 | 2015 | 14726
23 | 1 | 2015 | 14642
24 | 1 | 2015 | 13067
25 | 1 | 2015 | 11738
26 | 1 | 2015 | 15353
27 | 1 | 2015 | 17935
28 | 1 | 2015 | 14448
29 | 1 | 2015 | 15372
30 | 1 | 2015 | 16694
31 | 1 | 2015 | 16763
I would like to be able to group it like below:
--------------------------------------
DAY MONTH YEAR VISITOR_COUNT
--------------------------------------
1 | 12 | 2014 | 16260
2-31| 12 | 2014 | 309971
1 | 1 | 2015 | 9965
2-31| 1 | 2015 | 404176
Microsoft SQL Server 2016. Compatibility level: SQL Server 2005 (90)
Just use case:
select (case when min(day) = 1 then '1'
else concat(min(day), '-', max(day))
end) as day, month, year,
sum(visitor_count)
from t
group by year, month,
(case when day = 1 then 1 else 2 end);
Okay, this is a little tricky. The case in the group by and the case in the select are different. The group by just puts the days into two categories, 1 and others. The select chooses the minimum and maximum days in the month, to construct the range string.
EDIT:
Oy, SQL Server 2005 ???
Of course, you can do the same thing with + and type conversion, or using replace():
select (case when min(day) = 1 then '1'
else replace(replace('#min-#max', '#min', min(day)), '#max', max(day))
end) as day, month, year,
sum(visitor_count)
from t
group by year, month,
(case when day = 1 then 1 else 2 end);
Thank you for help in advance. Can anyone please help me with sql query for
I have daily table like
> Date | Sales_Rep_ID| Product ID | Zone | Sales
> 31 Jan 2015 | 001 | P01| EMEA | 10
> 31 Jan 2015 | 002 | P02| EMEA | 10
> 31 Jan 2015 | 003 | P02| EMEA | 10
> 30 Jan 2015 | 001 | P01| EMEA | 8
> 30 Jan 2015 | 002 | P02| EMEA | 7
> 30 Jan 2015 | 003 | P02| EMEA | 2
and wanted a average of last n days in last column depending upon date , rep id , product id
Date | Sales_Rep_ID| Product ID | Zone | Sales | AVG_3_DAYS
31 Jan 2015 | 001 | P01 | EMEA | 10 | 9
31 Jan 2015 | 002 | P02 | EMEA | 10 | 8.5
31 Jan 2015 | 003 | P02 | EMEA | 10 | 6
30 Jan 2015 | 001 | P01 | EMEA | 8 | .
30 Jan 2015 | 002 | P02 | EMEA | 7 | .
30 Jan 2015 | 003 | P02 | EMEA | 2 | .
For example
for row 1 date is 31 jan and we need average for 31,30, 29 jan for sales rep 001 and product id 002
and for row 4 date is 30 jan and we need average for 30 ,29, 38 jan for sales rep 001 and product id 002
In SQL Server, you can use apply for this purpose:
select t.*, tt.avgsales
from t outer apply
(select avg(sales) as avgsales
from t t2
where t2.rep_id = t.rep_id and
t2.product_id = t.product_id
t2.date <= t.date and
t2.date > dateadd(day, -3, t.date)
) tt;
I have the following table.
Data_table
R_id I_id Metric CType Timespan Quantity Date
1 1 S C Week 100 4/5/2015
1 1 Q C Week 200 4/5/2015
1 1 I D Week 80 4/5/2015
1 2 S C Week 150 4/5/2015
1 2 Q C Week 100 4/5/2015
1 2 I D Week 50 4/5/2015
Metric can have a limited set of values (S, Q, I..)
CType will be C, D or nil.
Timespan can be Weekly/Daily.
Date will be a Sunday (start of week) for Weekly and that day's date for Daily.
My goal is to convert this to a daily view which would involve
If Timespan is Daily, copy the Quantity for the above metrics as it is.
Converting a Weekly quantity to 7 Daily quantities.
If the CType is D copy the quantity as it is.
If the CType is C use a constant percentage breakdown logic to distribute the weekly over 7 days.eg [30%, 10%, 10%, 5%, 10%, 15% 20%] = 100%
Creating the following VIEW.
R_id I_id Date S Q I ... (other metrics whose CType is not nil)
1 1 4/5/2015 30 60 80 ... (the quantity of the other metrics)
1 1 4/6/2015 10 20 80
1 1 4/7/2015 10 20 80
1 1 4/8/2015 5 10 80
1 1 4/9/2015 10 20 80
1 1 4/10/2015 15 30 80
1 1 4/11/2015 20 40 80
1 2 4/5/2015 45 30 50
1 2 4/6/2015 15 10 50
1 2 4/7/2015 15 10 50
1 2 4/8/2015 7.5 5 50
1 2 4/9/2015 15 10 50
1 2 4/10/2015 22.5 15 50
1 2 4/11/2015 30 20 50
I can write a bunch of java methods which will pull out the data from the above table and get the values for metrics as needed. But for a large dataset, the performance will not be very good. Databases are meant for this type of data computation. Once this view is created, I can quickly (and simply) query it to get what I want. I can write simple sql queries. But I have no clue how to even begin approaching this problem! I can see a PIVOT here (logically, I don't know how a query would or even can achieve it). But how to compute the 7 daily quantities from a weekly quantity and put it in the VIEW?
Suggestions and guidance will be much appreciated.
You can use hierarchical queries to generate daily data.
SQL Fiddle
Query:
select
r_id,
i_id,
metric,
ctype,
timespan,
quantity,
tdate + level - 1 as m_tdate,
level as m_level,
(case ctype
when 'C' then
(case level
when 1 then 0.3
when 2 then 0.1
when 3 then 0.1
when 4 then 0.05
when 5 then 0.1
when 6 then 0.15
when 7 then 0.2
end)
else 1
end) * quantity as m_quantity
from myt
where timespan = 'Week'
connect by level <= 7
and r_id = prior r_id
and i_id = prior i_id
and metric = prior metric
and ctype = prior ctype
and timespan = prior timespan
and prior sys_guid() is not null
This will generate seven day data for each record
Results:
| R_ID | I_ID | METRIC | CTYPE | TIMESPAN | QUANTITY | M_TDATE | M_LEVEL | M_QUANTITY |
|------|------|--------|-------|----------|----------|-----------------------|---------|------------|
| 1 | 1 | I | D | Week | 80 | May, 04 2015 00:00:00 | 1 | 80 |
| 1 | 1 | I | D | Week | 80 | May, 05 2015 00:00:00 | 2 | 80 |
| 1 | 1 | I | D | Week | 80 | May, 06 2015 00:00:00 | 3 | 80 |
| 1 | 1 | I | D | Week | 80 | May, 07 2015 00:00:00 | 4 | 80 |
| 1 | 1 | I | D | Week | 80 | May, 08 2015 00:00:00 | 5 | 80 |
| 1 | 1 | I | D | Week | 80 | May, 09 2015 00:00:00 | 6 | 80 |
| 1 | 1 | I | D | Week | 80 | May, 10 2015 00:00:00 | 7 | 80 |
| 1 | 1 | Q | C | Week | 200 | May, 04 2015 00:00:00 | 1 | 60 |
| 1 | 1 | Q | C | Week | 200 | May, 05 2015 00:00:00 | 2 | 20 |
| 1 | 1 | Q | C | Week | 200 | May, 06 2015 00:00:00 | 3 | 20 |
| 1 | 1 | Q | C | Week | 200 | May, 07 2015 00:00:00 | 4 | 10 |
| 1 | 1 | Q | C | Week | 200 | May, 08 2015 00:00:00 | 5 | 20 |
| 1 | 1 | Q | C | Week | 200 | May, 09 2015 00:00:00 | 6 | 30 |
| 1 | 1 | Q | C | Week | 200 | May, 10 2015 00:00:00 | 7 | 40 |
| 1 | 1 | S | C | Week | 100 | May, 04 2015 00:00:00 | 1 | 30 |
| 1 | 1 | S | C | Week | 100 | May, 05 2015 00:00:00 | 2 | 10 |
| 1 | 1 | S | C | Week | 100 | May, 06 2015 00:00:00 | 3 | 10 |
| 1 | 1 | S | C | Week | 100 | May, 07 2015 00:00:00 | 4 | 5 |
| 1 | 1 | S | C | Week | 100 | May, 08 2015 00:00:00 | 5 | 10 |
| 1 | 1 | S | C | Week | 100 | May, 09 2015 00:00:00 | 6 | 15 |
| 1 | 1 | S | C | Week | 100 | May, 10 2015 00:00:00 | 7 | 20 |
| 1 | 2 | I | D | Week | 50 | May, 04 2015 00:00:00 | 1 | 50 |
| 1 | 2 | I | D | Week | 50 | May, 05 2015 00:00:00 | 2 | 50 |
| 1 | 2 | I | D | Week | 50 | May, 06 2015 00:00:00 | 3 | 50 |
| 1 | 2 | I | D | Week | 50 | May, 07 2015 00:00:00 | 4 | 50 |
| 1 | 2 | I | D | Week | 50 | May, 08 2015 00:00:00 | 5 | 50 |
| 1 | 2 | I | D | Week | 50 | May, 09 2015 00:00:00 | 6 | 50 |
| 1 | 2 | I | D | Week | 50 | May, 10 2015 00:00:00 | 7 | 50 |
| 1 | 2 | Q | C | Week | 100 | May, 04 2015 00:00:00 | 1 | 30 |
| 1 | 2 | Q | C | Week | 100 | May, 05 2015 00:00:00 | 2 | 10 |
| 1 | 2 | Q | C | Week | 100 | May, 06 2015 00:00:00 | 3 | 10 |
| 1 | 2 | Q | C | Week | 100 | May, 07 2015 00:00:00 | 4 | 5 |
| 1 | 2 | Q | C | Week | 100 | May, 08 2015 00:00:00 | 5 | 10 |
| 1 | 2 | Q | C | Week | 100 | May, 09 2015 00:00:00 | 6 | 15 |
| 1 | 2 | Q | C | Week | 100 | May, 10 2015 00:00:00 | 7 | 20 |
| 1 | 2 | S | C | Week | 150 | May, 04 2015 00:00:00 | 1 | 45 |
| 1 | 2 | S | C | Week | 150 | May, 05 2015 00:00:00 | 2 | 15 |
| 1 | 2 | S | C | Week | 150 | May, 06 2015 00:00:00 | 3 | 15 |
| 1 | 2 | S | C | Week | 150 | May, 07 2015 00:00:00 | 4 | 7.5 |
| 1 | 2 | S | C | Week | 150 | May, 08 2015 00:00:00 | 5 | 15 |
| 1 | 2 | S | C | Week | 150 | May, 09 2015 00:00:00 | 6 | 22.5 |
| 1 | 2 | S | C | Week | 150 | May, 10 2015 00:00:00 | 7 | 30 |
Once you have this, you need to pivot the result, which can be done by simple GROUP BY
Query:
with x as (
select
r_id,
i_id,
metric,
ctype,
timespan,
quantity,
tdate + level - 1 as m_tdate,
level as m_level,
(case ctype
when 'C' then
(case level
when 1 then 0.3
when 2 then 0.1
when 3 then 0.1
when 4 then 0.05
when 5 then 0.1
when 6 then 0.15
when 7 then 0.2
end)
else 1
end) * quantity as m_quantity
from myt
where timespan = 'Week'
connect by level <= 7
and r_id = prior r_id
and i_id = prior i_id
and metric = prior metric
and ctype = prior ctype
and timespan = prior timespan
and prior sys_guid() is not null
UNION ALL
select
r_id,
i_id,
metric,
ctype,
timespan,
quantity,
tdate as m_tdate,
1 as m_level,
quantity as m_quantity
from myt
where timespan = 'Day'
)
select
r_id,
i_id,
m_tdate,
sum(case when metric = 'S' then m_quantity end) S,
sum(case when metric = 'Q' then m_quantity end) Q,
sum(case when metric = 'I' then m_quantity end) I
from x
group by
r_id,
i_id,
m_tdate
order by
r_id,
i_id,
m_tdate
Results:
| R_ID | I_ID | M_TDATE | S | Q | I |
|------|------|-------------------------|--------|--------|-----|
| 1 | 1 | May, 04 2015 00:00:00 | 30 | 60 | 80 |
| 1 | 1 | May, 05 2015 00:00:00 | 10 | 20 | 80 |
| 1 | 1 | May, 06 2015 00:00:00 | 10 | 20 | 80 |
| 1 | 1 | May, 07 2015 00:00:00 | 5 | 10 | 80 |
| 1 | 1 | May, 08 2015 00:00:00 | 10 | 20 | 80 |
| 1 | 1 | May, 09 2015 00:00:00 | 15 | 30 | 80 |
| 1 | 1 | May, 10 2015 00:00:00 | 20 | 40 | 80 |
| 1 | 2 | April, 03 2015 00:00:00 | (null) | (null) | 120 |
| 1 | 2 | May, 04 2015 00:00:00 | 45 | 30 | 50 |
| 1 | 2 | May, 05 2015 00:00:00 | 15 | 10 | 50 |
| 1 | 2 | May, 06 2015 00:00:00 | 15 | 10 | 50 |
| 1 | 2 | May, 07 2015 00:00:00 | 7.5 | 5 | 50 |
| 1 | 2 | May, 08 2015 00:00:00 | 15 | 10 | 50 |
| 1 | 2 | May, 09 2015 00:00:00 | 22.5 | 15 | 50 |
| 1 | 2 | May, 10 2015 00:00:00 | 30 | 20 | 50 |