I need to build a query that will aggregate the # of units of blood that has been transfused so it can be compared to the # of units of blood that has been cross-matched. Blood (a precious resource) that is cross-matched, but not transfused is wasted.
Providers are supposed to check the system-of-record (Epic) for 'active' cross-match orders before creating new ones. Providers that don't do this are 'penalized' (provider 20). No penalty applies (it seems) for providers that don't transfuse all of the blood that they've cross-matched (provider 10).
Cross-match orders:
|ENC_ID|PROV_ID|ORDER_ID|ORDER_TIME |UNITS|
| 1| 10| 100|26-JUL-12 13:00| 4|
| 1| 20| 231|26-JUL-12 15:00| 2|
Transfusion orders:
|ENC_ID|PROV_ID|ORDER_ID|ORDER_TIME |UNITS|
| 1| 10| 500|26-JUL-12 13:05| 1|
| 1| 10| 501|26-JUL-12 13:25| 1|
| 1| 20| 501|26-JUL-12 15:00| 1|
| 1| 20| 501|26-JUL-12 15:21| 2|
Rules:
compare transfusions to cross-matches for same encounter (transfusion.END_ID=cross-match.ENC_ID)
transfusion orders are applied to cross-match orders in a FIFO manner
transfusion.ORDER_TIME >= cross-match.ORDER_TIME
a provider may transfuse more than their cross-match order, as long as all of the 'active' cross-match order still have available units (provider 20's second transfusion order)
Desired result:
|ENC_ID|PROV_ID|ORDER_ID|ORDER_TIME |CROSS-MATCHED|TRANSFUSED|
| 1| 10| 100|26-JUL-12 13:00| 4| 4|
| 1| 20| 231|26-JUL-12 15:00| 2| 1|
Provider 10 'credited' with Provider 20's transfusions.
Can this logic be implemented without resorting to a procedure?
You could do it in a single SQL query. Here's an example (tested on 11gR2, should work on 10g):
SETUP:
CREATE TABLE cross_match as (
SELECT 1 ENC_ID, 10 PROV_ID, 100 ORDER_ID,
to_date('2012-07-26 13', 'yyyy-mm-dd hh24') ORDER_TIME, 4 UNITS
FROM DUAL
UNION ALL SELECT 1, 20, 231, to_date('2012-07-26 15', 'yyyy-mm-dd hh24'), 2 FROM DUAL
);
CREATE TABLE transfusion as (
SELECT 1 ENC_ID, 10 PROV_ID, 500 ORDER_ID,
to_date('2012-07-26 13:05', 'yyyy-mm-dd hh24:mi') ORDER_TIME, 1 UNITS
FROM DUAL
UNION ALL SELECT 1, 10, 501, to_date('2012-07-26 13:25', 'yyyy-mm-dd hh24:mi'), 1 FROM DUAL
UNION ALL SELECT 1, 20, 501, to_date('2012-07-26 15:00', 'yyyy-mm-dd hh24:mi'), 1 FROM DUAL
UNION ALL SELECT 1, 20, 501, to_date('2012-07-26 15:21', 'yyyy-mm-dd hh24:mi'), 2 FROM DUAL
);
The following query will build a list of blood units numerically and join each unit from the cross_match table to the corresponding one (if it exists) in the transfusion table:
WITH cross_order as (
SELECT rownum rn FROM DUAL
CONNECT BY level <= (SELECT MAX(units) FROM cross_match)
),
transfusion_order as (
SELECT rownum rn FROM DUAL
CONNECT BY level <= (SELECT MAX(units) FROM transfusion)
)
SELECT c.enc_id, c.prov_id, c.order_id, c.order_time,
count(*) cross_matched,
count(t.enc_id) transfused
FROM (SELECT cm.*,
row_number() over (partition by cm.enc_id
order by cm.order_time) cross_no
FROM cross_match cm
JOIN cross_order co ON cm.units >= co.rn) c
LEFT JOIN (SELECT t.*,
row_number() over (partition by t.enc_id
order by t.order_time) trans_no
FROM transfusion t
JOIN transfusion_order tor ON t.units >= tor.rn) t
ON c.enc_id = t.enc_id
AND c.cross_no = t.trans_no
GROUP BY c.enc_id, c.prov_id, c.order_id, c.order_time;
ENC_ID PROV_ID ORDER_ID ORDER_TIME CROSS_MATCHED TRANSFUSED
-----------------------------------------------------------
1 20 231 07/26/2012 2 1
1 10 100 07/26/2012 4 4
This may be efficient if the maximum number of units remains small, otherwise this 1-to-1 relationship may become cumbersome.
This can be improved by using a running total of units on both sides instead of a basic 1-1. The join condition would be like an interval intersection between begin unit and end unit:
SELECT c.enc_id, c.prov_id, c.order_id, c.order_time,
sum(c.unit_end - nvl(c.unit_start,0))/count(*) cross_matched,
sum(least(c.unit_end, t.unit_end)
-greatest(nvl(c.unit_start, 0), nvl(t.unit_start, 0))) transfused
FROM (SELECT cm.*,
sum(cm.units) over (partition by cm.enc_id
order by cm.order_time
rows between unbounded preceding
and 1 preceding) unit_start,
sum(cm.units) over (partition by cm.enc_id
order by cm.order_time) unit_end
FROM cross_match cm) c
LEFT JOIN (SELECT t.*,
sum(t.units) over (partition by t.enc_id
order by t.order_time
rows between unbounded preceding
and 1 preceding) unit_start,
sum(t.units) over (partition by t.enc_id
order by t.order_time) unit_end
FROM transfusion t) t
ON c.enc_id = t.enc_id
AND c.unit_end > nvl(t.unit_start, 0)
AND t.unit_end > nvl(c.unit_start, 0)
GROUP BY c.enc_id, c.prov_id, c.order_id, c.order_time;
ENC_ID PROV_ID ORDER_ID ORDER_TIME CROSS_MATCHED TRANSFUSED
-----------------------------------------------------------
1 20 231 07/26/2012 2 1
1 10 100 07/26/2012 4 4
Related
I can calculate the number of ids in a month and then sum it up over 12 months. I also get the average using this code.
select id, to_char(event_month, 'yyyy') event_year, sum(cnt) overall_count, avg(cnt) average_count
from (
select id, trunc(event_date, 'month') event_month, count(*) cnt
from daily
where event_date >= date '2019-01-01' and event_date < '2019-01-31'
group by id, trunc(event_date, 'month')
) t
group by id, to_char(event_month, 'yyyy')
The results looks something like this:
ID| YEAR | OVER_ALL_COUNT| AVG
1| 2019 | 712 | 59.33
2| 2019 | 20936849 | 161185684.6
3| 2019 | 14255773 | 2177532.2
However, I want to modify this to get the over all id counts for a month instead and the average of the id counts per month. Desired result is:
ID| MONTH | OVER_ALL_COUNT| AVG
1| Jan | 152 | 10.3
2| Jan | 15000 | 1611
3| Jan | 14255 | 2177
1| Feb | 4300 | 113
2| Feb | 9700 | 782
3| Feb | 1900 | 97
where January has 152 id counts over all for id=1, and the average id count per day is 10.3. For id=2, the january count is 15000 and the average id=2 count for jan is 1611.
You just need to change the truncating on your subquery to truncate by day instead of by month, then truncate the outer query by month instead of year.
select id, to_char(event_day, 'Mon') event_month, sum(cnt) overall_count, avg(cnt) average_count
from (
select id, trunc(event_date) event_day, count(*) cnt
from daily
where event_date >= date '2019-01-01' and event_date < date '2019-01-31'
group by id, trunc(event_date)
) t
group by id, to_char(event_month, 'Mon')
This answers the original version of the question.
You can use last_day():
select id, to_char(event_month, 'yyyy') event_year, sum(cnt) as overall_count,
avg(cnt) as average_count,
extract(day from last_day(min(event_month)) as days_in_month,
sum(cnt) / extract(day from last_day(min(event_month)) as avg_days_in_month
from (select id, trunc(event_date, 'month') as event_month, count(*) as cnt
from daily
where event_date >= date '2019-01-01' and
event_date < date '2020-01-01'
group by id, trunc(event_date, 'month')
) t
group by id, to_char(event_month, 'yyyy')
I am trying use a oracle pivot function to display the data in below format. I have tried to use examples I found stackoverflow, but I am unable to achieve what I am looking.
With t as
(
select 1335 as emp_id, 'ADD Insurance New' as suuid, sysdate- 10 as startdate, null as enddate from dual
union all
select 1335 as emp_id, 'HS' as suuid, sysdate- 30 as startdate, null as enddate from dual
union all
select 1335 as emp_id, 'ADD Ins' as suuid, sysdate- 30 as startdate, Sysdate - 10 as enddate from dual
)
select * from t
output:
+--------+-------------------+-------------------+---------+-------------------+
| EMP_ID | SUUID_1 | SUUID_1_STARTDATE | SUUID_2 | SUUID_2_STARTDATE |
+--------+-------------------+-------------------+---------+-------------------+
| 1335 | ADD Insurance New | 10/5/2020 15:52 | HS | 9/15/2020 15:52 |
+--------+-------------------+-------------------+---------+-------------------+
Can anyone suggest to how to use SQL Pivot to get this format?
You can use conditional aggregation. There is more than one way to understand your question, but one approach that would work for your sample data is:
select emp_id,
max(case when rn = 1 then suuid end) suuid_1,
max(case when rn = 1 then startdate end) suid_1_startdate,
max(case when rn = 2 then suuid end) suuid_2,
max(case when rn = 2 then startdate end) suid_2_startdate
from (
select t.*, row_number() over(partition by emp_id order by startdate desc) rn
from t
where enddate is null
) t
group by emp_id
Demo on DB Fiddle:
EMP_ID | SUUID_1 | SUID_1_STARTDATE | SUUID_2 | SUID_2_STARTDATE
-----: | :---------------- | :--------------- | :------ | :---------------
1335 | ADD Insurance New | 05-OCT-20 | HS | 15-SEP-20
You can do it with PIVOT:
With t ( emp_id, suuid, startdate, enddate ) as
(
select 1335, 'ADD Insurance New', sysdate- 10, null from dual union all
select 1335, 'HS', sysdate- 30, null from dual union all
select 1335, 'ADD Ins', sysdate- 30, Sysdate - 10 from dual
)
SELECT emp_id,
"1_SUUID" AS suuid1,
"1_STARTDATE" AS suuid_startdate1,
"2_SUUID" AS suuid2,
"2_STARTDATE" AS suuid_startdate2
FROM (
SELECT t.*,
ROW_NUMBER() OVER ( ORDER BY startdate DESC, enddate DESC NULLS FIRST )
AS rn
FROM t
)
PIVOT (
MAX( suuid ) AS suuid,
MAX( startdate ) AS startdate,
MAX( enddate ) AS enddate
FOR rn IN ( 1, 2 )
)
Outputs:
EMP_ID | SUUID1 | SUUID_STARTDATE1 | SUUID2 | SUUID_STARTDATE2
-----: | :---------------- | :--------------- | :----- | :---------------
1335 | ADD Insurance New | 05-OCT-20 | HS | 15-SEP-20
db<>fiddle here
I am playing around with bigquery and hit an interesting use case. I have a collection of customers and account balances. The account balances collection records any account balance change.
Customers:
+---------+--------+
| ID | Name |
+---------+--------+
| 1 | Alice |
| 2 | Bob |
+---------+--------+
Accounts balances:
+---------+---------------+---------+------------+
| ID | customer_id | value | timestamp |
+---------+---------------+---------+------------+
| 1 | 1 | -500 | 2019-02-12 |
| 2 | 1 | -200 | 2019-02-10 |
| 3 | 2 | 200 | 2019-02-10 |
| 4 | 1 | 0 | 2019-02-09 |
+---------+---------------+---------+------------+
The goal is to find out, for how long a customer has a negative account balance. The resulting collection would look like this:
+---------+--------+---------------------------------+
| ID | Name | Negative account balance since |
+---------+--------+---------------------------------+
| 1 | Alice | 2 days |
+---------+--------+---------------------------------+
Bob is not in the collection, because his last account record shows a positive value.
I think following steps are involved:
get last account balance per customer, see if it is negative
go through the account balance values until you hit a positive (or no more) value
compute datediff
Is something like this even possible in sql? Do you have any ideas on who to create such query? To get customers that currently have a negative account balance, I use this query:
SELECT customer_id FROM (
SELECT t.account_balance, ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY timestamp DESC) as seqnum FROM `account_balances` t
) t
WHERE seqnum = 1 AND account_balance<0
Below is for BigQuery Standard SQL
#standardSQL
SELECT customer_id, name,
SUM(IF(negative_positive < 0, days, 0)) negative_days,
SUM(IF(negative_positive = 0, days, 0)) zero_days,
SUM(IF(negative_positive > 0, days, 0)) positive_days
FROM (
SELECT customer_id, negative_positive, grp,
1 + DATE_DIFF(MAX(ts), MIN(ts), DAY) days
FROM (
SELECT customer_id, ts, SIGN(value) negative_positive,
COUNTIF(flag) OVER(PARTITION BY customer_id ORDER BY ts) grp
FROM (
SELECT *, SIGN(value) = IFNULL(LEAD(SIGN(value)) OVER(PARTITION BY customer_id ORDER BY ts), 0) flag
FROM `project.dataset.balances`
)
)
GROUP BY customer_id, negative_positive, grp
)
LEFT JOIN `project.dataset.customers`
ON id = customer_id
GROUP BY customer_id, name
You can test, play with above using sample data from your question as in below example
#standardSQL
WITH `project.dataset.balances` AS (
SELECT 1 customer_id, -500 value, DATE '2019-02-12' ts UNION ALL
SELECT 1, -200, '2019-02-10' UNION ALL
SELECT 2, 200, '2019-02-10' UNION ALL
SELECT 1, 0, '2019-02-09'
), `project.dataset.customers` AS (
SELECT 1 id, 'Alice' name UNION ALL
SELECT 2, 'Bob'
)
SELECT customer_id, name,
SUM(IF(negative_positive < 0, days, 0)) negative_days,
SUM(IF(negative_positive = 0, days, 0)) zero_days,
SUM(IF(negative_positive > 0, days, 0)) positive_days
FROM (
SELECT customer_id, negative_positive, grp,
1 + DATE_DIFF(MAX(ts), MIN(ts), DAY) days
FROM (
SELECT customer_id, ts, SIGN(value) negative_positive,
COUNTIF(flag) OVER(PARTITION BY customer_id ORDER BY ts) grp
FROM (
SELECT *, SIGN(value) = IFNULL(LEAD(SIGN(value)) OVER(PARTITION BY customer_id ORDER BY ts), 0) flag
FROM `project.dataset.balances`
)
)
GROUP BY customer_id, negative_positive, grp
)
LEFT JOIN `project.dataset.customers`
ON id = customer_id
GROUP BY customer_id, name
-- ORDER BY customer_id
with result
Row customer_id name negative_days zero_days positive_days
1 1 Alice 3 1 0
2 2 Bob 0 0 1
I have a problem with a query in Oracle.
My table contains all of the loan applications from last year. Some of the customers have more than one application. I want to aggregate those applications as follows:
For each customer, I want to find his first application (let's call it A) in the last year and then I want to find out what was the last application in 30 days interval, counting from the first application (say B is the last one). Next, I need to find the application following B and again find for it the last one in 30 days interval, as in the previous step. What I want as the result is the table with the latest and earliest applications on each customer's interval. It is also possible that the first one is the same as the last one.
How could I do this in Oracle without plsql? Is this possible? Should I use cumulative sums of time intervals for it? (but then the starting point for each sum depends on the counted sum..)
Let's say the table has a following form:
application_id (unique) | customer_id (not unique) | create_date
1 1 2017-01-02 <- first
2 1 2017-01-10 <- middle
3 1 2017-01-30 <- last
4 1 2017-05-02 <- first and last
5 1 2017-06-02 <- first
6 1 2017-06-30 <- middle
7 1 2017-06-30 <- middle
8 1 2017-07-01 <- last
What I expect is:
application_id (unique) | customer_id (not unique) | create_date
1 1 2017-01-02 <- first
3 1 2017-01-30 <- last
4 1 2017-05-02 <- first and last
5 1 2017-06-02 <- first
8 1 2017-07-01 <- last
Thanks in advance for help.
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE table_name ( application_id, customer_id, create_date ) AS
SELECT 1, 1, DATE '2017-01-02' FROM DUAL UNION ALL -- <- first
SELECT 2, 1, DATE '2017-01-10' FROM DUAL UNION ALL -- <- middle
SELECT 3, 1, DATE '2017-01-30' FROM DUAL UNION ALL -- <- last
SELECT 4, 1, DATE '2017-05-02' FROM DUAL UNION ALL -- <- first and last
SELECT 5, 1, DATE '2017-06-02' FROM DUAL UNION ALL -- <- first
SELECT 6, 1, DATE '2017-06-30' FROM DUAL UNION ALL -- <- middle
SELECT 7, 1, DATE '2017-06-30' FROM DUAL UNION ALL -- <- middle
SELECT 8, 1, DATE '2017-07-01' FROM DUAL -- <- last
Query 1:
WITH data ( application_id, customer_id, create_date, first_date, grp ) AS (
SELECT t.application_id,
t.customer_id,
t.create_date,
t.create_date,
1
FROM table_name t
WHERE application_id = 1
UNION ALL
SELECT t.application_id,
t.customer_id,
t.create_date,
CASE WHEN t.create_date <= d.first_date + INTERVAL '30' DAY
THEN d.first_date
ELSE t.create_date
END,
CASE WHEN t.create_date <= d.first_date + INTERVAL '30' DAY
THEN grp
ELSE grp + 1
END
FROM data d
INNER JOIN table_name t
ON ( d.customer_id = t.customer_id
AND d.application_id + 1 = t.application_id )
)
SELECT application_id,
customer_id,
create_date,
grp
FROM (
SELECT d.*,
ROW_NUMBER() OVER ( PARTITION BY customer_id, grp ORDER BY create_date ASC ) AS rn_a,
ROW_NUMBER() OVER ( PARTITION BY customer_id, grp ORDER BY create_date DESC ) AS rn_d
FROM data d
)
WHERE rn_a = 1
OR rn_d = 1
Results:
| APPLICATION_ID | CUSTOMER_ID | CREATE_DATE | GRP |
|----------------|-------------|----------------------|-----|
| 1 | 1 | 2017-01-02T00:00:00Z | 1 |
| 3 | 1 | 2017-01-30T00:00:00Z | 1 |
| 4 | 1 | 2017-05-02T00:00:00Z | 2 |
| 5 | 1 | 2017-06-02T00:00:00Z | 3 |
| 8 | 1 | 2017-07-01T00:00:00Z | 3 |
I have a table with the following info
|date | user_id | week_beg | month_beg|
SQL to create table with test values:
CREATE TABLE uniques
(
date DATE,
user_id INT,
week_beg DATE,
month_beg DATE
)
INSERT INTO uniques VALUES ('2013-01-01', 1, '2012-12-30', '2013-01-01')
INSERT INTO uniques VALUES ('2013-01-03', 3, '2012-12-30', '2013-01-01')
INSERT INTO uniques VALUES ('2013-01-06', 4, '2013-01-06', '2013-01-01')
INSERT INTO uniques VALUES ('2013-01-07', 4, '2013-01-06', '2013-01-01')
INPUT TABLE:
| date | user_id | week_beg | month_beg |
| 2013-01-01 | 1 | 2012-12-30 | 2013-01-01 |
| 2013-01-03 | 3 | 2012-12-30 | 2013-01-01 |
| 2013-01-06 | 4 | 2013-01-06 | 2013-01-01 |
| 2013-01-07 | 4 | 2013-01-06 | 2013-01-01 |
OUTPUT TABLE:
| date | time_series | cnt |
| 2013-01-01 | D | 1 |
| 2013-01-01 | W | 1 |
| 2013-01-01 | M | 1 |
| 2013-01-03 | D | 1 |
| 2013-01-03 | W | 2 |
| 2013-01-03 | M | 2 |
| 2013-01-06 | D | 1 |
| 2013-01-06 | W | 1 |
| 2013-01-06 | M | 3 |
| 2013-01-07 | D | 1 |
| 2013-01-07 | W | 1 |
| 2013-01-07 | M | 3 |
I want to calculate the number of distinct user_id's for a date:
For that date
For that week up to that date (Week to date)
For the month up to that date (Month to date)
1 is easy to calculate.
For 2 and 3 I am trying to use such queries:
SELECT
date,
'W' AS "time_series",
(COUNT DISTINCT user_id) COUNT (user_id) OVER (PARTITION BY week_beg) AS "cnt"
FROM user_subtitles
SELECT
date,
'M' AS "time_series",
(COUNT DISTINCT user_id) COUNT (user_id) OVER (PARTITION BY month_beg) AS "cnt"
FROM user_subtitles
Postgres does not allow window functions for DISTINCT calculation, so this approach does not work.
I have also tried out a GROUP BY approach, but it does not work as it gives me numbers for whole week/months.
Whats the best way to approach this problem?
Count all rows
SELECT date, '1_D' AS time_series, count(DISTINCT user_id) AS cnt
FROM uniques
GROUP BY 1
UNION ALL
SELECT DISTINCT ON (1)
date, '2_W', count(*) OVER (PARTITION BY week_beg ORDER BY date)
FROM uniques
UNION ALL
SELECT DISTINCT ON (1)
date, '3_M', count(*) OVER (PARTITION BY month_beg ORDER BY date)
FROM uniques
ORDER BY 1, time_series
Your columns week_beg and month_beg are 100 % redundant and can easily be replaced by
date_trunc('week', date + 1) - 1 and date_trunc('month', date) respectively.
Your week seems to start on Sunday (off by one), therefore the + 1 .. - 1.
The default frame of a window function with ORDER BY in the OVER clause uses is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW. That's exactly what you need.
Use UNION ALL, not UNION.
Your unfortunate choice for time_series (D, W, M) does not sort well, I renamed to make the final ORDER BY easier.
This query can deal with multiple rows per day. Counts include all peers for a day.
More about DISTINCT ON:
Select first row in each GROUP BY group?
DISTINCT users per day
To count every user only once per day, use a CTE with DISTINCT ON:
WITH x AS (SELECT DISTINCT ON (1,2) date, user_id FROM uniques)
SELECT date, '1_D' AS time_series, count(user_id) AS cnt
FROM x
GROUP BY 1
UNION ALL
SELECT DISTINCT ON (1)
date, '2_W'
,count(*) OVER (PARTITION BY (date_trunc('week', date + 1)::date - 1)
ORDER BY date)
FROM x
UNION ALL
SELECT DISTINCT ON (1)
date, '3_M'
,count(*) OVER (PARTITION BY date_trunc('month', date) ORDER BY date)
FROM x
ORDER BY 1, 2
DISTINCT users over dynamic period of time
You can always resort to correlated subqueries. Tend to be slow with big tables!
Building on the previous queries:
WITH du AS (SELECT date, user_id FROM uniques GROUP BY 1,2)
,d AS (
SELECT date
,(date_trunc('week', date + 1)::date - 1) AS week_beg
,date_trunc('month', date)::date AS month_beg
FROM uniques
GROUP BY 1
)
SELECT date, '1_D' AS time_series, count(user_id) AS cnt
FROM du
GROUP BY 1
UNION ALL
SELECT date, '2_W', (SELECT count(DISTINCT user_id) FROM du
WHERE du.date BETWEEN d.week_beg AND d.date )
FROM d
GROUP BY date, week_beg
UNION ALL
SELECT date, '3_M', (SELECT count(DISTINCT user_id) FROM du
WHERE du.date BETWEEN d.month_beg AND d.date)
FROM d
GROUP BY date, month_beg
ORDER BY 1,2;
SQL Fiddle for all three solutions.
Faster with dense_rank()
#Clodoaldo came up with a major improvement: use the window function dense_rank(). Here is another idea for an optimized version. It should be even faster to exclude daily duplicates right away. The performance gain grows with the number of rows per day.
Building on a simplified and sanitized data model
- without the redundant columns
- day as column name instead of date
date is a reserved word in standard SQL and a basic type name in PostgreSQL and shouldn't be used as identifier.
CREATE TABLE uniques(
day date -- instead of "date"
,user_id int
);
Improved query:
WITH du AS (
SELECT DISTINCT ON (1, 2)
day, user_id
,date_trunc('week', day + 1)::date - 1 AS week_beg
,date_trunc('month', day)::date AS month_beg
FROM uniques
)
SELECT day, count(user_id) AS d, max(w) AS w, max(m) AS m
FROM (
SELECT user_id, day
,dense_rank() OVER(PARTITION BY week_beg ORDER BY user_id) AS w
,dense_rank() OVER(PARTITION BY month_beg ORDER BY user_id) AS m
FROM du
) s
GROUP BY day
ORDER BY day;
SQL Fiddle demonstrating the performance of 4 faster variants. It depends on your data distribution which is fastest for you.
All of them are about 10x as fast as the correlated subqueries version (which isn't bad for correlated subqueries).
Without correlated subqueries. SQL Fiddle
with u as (
select
"date", user_id,
date_trunc('week', "date" + 1)::date - 1 week_beg,
date_trunc('month', "date")::date month_beg
from uniques
)
select
"date", count(distinct user_id) D,
max(week_dr) W, max(month_dr) M
from (
select
user_id, "date",
dense_rank() over(partition by week_beg order by user_id) week_dr,
dense_rank() over(partition by month_beg order by user_id) month_dr
from u
) s
group by "date"
order by "date"
Try
SELECT
*
FROM
(
SELECT dates, count(user_id), 'D' as timesereis FROM users_data GROUP BY dates
UNION
SELECT max(dates), count(user_id), 'W' FROM users_data GROUP BY date_part('year',dates)+date_part('week',dates)
UNION
SELECT max(dates), count(user_id), 'M' FROM users_data GROUP BY date_part('year',dates)+date_part('week',dates)
) tEMP order by dates, timesereis
SQLFIDDLE
Try queries like this
SELECT count(distinct user_id), date_format(date, '%Y-%m-%d') as date_period
FROM uniques
GROUP By date_period