How to Group by 6 days in Postgresql - sql

I want to convert this type of data to 6Days GROUP BY format.
+-----+--------------+------------+
| gid | cnt | date |
+-----+--------------+------------+
| 1 | 1 | 2012-02-05 |
| 2 | 2 | 2012-02-06 |
| 3 | 1 | 2012-02-07 |
| 4 | 1 | 2012-02-08 |
| 5 | 1 | 2012-02-09 |
| 6 | 2 | 2012-02-10 |
| 7 | 3 | 2012-02-11 |
| 8 | 1 | 2012-02-12 |
| 9 | 1 | 2012-02-13 |
| 10 | 2 | 2012-02-14 |
| 11 | 3 | 2012-02-15 |
| 12 | 4 | 2012-02-16 |
| 13 | 1 | 2012-02-17 |
| 14 | 1 | 2012-02-18 |
| 15 | 1 | 2012-02-19 |
| 16 | NULL | 2012-02-20 |
| 17 | 6 | 2012-02-21 |
| 18 | NULL | 2012-02-22 |
+-----+--------------+------------+
↓↓↓↓↓↓↓↓↓↓↓↓↓↓
The date is a continuous format.

If I understand correctly you need something like this:
WITH x AS (SELECT date::date, (random() * 3)::int AS cnt FROM generate_series('2012-02-05'::date, '2012-02-22'::date, '1 day'::interval) AS date
)
SELECT start::date,
(start + '5 day'::interval)::date AS end,
sum(cnt)
FROM generate_series(
(SELECT min(date) FROM x),
(SELECT max(date) FROM x),
'5 day'::interval
) AS start
LEFT JOIN x ON (x.date >= start AND x.date <= start + '5 day'::interval)
GROUP BY 1, 2
ORDER BY 1
In x I emulate your table.

Related

SQL window excluding current group?

I'm trying to provide rolled up summaries of the following data including only the group in question as well as excluding the group. I think this can be done with a window function, but I'm having problems with getting the syntax down (in my case Hive SQL).
I want the following data to be aggregated
+------------+---------+--------+
| date | product | rating |
+------------+---------+--------+
| 2018-01-01 | A | 1 |
| 2018-01-02 | A | 3 |
| 2018-01-20 | A | 4 |
| 2018-01-27 | A | 5 |
| 2018-01-29 | A | 4 |
| 2018-02-01 | A | 5 |
| 2017-01-09 | B | NULL |
| 2017-01-12 | B | 3 |
| 2017-01-15 | B | 4 |
| 2017-01-28 | B | 4 |
| 2017-07-21 | B | 2 |
| 2017-09-21 | B | 5 |
| 2017-09-13 | C | 3 |
| 2017-09-14 | C | 4 |
| 2017-09-15 | C | 5 |
| 2017-09-16 | C | 5 |
| 2018-04-01 | C | 2 |
| 2018-01-13 | D | 1 |
| 2018-01-14 | D | 2 |
| 2018-01-24 | D | 3 |
| 2018-01-31 | D | 4 |
+------------+---------+--------+
Aggregated results:
+------+-------+---------+----+------------+------------------+----------+
| year | month | product | ct | avg_rating | avg_rating_other | other_ct |
+------+-------+---------+----+------------+------------------+----------+
| 2018 | 1 | A | 5 | 3.4 | 2.5 | 4 |
| 2018 | 2 | A | 1 | 5 | NULL | 0 |
| 2017 | 1 | B | 4 | 3.6666667 | NULL | 0 |
| 2017 | 7 | B | 1 | 2 | NULL | 0 |
| 2017 | 9 | B | 1 | 5 | 4.25 | 4 |
| 2017 | 9 | C | 4 | 4.25 | 5 | 1 |
| 2018 | 4 | C | 1 | 2 | NULL | 0 |
| 2018 | 1 | D | 4 | 2.5 | 3.4 | 5 |
+------+-------+---------+----+------------+------------------+----------+
I've also considered producing two aggregates, one with the product in question and one without, but having trouble with creating the appropriate joining key.
You can do:
select year(date), month(date), product,
count(*) as ct, avg(rating) as avg_rating,
sum(count(*)) over (partition by year(date), month(date)) - count(*) as ct_other,
((sum(sum(rating)) over (partition by year(date), month(date)) - sum(rating)) /
(sum(count(*)) over (partition by year(date), month(date)) - count(*))
) as avg_other
from t
group by year(date), month(date), product;
The rating for the "other" is a bit tricky. You need to add everything up and subtract out the current row -- and calculate the average by doing the sum divided by the count.

Get last value with delta from previous row

I have data
| account | type | position | created_date |
|---------|------|----------|------|
| 1 | 1 | 1 | 2016-08-01 00:00:00 |
| 2 | 1 | 2 | 2016-08-01 00:00:00 |
| 1 | 2 | 2 | 2016-08-01 00:00:00 |
| 2 | 2 | 1 | 2016-08-01 00:00:00 |
| 1 | 1 | 2 | 2016-08-02 00:00:00 |
| 2 | 1 | 1 | 2016-08-02 00:00:00 |
| 1 | 2 | 1 | 2016-08-03 00:00:00 |
| 2 | 2 | 2 | 2016-08-03 00:00:00 |
| 1 | 1 | 2 | 2016-08-04 00:00:00 |
| 2 | 1 | 1 | 2016-08-04 00:00:00 |
| 1 | 2 | 2 | 2016-08-07 00:00:00 |
| 2 | 2 | 1 | 2016-08-07 00:00:00 |
I need to get last positions (account, type, position) and delta from previous position. I'm trying to use Window functions but only get all rows and can't grouping them/get last.
SELECT
account,
type,
FIRST_VALUE(position) OVER w AS position,
FIRST_VALUE(position) OVER w - LEAD(position, 1, 0) OVER w AS delta,
created_date
FROM table
WINDOW w AS (PARTITION BY account ORDER BY created_date DESC)
I have result
| account | type | position | delta | created_date |
|---------|------|----------|-------|--------------|
| 1 | 1 | 1 | 1 | 2016-08-01 00:00:00 |
| 1 | 1 | 2 | 1 | 2016-08-02 00:00:00 |
| 1 | 1 | 2 | 0 | 2016-08-04 00:00:00 |
| 1 | 2 | 2 | 2 | 2016-08-01 00:00:00 |
| 1 | 2 | 1 | -1 | 2016-08-03 00:00:00 |
| 1 | 2 | 2 | 1 | 2016-08-07 00:00:00 |
| 2 | 1 | 2 | 2 | 2016-08-01 00:00:00 |
| 2 | 2 | 1 | 1 | 2016-08-01 00:00:00 |
| and so on |
but i need only last record for each account/type pair
| account | type | position | delta | created_date |
|---------|------|----------|-------|--------------|
| 1 | 1 | 2 | 0 | 2016-08-04 00:00:00 |
| 1 | 2 | 2 | 1 | 2016-08-07 00:00:00 |
| 2 | 1 | 1 | 0 | 2016-08-04 00:00:00 |
| and so on |
Sorry for my bad language and Thanks for any help.
My "best" try..
WITH cte_delta AS (
SELECT
account,
type,
FIRST_VALUE(position) OVER w AS position,
FIRST_VALUE(position) OVER w - LEAD(position, 1, 0) OVER w AS delta,
created_date
FROM table
WINDOW w AS (PARTITION BY account ORDER BY created_date DESC)
),
cte_date AS (
SELECT
account,
type,
MAX(created_date) AS created_date
FROM cte_delta
GROUP BY account, type
)
SELECT cd.*
FROM
cte_delta cd,
cte_date ct
WHERE
cd.account = ct.account
AND cd.type = ct.type
AND cd.created_date = ct.created_date

writing SQL query to show result in specific order

I have this table
+----+--------+------------+-----------+
| Id | day_id | subject_id | period_Id |
+----+--------+------------+-----------+
| 1 | 1 | 1 | 1 |
| 2 | 1 | 2 | 2 |
| 8 | 2 | 6 | 1 |
| 9 | 2 | 7 | 2 |
| 15 | 3 | 3 | 1 |
| 16 | 3 | 4 | 2 |
| 22 | 4 | 5 | 1 |
| 23 | 4 | 5 | 2 |
| 24 | 4 | 6 | 3 |
| 29 | 5 | 8 | 1 |
| 30 | 5 | 1 | 2 |
to something like this
| Id | day_id | subject_id | period_Id |
| 1 | 1 | 1 | 1 |
| 8 | 2 | 6 | 1 |
| 15 | 3 | 3 | 1 |
| 22 | 4 | 5 | 1 |
| 29 | 5 | 8 | 1 |
| 2 | 1 | 2 | 2 |
| 2 | 1 | 2 | 2 |
| 16 | 3 | 4 | 2 |
| 23 | 4 | 5 | 2 |
| 30 | 5 | 1 | 2 |
+----+--------+------------+-----------+
SO, I want to choose one period with a different subject each day and doing this for number of weeks. so first subject dose not come until all subject have been chosen.
You can ORDER BY period_id first and then by day_id:
SELECT *
FROM your_table
ORDER BY period_Id, day_Id
LiveDemo

Grouping / Ordering confusion

Hopefully what I have here is a simple question and explained to you in the correct manner.
I have the following Query:
--DECLARE DATES
DECLARE #Date datetime
DECLARE #DaysInMonth INT
DECLARE #i INT
--GIVE VALUES
SET #Date = Getdate()
SELECT #DaysInMonth = datepart(dd,dateadd(dd,-1,dateadd(mm,1,cast(cast(year(#Date) as varchar)+'-'+cast(month(#Date) as varchar)+'-01' as datetime))))
SET #i = 1
--MAKE TEMP TABLE
CREATE TABLE #TempDays
(
[days] VARCHAR(50)
)
WHILE #i <= #DaysInMonth
BEGIN
INSERT INTO #TempDays
VALUES(#i)
SET #i = #i + 1
END
SELECT #TempDays.days, DATEPART(dd, a.ActualDate) ActualDate, a.ActualAmount, (SELECT SUM(b.ActualAmount)
FROM UnpaidManagement..Actual b
WHERE b.ID <= a.ID) RunningTotal
FROM UnpaidManagement..Actual a
RIGHT JOIN #TempDays on a.ID = #TempDays.days
DROP TABLE #TempDays
Which produces the following output:
+------+------------+--------------+--------------+
| days | ActualDate | ActualAmount | RunningTotal |
+------+------------+--------------+--------------+
| 1 | 1 | 438706 | R 438 706 |
| 2 | 2 | 16239 | R 454 945 |
| 3 | 3 | 1611264 | R 2 066 209 |
| 4 | 4 | 1157777 | R 3 223 986 |
| 5 | 5 | 470662 | R 3 694 648 |
| 6 | 6 | 288628 | 3983276 |
| 7 | 7 | 245897 | 4229173 |
| 8 | 8 | 5235 | 4234408 |
| 9 | 10 | 375630 | 4610038 |
| 10 | 11 | 95610 | 4705648 |
| 11 | 12 | 87285 | 4792933 |
| 12 | 13 | 73399 | 4866332 |
| 13 | 14 | 59516 | 4925848 |
| 14 | 15 | 918915 | 5844763 |
| 15 | 17 | 1957285 | 7802048 |
| 16 | 18 | 489964 | 8292012 |
| 17 | 19 | 272304 | 8564316 |
| 18 | 20 | 378601 | 8942917 |
| 19 | 22 | 92374 | 9035291 |
| 20 | 23 | 198 | 9035489 |
| 21 | 24 | 1500820 | 10536309 |
| 22 | 25 | 2631057 | 13167366 |
| 23 | 26 | 6466505 | 19633871 |
| 24 | 27 | 3757350 | 23391221 |
| 25 | 28 | 3487466 | 26878687 |
| 26 | 29 | 160197 | 27038884 |
| 27 | 30 | 14000 | 27052884 |
| 28 | NULL | NULL | NULL |
| 29 | NULL | NULL | NULL |
| 30 | NULL | NULL | NULL |
| 31 | NULL | NULL | NULL |
+------+------------+--------------+--------------+
If you look closely at the table above, the "ActualDate" column is missing a few values, EG: 9, 16, etc.
And because of this, the rows are being pushed up instead of being grouped with their correct number? How would I accomplish a group by / anything to keep them in their correct row?
DESIRED OUTPUT:
+------+------------+--------------+--------------+
| days | ActualDate | ActualAmount | RunningTotal |
+------+------------+--------------+--------------+
| 1 | 1 | 438706 | R 438 706 |
| 2 | 2 | 16239 | R 454 945 |
| 3 | 3 | 1611264 | R 2 066 209 |
| 4 | 4 | 1157777 | R 3 223 986 |
| 5 | 5 | 470662 | R 3 694 648 |
| 6 | 6 | 288628 | 3983276 |
| 7 | 7 | 245897 | 4229173 |
| 8 | 8 | 5235 | 4234408 |
| 9 | NULL | NULL | NULL |
| 10 | 10 | 375630 | 4610038 |
| 11 | 11 | 95610 | 4705648 |
| 12 | 12 | 87285 | 4792933 |
| 13 | 13 | 73399 | 4866332 |
| 14 | 14 | 59516 | 4925848 |
| 15 | 15 | 918915 | 5844763 |
| 16 | NULL | NULL | NULL |
| 17 | 17 | 1957285 | 7802048 |
| 18 | 18 | 489964 | 8292012 |
| 19 | 19 | 272304 | 8564316 |
| 20 | 20 | 378601 | 8942917 |
| 21 | NULL | NULL | NULL |
| 22 | 22 | 92374 | 9035291 |
| 23 | 23 | 198 | 9035489 |
| 24 | 24 | 1500820 | 10536309 |
| 25 | 25 | 2631057 | 13167366 |
| 26 | 26 | 6466505 | 19633871 |
| 27 | 27 | 3757350 | 23391221 |
| 28 | 28 | 3487466 | 26878687 |
| 29 | 29 | 160197 | 27038884 |
| 30 | 30 | 14000 | 27052884 |
| 31 | NULL | NULL | NULL |
+------+------------+--------------+--------------+
I know this is a long one to read, but please let me know if I have explained this clearly enough. I have been trying to group by this whole morning, but I keep getting errors.
SELECT #TempDays.days, DATEPART(dd, a.ActualDate) ActualDate, a.ActualAmount, (SELECT SUM(b.ActualAmount)
FROM UnpaidManagement..Actual b
WHERE b.ID <= a.ID) RunningTotal
FROM UnpaidManagement..Actual a
RIGHT JOIN #TempDays on DATEPART(dd, a.ActualDate) = #TempDays.days
If you select the temp table as first table in the select and join to UnpaidManagement..Actual you have the days in correct row and order:
SELECT t.days
,DATEPART(dd, a.ActualDate) ActualDate
,a.ActualAmount
,(
SELECT SUM(b.ActualAmount)
FROM UnpaidManagement..Actual b
WHERE b.ID <= a.ID
) RunningTotal
FROM #TempDays AS t
INNER JOIN UnpaidManagement..Actual AS a ON a.IDENTITYCOL = t.days
ORDER BY t.days
After doing that, cou can add CASE WHEN to generate content for the NULL cells.

How to calculate running total (month to date) in SQL Server 2008

I'm trying to calculate a month-to-date total using SQL Server 2008.
I'm trying to generate a month-to-date count at the level of activities and representatives. Here are the results I want to generate:
| REPRESENTATIVE_ID | MONTH | WEEK | TOTAL_WEEK_ACTIVITY_COUNT | MONTH_TO_DATE_ACTIVITIES_COUNT |
|-------------------|-------|------|---------------------------|--------------------------------|
| 40 | 7 | 7/08 | 1 | 1 |
| 40 | 8 | 8/09 | 1 | 1 |
| 40 | 8 | 8/10 | 1 | 2 |
| 41 | 7 | 7/08 | 2 | 2 |
| 41 | 8 | 8/08 | 4 | 4 |
| 41 | 8 | 8/09 | 3 | 7 |
| 41 | 8 | 8/10 | 1 | 8 |
From the following tables:
ACTIVITIES_FACT table
+-------------------+------+-----------+
| Representative_ID | Date | Activity |
+-------------------+------+-----------+
| 41 | 8/03 | Call |
| 41 | 8/04 | Call |
| 41 | 8/05 | Call |
+-------------------+------+-----------+
LU_TIME table
+-------+-----------------+--------+
| Month | Date | Week |
+-------+-----------------+--------+
| 8 | 8/01 | 8/08 |
| 8 | 8/02 | 8/08 |
| 8 | 8/03 | 8/08 |
| 8 | 8/04 | 8/08 |
| 8 | 8/05 | 8/08 |
+-------+-----------------+--------+
I'm not sure how to do this: I keep running into problems with multiple-counting or aggregations not being allowed in subqueries.
A running total is the summation of a sequence of numbers which is
updated each time a new number is added to the sequence, simply by
adding the value of the new number to the running total.
I THINK He wants a running total for Month by each Representative_Id, so a simple group by week isn't enough. He probably wants his Month_To_Date_Activities_Count to be updated at the end of every week.
This query gives a running total (month to end-of-week date) ordered by Representative_Id, Week
SELECT a.Representative_ID, l.month, l.Week, Count(*) AS Total_Week_Activity_Count
,(SELECT count(*)
FROM ACTIVITIES_FACT a2
INNER JOIN LU_TIME l2 ON a2.Date = l2.Date
AND a.Representative_ID = a2.Representative_ID
WHERE l2.week <= l.week
AND l2.month = l.month) Month_To_Date_Activities_Count
FROM ACTIVITIES_FACT a
INNER JOIN LU_TIME l ON a.Date = l.Date
GROUP BY a.Representative_ID, l.Week, l.month
ORDER BY a.Representative_ID, l.Week
| REPRESENTATIVE_ID | MONTH | WEEK | TOTAL_WEEK_ACTIVITY_COUNT | MONTH_TO_DATE_ACTIVITIES_COUNT |
|-------------------|-------|------|---------------------------|--------------------------------|
| 40 | 7 | 7/08 | 1 | 1 |
| 40 | 8 | 8/09 | 1 | 1 |
| 40 | 8 | 8/10 | 1 | 2 |
| 41 | 7 | 7/08 | 2 | 2 |
| 41 | 8 | 8/08 | 4 | 4 |
| 41 | 8 | 8/09 | 3 | 7 |
| 41 | 8 | 8/10 | 1 | 8 |
SQL Fiddle Sample
As I understand your question:
SELECT af.Representative_ID
, lt.Week
, COUNT(af.Activity) AS Qnt
FROM ACTIVITIES_FACT af
INNER JOIN LU_TIME lt ON lt.Date = af.date
GROUP BY af.Representative_ID, lt.Week
SqlFiddle
Representative_ID Week Month_To_Date_Activities_Count
41 2013-08-01 00:00:00.000 1
41 2013-08-08 00:00:00.000 3
USE tempdb;
GO
IF OBJECT_ID('#ACTIVITIES_FACT','U') IS NOT NULL DROP TABLE #ACTIVITIES_FACT;
CREATE TABLE #ACTIVITIES_FACT
(
Representative_ID INT NOT NULL
,Date DATETIME NULL
, Activity VARCHAR(500) NULL
)
IF OBJECT_ID('#LU_TIME','U') IS NOT NULL DROP TABLE #LU_TIME;
CREATE TABLE #LU_TIME
(
Month INT
,Date DATETIME
,Week DATETIME
)
INSERT INTO #ACTIVITIES_FACT(Representative_ID,Date,Activity)
VALUES
(41,'7/31/2013','Chat')
,(41,'8/03/2013','Call')
,(41,'8/04/2013','Call')
,(41,'8/05/2013','Call')
INSERT INTO #LU_TIME(Month,Date,Week)
VALUES
(8,'7/31/2013','8/01/2013')
,(8,'8/01/2013','8/08/2013')
,(8,'8/02/2013','8/08/2013')
,(8,'8/03/2013','8/08/2013')
,(8,'8/04/2013','8/08/2013')
,(8,'8/05/2013','8/08/2013')
--Begin Query
SELECT AF.Representative_ID
,LU.Week
,COUNT(*) AS Month_To_Date_Activities_Count
FROM #ACTIVITIES_FACT AS AF
INNER JOIN #LU_TIME AS LU
ON AF.Date = LU.Date
Group By AF.Representative_ID
,LU.Week