Postgres count number or rows and group them by timestamp - sql

Let's assume I have one table in postgres with just 2 columns:
ID which is PK for the table (bigint)
time which is type of timestamp
Is there any way how to get IDs grouped by time BY YEAR- when the time is date 18 February 2005 it would fit in 2005 group (so result would be)
year number of rows
1998 2
2005 5
AND if the number of result rows is smaller than some number (for example 3) SQL will return the result by month
Something like
month number of rows
(February 2018) 5
(March 2018) 2
Is that possible some nice way in postgres SQL?

You can do it using window functions (as always).
I use this table:
TABLE times;
id | t
----+-------------------------------
1 | 2018-03-14 20:04:39.81298+01
2 | 2018-03-14 20:04:42.92462+01
3 | 2018-03-14 20:04:45.774615+01
4 | 2018-03-14 20:04:48.877038+01
5 | 2017-03-14 20:05:08.94096+01
6 | 2017-03-14 20:05:16.123736+01
7 | 2017-03-14 20:05:19.91982+01
8 | 2017-01-14 20:05:32.249175+01
9 | 2017-01-14 20:05:35.793645+01
10 | 2017-01-14 20:05:39.991486+01
11 | 2016-11-14 20:05:47.951472+01
12 | 2016-11-14 20:05:52.941504+01
13 | 2016-10-14 21:05:52.941504+02
(13 rows)
First, group by month (subquery per_month).
Then add the sum per year with a window function (subquery with_year).
Finally, use CASE to decide which one you will output and remove duplicates with DISTINCT.
SELECT DISTINCT
CASE WHEN yc > 5
THEN mc
ELSE yc
END AS count,
CASE WHEN yc > 5
THEN to_char(t, 'YYYY-MM')
ELSE to_char(t, 'YYYY')
END AS period
FROM (SELECT
mc,
sum(mc) OVER (PARTITION BY date_trunc('year', t)) AS yc,
t
FROM (SELECT
count(*) AS mc,
date_trunc('month', t) AS t
FROM times
GROUP BY date_trunc('month', t)
) per_month
) with_year
ORDER BY 2;
count | period
-------+---------
3 | 2016
3 | 2017-01
3 | 2017-03
4 | 2018
(4 rows)

Just count years. If it's at least 3, then you group by years, else by months:
select
case (select count(distinct extract(year from time)) from mytable) >= 3 then
to_char(time, 'yyyy')
else
to_char(time, 'yyyy-mm')
end as season,
count(*)
from mytable
group by season
order by season;
(Unlike many other DBMS, PostgreSQL allows to use alias names in the GROUP BY clause.)

Related

How to calculate total worktime per week [SQL]

I have a table of EMPLOYEES that contains information about the DATE and WORKTIME per that day. Fx:
ID | DATE | WORKTIME |
----------------------------------------
1 | 1-Sep-2014 | 4 |
2 | 2-Sep-2014 | 6 |
1 | 3-Sep-2014 | 5.5 |
1 | 4-Sep-2014 | 7 |
2 | 4-Sep-2014 | 4 |
1 | 9-Sep-2014 | 8 |
and so on.
Question: How can I create a query that would allow me to calculate amount of time worked per week (HOURS_PERWEEK). I understand that I need a summation of WORKTIME together with grouping considering both, ID and week, but so far my trials as well as googling didnt yield any results. Any ideas on this? Thank you in advance!
edit:
Got a solution of
select id, sum (worktime), trunc(date, 'IW') week
from employees
group by id, TRUNC(date, 'IW');
But will need somehow to connect that particular output with DATE table by updating a newly created column such as WEEKLY_TIME. Any hints on that?
You can find the start of the ISO week, which will always be a Monday, using TRUNC("DATE", 'IW').
So if, in the query, you GROUP BY the id and the start of the week TRUNC("DATE", 'IW') then you can SELECT the id and aggregate to find the SUM the WORKTIME column for each id.
Since this appears to be a homework question and you haven't attempted a query, I'll leave it at this to point you in the correct direction and you can complete the query.
Update
Now I need to create another column (lets call it WEEKLY_TIME) and populate it with values from the current output, so that Sep 1,3,4 (for ID=1) would all contain value 16.5, specifying that on that day (that is within the certain week) that person worked 16.5 in total. And for ID=2 it would then be a value of 10 for both Sep 2 and 4.
For this, if I understand correctly, you appear to not want to use aggregation functions and want to use the analytic version of the function:
select id,
"DATE",
trunc("DATE", 'IW') week,
worktime,
sum (worktime) OVER (PARTITION BY id, trunc("DATE", 'IW'))
AS weekly_time
from employees;
Which, for the sample data:
CREATE TABLE employees (ID, "DATE", WORKTIME) AS
SELECT 1, DATE '2014-09-01', 4 FROM DUAL UNION ALL
SELECT 2, DATE '2014-09-02', 6 FROM DUAL UNION ALL
SELECT 1, DATE '2014-09-03', 5.5 FROM DUAL UNION ALL
SELECT 1, DATE '2014-09-04', 7 FROM DUAL UNION ALL
SELECT 2, DATE '2014-09-04', 4 FROM DUAL UNION ALL
SELECT 1, DATE '2014-09-09', 8 FROM DUAL;
Outputs:
ID
DATE
WEEK
WORKTIME
WEEKLY_TIME
1
2014-09-01 00:00:00
2014-09-01 00:00:00
4
16.5
1
2014-09-03 00:00:00
2014-09-01 00:00:00
5.5
16.5
1
2014-09-04 00:00:00
2014-09-01 00:00:00
7
16.5
1
2014-09-09 00:00:00
2014-09-08 00:00:00
8
8
2
2014-09-04 00:00:00
2014-09-01 00:00:00
4
10
2
2014-09-02 00:00:00
2014-09-01 00:00:00
6
10
db<>fiddle here
edit: answer submitted without noticing "Oracle" tag. Otherwise, question answered here: Oracle SQL - Sum and group data by week
Select employee_Id,
DATEPART(week, workday) as [Week],
sum (worktime) as [Weekly Hours]
from WORK
group by employee_id, DATEPART(week, workday)
https://dbfiddle.uk/?rdbms=sqlserver_2019&fiddle=238b229156a383fa3c466b6c3c2dee1e

SQL sum and previous row [duplicate]

This question already has answers here:
Calculate a Running Total in SQL Server
(15 answers)
Closed 3 years ago.
I have the following table:
________________________
date | amount
________________________
01-01-2019 | 10
01-01-2019 | 10
01-01-2019 | 10
01-01-2019 | 10
02-01-2019 | 5
02-01-2019 | 5
02-01-2019 | 5
02-01-2019 | 5
03-01-2019 | 20
03-01-2019 | 20
These are mutation values by date. I would like my query to return the summed amount by date. So for 02-01-2019 I need 40 ( 4 times 10) + 20 ( 4 times 5). For 03-01-2019 I would need ( 4 times 10) + 20 ( 4 times 5) + 40 ( 2 times 20) and so on. Is this possible in one query? How do I achieve this?
My current query to get the individual mutations:
Select s.date,
Sum(s.amount) As Sum_amount
From dbo.Financieel As s
Group By s.date
You can try below -
DEMO
select dateval,
SUM(amt) OVER(ORDER BY dateval ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) as amt
from
(
SELECT
dateval,
SUM(amount) amt
FROM t2 group by dateval
)A
OUTPUT:
dateval amt
01/01/2019 00:00:00 40
01/02/2019 00:00:00 60
01/03/2019 00:00:00 100
Try this below script to get your desired output-
SELECT A.date,
(SELECT SUM(amount) FROM <your_table> WHERE Date <= A.Date) C_Total
FROM <your_table> A
GROUP BY date
ORDER BY date
Output is-
date C_Total
01-01-2019 40
02-01-2019 60
03-01-2019 100
I suggest to use a window function, like this:
select date, sum(amount) over( order by date)
from table

Those who listened to more than 10 mins each month in the last 6 months

I'm trying to figure out the count of users who listened to more than 10 mins each month in the last 6 months
We have this event: Song_stopped_listen and one attribute is session_progress_ms
Now I'm trying to see the monthly evolution of the count of this cohort over the last 6 months.
I'm using bigquery and this is the query I tried, but I feel that something is off semantically, but I couldn't put my finger on:
SELECT
CONCAT(CAST(EXTRACT(YEAR FROM DATE (timestamp)) AS STRING),"-",CAST(EXTRACT(MONTH FROM DATE (timestamp)) AS STRING)) AS date
,SUM(absl.session_progress_ms/(1000*60*10)) as total_10_ms, COUNT(DISTINCT u.id) as total_10_listeners
FROM ios.song_stopped_listen as absl
LEFT JOIN ios.users u on absl.user_id = u.id
WHERE absl.timestamp > '2018-05-01'
Group by 1
HAVING(total_10_ms > 1)
Please help figure out what I'm doing wrong here.
Thank you.
data Sample:
user_id | session_progress_ms | timestamp
1 | 10000 | 2017-10-10 14:34:25.656 UTC
What I want to have:
||Month-year | Count of users who listened to more than 10 mins
|2018-5 | 500
|2018-6 | 600
|2018-7 | 300
|2018-8 | 5100
|2018-9 | 4500
|2018-10 | 1500
|2018-11 | 1500
|2018-12 | 2500
Use multiple levels of aggregation:
select user_id
from (select ssl.user_id, timestamp_trunc(timestamp, month) as mon,
sum(ssl.session_progress_ms/(1000*60)) as total_minutes
from ios.song_stopped_listen as ssl
where date(ssl.timetamp) < date_trunc(current_date, month) and
date(ssl.timestamp) >= date_add(date_trunc(current_date, month) interval 6 month),
group by 1, 2
) u
where total_minutes >= 10
group by user_id
having count(*) = 6;
To get the count, just use this as a subquery with count(*).

Access SQL count number of people group by week number

I need to count how many people are working given a week number.
Here's my people table (date is US format) :
name | surname| date_of_entry | date_of_exit
-----|--------|---------------|-------------
foo | bar | 1/1/2006 | 1/8/2006
foo1 | bar1 | 1/5/2010 |
foo2 | bar2 | 2/3/2015 | 3/4/2015
and I'd like for a given year to have all weeks number with the proper number of people working during this period.I hope you understand me because english is not my native language sorry.
I've done some research and from what i understand i need to create a table with all weeks starting from 1/1/2006 and ending "now" (because people with no exit date are still working) according to the example above to be able to test for each person if he was working during this week so I can count him in my query.
I'm still a student programmer but it seems to be a pretty complex SQL query to me.
Expected output for a query for year 2006 (with the new year starting a monday) :
week_number | count
------------|------
1 | 1
2 | 1
3 | 0
etc.. | 0
until 1/5/2010 where all weeks have 1 in the field "count"
and then query for year 2015 :
week_number | count
------------|------
1 | 1
2 | 1
... | 1
9 | 2
... | 2
14 | 1
... | 1
If anybody can help me to resolve this it would be awesome, thanks!
You are lucky I have to change my mind for 15 minutes.
Step 1 : Create Calendar table
Create a new table named Calendar with the following fields
id : autonumber
Cal_Year : number
Cal_Week : number
In a module, add the following code and execute it (F5) :
Private Sub Create_Calendar_table()
Dim Y As Integer
Dim W As Integer
For Y = 2006 To 2016
For W = 1 To 52
DoCmd.RunSQL "INSERT INTO Calendar (cal_year, cal_week) VALUES (" & Y & "," & W & ")"
Next W
Next Y
End Sub
Calendar table is now ready to use :
ID Cal_year Cal_week
1 2006 1
2 2006 2
3 2006 3
4 2006 4
5 2006 5
and so on...
Step 2 : Create the query
Note that I am in Europe so my dates are DD/MM. This won't affect your results.
I decompose so you understand the process.
First we need to create a date from the year/week in the calendar table, this can be achieved like this
SELECT Cal_year, Cal_week, DateAdd("ww",Cal_week,DateSerial(Cal_year,1,1)) AS thedate
FROM Calendar
Cal_year Cal_week thedate
2006 1 8/01/2006
2006 2 15/01/2006
2006 3 22/01/2006
2006 4 29/01/2006
2006 5 5/02/2006
and so on...
Next, since we will work on ranges of dates, it is important a to attribute the current date when the people's exit_date is NULL, like this :
nz(date_of_exit, Now)
The field is prepared.
Then, the trick.
We need to JOIN our calendar table with the people table in manner that will return a record for every week on which a person is present.
The key to achieve this is the ON...BETWEEN...AND
SELECT C.Cal_year, C.Cal_Week, P.pname, P.psurname, P.date_of_entry, nz(P.date_of_exit, Now) AS exit_date
FROM [Calendar] C
INNER JOIN ( SELECT [Name] AS pname, [surname] as psurname, date_of_entry, date_of_exit
FROM people
) P
ON (DateAdd("ww",C.Cal_week,DateSerial(C.Cal_year,1,1)) BETWEEN P.date_of_entry AND nz(P.date_of_exit, Now))
ORDER BY C.Cal_year, C.Cal_Week
Cal_year Cal_Week pname psurname date_of_entry exit_date
2006 1 foo bar 1/01/2006 8/01/2006
2010 1 foo1 bar1 5/01/2010 22/04/2016 13:04:39
2010 2 foo1 bar1 5/01/2010 22/04/2016 13:04:39
2010 3 foo1 bar1 5/01/2010 22/04/2016 13:04:39
2010 4 foo1 bar1 5/01/2010 22/04/2016 13:04:39
Note that if you need ALL WEEKS since 2006, even those for which nobody is present, just change the INNER JOIN with a LEFT JOIN
And finally, we exploit the previous query to count the presences by doing a GROUP BY on year and week of the calendar table, and we specify the year 2015 in the WHERE clause otherwise it will count everything since 2006. Which implies that it is very easy to count the presences for any year.
SELECT yyyy, ww , count(*) AS cnt
FROM
(
SELECT C.Cal_year AS yyyy, C.Cal_Week AS ww
FROM [Calendar] C
INNER JOIN ( SELECT [Name] AS pname, [surname] as psurname,
date_of_entry,
date_of_exit
FROM people
) P ON (DateAdd("ww",C.Cal_week,DateSerial(C.Cal_year,1,1)) BETWEEN P.date_of_entry AND nz(P.date_of_exit, Now))
)
WHERE yyyy=2015
GROUP BY yyyy, ww
ORDER BY yyyy, ww
yyyy ww cnt
2015 1 1
2015 2 1
2015 3 1
2015 4 1
2015 5 1
2015 6 1
2015 7 1
2015 8 1
2015 9 2
2015 10 2
2015 11 2
2015 12 2
2015 13 2
2015 14 1
2015 15 1
2015 16 1
Well, it took me 40 minutes finally...
You can create a query to find the weeks working:
Select
[name],
surname,
year,
week
From
PeopleTable,
WeekTable
Where
(date_of_entry <= week_start And DateDiff("d", week_start, date_of_exit) >= 3)
Or
(date_of_entry >= week_start And date_of_exit <= week_end
And
DateDiff("d", date_of_entry, date_of_exit >= 3)
Or
(DateDiff("d", date_of_entry, week_end) >= 3 And date_of_exit >= week_end)
Group By
[name],
surname,
year,
week
Now, save this and create a new query where you use WeekTable as source (to list all weeks) with an outer join to the query above (to list the worked weeks). In this, Group By the year and week and add a count to get the count of working employees for each week.

How to get the count of distinct values until a time period Impala/SQL?

I have a raw table recording customer ids coming to a store over a particular time period. Using Impala, I would like to calculate the number of distinct customer IDs coming to the store until each day. (e.g., on day 3, 5 distinct customers visited so far)
Here is a simple example of the raw table I have:
Day ID
1 1234
1 5631
1 1234
2 1234
2 4456
2 5631
3 3482
3 3452
3 1234
3 5631
3 1234
Here is what I would like to get:
Day Count(distinct ID) until that day
1 2
2 3
3 5
Is there way to easily do this in a single query?
Not 100% sure if will work on impala
But if you have a table days. Or if you have a way of create a derivated table on the fly on impala.
CREATE TABLE days ("DayC" int);
INSERT INTO days
("DayC")
VALUES (1), (2), (3);
OR
CREATE TABLE days AS
SELECT DISTINCT "Day"
FROM sales
You can use this query
SqlFiddleDemo in Postgresql
SELECT "DayC", COUNT(DISTINCT "ID")
FROM sales
cross JOIN days
WHERE "Day" <= "DayC"
GROUP BY "DayC"
OUTPUT
| DayC | count |
|------|-------|
| 1 | 2 |
| 2 | 3 |
| 3 | 5 |
UPDATE VERSION
SELECT T."DayC", COUNT(DISTINCT "ID")
FROM sales
cross JOIN (SELECT DISTINCT "Day" as "DayC" FROM sales) T
WHERE "Day" <= T."DayC"
GROUP BY T."DayC"
try this one:
select day, count(distinct(id)) from yourtable group by day