Group by data by 5 mintues sample - sql

I am having a table like this and I am storing epoch time as one column. I want to use epoch time column to group by data by 5 min
-------------------
Name | epoch_time |
A | 1585977780 |
B | 1585977780 |
C | 1585978080 |
-------------------

You have Unix time, so you can use arithmetic:
select floor(epoch_time / (60 * 5)) * 60 * 5 as minutes_5, count(*)
from t
group by floor(epoch_time / (60 * 5)) * 60 * 5

Related

Pgsql- How to filter report days with pgsql?

Let's say I have a table Transaction which has data as following:
Transaction
| id | user_id | amount | created_at |
|:-----------|------------:|:-----------:| :-----------:|
| 1 | 1 | 100 | 2021-09-11 |
| 2 | 1 | 1000 | 2021-09-12 |
| 3 | 1 | -100 | 2021-09-12 |
| 4 | 2 | 200 | 2021-10-13 |
| 5 | 2 | 3000 | 2021-10-20 |
| 6 | 3 | -200 | 2021-10-21 |
I want to filter this data by this: last 4days, 15days, 28days:
Note: If user click on select option 4days this will filter last 4 days.
I want this data
total commission (sum of all transaction amount * 5%)
Total Top up
Total Debut: which amount (-)
Please help me out and sorry for basic question!
Expect result:
** If user filter last 4days:
Let's say current date is: 2021-09-16
So result:
- TotalCommission (1000 - 100) * 5
- TotalTopUp: 1000
- TotalDebut: -100
I suspect you want:
SELECT SUM(amount) * 0.05 AS TotalCmomission,
SUM(amount) FILTER (WHERE amount > 0) AS TotalUp,
SUM(amount) FILTER (WHERE amount < 0) AS TotalDown
FROM t
WHERE created_at >= CURRENT_DATE - 4 * INTERVAL '1 DAY';
This assumes that there are no future created_at (which seems like a reasonable assumption). You can replace the 4 with whatever value you want.
Take a look at the aggregate functions sum, max and min. Last four days should look like this:
SELECT
sum(amount)*.05 AS TotalComission,
max(amount) AS TotalUp,
min(amount) AS TotalDebut
FROM t
WHERE created_at BETWEEN CURRENT_DATE-4 AND CURRENT_DATE;
Demo: db<>fiddle
Your description indicates specifying the number of days to process and from your expected results indicate you are looking for results by user_id (perhaps not as user 1 falls into the range). Perhaps the the best option would be to wrap the query into a SQL function. Then as all your data is well into the future you would need to parameterize that as well. So the result becomes:
create or replace
function Commissions( user_id_in integer default null
, days_before_in integer default 0
, end_date_in date default current_date
)
returns table( user_id integer
, totalcommission numeric
, totalup numeric
, totaldown numeric
)
language sql
as $$
select user_id
, sum(amount) * 0.05
, sum(amount) filter (where amount > 0)
, sum(amount) filter (where amount < 0)
from transaction
where (user_id = user_id_in or user_id_in is null)
and created_at <# daterange( (end_date_in - days_before_in * interval '1 day')::date
, end_date_in
, '[]'::text -- indicates inclusive of both dates
)
group by user_id;
$$;
See demo here. You may just want to play around with the parameters and see the results.

Postgtres: Calculate number of days as float considering hours/minutes

I would like to compute the number of days since 1899/12/30, taking into consideration the number of hours and minutes.
So, for example data in my table:
+--------------------+-------------+
| dob | n_days |
+--------------------+-------------+
|1980-08-09 13:34:10 | 29442.5654 |
|2005-12-15 23:10:00 | 38701.6528 |
|2020-02-26 15:56:00 | 43887.6639 |
+--------------------+-------------+
Query:
SELECT DATE_PART('day', dob -'1899-12-30 00:00:00'::TIMESTAMPTZ) AS n_days
FROM my_date;
Returns only whole day count:
n_days
---------
29441
38701
43887
Consider:
extract(epoch from dob - '1899-12-30 00:00:00'::timestamptz) / 60 / 60 / 24
Rationale:
the timestamp substraction gives you an interval that represents the difference between the timestamps
extract(epoch from ...) turns this to a number of seconds
all that is left to do is divide by the number of seconds that there is in a day
Demo on DB Fiddle:
with t as (
select '1980-08-09 13:34:10'::timestamptz dob
union all select '2005-12-15 23:10:00'::timestamptz
union all select '2020-02-26 15:56:00'::timestamptz
)
select
dob,
extract(epoch from dob - '1899-12-30 00:00:00'::timestamptz) / 60 / 60 / 24 as n_days
from t
dob | n_days
:--------------------- | :-----------------
1980-08-09 13:34:10+01 | 29442.52372685185
2005-12-15 23:10:00+00 | 38701.965277777774
2020-02-26 15:56:00+00 | 43887.66388888889

How to dynamically perform a weighted random row selection in PostgreSQL?

I have following table for an app where a student is assigned task to play educational game.
Student{id, last_played_datetime, total_play_duration, total_points_earned}
The app selects a student at random and assigns the task. The student earns a point for just playing the game. The app records the date and time when the game was played and for how much duration. I want to randomly select a student and assign the task. At a time only one student can be assigned the task. To give equal opportunity to all students I am dynamically calculating weight for the student using the date and time a student last played the game, the total play duration and the total points earned by the student. A student will then be randomly choosen influenced on the weight.
How do I, in PostgreSQL, randomly select a row from a table depending on the dynamically calculated weight of the row?
The weight for each student is calculated as follows: (minutes(current_datetime - last_played_datetime) * 0.75 + total_play_duration * 0.5 + total_points_earned * 0.25) / 1.5
Sample data:
+====+======================+=====================+=====================+
| Id | last_played_datetime | total_play_duration | total_points_earned |
+====+======================+=====================+=====================+
| 1 | 01/02/2011 | 300 mins | 7 |
+----+----------------------+---------------------+---------------------+
| 2 | 06/02/2011 | 400 mins | 6 |
+----+----------------------+---------------------+---------------------+
| 3 | 01/03/2011 | 350 mins | 8 |
+----+----------------------+---------------------+---------------------+
| 4 | 22/03/2011 | 550 mins | 9 |
+----+----------------------+---------------------+---------------------+
| 5 | 01/03/2011 | 350 mins | 8 |
+----+----------------------+---------------------+---------------------+
| 6 | 10/01/2011 | 130 mins | 2 |
+----+----------------------+---------------------+---------------------+
| 7 | 03/01/2011 | 30 mins | 1 |
+----+----------------------+---------------------+---------------------+
| 8 | 07/10/2011 | 0 mins | 0 |
+----+----------------------+---------------------+---------------------+
Here is a solution that works as follows:
first compute the weight of each student
sum the weight of all students and multiply if by a random seed
then pick the first student above that target, random, weight
Query:
with
student_with_weight as (
select
id,
(
extract(epoch from (now() - last_played_datetime)) / 60 * 0.75
+ total_play_duration * 0.5
+ total_points_earned * 0.25
) / 1.5 weight
from student
),
random_weight as (
select random() * (select sum(weight) weight from student_with_weight ) weight
)
select id
from
student_with_weight s
inner join random_weight r on s.weight >= r.weight
order by id
limit 1;
You can use a cumulative sum on the weights and compare to rand(). It looks like this:
with s as (
select s.*,
<your expression> as weight
from s
)
select s.*
from (select s.*,
sum(weight) over (order by weight) as running_weight,
sum(weight) over () as total_weight
from s
) s cross join
(values (random())) r(rand)
where r.rand * total_weight >= running_weight - weight and
r.rand * total_weight < running_weight;
The values() clause ensures that the random value is calculated only once for the query. Funky things can happen if you put random() in the where clause, because it will be recalculated for each comparison.
Basically, you can think of the cumulative sum as dividing up the total count into discrete regions. The rand() is then just choosing one of them.

Split rows into m parts and build the average of each part [SQLite]

I've got the following problem.
Given a table consisting basically of two columns, a timestamp and a value, I need to reduce n rows between to timestamps down to m data rows by averaging both, value and timestamp.
Let's say I want all data between times 15 and 85 in a maximum of 3 data rows.
time | value
10 | 7
20 | 6
30 | 2
40 | 9
50 | 4
60 | 3
70 | 2
80 | 9
90 | 2
Remove unneeded rows and split them into 3 parts
20 | 6
30 | 2
40 | 9
50 | 4
60 | 3
70 | 2
80 | 9
Average them
25 | 4
50 | 8
75 | 5.5
I know how to remove the unwanted rows by including a WHERE, how to average a given set of rows but can't think of a way on how to split the wanted dataset into m parts.
Any help and ideas appreciated!
I use SQLite which doesn't make this any easier and can't switch to any other dialect sadly.
I tried to group the rows based on row number and count of rows without success. The only other solution that came to my mind was getting the count of affected rows and UNION m SELECTs having a limit and offset.
I've got it working. My problem was that I did integer division by accident.
The formula I use to group the items is:
group = floor ( row_number / (row_count / limit) )
The full SQL-query looks something like this:
SELECT
avg(measurement_timestamp) AS measurement_timestamp,
avg(measurement_value) AS average
FROM (
SELECT
(SELECT COUNT(*)
FROM measurements_table
WHERE measurement_timestamp > start_time AND measurement_timestamp < end_time)
AS row_count,
(SELECT COUNT(0)
FROM measurements_table t1
WHERE t1.measurement_timestamp < t2.measurement_timestamp
AND t1.measurement_timestamp > start_time AND t1.measurement_timestamp < end_time
ORDER BY measurement_timestamp ASC )
AS row_number,
*
FROM measurements_table t2
WHERE t2.measurement_timestamp > start_time AND t2.measurement_timestamp < end_time
ORDER BY measurement_timestamp ASC)
GROUP BY CAST((row_number / (row_count / ?)) AS INT)
* the limit has to be a float

SQL select two nearly fields

How do I select two most nearly fields to specific timestamp?
SELECT *
FROM 'wp_weather'
WHERE ( timestamp most nearly to 1385435000) AND city = 'Махачкала'
The table:
id | timestamp
---------------
0 | 1385410000
1 | 1385420000
2 | 1385430000
3 | 1385440000
4 | 1385450000
SELECT *
FROM wp_weather
WHERE city = 'Махачкала'
order by abs(timestamp - 1385435000)
limit 2
You may try like this:
SELECT * FROM 'wp_weather'
WHERE city = 'Махачкала'
order by abs(timestamp - 1385435000)
limit 2
Also check the ABS function