How to limit a SUM using postgres

How to limit a SUM using postgres - sql

I want to limit the SUM to 2 hours in postgres sql. I dont want to limit the result of the sum, but the value that it is going to sum.
For example, the following table:
In this case, if I SUM('02:20', '01:50', '00:30', '03:00') the result would be 07:30.
|CODE | HOUR |
| --- | ---- |
| 1 | 02:20 |
| 2 | 01:50 |
| 3 | 00:30 |
| 4 | 03:00 |
But what i want, is to limit the column HOUR to 02:00. So if the value is > 02:00, it will be replaced with 02:00, only in the SUM.
So the SUM should look like this ('02:00', '01:50', '00:30', '02:00'), and the result would be 06:20

Use case as an argument of the function:
select sum(case when hour < '2:00' then hour else '2:00' end)
from my_table
Test it in Db<>fiddle.

It's still not perfect, but the idea is this:
select sum(x.anything::time)
from (select id,
time,
case when time <= '02:00' then time::text
else '02:' || (EXTRACT(MINUTES FROM time::time)::text)
end as anything
from time_table) x

Related

Pgsql- How to filter report days with pgsql?

Let's say I have a table Transaction which has data as following:
Transaction
| id | user_id | amount | created_at |
|:-----------|------------:|:-----------:| :-----------:|
| 1 | 1 | 100 | 2021-09-11 |
| 2 | 1 | 1000 | 2021-09-12 |
| 3 | 1 | -100 | 2021-09-12 |
| 4 | 2 | 200 | 2021-10-13 |
| 5 | 2 | 3000 | 2021-10-20 |
| 6 | 3 | -200 | 2021-10-21 |
I want to filter this data by this: last 4days, 15days, 28days:
Note: If user click on select option 4days this will filter last 4 days.
I want this data
total commission (sum of all transaction amount * 5%)
Total Top up
Total Debut: which amount (-)
Please help me out and sorry for basic question!
Expect result:
** If user filter last 4days:
Let's say current date is: 2021-09-16
So result:
- TotalCommission (1000 - 100) * 5
- TotalTopUp: 1000
- TotalDebut: -100

I suspect you want:
SELECT SUM(amount) * 0.05 AS TotalCmomission,
SUM(amount) FILTER (WHERE amount > 0) AS TotalUp,
SUM(amount) FILTER (WHERE amount < 0) AS TotalDown
FROM t
WHERE created_at >= CURRENT_DATE - 4 * INTERVAL '1 DAY';
This assumes that there are no future created_at (which seems like a reasonable assumption). You can replace the 4 with whatever value you want.

Take a look at the aggregate functions sum, max and min. Last four days should look like this:
SELECT
sum(amount)*.05 AS TotalComission,
max(amount) AS TotalUp,
min(amount) AS TotalDebut
FROM t
WHERE created_at BETWEEN CURRENT_DATE-4 AND CURRENT_DATE;
Demo: db<>fiddle

Your description indicates specifying the number of days to process and from your expected results indicate you are looking for results by user_id (perhaps not as user 1 falls into the range). Perhaps the the best option would be to wrap the query into a SQL function. Then as all your data is well into the future you would need to parameterize that as well. So the result becomes:
create or replace
function Commissions( user_id_in integer default null
, days_before_in integer default 0
, end_date_in date default current_date
)
returns table( user_id integer
, totalcommission numeric
, totalup numeric
, totaldown numeric
)
language sql
as $$
select user_id
, sum(amount) * 0.05
, sum(amount) filter (where amount > 0)
, sum(amount) filter (where amount < 0)
from transaction
where (user_id = user_id_in or user_id_in is null)
and created_at <# daterange( (end_date_in - days_before_in * interval '1 day')::date
, end_date_in
, '[]'::text -- indicates inclusive of both dates
)
group by user_id;
$$;
See demo here. You may just want to play around with the parameters and see the results.

Calculate time span over a number of records

I have a table that has the following schema:
ID | FirstName | Surname | TransmissionID | CaptureDateTime
1 | Billy | Goat | ABCDEF | 2018-09-20 13:45:01.098
2 | Jonny | Cash | ABCDEF | 2018-09-20 13:45.01.108
3 | Sally | Sue | ABCDEF | 2018-09-20 13:45:01.298
4 | Jermaine | Cole | PQRSTU | 2018-09-20 13:45:01.398
5 | Mike | Smith | PQRSTU | 2018-09-20 13:45:01.498
There are well over 70,000 records and they store logs of transmissions to a web-service. What I'd like to know is how would I go about writing a script that would select the distinct TransmissionID values and also show the timespan between the earliest CaptureDateTime record and the latest record? Essentially I'd like to see what the rate of records the web-service is reading & writing.
Is it even possible to do so in a single SELECT statement or should I just create a stored procedure or report in code? I don't know where to start aside from SELECT DISTINCT TransmissionID for this sort of query.
Here's what I have so far (I'm stuck on the time calculation)
SELECT DISTINCT [TransmissionID],
COUNT(*) as 'Number of records'
FROM [log_table]
GROUP BY [TransmissionID]
HAVING COUNT(*) > 1
Not sure how to get the difference between the first and last record with the same TransmissionID I would like to get a result set like:
TransmissionID | TimeToCompletion | Number of records |
ABCDEF | 2.001 | 5000 |

Simply GROUP BY and use MIN / MAX function to find min/max date in each group and subtract them:
SELECT
TransmissionID,
COUNT(*),
DATEDIFF(second, MIN(CaptureDateTime), MAX(CaptureDateTime))
FROM yourdata
GROUP BY TransmissionID
HAVING COUNT(*) > 1

Use min and max to calculate timespan
SELECT [TransmissionID],
COUNT(*) as 'Number of records',datediff(s,min(CaptureDateTime),max(CaptureDateTime)) as timespan
FROM [log_table]
GROUP BY [TransmissionID]
HAVING COUNT(*) > 1

A method that returns the average time for all transmissionids, even those with only 1 record:
SELECT TransmissionID,
COUNT(*),
DATEDIFF(second, MIN(CaptureDateTime), MAX(CaptureDateTime)) * 1.0 / NULLIF(COUNT(*) - 1, 0)
FROM yourdata
GROUP BY TransmissionID;
Note that you may not actually want the maximum of the capture date for a given transmissionId. You might want the overall maximum in the table -- so you can consider the final period after the most recent record.
If so, this looks like:
SELECT TransmissionID,
COUNT(*),
DATEDIFF(second,
MIN(CaptureDateTime),
MAX(MAX(CaptureDateTime)) OVER ()
) * 1.0 / COUNT(*)
FROM yourdata
GROUP BY TransmissionID;

How to write SQL query to calculate instances where a row containing a distinct id occurs 7 days after the fist occurrence if the unique id?

I am looking to return a date, the count of unique_ids first occurrences on that date, the number unique_ids that occurred 7 days after their first occurrence and the percentage of occurrences after 7 days / number of first occurrences.
example data_import table
+---------------------+------------------+
| time | distinct_id |
+---------------------+------------------+
| 2018/10/01 | 1 | first instance of `1`
+---------------------+------------------+
| 2018/10/01 | 2 | also first instance, but does not occur 7 days later
+---------------------+------------------+
| 2018/10/02 | 1 | should be disregarded (not first instance of 1)
+---------------------+------------------+
| 2018/10/02 | 3 | first instance of `3`
+---------------------+------------------+
| 2018/10/08 | 1 | First instance 7 days after first instance of `1`
+---------------------+------------------+
| 2018/10/08 | 1 | Don't count as this is the 2nd instance of `1` on this day
+---------------------+------------------+
| 2018/10/09 | 3 | 7 days after first instance of `3`
+---------------------+------------------+
| 2018/10/09 | 1 | 7 days after non-first instance of `1`
+---------------------+------------------+
And the expected return.
+---------------------+----------------------+------------------------+---------------------------+
| time | num_of_1st_instance | num_occur_7_days_after | percent_used_7_days_after |
+---------------------+----------------------+------------------------+---------------------------+
| 2018/10/01 | 2 | 1 | .50 |
+---------------------+----------------------+------------------------+---------------------------+
| 2018/10/02 | 1 | 1 | 1.0 |
+---------------------+----------------------+------------------------+---------------------------+
| 2018/10/03 | 0 | 0 | 0 |
+---------------------+----------------------+------------------------+---------------------------+
The query I have written is close, but counts occurrences other that the first for a distinct_id.
In my example, this query would include the occurrence of distinct_id 1 on 2018/10/02 and it's occurrence seven days after 2018/10/02 on 2018/10/09. Not wanted as the 2018/10/02 occurrence of distinct_id 1 is not it's first.
SELECT
data_import.time AS date,
count(distinct data_import.distinct_id) AS num_installs_on_install_date,
count(distinct future_activity.distinct_id) AS num_occur_7_days_after,
count(distinct future_activity.distinct_id) / count(distinct data_import.distinct_id)::float AS percent_used_7_days_after
FROM data_import
LEFT JOIN data_import AS future_activity ON
data_import.distinct_id = future_activity.distinct_id
AND
DATE(data_import.time) = DATE(future_activity.time) - INTERVAL '7 days'
AND
data_import.time = ( SELECT
time
FROM
data_import
WHERE
distinct_id = future_activity.distinct_id
ORDER BY
time
limit
1 )
GROUP BY DATE(data_import.time)
I hope that I explained this clearly. Please let me know how I can change my current query or a different approach to the solution.

Hmmm. Does this do what you want?
select di.time, sum( (seqnum = 1)::int) as first_instance,
sum( flag_7day ) as num_after_7_day,
sum( (seqnum = 1)::int) * 1.0 / sum( flag_7day ) as ratio
from (select di.*,
row_number() over (partition by distinct_id order by time) as seqnum,
(case when exists (select 1 from data_import di2 where di2.distinct_id = di.distinct_id and di2.time > di.time + interval '7 day')
then 1 else 0
end) as flag_7day
from data_import di
) di
group by di.time;
This doesn't return days with no first instances. Those days seem a bit awkward with respect to the ratio, so I'm not 100% sure that you really need them. If you do, it is easy enough to include a generate_series() to generate all dates in the range that you want.

SQLlite strftime function to get grouped data by months

i have table with following structure and data:
I would like to get grouped data by months in given date range for example (from 2014-01-01 to 2014-12-31). Data for some months cannot be available but i still need to have in result information that in given month is result 0.
Result should have following format:
MONTH | DIALS_CNT | APPT_CNT | CONVERS_CNT | CANNOT_REACH_CNT |
2014-01 | 100 | 50 | 20 | 30 |
2014-02 | 100 | 40 | 30 | 30 |
2014-03 | 0 | 0 | 0 | 0 |
etc..
WHERE
APPT_CNT = WHERE call.result = APPT
CONVERS_CNT = WHERE call.result = CONV_NO_APPT
CANNOT_REACH_CNT = WHERE call.result = CANNOT_REACH
How can i do it please with usage function strftime ?
Many thanks for any help or example.

SELECT Month,
(SELECT COUNT(*)
FROM MyTable
WHERE date LIKE Month || '%'
) AS Dials_Cnt,
(SELECT SUM(Call_Result = 'APPT')
FROM MyTable
WHERE date LIKE Month || '%'
) AS Appt_Cnt,
...
FROM (SELECT '2014-01' AS Month UNION ALL
SELECT '2014-02' UNION ALL
SELECT '2014-03' UNION ALL
...
SELECT '2014-12')

SQL Query Compare values in per 15 minutes and display the result per hour

I have a table with 2 columns. UTCTime and Values.
The UTCTime is in 15 mins increment. I want a query that would compare the value to the previous value in one hour span and display a value between 0 and 4 depends on if the values are constant. In other words there is an entry for every 15 minute increment and the value can be constant so I just need to check each value to the previous one per hour.
For example
+---------|-------+
| UTCTime | Value |
------------------|
| 12:00 | 18.2 |
| 12:15 | 87.3 |
| 12:30 | 55.91 |
| 12:45 | 55.91 |
| 1:00 | 37.3 |
| 1:15 | 47.3 |
| 1:30 | 47.3 |
| 1:45 | 47.3 |
| 2:00 | 37.3 |
+---------|-------+
In this case, I just want a Query that would compare the 12:45 value to the 12:30 and 12:30 to 12:15 and so on. Since we are comparing in only one hour span then the constant values must be between 0 and 4 (O there is no constant values, 1 there is one like in the example above)
The query should display:
+----------+----------------+
| UTCTime | ConstantValues |
----------------------------|
| 12:00 | 1 |
| 1:00 | 2 |
+----------|----------------+
I just wanted to mention that I am new to SQL programming.
Thank you.
See SQL fiddle here

Below is the query you need and a working solution Note: I changed the timeframe to 24 hrs
;with SourceData(HourTime, Value, RowNum)
as
(
select
datepart(hh, UTCTime) HourTime,
Value,
row_number() over (partition by datepart(hh, UTCTime) order by UTCTime) RowNum
from foo
union
select
datepart(hh, UTCTime) - 1 HourTime,
Value,
5
from foo
where datepart(mi, UTCTime) = 0
)
select cast(A.HourTime as varchar) + ':00' UTCTime, sum(case when A.Value = B.Value then 1 else 0 end) ConstantValues
from SourceData A
inner join SourceData B on A.HourTime = B.HourTime and
(B.RowNum = (A.RowNum - 1))
group by cast(A.HourTime as varchar) + ':00'

select SUBSTRING_INDEX(UTCTime,':',1) as time,value, count(*)-1 as total
from foo group by value,time having total >= 1;
fiddle

Mine isn't much different from Vasanth's, same idea different approach.
The idea is that you need recursion to carry it out simply. You could also use the LEAD() function to look at rows ahead of your current row, but in this case that would require a big case statement to cover every outcome.
;WITH T
AS (
SELECT a.UTCTime,b.VALUE,ROW_NUMBER() OVER(PARTITION BY a.UTCTime ORDER BY b.UTCTime DESC)'RowRank'
FROM (SELECT *
FROM #Table1
WHERE DATEPART(MINUTE,UTCTime) = 0
)a
JOIN #Table1 b
ON b.UTCTIME BETWEEN a.UTCTIME AND DATEADD(hour,1,a.UTCTIME)
)
SELECT T.UTCTime, SUM(CASE WHEN T.Value = T2.Value THEN 1 ELSE 0 END)
FROM T
JOIN T T2
ON T.UTCTime = T2.UTCTime
AND T.RowRank = T2.RowRank -1
GROUP BY T.UTCTime
If you run the portion inside the ;WITH T AS ( ) you'll see that gets us the hour we're looking at and the values in order by time. That is used in the recursive portion below by joining to itself and evaluating each row compared to the next row (hence the RowRank - 1) on the JOIN.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to limit a SUM using postgres - sql

Use case as an argument of the function: select sum(case when hour < '2:00' then hour else '2:00' end) from my_table Test it in Db<>fiddle.

It's still not perfect, but the idea is this: select sum(x.anything::time) from (select id, time, case when time <= '02:00' then time::text else '02:' || (EXTRACT(MINUTES FROM time::time)::text) end as anything from time_table) x

Related

Pgsql- How to filter report days with pgsql?

Calculate time span over a number of records

How to write SQL query to calculate instances where a row containing a distinct id occurs 7 days after the fist occurrence if the unique id?

SQLlite strftime function to get grouped data by months

SQL Query Compare values in per 15 minutes and display the result per hour

Categories

Resources