Get earliest value from a column with other aggregated columns in postgresql

Get earliest value from a column with other aggregated columns in postgresql - sql

I have a very simple stock ledger dataset.
1. date_and_time store_id product_id batch opening_qty closing_qty inward_qty outward_qty
2. 01-10-2021 14:20:00 56 a 1 5 1 0 4
3. 01-10-2021 04:20:00 56 a 1 8 5 0 3
4. 02-10-2021 15:30:00 56 a 1 9 2 1 8
5. 03-10-2021 08:40:00 56 a 2 2 6 4 0
6. 04-10-2021 06:50:00 56 a 2 8 4 0 4
Output I want:
select date, store_id,product_id, batch, first(opening_qty),last(closing_qty), sum(inward_qty),sum(outward_qty)
e.g.
1. date store_id product_id batch opening_qty closing_qty inward_qty outward_qty
2. 01-10-2021 56 a 1 8 1 0 7
I am writing a query using First_value window function and tried several others but not able to get the out put I want.
select
date,store_id,product_id,batch,
FIRST_VALUE(opening_total_qty)
OVER(
partition by date,store_id,product_id,batch
ORDER BY created_at
) as opening__qty,
sum(inward_qty) as inward_qty,sum(outward_qty) as outward_qty
from table
group by 1,2,3,4,opening_total_qty
Help please.

As your expected result is one row per group of rows with the same date, you need aggregates rather than window functions which provide as many rows as the ones filtered by the WHERE clause. You can try this :
SELECT date_trunc('day', date),store_id,product_id,batch
, (array_agg(opening_qty ORDER BY datetime ASC))[1] as opening__qty
, (array_agg(closing_qty ORDER BY datetime DESC))[1] as closing_qty
, sum(inward_qty) as inward_qty
, sum(outward_qty ) as outward_qty
FROM table
GROUP BY 1,2,3,4
see the test result in dbfidle.

Related

Snowflake SQL: trying to calculate time difference between subsets of subsequent rows

I have some data like the following in a Snowflake database
DEVICE_SERIAL
REASON_CODE
VERSION
MESSAGE_CREATED_AT
NEXT_REASON_CODE
BA1254862158
1
4
2022-06-23 02:06:03
4
BA1254862158
4
4
2022-06-23 02:07:07
1
BA1110001111
1
5
2022-06-16 16:19:04
4
BA1110001111
4
5
2022-06-16 17:43:04
1
BA1110001111
5
5
2022-06-20 14:37:45
4
BA1110001111
4
5
2022-06-20 17:31:12
1
that's the result of a previous query. I'm trying to get the difference between message_created_at timestamps where the device_serial is the same between subsequent rows, and the first row (of the pair for the difference) has reason_code of 1 or 5, and the second row of the pair has reason_code 4.
For this example, my desired output would be
DEVICE_SERIAL
VERSION
DELTA_SECONDS
BA1254862158
4
64
BA1110001111
5
5040
BA1110001111
5
10407
It's easy to calculate the time difference between every pair of rows (just lead or lag + datediff). But I'm not sure how to structure a query to select only the desired rows so that I can get a datediff between them, without calculating spurious datediffs.
My ultimate goal is to see how these datediffs change between versions. I am but a lowly C programmer, my SQL-fu is weak.

with data as (
select *,
count(case when reason_code in (1, 5) then 1 end)
over (partition by device_serial order by message_created_at) as grp
/* or alternately bracket by the end code */
-- count(case when reason_code = 4 then 1 end)
-- over (partition by device_serial order by message_created_at desc) as grp
from T
)
select device_serial, min(version) as version,
datediff(second, min(message_created_at), max(message_created_at)) as delta_seconds
from data
group by device_serial, grp

calculate avg(value) for last 10 records postgresql

i have a tricky task,
lets assume we have table "Racings", and there we have columns TRACK, CAR, CIRCLE_TIME
here is an example how data could be look like:
id
track
car
circle_time
10
1
10
15
9
1
10
14
8
1
10
16
7
1
10
15
6
1
10
13
5
2
10
7
4
2
10
4
3
2
10
5
2
3
10
8
1
3
10
10
what i need, i to add one more coumn like avg3_circle_time which will show me an average time from last 3 circle_time from each track, example:
id
track
car
circle_time
avg3_circle_time
10
1
10
15
15
9
1
10
14
15
8
1
10
16
14.6
7
1
10
15
null
6
1
10
13
null
5
2
10
7
5.3
4
2
10
4
null
3
2
10
5
null
2
3
10
8
null
1
3
10
10
null
I know how it could works in oracle, you could use something like rowid, but in case of postgresql i don't know, i have a draft like .....avg(circle_time) OVER(PARTITION BY track,car.....) as avg3_circle_time..... help me to solve that task please

You can use window functions to calculate moving averages:
SELECT track, id, car, circle_time, AVG(circle_time) OVER (
PARTITION BY track
ORDER BY id
ROWS BETWEEN 2 PRECEDING AND CURRENT ROW
)
FROM t
ORDER BY track, id
Depending on your definition of previous three, the window could be ROWS BETWEEN 3 PRECEDING AND 1 PRECEDING.

If you want only values when at least 3 circles available
select *
, case when lag(id, 2) over(partition by TRACK, CAR order by id) is not null then
avg(CIRCLE_TIME) over(partition by TRACK, CAR order by id rows between 2 preceding and current row) end a
from Racing
order by id desc;
db<>fiddle
Output
id track car circle_time a
10 1 10 15 15.0000000000000000
9 1 10 14 15.0000000000000000
8 1 10 16 14.6666666666666667
7 1 10 15 null
6 1 10 13 null
5 2 10 7 5.3333333333333333
4 2 10 4 null
3 2 10 5 null
2 3 10 8 null
1 3 10 10 null

Use LAED() then checking one of the next 2 rows is NULL or not. THEN sum of three values for calculating average.
-- PostgreSQL
SELECT *
, CASE WHEN next_circle_time IS NULL OR next_next_circle_time IS NULL
THEN NULL
ELSE ((t.circle_time + COALESCE(next_circle_time, 0) + COALESCE(next_next_circle_time, 0)) / 3 :: DECIMAL) :: DECIMAL(10, 1)
END avg_circle_time
FROM (SELECT *
, LEAD(circle_time, 1) OVER (PARTITION BY track ORDER BY id DESC) next_circle_time
, LEAD(circle_time, 2) OVER (PARTITION BY track ORDER BY id DESC) next_next_circle_time
FROM Racings) t
Another way Use AVG()
SELECT *
, CASE WHEN LEAD(circle_time, 2) OVER (PARTITION BY track ORDER BY id DESC) IS NULL
OR LEAD(circle_time, 1) OVER (PARTITION BY track ORDER BY id DESC) IS NULL
THEN NULL
ELSE AVG(circle_time) OVER (PARTITION BY track ORDER BY id DESC ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING)
END :: DECIMAL(10, 2) avg_circle_time
FROM Racings
Please check from url where both query exists https://dbfiddle.uk/?rdbms=postgres_11&fiddle=f0cd868623725a1b92bf988cfb2deba3

Several of the posted answers end up repeating the window definition. You can avoid this with the window clause:
select *,
case when row_number() over(track_window) > 2
then trunc(avg(CIRCLE_TIME) over(track_window rows 2 preceding), 1)
end a
from Racing
window track_window as (partition by track order by id)
order by id desc
Note how, in this sample, track_window is defined once, then reused for both row_number and avg. In the latter case, the window clause is embellished with a frame as well (rows 2 preceding).

Delete rows, which are duplicated and follow each other consequently

It's hard to formulate, so i'll just show an example and you are welcome to edit my question and title.
Suppose, i have a table
flag id value datetime
0 b 1 343 13
1 a 1 23 12
2 b 1 21 11
3 b 1 32 10
4 c 2 43 11
5 d 2 43 10
6 d 2 32 9
7 c 2 1 8
For each id i want to squeze the table by flag columns such that all duplicate flag values that follow each other collapse to one row with sum aggregation. Desired result:
flag id value
0 b 1 343
1 a 1 23
2 b 1 53
3 c 2 75
4 d 2 32
5 c 2 1
P.S: I found functions like CONDITIONAL_CHANGE_EVENT, which seem to be able to do that, but the examples of them in docs dont work for me

Use the differnece of row number approach to assign groups based on consecutive row flags being the same. Thereafter use a running sum.
select distinct id,flag,sum(value) over(partition by id,grp) as finalvalue
from (
select t.*,row_number() over(partition by id order by datetime)-row_number() over(partition by id,flag order by datetime) as grp
from tbl t
) t

Here's an approach which uses CONDITIONAL_CHANGE_EVENT:
select
flag,
id,
sum(value) value
from (
select
conditional_change_event(flag) over (order by datetime desc) part,
flag,
id,
value
from so
) t
group by part, flag, id
order by part;
The result is different from your desired result stated in the question because of order by datetime. Adding a separate column for the row number and sorting on that gives the correct result.

How to perform sorting with circular series in PostgreSQL

I have a table with 3 columns (id, date, number)
number is a value which starts from 1 till 9999. Once it hits 9999 all series is changed to negative numbers (aka -1 to -9999) and it starts over from 1.
This is my query:
select id,date,number
from table
order by date DESC, abs(number) DESC
This gives the following result:
26 3.1.17 5
25 3.1.17 4
21 2.1.17 -9999
3 2.1.17 -9998
4 2.1.17 -9997
51 2.1.17 3
6 2.1.17 2
7 2.1.17 1
10 1.1.17 -9996
Basically sort the data by date and then by the number column.
Since the sort is by date it works most of the time however in dates where the number is changed from -9999 to 1 the order is messed up.
This should be the result
id date number
26 3.1.17 5
25 3.1.17 4
51 2.1.17 3
6 2.1.17 2
7 2.1.17 1
21 2.1.17 -9999
3 2.1.17 -9998
4 2.1.17 -9997
10 1.1.17 -9996
How can I do that?

Try:
select id,date,number
from a_table
order by date desc, number < 0, abs(number) desc

You need to make sure the positive numbers are sorted first.
This can be achieved using an expression in the order by:
select id, date, number
from the_table
order by date,
case
when number > 0 then 1
else 2
end, --<< this makes the positive numbers come first
abs(number) desc

order by date DESC,
CASE WHEN number > 0 -- SORT THE POSITIVES
THEN number
ELSE NULL -- dont need it, NULL is the default if you omit ELSE
END DESC,
number ASC -- SORT THE NEGATIVES

Single SQL query to display aggregate data while grouping by 3 fields

I have a table that contains basic info:
CREATE TABLE testing.testtable
(
recordId serial NOT NULL,
nameId integer,
teamId integer,
countryId integer,
goals integer,
outs integer,
assists integer,
win integer,
sys_time timestamp with time zone NOT NULL DEFAULT now(),
CONSTRAINT testtable_pkey PRIMARY KEY (recordid)
)
I want one single SQL query, (with one record per person-team-country) to display the following data. Note that I want it to group by nameId, teamId, and countryId
Name, Team, and Country
Goal/out ratio (G/O)
Goal + Assist / out ratio (GA/O)
Win percentage (Win%)
The difference between the current goal/out ratio and what it was one month ago (rDif)
The difference between the current goal+assist/out ratio and what it was one month ago (fDif)
The difference between the current win % and what it was one month ago (winDif)
Example Table with all records:
Id nameId teamId countryId goals outs assists win sys_time
1 1 3 5 2 4 11 1 2013-01-01
2 1 3 5 9 4 19 1 2013-01-01
3 1 3 4 10 2 1 0 2013-01-01
4 1 3 4 11 50 14 1 2013-01-01
5 2 2 2 10 5 4 1 2013-01-01
6 2 3 5 4 7 15 0 2013-01-01
7 1 3 5 4 8 22 0 2014-07-01
8 1 3 4 11 3 5 1 2014-07-01
9 3 1 4 44 1 4 1 2014-07-01
Example desired output record (1-3-5):
nameId teamId countryId G/O GA/O Win% rDif fDif winDif
1 3 5 0.938 4.19 66 0.44 0.94 -0.34
The ratios are easy enough to retrieve.. for the differences, I've done the following:
select tt.nameid
avg(tt.goals) - avg(case when tt.sys_time < date_trunc('day', NOW() - interval '1 month') then tt.goals end) as change
from testing.testtable tt
group by tt.nameid
order by change desc
This works if I want the differences for only the nameIds. But I want it to pull one record for each combination of name-team-country. I can't seem to get that working.

You can group by multiple fields:
select tt.nameid, tt.teamID, tt.countryID,
avg(tt.goals) - avg(case when tt.sys_time < date_trunc('day', NOW() - interval '1 month') then tt.goals end) as change
from testing.testtable tt
group by tt.nameid, tt.teamID, tt.countryID
order by change desc

just off the top of my head I think it would work for you to use
group by tt.nameid, tt.teamId, tt.countryId

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Get earliest value from a column with other aggregated columns in postgresql - sql

Related

Snowflake SQL: trying to calculate time difference between subsets of subsequent rows

calculate avg(value) for last 10 records postgresql

Delete rows, which are duplicated and follow each other consequently

How to perform sorting with circular series in PostgreSQL

Single SQL query to display aggregate data while grouping by 3 fields

Categories

Resources