I have the following table:
A
Sum(Tickets)
01-2022
5
02-2022
2
03-2022
8
04-2022
1
05-2022
3
06-2022
3
07-2022
4
08-2022
1
09-2022
5
10-2022
5
11-2022
3
I would like to create the following extra column 'TotalSum(Tickets)' but I am stuck....
Anyone who can help out?
A
Sum(Tickets)
TotalSum(Tickets)
01-2022
5
5
02-2022
2
7
03-2022
8
15
04-2022
1
16
05-2022
3
19
06-2022
3
22
07-2022
4
26
08-2022
1
27
09-2022
5
32
10-2022
5
37
11-2022
3
40
You may use SUM() as a window function here:
SELECT A, SumTickets, SUM(SumTickets) OVER (ORDER BY A) AS TotalSumTickets
FROM yourTable
ORDER BY A;
But this assumes that you actually have a bona-fide column SumTickets which contains the sums. Assuming you really showed us the intermediate result of some aggregation query, you should use:
SELECT A, SUM(Tickets) AS SumTickets,
SUM(SUM(Tickets)) OVER (ORDER BY A) AS TotalSumTickets
FROM yourTable
GROUP BY A
ORDER BY A;
left join the same table where date is not bigger, then sum that for every date:
select
table1.date,
sum(t.tickets)
from
table1
left join table1 t
on t.date<= table1.date
group by
table1.date;
Related
Outer Join 'fill-in-the blanks'
I have a pair of master-detail tables in a PostgreSQL database where master table 'samples' has some samples with a timestamp in each.
The detail table 'sample_values' has some values for some parameters at any given sample timestamp.
My Query
SELECT s.sample_id, s.sample_time, v.parameter_id, v.sample_value
FROM samples s LEFT OUTER JOIN sample_values v ON v.sample_id=s.sample_id
ORDER BY s.sample_id, v.parameter_id;
returns (as expected):
sample_id
sample_time
parameter_id
sample_value
1
2023-01-13T01:00:00.000Z
1
1.23
1
2023-01-13T01:00:00.000Z
2
4.98
2
2023-01-13T01:01:00.000Z
3
2023-01-13T01:02:00.000Z
4
2023-01-13T01:03:00.000Z
5
2023-01-13T01:04:00.000Z
2
6.08
6
2023-01-13T01:05:00.000Z
7
2023-01-13T01:06:00.000Z
1
1.89
8
2023-01-13T01:07:00.000Z
9
2023-01-13T01:08:00.000Z
10
2023-01-13T01:09:00.000Z
11
2023-01-13T01:10:00.000Z
12
2023-01-13T01:11:00.000Z
13
2023-01-13T01:12:00.000Z
14
2023-01-13T01:13:00.000Z
15
2023-01-13T01:14:00.000Z
1
2.11
16
2023-01-13T01:15:00.000Z
17
2023-01-13T01:16:00.000Z
18
2023-01-13T01:17:00.000Z
19
2023-01-13T01:18:00.000Z
2
3.57
20
2023-01-13T01:19:00.000Z
21
2023-01-13T01:20:00.000Z
22
2023-01-13T01:21:00.000Z
23
2023-01-13T01:22:00.000Z
1
3.21
23
2023-01-13T01:22:00.000Z
2
5.31
How do I write a query that returns one row per timestamp per parameter, where sample_value is the 'latest known' sample_value for that parameter like this:
sample_id
sample_time
parameter_id
sample_value
1
2023-01-13T01:00:00.000Z
1
1.23
1
2023-01-13T01:00:00.000Z
2
4.98
2
2023-01-13T01:01:00.000Z
1
1.23
2
2023-01-13T01:01:00.000Z
2
4.98
3
2023-01-13T01:02:00.000Z
1
1.23
3
2023-01-13T01:02:00.000Z
2
4.98
4
2023-01-13T01:03:00.000Z
1
1.23
4
2023-01-13T01:03:00.000Z
2
4.98
5
2023-01-13T01:04:00.000Z
1
1.23
5
2023-01-13T01:04:00.000Z
2
6.08
6
2023-01-13T01:05:00.000Z
1
1.23
6
2023-01-13T01:05:00.000Z
2
6.08
7
2023-01-13T01:06:00.000Z
1
1.89
7
2023-01-13T01:06:00.000Z
2
6.08
8
2023-01-13T01:07:00.000Z
1
1.89
8
2023-01-13T01:07:00.000Z
2
6.08
View on DB Fiddle
I cannot get my head around the LAST_VALUE function (if that is even the right tool for this?):
LAST_VALUE ( expression )
OVER (
[PARTITION BY partition_expression, ... ]
ORDER BY sort_expression [ASC | DESC], ...
)
First of all you need two rows for each of your sample ids. You can achieve it by cross joining your sample values with the distinct amount of parameters, and ensuring the condition on parameters is met as well on the left join.
...
FROM samples s
CROSS JOIN (SELECT DISTINCT parameter_id FROM sample_values) p
LEFT JOIN sample_values v
ON v.sample_id = s.sample_id AND v.parameter_id = p.parameter_id
...
In addition to this, your intuition of using the LAST_VALUE window function was correct. Problem is that PostgreSQL is unable to ignore null values till its current version. The only workaround for this problem is to generate partitioning on your parameter_ids and sample_value (each partition will contain one non-null value and the other null values), then taking the maximum value from each partition.
WITH cte AS (
SELECT s.sample_id, s.sample_time, p.parameter_id, v.sample_value,
COUNT(v.sample_value) OVER(
PARTITION BY p.parameter_id
ORDER BY s.sample_id
) AS partitions
FROM samples s
CROSS JOIN (SELECT DISTINCT parameter_id FROM sample_values) p
LEFT JOIN sample_values v
ON v.sample_id = s.sample_id AND v.parameter_id = p.parameter_id
)
SELECT sample_id, sample_time, parameter_id,
COALESCE(sample_value,
MAX(sample_value) OVER (PARTITION BY parameter_id, partitions)
) AS sample_value
FROM cte
ORDER BY sample_id, parameter_id
Check the demo here.
I have a table like (SQL Server 2016):
ID Month Sales
1 Jan 2019 40
2 Feb 2019 80
3 Mar 2019 400
...
would like to get sales redistributed by weeks (here we can assume each month is 4 weeks) like:
ID Month Sales
1 012019 10
1 022019 10
1 032019 10
1 042019 10
2 052019 20
2 062019 20
2 072019 20
2 082019 20
3 092019 100
3 102019 100
3 112019 100
3 122019 100
...
How can I achieve sth like that?
You could join the query with a hard-coded query that generates four rows:
SELECT id, month, sales / 4
FROM mytable
CROSS JOIN (SELECT 1 AS col
UNION ALL
SELECT 2
UNION ALL
SELECT 3
UNION ALL
SELECT 4) t
It's hard to formulate, so i'll just show an example and you are welcome to edit my question and title.
Suppose, i have a table
flag id value datetime
0 b 1 343 13
1 a 1 23 12
2 b 1 21 11
3 b 1 32 10
4 c 2 43 11
5 d 2 43 10
6 d 2 32 9
7 c 2 1 8
For each id i want to squeze the table by flag columns such that all duplicate flag values that follow each other collapse to one row with sum aggregation. Desired result:
flag id value
0 b 1 343
1 a 1 23
2 b 1 53
3 c 2 75
4 d 2 32
5 c 2 1
P.S: I found functions like CONDITIONAL_CHANGE_EVENT, which seem to be able to do that, but the examples of them in docs dont work for me
Use the differnece of row number approach to assign groups based on consecutive row flags being the same. Thereafter use a running sum.
select distinct id,flag,sum(value) over(partition by id,grp) as finalvalue
from (
select t.*,row_number() over(partition by id order by datetime)-row_number() over(partition by id,flag order by datetime) as grp
from tbl t
) t
Here's an approach which uses CONDITIONAL_CHANGE_EVENT:
select
flag,
id,
sum(value) value
from (
select
conditional_change_event(flag) over (order by datetime desc) part,
flag,
id,
value
from so
) t
group by part, flag, id
order by part;
The result is different from your desired result stated in the question because of order by datetime. Adding a separate column for the row number and sorting on that gives the correct result.
I have a table with data like this
picks
20
20
20
18
17
12
12
9
9
This is the table but I need to get result like this.
Picks Count
20 3
19 0
18 1
17 1
16 0
...up to
1 12
How can we write query to get zero totals for data which doesn't exist in the table?
Arun
Use a subquery to generate all the numbers and then outer join it to your table.
with nos as ( select level as pick_id
from dual
connect by level <= 20 )
select nos.pick_id
, count(*)
from nos
left outer join picks
on nos.pick_id = picks.id
group by nos.pick_id
order by nos.pick_id desc ;
------------------------------------------
ID Name C D
------------------------------------------
1 AK-47 10 5
2 RPG 10 20
3 Mp5 20 15
4 Sniper 20 18
5 Tank 90 80
6 Space12 90 20
7 Rifle 90 110
8 Knife 90 85
Consider 1,2 ; 3,4 ; 5,6,7,8 are as separate groups
So i need to get the row group wise that which's D column holds the nearest lower number to the C column
So the Expected Result is :
------------------------------------------
ID Name C D
------------------------------------------
1 AK-47 10 5
4 Sniper 20 18
8 Knife 90 85
How can I achieve this ?
select t1.*
from your_table t1
join
(
select c, min(abs(c-d)) as near
from your_table
group by c
) t2 on t1.c = t2.c and abs(t1.c-t1.d) = t2.near
Here is the syntax for another way of doing this. This uses a cte and will only hit the base table once.
with MySortedData as
(
select ID, Name, C, D, ROW_NUMBER() over(PARTITION BY C order by ABS(C - D)) as RowNum
from Something
)
select *
from MySortedData
where RowNum = 1