How can I obtain sum of each row in Postgresql? - sql

I have got some rows of results like below, if each sum (row(i)) same, I can suppose the results are correct. How can I write a SQL clause to calculate sum of each row? thanks.
27 | 29 | 27 | 36 | 33 | 29 | 16 | 17 | 35 | 28 | 34 | 15
27 | 29 | 27 | 29 | 33 | 29 | 16 | 17 | 35 | 28 | 34 | 15
27 | 29 | 27 | 14 | 33 | 29 | 16 | 17 | 35 | 28 | 34 | 15
27 | 29 | 16 | 37 | 33 | 29 | 16 | 17 | 35 | 28 | 34 | 15
27 | 29 | 16 | 36 | 33 | 29 | 16 | 17 | 35 | 28 | 34 | 15

Related

Pandas: keep first row of duplicated indices of second level of multi index

I found lots of drop_duplicates for index when both multi level indices are the same but, I would like to keep the first row of a multi index when the second level of the multi index has duplicates. So here:
| | col_0 | col_1 | col_2 | col_3 | col_4 |
|:-------------------------------|--------:|--------:|--------:|--------:|--------:|
| date | ID
| ('2022-01-01', 'identifier_0') | 26 | 46 | 44 | 21 | 10 |
| ('2022-01-01', 'identifier_1') | 25 | 45 | 83 | 23 | 45 |
| ('2022-01-01', 'identifier_2') | 42 | 79 | 55 | 5 | 78 |
| ('2022-01-01', 'identifier_3') | 32 | 4 | 57 | 19 | 61 |
| ('2022-01-01', 'identifier_4') | 30 | 25 | 5 | 93 | 72 |
| ('2022-01-02', 'identifier_0') | 42 | 14 | 56 | 43 | 42 |
| ('2022-01-02', 'identifier_1') | 90 | 27 | 46 | 58 | 5 |
| ('2022-01-02', 'identifier_2') | 33 | 39 | 53 | 94 | 86 |
| ('2022-01-02', 'identifier_3') | 32 | 65 | 98 | 81 | 64 |
| ('2022-01-02', 'identifier_4') | 48 | 31 | 25 | 58 | 15 |
| ('2022-01-03', 'identifier_0') | 5 | 80 | 33 | 96 | 80 |
| ('2022-01-03', 'identifier_1') | 15 | 86 | 45 | 39 | 62 |
| ('2022-01-03', 'identifier_2') | 98 | 3 | 42 | 50 | 83 |
I'd like to keep first rows with unique ID.
If your index is a MultiIndex:
>>> df.loc[~df.index.get_level_values('ID').duplicated()]
col_0 col_1 col_2 col_3 col_4
date ID
2022-01-01 identifier_0 26 46 44 21 10
identifier_1 25 45 83 23 45
identifier_2 42 79 55 5 78
identifier_3 32 4 57 19 61
identifier_4 30 25 5 93 72
# Or
>>> df.groupby(level='ID').first()
col_0 col_1 col_2 col_3 col_4
ID
identifier_0 26 46 44 21 10
identifier_1 25 45 83 23 45
identifier_2 42 79 55 5 78
identifier_3 32 4 57 19 61
identifier_4 30 25 5 93 72
If your index is an Index:
>>> df.loc[~df.index.str[1].duplicated()]
col_0 col_1 col_2 col_3 col_4
(2022-01-01, identifier_0) 26 46 44 21 10
(2022-01-01, identifier_1) 25 45 83 23 45
(2022-01-01, identifier_2) 42 79 55 5 78
(2022-01-01, identifier_3) 32 4 57 19 61
(2022-01-01, identifier_4) 30 25 5 93 72
>>> df.groupby(df.index.str[1]).first()
col_0 col_1 col_2 col_3 col_4
identifier_0 26 46 44 21 10
identifier_1 25 45 83 23 45
identifier_2 42 79 55 5 78
identifier_3 32 4 57 19 61
identifier_4 30 25 5 93 72

create a calculated column base on two column in SQL

I have a below table and I need to create a calculated column (RA) based on the category and month column.
Oa Sa Ai month MDY
5 10 2 Jan J302022
16 32 38 Jan J302022
15 14 4 Feb J302022
46 32 81 Jan J302022
3 90 0 Mar J302022
51 10 21 Jan J302021
19 32 3 Jan J302021
45 16 41 Feb J302021
46 7 81 Jan J302022
30 67 14 Mar J302021
45 16 41 Apr J302021
46 7 81 Apr J302021
30 67 0 Jan J302021
56 17 0 Mar J302022
first, it should need to consider a category, for example, J302022, then it needs to calculate the "RA" column based on the month for that category. for example, J302022, Jan, ((5+16+46+46)+(10+32+32+7)) / (2+38+81+81) = 0.96.
So below is the expected output looks like.
Oa Sa Ai month category RA
5 10 2 Jan J302022 0.96
16 32 38 Jan J302022 0.96
15 14 4 Feb J302022 7.25
46 32 81 Jan J302022 0.96
3 90 0 Mar J302022 0
51 10 21 Jan J302021 8.70
19 32 3 Jan J302021 8.70
45 16 41 Feb J302021 1.48
46 7 81 Jan J302022 0.96
30 67 14 Mar J302021 6.92
45 16 41 Apr J302021 1.48
46 7 81 Apr J302022 0.65
30 67 0 Jan J302021 8.70
56 17 0 Mar J302022 0
Is it possible to do it in SQL?
Thanks in advance!
select Oa, Sa, Ai, month, category,
coalesce((ra1+ra2)/ra3, 0) as RA
from (
select Oa, Sa, Ai, month, mdy as category,
sum(oa) over (partition by month, mdy) as ra1,
sum(sa) over (partition by month, mdy) as ra2,
sum(ai) over (partition by month, mdy) as ra3
from WhateverYourTableNameIs
) as t;
Output on MySQL 8.0.29:
+------+------+------+-------+----------+--------+
| Oa | Sa | Ai | month | category | RA |
+------+------+------+-------+----------+--------+
| 45 | 16 | 41 | Apr | J302021 | 0.9344 |
| 46 | 7 | 81 | Apr | J302021 | 0.9344 |
| 45 | 16 | 41 | Feb | J302021 | 1.4878 |
| 15 | 14 | 4 | Feb | J302022 | 7.2500 |
| 51 | 10 | 21 | Jan | J302021 | 8.7083 |
| 19 | 32 | 3 | Jan | J302021 | 8.7083 |
| 30 | 67 | 0 | Jan | J302021 | 8.7083 |
| 5 | 10 | 2 | Jan | J302022 | 0.9604 |
| 16 | 32 | 38 | Jan | J302022 | 0.9604 |
| 46 | 32 | 81 | Jan | J302022 | 0.9604 |
| 46 | 7 | 81 | Jan | J302022 | 0.9604 |
| 30 | 67 | 14 | Mar | J302021 | 6.9286 |
| 3 | 90 | 0 | Mar | J302022 | 0.0000 |
| 56 | 17 | 0 | Mar | J302022 | 0.0000 |
+------+------+------+-------+----------+--------+
for SQL Server
select Oa, Sa, Ai, [Month],Category,
case when (sum(Ai) over(partition by [Month], Category) *1.0) = 0 then 0 else
(sum(Oa) over(partition by [Month], Category) +
sum(Sa) over(partition by [Month], Category))/
(sum(Ai) over(partition by [Month], Category) *1.0) end Ra
from #temp
order by Category desc
1.0 is multiplied in denominator to convert the output to float.

Average daily peak hours in a week with SQL select

I'm trying to list the weekly average of customers in different restaurants in their daily peak hours, for example:
Week | Day | Hour | Rest | Custom
20 | Mon | 08-12 | KFC | 15
20 | Mon | 12-16 | KFC | 10
20 | Mon | 16-20 | KFC | 8
20 | Tue | 08-12 | KFC | 20
20 | Tue | 12-16 | KFC | 11
20 | Tue | 16-20 | KFC | 9
20 | Mon | 08-12 | MCD | 13
20 | Mon | 12-16 | MCD | 14
20 | Mon | 16-20 | MCD | 19
20 | Tue | 08-12 | MCD | 31
20 | Tue | 12-16 | MCD | 20
20 | Tue | 16-20 | MCD | 22
20 | Mon | 08-12 | PHT | 15
20 | Mon | 12-16 | PHT | 12
20 | Mon | 16-20 | PHT | 11
20 | Tue | 08-12 | PHT | 08
20 | Tue | 12-16 | PHT | 07
20 | Tue | 16-20 | PHT | 14
The desired result should be:
WeeK | Rest | Custom
20 | KFC | 17.5
20 | MCD | 25
20 | PHT | 14.5
Is it possible to do it in one line of SQL?
This is really two steps. Get the maximum people per day per restaurant and then average that per week:
select week, rest, sum(maxc)
from (select Week, Day, Rest, max(Custom) as maxc
from t
group by Week, Day, Rest
) wdr
group by week, rest;

SQL Paging - Search the OFFSET value to get a specific page

I have a problem with pagination. Using MYSQL, MariaDB and PostgreSQL. I am looking for a solution without vendor specific functions like ROW_NUMBER().
I have a (simplified) table as shown. I want to retrieve a page with 10 Rows containing a given id value.
SELECT id, costcentre_id, costcentreuser_id, createdate FROM devices
WHERE id < 62 ORDER BY createdate DESC;
+----+---------------+-------------------+---------------------+
| id | costcentre_id | costcentreuser_id | createdate |
+----+---------------+-------------------+---------------------+
| 61 | 18 | 31 | 2015-07-13 13:54:06 |+++++++
| 55 | 13 | 28 | 2015-07-13 13:54:05 |
| 53 | 16 | 27 | 2015-07-13 13:54:05 |
| 54 | 16 | 27 | 2015-07-13 13:54:05 |
| 56 | 13 | 28 | 2015-07-13 13:54:05 | Page 1
| 57 | 5 | 29 | 2015-07-13 13:54:05 |
| 58 | 5 | 29 | 2015-07-13 13:54:05 |
| 59 | 17 | 30 | 2015-07-13 13:54:05 |
| 60 | 17 | 30 | 2015-07-13 13:54:05 |
| 46 | 5 | 23 | 2015-07-13 13:54:04 |
| 45 | 5 | 23 | 2015-07-13 13:54:04 |+++++++
| 47 | 13 | 24 | 2015-07-13 13:54:04 |
| 48 | 13 | 24 | 2015-07-13 13:54:04 |
| 49 | 14 | 25 | 2015-07-13 13:54:04 |
| 50 | 14 | 25 | 2015-07-13 13:54:04 |
| 51 | 15 | 26 | 2015-07-13 13:54:04 | Page 2
| 52 | 15 | 26 | 2015-07-13 13:54:04 |
| 37 | 5 | 19 | 2015-07-13 13:54:03 |
| 38 | 5 | 19 | 2015-07-13 13:54:03 |
| 39 | 12 | 20 | 2015-07-13 13:54:03 |
| 40 | 12 | 20 | 2015-07-13 13:54:03 |+++++++
| 41 | 5 | 21 | 2015-07-13 13:54:03 |
| 42 | 5 | 21 | 2015-07-13 13:54:03 |
| 43 | 11 | 22 | 2015-07-13 13:54:03 |
| 44 | 11 | 22 | 2015-07-13 13:54:03 |
| 36 | 11 | 18 | 2015-07-13 13:54:02 | Page 3
| 35 |** 11 | 18 | 2015-07-13 13:54:02 |
| 34 | 6 | 17 | 2015-07-13 13:54:02 |
| 33 | 6 | 17 | 2015-07-13 13:54:02 |
| 32 | 5 | 16 | 2015-07-13 13:54:02 |
| 31 | 5 | 16 | 2015-07-13 13:54:02 |+++++++
| 30 | 5 | 15 | 2015-07-13 13:54:02 |
| 29 | 5 | 15 | 2015-07-13 13:54:02 |
| 21 | 5 | 11 | 2015-07-13 13:54:01 |
| 22 | 5 | 11 | 2015-07-13 13:54:01 |
| 23 | 5 | 12 | 2015-07-13 13:54:01 | Page 4
| 24 | 5 | 12 | 2015-07-13 13:54:01 |
| 25 | 5 | 13 | 2015-07-13 13:54:01 |
| 26 | 5 | 13 | 2015-07-13 13:54:01 |
| 27 | 10 | 14 | 2015-07-13 13:54:01 |
| 28 | 10 | 14 | 2015-07-13 13:54:01 |+++++++
| 11 | 6 | 6 | 2015-07-13 13:54:00 |
| 12 | 6 | 6 | 2015-07-13 13:54:00 |
| 13 | 7 | 7 | 2015-07-13 13:54:00 |
| 14 | 7 | 7 | 2015-07-13 13:54:00 |
| 15 | 5 | 8 | 2015-07-13 13:54:00 |
| 16 | 5 | 8 | 2015-07-13 13:54:00 |
| 17 | 8 | 9 | 2015-07-13 13:54:00 |
| 18 | 8 | 9 | 2015-07-13 13:54:00 |
| 19 | 9 | 10 | 2015-07-13 13:54:00 |
| 20 | 9 | 10 | 2015-07-13 13:54:00 |
| 2 | 1 | 1 | 2015-07-13 13:53:59 |
| 3 | 2 | 2 | 2015-07-13 13:53:59 |
| 4 | 2 | 2 | 2015-07-13 13:53:59 |
| 5 | 3 | 3 | 2015-07-13 13:53:59 |
| 6 | 3 | 3 | 2015-07-13 13:53:59 |
| 7 | 4 | 4 | 2015-07-13 13:53:59 |
| 8 | 4 | 4 | 2015-07-13 13:53:59 |
| 9 | 5 | 5 | 2015-07-13 13:53:59 |
| 10 | 5 | 5 | 2015-07-13 13:53:59 |
| 1 | 1 | 1 | 2015-07-13 13:53:59 |
+----+---------------+-------------------+---------------------+
I want to get the page with id 35 (here page 3)
SELECT id, costcentre_id, costcentreuser_id, createdate FROM devices
WHERE id < 62 ORDER BY createdate DESC LIMIT 10 OFFSET 20;
+----+---------------+-------------------+---------------------+
| id | costcentre_id | costcentreuser_id | createdate |
+----+---------------+-------------------+---------------------+
| 37 | 5 | 19 | 2015-07-13 13:54:03 |
| 40 | 12 | 20 | 2015-07-13 13:54:03 |
| 41 | 5 | 21 | 2015-07-13 13:54:03 |
| 38 | 5 | 19 | 2015-07-13 13:54:03 |
| 42 | 5 | 21 | 2015-07-13 13:54:03 |
| 35 |** 11 | 18 | 2015-07-13 13:54:02 |
| 36 | 11 | 18 | 2015-07-13 13:54:02 |
| 33 | 6 | 17 | 2015-07-13 13:54:02 |
| 29 | 5 | 15 | 2015-07-13 13:54:02 |
| 30 | 5 | 15 | 2015-07-13 13:54:02 |
+----+---------------+-------------------+---------------------+
But how to calculate the OFFSET value automatically?
Thank you for any idea!
You can use TOP to your advantage here. Some DBMSs support a variable or dynamic TOP condition, but if not, this would need to be generated in your target language.
This is also not the most efficient way, but only depends on whether you can have a deterministic sorting key.
--offset 10, page 2
SELECT *
FROM (
SELECT TOP 10 * --top offset number
FROM (
SELECT TOP 20 * --top offset * page number
FROM MyTable
ORDER BY id --sort ASCENDING
) T1
ORDER BY id DESC --sort by same key DESCENDING
) T2
ORDER BY id --reorder to original order, unless you want to order in client app

SQL Server partitioning when null

I have a sql server table like this:
Value RowID Diff
153 48 1
68 49 1
50 57 NULL
75 58 1
65 59 1
70 63 NULL
66 64 1
79 66 NULL
73 67 1
82 68 1
85 69 1
66 70 1
118 88 NULL
69 89 1
67 90 1
178 91 1
How can I make it like this (note the partition after each null in 3rd column):
Value RowID Diff
153 48 1
68 49 1
50 57 NULL
75 58 2
65 59 2
70 63 NULL
66 64 3
79 66 NULL
73 67 4
82 68 4
85 69 4
66 70 4
118 88 NULL
69 89 5
67 90 5
178 91 5
It looks like you are partitioning over sequential values of RowID. There is a trick to do this directly by grouping on RowID - Row_Number():
select
value,
rowID,
Diff,
RowID - row_number() over (order by RowID) Diff2
from
Table1
Notice how this gets you similar groupings, except with distinct Diff values (in Diff2):
| VALUE | ROWID | DIFF | DIFF2 |
|-------|-------|--------|-------|
| 153 | 48 | 1 | 47 |
| 68 | 49 | 1 | 47 |
| 50 | 57 | (null) | 54 |
| 75 | 58 | 1 | 54 |
| 65 | 59 | 1 | 54 |
| 70 | 63 | (null) | 57 |
| 66 | 64 | 1 | 57 |
| 79 | 66 | (null) | 58 |
| 73 | 67 | 1 | 58 |
| 82 | 68 | 1 | 58 |
| 85 | 69 | 1 | 58 |
| 66 | 70 | 1 | 58 |
| 118 | 88 | (null) | 75 |
| 69 | 89 | 1 | 75 |
| 67 | 90 | 1 | 75 |
| 178 | 91 | 1 | 75 |
Then to get ordered values for Diff, you can use Dense_Rank() to produce a numbering over each separate partition - except when a value is Null:
select
value,
rowID,
case when Diff = 1
then dense_rank() over (order by Diff2)
else Diff end as Diff
from (
select
value,
rowID,
Diff,
RowID - row_number() over (order by RowID) Diff2
from
Table1
) T
The result is the expected result, except keyed off of RowID directly rather than off of the existing Diff column.
| VALUE | ROWID | DIFF |
|-------|-------|--------|
| 153 | 48 | 1 |
| 68 | 49 | 1 |
| 50 | 57 | (null) |
| 75 | 58 | 2 |
| 65 | 59 | 2 |
| 70 | 63 | (null) |
| 66 | 64 | 3 |
| 79 | 66 | (null) |
| 73 | 67 | 4 |
| 82 | 68 | 4 |
| 85 | 69 | 4 |
| 66 | 70 | 4 |
| 118 | 88 | (null) |
| 69 | 89 | 5 |
| 67 | 90 | 5 |
| 178 | 91 | 5 |