SQL postgresql: maximum value and sum of values - sql

I need to make a query to the following table, to return the maximum date grouped by code and also make the following calculation: deb-cre (maximum only).
How would I do this?
code | date | deb | cred
-----------------------------------
4 | 2018-01-01 | 100,00 | 200,00
4 | 2017-12-28 | 100,00 | 500,00
6 | 2018-01-23 | 350,00 | 400,00
6 | 2018-04-28 | 140,00 | 678,00
8 | 2018-01-12 | 156,00 | 256,00
8 | 2016-02-28 | 134,00 | 598,00
The result must be
4 | 2018-01-01 | -200,00
6 | 2018-04-28 | -50,00
8 | 2018-01-12 | -464,00

PostgreSQL's DISTINCT ON in combination with ORDER BY will return the first row per group:
SELECT DISTINCT ON (code)
code, date, deb - cre
FROM your_table
ORDER BY code, date DESC;

Related

Get average time difference grouped by another column in postgresql

I have a posts table where i'm interested in calculating the average difference between each authors posts. Here is a minimal example:
+---------------+---------------------+
| post_author | post_date |
|---------------+---------------------|
| 0 | 2019-03-05 19:12:24 |
| 1 | 2017-11-06 18:28:43 |
| 1 | 2017-11-06 18:28:43 |
| 1 | 2017-11-06 18:28:43 |
| 1 | 2017-11-06 18:28:43 |
| 1 | 2018-02-19 18:36:36 |
| 1 | 2018-02-19 18:36:36 |
| 1 | 2018-02-19 18:36:36 |
| 1 | 2018-02-19 18:36:36 |
| 1 | 2018-02-19 18:40:09 |
+---------------+---------------------+
so for each author, i want to get the delta of their time series essentially, then find the average (grouped by author). so the end result would look something like:
+---------------+---------------------+
| post_author | post_date_delta(hrs)|
|---------------+---------------------|
| 0 | 0 |
| 1 | 327 |
| 2 | 95 |
| ... | ... |
+---------------+---------------------+
I can think of how to do it in Python, but I'm struggling to write a (postgres) SQL query to accomplish this. Any help is appreciated!
You can use aggregation and arithmetic:
select post_author,
(max(post_date) - min(post_date)) / nullif(count(*) - 1, 0)
from t
group by post_author;
The average days is the difference between the maximum and minimum days, divided by one less than the count.

How to set a SQL statement into other statement to create a view

I'm trying to write a SQL statement that includes another statement, to get from that all a view. I have 1 data Table. this table have 3 rows(see: Table 1). What I'm trying to do is create a view which select all dates one time DISTINCT. now for every selected date row, select all rows where date = date and sum all price.
For example: the Main table
+----+--------------+---------------+------------+
| id | article_name | article_price | date |
+----+--------------+---------------+------------+
| 1 | T-Shirt | 10 | 2020-11-16 |
| 2 | Shoes | 25 | 2020-11-16 |
| 3 | Pullover | 35 | 2020-11-17 |
| 4 | Pants | 10 | 2020-11-18 |
+----+--------------+---------------+------------+
What im expecting is to have 3 rows(because the first 2 rows have the same date)
+------------+-----+
| date | sum |
+------------+-----+
| 2020-11-16 | 35 |
| 2020-11-17 | 35 |
| 2020-11-18 | 10 |
+------------+-----+
I'm having a hard time to think about an "Algorithm" to solve this.
any ideas?
Use group by!
select date, sum(article_price) as sum_article_price
from mytable
group by date

Selecting latest consecutive records that match a condition with PostgreSQL

I am looking for a PostgreSQL query to find the latest consecutive records that match a condition. Let me explain it better with an example:
| ID | HEATING STATE | DATE |
| ---- | --------------- | ---------- |
| 1 | ON | 2018-02-19 |
| 2 | ON | 2018-02-20 |
| 3 | OFF | 2018-02-20 |
| 4 | OFF | 2018-02-21 |
| 5 | ON | 2018-02-21 |
| 6 | OFF | 2018-02-21 |
| 7 | ON | 2018-02-22 |
| 8 | ON | 2018-02-22 |
| 9 | ON | 2018-02-22 |
| 10 | ON | 2018-02-23 |
I need to find all the recent consecutive records with date >= 2018-02-20 and heating_state ON, i.e. the ones with ID 7, 8, 9, 10. My main issue is with the fact that they must be consecutive.
For further clarification, if needed:
ID 1 is excluded because older than 2018-02-20
ID 2 is excluded because followed by ID 3 which has heating state OFF
ID 3 is excluded because it has heating state OFF
ID 4 is excluded because it is followed by ID 5, which has heating OFF
ID 5 is excluded because it has heating state OFF
ID 6 is excluded because it has heating state OFF
I think this is best solved using windows functions and a filtered aggregate.
For each row, add the number of later rows that have state = 'OFF', then use only the rows where that count is 0.
You need a subquery because you cannot use a window function result in the WHERE condition (WHERE is evaluated before window functions).
SELECT id, state, date
FROM (SELECT id, state, date,
count(*) FILTER (WHERE state = 'OFF')
OVER (ORDER BY date DESC, state DESC) AS later_off_count
FROM tab) q
WHERE later_off_count = 0;
id | state | date
----+-------+------------
10 | ON | 2018-02-23
9 | ON | 2018-02-22
8 | ON | 2018-02-22
7 | ON | 2018-02-22
(4 rows)
Use the LEAD function with a CASE expression.
SQL Fiddle
Query 1:
SELECT id,
heating_state,
dt
FROM (SELECT t.*,
CASE
WHEN dt >= timestamp '2018-02-20'
AND heating_state = 'ON'
AND LEAD(heating_state, 1, heating_state)
OVER (
ORDER BY dt ) = 'ON' THEN 1
ELSE 0
END on_state
FROM t) s
WHERE on_state = 1
Results:
| id | heating_state | dt |
|----|---------------|----------------------|
| 7 | ON | 2018-02-22T00:00:00Z |
| 8 | ON | 2018-02-22T00:00:00Z |
| 9 | ON | 2018-02-22T00:00:00Z |
| 10 | ON | 2018-02-23T00:00:00Z |

SQL Duplicating tables with groupBy

I'm trying to compare income/outgoings using a simple query, but for some reason, I'm getting duplicated data. This is the query I'm running:
SELECT
Event.Name as "Event",
Concat("£", round(sum(Ticket.Price),2)) as "Ticket Sales",
sum(Invoice.NetTotal) as "Invoice Costs",
Concat("£", round(sum(Ticket.Price),2) - round(sum(Invoice.NetTotal),2)) as "Total Loss"
FROM Ticket
JOIN Event ON Ticket.EventID = Event.EventID
JOIN Invoice ON Event.EventID = Invoice.EventID
GROUP BY Event.EventID;
This is the result I'm getting
+--------------------------+--------------+---------------+------------+
| Event | Ticket Sales | Invoice Costs | Total Loss |
+--------------------------+--------------+---------------+------------+
| Victorious Festival 2018 | £47.94 | 1800 | £-1752.06 |
+--------------------------+--------------+---------------+------------+
Despite there only being 2 items in the Invoice table, totaling £600,
and 3 relevant items in the ticket table totaling £24.97
+-----------+--------+---------+---------------+-------------+----------+------+
| InvoiceNo | ItemID | EventID | HireStartDate | HireEndDate | NetTotal | VAT |
+-----------+--------+---------+---------------+-------------+----------+------+
| 1 | 1 | 1 | 2018-05-05 | 2018-05-06 | 500 | 20 |
| 2 | 2 | 1 | 2018-05-05 | 2018-05-06 | 100 | 20 |
+-----------+--------+---------+---------------+-------------+----------+------+
+----------+---------+-------+------------+------------+----------+
| TicketNo | EventID | Price | ValidFrom | ValidTo | Class |
+----------+---------+-------+------------+------------+----------+
| 1 | 1 | 7.99 | 2018-05-05 | 2018-05-22 | Standard |
| 2 | 1 | 7.99 | 2018-05-05 | 2018-05-22 | Standard |
| 3 | 2 | 10 | 2018-04-28 | 2018-04-28 | Standard |
| 4 | 2 | 10 | 2018-04-28 | 2018-04-28 | Standard |
| 5 | 2 | 10 | 2018-04-28 | 2018-04-28 | Standard |
| 6 | 2 | 10 | 2018-04-28 | 2018-04-28 | Standard |
| 7 | 2 | 10 | 2018-04-28 | 2018-04-28 | Standard |
| 8 | 2 | 10 | 2018-04-28 | 2018-04-28 | Standard |
| 9 | 1 | 7.99 | 2018-05-05 | 2018-05-22 | Standard |
+----------+---------+-------+------------+------------+----------+
You have two different independent dimensions. The best solution is to aggregate before joining:
SELECT e.Name as "Event",
Concat("£", round(sum(t.Price), 2)) as "Ticket Sales",
sum(i.NetTotal) as "Invoice Costs",
Concat("£", round(sum(t.Price), 2) - round(sum(i.NetTotal), 2)) as "Total Loss"
FROM Event e JOIN
(SELECT t.EventId, SUM(Price) as Price
FROM Ticket t
GROUP BY t.EventId
) t
ON t.EventID = e.EventID JOIN
(SELECT i.EventId, SUM(i.NetTotal) as NetTotal
FROM Invoice i
GROUP BY i.EventId
) i
ON e.EventID = i.EventID
GROUP BY e.EventID;
Two comments. First, I don't really like aggregating on EventId, because it is not in the SELECT (preferring EventName instead). Assuming that it is the primary key for Events, then this structure is fine -- the id uniquely identifies each row in events, so the name is well-defined.
Second, you might want to make the joins left joins, so you are including all events, even those that might be missing tickets or invoices.

Group by dynamic time period

For my problem I'll try to create a simplified example.
Table "order":
ID | date | client | product | product_limit_period (in months)
1 | 2015-01-01 | Bob | table | 1
2 | 2015-01-31 | Bob | table | 1
3 | 2015-02-01 | Bob | table | 1
4 | 2015-01-01 | Mary | lamb | 12
5 | 2015-06-01 | Mary | lamb | 12
6 | 2016-01-01 | Mary | lamb | 12
7 | 2016-12-31 | Mary | lamb | 12
This is the result, I'd like to get:
client | product | group | count
Bob | table | 1 | 2 #ID 1, 2
Bob | table | 2 | 1 #ID 3
Mary | lamb | 3 | 2 #ID 4, 5
Mary | lamb | 4 | 2 #ID 6, 7
Every product has a limit and a limit period (in months). I need to be able to see if there are any clients that have ordered a product more than its limit allows in a certain period. The period in months might be 1 month or several years in months. It is possible that the period is 1 month, 12 months, 24 months, ... until 108 months (9 years).
I feel like I need to use some combination of window functions and group by. But I haven't figured out how.
I'm using postgres 9.1. Please let me know if there is any more information I should provide.
Any help is appreciated, even just pointing me to the right direction!
Edit:
To clarify how the grouping works: The limit period starts with the first order. Bob's first order is 2015-01-01 and so this period ends with 2015-01-31. 2015-02-01 starts a second period. A period always starts with the first day of a month and ends with the last day of a month.
no need to complicate with both window and group by, just add case to either a window or group, like here:
t=# select
client
, product
, count(1)
, string_agg(id::text,',')
from so44
group by
client
, product
, date_trunc(case when product_limit_period = 1 then 'month' else 'year' end,date);
client | product | count | string_agg
----------+-----------+-------+------------
Bob | table | 2 | 1,2
Bob | table | 1 | 3
Mary | lamb | 2 | 4,5
Mary | lamb | 2 | 6,7
(4 rows)
sample:
t=# create table so44 (i int,"date" date,client text,product text,product_limit_period int);
CREATE TABLE
t=# copy so44 from stdin delimiter '|';
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself.
>> 1 | 2015-01-01 | Bob | table | 1
>> 2 | 2015-01-31 | Bob | table | 1
3 | 2015-02-01 | Bob | table | 1
4 | 2015-01-01 | Mary | lamb | 12
5 | 2015-06-01 | Mary | lamb | 12
6 | 2016-01-01 | Mary | lamb | 12
7 | 2016-12-31 | Mary | lamb | 12>> >> >> >> >>
>> \.
COPY 7