Calculate max value by a dimension over a number of days in SQL (Presto) - sql

I am looking to get the max value by a specific dimension over a number of days in SQL, as per example below:
I have this initial dataset:
And I am looking to calculate the maximum of nr of items and sales by product type across the number of days, as in the example below:
Expected output:
Any advise on best way to get this? I tried Max function and Max_by to get the max by product id but it didnt work.
Thank you in advance.

Use window functions:
select t.*,
max(items) over (partition by product_type),
max(sales) over (partition by product_type)
from t;

Related

Finding the initial sampled time window after using SAMPLE BY again

I can't seem to find a perhaps easy solution to what I'm trying to accomplish here, using SQL and, more importantly, QuestDB. I also find it hard to put my exact question into words so bear with me.
Input
My real input is different of course but a similar dataset or case is the gas_prices table on the demo page of QuestDB. On https://demo.questdb.io, you can directly write and run queries against some sample database, so it should be easy enough to follow.
The main task I want to accomplish is to find out which month was responsible for the year's highest galon price.
Output
Using the following query, I can get the average galon price per month just fine.
SELECT timestamp, avg(galon_price) as avg_per_month FROM 'gas_prices' SAMPLE BY 1M
timestamp
avg_per_month
2000-06-05T00:00:00.000000Z
1.6724
2000-07-05T00:00:00.000000Z
1.69275
2000-08-05T00:00:00.000000Z
1.635
...
...
Then, I get all these monthly averages, group them by year and return the maximum galon price per year by wrapping the above query in a subquery, like so:
SELECT timestamp, max(avg_per_month) as max_per_year FROM (
SELECT timestamp, avg(galon_price) as avg_per_month FROM 'gas_prices' SAMPLE BY 1M
) SAMPLE BY 12M
timestamp
max_per_year
2000-01-05T00:00:00.000000Z
1.69275
2001-01-05T00:00:00.000000Z
1.767399999999
2002-01-05T00:00:00.000000Z
1.52075
...
...
Wanted output
I want to know which month was responsible for the maximum price of a year.
Looking at the output of the above query, we see that the maximum galon price for the year 2000 was 1.69275. Which month of the year 2000 had this amount as average price? I'd like to display this month in an additional column.
For the first row, July 2000 is shown in the additional column for year 2000 because it is responsible for the highest average price in 2000. For the second row, it was May 2001 as that month had the highest average price of 2001.
timestamp
max_per_year
which_month_is_responsible
2000-01-05T00:00:00.000000Z
1.69275
2000-07-05T00:00:00.000000Z
2001-01-05T00:00:00.000000Z
1.767399999999
2001-05-05T00:00:00.000000Z
...
...
What did I try?
I tried by adding a subquery to the SELECT to have a "duplicate" of some sort for the timestamp column but that's apparently never valid in QuestDB (?), so probably the solution is by adding even more subqueries in the FROM? Or a UNION?
Who can help me out with this? The data is there in the database and it can be calculated. It's just a matter of getting it out.
I think 'wanted output' can be achieved with window functions.
Please have a look at:
CREATE TABLE electricity (ts TIMESTAMP, consumption DOUBLE) TIMESTAMP(ts);
INSERT INTO electricity
SELECT (x*1000000)::timestamp, rnd_double()
FROM long_sequence(10000000);
SELECT day, ts, max_per_day
FROM
(
SELECT timestamp_floor('d', ts) as day,
ts,
avg_in_15_min as max_per_day,
row_number() OVER (PARTITION BY timestamp_floor('d', ts) ORDER BY avg_in_15_min desc) as rn_per_day
FROM
(
SELECT ts, avg(consumption) as avg_in_15_min
FROM electricity
SAMPLE BY 15m
)
) WHERE rn_per_day = 1

SQL calculating running total as you go down the rows but also taking other fields into account

I'm hoping you guys can help with this problem.
I have a set of data which I have displayed via excel.
I'm trying to work out the rolling new cap allowance but need to deduct from previous weeks bookings. I don't want to use a cursor so can anyone help.
I'm going to group by the product id so it will need to start afresh for every product.
In the image, Columns A to D are fixed and I am trying to calculate the data in column E ('New Cap'). The 'New Cap' is the expected results.
Column F gives a detailed formula of what im trying to do.
Not sure what I've done for the post to be marked down.
Thanks
Update:
The formula looks like this.
You want the sum of the cap through this row minus the sum of booked through the previous row. This is easy to do with window functions:
select t.*,
(sum(cap + booked) over (partition by productid order by weekbeg) - booked
) as new_cap
from t;
You can get the new running total using lag and sum over window functions - calculate the cap-booked first, then use sum over() for the running total:
select weekbeg, ProductId, Cap, Booked,
Sum(n) over(partition by productid order by weekbeg) New_Cap
from (
select *, cap - Lag(booked,1,0) over(partition by productid order by weekbeg)n
from t
)t

Calculation of weighted average counts in SQL

I have a query that I am currently using to find counts
select Name, Count(Distinct(ID)), Status, Team, Date from list
In addition to the counts, I need to calculate a goal based on weighted average of counts per status and team, for each day.
For example, if Name 1 counts are divided into 50% Status1-Team1(X) and 50% Status2-Team2(Y) yesterday, then today's goal for Name1 needs to be (X+Y)/2.
The table would look like this, with the 'Goal' field needed as the output:
What is the best way to do this in the same query?
I'm almost guessing here since you did not provide more details but maybe you want to do this:
SELECT name,status,team,data,(select sum(data)/(select count(*) from list where name = q.name)) FROM (SELECT Name, Count(Distinct(ID)) as data, Status, Team, Date FROM list) as q

(MSSQL) SUM COLUMN BY MODIFIED DATE GIVEN

I want to get the sum on a specific date given e.g from 18/12/04 to 18/12/15 then calculate the total amount
The key is in the GROUP BY statement.
If you want to get the sum per tenant and date, group by them both.
If you want to get the sum per date, group by date only (and remove the tenant colums [or select MAX] from the SELECT).
Paste your code as text and I can edit it for you if you want.
If you want to get the total as additional column, you can use WINDOW functions:
SELECT tenant_id
,SUM(total_amount) OVER (PARTITION BY tenant_id)
,SUM(total_amount) OVER ()
FROM tenant_reeipits
WHERE ...
ORDER BY ...

Calculate contribution

It is not legal to do max(count()), so how do I accomplish calculating the contribution as shown here (and also get the other columns)
SELECT id,
Avg(time) AS avgSec,
Stdev(time) AS stdevSec,
Count(time) AS cnt,
Avg(time)*Count(time)/max(Count(time)) AS contribution
FROM ...very long and complex query...
Use MAX()OVER() window aggregate function to get maximum count out of all records
Here is the correct way
Avg(time)*Count(time)/max(Count(time)) over()