SQL complicated query - sql

I have a following problem with my SQL query. I was managed to successfully execute about 15 of them but this one makes me sick. It's even hard to translate but probably you will understand.
Show years and months numbers and their sum of costs of ‘borrowed-time’ in those months where monthly sum of costs was lesser then the biggest one in February and March in 2006.
And this is what I have so far (one of the version of the query because I tried many of them)
SELECT EXTRACT(MONTH FROM DATA_WYP), SUM(KOSZT)
FROM WYPOZYCZENIA
WHERE KOSZT<(SELECT SUM(KOSZT)
FROM WYPOZYCZENIA
WHERE SUM(KOSZT)<(SELECT MAX(SUM(KOSZT))
FROM WYPOZYCZENIA
WHERE EXTRACT(MONTH FROM DATA_WYP)='2' OR
EXTRACT(MONTH FROM DATA_WYP)='3'
GROUP BY EXTRACT(MONTH FROM DATA_WYP)))
GROUP BY EXTRACT(MONTH FROM DATA_WYP);
The problem is that I cannot equal SUM(KOSZT), tried to save them using AS but it doesn't work either.
Please help me because it already ruined my day.
Thanks in advance.

To filter by the results of an aggregate function in a SQL query, place the comparisons in a HAVING statement.

Are you looking for something like this?
SELECT YEAR(data_wyp) year,
MONTH(data_wyp) month,
SUM(koszt) koszt
FROM wypozyczenia
GROUP BY year, month
HAVING koszt < (SELECT SUM(koszt) koszt
FROM wypozyczenia
WHERE MONTH(data_wyp) IN (2, 3)
AND YEAR(data_wyp) = 2006
GROUP BY MONTH(data_wyp)
ORDER BY koszt DESC LIMIT 1)
SQLFiddle (MySql)

I'm assuming Oracle or DB2 because of the keywords in your sample query.
Using a CTE to find the max for the 2 months and using HAVING for the condition simplifies the expression somewhat, this should (more or less) do what you need;
WITH cte AS (
SELECT SUM(KOSZT) mx FROM WYPOZYCZENIA
WHERE EXTRACT(MONTH FROM DATA_WYP) IN ('2','3')
AND EXTRACT(YEAR FROM DATA_WYP) = '2006'
GROUP BY EXTRACT(MONTH FROM DATA_WYP)
)
SELECT EXTRACT(YEAR FROM DATA_WYP) Year, EXTRACT(MONTH FROM DATA_WYP) Month,
SUM(KOSZT) KOSZT
FROM WYPOZYCZENIA
GROUP BY EXTRACT(Year FROM DATA_WYP), EXTRACT(Month FROM DATA_WYP)
HAVING SUM(KOSZT) < (SELECT MAX(mx) FROM cte)
An SQLfiddle to test with.

Try this out-:
SELECT EXTRACT(YEAR FROM DATA_WYP)
EXTRACT(MONTH FROM DATA_WYP),
SUM(KOSZT)
FROM WYPOZYCZENIA
GROUP BY EXTRACT(YEAR FROM DATA_WYP)
EXTRACT(MONTH FROM DATA_WYP)
HAVING Sum(KOSZT)<(SELECT MAX(SUM(KOSZT))
FROM WYPOZYCZENIA
WHERE EXTRACT(YEAR FROM DATA_WYP)=2006
EXTRACT(MONTH FROM DATA_WYP) IN ('2','3')
GROUP BY EXTRACT(MONTH FROM DATA_WYP))
I hope this solves your problem.

Related

Using Where and group by clause

Can anyone describe how can I suppose to retrieve data using filter conditions such as both where and group by clauses of different fields through SQL ?
For instance ,
Require to take out the No of days in a month does the temperature exceeding 35 degrees celsius ?
SELECT temp, count(*)
FROM weather_data
WHERE day between '01-jun-2022' to '30-jun-2022'
GROUP BY temp > '35';
My requirement is to find out the aggregate details like total count
So I tried using group by clause , Inaddition to that , I must use few conditions to filter further ,
Hence I used conditions in where clause before group by clause
it's correct query :
SELECT temp, count(*) FROM weather_data
WHERE temp > '35' AND day between '01-jun-2022' and '30-jun-2022' GROUP BY temp
You want to aggregate your data, so as to get one result row per month. In SQL this is GROUP BY EXTRACT(YEAR FROM day), EXTRACT(MONTH FROM day). Your DBMS may have additional functions to extract a month (year + month to be precise) from a date, such as TO_CHAR(day, 'YYYY-MM'), but this is vendor specific.
Now you only want to count days with a temperature obove 35 degrees. The first idea to solve this, is a WHERE clause that limits the rows you aggregate to the ones in question:
SELECT
EXTRACT(YEAR FROM day) AS year,
EXTRACT(MONTH FROM day) AS month,
COUNT(*)
FROM mytable
WHERE temp > 35
GROUP BY EXTRACT(YEAR FROM day), EXTRACT(MONTH FROM day)
ORDER BY EXTRACT(YEAR FROM day), EXTRACT(MONTH FROM day);
The problem with this: If a month has no day above that temperature, you won't select that month, because your WHERE clause removed those rows. That may be okay with you, but if you want to show the months with a zero count, then move the condition into the aggregation function. Thus you select all months but only count days with high temperatures:
SELECT
EXTRACT(YEAR FROM day) AS year,
EXTRACT(MONTH FROM day) AS month,
COUNT(CASE WHEN temp > 35 THEN 1 END)
FROM mytable
GROUP BY EXTRACT(YEAR FROM day), EXTRACT(MONTH FROM day)
ORDER BY EXTRACT(YEAR FROM day), EXTRACT(MONTH FROM day);
How does this work? COUNT <expression> ) counts non-null occurrences. CASE WHEN temp > 35 THEN 1 END is short for CASE WHEN temp > 35 THEN 1 ELSE NULL END. And instead of 1 you could use any value that is not null, e.g. 'count me'. Or you could use SUM instead, if you like that better: SUM(CASE WHEN temp > 35 THEN 1 ELSE 0 END).
At last you want to limit the date range. Date literals in SQL look like this: DATE 'YYYY-MM-DD'. And as we sometimes deal with dates and other times with datetimes or timestamps, it has become common, not to use BETWEEN, but >= and <, so as to have the range work for all those data types:
SELECT
EXTRACT(YEAR FROM day) AS year,
EXTRACT(MONTH FROM day) AS month,
COUNT(CASE WHEN temp > 35 THEN 1 END)
FROM mytable
WHERE day >= DATE '2022-06-01'
AND day < DATE '2022-07-01'
GROUP BY EXTRACT(YEAR FROM day), EXTRACT(MONTH FROM day)
ORDER BY EXTRACT(YEAR FROM day), EXTRACT(MONTH FROM day);
Try this:
SELECT temp, count(*)
FROM weather_data
WHERE date >= '01-jun-2022' AND date<='30-jun-2022' AND temp > '35'
GROUP BY temp;

PostgreSQL: Simplifying a SQL query into a shorter query

I have a table called 'daily_prices' where I have 'sale_date', 'last_sale_price', 'symbol' as columns.
I need to calculate how many times 'last_sale_price' has gone up compared to previous day's 'last_sale_price' in 10 weeks.
Currently I have my query like this for 2 weeks:
select count(*) as "timesUp", sum(last_sale_price-prev_price) as "dollarsUp", 'wk1' as "week"
from
(
select last_sale_price, LAG(last_sale_price, 1) OVER (ORDER BY sale_date) as prev_price
from daily_prices
where sale_date <= CAST('2020-09-18' AS DATE) AND sale_date >= CAST('2020-09-14' AS DATE)
and symbol='AAPL'
) nest
where last_sale_price > prev_price
UNION
select count(*) as "timesUp", sum(last_sale_price-prev_price) as "dollarsUp", 'wk2' as "week"
from
(
select last_sale_price, LAG(last_sale_price, 1) OVER (ORDER BY sale_date) as prev_price
from daily_prices
where sale_date <= CAST('2020-09-11' AS DATE) AND sale_date >= CAST('2020-09-07' AS DATE)
and symbol='AAPL'
) nest
where last_sale_price > prev_price
I'm using 'UNION' to combine the weekly data. But as the number of weeks increase the query is going to be huge.
Is there a simpler way to write this query?
Any help is much appreciated. Thanks in advance.
you can extract week from sale_date. then apply group by on the upper query
select EXTRACT(year from sale_date) YEAR, EXTRACT('week' FROM sale_date) week, count(*) as "timesUp", sum(last_sale_price-prev_price) as "dollarsUp"
from (
select
sale_date,
last_sale_price,
LAG(last_sale_price, 1) OVER (ORDER BY sale_date) as prev_price
from daily_prices
where symbol='AAPL'
)
where last_sale_price > prev_price
group by EXTRACT(year from sale_date), EXTRACT('week' FROM sale_date)
to extract only weekdays you can add this filter
EXTRACT(dow FROM sale_date) in (1,2,3,4,5)
PS: make sure that monday is first day of the week. In some countries sunday is the first day of the week
You can filter on the last 8 weeks in the where clause, then group by week and do conditional aggregation:
select extract(year from sale_date) yyyy, extract(week from saledate) ww,
sum(last_sale_price - lag_last_sale_price) filter(where lag_last_sale_price > last_sale_price) sum_dollars_up,
count(*) filter(where lag_last_sale_price > last_sale_price) cnt_dollars_up
from (
select dp.*,
lag(last_sale_price) over(partition by extract(year from sale_date), extract(week from saledate) order by sale_date) lag_last_sale_price
from daily_price
where symbol = 'AAPL'
and sale_date >= date_trunc('week', current_date) - '8 week'::interval
) dp
group by 1, 2
Notes:
I am asssuming that you don't want to compare the first price of a week to the last price of the previous week; if you do, then just remove the partition by clause from the over() clause of lag()
this dynamically computes the date as of 8 (entire) weeks ago
if there is no price increase during a whole week, the query still gives you a row, with 0 as sum_dollars_up and cnt_dollars_up

ORA-00907 "missing right parenthesis" extract month and year

Having a horrible time with everything today. I am trying to get a list of month, year and then average the order total for each month. I am getting ORA-00907 "missing right parenthesis" and I am not sure why. Again, I very new to this, but the code that I have I referenced from https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions050.htm .Thanks in advance.
SELECT EXTRACT (MONTH, YEAR FROM ORDERDATE) "DATE"
AVG (ORDERDATE) "NO. OF ORDERS"
FROM ORDERINFO
GROUP BY EXTRACT (MONTH, YEAR FROM ORDERDATE)
ORDER BY "MONTH" ASC;
The docs you referenced show that only one of the date part specifier can be of the extract function.
For example, extract(month from orderdate) or extract(year from orderdate).
I'm guessing you really want to truncate the orderdate instead. See https://docs.oracle.com/cd/B19306_01/server.102/b14200/functions201.htm.
SELECT trunc(orderdate, 'MONTH') AS "date"
Either:
select
extract(month from orderdate) as year,
extract(year from orderdate) as month,
...
from ...
group by extract(month from orderdate), extract(year from orderdate)
or:
select
to_char(orderdate, 'YYYY-MM') as year_month,
...
from ...
group by to_char(orderdate, 'YYYY-MM')
this will work:
select extract(month from orderdate) "MONTH",
extract(year from orderdate) "YEAR",
from orderdate
group by extract(month from orderdate)
order by extract(month from orderdate) asc;

Choosing Specific Year when using EXTRACT

I'm trying to pull a few SUMs, but I'm getting stuck on how to narrow it down to a specific year.
I have the following code...
SELECT SITE_ID,
Extract(YEAR FROM DATE_ORDERED) YEAR,
Extract(MONTH FROM DATE_ORDERED) MONTH,
SUM(TOTAL_PRICE),
SUM(TOTAL_PRICE),
SUM(TOTAL_SAVINGS)
FROM DB.ACTUAL_SAVINGS_MVIEW
WHERE SITE_ID = 561
GROUP BY SITE_ID,
Extract(YEAR FROM DATE_ORDERED),
Extract(MONTH FROM DATE_ORDERED)
ORDER BY YEAR DESC,
MONTH DESC
This returns all available years, when I'm only looking for 2016.
Any and all help would be greatly appreciated!
How about adding it to your where clause:
SELECT SITE_ID,
Extract(YEAR FROM DATE_ORDERED) YEAR,
Extract(MONTH FROM DATE_ORDERED) MONTH,
SUM(TOTAL_PRICE),
SUM(TOTAL_PRICE),
SUM(TOTAL_SAVINGS)
FROM DB.ACTUAL_SAVINGS_MVIEW
WHERE SITE_ID = 561
AND Extract(YEAR FROM DATE_ORDERED) = 2016
GROUP BY SITE_ID,
Extract(YEAR FROM DATE_ORDERED),
Extract(MONTH FROM DATE_ORDERED)
ORDER BY YEAR DESC,
MONTH DESC

Append select with missing month and year values

I have SELECT:
SELECT month, year, ROUND(AVG(q_overall) OVER (rows BETWEEN 10000 preceding and current row),2) as avg
FROM (
SELECT EXTRACT(Month FROM date) as month, EXTRACT(Year FROM date) as year, ROUND(AVG(q_overall),1) as q_overall
FROM fb_parsed
WHERE business_id = 1
GROUP BY year, month
ORDER BY year, month) a
output:
month year avg
-----------------
12 2012 5
1 2013 4.5
2 2013 4.1
4 2013 4.8
5 2013 4.7
And I have to append this table with missing values (in this example with 3-rd month in 2013 year). The avg must be same as in previous row, that means I need to append this table with:
3 2013 4.1
Can I do this with SELF JOINS and generate_series, or with some UNION select?
You can simplify your select. It doesn't need a subquery:
SELECT EXTRACT(Month FROM date) as month,
EXTRACT(Year FROM date) as year,
ROUND(AVG(q_overall), 1) as q_overall,
ROUND(AVG(AVG(q_overall)) OVER (rows BETWEEN 10000 preceding and current row), 2)
FROM fb_parsed
WHERE business_id = 1
GROUP BY year, month;
The windows function needs an order by. I assume you really intend:
SELECT EXTRACT(Month FROM date) as month,
EXTRACT(Year FROM date) as year,
ROUND(AVG(q_overall), 1) as q_overall,
ROUND(AVG(AVG(q_overall)) OVER (ORDER BY year, month)), 2)
FROM fb_parsed
WHERE business_id = 1
GROUP BY year, month;
Then, to fill in the values you can use generate_series():
SELECT EXTRACT(Month FROM ym.date) as month,
EXTRACT(Year FROM ym.date) as year,
ROUND(AVG(AVG(q_overall)) OVER (ORDER BY year, month)), 2)
FROM (SELECT generate_series(date_trunc('month', min(date)),
date_trunc('month', max(date)),
interval '1 month') as date
FROM fb_parsed
) ym LEFT JOIN
fb_parsed p
ON EXTRACT(year FROM ym.date) = EXTRACT(year FROM p.date) AND
EXTRACT(month FROM ym.date) = EXTRACT(month FROM p.date) AND
p.business_id = 1
GROUP BY year, month;
I think this will do what you want.
Final query:
SELECT EXTRACT(Month FROM ym.date) as month,
EXTRACT(Year FROM ym.date) as year,
ROUND(AVG(AVG(q_overall)) OVER (ORDER BY EXTRACT(Year FROM ym.date), EXTRACT(Month FROM ym.date)), 2)
FROM
(SELECT generate_series(date_trunc('month', min(date)),
date_trunc('month', max(date)),
interval '1 month') as date
FROM fb_parsed WHERE business_id = 1 AND site = 'facebook')
ym LEFT JOIN
fb_parsed p
ON EXTRACT(year FROM ym.date) = EXTRACT(year FROM p.date) AND
EXTRACT(month FROM ym.date) = EXTRACT(month FROM p.date) AND
p.business_id = 1 AND site = 'facebook'
GROUP BY year, month;
Can I do this with SELF JOINS and generate_series?
Yep, you're close, but your current query does a Cumulative Average. The tricky part is the fill the gaps with the previous value (If PostgreSQL supported the IGNORE NULLS option of LAST_VALUE this would be easier...)
SELECT month,
year,
MAX(q_overall) -- assign the value to all rows within the same group
OVER (PARTITION BY grp)
FROM
(
SELECT all_months.month, all_months.year, p.q_overall,
-- assign a new group number whenever there's a value in q_overall
SUM(CASE WHEN q_overall IS NULL THEN 0 ELSE 1 END)
OVER (ORDER BY all_months.month, all_months.year
ROWS UNBOUNDED PRECEDING) AS grp
FROM
( -- create all months with min and max date
SELECT generate_series(date_trunc('month', min(date)),
date_trunc('month', max(date)),
interval '1 month') as date
FROM fb_parsed
) AS all_months
LEFT JOIN
( -- do the average per month calculation
SELECT EXTRACT(Month FROM date) as month,
EXTRACT(Year FROM date) as year,
ROUND(AVG(q_overall),1) as q_overall
FROM fb_parsed
WHERE business_id = 1
GROUP BY year, month
) AS p
ON EXTRACT(year FROM ym.date) = all_months.month
AND EXTRACT(month FROM ym.date) = all_months.year
) AS dt
Edit:
Oops, this was overly complicated, the question asked for a Cumulative Average and then NULLs will not change the result and there's no need to fill the gaps