date_part - gives error when using where clause - sql

I have a table that looks like this
I'm wanting to sum the sales and only show sales from 2010. Using DATE_PART works except when I try to use there WHERE clause
SELECT date_part('year', date) AS year, SUM(weekly_sales) as yearly_sales
FROM sales
GROUP BY year
SELECT date_part('year', date) AS year, SUM(weekly_sales) as yearly_sales
FROM sales
GROUP BY year
WHERE year = 2010;
Is it because the date_part is a double precision?

The WHERE clause goes before the GROUP BY clause, and it cannot access column aliases defined in the SELECT clause.
You could do:
SELECT date_part('year', date) AS year, SUM(weekly_sales) as yearly_sales
FROM sales
WHERE date_part('year', date) = 2010
GROUP BY year
Or, more efficiently:
SELECT 2010 AS year, SUM(weekly_sales) as yearly_sales
FROM sales
WHERE date >= date '2010-01-01' and date < date '2011-01-01'

Related

Group by days of a month in CockroachDB

In CockroachDB, I want to have such this query on a specific month for its every day:
select count(*), sum(amount)
from request
where code = 'code_string'
and created_at >= '2022-07-31T20:30:00Z' and created_at < '2022-08-31T20:30:00Z'
the problem is that I want it on my local date. What should I do?
My goal is:
"month, day, count, sum" as result columns for a month.
UPDATE:
I have found a suitable query for this purpose:
select count(amount), sum(amount), extract(month from created_at) as monthTime, extract(day from created_at) as dayTime
from request
where code = 'code_string' and created_at >= '2022-07-31T20:30:00Z' and created_at < '2022-08-31T20:30:00Z'
group by dayTime, monthTime
Thanks to #histocrat for easier answer :) by replacing
extract(month from created_at) as monthTime, extract(day from created_at) as dayTime
by this:
date_part('month', created_at) as monthTime, date_part('day', created_at) as dayTime
To group results by both month and day, you can use the date_part function.
select month, day, count(*), sum(things)
from request
where code = 'code_string'
group by date_part('month', created_at) as month, date_part('day', created_at) as day;
Depending on what type created_at is, you may need to cast or convert it first (for example, group by date_part('month', created_at::timestamptz)).

how to write a query that returns the second quarter of a year .....example total profit of second quarter from 2011 to 2016

SELECT SUM(profit),
DATE_PART('quarter',2,sales_date) AS Quarter,sales_year
FROM sales_data
GROUP BY sales_year,sales_date
this is what i have tried
You need a WHERE clause that only selects dates from the second quarter, then you need to GROUP BY the quarter, not the full date.
select sales_year,
date_part('quarter', sales_date) as quarter,
sum(profit)
from sales_data
where date_part('quarter', sales_date) = 2 -- get only dates from second quarter
and sales_date >= date '2011-01-01'
and sales_date < date '2017-01-01' -- only from 2011 to 2016
group by sales_year, date_part('quarter', sales_date)

PostgreSQL: Simplifying a SQL query into a shorter query

I have a table called 'daily_prices' where I have 'sale_date', 'last_sale_price', 'symbol' as columns.
I need to calculate how many times 'last_sale_price' has gone up compared to previous day's 'last_sale_price' in 10 weeks.
Currently I have my query like this for 2 weeks:
select count(*) as "timesUp", sum(last_sale_price-prev_price) as "dollarsUp", 'wk1' as "week"
from
(
select last_sale_price, LAG(last_sale_price, 1) OVER (ORDER BY sale_date) as prev_price
from daily_prices
where sale_date <= CAST('2020-09-18' AS DATE) AND sale_date >= CAST('2020-09-14' AS DATE)
and symbol='AAPL'
) nest
where last_sale_price > prev_price
UNION
select count(*) as "timesUp", sum(last_sale_price-prev_price) as "dollarsUp", 'wk2' as "week"
from
(
select last_sale_price, LAG(last_sale_price, 1) OVER (ORDER BY sale_date) as prev_price
from daily_prices
where sale_date <= CAST('2020-09-11' AS DATE) AND sale_date >= CAST('2020-09-07' AS DATE)
and symbol='AAPL'
) nest
where last_sale_price > prev_price
I'm using 'UNION' to combine the weekly data. But as the number of weeks increase the query is going to be huge.
Is there a simpler way to write this query?
Any help is much appreciated. Thanks in advance.
you can extract week from sale_date. then apply group by on the upper query
select EXTRACT(year from sale_date) YEAR, EXTRACT('week' FROM sale_date) week, count(*) as "timesUp", sum(last_sale_price-prev_price) as "dollarsUp"
from (
select
sale_date,
last_sale_price,
LAG(last_sale_price, 1) OVER (ORDER BY sale_date) as prev_price
from daily_prices
where symbol='AAPL'
)
where last_sale_price > prev_price
group by EXTRACT(year from sale_date), EXTRACT('week' FROM sale_date)
to extract only weekdays you can add this filter
EXTRACT(dow FROM sale_date) in (1,2,3,4,5)
PS: make sure that monday is first day of the week. In some countries sunday is the first day of the week
You can filter on the last 8 weeks in the where clause, then group by week and do conditional aggregation:
select extract(year from sale_date) yyyy, extract(week from saledate) ww,
sum(last_sale_price - lag_last_sale_price) filter(where lag_last_sale_price > last_sale_price) sum_dollars_up,
count(*) filter(where lag_last_sale_price > last_sale_price) cnt_dollars_up
from (
select dp.*,
lag(last_sale_price) over(partition by extract(year from sale_date), extract(week from saledate) order by sale_date) lag_last_sale_price
from daily_price
where symbol = 'AAPL'
and sale_date >= date_trunc('week', current_date) - '8 week'::interval
) dp
group by 1, 2
Notes:
I am asssuming that you don't want to compare the first price of a week to the last price of the previous week; if you do, then just remove the partition by clause from the over() clause of lag()
this dynamically computes the date as of 8 (entire) weeks ago
if there is no price increase during a whole week, the query still gives you a row, with 0 as sum_dollars_up and cnt_dollars_up

unable to typecast timestamp to date in Group By

I am unable to typecast timestamp to date type in the Group By of my SQL Select statement.
SELECT geography_id,
listed_at::DATE,
EXTRACT(YEAR FROM listed_at) AS year,
EXTRACT(MONTH FROM listed_at) AS month,
EXTRACT(day FROM listed_at) AS day,
Count(*) AS active_listing_count,
SUM(list_price) AS sum_of_listing_price,
Date_part('day', current_date :: timestamp - listed_at :: timestamp) AS days_on_market,
COUNT(num_bathrooms) AS total_bathrooms,
COUNT(num_bedrooms) AS total_bedrooms
FROM properties
WHERE expired_at IS NULL
GROUP BY geography_id,
listed_at::DATE
ORDER BY listed_at::DATE DESC;
I am getting this error:
ERROR: column "properties.listed_at" must appear in the GROUP BY clause or be used in an aggregate function
Each occurrence of listed_at in select list should be casted to date:
SELECT geography_id,
listed_at::DATE,
EXTRACT(YEAR FROM listed_at::date) AS year,
EXTRACT(MONTH FROM listed_at::date) AS month,
EXTRACT(day FROM listed_at::date) AS day,
count(*) AS active_listing_count,
SUM(list_price) AS sum_of_listing_price,
date_part('day', current_date::timestamp - listed_at::date) AS days_on_market,
COUNT(num_bathrooms) AS total_bathrooms,
COUNT(num_bedrooms) AS total_bedrooms
FROM properties
WHERE expired_at IS NULL
GROUP BY geography_id,
listed_at::DATE
ORDER BY listed_at::DATE DESC;

Append select with missing month and year values

I have SELECT:
SELECT month, year, ROUND(AVG(q_overall) OVER (rows BETWEEN 10000 preceding and current row),2) as avg
FROM (
SELECT EXTRACT(Month FROM date) as month, EXTRACT(Year FROM date) as year, ROUND(AVG(q_overall),1) as q_overall
FROM fb_parsed
WHERE business_id = 1
GROUP BY year, month
ORDER BY year, month) a
output:
month year avg
-----------------
12 2012 5
1 2013 4.5
2 2013 4.1
4 2013 4.8
5 2013 4.7
And I have to append this table with missing values (in this example with 3-rd month in 2013 year). The avg must be same as in previous row, that means I need to append this table with:
3 2013 4.1
Can I do this with SELF JOINS and generate_series, or with some UNION select?
You can simplify your select. It doesn't need a subquery:
SELECT EXTRACT(Month FROM date) as month,
EXTRACT(Year FROM date) as year,
ROUND(AVG(q_overall), 1) as q_overall,
ROUND(AVG(AVG(q_overall)) OVER (rows BETWEEN 10000 preceding and current row), 2)
FROM fb_parsed
WHERE business_id = 1
GROUP BY year, month;
The windows function needs an order by. I assume you really intend:
SELECT EXTRACT(Month FROM date) as month,
EXTRACT(Year FROM date) as year,
ROUND(AVG(q_overall), 1) as q_overall,
ROUND(AVG(AVG(q_overall)) OVER (ORDER BY year, month)), 2)
FROM fb_parsed
WHERE business_id = 1
GROUP BY year, month;
Then, to fill in the values you can use generate_series():
SELECT EXTRACT(Month FROM ym.date) as month,
EXTRACT(Year FROM ym.date) as year,
ROUND(AVG(AVG(q_overall)) OVER (ORDER BY year, month)), 2)
FROM (SELECT generate_series(date_trunc('month', min(date)),
date_trunc('month', max(date)),
interval '1 month') as date
FROM fb_parsed
) ym LEFT JOIN
fb_parsed p
ON EXTRACT(year FROM ym.date) = EXTRACT(year FROM p.date) AND
EXTRACT(month FROM ym.date) = EXTRACT(month FROM p.date) AND
p.business_id = 1
GROUP BY year, month;
I think this will do what you want.
Final query:
SELECT EXTRACT(Month FROM ym.date) as month,
EXTRACT(Year FROM ym.date) as year,
ROUND(AVG(AVG(q_overall)) OVER (ORDER BY EXTRACT(Year FROM ym.date), EXTRACT(Month FROM ym.date)), 2)
FROM
(SELECT generate_series(date_trunc('month', min(date)),
date_trunc('month', max(date)),
interval '1 month') as date
FROM fb_parsed WHERE business_id = 1 AND site = 'facebook')
ym LEFT JOIN
fb_parsed p
ON EXTRACT(year FROM ym.date) = EXTRACT(year FROM p.date) AND
EXTRACT(month FROM ym.date) = EXTRACT(month FROM p.date) AND
p.business_id = 1 AND site = 'facebook'
GROUP BY year, month;
Can I do this with SELF JOINS and generate_series?
Yep, you're close, but your current query does a Cumulative Average. The tricky part is the fill the gaps with the previous value (If PostgreSQL supported the IGNORE NULLS option of LAST_VALUE this would be easier...)
SELECT month,
year,
MAX(q_overall) -- assign the value to all rows within the same group
OVER (PARTITION BY grp)
FROM
(
SELECT all_months.month, all_months.year, p.q_overall,
-- assign a new group number whenever there's a value in q_overall
SUM(CASE WHEN q_overall IS NULL THEN 0 ELSE 1 END)
OVER (ORDER BY all_months.month, all_months.year
ROWS UNBOUNDED PRECEDING) AS grp
FROM
( -- create all months with min and max date
SELECT generate_series(date_trunc('month', min(date)),
date_trunc('month', max(date)),
interval '1 month') as date
FROM fb_parsed
) AS all_months
LEFT JOIN
( -- do the average per month calculation
SELECT EXTRACT(Month FROM date) as month,
EXTRACT(Year FROM date) as year,
ROUND(AVG(q_overall),1) as q_overall
FROM fb_parsed
WHERE business_id = 1
GROUP BY year, month
) AS p
ON EXTRACT(year FROM ym.date) = all_months.month
AND EXTRACT(month FROM ym.date) = all_months.year
) AS dt
Edit:
Oops, this was overly complicated, the question asked for a Cumulative Average and then NULLs will not change the result and there's no need to fill the gaps