Column alias is not recognized [duplicate] - sql

This question already has answers here:
Referring to a select aggregate column alias in the having clause in Postgres
(2 answers)
Closed 2 years ago.
This minimal example is supposed to extract year from time stamps, then count something in a given year.
SELECT EXTRACT(YEAR FROM rental_ts) as year,
COUNT(DISTINCT rental_id)
FROM rental
GROUP BY year
HAVING year=2020
Running it, I get an error column "year" does not exist. What is the reason for this?
Code with explicit HAVING EXTRACT(YEAR FROM rental_ts)=2020 works without problems, but not very convenient.
Same would happen if I use year in WHERE clause instead.
I practice in this playground. It uses PostgreSQL.

Alas, that is true. Column aliases are not allowed. One solution is to repeat the expression:
SELECT EXTRACT(YEAR FROM rental_ts) as year,
COUNT(DISTINCT rental_id)
FROM rental
GROUP BY year
HAVING EXTRACT(YEAR FROM rental_ts) = 2020;
A better solution is to filter before aggregating:
SELECT EXTRACT(YEAR FROM rental_ts) as year,
COUNT(DISTINCT rental_id)
FROM rental
WHERE rental_ts >= '2020-01-01' AND rental_ts < '2021-01-01'
GROUP BY year;
This is better for two reasons. First, it is index (and partition) compatible. Second, it reduces the amount of data needed for the aggregation.

Related

What is missing (if anything) from this SQL statement? [duplicate]

This question already has an answer here:
Mysql : Aggregation function is not working without group by
(1 answer)
Closed 1 year ago.
SELECT EXTRACT(YEAR FROM created_at) AS yr, count(*) AS users_count
FROM users;
I'm new to SQL (so new that I can't find anything wrong with this). There may not be, and it may be a trick question. Would appreciate feedback!
You should use "Group BY" function too, because you use COUNT function with an another column.
For example:
SELECT EXTRACT(YEAR FROM created_at) AS yr, count(*) AS users_count FROM users group by YR;

SQL Monthname Display in Query

I am new to SQL and currently going through some training materials.
I need to: Display the month and number of quotations received in each month.
My Query
SELECT Qdate AS MONTH ,COUNT (Quotationid)AS QUOTATIONCOUNT
FROM Quotation
GROUP BY Qdate
ORDER BY Qdate ASC;
DB Structure
Quotation (Quotationid, Sname, Itemcode, Quotedprice, Qdate, Qstatus)
You are almost correct.
Below query for Oracle.
SELECT to_char(Qdate, 'MONTH') AS "MONTH" ,COUNT (Quotationid)AS QUOTATIONCOUNT
FROM Quotation
GROUP BY to_char(Qdate, 'MONTH')
ORDER BY "MONTH";
And to show month name:
in SQL Server, you can use DATENAME function in combination with DATEADD function. #MonthNum is your number of month, that means you can use: SELECT DateName(month, DateAdd(month, #MonthNum, 0) - 1)
in MySQL, you can use: MONTHNAME(date)
You need to extract the month and year to get what you want. Date/time functions are notoriously database-dependent. In standard SQL you can do:
SELECT EXTRACT(YEAR FROM Qdate) AS YYYY,
EXTRACT(MONTH FROM Qdate) AS MM,
COUNT(*) AS QUOTATIONCOUNT
FROM Quotation
GROUP BY EXTRACT(YEAR FROM Qdate), EXTRACT(MONTH FROM Qdate)
ORDER BY MIN(Qdate) ASC;
The specific functions you want depend on the database you are using.
Important points:
You should not be considering the month without the year, unless you have a really good reason.
In general, you need to repeat the expressions in the GROUP BY, although some databases do allow column aliases (such as GROUP BY YYYY, MM in the above example).
The ORDER BY needs to use an expression that only consists of expressions in the GROUP BY or aggregation functions (so ORDER BY QDate wouldn't work).

PyPika how to generate IF statement

How do I generate an IF statement in PyPika?
I am trying to generate a BigQuery query that pivots a row to a column. I found that if I use the following in a query (where date_range is from a WITH statement):
IF (date_range.kind = 'year', date_range.name, NULL) as year
then this will work. However, I haven't found a way to generate this SQL fragment in PyPika.
For completeness, this is an example of a query I need to run in BigQuery:
WITH date_range AS (
SELECT
CAST(EXTRACT(year FROM year) as string) name,
'year' kind,
year start_date,
DATE_ADD(year, INTERVAL 1 year) end_date
FROM UNNEST(GENERATE_DATE_ARRAY('2010-01-01','2020-06-01',INTERVAL 1 year)) year
UNION ALL
SELECT
FORMAT_DATE('%B', month)||' '||EXTRACT(year FROM month) name,
'month' kind,
month start_date,
DATE_ADD(month,INTERVAL 1 month) end_date
FROM
UNNEST(GENERATE_DATE_ARRAY('2010-01-01','2020-06-01',INTERVAL 1 month)) month
)
SELECT
IF(date_range.kind='year', date_range.name, null) as year,
IF(date_range.kind='month', date_range.name, null) as month,
SUM(sales.sales_value) sales_value,
FROM sales
JOIN date_range ON sales.start_date>=date_range.start_date AND sales.end_date<date_range.end_date
GROUP BY year, month
ORDER BY year, month
The more general question I have is, is there a way to pass literal strings to PyPika so that those will be included in the resulting query string? There are several SQL fragments that Pypika does not generate (such as GENERATE_DATE_ARRAY and UNNEST, at least as far I can find) and passing the actual SQL fragment to PyPika would solve the problem.
Thanks!
Not sure if it applies but be sure to also check whether the CASE statement can help you.
Other than that you can either subclass PyPika's Function class and overwrite get_sql and use that or (ab)use the CustomFunction and PseudoColumn utility classes like this:
from pypika import CustomFunction
sales_table = Table('sales')
MyIf = CustomFunction('IF', ['condition', 'if', 'else'])
q = Query.from_(sales_table).select(
MyIf(PseudoColumn("date_range.kind = 'year'"), PseudoColumn("date_range.name"), None, alias="year")
)
However, I'd probably recommend making a ticket on the PyPika Github.
Note: I wasn't able to test this.

Display By month using select statement

SELECT SUM(Total_A ) FROM Materials_List
This is the snippet of code that I have.
I need it to calculate by month and display by month using SQL.
I also would like it to be a code I can use for any month in the year not just one month at a time.
You seem to be looking for simple aggregation:
select
year(materials_datetime) yr,
month(materials_datetime) mn,
sum(total_a) sum_total_a
from materials_list
group by
year(materials_datetime),
month(materials_datetime)
order by yr, mn
This assumes that column materials_datetime contains the date/time that you want to use to aggregate the data.

Teradata Current year and year-1

How to get the dynamic years in the Query for where condition, i need to fetch data for 2017,2018,2019, currently i am hard coding them ( where FSC_YR in (2017,2018,2019) instead i need in a dynamic way. How to do it in teradata.
I tried extract(year from current_date)-2,extract(year from current_date)-1,extract(year from current_date)-3). I am getting error too many expression.
Since you're looking for a range of year numbers, why not just use a BETWEEN?
SELECT *
FROM data
WHERE fsc_yr BETWEEN EXTRACT(year FROM current_date - interval '2' year) AND EXTRACT(year FROM current_date)
But as #dnoeth pointed out in the comments.
To avoid an error when running it on Feb. 29, using INTERVAL might not be the safest method.
But just subtracting from the year number isn't so bad really.
SELECT *
FROM data
WHERE fsc_yr BETWEEN EXTRACT(year FROM current_date)-2 AND EXTRACT(year FROM current_date)
Also note that such error can come from selecting more than 1 column in the query for an IN
For example this would fail:
SELECT * FROM Table1
WHERE Col1 IN (SELECT Col1, Col2 FROM Tabel2)
So if you would use the query for data with a * then it would still result in that error.