How to group by month in presto SQL - sql

I am trying to group by per month in Presto SQL.
I tried this:
select
date_trunc('month', CAST(date AS date)) date_month,
sum(gross_revenue,0) AS 'monthly_net_revenue'
from gross_revenue_calculator
group by date_trunc('month', date)
This gives me the following error:
Malformed query: line 61:27: mismatched input ''monthly_net_revenue''. Expecting: <identifier>
Expected output:
October: $102.12
November: $90.12

You should not use single quotes as column name it either no quotes or double, also you can reference columns by index in GROUP BY as you do in your WITH clause:
select
date_trunc('month', CAST(date AS date)) date_month,
sum(gross_revenue,0) AS monthly_net_revenue
from gross_revenue_calculator
group by 1

Related

Oracle SQL group by to_char - not a group by expression

I want to group by dd-mm-yyyy format to show working_hours per employee (person) per day, but I get error message ORA-00979: not a GROUP BY expression, when I remove TO_CHAR from GROUP BY it works fine, but that's not I want as I want to group by days regardless hours, what am I doing wrong here?
SELECT papf.person_number emp_id,
to_char(sh21.start_time,'dd/mm/yyyy') start_time,
to_char(sh21.stop_time,'dd/mm/yyyy') stop_time,
SUM(sh21.measure) working_hours
FROM per_all_people_f papf,
hwm_tm_rec sh21
WHERE ...
GROUP BY
papf.person_number,
to_char(sh21.start_time,'dd/mm/yyyy'),
to_char(sh21.stop_time,'dd/mm/yyyy')
ORDER BY sh21.start_time
ORDER BY sh21.start_time
needs to either be just the column alias defined in the SELECT clause:
ORDER BY start_time
or use the expression in the GROUP BY clause:
ORDER BY to_char(sh21.start_time,'dd/mm/yyyy')
If you use sh21.start_time then the table_alias.column_name syntax refers to the underlying column from the table and you are not selecting/grouping by that.

PRESTO SQL count group by date

I have a Presto sql table called "imp_pixel".
Here a record of a table :
date_time ip impression_id
2022-08-27 07:05:48 192.0.0.1 001
2022-08-27 07:05:58 192.0.0.12 002
I would like to show the sum of impression_id group by hour
I tryed with this code
select
date_trunc('hour', CAST(date_time AS date)) date_time,
COUNT(impression_id,0) AS 'impression_id'
from parquet_db.imp_pixel
group by date_trunc('hour', date)
But I got this error :
line 3:31: mismatched input ''impression_id''. Expecting: <identifier>
Can you help me please to fix this error?
thanks
Formatting date_time to date, we lose the hourly data
select
date_trunc('hour', CAST(date_time AS timestamp)) date_time,
COUNT(impression_id) AS impression_id
from parquet_db.imp_pixel
group by 1

How to filter date with where on SQLite with '2021-07-31 13:53:26' format?

I wanted to take just the year and month from '2021-07-31 13:53:26' and group them based on count values.
i tried the date, datetime, strftime functions.
Date and Datetime resulting null. strftime result something, but i cant group the Year and Month i get with the count i want, resulting null again
Here is the preview of the data.
expected result example is like '2021-07' with the count of how many times this year and month occurs
This is the syntax i tried with strftime:
select strftime('%Y%m', started_at) year_month, count(year_month) from bike_trip
group by year_month
Thank You
Sqlite doesn't have a date data type so you will need to do string comparison to achieve this.
with d as (
select '2021-07-31 13:53:26' as d, 'A' val union all
select '2021-08-30 13:53:26' as d, 'B' val
)
select substr(d,1,4) as yyyy, substr(d,6,2) as mm, count(*)
from d
group by substr(d,1,4), substr(d,6,2)
in your query:
select substr(started_at,1,4) as yyyy, substr(started_at,6,2) as mm, count(*)
from bike_trip
group by substr(started_at,1,4), substr(started_at,6,2)
Use a CTE to get your answer.
with
-- uncomment to test
/*bike_trip(started_at) as (
values
('2021-07-31 13:53:26'),
('2021-07-17 19:06:01'),
('2021-08-30 13:53:26')
),*/
bike_months(year_month) as (
select strftime('%Y-%m', started_at) year_month from bike_trip
)
select year_month, count(year_month) count_year_month from bike_months
group by year_month;
Output:
year_month|count_year_month
2021-07|2
2021-08|1

CAST a date in Presto to next count

I would like to query Athena with JSON files. I matched creation_date with id because I would like to get a heatmap where on Y axis I have month, on X axis there day and I count the id's inside. I created a table with 2 columns:
creation_date date, id int. Next I am query with the below code:
SELECT CAST(creation_date as DATE) as ad_creation,
COUNT(id) as Total_ads
FROM default.test
GROUP BY CAST(creation_at_first as DATE)
unfortunately I am getting this error:
DatabaseError: Execution failed on sql: SELECT CAST(creation_date as DATE) as ad_creation, COUNT(id) as Total_ads FROM default.testing_fresh_1 GROUP BY CAST(creation_date as DATE)
When I query Select * from...
I get results formatted like this:
creation_date
2018-07-01 02:02:09
2018-06-05 01:39:30
2018-05-16 21:28:48
2017-04-23 17:03:53
Any idea what I am doing wrong?
From your select * result set, I guess there isn't ID column in your table.
You can try to use COUNT(*) instead of COUNT(id)
SELECT CAST(creation_date as DATE) as ad_creation,
COUNT(*) as Total_ads
FROM default.test
GROUP BY CAST(creation_date as DATE)
Try below Code.
SELECT CAST(creation_date as DATE) as ad_creation,
COUNT(id) as Total_ads
FROM default.testing_fresh_1
GROUP BY ad_creation

Athena greater than condition in date column

I have the following query that I am trying to run on Athena.
SELECT observation_date, COUNT(*) AS count
FROM db.table_name
WHERE observation_date > '2017-12-31'
GROUP BY observation_date
However it is producing this error:
SYNTAX_ERROR: line 3:24: '>' cannot be applied to date, varchar(10)
This seems odd to me. Is there an error in my query or is Athena not able to handle greater than operators on date columns?
Thanks!
You need to use a cast to format the date correctly before making this comparison. Try the following:
SELECT observation_date, COUNT(*) AS count
FROM db.table_name
WHERE observation_date > CAST('2017-12-31' AS DATE)
GROUP BY observation_date
Check it out in Fiddler: SQL Fidle
UPDATE 17/07/2019
In order to reflect comments
SELECT observation_date, COUNT(*) AS count
FROM db.table_name
WHERE observation_date > DATE('2017-12-31')
GROUP BY observation_date
You can also use the date function which is a convenient alias for CAST(x AS date):
SELECT *
FROM date_data
WHERE trading_date >= DATE('2018-07-06');
select * from my_schema.my_table_name where date_column = cast('2017-03-29' as DATE) limit 5
I just want to add my little words here, if you have date column with ISO-8601 format, for example: 2022-08-02T01:46:46.963120Z then you can use parse_datetime function.
In my case, the query looks like this:
SELECT * FROM internal_alb_logs
WHERE elb_status_code >= 500 AND parse_datetime(time,'yyyy-MM-dd''T''HH:mm:ss.SSSSSS''Z') > parse_datetime('2022-08-01-23:00:00','yyyy-MM-dd-HH:mm:ss')
ORDER BY time DESC
See more other examples here: https://docs.aws.amazon.com/athena/latest/ug/application-load-balancer-logs.html#query-alb-logs-examples