PRESTO SQL count group by date - sql

I have a Presto sql table called "imp_pixel".
Here a record of a table :
date_time ip impression_id
2022-08-27 07:05:48 192.0.0.1 001
2022-08-27 07:05:58 192.0.0.12 002
I would like to show the sum of impression_id group by hour
I tryed with this code
select
date_trunc('hour', CAST(date_time AS date)) date_time,
COUNT(impression_id,0) AS 'impression_id'
from parquet_db.imp_pixel
group by date_trunc('hour', date)
But I got this error :
line 3:31: mismatched input ''impression_id''. Expecting: <identifier>
Can you help me please to fix this error?
thanks

Formatting date_time to date, we lose the hourly data
select
date_trunc('hour', CAST(date_time AS timestamp)) date_time,
COUNT(impression_id) AS impression_id
from parquet_db.imp_pixel
group by 1

Related

select rows with condition of date presto

I try to select by hour the number of impression for a particular day :
I try with this code :
SELECT
date_trunc('hour', CAST(date_time AS timestamp)) date_time,
COUNT(impression_id) AS count_impression_id
FROM
parquet_db.imp_pixel
WHERE
date_time = '2022-07-27'
LIMIT 100
GROUP BY 1
But I got this error when I add the "where" clause :
line 5:1: mismatched input 'group'. Expecting:
Can you help me to fix it? thanks
LIMIT usually comes last in a SQL query. Also, you should not be using LIMIT without ORDER BY. Use this version:
SELECT DATE_TRUNC('hour', CAST(date_time AS timestamp)) date_time,
COUNT(impression_id) AS count_impression_id
FROM parquet_db.imp_pixel
WHERE CAST(date_time AS date) = '2022-07-27'
GROUP BY 1
ORDER BY <something>
LIMIT 100;
Note that the ORDER BY clause determines which 100 records you get in the result set. Your current (intended) query lets Presto decide on its own which 100 records get returned.

get count all with groupby timestamp into hourly intervals

I have a hive table that has a timestamp in string format as below,
20190516093836, 20190304125015, 20181115101358
I want to get row count with an aggregate timestamp into hourly as below
date_time count
-----------------------------
2019:05:16: 00:00:00 23
2019:05:16: 01:00:00 64
I followed several links like this but was unable to generate the desired results yet.
This is my final query:
SELECT
DATE_PART('day', b.date_time) AS date_prt,
DATE_PART('hour', b.date_time) AS hour_prt,
COUNT(*)
FROM
(SELECT
from_unixtime(unix_timestamp(`timestamp`, "yyyyMMddHHmmss")) AS date_time
FROM table_name
WHERE from_unixtime(unix_timestamp(`timestamp`, "yyyyMMddHHmmss"))
BETWEEN '2018-12-10 07:02:30' AND '2018-12-12 08:02:30') b
GROUP BY
date_prt, hour_prt
I hope for some guidance from you, thanks in advance
You can extract date_time already in required format 'yyyy-MM-dd HH:00:00'. I prefer using regexp_replace:
SELECT
date_time,
COUNT(*) as `count`
FROM
(SELECT
regexp_replace(`timestamp`, '^(\\d{4})(\\d{2})(\\d{2})(\\d{2})(\\d{2})(\\d{2})$','$1-$2-$3 $4:00:00') AS date_time
FROM table_name
WHERE regexp_replace(`timestamp`, '^(\\d{4})(\\d{2})(\\d{2})(\\d{2})(\\d{2})(\\d{2})$','$1-$2-$3 $4:$5:$6')
BETWEEN '2018-12-10 07:02:30' AND '2018-12-12 08:02:30') b
GROUP BY
date_time
This will also work:
from_unixtime(unix_timestamp('20190516093836', "yyyyMMddHHmmss"),'yyyy-MM-dd HH:00:00') AS date_time

How to group by month in presto SQL

I am trying to group by per month in Presto SQL.
I tried this:
select
date_trunc('month', CAST(date AS date)) date_month,
sum(gross_revenue,0) AS 'monthly_net_revenue'
from gross_revenue_calculator
group by date_trunc('month', date)
This gives me the following error:
Malformed query: line 61:27: mismatched input ''monthly_net_revenue''. Expecting: <identifier>
Expected output:
October: $102.12
November: $90.12
You should not use single quotes as column name it either no quotes or double, also you can reference columns by index in GROUP BY as you do in your WITH clause:
select
date_trunc('month', CAST(date AS date)) date_month,
sum(gross_revenue,0) AS monthly_net_revenue
from gross_revenue_calculator
group by 1

CAST a date in Presto to next count

I would like to query Athena with JSON files. I matched creation_date with id because I would like to get a heatmap where on Y axis I have month, on X axis there day and I count the id's inside. I created a table with 2 columns:
creation_date date, id int. Next I am query with the below code:
SELECT CAST(creation_date as DATE) as ad_creation,
COUNT(id) as Total_ads
FROM default.test
GROUP BY CAST(creation_at_first as DATE)
unfortunately I am getting this error:
DatabaseError: Execution failed on sql: SELECT CAST(creation_date as DATE) as ad_creation, COUNT(id) as Total_ads FROM default.testing_fresh_1 GROUP BY CAST(creation_date as DATE)
When I query Select * from...
I get results formatted like this:
creation_date
2018-07-01 02:02:09
2018-06-05 01:39:30
2018-05-16 21:28:48
2017-04-23 17:03:53
Any idea what I am doing wrong?
From your select * result set, I guess there isn't ID column in your table.
You can try to use COUNT(*) instead of COUNT(id)
SELECT CAST(creation_date as DATE) as ad_creation,
COUNT(*) as Total_ads
FROM default.test
GROUP BY CAST(creation_date as DATE)
Try below Code.
SELECT CAST(creation_date as DATE) as ad_creation,
COUNT(id) as Total_ads
FROM default.testing_fresh_1
GROUP BY ad_creation

Hive + Pass previous day date to like clause

Am trying to fetch records from hive table based on the previous date.
Example: If table is as follows
CustomerVisit table
ID NAME VISIT_DATE
------------------
01 Harish 2018-02-31
03 Satish 2017-02-13
04 Shiva 2018-03-04
Now i need to fetch all records that have visit_date = 2018-03-04 (i.e today's date -1).
Expected Query something like:
select ID, Name from CustomerVisit where
visit_date like concat((select date_sub(current_date, 1)),'%')
I have tried following
select current_date; - Gives current date
select date_sub(current_date, 1) - Gives previous date
select concat(tab1.date1,'%') from
(select date_sub(current_date, 1) as date1) as tab1; -- Gives the previous date appended with % which i need to use in like
but when i use the above as sub-query like below it fails with
select tab2.id, (select concat(tab1.date1,'%') as pattern from
(select date_sub(current_date, 1) as date1) as tab1) from CustomerVisit as tab2 limit 1;
FAILED: ParseException line 1:0 cannot recognize input near 'seelct' 'current_date' ''
How to write query to get results for previous date?
You don't need a LIKE clause. Just select using an equal to (=)
select ID, Name from CustomerVisit where
visit_date = date_sub(current_date, 1);