Check if timestamp is contained in date - google-bigquery

I'm trying to check if a datetime is contained in current date, but I'm not veing able to do it.
This is my query:
select
date(timestamp) as event_date,
count(*)
from pixel_logs.full_logs f
where 1=1
where event_date = CUR_DATE()
How can I fix it?

Like Mikhail said, you need to use CURRENT_DATE(). Also, count(*) requires you to GROUP BY the date in your example. I do not know how your data is formatted, but one way to modify your query:
#standardSQL
WITH
table AS (
SELECT
1494977678 AS timestamp_secs) -- Current timestamp (in seconds)
SELECT
event_date,
COUNT(*) as count
FROM (
SELECT
DATE(TIMESTAMP_SECONDS(timestamp_secs)) AS event_date,
CURRENT_DATE()
FROM
table)
WHERE
event_date = CURRENT_DATE()
GROUP BY
event_date;

Related

SQL count of distinct values over two columns

I have the following query that allows me to aggregate the number of unique sellers/buyers for every single day from the Flipside API:
SELECT
date_trunc('day', block_timestamp) AS date,
COUNT(DISTINCT(seller_address)) AS unique_sellers,
COUNT(DISTINCT(buyer_address)) AS unique_buyers
FROM ethereum.core.ez_nft_sales
GROUP BY date
Now, I've been trying a lot of different things, but I can't for the life of me figure out how it would be possible to get the number of unique active addresses on a given day as I would need to somehow merge the sellers and buyers and then count the unique addresses. I would greatly appreciate any kind of help. Thanks in advance!
This is how I managed to solve the issue by using a separate query for the unique_active and merging them:
WITH
other_values AS (
SELECT
date_trunc('day', block_timestamp) AS date,
COUNT(DISTINCT seller_address) AS unique_sellers,
COUNT(DISTINCT buyer_address) AS unique_buyers
FROM ethereum.core.ez_nft_sales
GROUP BY date
),
unique_addresses AS (
SELECT
date,
COUNT(*) as unique_active
FROM (
SELECT
date_trunc('day', block_timestamp) as date,
seller_address as address
FROM ethereum.core.ez_nft_sales
GROUP BY date, seller_address
UNION
SELECT
date_trunc('day', block_timestamp) as date,
buyer_address as address
FROM ethereum.core.ez_nft_sales
GROUP BY date, buyer_address
)
GROUP BY date
)
SELECT * FROM other_values
LEFT JOIN unique_addresses
ON other_values.date = unique_addresses.date
ORDER BY other_values.date DESC

DATE_TRUNC with :: and without

The query only works when there is :: DATE.
-- Wrap the query you wrote in a CTE named reg_dates
WITH reg_dates AS (
SELECT
user_id,
MIN(order_date) AS reg_date
FROM orders
GROUP BY user_id)
SELECT
-- Count the unique user IDs by registration month
DATE_TRUNC('month', reg_date) :: DATE AS delivr_month,
COUNT(DISTINCT user_id) AS regs
FROM reg_dates
GROUP BY delivr_month
ORDER BY delivr_month ASC;
Why is that required? When I run the query below, without :: DATE, it does not work.
-- Wrap the query you wrote in a CTE named reg_dates
WITH reg_dates AS (
SELECT
user_id,
MIN(order_date) AS reg_date
FROM orders
GROUP BY user_id)
SELECT
-- Count the unique user IDs by registration month
DATE_TRUNC('month', reg_date) AS delivr_month,
COUNT(DISTINCT user_id) AS regs
FROM reg_dates
GROUP BY delivr_month
ORDER BY delivr_month ASC;
Highly likely your RDBMS is PostgreSQL, in your case the :: converts a date type of date, further, :: is represented as CAST(expression AS type).
Equally,
CAST (DATE_TRUNC('month', reg_date) AS DATE) AS delivr_month
What does "does not work" mean? Note that date_trunc in PostgreSQL returns a datetime. So if you need a date for your query to work, this is why you need ::date.

Select latest 30 dates for each unique ID

This is a sample data file
Data Contains unique IDs with different latitudes and longitudes on multiple timestamps.I would like to select the rows of latest 30 days of coordinates for each unique ID.Please help me on how to run the query .This date is in Hive table
Regards,
Akshay
According to your example above (where no current year dates for id=2,3), you can numbering date for each id (order by date descending) using window function ROW_NUMBER(). Then just get latest 30 values:
--get all values for each id where num<=30 (get last 30 days for each day)
select * from
(
--numbering each date for each id order by descending
select *, row_number()over(partition by ID order by DATE desc)num from Table
)X
where num<=30
If you need to get only unique dates (without consider time) for each id, then can try this query:
select * from
(
--numbering date for each id
select *, row_number()over(partition by ID order by new_date desc)num
from
(
-- move duplicate using distinct
select distinct ID,cast(DATE as date)new_date from Table
)X
)Y
where num<=30
In Oracle this will be:
SELECT * FROM TEST_DATE1
WHERE DATEUPDT > SYSDATE - 30;
select * from MyTable
where
[Date]>=dateadd(d, -30, getdate());
To group by ID and perform aggregation, something like this
select ID,
count(*) row_count,
max(Latitude) max_lat,
max(Longitude) max_long
from MyTable
where
[Date]>=dateadd(d, -30, getdate())
group by ID;

BigQuery Cross Join Failing

I'm trying to pull user activity by date. I am trying to built a table of every day since a user account was created, using cross join and a where clause. In my case, cross join cannot be avoided. The calendar table is just a list of all dates for last 365 days (365 rows). The user table has ~1b rows.
Here is the query that fails with insufficient resources:
SELECT
u.user_id as user_id,
date(u.created) as signup_date,
cal.date as date,
from (select date(dt) as date from [dw.calendar] where date(dt) <
CURRENT_DATE() ) cal
cross join each dw.user u
where
date(u.created) <= cal.date
Based on https://cloud.google.com/bigquery/query-reference, cross joins do not even support the "each" clause. How do I perform the above operation to successfully create a table?
You do not need to fill "empty" days to just calculate daily count and perform window function to get the aggregated sum, so you don't even need calendar table for this. To make this happen you need to use RANGE vs. ROWS in your window. See example below (for BigQuery Standard SQL)
#standardSQL
SELECT
user_id, created, daily_count,
SUM(daily_count) OVER(
PARTITION BY user_id ORDER BY created_unix_date DESC
RANGE BETWEEN CURRENT ROW AND 6 FOLLOWING
) weekly_avg
FROM `dw.user`, UNNEST([UNIX_DATE(created)]) AS created_unix_date
ORDER BY user_id, created DESC
i am not sure about exact schema /types of your table so might need to adjust above respectively, but meantime you can test/play with below dummy data
#standardSQL
WITH `dw.user` AS (
SELECT
day AS created,
CAST(1 + 10 * RAND() AS INT64) AS user_id,
CAST(100 * RAND() AS INT64) AS daily_count
FROM UNNEST(GENERATE_DATE_ARRAY('2017-01-01', '2017-04-26')) AS day
)
SELECT
user_id, created, daily_count,
SUM(daily_count) OVER(
PARTITION BY user_id ORDER BY created_unix_date DESC
RANGE BETWEEN CURRENT ROW AND 6 FOLLOWING
) weekly_avg
FROM `dw.user`, UNNEST([UNIX_DATE(created)]) AS created_unix_date
ORDER BY user_id, created DESC

Compare timestamps stored as strings to a string formatted date

event_date contains timestamps stored as strings.
1382623200
1382682600
1384248600
...
How can I SELECT rows where event_date is less than a string formatted date? This is my best attempt:
SELECT *
FROM [analytics:workspace.events]
WHERE TIMESTAMP(event_date) < PARSE_UTC_USEC("2013-05-02 09:09:29");
I get all rows regardless of what date I pass to PARSE_UTC_USEC()
It looks like the event_date strings represent Unix seconds. Try this using standard SQL (uncheck "Use Legacy SQL" under "Show Options"):
WITH T AS (
SELECT x, event_date
FROM UNNEST(['1382623200',
'1382682600',
'1384248600']) AS event_date WITH OFFSET x
)
SELECT *
FROM (
SELECT * REPLACE (TIMESTAMP_SECONDS(CAST(event_date AS INT64)) AS event_date)
FROM T
)
WHERE event_date < '2013-05-02 09:09:29';
The subquery converts the event_date string into a timestamp using the REPLACE clause.
Try below. Hope this helps
SELECT event_date, TIMESTAMP(event_date) as ts
FROM -- [analytics:workspace.events]
(
SELECT event_date FROM
(SELECT '1382623200' AS event_date),
(SELECT '1382682600' AS event_date),
(SELECT '1384248600' AS event_date)
)
WHERE TIMESTAMP(event_date) < PARSE_UTC_USEC("2013-10-25 07:30:00")
above is just example - you should use your table in real life:
SELECT event_date, TIMESTAMP(event_date) as ts
FROM [analytics:workspace.events]
WHERE TIMESTAMP(event_date) < PARSE_UTC_USEC("2013-10-25 07:30:00")