Bigquery: Getting first day of the week/month in standard SQL - google-bigquery

In Bigquery's legacy SQL, I can get the start of week for a date by using
SELECT DATE((UTC_USEC_TO_WEEK(TIMESTAMP_TO_USEC(TIMESTAMP('2017-04-13 20:58:06 UTC')), 0)))
which returns 2017-04-09.
Is there a way to do this in BigQuery's standard SQL? There doesn't seem to be any equivalents for UTC_USEC_TO_WEEK and UTC_USEC_TO_MONTH.

It looks like BigQuery has a function named TIMESTAMP_TRUNC which may do what you want. It is referenced as the replacement for UTC_USEC_TO_DAY(t) in LegacySQL when used with a Day datepart. It also accepts Week and Month as a parameter which may meet your requirements.
TIMESTAMP_TRUNC(TIMESTAMP '2008-12-25 15:30:00', WEEK, 'UTC')
Here is the page for migrating from Legacy to Standard sql

This is better option that works now:
select DATE_TRUNC(date( '2008-12-25 15:30:00'), month)

Related

How to add a certain number of days based on field value

I'm working on creating a view to convert some values that we get coming in for dates/times. (I've figured out the times already.) In this table, dates come in the format of "Days after 12-31-1840." I'm trying to create a view that shows the actual dates/times rather than in this format. This is what I have so far:
CASE WHEN UPPER(FLWSHT_MEAS_NM) LIKE '%DATE' THEN TO_CHAR(DATE '1840-12-31' + INTERVAL ACUTE_MEASURE_VALUE DAY)
I know that this is the correct syntax for adding dates, because I'm able to get the view working with this instead:
CASE WHEN UPPER(FLWSHT_MEAS_NM) LIKE '%DATE' THEN TO_CHAR(DATE '1840-12-31' + INTERVAL '30' DAY)
My question is, how do I add a specific number of days based on the ACUTE_MEASURE_VALUE field? I'm not able to run the code to get a runtime error as it's coming up as a syntax error.
From the Teradata documentation:
Teradata SQL extends the ANSI SQL:2011 standard to allow the operations of adding or subtracting a number of days from an ANSI DATE value. Teradata SQL treats the number as an INTERVAL DAY value.
I'm assuming your field ACUTE_MEASURE_VALUE is already in your table and is an integer. The words INTERVAL and DAY are part of the specification of interval constants - this is a variable and syntactically you don't use those keywords.
...TO_CHAR(DATE '1840-12-31' + ACUTE_MEASURE_VALUE)...
Just drop the INTERVAL keyword and the DAY keyword and it should work.
By the way, why are you using To_Char() in this? By transforming it into a character string it preempts anyone using this view from performing calculations on this date. If you leave the view in DATE format then any subsequent Select from this view has a lot more flexibility in manipulating this data field.

Spark Scala where date is greater than

I want to create a function to get the last 4 days on data including today. Here's my function, what am I missing? When I run a test I got an empty table.
df.where(trunc(col("date"),"day") >= date_add(current_date(),-4))
Try to use date_trunc instead. trunc only supports month and year. Also note that date_trunc accepts arguments in the reverse order of trunc.
df.where(date_trunc("day",col("date")) >= date_add(current_date(),-4))

is there an easy way to generate a table_query based off of a timezone difference in BigQuery?

Legacy SQL
I'm using GBQ's legacy SQL to query tables dynamically using the TABLE_QUERY function. I dynamically generate the table name to query based on CURRENT_TIMESTAMP. For example, I select devices from the past 14 days of hit data in tables that are partitioned by quarter (ie. mydataset.hit_data_[1-4]).
Standard SQL
I need to convert the timezones to PST. GBQ Standard SQL has TIME ZONE conversions. Switching to Standard SQL, I am able to convert timezones using the GBQ Standard SQL. But if I now try to use a TABLE_QUERY in the same query, to do what I was doing in the Legacy SQL version, I get:
Error: Table-valued functions are not supported
Using both
Is there a way to have the best of both worlds? I would like to query mydataset.hit_data_3 and mydataset.hit_data_4 based on the current timestamp in Q4, if the previous 14 days overlap into Q3.
SELECT
device
FROM
TABLE_QUERY(mydataset, 'table_id = CONCAT(\"hit_data_\", STRING(QUARTER(TIMESTAMP(CURRENT_TIMESTAMP, "America/Los_Angeles")))) OR table_id = CONCAT(\"hit_data_\", STRING(QUARTER(DATE_ADD(TIMESTAMP(CURRENT_TIMESTAMP, "America/Los_Angeles"), INTERVAL -14 DAY)))) ')
WHERE
DATE(date_time) BETWEEN DATE(DATE_ADD(TIMESTAMP(CURRENT_TIMESTAMP, 'America/Los_Angeles'), INTERVAL -14 DAY))
AND DATE(CURRENT_DATE())
;
It looks ugly, but in GBQ it should be valid.
Standard SQL doesn't support TABLE_QUERY or TABLE_DATE_RANGE functions. Instead it supports wildcard tables with a special pseudo column _TABLE_SUFFIX:
You should be able to rewrite your query with a WHERE clause on _TABLE_SUFFIX pseudo column
https://cloud.google.com/bigquery/docs/querying-wildcard-tables
with BigQuery Standard SQL you should use _TABLE_SUFFIX pseudo column that allows you to chose table(s) to query from
Below is direction to go
SELECT *
FROM `mydataset.hit_data_*`
WHERE (_TABLE_SUFFIX = STRING(QUARTER(TIMESTAMP(CURRENT_TIMESTAMP, "America/Los_Angeles")))
OR _TABLE_SUFFIX = STRING(QUARTER(DATE_ADD(TIMESTAMP(CURRENT_TIMESTAMP, "America/Los_Angeles"), INTERVAL -14 DAY)))
)
AND DATE(date_time) BETWEEN
DATE(DATE_ADD(TIMESTAMP(CURRENT_TIMESTAMP, 'America/Los_Angeles'), INTERVAL -14 DAY))
AND DATE(CURRENT_DATE())
Note: you need to make sure you are using functions supported by Standard SQL
For example instead of
QUARTER(TIMESTAMP(...))
you should use
EXTRACT(QUARTER FROM TIMESTAMP(...))

Cannot use calculated offset in BigQuery's DATE_ADD function

I'm trying to create a custom query in Tableau to use on Google's BigQuery. The goal is to have an offset parameter in Tableau that changes the offsets used in a date based WHERE clause.
In Tableau it would look like this:
SELECT
DATE_ADD(UTC_USEC_TO_MONTH(CURRENT_DATE()),<Parameters.Offset>-1,"MONTH") as month_index,
COUNT(DISTINCT user_id, 1000000) as distinct_count
FROM
[Orders]
WHERE
order_date >= DATE_ADD(UTC_USEC_TO_MONTH(CURRENT_DATE()),<Parameters.Offset>-12,"MONTH")
AND
order_date < DATE_ADD(UTC_USEC_TO_MONTH(CURRENT_DATE()),<Parameters.Offset>-1,"MONTH")
However, BigQuery always returns an error:
Error: DATE_ADD 2nd argument must have INT32 type.
When I try the same query in the BigQuery editor using simple arithmetic it fails with the same error.
SELECT
DATE_ADD(UTC_USEC_TO_MONTH(CURRENT_DATE()),5-3,"MONTH") as month_index,
FROM [Orders]
Any workaround for this? My only option so far is to make multiple offsets in Tableau, it seems.
Thanks for the help!
I acknowledge that this is a hole in functionality of DATE_ADD. It can be fixed, but it will take some time until fix is rolled into production.
Here is a possible workaround. It seems to work if the first argument to DATE_ADD is a string. Then you can truncate the result to a month boundary and convert it from a timestamp to a string.
SELECT
FORMAT_UTC_USEC(UTC_USEC_TO_MONTH(DATE_ADD(CURRENT_DATE(),5-3,"MONTH"))) as month_index;

Using DAY(), WEEK(), and YEAR() at one query

i using MySQL Query for my task.
And I interested using Date and time function.
can i use DAY(), WEEK(), and YEAR() at one query?
SELECT Object
FROM table
WHERE DAY(date) BETWEEN 1 AND 7
GROUP BY WEEK(date, 1), YEAR(date)
i want do this bcoz i'm worry if sometimes my program have an error because of the date setting and not recognize some date.please give me an input.
Yes, you can use them all in a single query.
The only disadvantage I can think of is that using any of the DAY, WEEK or YEAR functions won't be able to use the index on the column the function is applied to, assuming one is present.
If you're having issues relating to date formatting, you should get familiar with:
DATE_FORMAT
STR_TO_DATE