EXTRACT vs CAST in SQL - sql

This question is mostly for an "Optimizing Code" kind of purpose.
So, in SQL, specifically Google BigQuery, there are 2 ways to transform a timestamp into a date or an hour. Using EXTRACT() or CAST().
There might be more ways to do so, but at least those are the ones I know of currently.
CAST() example:
SELECT
CAST(tb.timestamp_field AS DATE) AS date_field, COUNT(*)
FROM
database.table tb
GROUP BY
CAST(tb.timestamp_field AS DATE)
EXTRACT() example:
SELECT
EXTRACT(DATE FROM tb.timestamp_field) AS date_field, COUNT(*)
FROM
database.table tb
GROUP BY
EXTRACT(DATE FROM tb.timestamp_field)
Both methods work for what I'm trying to do, but I would like to know which one would be considered as a "best practice". Or maybe the whole questions could be silly, like asking which is better: "4+3-2" or "4-2+3". Which would be basically the same.

my two cents -
Cast - preferable. Because lot of other big data tools uses similar format so if you have to ever migrate to another big data, you can migrate smoothly.
Also, in your SQL, cast is a direct operation so i think this can be faster. I tested this using one record and this sql took 0.011 sec.
SELECT cast( TIMESTAMP "2018-12-25 05:30:00" as date)
Extract - The SQL you are using is not official - there is nothing like EXTRACT(DATE from timestamp_col). Good way is to use what #
Mikhail Berlyant mentioned. but your sql is working - so i think internally, google big query engine is converting the timestamp to date and the removing time part. So its a two part operation and heavily depends on internal conversion. A little unreliable i think. Also, i think you can run both your query and check performance because perf depends on lot of factors like - environment, amount of data, optimized table, etc.
Also, below SQL took like 0.012 sec. (not a great perf indicator though)
SELECT EXTRACT(DATE FROM TIMESTAMP "2018-12-25 05:30:00")
You can refer to below link for more on EXTRACT or DATE -
https://cloud.google.com/bigquery/docs/reference/standard-sql/date_functions#extract

Related

Amazon AMC SQL Queries - How to declare a variable

Does anyone know how to declare a date a variable in SQL for Amazon Marketing Cloud? The query UI for AMC uses syntax specific to Amazon and I cannot find this in the documentation or instructional queries.
I'm trying to do something like this, adding a couple date parameters I can then re-use across a few different tables and date fields:
declare start_date date constant cast('2022-6-1' as date)
select
click_date,
sum(clicks) as clicks
from dsp_clicks
where click_date > start_date
group by click_date
I’m learning this db myself and just stumbled upon your question here. It’s probably too late and I’m not great at general SQL as this is my first time really trying to learn it for work but it should go something like this if you’re just looking to gain a useable or custom date:
CAST(campaign_start_date AS DATE)>CAST(‘2022-02-02’ AS DATE)
I hope this helps.

SQL Server : best practice query for date manipulation

Long time listener, first time caller.
At work we have all of the date columns for most tables stored in as a simple "string" (varchar) formats. Such as yyyymmdd (eg. 20220625) or yyyymm (202206) etc.
Now for a lot of queries that are time based we need to compare to current date, or some fixed offset from current date.
Now two obvious versions that I know of to get current utc date into either of those formats are the following (for yyyymm as example):
SELECT LEFT(CONVERT(VARCHAR, GETUTCDATE(), 112), 6) ...
SELECT CONVERT(VARCHAR(6), GETUTCDATE(), 112) ...
I'm wondering if anyone knows of a better way, either both idiomatically or performance wise to convert those, and/or is there anything wrong with the second one to be worried about versus the first one in regards to either security/reliability etc? The second one definitely satisfies my code golf sensibilities, but not if it's at the expense of something I'm unaware of.
Also for some extra context the majority of our code runs in SQL Server or T-SQL, BUT we also need to attempt to be as platform agnostic as possible as there are customers on Oracle and/or Mysql.
Any insight/help would be highly appreciated.
There is no problem with either approach. Both work just fine. It is a matter of personal preference which to choose. The first looks more explicit, the second is shorter and thus easier to read maybe. As to performance: You want to get the current day or month only once in a query, so the call doesn't realy affect query runtime.
As to getting this platform agnostic is quite a different story. SQL dialects differ. Especially when it comes to date/time handling. You already notice that SQL Server's date functions are quite restricted. In Oracle and MySQL you would simple state the format you want (TO_CHAR(SYSDATE, 'YYYYMM') in Oracle and DATE_FORMAT(CURRENT_DATE, '%Y%m') in MySQL). But you also see that the function calls differ.
Now, you could write a user defined function GET_CURRENT_MONTH_FORMATTED for this which would return the string for the current month, e.g. '202206'. Then you'd have the different codes hidden in that function and the SQL queries would all look the same. The problem, though, is how to tell the DBMS that the function result is deterministic for a particular timestamp? If you run the query on December 31, 2022 at 23:50 and it runs until January 1, 2023 at 0:20, you want the DBMS to call this function only once for the query resulting in '202212' and not being called again, suddenly resulting in another string '202301'. I don't even know whether this is possible. I guess it is not.
I think you cannot write a query that does what you want and looks the same in all mentioned DBMS.

Where in my query to place the CONVERT to convert DateTime to Date

Just learning SQL and I've searched many options about converting a DateTime into a Date, and I do not want current date. It's a super simple query from this website: https://sqlzoo.net/wiki/Guest_House_Assessment_Easy
SELECT booking_date, nights
FROM booking
WHERE guest_id=1183
But the output is with the timestamp and I just want the date. I've searched so many forums and tried all their suggestions, including this:
SELECT CONVERT(varchar(10), <col>, 101)
So I've done:
SELECT CONVERT(varchar(10), booking_date,101), nights
FROM booking
WHERE guest_id=1183
But I'm getting syntax errors. This is probably so simple and you'll all think me an idiot, but I'd greatly appreciate help. It's driving me nuts.
When I fiddled about at your sqlzoo link I got the error
execute command denied to user 'scott'#'localhost' for routine 'gisq.to_date'`.
When I googled gisq.to_date I got this link https://sqlzoo.net/wiki/Format_a_date_and_time
Which has examples of how this dialect represents dates. See if you can work it out. Something like this:
SELECT date_format(booking_date,'%d/%m/%Y')
FROM booking
You didn't post the error in your question which is a big mistake. When you get an error message, you actually have something to work from.
It is also very important to note that the query above returns a string, not a date. It's only good for display, not for date arithmetic
TBH that seems like a terrible site to learn on as it gives no clues about the dialect. it looks like Oracle but to_date and trunc don't work.
The use of convert() suggests that you think you are uinsg SQL Server. If you only want the date component of a date/time data type, then you can use:
SELECT CONVERT(DATE, booking_date), nights
FROM booking
WHERE guest_id = 1183;
The syntax error suggests that you are not using SQL Server.
CONVERT() is bespoke syntax for SQL Server. Examples of similar functionality in other databases are:
DATE(booking_date)
TRUNC(booking_date)
DATE_TRUNC('day', booking_date)
In addition, what you see also depends on the user-interface.
In your case, the data is being stored as a date with no time component, but the UI is showing the time. For that, you want to convert to a string. That site uses MariaDB -- which is really a flavor of MySQL-- and you would use:
DATE_FORMAT(booking_date, '%Y-%m-%d')

TABLE_DATE_RANGE for xxxx_yyyymm format tables

I'm having a problem trying to query for 15 months worth of data.
I know about bigquery's wildcard functions, but I can't seem to get them to work with my tables.
For example, if my tables are called:
xxxx_201501,
xxxx_201502,
xxxx_201503,
...
xxxx_201606
How can I select everything from 201501 until today (current_timestamp)?
It seems that it's necessary to have the tables per day, am I wrong?
I've also read that you can use regex but can't find the way.
With Standard SQL, you can use a WHERE clause on a _TABLE_SUFFIX pseudo column as described here:
Is there an equivalent of table wildcard functions in BigQuery with standard SQL?
In this particular case, it would be:
SELECT ... from `mydataset.xxx_*` WHERE _TABLE_SUFFIX >= '201501';
This is a bit long for a comment.
If you are using the standard SQL dialect, then I don't think the functionality is yet implemented.
If you are using the legacy SQL dialect, then you can use a function such as TABLE_DATE_RANGE(). This and other table wildcard functions are well documented.
EDIT:
Oh, I see. The simplest way would be to store the tables as YYYYMM01 so you can use the range query.
But, you can also use table_query():
from table_query(t, 'right(table_id, 6) >= ''201501'' ')

Subtract hours from SQL Server 2012 query result

I am running queries on an alarm system signal automation platform database in SQL Server 2012 Management Studio, and I am running into some hiccups.
My queries run just fine, but I am unable to refine my results to the level that I would like.
I am selecting some columns that are formatted as DATETIME, and I simply want to take the value in the column and subtract 4 hours from it (i.e., from GMT to EST) and then output that value into the query results.
All of the documentation I can find regarding DATESUB() or similar commands are showing examples with a specific DATETIME in the syntax, and I don't have anything specific, just 138,000 rows with columns I want to adjust time zones for.
Am I missing something big or will I just need to continue to adjust manually after I being manipulating my query result? Also, in case it makes a difference, I have a read-only access level, and am not interested in altering table data in any way.
Well, for starters, you need to know that you aren't restricted to use functions only on static values, you can use them on columns.
It seems that what you want is simply:
SELECT DATEADD(HOUR,-4,YourColumnWithDateTimes)
FROM dbo.YourTable
Maybe it will be helpful
SELECT DATEPART(HOUR, GETDATE());
DATEPART docs from MSDN