Why is BigQuery converting some dates to timestamps but not others? - sql

Our SQL query shown below is converting strings to timestamp fields but failing on some dates and not on others. What is causing this conversion to fail?
SELECT birthdate, TIMESTAMP(REGEXP_REPLACE(birthdate, r'(..)/(..)/(....)', r'\3-\2-\1')) ts
FROM [our_project:our_table] LIMIT 1000
Here are the results. Notice that BigQuery is giving "null" for many of the dates. Why is the regex failing? Is there something to add to make it more robust?
Here is a second conversion query we tried.
SELECT birthdate, TIMESTAMP(year + '-' + month + '-' + day) as output_timestamp
FROM (
SELECT
birthdate,
REGEXP_EXTRACT(birthdate, '.*/([0-9]{4})$') as year,
REGEXP_EXTRACT(birthdate, '^([0-9]{2}).*') as day,
REGEXP_EXTRACT(birthdate, '.*/([0-9]{2})/.*') AS month
FROM
[our_project:our_table]
)
LIMIT 1000
Notice that nulls appeared in these results as well.
How might we fix what is going wrong?

Is there a reason you're not using the supported TIMESTAMP data type?
From the docs:
You can describe TIMESTAMP data types as either UNIX timestamps or calendar datetimes.
Datetimes need to be in a specific format:
A date and time string in the format YYYY-MM-DD HH:MM:SS. The UTC and Z specifiers are supported.
This would also make it easier to query this particular column, as it would allow you to leverage BigQuery's standard SQL dialect. Commands such as HOUR, DAYOFWEEK, DAYOFYEAR, etc.
Here's an example query using one of BQ's public datasets to find the most popular pickup hour using a timestamp field:
SELECT
HOUR(pickup_datetime) as pickup_hour,
COUNT(*) as pickup_count
FROM
[nyc-tlc:green.trips_2014]
GROUP BY
1
ORDER BY
pickup_count DESC
will yield:
Row pickup_hour pickup_count
1 19 1059068
2 18 1051326
3 20 985664
4 17 957583
5 21 938378
6 22 908296

It turns out that the month and the day were swapped (international versus U.S.) The result is that the ranges were invalid for the timestamp. Once we swapped the day and the month - then the conversions occurred without problems.

If your data has custom formatting of timestamps, you can always use PARSE_TIMESTAMP function in Standard (non-legacy) SQL - https://cloud.google.com/bigquery/sql-reference/functions-and-operators#parse_timestamp
I.e. all the following queries
select parse_timestamp("%Y-%d-%m", x) from
unnest(["2016-31-12", "1999-01-02"]) x
select parse_timestamp("%Y-%d-%m", x) from
unnest(["2016-31-12", "1999-01-02"]) x
select parse_timestamp("%Y-%b-%d", x) from
unnest(["2016-Dec-31", "1999-Feb-01"]) x
results in
f0_
1 2016-12-31 00:00:00 UTC
2 1999-02-01 00:00:00 UTC

Related

getting sql error:hour must be between 1 and 12

There is a problem with a query I use to report.I get an error comparing a value stored as a timestamp with data saved yesterday.
query:
SELECT * FROM PIECE P, PIECE_ATTRB PA WHERE P.PIECE_NUM_ID=PA.PIECE_NUM_ID
AND PA.ATTRB_CODE='PRODUCTION_CUT_DATE'
AND PA.ATTRB_AN_VALUE >=cast(TRUNC(SYSDATE-1)+ INTERVAL '00:00:00' HOUR TO SECOND AS timestamp)
AND pa.ATTRB_AN_VALUE < CAST(TRUNC(SYSDATE)+ INTERVAL '00:00:00' HOUR TO SECOND AS timestamp)
Sample value for pa.attrb_an_value : 03-FEB-21 23:43:26,000000
But I get the following error.
hour must be between 1 and 12
you can first convert the date into timestamp. Instead of ATTRB_AN_VALUE please use
to_timestamp(substr(ATTRB_AN_VALUE,1,18),'DD.MM.YYYY HH24:MI:SSFF3')
This will convert the value into 03-FEB-21 11.43.26.000000 PM and it will eliminate the error.
Since the column attrb_an_value is not a DATE or TIMESTAMP but a VARCHAR2, you cannot compare it to a date without some casting. The TO_TIMESTAMP function will take a string and convert that to a timestamp value with a given format mask.
SELECT
*
FROM
piece p,
piece_attrb pa
WHERE
p.piece_num_id = pa.piece_num_id AND
pa.attrb_code = 'PRODUCTION_CUT_DATE' AND
TO_TIMESTAMP(pa.attrb_an_value,'DD-MON-YY HH24:MI:SS,FF6') >= TRUNC(systimestamp,'DD') - INTERVAL '1' DAY AND
TO_TIMESTAMP(pa.attrb_an_value,'DD-MON-YY HH24:MI:SS,FF6') < TRUNC(systimestamp,'DD')
Note 1: This will fail as soon as a row does not contain a string matching the DD-MON-YY HH24:MI:SS,FF6 format mask.
Note 2: As others pointed out, this is a serious design flaw. No date or timestamp data should be stored in VARCHAR2 columns.
I think your problem is about formatting the date. Here's the correct formatting. Also, I thought that you wanted result set that contained PA's ATTRB_AN_VALUE values in between the beginning of yesterday and today. So, the answer contains the simplified version of compared dates.
SELECT * FROM PIECE P, PIECE_ATTRB PA WHERE P.PIECE_NUM_ID=PA.PIECE_NUM_ID
AND PA.ATTRB_CODE='PRODUCTION_CUT_DATE'
AND to_timestamp(PA.ATTRB_AN_VALUE,'DD-MON-RR HH24:MI:SS,FF') >=to_timestamp(trunc(sysdate-1))
AND to_timestamp(pa.ATTRB_AN_VALUE,'DD-MON-RR HH24:MI:SS,FF') < to_timestamp(trunc(sysdate));

SQLite Date Adjustment Incorrect

I am trying to convert my date (15768) to a normal format within SQLite Studio....
I have the following formula that works but it's giving me the incorrect end result (it puts it in 1967 rather than the mid-2010's)
DATETIME(ReportDate,'unixepoch','localtime') ReportDate
Is it also possible to convert this to just the date, not time?
It might be the case that your 5 digit dates represent the number of days since 1970-01-01.
So you can try:
SELECT DATE('1970-01-01', ReportDate || ' day') ReportDate
FROM tablename
Result:
ReportDate
----------
2013-03-04

Why do I get an incompatible value type for my column?

I am trying to calculate the difference between two dates in an oracle database using a JDBC connection. I followed the advice from this question using a query like this:
SELECT CREATE_DATE - CLOSED
FROM TRANSACTIONS;
and I get the following error:
Incompatible value type specified for
column:CREATE_DATE-CLOSED. Column Type = 11 and Value Type =
8.[10176] Error Code: 10176
What should I change so I can successfully calculate the difference between the dates?
note: CREATE_DATE and CLOSED both have TIMESTAMP type
The answer you found is related to date datatypes, but you are dealing with timestamps. While substracting two Oracle dates returns a number, substracting timestamps produces an interval datatype. This is probably not what you want, and, apparently, your driver does not properly handle this datatype.
For this use case one solution is to cast the timestamps to dates before substracting them:
select cast(create_date as date) - cast(closed as date) from transactions;
As it was mentioned, it seems that JDBC cannot work with the INTERVAL datatype. What about casting it with the EXTRACT function to the expected output as number? If you want number of seconds between those two timestamps, it would be:
SELECT EXTRACT(SECOND FROM (CREATE_DATE - CLOSED)) FROM TRANSACTIONS;
Here are list of options which might be used instead of SECOND:
https://docs.oracle.com/database/121/SQLRF/functions067.htm#SQLRF00639
When we subtract one date from another Oracle gives us the difference as a number: it's straightforward arithmetic. But when we subtract one timestamp from another - which is what you're doing - the result is an INTERVAL. Older versions of JDBC don't like the INTERVAL datatype (docs) .
Here are a couple of workarounds, depending on what you want to do with the result. The first is to calculate the number of seconds from the interval result. extract second from ... only gives us the numbers of seconds in the interval. This will be fine providing none of your intervals are more than fifty-nine seconds long. Longer intervals require us to extract minute, hour and even days. So that solution would be:
select t.*
, extract (day from (t.closed - t.create_date)) * 84600
+ extract (hour from (t.closed - t.create_date)) * 3600
+ extract (minute from (t.closed - t.create_date)) * 60
+ extract (second from (t.closed - t.create_date)) as no_of_secs
from transactions t
A second solution is to follow the advice in the JDBC mapping guide and turn the interval into a string:
select t.*
, cast ((t.closed - t.create_date) as varchar2(128 char)) as intrvl_str
from transactions t
The format of a string interval is verbose:INTERVAL'+000000001 04:40:59.710000'DAY(9)TO SECOND. This may not be useful in the Java side of the application. But with regex we can turn it into a string which can be converted into a Java 8 Duration object (docs) : PnDTnHnMn.nS.
select t.id
, regexp_replace(cast ((t.closed - t.create_date) as varchar2(128 char))
, 'INTERVAL''\+([0-9]+) ([0-9]{2}):([0-9]{2}):([0-9]{2})\.([0-9]+)''DAY\(9\)TO SECOND'
, 'P\1DT\2H\3M\4.\5S')
as duration
from transactions t
There is a demo on db<>fiddle

How to take differece between 2 dates of different format in SQL

I have a table with a LOAD_STRT_DTM colum. This is a date column and values are like this - 18-JUL-14 08.20.34.000000000 AM.
I want to find the data which came before 5 days.
My logic is -
Select * from Table where 24 *(To_DATE(Sysdate,'DD-MM-YY') - To_DATE(LOAD_STRT_DTM,'DD-MM-YY')) >120
The issue is -
Select (To_DATE(Sysdate,'DD-MM-YY') - To_DATE(LOAD_STRT_DTM,'DD-MM-YY')) from table
This query should give the NumberOfDays between two dates. But this is not working, I Doubt, the issue is because of the format of the LOAD_STRT_DTM colum.
Please let me know where i am doint it wrong.
If your column is DATE datatype everything is ok, just shoot an:
select * from table where LOAD_STRT_DTM > sysdate - 5;
No need to convert dates to DATE datatype.
(To_DATE(Sysdate,'DD-MM-YY') - To_DATE(LOAD_STRT_DTM,'DD-MM-YY'))
You don't have to convert a DATE into a DATE again. IT is already a DATE. You just need to use it for date calculations. You use TO_DATE to convert a STRING into a DATE.
For example, if you have a string value like '18-JUL-14', then you would need to convert it into date using TO_DATE. Since your column is DATE data type, you just need to use as it is.
This is a date column
I want to find the data which came before 5 days.
Simply use the filter predicate as:
WHERE load_strt_dtm > SYSDATE - 5;
NOTE : SYSDATE has both date and time elements, so it will filter based on the time too. If you want to use only the date part in the filter criteria, then you could use TRUNC. IT would truncate the time element.
I have answered a similar question, have a look at this https://stackoverflow.com/a/29005418/3989608
It looks like LOAD_STRT_DTM is a TIMESTAMP rather than a DATE, given the number of decimal points following the seconds. The only thing you have to be cautious about is that Oracle will convert a DATE to a TIMESTAMP implicitly where one of the operands is a TIMESTAMP. So the solution
WHERE load_strt_dtm > SYSDATE - 5
will work; as will
WHERE load_strt_dtm + 5 > SYSDATE
but the following will not:
WHERE SYSDATE - load_start_dtm < 5
the reason being that TIMESTAMP arithmetic produces an INTERVAL rather than a NUMBER.
first convert two dates to same format select datediff(dd,convert(varchar(20),'2015-01-01',112),convert(varchar(20),'01-10-2015',112))

check for dates syntax - teradata SQL

I am trying to check for dates but after running the query below, it displays no result. Could someone recommend me the correct syntax?
SELECT TOP 10 * FROM MY_DATABASE.AGREEMENT
WHERE end_dt=12/31/9999
12/31/9999 might look like a date for you but for the database it's a calculation:
12 divided by 31 divided by 9999 and because this involves INTEGER division this results in an INTEGER 0
So finally you compare a DATE to an INT and this results in typecasting the DATE to a INT.
The only reliable way to write a date literal in Teradata is DATE followed by a string with a YYYY-MM-DD format:
DATE '9999-12-31'
Similar for TIME '12:34:56.1' and TIMESTAMP '2014-08-20 12:34:56.1'
Is it a date column? Then try where end_dt = '9999-12-31'.
The question you ask is not very clear. The date you specify is language dependent.
Try
SELECT TOP 10 * FROM MY_DATABASE.AGREEMENT WHERE end_dt='99991231'