Athena Timestamp - sql

What is the appropriate format for my datetime? I've tried several combinations and getting various errors. The data is a string and here is an example: "2022-10-28T00:00:00Z"
Neither of these work:
`WHERE MONTH(parse_datetime(start, 'yyyy-MM-dd"T"HH:mm:ss"Z"')) = 12
`WHERE MONTH(parse_datetime(start, 'yyyy-MM-dd HH:mm:ss')) = 12

You need to use single quotes (') to escape symbol when using Java date functions. To add it to the format string you need to escape it with another one:
select parse_datetime('2023-01-30T20:00:02Z', 'yyyy-MM-dd''T''HH:mm:ss''Z''');
Output:
_col0
2023-01-30 20:00:02.000 UTC
Note that in this case you can just use from_iso8601_timestamp function, which should be more correct approach in general:
select from_iso8601_timestamp('2023-01-30T20:00:02Z');

Related

Snowflake - Date string has T character, can't convert to datetime

I have several columns that look like this with a T and +: 2020-04-11T21:00:09+0000
I want to convert them to datetime if possible, I've tried to_timestamp_ntz() and to_date():
to_timestamp_ntz('2020-04-11T21:00:09+0000', 'YYYY-MM-DD HH24:MI:SS.FF+00')
but I keep seeing:
Can't parse '2020-04-11T21:00:09+0000' as timestamp with format...
It is a matter of format:
SELECT to_timestamp_ntz('2020-04-11T21:00:09+0000',
'YYYY-MM-DD"T"HH24:MI:SSTZHTZM') AS res
To handle T it needs to be provided as "T".
Output:
The pattern "..." inside format works for arbitrary text:
SELECT to_timestamp_ntz('2020-04-11aaaa21:00:09+0000',
'YYYY-MM-DD"aaaa"HH24:MI:SSTZHTZM') AS res
-- 2020-04-11 21:00:09.000
I don't think there's a way to specify a date/time format to skip over a character like that. You may have to do something like this:
select to_timestamp_ntz(replace('2020-04-11T21:00:09+0000', 'T', ' '), 'YYYY-MM-DD HH:MI:SSTZHTZM')

How to delete "" and change it to date format

My dataset's date column (tweet_stamp) is shown as "2020-01-29 00-21-29" and it is STRING format.
I would like to have result as 2020-01-29.
How to delete "" in string and change it to date format?
I tried code as below
select to_date(from_unixtime(unix_timestamp(tweet_timestamp,'"yyyy-MM-dd 00-00-00"'), 'yyyy-MM-dd')) as tweet_date;
However, result is NULL.
You do not need unix_timestamp+from_unix_time conversion because date part of string is already in right format. Just remove double quotes, get substring and optionally convert to date:
select date(substr(regexp_replace('"2020-01-29 00-21-29"','"',''),1,10)) --returns 2020-01-29
Or even simpler using to_date function:
select to_date(regexp_replace('"2020-01-29 00-21-29"','"','')) --2020-01-29

postgres sql to extract year-month

Have a table with a column like this:
first_day_month
01/07/2020
01/07/2020
01/08/2020
01/09/2020
.......
Need to create a column like year-month,
Tried to_char(first_day_month, 'MM/YYYY') but got an error:
Error running query: INVALID_FUNCTION_ARGUMENT: Failed to tokenize string [M] at offset [0]
Tried
concat(extract(year from first_day_month),'-',extract(month from first_day_month) ) as month,
with an error:
Error running query: SYNTAX_ERROR: line 2:1: Unexpected parameters (bigint, varchar(1), bigint) for function concat. Expected: concat(array(E), E) E, concat(E, array(E)) E, concat(array(E)) E, concat(varchar)
Also tried date_parse but didn't get it right, any idea?
Thanks
You need to use TO_DATE first, to convert the column to a proper date. Then use TO_CHAR to format as you want:
SELECT TO_CHAR(TO_DATE(first_day_month, 'DD/MM/YYYY'), 'MM/YYYY') AS my
FROM yourTable;
Note that in this case since the text month year you want is actually just the right substring, you could also directly use RIGHT here:
SELECT RIGHT(first_day_month, 7)
FROM yourTable;
Finally, note that YYYY/MM would generally be a better format to use, as it sorts properly. So perhaps consider using this version:
SELECT TO_CHAR(TO_DATE(first_day_month, 'DD/MM/YYYY'), 'YYYY/MM') AS ym
FROM yourTable;
Your data doesn't seem to be of DATE type, might be string, then need to convert to DATE type first and format display style as desired pattern :
SELECT TO_CHAR(first_day_month::DATE,'MM/YYYY') AS first_day_month
FROM t
Demo

BigQuery: Validate that all dates are formatted as yyyy-mm-dd

Using Google BIGQUERY, I need to check that the values in a column called birth_day_col are the correct and desired date format: YYYY-MM-DD. The values in this column are defined as STRING. Also the values in this column are currently of the following format: YYYY-MM-DD.
I researched a lot on the internet and found an interesting workaround. The following query:
SELECT
DISTINCT birth_day_col
FROM `project.dataset.datatable`
WHERE birth_day_col LIKE '[1-2][0-9][0-9][0-9]/[0-1][0-9]/[0-3][0-9]'
AND country_code = 'country1'
But the result is: "This query returned no results."
I then checked with NOT, using the following code:
SELECT
DISTINCT birth_day_col
FROM `project.dataset.datatable`
WHERE NOT(birth_day_col LIKE '[1-2][0-9][0-9][0-9]/[0-1][0-9]/[0-3][0-9]')
AND country_code = 'country1'
Surprisingly it gave all the values in birth_dat_col, which I have verified and are of the correct date format, but this result coud very much be a coincidence.
And it is very strange (wrong) that I used a query that should result only the wrong format dates, but it actually gives me the correct ones. Everything about these two queries seems like an inversation of each one's role.
The expected result of any query for this business case is to make a count of all incorrect formatted dates (even if currently this is 0).
Thank you for your help!
Robert
A couple of things here:
Read the documentation for the LIKE operator if you want to understand how to use it. It looks like you're trying to use regular expression syntax, but the LIKE operator does not take a regular expression as input.
The standard format for BigQuery's dates is YYYY-MM-DD, so you can just try casting and see if the result is a valid date, e.g.:
SELECT SAFE_CAST(birth_day_col AS DATE) AS birth_day_col
FROM `project`.dataset.table
This will return null for any values that don't have the correct format. If you want to find all of the ones that don't have the correct format, you can use SAFE_CAST inside a filter:
SELECT DISTINCT birth_day_col AS invalid_date
FROM `project`.dataset.table
WHERE SAFE_CAST(birth_day_col AS DATE) IS NULL
The result of this query will be all of the date strings that don't use YYYY-MM-DD format. If you want to check for slashes instead, you can use REGEXP_CONTAINS, e.g. try this:
SELECT
date,
REGEXP_CONTAINS(date, r'^[0-9]{4}/[0-9]{2}/[0-9]{2}$')
FROM (
SELECT '2019/05/10' AS date UNION ALL
SELECT '2019-05-10' UNION ALL
SELECT '05/10/2019'
)
If you want to find all dates with either YYYY-MM-DD format or YYYY/MM/DD format, you can use a query like this:
SELECT
DISTINCT date
FROM `project`.dataset.table
WHERE REGEXP_CONTAINS(date, r'^[0-9]{4}[/\-][0-9]{2}[/\-][0-9]{2}$')
For example:
SELECT
DISTINCT date
FROM (
SELECT '2019/05/10' AS date UNION ALL
SELECT '2019-05-10' UNION ALL
SELECT '05/10/2019'
)
WHERE REGEXP_CONTAINS(date, r'^[0-9]{4}[/\-][0-9]{2}[/\-][0-9]{2}$')
Yet another example for BigQuery Standrad SQL - with use of SAFE.PARSE_DATE
#standardSQL
WITH `project.dataset.table` AS (
SELECT '1980/08/10' AS birth_day_col UNION ALL
SELECT '1980-08-10' UNION ALL
SELECT '08/10/1980'
)
SELECT birth_day_col
FROM `project.dataset.table`
WHERE SAFE.PARSE_DATE('%Y-%m-%d', birth_day_col) IS NULL
with result of list of all dates which are not formatted as yyyy-mm-dd
Row birth_day_col
1 1980/08/10
2 08/10/1980
Google BigQuery's LIKE operator does not support matching digits nor does it uses the [ character in its syntax (I don't think ISO standard SQL does either - LIKE is nowhere near as powerful as Regex).
X [NOT] LIKE Y
Checks if the STRING in the first operand X matches a pattern specified by the second operand Y. Expressions can contain these characters:
A percent sign "%" matches any number of characters or bytes
An underscore "_" matches a single character or byte
You can escape "\", "_", or "%" using two backslashes. For example, "\%". If you are using raw strings, only a single backslash is required. For example, r"\%".
You should use REGEX_CONTAINS instead.
I note that string format tests won't tell you if a date is valid or not, however. Consider that 2019-02-31 has a valid date format, but an invalid date value. I suggest using a datatype conversion function (to convert the STRING to a DATE value) instead.

Oracle query cast attribute

I am trying to convert a decimal to text. Every time I try to cast, it converts the number this way:
number: converted:
---------------------
0.1234 .1234
I tried using TO_CHAR, but without success.
Use the second format parameter of TO_CHAR function.
Try something like:
SELECT TO_CHAR(0.1234, '00000000.00') FROM DUAL