Want to convert timestamp to date format in hive - sql

want to convert this number '20210412070422' to date format '2021-04-12' in hive
I am trying but this returns null value
from_unixtime(unix_timestamp(eap_as_of_dt, 'MM/dd/yyyy'))

The best methoid is to do without unix_timestamp/from_unixtime if possible and in your case it is possible. date() can be removed, string in yyyy-MM-dd format is compatible with date type:
select date(concat_ws('-',substr(ts,1,4),substr(ts,5,2),substr(ts,7,2)))
from
(
select '20210412070422' as ts
)s
Result:
2021-04-12
Another efficient method using regexp_replace:
select regexp_replace(ts,'^(\\d{4})(\\d{2})(\\d{2}).*','$1-$2-$3')
If you prefer using unix_timestamp/from_unixtime
select date(from_unixtime(unix_timestamp(ts, 'yyyyMMddHHmmss')))
from
(
select '20210412070422' as ts
)s
But it is more complex, slower (SimpleDateFormat class is involved) and error prone because will not work if data is not exactly in expected format, for example '202104120700'
Of course you can make it more reliable by taking substring of required length and using yyyyMMdd template:
select date(from_unixtime(unix_timestamp(substr(ts,1,8), 'yyyyMMdd')))
from
(
select '20210412070422' as ts
)s
It makes it even more complex.
Use unix_timestamp/from_unixtime only if simple substr or regexp_replace do not work for data format like '2021Apr12blabla'.

Related

BigQuery CAST whole string column to date after LEFT

The data came in datetime format but was upload as a string to BigQuery.I want to extract only the date part but not the time part. I try to use a subquery using code:
SELECT CAST(LEFT(SleepDay,9) AS DATE)
but I kept getting: Invalid datetime string "4/12/2016" as an error. Please help me fix it thank you so much.
try use BIGQuery parse_date() function.
SELECT PARSE_DATE('%m/%d/%Y', LEFT(SleepDay,9))
I would not recommend using the LEFT(date_str, 9) function, since you don't want to parse 12/31/2022 as 12/31/202. Instead, parse the date_string and use the PARSE_DATE() function with proper format strings.
WITH
dataset AS (
SELECT "4/12/2016T12:34:56" as SleepDay
UNION ALL SELECT "4/15/2016T12:34:56" as SleepDay
),
some_how_parse_the_date_string AS (
SELECT SPLIT(SleepDay, 'T')[OFFSET(0)] as SleepDay_date_string
FROM dataset
)
SELECT PARSE_DATE('%m/%d/%Y', SleepDay_date_string) as SleepDay_date
FROM some_how_parse_the_date_string
;

[SQL]Removing day in yyyy/mm/dd datetime format in sql

I'm using PostgreSQL, but this question is for any modern dbms
I want to basically convert a datetime column which has yyyy/mm/dd into just yyyy/mm
I tried getting months and year separately and using Concat, but the problem is the month comes as a single digit integers for values < 10 and that messes up ordering
select *,
concat(date_part('year' , date_old), '/', date_part('month' , date_old)) as date_new
from table
date _old
date_new
2010-01-20
2010-1
2010-01-22
2010-1
2010-11-22
2010-11
You can use to_char()
to_char(date_old, 'yyyy/mm')
If you want to display your date in the format YYYY-MM then
In PostgreSQL (db<>fiddle) and Oracle (db<>fiddle), use TO_CHAR:
SELECT TO_CHAR(date_old, 'YYYY/MM') FROM table_name;
In MySQL (db<>fiddle), use DATE_FORMAT:
SELECT DATE_FORMAT(date_old, '%Y/%m') FROM table_name;
In SQL Server (db<>fiddle), use CONVERT or, if you are using SQL Server 12 or later, FORMAT:
SELECT CONVERT(varchar(7), date_old, 111) FROM table_name;
SELECT FORMAT(date_old,'yyyy/MM') FROM table_name;
Don't do this.
If you're able to use the date_part() function, what you have is not actually formatted as the yyyy/mm/dd value you say it is. Instead, it's a binary value that's not human-readable, and what you see is a convenience shown you by your tooling.
You should leave this binary value in place!
If you convert to yyyy/mm, you will lose the ability to directly call functions like date_part(), and you will lose the ability to index the column properly.
What you'll have left is a varchar column that only pretends to be a date value. Schemas that do this are considered BROKEN.

How to retrieve data from MariaDB where any dates are in the format YYYY-MM-DD?

I'm retrieving data from MariaDB using:
SELECT * FROM table_name
Three columns in this data contain dates (and are formatted as dates in the form YYYY-MM-DD). When I receive them on the client side, they appear as "2021-07-11T14:00:00.000Z" but I instead want "2021-07-11". I've tried lots of things, including:
SELECT * FROM table_name DATE_FORMAT(date_column_one,'dd/mm/yyyy')
which doesn't work, as well as
SELECT *, DATE_FORMAT(date_column_one,'dd/mm/yyyy') FROM table_name
but this simply adds another column of data - I still get the other dates in the wrong format.
There is a lot of info about this stuff online but I can't find anywhere where they actually combine the formatting of dates with a basic select statement.
Your call to DATE_FORMAT is not using the format mask you seem to want here, which is yyyy-mm-dd. Try using the correct mask, and also don't select the date columns via SELECT *:
SELECT col1, col2, col3, ... -- excluding date_column_one etc.
DATE_FORMAT(date_column_one, 'yyyy-mm-dd') AS date_column_one
FROM table_name;
If you just want to return a date, then you can use the date function:
SELECT . . ., -- the other columns
DATE(date_column_one)
FROM table_name;
However, this returns a column with the type of date and you are at the mercy of your application to display it. Some applications might do you the "favor" of deciding that you want to see the time and timezone.
You can control this by converting the value to a string using DATE_FORMAT():
SELECT . . ., -- the other columns
DATE_FORMAT(date_column_one, '%d/%m/%Y')
FROM table_name;
Now the value is a string and its format will not be changed. The format '%Y-%m-%d' is the standard YYYY-MM-DD format, and I much prefer that.

Change date format in oracle query

When running
select processing_date from table;
i got this result "04-30-2020 20.12.49.978711"
what i want to change the format of the result to "30-APR-20"
is there a way i can do that ?
i tried select to_date(processing_date,'mm-dd-yyyy') from table; but it gives me errors
any help ?
You want to_char():
select to_char(processing_date, 'MM-DD-YYYY')
Dates are stored as an internal format, which you cannot change. If you want the date formatted in a particular way, then one solution is to convert to a string with the format you want.
EDIT:
The date appears to be a string. You can convert it to a date using:
select to_date(substr(processing_date, 1, 10), 'MM-DD-YYYY')
You can then either use as-is or use to_date() to get the format you really want.

BigQuery: Validate that all dates are formatted as yyyy-mm-dd

Using Google BIGQUERY, I need to check that the values in a column called birth_day_col are the correct and desired date format: YYYY-MM-DD. The values in this column are defined as STRING. Also the values in this column are currently of the following format: YYYY-MM-DD.
I researched a lot on the internet and found an interesting workaround. The following query:
SELECT
DISTINCT birth_day_col
FROM `project.dataset.datatable`
WHERE birth_day_col LIKE '[1-2][0-9][0-9][0-9]/[0-1][0-9]/[0-3][0-9]'
AND country_code = 'country1'
But the result is: "This query returned no results."
I then checked with NOT, using the following code:
SELECT
DISTINCT birth_day_col
FROM `project.dataset.datatable`
WHERE NOT(birth_day_col LIKE '[1-2][0-9][0-9][0-9]/[0-1][0-9]/[0-3][0-9]')
AND country_code = 'country1'
Surprisingly it gave all the values in birth_dat_col, which I have verified and are of the correct date format, but this result coud very much be a coincidence.
And it is very strange (wrong) that I used a query that should result only the wrong format dates, but it actually gives me the correct ones. Everything about these two queries seems like an inversation of each one's role.
The expected result of any query for this business case is to make a count of all incorrect formatted dates (even if currently this is 0).
Thank you for your help!
Robert
A couple of things here:
Read the documentation for the LIKE operator if you want to understand how to use it. It looks like you're trying to use regular expression syntax, but the LIKE operator does not take a regular expression as input.
The standard format for BigQuery's dates is YYYY-MM-DD, so you can just try casting and see if the result is a valid date, e.g.:
SELECT SAFE_CAST(birth_day_col AS DATE) AS birth_day_col
FROM `project`.dataset.table
This will return null for any values that don't have the correct format. If you want to find all of the ones that don't have the correct format, you can use SAFE_CAST inside a filter:
SELECT DISTINCT birth_day_col AS invalid_date
FROM `project`.dataset.table
WHERE SAFE_CAST(birth_day_col AS DATE) IS NULL
The result of this query will be all of the date strings that don't use YYYY-MM-DD format. If you want to check for slashes instead, you can use REGEXP_CONTAINS, e.g. try this:
SELECT
date,
REGEXP_CONTAINS(date, r'^[0-9]{4}/[0-9]{2}/[0-9]{2}$')
FROM (
SELECT '2019/05/10' AS date UNION ALL
SELECT '2019-05-10' UNION ALL
SELECT '05/10/2019'
)
If you want to find all dates with either YYYY-MM-DD format or YYYY/MM/DD format, you can use a query like this:
SELECT
DISTINCT date
FROM `project`.dataset.table
WHERE REGEXP_CONTAINS(date, r'^[0-9]{4}[/\-][0-9]{2}[/\-][0-9]{2}$')
For example:
SELECT
DISTINCT date
FROM (
SELECT '2019/05/10' AS date UNION ALL
SELECT '2019-05-10' UNION ALL
SELECT '05/10/2019'
)
WHERE REGEXP_CONTAINS(date, r'^[0-9]{4}[/\-][0-9]{2}[/\-][0-9]{2}$')
Yet another example for BigQuery Standrad SQL - with use of SAFE.PARSE_DATE
#standardSQL
WITH `project.dataset.table` AS (
SELECT '1980/08/10' AS birth_day_col UNION ALL
SELECT '1980-08-10' UNION ALL
SELECT '08/10/1980'
)
SELECT birth_day_col
FROM `project.dataset.table`
WHERE SAFE.PARSE_DATE('%Y-%m-%d', birth_day_col) IS NULL
with result of list of all dates which are not formatted as yyyy-mm-dd
Row birth_day_col
1 1980/08/10
2 08/10/1980
Google BigQuery's LIKE operator does not support matching digits nor does it uses the [ character in its syntax (I don't think ISO standard SQL does either - LIKE is nowhere near as powerful as Regex).
X [NOT] LIKE Y
Checks if the STRING in the first operand X matches a pattern specified by the second operand Y. Expressions can contain these characters:
A percent sign "%" matches any number of characters or bytes
An underscore "_" matches a single character or byte
You can escape "\", "_", or "%" using two backslashes. For example, "\%". If you are using raw strings, only a single backslash is required. For example, r"\%".
You should use REGEX_CONTAINS instead.
I note that string format tests won't tell you if a date is valid or not, however. Consider that 2019-02-31 has a valid date format, but an invalid date value. I suggest using a datatype conversion function (to convert the STRING to a DATE value) instead.