Change data type of column from STRING format to DATE format - sql

I am reading a file from ADLS location, in that one column Period_Ending_Date is having data type as STRING.
The Period_Ending_Date is having many dates in random order, I need to apply filter to get the latest date.
I'm trying this code:
select * from final_table
WHERE Period_Ending_Date = (SELECT MAX(Period_Ending_Date) FROM final_table)
But the problem is I'm getting the day with maximum, not the latest date. I can understand this is happening because of STRING data type. Please guide me how I can change this column to DATE data type or any other alternative to get the solution of this.
I'm working with Scala and SQL on Azure Databricks.

what about changing SELECT MAX(Period_Ending_Date) FROM final_table to SELECT MAX(cast(Period_Ending_Date as date)) FROM final_table - performing explicit casting to date if date format is ISO8601 (YYYY-MM-DD) or using the to_date function (doc) to convert non-standard dates.

Related

Convert YYYYMMDD to MM/DD/YYYY in Snowflake

I need help in figuring out the date conversion logic in Snowflake. The documentation isn't clear enough on this.
In SQL Server, I would try
SELECT CONVERT(DATE, '20200730', 101)
and it gives me '07/30/2020'.
If I try the following in Snowflake,
to_varchar('20200730'::date, 'mm/dd/yyyy')
it gives me '08/22/1970'. Why would it give an entire different date? Need help in getting the logic with the correct date.
The issue with what you are doing is that you are assuming that Snowflake is converting your string of '20200730'::DATE to 2020-07-03. It's not. You need to specify your input format of a date. So, 2 options based on your question being a bit vague:
If you have a string in a table and you wish to transform that into a date and then present it back as a formatted string:
SELECT TO_VARCHAR(TO_DATE('20200730','YYYYMMDD'),'MM/DD/YYYY');
--07/30/2020
If the field in the table is already a date, then you just need to apply the TO_VARCHAR() piece directly against that field.
Unlike SQL Server, Snowflake stores date fields in the same format regardless of what you provide it. You need to use the TO_VARCHAR in order to format that date in a different way...or ALTER SESSION SET DATE_OUTPUT_FORMAT will also work.
Try select to_varchar(TO_DATE( '20200730', 'YYYYMMDD' ), 'MM/DD/YYYY'); which produces 2020-07-30
You may need to refer to https://docs.snowflake.com/en/user-guide/date-time-input-output.html#timestamp-formats

Google Analytics to Big Query

"Date" data from GA in BQ is "yyyymmdd" which is not able to convert to "date" data set.
Is there any way to make BQ recognize it as "date"?
Thank you,
According to the documentation, the date field is exported as String from your GA data.
However, it is possible to change that after you export your data to BigQuery. You can overwrite your current table or create a new one with the date format you desire. In order to achieve this, we will use PARSE_DATE() builtin method. It receives a String that will be casted to date according to the string format it has. Below is the StandardSQL syntax in BigQuery:
SELECT PARSE_DATE("%Y%m%d", date) as date FROM `project.dataset.table`
The date will be outputed as YYYY-MM-DD. In addition, if you want to change the date format, you can use FORMAT_DATE() builtin method using one of the formatting elements.
In your case that you want to replace the whole table with the date column with the desired format, you could use the following syntax:
CREATE OR REPLACE TABLE `project.dataset.table` AS
( SELECT * REPLACE(PARSE_DATE("%Y%m%d",date) as date) FROM `project.dataset.table`)
Therefore, your table will have all the same columns, but the date field will be formatted as DATE.

convert TEXT dd/mm/yyyy in SQL column to DATE YYYY-MM-DD

I would love to know the best way to handle data that has been inputted incorrectly as dd/mm/yyyy into a sql database as TEXT and to have it converted into a new column of the table with the datatype as DATE so it is actually stored as yyyy-mm-dd.
Existing text date column name is called "olddate" with an empty column created called "truedate" to house the new data. Each row has the date field, but none are able to be sorted correctly because of this issue.
Any ideas how I can slice and dice the current date into a new DATE field friendly version?
Thanks in advance :-)
That is style 103. So use:
select convert(date, col, 103)
Are you using Oracle? If so, TO_DATE is what you want. You can take in a string that represents a date and convert it to a date using the format you pass it.

How to change date format in hive?

My table in hive has a filed of date in the format of '2016/06/01'. but i find that it is not in harmory with the format of '2016-06-01'.
They can not compare for instance.
Both of them are string .
So I want to know how to make them in harmory and can compare them. Or on the other hand, how to change the '2016/06/01' to '2016-06-01' so that them can compare.
Many thanks.
To convert date string from one format to another you have to use two date function of hive
unix_timestamp(string date, string pattern) convert time string
with given pattern to unix time stamp (in seconds), return 0 if
fail.
from_unixtime(bigint unixtime[, string format]) converts the
number of seconds from unix epoch (1970-01-01 00:00:00 UTC) to a
string representing the timestamp of that moment in the current
system time zone.
Using above two function you can achieve your desired result.
The sample input and output can be seen from below image:
The final query is
select from_unixtime(unix_timestamp('2016/06/01','yyyy/MM/dd'),'yyyy-MM-dd') from table1;
where table1 is the table name present in my hive database.
I hope this help you!!!
Let's say you have a column 'birth_day' in your table which is in your format,
you should use the following query to convert birth_day into the required format.
date_Format(birth_day, 'yyyy-MM-dd')
You can use it in a query in the following way
select * from yourtable
where
date_Format(birth_day, 'yyyy-MM-dd') = '2019-04-16';
Use :
unix_timestamp(DATE_COLUMN, string pattern)
The above command would help convert the date to unix timestamp format which you may format as you want using the Simple Date Function.
Date Function
cast(to_date(from_unixtime(unix_timestamp(yourdate , 'MM-dd-yyyy'))) as date)
here is my solution (for string to real Date type):
select to_date(replace('2000/01/01', '/', '-')) as dt ;
ps:to_date() returns Date type, this feature needs Hive 2.1+; before 2.1, it returns String.
ps2: hive to_date() function or date_format() function , or even cast() function, cannot regonise the 'yyyy/MM/dd' or 'yyyymmdd' format, which I think is so sad, and make me a little crazy.

Highest date from hive table with string data type

I am a newbie to hive and need your help. My requirement is to get the highest date from the table and my date datatype is string. I tried with max(), but it's not working for string data type... please help me on this.
Use built-in date functions unix_timestamp(string date, string pattern).
The unix_timestamp covert a string date to unix_timestamp as int, which is comparable.
Assume your table name is t and the time column is tt.
select max(unix_timestamp(tt, 'yyyyMMdd')) from t
would find the max unix_timestamp for you, which is the latest date
You're asserting the MAX doesn't work on Strings in Hive, but in fact it does:
Select MAX(dt) FROM (Select explode(Array("20150103", "20150102")) as dt) a;
As long as your date string is in a format which can be sorted lexographically, MAX should work fine.
Since 0.12.0 version max(date) will just work.
If all the values in that column match the pattern 'yyyy-mm-dd' the above syntax should do the job