PySpark Keep only Year and Month in Date - dataframe

I have a dataframe with a column date_key with Datetype. The problem is I want to create another column with only yyyy-mmpart of the date_key, but still keep it date type. I tried (to_date(df[date_key],'YYYY-MM') which does not work. Also tried date_format(df[date_key] , 'YYYY-MM')but the result is string rather than date type. Could someone please help? Many thanks. The result I need to get is in the format of 2020-09, with no date or timestamp after.

You can use date_trunc to reduce the precision of a timestamp:
df = spark.createDataFrame([['2020-09-30'], ['2020-11-11']], ['date'])\
.select(to_date(col('date'), 'yyyy-MM-dd').alias('date_key'))
df.show()
+----------+
| date_key|
+----------+
|2020-09-30|
|2020-11-11|
+----------+
Then truncate:
df.select(f.date_trunc('mm', col('date_key'))).show()
+------------------------+
|date_trunc(mm, date_key)|
+------------------------+
| 2020-09-01 00:00:00|
| 2020-11-01 00:00:00|
+------------------------+
date_trunc will retain the precision up to the specified format, mm in this case meaning month.

Related

Translate Teradata DATE function (division/extract and sum) into BigQuery

I have this code in Teradata that reads "x_date/100+190000". So from my understanding it removes the 'day' portion from DATE and then adds an INT number of days. Now I have to translate the same into BigQuery but can't see how.
edit: so what I have is a SELECT statement that includes the "x_date" field, which has a DATE format. It contains a list of dates in the form of 'yyyy-mm-dd'. The query reads something like:
SELECT x_date/100+190000
FROM x_table
and the field has this sort of rows:
| '2022-06-06' |
| '2020-03-06' |
| '2019-09-01' |
| '2028-05-06' |
What I don't understand exactly is what this functions are doing in Teradata.
My expected output should be in DATE format and should be copying (in BigQuery), whatever the Teradata function is doing to the field.
Use below
SELECT FORMAT_DATE('%Y%m', x_date)
FROM x_table

Replace the time as 0's in datetime column

Experts,
I need to convert the time value as 0's in a datetime column leaving behind 00:00:00.000.
Sample data:
2019-04-17 08:47:51.433
2019-04-17 00:00:00.000
Kindly suggest a key code.
Thanks in advance!
As you appear to want to keep a time of 00:00:00.000 you could use
SELECT DATE_FORMAT('2019-04-17 08:47:51.433' , '%Y-%m-%d 00:00:00.000');
RESULT
2019-04-17 00:00:00.000
To trim the time part, you can just use MySQL date function DATE():
Demo on DB Fiddle:
select DATE('2019-04-17 08:47:51.433')
| DATE('2019-04-17 08:47:51.433') |
| :------------------------------ |
| 2019-04-17 |
I will assume that you really only want to view your datetime data this way. If so, then you should use DATE_FORMAT with a mask containing only the date portion:
SELECT
dt AS datetime,
DATE_FORMAT(dt, '%Y-%m-%d') AS dateonly
FROM yourTable;
Unless you are certain that you would never need the time information, it makes no sense to throw that away.

Geting Error to convert timestamp to time

I have a database table name "c_pay_daily_attend as da" in postgresql like:
| Name | da.outime6 |
| Zakir | 2018-09-06 15:00:00 |
I want just time like this:
| Name | da.outime6 |
| Zakir | 15:00:00 |
I am using
TO_TIMESTAMP(da.outime6, 'HH24:MI:SS')::TIME
but it is getting the following error
ERROR: function to_timestamp(timestamp without time zone, unknown) does not exist
LINE 1: select TO_TIMESTAMP(da.intime6, 'HH24:MI:SS')::TIME ee,bp.c_...
^
HINT: No function matches the given name and argument types. You might need to add explicit type casts.
How do I solve this?
use time conversion below way
select CURRENT_TIMESTAMP::time
demo in fiddle
use cast() function
SELECT Cast(da.outime6 :: timestamp AS TIME)
FROM c_pay_daily_attend AS da
outime6 seems to be already a timestamp, so there is no need to convert it first.
Just cast it to a time value:
da.outime6::time

SAS internal Date format to yyyy-MM-dd in HIVE

I got SAS dataset into txt file format from client. But client didnt change the SAS date format to mm/dd/yyyy or yyyy-MM-dd. As SAS uses seconds since Jan 1, 1960, Date is coming like:
| response_dt |
+----------------
| 19724 |
| 19673 |
| 19698 |
| 19738 |
| 19738 |
I want to convert this to yyyy-MM-dd format in hive. Kindly help
Just read in the numbers. They are indeed days since Jan 1, 1960.
Then assign a format to them in one of the several ways you can, like
Data myData;
set myData;
format response_dt ddmmyy8.;
run;
If the dataset is huge, consider using proc datasets
Sas dates are days since 1jan1960.
i made this a wiki.
Can anyone add the correct function to add days to a date in hive?

Changing the format of data in a column

Trying the change the date column from YYYYMMDD to MMDDYYYY while maintaining varchar value. Currently my column is set as varchar(10). Is there a way to change the strings in mass numbers because I have thousands of rows that need the format converted.
For example:
| ID | Date |
------------------------
| 1 | 20140911 |
| 2 | 20140101 |
| 3 | 20140829 |
What I want my table to look like:
| ID | Date |
------------------------
| 1 | 09112014 |
| 2 | 01012014 |
| 3 | 08292014 |
Bonus question: Would it cause an issue while trying to convert this column if there is data such as 91212 for 09/12/2012 or something like 1381 which is supposed to be 08/01/2013?
Instead of storing the formatted date in separate column; just correct the format while fetching using STR_TO_DATE function (as you said your dates are stored as string/varchar) like below. Again, as other have suggested don't store date data as string rather use the datetime data type instead
SELECT STR_TO_DATE(`Date`, '%m/%d/%Y')
FROM yourtable
EDIT:
In that case, I would suggest don't update your original table. Rather store this formatted data in a view or in a separate table all together like below
create view formatted_date_view
as
SELECT ID,STR_TO_DATE(`Date`, '%m/%d/%Y') as 'Formatted_Date'
FROM yourtable
(OR)
create table formatted_date_table
as
SELECT ID,STR_TO_DATE(`Date`, '%m/%d/%Y') as 'Formatted_Date'
FROM yourtable
EDIT1:
In case of SQL Server use CONVERT function like CONVERT(datetime, Date,110). so, it would be (Here 110 is the style for mm-dd-yyyy format)
SELECT ID,convert(datetime,[Date],110) as 'Formatted_Date'
FROM yourtable
(OR)
CAST function like below (only drawback, you can't use any specific style to format the date)
SELECT ID, cast([Date] as datetime) as 'Formatted_Date'
FROM yourtable
MS SQL Server Solution:
Which SQL are you trying with?
MSSQL Server 2008 R2
You can use Convert function on your date field. You have to specify the date's format Style.
For mm/dd/yyyy format Style value is 101.
Using with style value, your update statement can be:
UPDATE table_name
SET date = CONVERT( VARCHAR, date, 101 )
Refer To:
How to format datetime & date in Sql Server
SQL Server 2008 Date Format
Demo # MS SQL Server 2008 Fiddle
MySQL Solution:
it needs to stay in varchar or int and the dates are yyyymmdd and I need to change thousands of rows of data to be in mmddyyyy format.
Change to date type using str_to_date and then change again to string using date_format.
UPDATE table_name
SET date = DATE_FORMAT( STR_TO_DATE( date, '%Y%m%d' ), '%m%d%Y' )
The value 20140911 when converted from yyyymmdd to mmddyyyy format, will retain the leading 0 as 09112014.
Bonus question: Would it cause an issue while trying to convert this column if there is data such as 91212 for 09/12/2012 or something like 1381 which is supposed to be 08/01/2013
You can use str_to_date( '91212', '%c%e%y' ) to convert the same to valid date object. But MySQL, though defines to support single digit month and date numbers, it won't parse such date correctly and returns a NULL on such formats.
mysql> select str_to_date( '91212', '%c%e%y' ) s1, str_to_date( '091212', '%c%e%y' ) s2;
+------+------------+
| s1 | s2 |
+------+------------+
| NULL | 2012-09-12 |
+------+------------+
1 row in set, 1 warning (0.00 sec)
mysql> show warnings;
+---------+------+------------------------------------------------------------+
| Level | Code | Message |
+---------+------+------------------------------------------------------------+
| Warning | 1411 | Incorrect datetime value: '91212' for function str_to_date |
+---------+------+------------------------------------------------------------+
1 row in set (0.00 sec)