I'm unable to parse date in pig.
Date format is Mon, 10/11/10 01:02 PM
I load data using the following command:
data = load 'CampaignData.csv' using PigStorage(';');
Next I generate the date column as a chararray using the following command:-
date_data = foreach data generate (chararray) $272 as dates;
when I dump date_data I get the following in output:
Mon
How to get the complete date?
You don't need $272 to convert date provided to datetime object. You can simply follow this :
date_data = foreach data generate ToDate($273, ' MM/dd/yy hh:mm aaa');
Just make sure $273 is chararray and there is space before data format string specified in ToDate function above. Space is required only to make sure format string looks exactly as data that would be present after parsing row using comma delimiter.
Related
I started with a date in a string format from a JSON extraction using this: json_value(answer, '$.date_created') and got an output 2020-01-02T10:26:47.056-04:00.
From there, I transformed the output (because I couldn't change it to date using a series of functions like regexp_replace, left and CAST) to a date-like string: 2020-01-02 10:26:47
I need to be able to transform this new string to a date. So far, I've tried with FORMAT_DATETIME and FORMAT_TIMESTAMP but I'm getting an error: Failed to parse input string bigquery
Your original timestamp string is just fine to do this:
select Date(timestamp("2020-01-02T10:26:47.056-04:00"))
Only thing here to check is: you have -4 offset from UTC so as long as you take care of timezone etc, above style should work fine.
I am try to convert string value to date. The string has this format : yyyy-MM-dd. But when I try to convert using select values (in meta-date I selected fildname, type = Date and currency = dd/MM/yyyy I got this error :
String : couldn't convert string [2017-01-30] to a date using format [yyyy/MM/dd HH:mm:ss.SSS] on offset location 4
If I do in calculator step : Create a new field, Final_date as a Copy of field A; on Field A put the name of your input string; Data type is date and on Conversion mask choose the yyyy-MM-dd format (you don't have to pick one from the dropdown menu, you can write your own). I got the same error.
I am using the Pentaho Data Intagration 9.
I am try to convert the string format in yyyy-MM-dd to date type format in dd/MM/yyyy. For this case, how to convert string to date ?
When converting from string to date you specify the source format that the string is using, so in this case yyyy-MM-dd. That should be in the format selection list, but you can also manually type in any format needed.
Once the field is in date format, it will be correctly output to most database types. For files, you can define the new format (dd/MM/yyyy) in the output step like Text File Output or Excel Writer. Alternatively, you convert the date back into a string with the desired format using Select Values.
Im working with Data actory this time this why i ask lot of question about that
My new problem is that my SOURCE(CSV file contains a column DeleveryDate full of Date dd/MM/YYYY) and my table SQl where i specify DElevry date as DateTime but when I map btw source and sink in Data preview source
duplicate columns like in the picture below but in data preview sink the columns always NULL the same in my table NULL.
Thanks
You said column DeleveryDate full of Date dd/MM/YYYY), can you tell me why the column DeleveryDate has the values like '3', '1' in your screenshot? String '3' or '1' are not the date string with format dd/MM/YYYY.
If you want to do some data convert in Data Factory, I still suggest your to learn more about Data Flow.
For now, we can not convert date format from dd/MM/YYYY to datetime yyyy-MM-dd HH:mm:ss.SSS directly, we must do some other converts.
Look at bellow, I have a csv file contained a column with date format dd/MM/YYYY string, I still using DerivedColumn this time:
Add DerivedColumn:
Firstly, using this bellow expression to substring and convert dd/MM/YYYY to YYYY-MM-dd:
substring(Column_2, 7, 4)+'-'+substring(Column_2, 4, 2)+'-'+substring(Column_2, 1,2)
Then using toTimestamp() to convert it:
toTimestamp(substring(Column_2, 7, 4)+'-'+substring(Column_2, 4, 2)+'-'+substring(Column_2, 1,2), 'yyyy-MM-dd')
Sink settings and preview:
My Sink table column tt data type is datetime:
Execute the pipeline:
Check the data in sink table:
Hope this helps.
Please try this-
This is a trick which was a blocker for me, but try this-
Go to sink
Mapping
Click on output format
Select the data format or time format you prefer to store the data into the sink.
Iam new to Pig and I have a sample test data of 500 KB which I need to multiply several times to make the file size bigger for some test purpose. The single row in my data is as follows:
( card_description:chararray,
transaction_date:chararray,
merchant_name:chararray,
merchant_city:chararray,
transaction_amount:float
) ;
I want to simply change the transaction_amount and transaction_date for each row several times and then join all the results to make a single big file.
I am stuck in trying to change the transaction_date.
The date value in the file is
27/05/2010 00:00
r1 = FOREACH data GENERATE card_description,ToDate(transaction_date),merchant_name,merchant_city,
ROUND(RANDOM()*5)*transaction_amount;
result =union data,r1;
In order to alter the transaction datei want to use AddDuration function, but in trying to convert chararray to date, I am facing format related issues and unable to understand the solution.
Can someone guide?
After checking out the ways you can invoke ToDate, currently you are invoking ToDate as:
ToDate(milliseconds)
ToDate(iosstring)
And your format is not in milliseconds, nor follows the ISO 8601 format. You should be invoking it like:
ToDate(userstring, format)
Where format is a pattern string that follows these rules.
Therefore, ToDate should be called like:
-- For a 12hr clock
ToDate(transaction_date, "yyyy/MM/dd hh:mm")
-- For a 24hr clock
ToDate(transaction_date, "yyyy/MM/dd HH:mm")
For AddDuration, remember that the second parameter you provide to it must be a string in the ISO 8601 format. Make sure to read the link so you format the string correctly.
I am trying to import a file of records into a table and am getting
Rejected - Error on table FUNDPRICE_TEST, column FUNDDATE.
ORA-01843: not a valid month
I did some research and read that Oracle by default expects dates formatted yyyymmdd. Clearly this would throw an error when the first day value (the first record in my case) is more than 12.
How can I format the date in my exported csv file from mm/dd/yyyy 00:00:00 to this yyyymmdd format?
This is of course assuming this is my issue. I could be barking up the wrong tree.
Here is the ctl file after trying date formatting:
load data
infile 'commands/FundPriceDataNEW2.txt'
into table FundPrice
fields terminated by "," optionally enclosed by '"'
( FUND, FUNDDATE DATE "mm/dd/yyyy HH24:MI:SS", FAV, FUNDVALUE )
Here are a few records from the FundPriceDataNEW2.txt:
16,9/30/1999 0:00:00,"999999",9.64
16,10/31/1999 0:00:00,"999999",10.06
16,11/30/1999 0:00:00,"999999",10.40
the order is fund,funddate,fav,fundvalue. The error is on the date saying the format is invalid.
FUNDDATE DATE "mm/dd/yyyy HH24:MI:SS"
In your control card (Control file), try adding the above format for your Date column.