ToDate function provides unexpected output - apache-pig

I used the ToDate(userinput, format) function to covert my chararray field. I used the ToDate(userinput, 'MM/dd/yyyy') to covert the field from chararray to date but looks like i am not seeing the output as i had expected.
Here is the code:
l_dat = load 'textfile' using PigStorage('|') as (first:chararray,last:chararray,dob:chararray);
c_dat = foreach l_dat generate ToDate(dob,'MM/dd/yyyy') as mydate;
describe c_dat;
dump c_dat;
data looks like this:
(firstname1,lastname1,02/02/1967)
(John,deloy,05/26/1967)
(frank,fun,05/18/1967)
Output looks like this:
c_dat: {mydate: datetime}
(1967-05-26T00:00:00.000-04:00)
(1967-05-18T00:00:00.000-04:00)
(1967-02-02T00:00:00.000-05:00)
The output i was expecting was dateObjects with data as shown below:
(05/26/1967)
(05/18/1967)
(02/02/1967)
Please advise if i am doing anything wrong?

Ref : http://pig.apache.org/docs/r0.12.0/func.html#to-date, the return type of ToDate function is DateTime object. You can observe that in the schema description shared in output
c_dat: {mydate: datetime}
If you are having the date in the required format, you need not do any conversion.
c_dat = foreach l_dat generate dob as mydate;
If you are interested in converting the chararray date to any other format then you have to use ToString() function after getting the DateTime object.
Step 1: Convert date chararray to Date Time Ojbect using ToDate(datesstring, inutformat)
Step 2 : Use ToString(DateTime object, required format) to get the string date in the required format.
This can be achieved in a single step as below.
ToString(ToDate(date,inputformat),requiredformat);
Ref : http://pig.apache.org/docs/r0.12.0/func.html#to-string for details.

Related

Prisma queryRaw returning date as string

Issue: When I use prisma.$queryRaw, it returns my date as a string, even though I specify the query's return type. If I use prisma.find then it returns it correctly. But, I have to use queryRaw because of the complexity of the query.
schema.prisma has the date defined like such:
effectiveDate DateTime? #map("effective_date") #db.Date
So, the model object has the field defined like effectiveDate: Date | null
The query looks something like this:
const catalogCourses: CatalogCourse[] = await prisma.$queryRaw<CatalogCourse[]>`
SELECT
id,
campus,
effective_date as "effectiveDate",
...rest of the query ommitted here because it's not important
If I then do something like
console.log(`typeof date: ${typeof catalogCourses[0].effectiveDate}, value ${catalogCourses[0].effectiveDate}`)
The result shows typeof date: string, value 2000-12-31. Why isn't it a date? I need to be able to work with it as a Date, but if I do effectiveDate.getTime() for example, it errors during runtime, saying 'getTime is not a function', which it is doc. If I try and do new Date(effectiveDate), that doesn't work either because typescript sees the field as a Date object already. EDIT: I was incorrect about why the previous statement wasn't working; doing new Date(effectiveDate) does work.
I do see in the prisma docs that it says:
Type caveats when using raw SQL When you type the results of
$queryRaw, the raw data does not always match the suggested TypeScript
type.
Is there a way for queryRaw to return my date as a Date object?

datetime, string format in Python/Pandas

New to Python/Pandas. Just wondering if it is correct to assume the following are the same:
pd.to_datetime(str(20061231), format='%Y%m%d')
and
pd.Timestamp('2006-12-31')
also
start_date = pd.to_datetime(str(20061231), format='%Y%m%d')
str(20061231) of course produces a string, and is the same as start_date.strftime('%Y%m%d')

Handling Null DataType

I'm using the Over function from Piggybank to get the Lag of a row
res= foreach (group table by fieldA) {
Aord = order table by fieldB;
generate flatten(Stitch(Aord, Over(Aord.fieldB, 'lag'))) as (fieldA,fieldB,lag_fieldB) ;}
This works correctly and when I do a dump I get the expected result, the problem is when I want to use lag_fieldB for any comparison or transformation I get datatype issues.
If I do a describe it returns fieldA: long,fieldB: chararray,lag_fieldB: NULL
I'm new with PIG but I already tried casting to chararray and using ToString() and I keep getting errors like these:
ERROR 1052: Cannot cast bytearray to chararray
ERROR 1051: Cannot cast to bytearray
Thanks for your help
Ok after some looking around into the code of the Over function I found that you can instantiate the Over class to set the return type. What worked for me was:
DEFINE ChOver org.apache.pig.piggybank.evaluation.Over('chararray');
res= foreach (group table by fieldA) {
Aord = order table by fieldB;
generate flatten(Stitch(Aord, ChOver(Aord.fieldB, 'lag'))) as (fieldA,fieldB,lag_fieldB) ;}
Now the describe is telling me
fieldA: long,fieldB: chararray,lag_fieldB: chararray
And I'm able to use the columns as expected, hope this can save some time for someone else.

pig ToDate function not working properly

i am trying cast field with date function.
raw_data = LOAD '/user/cloudera/Chicago_Traffic_Tracker_- _Historical_Congestion_Estimates_by_Region.csv' USING PigStorage(',') AS ( TIME :chararray,REGION_ID:int,BUS_COUNT:int,NUMBER_OF_READS:int,SPEED:double);
raw_clean = FOREACH raw_data GENERATE ToDate(raw_data.TIME,'yyyy/MM/dd HH:mm:ss')as date_time:DateTime ;
I get the below error
Scalar has more than one row in the output. 1st :
(01/29/2015 01:40:35 PM,22,33,429,25.23), 2nd :(01/05/2015 01:10:46 PM,18,58,1058,21.14)
Input
01/29/2015 01:40:35 PM,22,33,429,25.23,a61e11c83f811b63e1dc64362f799dcac322fca8
01/05/2015 01:10:46 PM,18,58,1058,21.14,39c63427d0e1401a06f967fd43c30e291140c26e
Didn't try practicals: But Your Input date is in format 01/29/2015
01:40:35 i.e MM/dd/YYYY HH:mm:ss . Whereas you have specified it as
'yyyy/MM/dd HH:mm:ss'
Try something like :
raw_clean = FOREACH raw_data GENERATE ToDate(raw_data.TIME,'MM/dd/YYYY HH:mm:ss');

how to convert a string to date format using dataweave

I am performing a CSV to CSV transformation using DataWeave.
One of the Input fields is a string 13/01/2015. My requirement is to convert this string to a date format as 13-Jan-2015.
I have tried using as :string{"dd/MMM/yyyy} and as :date{format:dd/M/yyyy} functions but did not succeed in changing the format.
Here is what I tried:
payload map {
"Order Number":$[0],
"Order Date": ($[1] as :date{format:"d/M/yyyy"}),
}
This conversion gave the output as
Order Number,Order Date
14710655,2015-08-17
Then I tried the following:
payload map {
"Order Number":$[0],
"Order Date": ($[1] as :date{format:"d/M/yyyy"}) as :string{format:"d/MMM/yyyy"})
}
This conversion gave the output as
Order Number,Order Date
14710655,17/8/2015
When I tried :string { format: "dd-MMM-YYYY"}, it gave me 13-Jan-2015.