ERROR 1070: Could not resolve ToDate using imports - apache-pig

Following are the details:
date2.txt
B02617,2/27/2015,1551,14677
B02598,2/27/2015,1114,10755
B02512,2/27/2015,272,2056
B02764,2/27/2015,4253,38780
pig-script:
A = Load '/files/date2.txt' using PigStorage(',') as (base:chararray, tripdate:chararray, cars:int, tripkms:int);
B = FOREACH A GENERATE tripdate;
C = FOREACH B GENERATE ToDate(tripdate,'yyyy-MM-dd') as mytripdate;
This the error I am getting:
main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1070: Could not resolve ToDate using imports: [, org.apache.pig.builtin., org.apache.pig.impl.builtin.]

The input date format is MM/dd/yyyy.
C = FOREACH B GENERATE ToDate(tripdate,'MM/dd/yyyy') as mytripdate;
If you want the date to be in 'yyyy-MM-dd' format use ToString()
C = FOREACH B GENERATE ToString(ToDate(tripdate,'MM/dd/yyyy'),'yyyy-MM-dd') as mytripdate;

Related

DataWeave: Unable to obtain ZonedDateTime from TemporalAccessor when parsing a String to DateTime

I have a problem with DataWeave 2 transformation. I have:
var parseDate = (dateStr) -> dateStr as DateTime {format: "yyyy-MM-dd"}
But when I am running this code I get:
Caused by: org.mule.runtime.api.el.ExpressionExecutionException:
Cannot coerce String (2019-03-26) to DateTime, caused by: Text
'2019-03-26' could not be parsed: Unable to obtain ZonedDateTime from
TemporalAccessor: {},ISO resolved to 2019-03-26 of type
java.time.format.Parsed
I am using DateTime cause it is detected as such when creating metadata. But the class itself has LocalDate bookingDate; - the problem is that when I am trying to use LocalDate - I get an error:
Unable to resolve reference of: `LocalDate`.
What can I do with this problem? Can I parse it somehow correctly? Or what can I do with the LocalDate problem mentioned above?
As your input string has only the date part, you can use the following DataWeave expression:
var parseDate = (dateStr) -> dateStr as Date {format: "yyyy-MM-dd"}

Identifying columns through PiG

I have data set like below :
"column,1A",column2A,column3A
"column,1B",column2B,column3B
"column,1C",column2C,column3C
"column,1D",column2D,column3D
What separator I should be using in this case to separate out above 3 columns.
First column value is => Column,1A
Second column value is => Column2A
Third column value is => Column3A
Let be try my code:
a = LOAD '/home/hduser/pig_ex' USING PigStorage(',') AS (col1,col2,col3,col4);
b = FOREACH a GENERATE REGEX_EXTRACT(col1, '^\\"(.*)', 1) AS (modfirstcol),REGEX_EXTRACT(col2, '^(.*)\\"', 1) AS (modsecondcol),col3,col4;
c = foreach b generate CONCAT($0, CONCAT(', ', $1)), $2 , $3;
dump c;
I am able to resolve it using the below steps:
Input:-
"column,1A",column2A,column3A
"column,1B",column2B,column3B
"column,1C",column2C,column3C
"column,1D",column2D,column3D
PiG Script :-
A = load '/home/hduser/pig_ex' AS line;
B = FOREACH A GENERATE FLATTEN(STRSPLIT(line,'\\,',4)) AS (firstcol:chararray,secondcol:chararray,thirdcol:chararray,forthcol:chararray);
C = FOREACH B GENERATE REGEX_EXTRACT(firstcol, '^\\"(.*)', 1) AS (modfirstcol),REGEX_EXTRACT(secondcol, '^(.*)\\"', 1) AS (modsecondcol),thirdcol,forthcol;
D = FOREACH C GENERATE CONCAT(modfirstcol,',',modsecondcol),thirdcol,forthcol;
DUMP D;
Output :-
(column,1A,column2A,column3A)
(column,1B,column2B,column3B)
(column,1C,column2C,column3C)
(column,1D,column2D,column3D)
Please let me know if there is any better way

Regex to extract first part of string in Apache Pig

I need to extract post code district from the input data below
AB55 4
DD7 6LL
DD5 2HI
My Code
A = load 'data' as postcode:chararray;
B = foreach A {
code_district = REGEX_EXTRACT(postcode,'<SOME EXP>',1);
generate code_district;
};
dump B;
Output should look like
AB55
DD7
DD5
what should be the regular expression to extract the first part of the string?
Can you try the below Regex?
Option1:
A = LOAD 'input' as postcode:chararray;
code_district = FOREACH A GENERATE REGEX_EXTRACT(postcode,'(\\w+).*',1);
DUMP code_district;
Option2:
A = LOAD 'input' as postcode:chararray;
code_district = FOREACH A GENERATE REGEX_EXTRACT(postcode,'([a-zA-Z0-9]+).*',1);
DUMP code_district;
Output:
(AB55)
(DD7)
(DD5)

Piggybank running total: Sum Over()

I am using the following pig script to calculate a running total (pig local mode)
Register /home/ec2-user/pig*/bin/piggybank-0.12.0.jar ;
define Sum org.apache.pig.piggybank.evaluation.int.Sum();
define Over org.apache.pig.piggybank.evaluation.Over();
define Stitch org.apache.pig.piggybank.evaluation.Stitch();
A = load '/home/ec2-user/staff_data.csv' using PigStorage(',') as (id:int, name:chararray, salary:int, department:chararray);
B = group A by department;
C = foreach B {
C1 = order A by salary;
generate flatten(Stitch(C1, Over(C1.department, 'Sum(C1.salary)')));
};
However, I am getting the following error
Unknown aggregate Sum(C1.salary)
Anyone any ideas?
Edit:
Figured the answer by myself. Here it is:
Register /home/ec2-user/pig*/bin/piggybank-0.12.0.jar ;
define Over org.apache.pig.piggybank.evaluation.Over();
define Stitch org.apache.pig.piggybank.evaluation.Stitch();
A = load '/home/ec2-user/staff_data.csv' using PigStorage(',') as (id:int, name:chararray, salary:int, department:chararray);
B = group A by department;
C = foreach B {
C1 = order A by salary;
generate flatten(Stitch(C1, Over(C1.salary, 'sum(int)')));
};

STR DATE with french format sql

i have some date in my db which are like this :
FCOL_DATE = 31/08/2007 00:00
i'm trying to make this query but i received this error :
Parse error: syntax error, unexpected '%' in
$from1 = "01/08/2013";
$to1 = "31/08/2013";
foreach ($query=$db->query("SELECT
t1.FCOL_ID, t1.FCOL_FRN_ID, t1.FCOL_DATE, t1.FCOL_PERIODE,
t2.STK_IDFACTCOLISAGE
FROM colisage t1
INNER JOIN mvt_stock t2
ON t1.FCOL_ID = t2.STK_IDFACTCOLISAGE
WHERE STR_TO_DATE(FCOL_DATE, "%d/%m/%Y") BETWEEN $from1 AND $to1 AND t1.FCOL_FRN_ID = $frnId") AS $donnees):