I'm using CurrentTime(), which is a datetime data type. However, I need it as a chararray. I have the following:
A = LOAD ...
B = FOREACH A GENERATE CurrentTime() AS todaysDate;
I've tried various approaches, such as the following:
B = FOREACH A GENERATE (chararray)CurrentTime() AS todaysDate;
However, I always get ERROR 1052: Cannot cast datetime to chararray.
Anyone know how I can do this? By the way, I'm very new to pig. Thanks in advance!
I had a similar issue and I didn't want to use a custom UDF as described in the other answer. I am pretty new with Pig but it seems a pretty basic operation to justify the need of an UDF. This command works great for me:
B = FOREACH A GENERATE ToString(yourdatetimeobject, 'yyyy-MM-dd\'T\'HH:mm:ssz') AS yourfieldname;
You can select the format you want by looking at the SimpleDateFormat javadoc
You need to create a custom UDF which does the conversion
(e.g: see CurrentTime() implementation). Alternatively you may check out my answer on a similar topic for workarounds.
If you are on AWS, then use their DATE_TIME UDF.
Related
I have a bigquery code.
CREATE TEMP FUNCTION to_struct_attributes(input STRING)
RETURNS STRUCT<status_code STRING, created_time TIMESTAMP>
LANGUAGE js AS """
let res = JSON.parse(input);
res['created_time'] = Date(res['created_time'])
return res;
""";
SELECT
5 AS ID,
to_struct_attributes(
TO_JSON_STRING(
STRUCT(
TIMESTAMP(PARSE_TIMESTAMP('%Y%m%d%H%M%S', '20220215175959','America/Los_Angeles')) AS created_time
)
)
) AS ATTRIBUTES;
When I execute this, I'm getting the following error:
Failed to coerce output value "2022-02-16 01:59:59+00" to type TIMESTAMP
I feel this is quite strange, since BigQuery should be able to interpret it correctly and I haven't had this issue with any other datatypes. Also, if I do:
SELECT TIMESTAMP("2022-02-16 01:59:59+00")
It returns:
2022-02-16 01:59:59 UTC
So BigQuery can indeed parse it correctly. I'm not sure why it doesn't happen for the UDF. On searching the internet, I found this question and as the answer suggests, if I change the return statement to:
return Date(res.created_time);
It resolves the issue. But for a project of mine, doing it for every timestamp is not feasible due to the high number of struct columns.
So, I wanted to know if someone has a better alternative to it?
PS : I have removed a lot of non-essential parts from the above example, so this might look a bit abstract. Also, the actual use-case is a bit different and complex that's why I need that JS udf.
The best way to do what you want is to implement the following code.
return Date(res.created_time);
This happens when you pass a TIMESTAMP to a UDF, it is represented as a DATE object, as stated in the documentation. This is like a return of a TIMESTAMP from a JavaScript UDF, where you need to construct and return a DATE object.
I am struggling a bit as I am new to programming. I am currently writing a python script and I am a bit stuck. The goal is to parse some spatial information the gets pulled from SQL to a format that is usable for my py script down the line.
I was able to CAST through a SQL query and fetchall using the obdc module. However once I fetch the data that is where it gets trick for me. Here is an example of a print from the fetchall:
[(u'POLYGON ((7014.186279296875 6602.99658203125 1612.5, 7015.984375 6600.416015625 1612.5))',), (u'POLYGON ((6730.962646484375 6715.2490234375 1522.5, 6730.0869140625 6714.13916015625 1522.5))',)]
I am not exactly sure what I am getting here it is like a list of tuples. which I have tried converting to a list of list, but there must be something I am missing.
Here is the usable format I am looking for:
[[7014.186279296875, 6602.99658203125, 1612.5], [7015.984375, 6600.416015625, 1612.5]]
[[6730.962646484375, 6715.2490234375, 1522.5], [6730.0869140625, 6714.13916015625, 1522.5]]
Any ideas of how I can accomplish this? Maybe there is a better way to CAST in SQL or a module in python that would be easier to use instead of just doing a cursor.fetchall() and parsing? Or any any parsing help would be useful. Thanks.
If you want to do parsing, that should be straight forward. For example you've provided next code would do the thing:
result = []
for element in data:
single_elements = element[0][10:-2].split(', ')
for se in single_elements:
row = str(se).split(' ')
result.append([float(a) for a in row])
Result will contain what you need. If parsing is not an option, then paste some of your code so I can see how you're fetching data.
I am trying to calculate number of months between two datetime objects with the following code.
abc = load '/tmp/abc_2013_06_29/*' using PigStorage('\u0001') as ( open_dte: datetime, clsd_dte: datetime);
duration_in_months = MonthsBetween(open_dte, clsd_dte);
I am trying to generate the relation duration_in_months in another relation. However I am facing the following error,
Could not infer the matching function for org.apache.pig.builtin.GetMonth as multiple or none of them fit. Please use an explicit cast.
Appreciate any your help and also of any in-depth guide for learning casting and functions in pig.
Thanks,
Murali
Your code does not look correct.
Try instead
duration_in_months = FOREACH abc GENERATE MonthsBetween(open_dte, clsd_dte);
I have the following schema
x = foreach a generate ids as ids:bag{(mid: long)};
This works fine. But I actually need to do the following:
x = foreach a generate ids as ids:bag{((int)mid)};
This will give an error. And I found
x = foreach a generate ids as ids:bag{(mid:int)};
is not good enough. Can anybody please help me?
Thank you.
There is a bug in pig about casting after a colon:
https://issues.apache.org/jira/browse/PIG-2315
What you need is to issue another FOREACH statement.
As Ruslan mentioned, this is a bug. You can get around it with an "explicit" cast using parentheses:
x = foreach a generate ids as (bag{(mid:int)}) ids;
I want to convert a date to a string, and sees that Sencha 2 has this class for the job. It has a lot of convertion, but I cant find anyone where I can customize how I want the string formatted. I want a date in 'dd-MM-yyyy'.
In java you have the SimpleDateFormat class where you give the pattern you want it formated in as parameter, I would except there was something like this in the Date class. If not, whats the best way to do this in pure javascript (no third part libraries), I know the trivial way (getFullYear(), getMonth() and such), but its error prone.
http://docs.sencha.com/touch/2-0/#!/api/Date-method-toDateString
have a look at http://docs-devel.sencha.com/touch/2-0/#!/api/Ext.Date
it contains a ton of the format options :)
Cheers, Oleg