Conevrt chararray to Date and add duration in Pig - apache-pig

Iam new to Pig and I have a sample test data of 500 KB which I need to multiply several times to make the file size bigger for some test purpose. The single row in my data is as follows:
( card_description:chararray,
transaction_date:chararray,
merchant_name:chararray,
merchant_city:chararray,
transaction_amount:float
) ;
I want to simply change the transaction_amount and transaction_date for each row several times and then join all the results to make a single big file.
I am stuck in trying to change the transaction_date.
The date value in the file is
27/05/2010 00:00
r1 = FOREACH data GENERATE card_description,ToDate(transaction_date),merchant_name,merchant_city,
ROUND(RANDOM()*5)*transaction_amount;
result =union data,r1;
In order to alter the transaction datei want to use AddDuration function, but in trying to convert chararray to date, I am facing format related issues and unable to understand the solution.
Can someone guide?

After checking out the ways you can invoke ToDate, currently you are invoking ToDate as:
ToDate(milliseconds)
ToDate(iosstring)
And your format is not in milliseconds, nor follows the ISO 8601 format. You should be invoking it like:
ToDate(userstring, format)
Where format is a pattern string that follows these rules.
Therefore, ToDate should be called like:
-- For a 12hr clock
ToDate(transaction_date, "yyyy/MM/dd hh:mm")
-- For a 24hr clock
ToDate(transaction_date, "yyyy/MM/dd HH:mm")
For AddDuration, remember that the second parameter you provide to it must be a string in the ISO 8601 format. Make sure to read the link so you format the string correctly.

Related

I have a Date in a String format and I can't convert it to date in BigQuery

I started with a date in a string format from a JSON extraction using this: json_value(answer, '$.date_created') and got an output 2020-01-02T10:26:47.056-04:00.
From there, I transformed the output (because I couldn't change it to date using a series of functions like regexp_replace, left and CAST) to a date-like string: 2020-01-02 10:26:47
I need to be able to transform this new string to a date. So far, I've tried with FORMAT_DATETIME and FORMAT_TIMESTAMP but I'm getting an error: Failed to parse input string bigquery
Your original timestamp string is just fine to do this:
select Date(timestamp("2020-01-02T10:26:47.056-04:00"))
Only thing here to check is: you have -4 offset from UTC so as long as you take care of timezone etc, above style should work fine.

Conversion of Date for direct sql query in OBIEE using presentation variable

I am trying to achieve this use case: when there is no date picked I want to show all the results but when I have date picked I want it to filter.
fyi:
the date getting picked are YYYY-MM-DD HH24:MI:SS in the presentation variable, but the date format in my query is dd-mon-yy. So when I need to convert the value of the presntation varible to dd-mon-yy.
OBIEE doesnt like when I play around with the values, and the BI server does not let me look at the error message.
I dont have access to change the format on the server level so my only option is to use formulas
I'm new to presentation variable.
ALso I need you'll to remember if there is no date selected in the prompt I would want all values returned
code:
and
( ( main_query.schd_compare >= (#{pv_task_sch_st_date}['#']{NVL(main_query.schd_compare,'None')})
)
AND (
main_query.schd_compare <= (#{pv_task_sch_end_date}['#']{NVL(main_query.schd_compare,'None')})
) )
I need help with syntax for obiee
Inside the database, it doesn't care if a date is "DD Mon YY" or "YYYY-MM-DD HH24:MI:SS". Those are just formatting, and the actual bit value of that date will be the same. So if both "dates" are actually a date datatype, then you can just use something like:
....
AND (
(
main_query.schd_compare >= NVL(pv_task_sch_st_date,main_query.schd_compare)
)
AND
(
main_query.schd_compare <= NVL(pv_task_sch_end_date,main_query.schd_compare)
)
)
That's if your pv_task_sch_???_date are passed as NULL values when not selected. Oracle does seem to treat empty strings and NULL the same in comparisons like this, but you it can be hard to debug if you get in the habit of relying on NULL and '' being the same.
As for your query, my guess is that your pv_task... values are actually being passed to your query as a some sort of string. If that's the case, then you'll need to put a TO_DATE() around your pv_task... variables.
Take a look at https://dbfiddle.uk/?rdbms=oracle_11.2&fiddle=2066b2005a22769e785815f6b03750a1. I stepped through a few examples of how your dates can be treated.
I did say earlier that the database doesn't care when format your date is in. And it is stored the same, no matter the format. But when you're using TO_DATE() or other similar functions, Oracle wants you to specify the proper mask of your data. If you send "01 Jan 99" to TO_DATE(), Oracle needs to know how to interpret that value. So you tell it that the string is "DD Mon YY". You can't do TO_DATE('2018-09-10','YYYY-MM-DD HH24:MI:SS') because my input doesn't have a time component. ( I would also caution about using 2-digit years. )
By the way, I hate dates. And dealing with dates in Oracle reinforces that hatred.

Query "Select max(date) from table where date <= somedate" not working

I am querying a SQLite database table as follows:
SELECT MAX(Date) from Intra360 WHERE Date <= "05/04/2013 00:00"
The right record in return should be the number 47, i.e. 04/04/2013 23:00:
However, the execution of this statement returns a different value:
I confess I know almost nothing about SQL, but this outcome is strange. Where am I being wrong?
NOTE "Intra360" is the name of the table and the field containing the dates is called "Date"
ADDITIONAL NOTE what I need is the closest available date to a user input. It is a Python program which is making some analysis but when the user inputs the dates is not necessarily true they will exist in the database. So I'm just trying to re-select them in a way that the proper SQL statement that will load the data to be used in the analysis won't fail execution because of the missing record. So "05/04/2013 00:00" is the user input, and the query should be done hence starting from 04/04/2013 (and not definetely 04/06/2013).
The comparisons are performed on strings with alphabetical ordering, not on datetime stamps with chronological ordering.
Store your datetimes in a format that compares the way you want. For example, unix epoch timestamps and ISO 8601 yyyy-MM-dd'T'HH:mm:ss datetimes have this property.
If you cannot influence how the data is stored, you can use substr() to mangle the timestamps in SQL. See e.g. Sqlite convert string to date for more.

How do you parse a custom formatted date time string into a datetime?

This is for Microsoft SQL Server
I have an audit table with a timestamp represented as a string - timestamps are in multiple locale-specific representations (eg some are in mm/dd others are dd/mm)
I know some rows that I'm interested in have a timestamp string in the format of dd/MM/yy HH:mm:ss
I want to write a query that will return rows where the timestamp string is NOT in that format so I imagine something like this (with an imaginary PARSEDATE function)
WHERE PARSEDATE(timestamp) IS NOT NULL
Everything I've read about T-SQL datetime functions seem to involve well defined format codes eg 112 but I don't see a generalized way of being able to provide a custom date time format string for parsing?
Set the format before running your query.
SET LANGUAGE us_english;
SET DATEFORMAT dmy;
In your query
WHERE ISDATE(timestamp) = 1
More information can be found here

character_length Teradata SQL Assistant

I have to run column checks for data consistency and the only thing that is throwing off my code is checking for character lengths for dates between certain parameters.
SEL
sum(case when ( A.date is null or (character_length(A.date) >8)) then 1 else 0 end ) as Date
from
table A
;
The date format of the column is YYYY-MM-DD, and the type is DA. When I run the script in SQL Assistant, I get an error 3580 "Illegal use of CHARACTERS, MCHARACTERS, or OCTET_LENGTH functions."
Preliminary research suggests that SQL Assistant has issues with the character_length function, but I don't know how to adjust the code to make it run.
with chareter length are you trying to get the memory used? Becuase if so that is constant for a date field. If you are trying to get the length of the string representation i think LENGTH(A.date) will suffice. Unfortanatly since teradata will pad zeros on conversions to string, I think this might always return 10.
UPDATE :
Okay so if you want a date in a special 'form' when you output it you need to select it properly. In teradata as with most DBs Date are not store in strings, but rather as ints, counting days from a given 'epoch' date for the database (for example the epoch might be 01/01/0000). Each date type in teradata has a format parameter, which places in the record header instructions on how to format the output on select. By default a date format is set to this DATE FROMAT 'MM/DD/YYYY' I believe. You can change that by casting.
Try SELECT cast(cast(A.date as DATE FORMAT 'MM-DD-YYYY') as CHAR(10)) FROM A. and see what happens. There should be no need to validate the form of the dates past a small sample to see if the format is correct. The second cast forces the database to perform the conversion and use the format header specified. Other wise what you might see is the database will pass the date in a date form to SQL Assitant and sql assitant will perform the conversion on the application level, using the format specified in its own setting rather then the one set in the database.