Why ToDate does not work with string format - apache-pig

Assume that I have a string column named DAT_015_X in that format 'yyyyMMdd'.
I want to create a filter by the date '20190101'.
I just load my data :
CRE_22002 = LOAD '$input' USING PigStorage(';') AS (GM_COMPTEUR:chararray,CIA_CD_CRV_CIA:chararray,CIA_DA_EM_CRV:chararray,CIA_CD_CTRL_BLCE:chararray,CIA_IDC_EXTR_RDJ:chararray,CIA_VLR_IDT_CRV_LOQ:chararray,CIA_VLR_REF_CRV:chararray,CIA_NO_SEQ_CRV:chararray,CIA_VLR_LG_ZON_RTG:chararray,CIA_HEU_CIA:chararray,CIA_TM_STP_CRE:chararray,CIA_CD_SI:chararray,CIA_VLR_1:chararray,CIA_DA_ARR_FIC:chararray,CIA_TY_ENR:chararray,CIA_CD_BTE:chararray,CIA_CD_PER:chararray,CIA_CD_EFS:chararray,CIA_CD_ETA_VAL_CRV:chararray,CIA_CD_EVE_CPR:chararray,CIA_CD_APLI_TDU:chararray,CIA_CD_STE_RTG:chararray,CIA_DA_TT_RTG:chararray,CIA_NO_ENR_RTG:chararray,CIA_DA_VAL_EVE:chararray,PSE_001:chararray,STR_002:chararray,STR_003:chararray,CPR_006_VLR:chararray,CPR_006_DCM:chararray,CPR_006_CD_DVS:chararray,CPR_008_VLR:chararray,CPR_008_DCM:chararray,CPR_008_CD_DVS:chararray,CPR_009_VLR:chararray,CPR_009_DCM:chararray,CPR_009_CD_DVS:chararray,CPR_059_VLR:chararray,CPR_059_DCM:chararray,CPR_059_CD_DVS:chararray,CPR_060_VLR:chararray,CPR_060_DCM:chararray,CPR_060_CD_DVS:chararray,RUB_205:chararray,RUB_216:int,DAT_015_X:chararray,NB_005_VLR:chararray,NB_005_DCM:chararray,NB_007_VLR:chararray,NB_007_DCM:chararray,NB_012_VLR:chararray,NB_012_DCM:chararray,EUR_061_VLR:chararray,EUR_061_DCM:chararray,EUR_061_CD_DVS:chararray,EUR_062_VLR:chararray,EUR_062_DCM:chararray,EUR_062_CD_DVS:chararray);
I dumped it and data exist and there was no problem. No space or any problem.
To get the filter I write :
CRE_22002_DATA_FILTER = FILTER CRE_22002 BY (ToDate($45,'yyyyMMdd')> ToDate('20190101','yyyyMMdd'));
I get this error :
Could not infer the matching function for
org.apache.pig.builtin.ToDate as multiple or none of them fit. Please
use an explicit cast.
EDIT
#VK_217 as say, I used $45 instead $44 beacuse it's the right position.
(DAT_015_X)
(99991231)
(20200605)
(20190605)
(20200605)
(20200305)
(99991231)
(99991231)
(99991231)
(99991231)
(99991231)
(99991231)
(20200110)
(99991231)
(99991231)
(99991231)
(20190501)
(99991231)
(20190905)
(99991231)
(99991231)
(99991231)
(99991231)
(99991231)
(99991231)
(99991231)
(20190605)
(99991231)
(99991231)
(20190905)
(99991231)
(20190915)
(20190805)
(99991231)
(99991231)
(99991231)
(99991231)
(99991231)
(99991231)
(20200110)
(99991231)
(20190705)
But there is always an error :
[]]: java.lang.IllegalArgumentException: Invalid format: "DAT_015_X"
Failed Jobs:
JobId Alias Feature Message Outputs
job_1549794175705_3562351 CRE_22002,CRE_22002_DATA_FILTER,LIMITED_DATA Message: Job failed!
Input(s):
Failed to read data from "/hdfs/data/adhoc/PR/02/RDO0/BB0/MGM22002-2019-09-04.csv"
Output(s):
How do you explain that please ?

Related

How to get the 1st day of the current week if week starts from Monday in BigQuery?

SELECT TIMESTAMP_TRUNC(CURRENT_DATE(), WEEK(MONDAY)) AS FirstDayOfWeek
I am a beginner in Big query. This above query is giving me an error, please help me in correcting this.
Error message:
No matching signature for function TIMESTAMP_TRUNC for argument types: DATE, DATE_TIME_PART. Supported signature: TIMESTAMP_TRUNC(TIMESTAMP, DATE_TIME_PART, [STRING]) at [1:8]
Use DATE_TRUNC instead of TIMESTAMP_TRUNC
SELECT DATE_TRUNC(CURRENT_DATE(), WEEK(MONDAY)) AS FirstDayOfWeek

Oracle SQL - select query with where condition greater than certain moment defined in a timestamp column

I have a table where there is a column, report_time, whose type is timestamp. A value of it looks like "25-May-20 05.03.20.12000 PM", now I want to filter out all rows whose report_time is greater than or equal to that moment, a pseudo where clause looks like:
where report_time >= to_timestamp("25-May-20 05.03.20.12000 PM", "DD-MON-YY HH.MI.SS.FF PM")
somehow I failed to get it work, even after googling for quite some time.
Please help.
I think the idea is ok with this syntax:
report_time >= to_timestamp('25-May-20 05.03.20.12000 PM', 'DD-MON-YY HH.MI.SS.FF PM')

How can I do to get the hour from a date column with Postgres? [duplicate]

I have timestamp in my table and i want to extract only hour from it. I search and find a extract function but unable to use as a query. Do i need to convert first timestamp in varchar and then extract hour from it?
Here is my query:
select extract(hour from timestamp '2001-02-16 20:38:40') // example
actual query:
select extract(hour from timestamp observationtime) from smartvakt_device_report
The following should work
select extract(hour from observationtime) from smartvakt_device_report
SELECT to_char(now(), 'HH24:MI:SS') hour_minute_second
The word timezone is redundant (read: wrong). You just need to give the column's name. E.g.:
db=> select extract(hour from observationtime) from smartvakt_device_report;
date_part
-----------
19
(1 row)
EXTRACT does not work with Grafana but date_part does.
The solution for me was:
SELECT date_part('hour', observationtime::TIMESTAMP) FROM smartvakt_device_report;
Reference: https://www.postgresql.org/docs/current/functions-datetime.html#FUNCTIONS-DATETIME-EXTRACT

search date and time in oracle using to_char

In oracle, when I search using below query, it is fetching wrong records (check the attached screenshot), can someone suggest the correct format for 12 hour time.
to_char(a.created, 'MM/DD/YYYY HH12:MI:SS') >='05/23/2012 12:00:00'
Thanks,
Kiran.
Don't search based on a string. Search based on a date. If you search on a string, you'll get string comparison semantics which are not what you want. The string '06/01/1900' is alphabetically after the string '05/23/2012' despite the date that it represents being much earlier.
a.created >= to_date('05/23/2012 12:00:00', 'mm/dd/yyyy hh24:mi:ss' )
or using a 12-hour clock
a.created >= to_date('05/23/2012 03:15:00 pm', 'mm/dd/yyyy hh:mi:ss am' )

SQL query to return the rows between two dates

I am using this query to get the rows between two dates but I don't get the rows from the database, it returns an empty set (in db I have rows for that day)
Query:
select email
from users
where date_email_sent between '17-May-12' AND '17-May-12'
When I try the query below I am getting 17th row alone
Query:
select email
from users
where date_email_sent between '17-May-12' AND '18-May-12'
Can any one plz suggest me how to get 17th records alone if start date and end date as same.
Thanks in advance.
Does the date_email_sent have a datetime value? If the column value like 2012-05-17 09:30:00.000 has both date and time, then you may need to put the time together with your date value in the where clause,
e.g.
where date_email_sent between '17-May-12 00:00:00.000' AND '17-May-12 23:59:59.000'
or you can look at the date value only for the field
where DATEADD(dd, 0, DATEDIFF(dd, 0, date_email_sent )) between '17-May-12' AND '17-May-12'
SQL always consider as 12am if you put date only, so there will be a problem if you want to compare a date with datetime value
When you specify a date without a time portion, the time is automatically assumed as 12:00 am. So when you write
date_email_sent between '17-May-12' AND '17-May-12'
This is effectively the same as
date_email_sent between '17-May-12 12:00AM' AND '17-May-12 12:00AM'
So as you can see, the two times are identical and naturally there are no records in the specified interval.
If you want all the records on one day, you need to measure from midnight until midnight the next day:
date_email_sent >= '17-May-12 12:00AM' and date_email_sent < '18-May-12 12:00AM'
or just:
date_email_sent >= '17-May-12' and date_email_sent < '18-May-12'
Alternatively, you can extract the day portion of the date and check that for the correct value. The specific date handling functions vary depending on your dbms.
In SQL Server you should do it like this.
select email
from users
where date_email_sent >= '20120517' and
date_email_sent < dateadd(day, 1, '20120517')
If you use something else you have to replace dateadd with something else.
select email
from users
where date_email_sent >= '17-May-12' AND date_email_sent < '18-May-12'
when you write
BETWEEN '17-May-12' AND '17-May-12'
so you haven't mentioned the time so in Date datatype it considered as
'17-May-12 00:00:00' AND '17-May-12 00:00:00'
so your range is limited that's why you didn't get records.
you should mention time also to get the record with same date .
In oracle you can write like this
BETWEEN to_date('dd-mm-yyyy hh24:mi','17-05-12 00:00') AND to_date('dd-mm-yyyy hh24:mi','17-05-12 23:59')