FETCH the latest partition from HIVE table - sql

Hi I am very much new to this.
I have three columns YEAR, MONTH,DAY in INTEGER format.
I want to load the script and combine YEAR,MONTH,DAY as single column and fetch the maximum.
I tried like,
Load year,month,date from HIVE.`abc`.`abc1';
SELECT max(cast(year as String) || '_' || cast(month as string) || '_' || cast(day as string)) as result FROM HIVE.`abc`.`abc1';
By doing this I will get the result as 2020_5_21. But I should use the separator and find the max of the date.
The following error occurred: Connector reply error: SQL##f -
SqlState: S1000, ErrorCode: 35, ErrorMsg: [Cloudera][Hardy] (35) Error
from server: error code: '1' error message: 'Error while compiling
statement: FAILED: Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.tez.TezTask'.
I want to use the result in WHERE clause. But I don't know the statement.
SQL select * from HIVE.abc.`abc1' where ---- ;
Please help.

If month and day are stored as integers, you need to use lpad() to add zero if it is single digit month or day. For example month 5 should become 05. Without this max may work incorrectly. Also use dash as a separator to have date in compatible format.
max(concat(year,'-',lpad(month, 2,0),'-',lpad(day, 2,0)))
And to use it in the WHERE use WHERE date in (select max ...):
SELECT * from your_table
WHERE concat(year,'-',lpad(month, 2,0),'-',lpad(day, 2,0)) in (select max(concat(year,'-',lpad(month, 2,0),'-',lpad(day, 2,0))) from your_table)
Also you may need to quote names like year, month, day in backticks everywhere in sql:
max(concat(`year`,'-',lpad(`month`, 2,0),'-',lpad(`day`, 2,0)))

Related

What causes error "Strings cannot be added or subtracted in dialect 3"

I have the query:
WITH STAN_IND
AS (
SELECT ro.kod_stanow, ro.ind_wyrob||' - '||ro.LP_OPER INDEKS_OPERACJA, count(*) ILE_POWT
FROM M_REJ_OPERACJI ro
JOIN M_TABST st ON st.SYMBOL = ro.kod_stanow
WHERE (st.KOD_GRST starting with 'F' or (st.KOD_GRST starting with 'T') ) AND ro.DATA_WYKON>'NOW'-100
GROUP BY 1,2)
SELECT S.kod_stanow, count(*) ILE_INDEKS, SUM(ILE_POWT-1) POWTORZEN
from STAN_IND S
GROUP BY S.kod_stanow
ORDER BY ILE_INDEKS
That should be working, but I get an error:
SQL Error [335544606] [42000]: Dynamic SQL Error; expression evaluation not supported; Strings cannot be added or subtracted in dialect 3 [SQLState:42000, ISC error code:335544606]
I tried to cast it into bigger varchar but still no success. What is wrong here? Database is a Firebird 2.1
Your problem is 'NOW'-100. The literal 'NOW' is not a date/timestamp by itself, but a CHAR(3) literal. Only when compared to (or assigned to) a date or timestamp column will it be converted, and here the subtraction happens before that point. And the subtraction fails, because subtraction from a string literal is not defined.
Use CAST('NOW' as TIMESTAMP) - 100 or CURRENT_TIMESTAMP - 100 (or cast to DATE or use CURRENT_DATE if the column DATA_WYKON is a DATE).

How to extract month name on a string datatype on athena

SELECT sales_invoice_date,
MONTH( DATE_TRUNC('month',
CASE
WHEN TRIM(sales_invoice_date) = '' THEN
DATE('1999-12-31')
ELSE
DATE_PARSE(sales_invoice_date, '%m/%d/%Y')
END) ) AS DT
FROM testdata_parquet
I used the query above to convert the string into date and was able to get the month number on AWS athena but I wasn't able to get the corresponding month name
I have already tried monthname and datename('month', ...) but they gave the following error messages respectively:
SYNTAX_ERROR: line 2:1: Function monthname not registered
SYNTAX_ERROR: line 2:1: Function datename not registered
Athena is currently based on Presto .172, so you should refer to https://trino.io/docs/0.172/functions/datetime.html for available functions on date/time values.
You can get month name with date_format():
date_format(value, '%M')
or similarly format_datetime().
format_datetime(value, 'MMM')
Example:
presto:default> SELECT date_format(current_date, '%M');
_col0
----------
December
(1 row)
(verified on Presto 327, but will work in Athena too)
You can use to_char() function with 'month' argument :
to_char(sales_invoice_date, 'month')
in order to return the month names.

DATE_PART and Postgresql

I have a problem when I subtract two date in a function DATE_PART in this query.
SELECT
TO_CHAR(date_trunc('month',sql_activity_days.created_month),'YYYY-MM') AS "sql_activity_days.created_month",
coalesce(SUM(
CASE
WHEN(date_part('day', (sql_activity_days.sale_date + 1) - sql_activity_days.start_date) < 122)
THEN sql_activity_days.cad_net_invoiced
ELSE NULL
END
),0) AS "sql_activity_days.activity_over_122_day_after_signup"
FROM
camel.f_subscription_touch AS subscription_touch
LEFT JOIN sql_activity_days ON subscription_touch.id = sql_activity_days.customer_id
group by date_trunc('month',sql_activity_days.created_month)
order by 1 desc limit
500
The PostgreSQL database encountered an error while running this query.
ERROR: function date_part(unknown, integer) does not exist Hint: No
function matches the given name and argument types. You might need to
add explicit type casts. Position: 1340
The second argument of the function date_part can be either a timestamp or an interval.
Your expression (sql_activity_days.sale_date + 1) - sql_activity_days.start_date subtracts two dates, whose resulting datatype is an integer, hence the error.
The solution to this would be to remove date_part and use the expression directly. The difference of two dates always gives the value in days.

query to subtract date from systimestamp in oracle 11g

I want to perform a subtraction operation on the date returned from another query and the system time in oracle SQL. So far I have been able to use the result of another query but when I try to subtract from systimestamp it gives me the following error
ORA-01722: invalid number
'01722. 00000 - "invalid number"
*Cause: The specified number was invalid.
*Action: Specify a valid number.
Below is my query
select round(to_number(systimestamp - e.last_time) * 24) as lag
from (
select ATTR_VALUE as last_time
from CONFIG
where ATTR_NAME='last_time'
and PROCESS_TYPE='new'
) e;
I have also tried this
select to_char(sys_extract_utc(systimestamp)-e.last_time,'YYYY-MM-DD HH24:MI:SS') as lag
from (
select ATTR_VALUE as last_time
from CONFIG
where ATTR_NAME='last_time'
and PROCESS_TYPE='new'
) e;
I want the difference between the time intervals to be in hours.
Thank you for any help in advance.
P.S. The datatype of ATTR_VALUE is VARCHAR2(150). A sample result of e.last_time is 2016-09-05 22:43:81796
"its VARCHAR2(150). That means I need to convert that to date"
ATTR_VALUE is a string so yes you need to convert it to the correct type before attempting to compare it with another datatype. Given your sample data the correct type would be timestamp, in which case your subquery should be:
(
select to_timestamp(ATTR_VALUE, 'yyyy-mm-dd hh24:mi:ss.ff5') as last_time
from CONFIG
where ATTR_NAME='last_time'
and PROCESS_TYPE='new'
)
The assumption is that your sample is representative of all the values in your CONFIG table for the given keys. If you have values in different formats your query will break on some other way: that's the danger of using this approach.
So finally after lots of trial and errors I got this one
1. Turns out initially the error was because the data_type of e.last_time was VARCHAR(150).
To find out the datatype of a given column in the table I used
desc <table_name>
which in my case was desc CONFIG
2. To convert VARCHAR to system time I have two options to_timestamp and to_date. If I use to_timestamp like
select round((systimestamp - to_timestamp(e.last_time,'YYYY-MM-DD HH24:MI:SSSSS')) * 24, 2) as lag
from (
select ATTR_VALUE as last_time
from CONFIG
where ATTR_NAME='last_time'
and PROCESS_TYPE='new'
) e;
I get an error that round expects NUMBER and got INTERVAL DAY TO SECONDS since the difference in date comes out to be like +41 13:55:20.663990. To convert that into hour would require a complex logic.
An alternative is to use to_data which I preferred and used it as
select round((sysdate - to_date(e.last_time,'YYYY-MM-DD HH24:MI:SSSSS')) * 24, 2) as lag
from (
select ATTR_VALUE as last_time
from CONFIG
where ATTR_NAME='last_time'
and PROCESS_TYPE='new'
) e;
This returns me the desired result i.e. the difference in hours rounded off to 2 floating digits

Oracle error: ORA-01839: date not valid for month specified

OK guys, first of all, I have checked many websites about this error and, unfortunately, none of them helped me. I have the simple following query:
select * from (
select to_date(cal.year || cal.month || cal.day, 'yyyymmdd') as datew,
cal.daytype as type
from vw_calendar cal)
where datew > sysdate;
When I try to execute the entire query, this error shows up:
ORA-01839: date not valid for month specified
If I execute only the query:
select to_date(cal.year || cal.month || cal.day, 'yyyymmdd') as datew,
cal.daytype as type
from vw_calendar cal;
It worked absolutely fine. If you want to view the results of the query: http://pastebin.com/PV95g3ac
I checked the days of the month like day 31 or leap year and everything seems correct. I don't know what to do anymore. My database is a Oracle 10g and I tried to execute the query on SQL Developer and PL/SQL Developer. Same error on both IDE.
Actually the issue is some months have 30 days and some have 31, so the query which you are forming, it is trying to get 31st day of that month which is invalid and hence the error. For Example:
it may be trying for date like: 31-NOV-2016 which is not correct, NOV has only 30 days.
Well, I found a workaround, not a solution. If anyone knows the "correct" solution for this problem, I appreciate if you share with us. My "solution" convert the date to number and then compare with a sysdate (converted too). Take a look:
select * from
( select to_number(cal.year||cal.month||cal.day) as datew,
cal.daytype as type from vw_calendar cal ) a
where a.datew > to_number(to_char(sysdate, 'yyyymmdd'));
Thanks to everyone!
Use alias name for the inner query
select * from (
select to_date(cal.year || cal.month || cal.day, 'yyyymmdd') as datew,
cal.daytype as type
from vw_calendar cal) a
where a.datew > sysdate;