Wrapping an analytical function in another function in Hive 0.11

Wrapping an analytical function in another function in Hive 0.11 - hive

I am trying the following:
select case when ta_end_datetime_berekenen = 'Y' then lead(ta_update_datetime) over ( partition by dn_waarde_van, dn_waarde_tot order by ta_update_datetime ) else ea_end_datetime end ea_end_datetime, ta_insert_datetime, ta_update_datetime from tmp_wtdh_bestedingsklasse_10_s2_stap2
However, when I try that, I get the following error:
NoViableAltException(86#[129:7: ( ( ( KW_AS )? identifier ) | ( KW_AS LPAREN identifier ( COMMA identifier )* RPAREN ) )?])
FAILED: ParseException line 1:175 missing KW_END at 'over' near ')' in selection target
line 1:254 cannot recognize input near 'else' 'ea_end_datetime' 'end' in selection target
Would I be correct in assuming that it's not possible to wrap an analytical function in another function?
This is with Hive 0.11.

Not sure this is the root of your problem, but seems like your query is missing an AS keyword (note the all-caps AS on line 8 below).
select
case
when ta_end_datetime_berekenen = 'Y'
then
lead(ta_update_datetime) over ( partition by dn_waarde_van, dn_waarde_tot order by ta_update_datetime )
else
ea_end_datetime
end AS ea_end_datetime,
ta_insert_datetime,
ta_update_datetime
from tmp_wtdh_bestedingsklasse_10_s2_stap2

Related

The replacement for double colon to set the value in postgresql

with recursive tree_Gy_Department as(
select PreGD.*, 1::integer recursion_level
from GY_DEPARTMENT PreGD
where PreGD.dept_id = :deptId
union all
select NextGD.*, recursion_level +1
from GY_DEPARTMENT NextGD
join tree_Gy_Department treeGD on treeGD.parent_id = NextGD.dept_id)
select recursion_level, a.dept_name,
case
when recursion_level = 1 then REGEXP_replace(initcap(a.DEPT_NAME), '\\s', '')
else REGEXP_replace(initcap(a.DEPT_NAME), '[[:lower:]]|\\s', '', 'g') END
AS Result
from tree_Gy_Department a;
I'm trying to run this query and it works in the console query of PostgreSQL but when I put it in the repository it got an error: ERROR: syntax error at or near ":". I think the error occurred when I set the value for recursion_level "1::level recursion_level", maybe a conflict with hibernate. Does anyone have a replacement for this double colon?

Use the standard cast() syntax instead. Hibernated gets confused by the ::
cast(1 as integer) as recursion_level
But I don't think you need the cast at all. A simple 1 as recursion_level will work just as well.

invalid identifier while parsing json

I am compiling a dbt base model. Currently I get this error below. Line 6 looks the same as other lines above. Might a small syntax error that I could not spot.
15:40:22 Database Error in model base_datacenter_handling_unit (models/l10_staging_datacenter/base_unit.sql)
15:40:22 000904 (42000): SQL compilation error: error line 6 at position 3
15:40:22 invalid identifier 'VALUE'
15:40:22 compiled SQL at target/run/dbt/models/l10_staging_datacenter/base_unit.sql
This is how my file looks like:
SELECT
JSON_DATA:"key"::text AS KEY
, value:"description"::text AS DESCRIPTION
, value:"globalHandlingUnitId"::text AS GLOBAL_HANDLING_UNIT_ID
, value:"tareWeight"::NUMBER(38,0) AS TARTE_WEIGHT
, value:"tareWeight_unit"::text AS TARTE_WEIGHT_UNIT
, value:"width"::NUMBER(38,0) AS WIDTH
, value:"width_unit"::text AS WIDTH_UNIT
, value:"length"::NUMBER(38,0) AS LENGTH
, value:"validFrom"::TIMESTAMP_NTZ AS VALID_FROM_TS_UTC
, value:"validTo"::TIMESTAMP_NTZ AS VALID_TO_TS_UTC
, value:"lastModified"::TIMESTAMP_NTZ AS LAST_MODIFIED_TS_UTC
, value:"status"::text AS STATUS
, md5(KEY::STRING || MASTERCLIENT_ID) AS HANDLING_UNIT_KEY --different logic than in POSTGRESDWH!
,MASTERCLIENT_ID
,{{ extract_masterclientname_clause('META_FILENAME') }} AS MASTERCLIENT_NAME
,META_ROW_NUM
,META_FILENAME
,META_LOAD_TS_UTC
,META_FILE_TS_UTC
,CASE WHEN {{table_dedup_clause('HANDLING_UNIT_KEY')}}
THEN True
ELSE False
END AS IS_RECORD_CURRENT
FROM {{ source('INGEST_DATACENTER', 'HANDLING_UNIT') }} src
QUALIFY {{table_dedup_clause('HANDLING_UNIT_KEY')}}
It could also be because of the STRING type md5(KEY::STRING || MASTERCLIENT_ID) I am using with md5 but I have another file, which is based on the same pattern but it does not throw an error tho:
SELECT
JSON_DATA:"issueId"::NUMBER(38,0) AS ISSUE_ID
, value:"slaName"::text AS SLA_NAME
, value:"slaTimeLeft"::NUMBER(38,0) AS SLA_TIME_USED_SECONDS
, md5(ISSUE_ID::STRING || SLA_NAME) AS ISSUE_SLA_ID
,MASTERCLIENT_ID
,{{ extract_masterclientname_clause('META_FILENAME') }} AS MASTERCLIENT_NAME
,META_ROW_NUM
,META_FILENAME
,META_LOAD_TS_UTC
,META_FILE_TS_UTC
,CASE WHEN {{table_dedup_clause('ISSUE_SLA_ID')}}
THEN True
ELSE False
END AS IS_RECORD_CURRENT
FROM {{ source('INGEST_EMS', 'ISSUES') }} src
, lateral flatten ( input => JSON_DATA:slas)
QUALIFY {{table_dedup_clause('ISSUE_SLA_ID')}}
I don't see any significant difference between the two

value is the output columns of a FLATTEN which you have in your second SQL. But not your first.
This is where putting an alias of every table, and using it on EVERY usage, you would see something like
SELECT t.json_data:"key",
f.value:"json_prop_name"
FROM table AS t;
and be like, where does f come from...

The most likely reason is the column is not named "tareWeight_unit". Snowflake creates column names in upper case regardless of how they are written unless the original create statement puts the columns names in double quotes (e.g. "MyColumn") in which case it will create the column names with the exact case specified. Use SHOW COLUMNS IN TABLE and check the actual column name.

Why do I have an 'invalid column name' when I try to sum two columns with alias that contains a space? [duplicate]

I've create 3 computed columns as alias and then used the aliased columns to calculate the total cost. This is the query:
SELECT TOP 1000 [Id]
,[QuantityOfProduct]
,[Redundant_ProductName]
,[Order_Id]
,(CASE
WHEN [PriceForUnitOverride] is NULL
THEN [Redundant_PriceForUnit]
ELSE
[PriceForUnitOverride]
END
) AS [FinalPriceForUnit]
,(CASE
WHEN [QuantityUnit_Override] is NULL
THEN [Redundant_QuantityUnit]
ELSE
[QuantityUnit_Override]
END
) AS [FinalQuantityUnit]
,(CASE
WHEN [QuantityAtomic_Override] is NULL
THEN [Redundant_QuantityAtomic]
ELSE
[QuantityAtomic_Override]
END
) AS [Final_QuantityAtomic]
--***THIS IS WHERE THE QUERY CREATES AN ERROR***--
,([QuantityOfProduct]*[FinalPriceForUnit]*
([Final_QuantityAtomic]/[FinalQuantityUnit])) AS [Final_TotalPrice]
FROM [dbo].[ItemInOrder]
WHERE [IsSoftDeleted] = 0
ORDER BY [Order_Id]
The console returns this ERROR message:
Msg 207, Level 16, State 1, Line 55
Invalid column name 'FinalPriceForUnit'.
Msg 207, Level 16, State 1, Line 55
Invalid column name 'Final_QuantityAtomic'.
Msg 207, Level 16, State 1, Line 55
Invalid column name 'FinalQuantityUnit'.
If I remove the "AS [Final_TotalPrice]" alias computed column, no error occurs, but I need the total price. How can I solve this issue? It seems as the other aliases have not been created when the Final_TotalPrice is reached.

You can't use table aliases in the same select. The normal solution is CTEs or subqueries. But, SQL Server also offers APPLY. (Oracle also supports APPLY and other databases such as Postgres support lateral joins using the LATERAL keyword.)
I like this solution, because you can create arbitrarily nested expressions and don't have to worry about indenting:
SELECT TOP 1000 io.Id, io.QuantityOfProduct, io.Redundant_ProductName,
io.Order_Id,
x.FinalPriceForUnit, x.FinalQuantityUnit, x.Final_QuantityAtomic,
(x.QuantityOfProduct * x.FinalPriceForUnit * x.Final_QuantityAtomic / x.FinalQuantityUnit
) as Final_TotalPrice
FROM dbo.ItemInOrder io OUTER APPLY
(SELECT COALESCE(PriceForUnitOverride, Redundant_PriceForUnit) as FinalPriceForUnit,
COALESCE(QuantityUnit_Override, Redundant_QuantityUnit) as FinalQuantityUnit
COALESCE(QuantityAtomic_Override, Redundant_QuantityAtomic) as Final_QuantityAtomic
) x
WHERE io.IsSoftDeleted = 0
ORDER BY io.Order_Id ;
Notes:
I don't find that [ and ] help me read or write queries at all.
COALESCE() is much simpler than your CASE statements.
With COALESCE() you might consider just putting the COALESCE() expression in the final calculation.

You can't use an alias in the same select. What you can do is find the value in subquery and then use it outside in the expression (or may be repeated the whole case statement in your expression). Also, use COALESCE to instead of CASE.
select t.*,
([QuantityOfProduct] * [FinalPriceForUnit] * ([Final_QuantityAtomic] / [FinalQuantityUnit])) as [Final_TotalPrice]
from (
select top 1000 [Id],
[QuantityOfProduct],
[Redundant_ProductName],
[Order_Id],
coalesce([PriceForUnitOverride], [Redundant_PriceForUnit]) as [FinalPriceForUnit],
coalesce([QuantityUnit_Override], [Redundant_QuantityUnit]) as [FinalQuantityUnit],
coalesce([QuantityAtomic_Override], [Redundant_QuantityAtomic]) as [Final_QuantityAtomic]
from [dbo].[ItemInOrder]
where [IsSoftDeleted] = 0
order by [Order_Id]
) t;

hive date format error when used with the table

I have fields like date_column = 20140228 in the table 1. When I hard code the number like below it works, but when I specify the column name its failing. With error
H110 Unable to submit statement. Error while compiling statement: FAILED: ParseException line 2:1 cannot recognize input near 'select' 'date_format' '(' in select clause [ERROR_STATUS]
Working:
select date_format(from_unixtime(unix_timestamp(cast('2014022817' as string),'yyyyMMddHH')),'yyyy-MM-dd HH');
Failing:
select
select date_format(from_unixtime(unix_timestamp(cast(date_column as string),'yyyyMMddHH')),'yyyy-MM-dd HH')
from
table1

Why are you repeating the select? Try this:
select date_format(from_unixtime(unix_timestamp(cast(date_column as string
),'yyyyMMddHH'
)
),'yyyy-MM-dd HH'
)
from table1

LEAD function syntax in HIVE-QL

Is there any way to convert below LEAD function into HIVE QL format??
NVL(LEAD(START_DT) OVER (PARTITION BY EV_ID,AR_EV_RLTNSHP_TYPE_CD ORDER BY START_DT)-1,'2099-12-31') AS DERIVED_END_DT
PFB the error:
FAILED: ParseException line 1:1599 missing ) at 'OVER' near '(' in
subquery source line 1:1603 missing FROM at '(' near '(' in subquery
source line 1:1604 cannot recognize input near 'PARTITION' 'BY'
'EV_ID' in subquery source

It is complicated in HiveSQL but you can do it with a left join and aggregation:
select t.ev_id, t.ar_ev_rltnshp_type_cd, t.start_date,
coalesce(min(tnext.start_dt) - 1, '2099-12-31') as derived_end_dt
from table t left join
table tnext
on t.ev_id = tnext.ev_id and t.ar_ev_rltnshp_type_cd = tnext.ar_ev_rltnshp_type_cd and
tnext.start_date > t.start_date
group by t.ev_id, t.ar_ev_rltnshp_type_cd, t.start_date;
This makes certain assumptions about start_date being unique within a given group, but it will probably work for your purposes.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Wrapping an analytical function in another function in Hive 0.11 - hive

Related

The replacement for double colon to set the value in postgresql

invalid identifier while parsing json

Why do I have an 'invalid column name' when I try to sum two columns with alias that contains a space? [duplicate]

hive date format error when used with the table

LEAD function syntax in HIVE-QL

Categories

Resources