Redshift Correlated Subquery Internal Error - sql

So I have a table of bids in Amazon Redshift. Each bid has a description and a user who made the bid, and for each bid I want to know if a user made a bid with the same description in the last 5 days.
The query looks like this:
select b1.bid_id, case when
exists(select b2.bid_id from dim_bid b2 WHERE b1.user_id = b2.user_id
and b2.bid_timestamp < b1.bid_timestamp and b2.bid_timestamp > b1.bid_timestamp - INTERVAL '5 day'
and b2.description = b1.description and b2.bid_timestamp > '2017-04-25') then 'good bid' else 'duplicate bid' END
from dim_bid b1
where b1.hidden
which doesn't work, giving the error: this type of correlated subquery is not supported due to internal error. However when I just add a "=True" at the end it works.
select b1.bid_id, case when
exists(select b2.bid_id from dim_bid b2 WHERE b1.user_id = b2.user_id
and b2.bid_timestamp < b1.bid_timestamp and b2.bid_timestamp > b1.bid_timestamp - INTERVAL '5 day'
and b2.description = b1.description and b2.bid_timestamp > '2017-04-25') then 'good bid' else 'duplicate bid' END
from dim_bid b1
where b1.hidden = True
Is this just a bug, or is there some deep reason why the first one can't be done?

I think the better way to write the query uses lag():
select b.*,
(case when lag(b.bid_timestamp) over (partition by b.description order by b.timestamp) > b.bid_timestamp - interval '5 day'
then 'good bid' else 'duplicate bid'
end)
from dim_bid b;

Try to run this first:
select b1.bid_id
from dim_bid b1
where b1.hidden
You will see that redshift will raise a different error(eg. WHERE must be type boolean...). So argument of where must be a boolean in order for the query to run. So when you add '=True' then argument is boolean and query runs. And when the query has correlated subquery and there is an invalid operation in the query I have noticed that redshift raises correlated subquery error. This might be due to the fact that redshift does not support some of the correlated subqueries(correlated subqueries redshift).

The docs state the following:
We recommend always checking Boolean values explicitly, as shown in the examples following. Implicit comparisons, such as WHERE flag or WHERE NOT flag might return unexpected results.
Reference: http://docs.aws.amazon.com/redshift/latest/dg/r_Boolean_type.html
I do not think this is necessarily a bug. I would recommend always checking boolean values as where b1.hidden is True. I have seen this error quite a few times when using correlated subqueries, but I have always been able to fix it when explicitly checking the boolean values using is true/false/unknown.

Related

Using BETWEEN operator in a WHERE clause with dates from an internal table

I have an internal table populated with start and end dates for each type of period. I want to use this internal table in a WHERE clause of an SQL query to select items whose start and end dates are within the open period of their respective type.
TYPES: BEGIN OF s_openprd,
TETXT TYPE TETXT,
fromdate TYPE d,
todate TYPE d,
END OF s_openprd.
DATA: it_openprd TYPE TABLE OF s_openprd WITH KEY TETXT.
SELECT * FROM FPLT
INNER JOIN #it_openprd AS OP ON FPLT~TETXT = OP~TETXT
WHERE FPLT~FKDAT BETWEEN OP~fromdate AND OP~todate
AND FPLT~NFDAT BETWEEN OP~fromdate AND OP~todate
However I get the error saying that OP~fromdate should be of a compatible type to be used as an operator with BETWEEN. The types listed include the date type d.
I've tried replacing BETWEEN with regular >= and <= operators:
SELECT * FROM FPLT
INNER JOIN #it_openprd AS OP ON FPLT~TETXT = OP~TETXT
WHERE FPLT~FKDAT >= OP~fromdate AND FPLT~FKDAT <= OP~todate
AND FPLT~NFDAT >= OP~fromdate AND FPLT~NFDAT <= OP~todate
But the query returns incorrect results.
I assume the ABAP type d is incompatible with SQL type d ?
How can I use an internal table to restrict the selection in this way ?
Nothing prevents you from using FOR ALL ENTRIES instead of joining with internal table, if you ABAP version does not support it.
Regarding "incorrect results" I agree with Sandra, BETWEEN and LT/GT have totally identical sense, so it is more a matter of what you expect than correctness. I'd rather utilize standard logic for dealing with the issue that bothers you:
The problem with FPLT-NFDAT and FPLT-FKDAT is that their order is not consistent. In one entry, the value of NFDAT may be anterior to FKDAT and in another entry it's the opposite.
Following the same approach for you SQL query, you can write something like this:
TYPES: BEGIN OF ty_fplt,
fplnr TYPE fplnr,
fkdat TYPE fkdat,
nfdat TYPE nfdat,
END OF ty_fplt,
tt_fplt TYPE STANDARD TABLE OF ty_fplt WITH NON-UNIQUE KEY fkdat nfdat.
DATA(lt_fplt_base) = VALUE tt_fplt( ).
SELECT fplnr, CASE WHEN nfdat < fkdat THEN nfdat ELSE fkdat END AS fkdat,
CASE WHEN nfdat < fkdat THEN fkdat ELSE nfdat END AS nfdat
FROM fplt
INTO TABLE #lt_fplt_base.
SELECT *
FROM fplt AS f
INTO TABLE #DATA(result)
FOR ALL ENTRIES IN #lt_fplt_base
WHERE f~fplnr = #lt_fplt_base-fplnr
AND f~fkdat >= #lt_fplt_base-fkdat
AND f~nfdat <= #lt_fplt_base-nfdat.
Don't take it as a rule of thumb, it is just a quick suggestion.
P.S. Joining by text field INNER JOIN #it_openprd AS OP ON FPLT~TETXT = OP~TETXT does not make sense in any context. Text/string fields are often ambiguous, they often contain control characters, whitespaces, etc., which make them useless for primary key.

Issue With SQL Pivot Function

I have a SQL query where I am trying to replace null results with zero. My code is producing an error
[1]: ORA-00923: FROM keyword not found where expected
I am using an Oracle Database.
Select service_sub_type_descr,
nvl('Single-occupancy',0) as 'Single-occupancy',
nvl('Multi-occupancy',0) as 'Multi-occupancy'
From
(select s.service_sub_type_descr as service_sub_type_descr, ch.claim_id,nvl(ci.item_paid_amt,0) as item_paid_amt
from table_1 ch, table_" ci, table_3 s, table_4 ppd
where ch.claim_id = ci.claim_id and ci.service_type_id = s.service_type_id
and ci.service_sub_type_id = s.service_sub_type_id and ch.policy_no = ppd.policy_no)
Pivot (
count(distinct claim_id), sum(item_paid_amt) as paid_amount For service_sub_type_descr IN ('Single-occupancy', 'Multi-occupancy')
)
This expression:
nvl('Single-occupancy',0) as 'Single-occupancy',
is using an Oracle bespoke function to say: If the value of the string Single-occupancy' is not null then return the number 0.
That logic doesn't really make sense. The string value is never null. And, the return value is sometimes a string and sometimes a number. This should generate a type-conversion error, because the first value cannot be converted to a number.
I think you intend:
coalesce("Single-occupancy", 0) as "Single-occupancy",
The double quotes are used to quote identifiers, so this refers to the column called Single-occupancy.
All that said, fix your data model. Don't have identifiers that need to be quoted. You might not have control in the source data but you definitely have control within your query:
coalesce("Single-occupancy", 0) as Single_occupancy,
EDIT:
Just write the query using conditional aggregation and proper JOINs:
select s.service_sub_type_descr, ch.claim_id,
sum(case when service_sub_type_descr = 'Single-occupancy' then item_paid_amt else 0 end) as single_occupancy,
sum(case when service_sub_type_descr = 'Multi-occupancy' then item_paid_amt else 0 end) as multi_occupancy
from table_1 ch join
table_" ci
on ch.claim_id = ci.claim_id join
table_3 s
on ci.service_type_id = s.service_type_id join
table_4 ppd
on ch.policy_no = ppd.policy_no
group by s.service_sub_type_descr, ch.claim_id;
Much simpler in my opinion.
for column aliases, you have to use double quotes !
don't use
as 'Single-occupancy'
but :
as "Single-occupancy",

How to use CASE expression to update a table with inner queries

I have used the following update query :
UPDATE datA_table T
SET T.VALUE=
(SELECT
CASE
WHEN t3.h1 =(t3.h2) and t3.h1=(t3.h3) THEN t3.h1
ELSE
Case
wHEN T3.h1 < > T3.h2 THEN T3.h2
ELSE
cASE
wHEN T3.h1 < > T3.h3 THEN T3.h3
eND
eND
END
from datA_table t3)T1
where t.time=t1.time and t.name=t1.name
But this is giving the following error:
Error report - SQL Error: ORA-00933: SQL command not properly ended
00933. 00000 - "SQL command not properly ended"
Is there any way to resolve this issue?
You can have many WHEN in each CASE, which is also likely to perform better than nested CASEs. I'm unfamiliar with the way you linked the updated table with the subselect (normally you can't refer to internal aliases in a subquery from outside), and don't know if it works at all, so I coded it in a way I know for sure it works:
UPDATE datA_table T
SET T.VALUE=
(SELECT
CASE
WHEN t3.h1 =(t3.h2) and t3.h1=(t3.h3) THEN t3.h1
WHEN T3.h1 < > T3.h2 THEN T3.h2
wHEN T3.h1 < > T3.h3 THEN T3.h3
END
from datA_table t3
where t.id=t3.id AND T.name=T3.name
)
where <your condition for datA_table rows to be updated>
Update
I just noticed that both the table being modified and the table in the subquery are the same table, and that the joining condition is likely to be a key. Therefore, there is no need to specify a subquery at all. The following simpler UPDATE will do the same, but likely much faster (I also improved the CASE logic, as the first test was superfluous):
UPDATE datA_table
SET VALUE = CASE
WHEN h1 <> h2 THEN h2
WHEN h1 <> h3 THEN h3
ELSE h1
END
WHERE <your condition for rows to be updated>
The syntax is incorrect. Everything after the parenthesis ending the SELECT should be omitted.
The subselect takes the place of <expression> in
UPDATE data_table SET value = <expression>
By the way, The CASE construction is overly complicated; you could use a single CASE expression like in
CASE WHEN <condition1>
THEN <expression1>
WHEN <condition2>
THEN <expression2>
...
ELSE <expression>
END

Use Case Statement in Join

Hi every one i want to use case statement in join using this query and got error
Select CONVERT(VARCHAR(10), SII.SIDATE,103)DATE,SII.SALEID,SII.ItemName,SI.TenancyID
FROM F_SALESINVOICEITEM SII
INNER JOIN F_SALESINVOICE SI ON SI.SALEID=SII.SALEID
INNER JOIN #TempTableSearch ts ON CASE
WHEN ts.ACCOUNTTYPE = '1' THEN ts.ACCOUNTID=SI.TENANCYID
WHEN ts.ACCOUNTTYPE='2' THEN ts.ACCOUNTID=SI.EMPLOYEEID
WHEN ts.ACCOUNTTYPE='3' THEN ts.ACCOUNTID=SI.SUPPLIERID
WHEN ts.ACCOUNTTYPE='4' THEN ts.ACCOUNTID=SI.SALESCUSTOMERID
Error
Incorrect syntax near '='.
Please help me to solve this error.
IT should be,
ON
ts.ACCOUNTID = CASE
WHEN ts.ACCOUNTTYPE = '1' THEN SI.TENANCYID
WHEN ts.ACCOUNTTYPE = '2' THEN SI.EMPLOYEEID
WHEN ts.ACCOUNTTYPE = '3' THEN SI.SUPPLIERID
WHEN ts.ACCOUNTTYPE = '4' THEN SI.SALESCUSTOMERID
END
Instead of using CASE, I'd much rather do this:
Select CONVERT(VARCHAR(10), SII.SIDATE,103)DATE,SII.SALEID,SII.ItemName,SI.TenancyID
FROM F_SALESINVOICEITEM SII
INNER JOIN F_SALESINVOICE SI ON SI.SALEID=SII.SALEID
INNER JOIN #TempTableSearch ts ON
(ts.ACCOUNTTYPE='1' AND ts.ACCOUNTID=SI.TENANCYID)
OR (ts.ACCOUNTTYPE='2' AND ts.ACCOUNTID=SI.EMPLOYEEID)
OR (ts.ACCOUNTTYPE='3' AND ts.ACCOUNTID=SI.SUPPLIERID)
OR (ts.ACCOUNTTYPE='4' AND ts.ACCOUNTID=SI.SALESCUSTOMERID)
To explain why the query didn't work for you: the syntax of the CASE requires an END at the end of the clause. It would work, as the other solutions proposed suggest, but I find this version to be more convenient to understand - although this part is highly subjective.
you can do this, so you have no chance to misspell something (note that ACCOUNTTYPE and ACCOUNTID used only when needed, you don't have to copy-paste it)
select
convert(varchar(10), SII.SIDATE,103) as DATE,
SII.SALEID, SII.ItemName, SI.TenancyID
from F_SALESINVOICEITEM as SII
inner join F_SALESINVOICE as SI on SI.SALEID = SII.SALEID
outer apply (
'1', SI.TENANCYID
'2', SI.EMPLOYEEID
'3', SI.SUPPLIERID
'4', SI.SALESCUSTOMERID
) as C(ACCOUNTTYPE, ACCOUNTID)
inner join #TempTableSearch as ts on
ts.ACCOUNTTYPE = C.ACCOUNTTYPE and ts.ACCOUNTID = C.ACCOUNTID
You have syntax error. You are missing END there.
You must understand that CASE ... END block is NOT equivalent to IF { } from C-like languages. Much rather this is equivalent to elaborate version of ... ? ... : ... operator from C-like languages. What it means that the WHOLE CASE block must essentially evaluate to single value and that this value has to be the same type no matter which case of the block is executed. This means that:
CASE
WHEN ts.ACCOUNTTYPE = '1' THEN ts.ACCOUNTID=SI.TENANCYID ...
END
Is fundamentally incorrect unless you work on a version of database that will allow you bool value as a value (SQL Server won't allow it for example but I think some of MySQL version used to allow it - not sure about this). You probably should write something like:
CASE
WHEN ts.ACCOUNTTYPE = '1' AND ts.ACCOUNTID=SI.TENANCYID THEN 1
WHEN ts.ACCOUNTTYPE='2' AND ts.ACCOUNTID=SI.EMPLOYEEID THEN 1
WHEN ts.ACCOUNTTYPE='3' AND ts.ACCOUNTID=SI.SUPPLIERID THEN 1
WHEN ts.ACCOUNTTYPE='4' AND ts.ACCOUNTID=SI.SALESCUSTOMERID THEN 1
ELSE 0
END = 1
Notice how the whole CASE block evaluates to 1 or 0 and then it is compared to 1. Of course instead of 4 WHEN's you could use one WHEN with combination of AND's, OR's and ( ) brackets. Of course in this particular case answer by #ppeterka 66 is correct as CASE is not suited for what you really wanted to do - I'm just trying to clarify what CASE really is.

HIVE - hive subquery is not working with case when statement with IN clause

I am trying to migrate data from mysql to hive.I am not able to write a subquery case when statement with IN clause.This is my query. Can you Please help in this regard. AM i not following the proper syntax .
CREATE TABLE HIVE_TPCE_TEMP.TMP_CDMA_CD AS
SELECT A.DRI,C.BOUND_ID,A.CT_ID,A.CD_ID,A.CID,
A.TID,A.TASK_SEQ_ID,A.DIV_ID,C.BLOCK_GROUP_ID,C.ZIP_CODE,C.ROAD_CATEGORY_ID,A.RXPOWER,"${hiveconf:C_CDMA_DEVICE_ONLINE_RXPOWER_METRIC_ID}" METRIC_ID,
CASE WHEN
((A.DRI,A.DIV_ID,A.RFID) in (SELECT DRI,DIV_ID,HOME_RFID FROM HIVE_TPCE_TEMP.TMP_HOME_NETWORKS)) THEN
CASE WHEN MODE IN ('A','N') THEN "${hiveconf:HAD}" ELSE "${hiveconf:HD}" END
WHEN (COALESCE(A.RFID,0) = 0) AND ((A.DRI,A.DIV_ID,D.FR,D.SUBBAND) IN (SELECT DRI,DIV_ID,HOME_FR,
HOME_SUBBAND FROM HIVE_TPCE_TEMP.TMP_HOME_NETWORKS))
THEN CASE WHEN MODE IN ('A','N') THEN "${hiveconf:HAD}" ELSE "${hiveconf:HD}" END
ELSE CASE WHEN MODE IN ('A','N') THEN "${hiveconf:PAI}" ELSE "${hiveconf:PDI}" END END HPDA_ID
FROM HIVE_TPCE.VW_CDMA_CD A INNER JOIN HIVE_TPCE.STG_CURRENT_FILES B
ON A.DRI = B.DRI AND A.SOURCE_FILE_ID = B.SOURCE_FILE_ID
INNER JOIN
HIVE_TPCE.WRK_LOCATION C
ON A.DRI = C.DRI AND A.LOCATION_ID = C.LOCATION_ID
INNER JOIN
HIVE_TPCE.LU_RADIO D
ON A.RADIO_ID = D.RADIO_ID WHERE A.CID > 0 AND D.MODE IN ('A','N') AND A.RXPOWER IS NOT NULL AND A.CALL_RESULT_ID BETWEEN 1 AND 16;
My error signature is
FAILED: ParseException line 10:42 mismatched input ',' expecting ) near 'DRI' in expression specification
According to the Hive Language Manual: "Hive supports subqueries only in the FROM clause".
Your CASE WHEN is part of the SELECT clause, but it includes includes a SELECT subquery. Seems like that is not supported, so your syntax is not correct (in Hive).
Perhaps you could stage the data in MySQL using the query you have and then load it into Hive using a simple SELECT without CASE WHEN?
See official document.
It says
Assumptions
We plan to limit the scope with the following assumptions and limitations.
Subqueries could only be top-level expressions in SELECT. That is, subqueries in complex expressions, aggregates, UDFs, etc. will not be supported for now