Convert from lateral view to case statements in hive - sql

I need to place below code in case statement:
select
count (*)
from db.tab1
lateral view explode(secondary.tertiary) exp as lv
where id IN ('6','1') and array_contains (lv.ci, "1");
I have tried:
select
sum(
case
when id IN ('6','1')
and array_contains ((lateral view explode(secondary.tertiary)).ci, "1")
then 1
else 0
end)
from db.tab1;
But getting error.

select
count(*),
sum(if(..., 1, 0))
from db.tab1
lateral view explode(secondary.tertiary) exp as lv
;
For the provided SQL on table tab1, the actual logic is like:
Explode field secondary.tertiary, alias it as lv, which results in a temporary result set (table) tab2;
A join-like operation to concatenate tab2's fields back to rows in tab1, resulting in another intermediate table tab3;
Select from tab3, upon which where conditions are applied.

Related

Stacking my conditions in a CASE statement it's not returning all cases for each member

SELECT DISTINCT
Member_ID,
CASE
WHEN a.ASTHMA_MBR = 1 THEN 'ASTHMA'
WHEN a.COPD_MBR = 1 THEN 'COPD'
WHEN a.HYPERTENSION_MBR = 1 THEN 'HYPERTENSION'
END AS DX_FLAG
So a member may have more than one, but my statement is only returning one of them.
I'm using Teradata and trying to convert multiple columns of boolean data into one column. The statement is only returning one condition when members may have 2 or more. I tried using Select instead of Select Distinct and it made no difference.
This is a kind of UNPIVOT:
with base_data as
( -- select the columns you want to unpivot
select
member_id
,date_col
-- the aliases will be the final column value
,ASTHMA_MBR AS ASTHMA
,COPD_MBR AS COPD
,HYPERTENSION_MBR AS HYPERTENSION
from your_table
)
,unpvt as
(
select member_id, date_col, x, DX_FLAG
from base_data
-- now unpivot those columns into rows
UNPIVOT(x FOR DX_FLAG IN (ASTHMA, COPD, HYPERTENSION)
) dt
)
select member_id, DX_FLAG, date_col
from unpvt
-- only show rows where the condition is true
where x = 1

BigQuery use the where clause to filter on a column that not always exists in the table

I need to create some kind of a uniform query for multiple tables. Some tables contain a certain column with a type. If this is the case, I need to apply filtering to it. I don't know how to do this.
I have for example two tables
table_customer_1
CustomerId, CustomerType
1, 1
2, 1
3, 2
Table_customer_2
Customerid
4
5
6
The query needs to be something like the one below and should work for both tables (the table name wil be replaced by the customer that uses the query):
With input1 as(
SELECT
(CASE WHEN exists(customerType) THEN customerType ELSE "0" END) as customerType, *
FROM table_customer_1)
SELECT * from input1
WHERE customerType != 2
Below is for BigQuery Standard SQL
#standardSQL
SELECT *
FROM `project.dataset.table` t
WHERE SAFE_CAST(IFNULL(JSON_EXTRACT_SCALAR(TO_JSON_STRING(t), '$.CustomerType'), '0') AS INT64) != 2
or as a simplification you can ignore casting to INT64 and use comparison to STRING
#standardSQL
SELECT *
FROM `project.dataset.table` t
WHERE IFNULL(JSON_EXTRACT_SCALAR(TO_JSON_STRING(t), '$.CustomerType'), '0') != '2'
above will work for whatever table you put instead of project.dataset.table: either project.dataset.table_customer_1 or project.dataset.table_customer_2 - so quite generic I think
I can think of no good reason for doing this. However, it is possible by playing with the scoping rules for subqueries:
SELECT t.*
FROM (SELECT t.*,
(SELECT customerType -- will choose from tt if available, otherwise x
FROM table_customer_1 tt
WHERE tt.Customerid = t.Customerid
) as customerType
FROM (SELECT t.* EXCEPT (Customerid)
FROM table_customer_1 t
) t CROSS JOIN
(SELECT 0 as customerType) x
) t
WHERE customerType <> 2

Using a case statement as an if statement

I am attempting to create an IF statement in BigQuery. I have built a concept that will work but it does not select the data from a table, I can only get it to display 1 or 0
Example:
SELECT --AS STRUCT
CASE
WHEN (
Select Count(1) FROM ( -- If the records are the same, then return = 0, if the records are not the same then > 1
Select Distinct ESCO, SOURCE, LDCTEXT, STATUS,DDR_DATE, TempF, HeatingDegreeDays, DecaTherms
from `gas-ddr.gas_ddr_outbound.LexingtonDDRsOutbound_onchange_Prior_Filtered`
Except Distinct
Select Distinct ESCO, SOURCE, LDCTEXT, STATUS,DDR_DATE, TempF, HeatingDegreeDays, DecaTherms
from `gas-ddr.gas_ddr_outbound.LexingtonDDRsOutbound_onchange_Latest_Filtered`
)
)= 0
THEN
(Select * from `gas-ddr.gas_ddr_outbound.LexingtonDDRsOutbound_onchange_Latest`) -- This Does not
work Scalar subquery cannot have more than one column unless using SELECT AS
STRUCT to build STRUCT values at [16:4] END
SELECT --AS STRUCT
CASE
WHEN (
Select Count(1) FROM ( -- If the records are the same, then return = 0, if the records are not the same then > 1
Select Distinct ESCO, SOURCE, LDCTEXT, STATUS,DDR_DATE, TempF, HeatingDegreeDays, DecaTherms
from `gas-ddr.gas_ddr_outbound.LexingtonDDRsOutbound_onchange_Prior_Filtered`
Except Distinct
Select Distinct ESCO, SOURCE, LDCTEXT, STATUS,DDR_DATE, TempF, HeatingDegreeDays, DecaTherms
from `gas-ddr.gas_ddr_outbound.LexingtonDDRsOutbound_onchange_Latest_Filtered`
)
)= 0
THEN 1 --- This does work
Else
0
END
How can I Get this query to return results from an existing table?
You question is still a little generic, so my answer same as well - and just mimic your use case at extend I can reverse engineer it from your comments
So, in below code - project.dataset.yourtable mimics your table ; whereas
project.dataset.yourtable_Prior_Filtered and project.dataset.yourtable_Latest_Filtered mimic your respective views
#standardSQL
WITH `project.dataset.yourtable` AS (
SELECT 'aaa' cols, 'prior' filter UNION ALL
SELECT 'bbb' cols, 'latest' filter
), `project.dataset.yourtable_Prior_Filtered` AS (
SELECT cols FROM `project.dataset.yourtable` WHERE filter = 'prior'
), `project.dataset.yourtable_Latest_Filtered` AS (
SELECT cols FROM `project.dataset.yourtable` WHERE filter = 'latest'
), check AS (
SELECT COUNT(1) > 0 changed FROM (
SELECT DISTINCT cols FROM `project.dataset.yourtable_Latest_Filtered`
EXCEPT DISTINCT
SELECT DISTINCT cols FROM `project.dataset.yourtable_Prior_Filtered`
)
)
SELECT t.* FROM `project.dataset.yourtable` t
CROSS JOIN check WHERE check.changed
the result is
Row cols filter
1 aaa prior
2 bbb latest
if you changed your table to
WITH `project.dataset.yourtable` AS (
SELECT 'aaa' cols, 'prior' filter UNION ALL
SELECT 'aaa' cols, 'latest' filter
) ......
the result will be
Row cols filter
Query returned zero records.
I hope this gives you right direction
Added more explanations:
I can be wrong - but per your question - it looks like you have one table project.dataset.yourtable and two views project.dataset.yourtable_Prior_Filtered and project.dataset.yourtable_Latest_Filtered which present state of your table prior and after some event
So, first three CTE in the answer above just mimic those table and views which you described in your question.
They are here so you can see concept and can play with it without any extra work before adjusting this to your real use-case.
For your real use-case you should omit them and use your real table and views names and whatever columns the have.
So the query for you to play with is:
#standardSQL
WITH check AS (
SELECT COUNT(1) > 0 changed FROM (
SELECT DISTINCT cols FROM `project.dataset.yourtable_Latest_Filtered`
EXCEPT DISTINCT
SELECT DISTINCT cols FROM `project.dataset.yourtable_Prior_Filtered`
)
)
SELECT t.* FROM `project.dataset.yourtable` t
CROSS JOIN check WHERE check.changed
It should be a very simple IF statement in any language.
Unfortunately NO! it cannot be done with just simple IF and if you see it fit you can submit a feature request to BigQuery team for whatever you think makes sense

Yield Return equivalent in SQL Server

I am writing down a view in SQL server (DWH) and the use case pseudo code is:
-- Do some calculation and generate #Temp1
-- ... contains other selects
-- Select statement 1
SELECT * FROM Foo
JOIN #Temp1 tmp on tmp.ID = Foo.ID
WHERE Foo.Deleted = 1
-- Do some calculation and generate #Temp2
-- ... contains other selects
-- Select statement 2
SELECT * FROM Foo
JOIN #Temp2 tmp on tmp.ID = Foo.ID
WHERE Foo.Deleted = 1
The result of the view should be:
Select Statement 1
UNION
Select Statement 2
The intended behavior is the same as the yield returnin C#. Is there a way to tell the view which SELECT statements are actually part of the result and which are not? since the small calculations preceding what I need also contain selects.
Thank you!
Yield return in C# returns rows one at a time as they appear in some underlying function. This concept does not exist in SQL statements. SQl is set-based, returning the entire result set, conceptually as a unit. (That said, sometimes queries run slowly and you will see rows returned slowly or in batches.)
You can control the number of rows being returns using TOP (in SQL Server). You can select particular rows to be returned using WHERE statements. However, you cannot specify a UNION statement that conditionally returns rows from some components but not others.
The closest you may be able to come is something like:
if UseTable1Only = 'Y'
select *
from Table1
else if UseTable2Only = 'Y'
select *
from Table2
else
select *
from table1
union
select *
from table2
You can do something similar using dynamic SQL, by constructing the statement as a string and then executing it.
I found a better work around. It might be helpful for someone else. It is actually to include all the calculation inside WITH statements instead of doing them in the view core:
WITH Temp1 (ID)
AS
(
-- Do some calculation and generate #Temp1
-- ... contains other selects
)
, Temp2 (ID)
AS
(
-- Do some calculation and generate #Temp2
-- ... contains other selects
)
-- Select statement 1
SELECT * FROM Foo
JOIN Temp1 tmp on tmp.ID = Foo.ID
WHERE Foo.Deleted = 1
UNION
-- Select statement 2
SELECT * FROM Foo
JOIN Temp2 tmp on tmp.ID = Foo.ID
WHERE Foo.Deleted = 1
The result will be of course the UNION of all the outiside SELECT statements.

return a default record from a sql query

I have a sql query that I run against a sql server database eg.
SELECT * FROM MyTable WHERE Id = 2
This may return a number of records or may return none. If it returns none, I would like to alter my sql query to return a default record, is this possible and if so, how? If records are returned, the default record should not be returned. I cannot update the data so will need to alter the sql query for this.
Another way (you would get an empty initial rowset returned);
SELECT * FROM MyTable WHERE Id = 2
IF (##ROWCOUNT = 0)
SELECT ...
SELECT TOP 1 * FROM (
SELECT ID,1 as Flag FROM MyTable WHERE Id = 2
UNION ALL
SELECT 1,2
) qry
ORDER BY qry.Flag ASC
You can have a look to this post. It is similar to what you are asking
Return a value if no rows are found SQL
I hope that it can guide you to the correct path.
if not exists (SELECT top 1 * FROM mytable WHERE id = 2)
select * from mytable where id= 'whatever_the_default_id_is'
else
select * from mytable where id = 2
If you have to return whole rows of data (and not just a single column) and you have to create a single SQL query then do this:
Left join actual table to defaults single-row table
select
coalesce(a.col1, d.col1) as col1,
coalesce(a.col2, d.col2) as col2,
...
from (
-- your defaults record
select
default1 as col1,
default2 as col2,
...) as d
left join actual as a
on ((1 = 1) /* or any actual table "where" conditions */)
The query need to return the same number of fields, so you shouldn't do a SELECT * FROM but a SELECT value FROM if you want to return a default value.
With that in mind
SELECT value FROM MyTable WHERE Id = 2
UNION
SELECT CASE (SELECT count(*) FROM MyTable WHERE Id = 2)
WHEN 0 THEN 'defaultvalue'
END