Using a case statement as an if statement - google-bigquery

I am attempting to create an IF statement in BigQuery. I have built a concept that will work but it does not select the data from a table, I can only get it to display 1 or 0
Example:
SELECT --AS STRUCT
CASE
WHEN (
Select Count(1) FROM ( -- If the records are the same, then return = 0, if the records are not the same then > 1
Select Distinct ESCO, SOURCE, LDCTEXT, STATUS,DDR_DATE, TempF, HeatingDegreeDays, DecaTherms
from `gas-ddr.gas_ddr_outbound.LexingtonDDRsOutbound_onchange_Prior_Filtered`
Except Distinct
Select Distinct ESCO, SOURCE, LDCTEXT, STATUS,DDR_DATE, TempF, HeatingDegreeDays, DecaTherms
from `gas-ddr.gas_ddr_outbound.LexingtonDDRsOutbound_onchange_Latest_Filtered`
)
)= 0
THEN
(Select * from `gas-ddr.gas_ddr_outbound.LexingtonDDRsOutbound_onchange_Latest`) -- This Does not
work Scalar subquery cannot have more than one column unless using SELECT AS
STRUCT to build STRUCT values at [16:4] END
SELECT --AS STRUCT
CASE
WHEN (
Select Count(1) FROM ( -- If the records are the same, then return = 0, if the records are not the same then > 1
Select Distinct ESCO, SOURCE, LDCTEXT, STATUS,DDR_DATE, TempF, HeatingDegreeDays, DecaTherms
from `gas-ddr.gas_ddr_outbound.LexingtonDDRsOutbound_onchange_Prior_Filtered`
Except Distinct
Select Distinct ESCO, SOURCE, LDCTEXT, STATUS,DDR_DATE, TempF, HeatingDegreeDays, DecaTherms
from `gas-ddr.gas_ddr_outbound.LexingtonDDRsOutbound_onchange_Latest_Filtered`
)
)= 0
THEN 1 --- This does work
Else
0
END
How can I Get this query to return results from an existing table?

You question is still a little generic, so my answer same as well - and just mimic your use case at extend I can reverse engineer it from your comments
So, in below code - project.dataset.yourtable mimics your table ; whereas
project.dataset.yourtable_Prior_Filtered and project.dataset.yourtable_Latest_Filtered mimic your respective views
#standardSQL
WITH `project.dataset.yourtable` AS (
SELECT 'aaa' cols, 'prior' filter UNION ALL
SELECT 'bbb' cols, 'latest' filter
), `project.dataset.yourtable_Prior_Filtered` AS (
SELECT cols FROM `project.dataset.yourtable` WHERE filter = 'prior'
), `project.dataset.yourtable_Latest_Filtered` AS (
SELECT cols FROM `project.dataset.yourtable` WHERE filter = 'latest'
), check AS (
SELECT COUNT(1) > 0 changed FROM (
SELECT DISTINCT cols FROM `project.dataset.yourtable_Latest_Filtered`
EXCEPT DISTINCT
SELECT DISTINCT cols FROM `project.dataset.yourtable_Prior_Filtered`
)
)
SELECT t.* FROM `project.dataset.yourtable` t
CROSS JOIN check WHERE check.changed
the result is
Row cols filter
1 aaa prior
2 bbb latest
if you changed your table to
WITH `project.dataset.yourtable` AS (
SELECT 'aaa' cols, 'prior' filter UNION ALL
SELECT 'aaa' cols, 'latest' filter
) ......
the result will be
Row cols filter
Query returned zero records.
I hope this gives you right direction
Added more explanations:
I can be wrong - but per your question - it looks like you have one table project.dataset.yourtable and two views project.dataset.yourtable_Prior_Filtered and project.dataset.yourtable_Latest_Filtered which present state of your table prior and after some event
So, first three CTE in the answer above just mimic those table and views which you described in your question.
They are here so you can see concept and can play with it without any extra work before adjusting this to your real use-case.
For your real use-case you should omit them and use your real table and views names and whatever columns the have.
So the query for you to play with is:
#standardSQL
WITH check AS (
SELECT COUNT(1) > 0 changed FROM (
SELECT DISTINCT cols FROM `project.dataset.yourtable_Latest_Filtered`
EXCEPT DISTINCT
SELECT DISTINCT cols FROM `project.dataset.yourtable_Prior_Filtered`
)
)
SELECT t.* FROM `project.dataset.yourtable` t
CROSS JOIN check WHERE check.changed
It should be a very simple IF statement in any language.
Unfortunately NO! it cannot be done with just simple IF and if you see it fit you can submit a feature request to BigQuery team for whatever you think makes sense

Related

Stacking my conditions in a CASE statement it's not returning all cases for each member

SELECT DISTINCT
Member_ID,
CASE
WHEN a.ASTHMA_MBR = 1 THEN 'ASTHMA'
WHEN a.COPD_MBR = 1 THEN 'COPD'
WHEN a.HYPERTENSION_MBR = 1 THEN 'HYPERTENSION'
END AS DX_FLAG
So a member may have more than one, but my statement is only returning one of them.
I'm using Teradata and trying to convert multiple columns of boolean data into one column. The statement is only returning one condition when members may have 2 or more. I tried using Select instead of Select Distinct and it made no difference.
This is a kind of UNPIVOT:
with base_data as
( -- select the columns you want to unpivot
select
member_id
,date_col
-- the aliases will be the final column value
,ASTHMA_MBR AS ASTHMA
,COPD_MBR AS COPD
,HYPERTENSION_MBR AS HYPERTENSION
from your_table
)
,unpvt as
(
select member_id, date_col, x, DX_FLAG
from base_data
-- now unpivot those columns into rows
UNPIVOT(x FOR DX_FLAG IN (ASTHMA, COPD, HYPERTENSION)
) dt
)
select member_id, DX_FLAG, date_col
from unpvt
-- only show rows where the condition is true
where x = 1

BigQuery use the where clause to filter on a column that not always exists in the table

I need to create some kind of a uniform query for multiple tables. Some tables contain a certain column with a type. If this is the case, I need to apply filtering to it. I don't know how to do this.
I have for example two tables
table_customer_1
CustomerId, CustomerType
1, 1
2, 1
3, 2
Table_customer_2
Customerid
4
5
6
The query needs to be something like the one below and should work for both tables (the table name wil be replaced by the customer that uses the query):
With input1 as(
SELECT
(CASE WHEN exists(customerType) THEN customerType ELSE "0" END) as customerType, *
FROM table_customer_1)
SELECT * from input1
WHERE customerType != 2
Below is for BigQuery Standard SQL
#standardSQL
SELECT *
FROM `project.dataset.table` t
WHERE SAFE_CAST(IFNULL(JSON_EXTRACT_SCALAR(TO_JSON_STRING(t), '$.CustomerType'), '0') AS INT64) != 2
or as a simplification you can ignore casting to INT64 and use comparison to STRING
#standardSQL
SELECT *
FROM `project.dataset.table` t
WHERE IFNULL(JSON_EXTRACT_SCALAR(TO_JSON_STRING(t), '$.CustomerType'), '0') != '2'
above will work for whatever table you put instead of project.dataset.table: either project.dataset.table_customer_1 or project.dataset.table_customer_2 - so quite generic I think
I can think of no good reason for doing this. However, it is possible by playing with the scoping rules for subqueries:
SELECT t.*
FROM (SELECT t.*,
(SELECT customerType -- will choose from tt if available, otherwise x
FROM table_customer_1 tt
WHERE tt.Customerid = t.Customerid
) as customerType
FROM (SELECT t.* EXCEPT (Customerid)
FROM table_customer_1 t
) t CROSS JOIN
(SELECT 0 as customerType) x
) t
WHERE customerType <> 2

MS SQL does not return the expected top row when ordering by DIFFERENCE()

I have noticed strange behaviour in some SQL code used for address matching at the company I work for & have created some test SQL to illustrate the issue.
; WITH Temp (Id, Diff) AS (
SELECT 9218, 0
UNION
SELECT 9219, 0
UNION
SELECT 9220, 0
)
SELECT TOP 1 * FROM Temp ORDER BY Diff DESC
Returns 9218 but
; WITH Temp (Id, Name) AS (
SELECT 9218, 'Sonnedal'
UNION
SELECT 9219, 'Lammermoor'
UNION
SELECT 9220, 'Honeydew'
)
SELECT TOP 1 *, DIFFERENCE(Name, '') FROM Temp ORDER BY DIFFERENCE(Name, '') DESC
returns 9219 even though the Difference() is 0 for all records as you can see here:
; WITH Temp (Id, Name) AS (
SELECT 9218, 'Sonnedal'
UNION
SELECT 9219, 'Lammermoor'
UNION
SELECT 9220, 'Honeydew'
)
SELECT *, DIFFERENCE(Name, '') FROM Temp ORDER BY DIFFERENCE(Name, '') DESC
which returns
9218 Sonnedal 0
9219 Lammermoor 0
9220 Honeydew 0
Does anyone know why this happens? I am writing C# to replace existing SQL & need to return the same results so I can test that my code produces the same results. But I can't see why the actual SQL used returns 9219 rather than 9218 & it doesn't seem to make sense. It seems it's down to the Difference() function but it returns 0 for all the record in question.
When you call:
SELECT TOP 1 *, DIFFERENCE(Name, '')
FROM Temp l
ORDER BY DIFFERENCE(Name, '') DESC
All three records have a DIFFERENCE value of zero, and hence SQL Server is free to choose from any of the three records for ordering. That is to say, there is no guarantee which order you will get. The same is true for your second query. Actually, it is possible that the ordering for the same query could even change over time. In practice, if you expect a certain ordering, you should provide exact logic for it, e.g.
SELECT TOP 1 *
FROM Temp
ORDER BY Id;

Combining Columns from different tables

I've write a SQL code to combine several columns from different tables.
SELECT *
FROM
(
SELECT PD_BARCODE
FROM docsadm.PD_BARCODE
WHERE SYSTEM_ID = 11660081
) t,
(
SELECT A_JAHRE
FROM docsadm.A_PD_DATENSCHUTZ
WHERE system_ID = 2066
) t2,
(
SELECT PD_PART_NAME
FROM docsadm.PD_FILE_PART
WHERE system_id = 11660082
) t3;
code works fine but if one of my where clause is not found in a table,the result is null even the other columns have value. How you can solve this problem?
It looks like you are doing a cross join between the three subquery tables. This would probably only yield output which makes sense if each subquery return a single value. I might suggest instead that you use a UNION ALL here:
SELECT ISNULL(PD_BARCODE, 'NA' AS value
FROM docsadm.PD_BARCODE WHERE SYSTEM_ID = 11660081
UNION ALL
SELECT ISNULL(A_JAHRE, 'NA')
FROM docsadm.A_PD_DATENSCHUTZ WHERE system_ID = 2066
UNION ALL
SELECT ISULL(PD_PART_NAME, 'NA')
FROM docsadm.PD_FILE_PART WHERE system_id = 11660082
The above union query might require a slight modification if the three columns being select don't all have the same type (which I assume to be varchar in my query).
If you really need these three points of data as separate columns, then you can just include the three subqueries as items in an outer select:
SELECT
(SELECT ISNULL(PD_BARCODE, 'NA')
FROM docsadm.PD_BARCODE WHERE SYSTEM_ID = 11660081) AS PD_BARCODE,
(SELECT ISNULL(A_JAHRE, 'NA')
FROM docsadm.A_PD_DATENSCHUTZ WHERE system_ID = 2066) AS A_JAHRE,
(SELECT ISNULL(PD_PART_NAME, 'NA')
FROM docsadm.PD_FILE_PART WHERE system_id = 11660082) AS PD_PART_NAME;
Note that as the above is written we simply including the subqueries as values in the select statement. But as you wrote your original query, you are joining the subqueries as separate tables.
Here is Query.
You can replace the word 'empty' by your required word or value
SELECT isnull(
(
SELECT PD_BARCODE
FROM docsadm.PD_BARCODE
WHERE SYSTEM_ID = 11660081
), 'Empty') AS PD_BARCODE,
isnull(
(
SELECT A_JAHRE
FROM docsadm.A_PD_DATENSCHUTZ
WHERE system_ID = 2066
), 'Empty') AS A_JAHRE,
isnull(
(
SELECT PD_PART_NAME
FROM docsadm.PD_FILE_PART
WHERE system_id = 11660082
), 'Empty') AS PD_PART_NAME;
This is a special case since you would be getting only one value per query of yours which you are trying to put up as column. In this case, you can use SQL PIVOT clause with UNION ALL of your queries like shown below. MIN can be used for aggregation in this case.
This would mean, you can get your data row-wise as you like, even for multiple different fields and then pivot it into columns in one go.
SELECT * FROM
(
SELECT 'PD_BARCODE' as KeyItem, PD_BARCODE Name from docsadm.PD_BARCODE
where SYSTEM_ID=11660081
UNION ALL
SELECT 'A_JAHRE', A_JAHRE from docsadm.A_PD_DATENSCHUTZ
where system_ID=2066
UNION ALL
select 'PD_PART_NAME', PD_PART_NAME from docsadm.PD_FILE_PART
where system_id=11660082
) VTABLE
PIVOT(MIN(Name)
FOR KeyItem IN ([PD_BARCODE], [A_JAHRE], [PD_PART_NAME])) as pivotData;

Select where record does not exists

I am trying out my hands on oracle 11g. I have a requirement such that I want to fetch those id from list which does not exists in table.
For example:
SELECT * FROM STOCK
where item_id in ('1','2'); // Return those records where result is null
I mean if item_id '1' is not present in db then the query should return me 1.
How can I achieve this?
You need to store the values in some sort of "table". Then you can use left join or not exists or something similar:
with ids as (
select 1 as id from dual union all
select 2 from dual
)
select ids.id
from ids
where not exists (select 1 from stock s where s.item_id = ids.id);
You can use a LEFT JOIN to an in-line table that contains the values to be searched:
SELECT t1.val
FROM (
SELECT '1' val UNION ALL SELECT '2'
) t1
LEFT JOIN STOCK t2 ON t1.val = t2.item_id
WHERE t2.item_id IS NULL
First create the list of possible IDs (e.g. 0 to 99 in below query). You can use a recursive cte for this. Then select these IDs and remove the IDs already present in the table from the result:
with possible_ids(id) as
(
select 0 as id from dual
union all
select id + 1 as id from possible_ids where id < 99
)
select id from possible_ids
minus
select item_id from stock;
A primary concern of the OP seems to be a terse notation of the query, notably the set of values to test for. The straightforwwrd recommendation would be to retrieve these values by another query or to generate them as a union of queries from the dual table (see the other answers for this).
The following alternative solution allows for a verbatim specification of the test values under the following conditions:
There is a character that does not occur in any of the test values provided ( in the example that will be - )
The number of values to test stays well below 2000 (to be precise, the list of values plus separators must be written as a varchar2 literal, which imposes the length limit ). However, this should not be an actual concern - If the test involves lists of hundreds of ids, these lists should definitely be retrieved froma table/view.
Caveat
Whether this method is worth the hassle ( not to mention potential performance impacts ) is questionable, imho.
Solution
The test values will be provided as a single varchar2 literal with - separating the values which is as terse as the specification as a list argument to the IN operator. The string starts and ends with -.
'-1-2-3-156-489-4654648-'
The number of items is computed as follows:
select cond, regexp_count ( cond, '[-]' ) - 1 cnt_items from (select '-1-2-3-156-489-4654648-' cond from dual)
A list of integers up to the number of items starting with 1 can be generated using the LEVEL pseudocolumn from hierarchical queries:
select level from dual connect by level < 42;
The n-th integer from that list will serve to extract the n-th value from the string (exemplified for the 4th value) :
select substr ( cond, instr(cond,'-', 1, 4 )+1, instr(cond,'-', 1, 4+1 ) - instr(cond,'-', 1, 4 ) - 1 ) si from (select cond, regexp_count ( cond, '[-]' ) - 1 cnt_items from (select '-1-2-3-156-489-4654648-' cond from dual) );
The non-existent stock ids are generated by subtracting the set of stock ids from the set of values. Putting it all together:
select substr ( cond, instr(cond,'-',1,level )+1, instr(cond,'-',1,level+1 ) - instr(cond,'-',1,level ) - 1 ) si
from (
select cond
, regexp_count ( cond, '[-]' ) - 1 cnt_items
from (
select '-1-2-3-156-489-4654648-' cond from dual
)
)
connect by level <= cnt_items + 1
minus
select item_id from stock
;