Binary Case when in Union Query - sql

I am facing some problems and I got stuck in building this query.
I would like to stack 2 columns from the same table into a long single one (I use UNION statement), and then, I would like to produce a new variable to tell me if the number (stack of column1 and column2, organism_id) comes from column 1 or comes from column 2. For now, I have been trying this approach but I have a problem which I do not understand in the following query:
SELECT u.organism_id, case when u.organism_id IN cpl.column1 then 1
else 0
end as is_column1
FROM
(select column1 as organism_id
from table1
UNION
select column2
from table1) as u,
table1 as cpl;
Does someone have a clue on how to solve this problem?
Thanks in advance!

In general, and if I understand you correctly, you can throw a source column on the tables before unioning them. I'd also suggest UNION ALL to avoid accidental removal of duplicates:
SELECT
*
FROM
(
SELECT
'Column1' AS Source,
Column1
FROM
Table1
UNION ALL
SELECT
'Column2' AS Source,
Column2
FROM
Table1
) u

Related

Exclude Redundant Wildcards (SQL)

This is not a difficult problem to solve, but it doesn't seem to have been asked on the site so I thought I should post it
Sample Data:
wildcard
%abc%
abc
something abc%
ijk
this is mno
%mno%
Expected Output:
wildcard
%abc%
ijk
%mno%
Problem Statement:
List all wildcards except those that are redundant. For e.g. whatever matches with this is mno would match with %mno%, making the former redundant.
I found it easier to solve using not exists
with cte(wildcard) as
(select '%abc%' union all
select 'abc' union all
select 'something abc%' union all
select 'ijk' union all
select 'this is mno' union all
select '%mno%')
select *
from cte a
where not exists(select *
from cte b
where a.wildcard <> b.wildcard and
a.wildcard like b.wildcard)

Count for a list of items with zero for does not exist

If I have a table t1 with:
my_col
------
foo
foo
bar
And I have a list with foo and hello
How can I get:
my_col | count
-------|-------
foo | 2
hello | 0
If I just do
SELECT my_col, COUNT(*)
FROM t1
WHERE my_col in ('foo', 'hello')
GROUP BY my_col
I get
my_col | count
-------|------
foo | 2
without any value for hello.
I'm specifically wanting this to be in reference to a list of items because this will be called in a program where the list is a variable.
Ideally you should maintain a separate table with all the possible column values which you want to appear in your report. In the absence of that, we can try using a CTE here:
WITH cte AS (
SELECT 'foo' AS my_col UNION ALL
SELECT 'bar' UNION ALL
SELECT 'hello'
)
SELECT
a.my_col,
COUNT(b.my_col) AS count
FROM cte a
LEFT JOIN t1 b
ON a.my_col = b.my_col
WHERE
a.my_col IN ('foo', 'hello')
GROUP BY
a.my_col;
Demo
Here's yet another way, using values:
select
t2.my_col, count (t1.my_col)
from
(values ('foo'), ('hello')) as t2 (my_col)
left join t1 on t1.my_col = t2.my_col
group by
t2.my_col
Note that count (t1.my_col) returns a 0 for "hello" since nulls are not counted. count (*) by contast would have returned 1 for "hello" because it was counting the row.
You can turn your list into a set of rows and use a LEFT JOIN, like :
SELECT x.val, COUNT(t.my_col)
FROM
(SELECT 'foo' val UNION SELECT 'hello') x
LEFT JOIN t ON t.my_col = x.val
GROUP BY x.val
Postgres solution:
One way is to place the 'list' into an ARRAY, and then convert the ARRAY into a column using unnest. Then perform a left join on that column with the other table and perform a count.
WITH t1 AS (
SELECT 'foo' AS my_col UNION ALL
SELECT 'foo' UNION ALL
SELECT 'bar'
)
SELECT
a.my_col,
COUNT(b.my_col) AS count
FROM unnest(ARRAY['foo', 'hello']) a (my_col)
LEFT JOIN t1 b
ON a.my_col = b.my_col
GROUP BY
a.my_col;
The issue I had with the other answers is that (while they they helped me get to the solution) they did not provide a solution where the items of interest were in a single list (which isn't an actual sql term, so the fault is on me).
However, my real use case is to perform a native query using java and hibernate, and unfortunately the above does not work because the typing cannot be determined. Instead I converted my list into a single string and used string_to_array in place of the ARRAY function.
So the solution that worked best for my use case is below (but at this point, the other answers would be just as correct since I'm now having to do manual string manipulation, but I'm leaving this here for the sake of posterity)
WITH t1 AS (
SELECT 'foo' AS my_col UNION ALL
SELECT 'foo' UNION ALL
SELECT 'bar'
)
SELECT
a.my_col,
COUNT(b.my_col) AS count
FROM unnest(string_to_array('foo, hello', ',')) a (my_col)
LEFT JOIN t1 b
ON a.my_col = b.my_col
GROUP BY
a.my_col;

Using a case statement as an if statement

I am attempting to create an IF statement in BigQuery. I have built a concept that will work but it does not select the data from a table, I can only get it to display 1 or 0
Example:
SELECT --AS STRUCT
CASE
WHEN (
Select Count(1) FROM ( -- If the records are the same, then return = 0, if the records are not the same then > 1
Select Distinct ESCO, SOURCE, LDCTEXT, STATUS,DDR_DATE, TempF, HeatingDegreeDays, DecaTherms
from `gas-ddr.gas_ddr_outbound.LexingtonDDRsOutbound_onchange_Prior_Filtered`
Except Distinct
Select Distinct ESCO, SOURCE, LDCTEXT, STATUS,DDR_DATE, TempF, HeatingDegreeDays, DecaTherms
from `gas-ddr.gas_ddr_outbound.LexingtonDDRsOutbound_onchange_Latest_Filtered`
)
)= 0
THEN
(Select * from `gas-ddr.gas_ddr_outbound.LexingtonDDRsOutbound_onchange_Latest`) -- This Does not
work Scalar subquery cannot have more than one column unless using SELECT AS
STRUCT to build STRUCT values at [16:4] END
SELECT --AS STRUCT
CASE
WHEN (
Select Count(1) FROM ( -- If the records are the same, then return = 0, if the records are not the same then > 1
Select Distinct ESCO, SOURCE, LDCTEXT, STATUS,DDR_DATE, TempF, HeatingDegreeDays, DecaTherms
from `gas-ddr.gas_ddr_outbound.LexingtonDDRsOutbound_onchange_Prior_Filtered`
Except Distinct
Select Distinct ESCO, SOURCE, LDCTEXT, STATUS,DDR_DATE, TempF, HeatingDegreeDays, DecaTherms
from `gas-ddr.gas_ddr_outbound.LexingtonDDRsOutbound_onchange_Latest_Filtered`
)
)= 0
THEN 1 --- This does work
Else
0
END
How can I Get this query to return results from an existing table?
You question is still a little generic, so my answer same as well - and just mimic your use case at extend I can reverse engineer it from your comments
So, in below code - project.dataset.yourtable mimics your table ; whereas
project.dataset.yourtable_Prior_Filtered and project.dataset.yourtable_Latest_Filtered mimic your respective views
#standardSQL
WITH `project.dataset.yourtable` AS (
SELECT 'aaa' cols, 'prior' filter UNION ALL
SELECT 'bbb' cols, 'latest' filter
), `project.dataset.yourtable_Prior_Filtered` AS (
SELECT cols FROM `project.dataset.yourtable` WHERE filter = 'prior'
), `project.dataset.yourtable_Latest_Filtered` AS (
SELECT cols FROM `project.dataset.yourtable` WHERE filter = 'latest'
), check AS (
SELECT COUNT(1) > 0 changed FROM (
SELECT DISTINCT cols FROM `project.dataset.yourtable_Latest_Filtered`
EXCEPT DISTINCT
SELECT DISTINCT cols FROM `project.dataset.yourtable_Prior_Filtered`
)
)
SELECT t.* FROM `project.dataset.yourtable` t
CROSS JOIN check WHERE check.changed
the result is
Row cols filter
1 aaa prior
2 bbb latest
if you changed your table to
WITH `project.dataset.yourtable` AS (
SELECT 'aaa' cols, 'prior' filter UNION ALL
SELECT 'aaa' cols, 'latest' filter
) ......
the result will be
Row cols filter
Query returned zero records.
I hope this gives you right direction
Added more explanations:
I can be wrong - but per your question - it looks like you have one table project.dataset.yourtable and two views project.dataset.yourtable_Prior_Filtered and project.dataset.yourtable_Latest_Filtered which present state of your table prior and after some event
So, first three CTE in the answer above just mimic those table and views which you described in your question.
They are here so you can see concept and can play with it without any extra work before adjusting this to your real use-case.
For your real use-case you should omit them and use your real table and views names and whatever columns the have.
So the query for you to play with is:
#standardSQL
WITH check AS (
SELECT COUNT(1) > 0 changed FROM (
SELECT DISTINCT cols FROM `project.dataset.yourtable_Latest_Filtered`
EXCEPT DISTINCT
SELECT DISTINCT cols FROM `project.dataset.yourtable_Prior_Filtered`
)
)
SELECT t.* FROM `project.dataset.yourtable` t
CROSS JOIN check WHERE check.changed
It should be a very simple IF statement in any language.
Unfortunately NO! it cannot be done with just simple IF and if you see it fit you can submit a feature request to BigQuery team for whatever you think makes sense

SQL - Query conditional from multiple tables

I am trying to find the moments in which column5 data is greater than column6. The following command works for me for results in one table
SELECT * FROM table1 WHERE column5 > column6;
But I would like to query a return from other tables using the same conditional column5 > column6 and still select all because some columns provide additional relevant information and reference. For example, something like
SELECT * FROM table1, table2 WHERE column5 > column6;
but I get an error complaining from an ambiguous named table column5... I switched the command over to something like
SELECT * FROM table1, table2 WHERE table1.column5 > table1.column6 AND table2.column5 > table2.column6;
The above command does not produce and error but the query returns an empty result. I have also tried the INNER JOIN command as such
SELECT * FROM table1 INNER JOIN table2 ON table1.seconds = table2.seconds AND table1.column5 > table1.column6 AND table2.column5 > table2.column6;
This yields a result that is similar to the first command but the query results in all rows from table1 from that operator conditional repeated twice in the same row, but table2 results are not shown.
Is there another way to achieve this? Most of my attempts did not return the desired results. Again, I just wanted my query to result the rowid of multiple tables WHERE *.column5 > *.column6
.sql file of the database pasted on codeshare.io to provide sample-data to this problem
Put sample data here:
Table1:
Column5
"139062.6"
"115080"
"279718.5"
"106184"
"109483"
"152253"
"159030.3333"
"144092.5"
"154913.8333"
"52166.83333"
"18257.5"
"8907"
Column6
"224340.8"
"154723.6667"
"202486.8333"
"107184.8333"
"110674"
"257038.6667"
"151057"
"190702.6667"
"229714"
"37816.16667"
"18061.83333"
"6606.666667"
Table2:
Column5
"7544.6"
"10165.16667"
"11574.16667"
"9400.833333"
"11421.5"
"11368.5"
"11925.83333"
"9108.833333"
"8276.666667"
"8650.5"
"14998.16667"
"16229.83333"
Column6
"10109"
"14526.83333"
"12070.66667"
"7819.333333"
"9247.833333"
"7201.833333"
"8166.833333"
"4928"
"9135.5"
"9666.166667"
"8201.166667"
"10186"
Expected Result "WHERE column5 > column6" on table 1 and 2:
TABLE 1
Column5
"279718.5"
"159030.3333"
"52166.83333"
"18257.5"
"8907"
Column6
"202486.8333"
"151057"
"37816.16667"
"18061.83333"
"6606.666667"
TABLE 2
Column5
"9400.833333"
"11421.5"
"11368.5"
"11925.83333"
"9108.833333"
"14998.16667"
"16229.83333"
Column6
"7819.333333"
"9247.833333"
"7201.833333"
"8166.833333"
"4928"
"8201.166667"
"10186"
I think you are looking for the union operator (assuming your tables have the same number of columns):
SELECT * FROM table1 WHERE column5 > column6
UNION
SELECT * FROM table2 WHERE column5 > column6;
If you only want the rowids listed, SELECT rowid instead (it depends on how you want to use the result of the query later, as this way you have no way of telling which table the rowid refers to).

Same query but different tables

I'm faced with a big query that is generated in a string and executed with "OPEN pCursor FOR vQuery" and I'm trying to get the query out of the string variable and as a proper "compilable" query.
I'm having this problem where a different table is query depending on a variable
vQuery := 'SELECT ...';
IF pVar = 1 Then
vQuery := vQuery || ' FROM table1';
ELSE
vQuery := vQuery || ' FROM table2';
END IF
vQuery := vQuery || ' WHERE ...';
The two tables have pretty much the same column name. Is there a way to have this as a single query
OPEN Pcursorout FOR
SELECT ... FROM CASE WHEN pVar = 1 THEN table1 ELSE table1 END WHERE ...;
Or I'm stuck at having two queries?
IF pVar = 1 Then
OPEN Pcursorout FOR SELECT ... FROM table1 WHERE ...;
ELSE
OPEN Pcursorout FOR SELECT ... FROM table2 WHERE ...;
END IF
The select and where part are large and exactly the same for both table.
You could use a UNION and use your variable pVar to only include the results from one query in the result set.
SELECT t1.col1, t1.col2, ..., t1.col10
FROM table1 t1
WHERE pVar = 1 and ...
UNION
SELECT t2.col1, t2.col2, ..., t2.col10
FROM table1 t2
WHERE pVar <> 1 and ...
This isn't exactly what you asked about -- not being required to have duplicate lists of columns for the two select statements -- but I think it might capture your intent. It will require that the columns selected by both queries have the same datatype so there will be a (somewhat weak) constraint that the columns of both query results are the same. For example, you won't be able to add a new column to one query but not the other.
Perhaps using UNION / UNION ALL to unite both queries? The requirement for using UNION/UNION ALL is that all SELECTs being united must return columns with the same names.
So if you have
SELECT t.f1,
t.f2,
t.f3
FROM t
WHERE ...
and your other query is
SELECT q.f1,
q.f2,
q.f3
FROM q
WHERE ...
you can have both running as a single SQL statement with UNION:
SELECT t.f1,
t.f2,
t.f3
FROM t
WHERE ...
UNION
SELECT q.f1,
q.f2,
q.f3
FROM q
WHERE ...
Keep in mind that if you need to return columns that exist in one table but not in the other, you can still use UNION, just return NULL and name the column correspondingly to the column name in the table that has it.
Its a bit of a kludge and you might need to look at the performance impact, but you could use an inline view that unions the two base tables, with a flag on each part that you then compare to your variable
SELECT ...
FROM (
SELECT 1 as var, table1.*
FROM table1
UNION ALL
SELECT 2 as var, table2.*
FROM table2
) t
WHERE t.var = pVar
AND ...;
Using an inline view means you don't have to duplicate the main select-list or the where clause etc. If the tables have different columns then you can (and maybe should anyway) only select the columns in the inner queries that will be referenced in the outer select-list.