DB Query. How to capture 'mixed' status on a single row - sql

I have a database query ...
select foo, bar, status
from mytable
where bar in (bar1, bar2, bar3);
The status is a status associated with the pair foo-bar. The GUI display is going to display 1 row for every foo, and should display a checked-checkbox if for all bar1, bar2, bar3 for that foo, the status are all 1. And an unchecked chceckbox if for that foo, the status values of bar1, bar2 and bar3 are all zero. If, again for a given foo, different bars have a different status, I am required to display some other token (a questionmark, say.)
My knowledge of sql isn't sufficient to this task. Can this be done in sql. it's in Oracle, if that makes a difference. I'm thinking I may have to suck it into perl and check for the condition there, but I'm not happy with that idea.

In T-SQL I'd do this:
create table mytable (foo nvarchar(128), bar nvarchar(128), status int)
go
select foo, (MAX(status) + MIN(status)) as status
from mytable
group by foo
then in the client app the resulting status value will be 0 if all are unchecked, 1 if some checked, and 2 if all checked

With a CTE to supply sample data, different combinations of zeros and ones in the row statuses gives different output values:
with tmp_tab as (
select 'foo1' as foo, 'bar1' as bar, 0 as status from dual
union
select 'foo1' as foo, 'bar2' as bar, 0 as status from dual
union
select 'foo1' as foo, 'bar3' as bar, 0 as status from dual
union
select 'foo2' as foo, 'bar1' as bar, 0 as status from dual
union
select 'foo2' as foo, 'bar2' as bar, 1 as status from dual
union
select 'foo2' as foo, 'bar3' as bar, 0 as status from dual
union
select 'foo3' as foo, 'bar1' as bar, 1 as status from dual
union
select 'foo3' as foo, 'bar2' as bar, 1 as status from dual
union
select 'foo3' as foo, 'bar3' as bar, 1 as status from dual
)
select foo,
case
when sum(status) = 0 then 'Unchecked'
when sum(status) = count(bar) then 'Checked'
else 'Unknown'
end as status
from tmp_tab
where bar in ('bar1','bar2','bar3')
group by foo;
FOO STATUS
---- ---------
foo1 Unchecked
foo2 Unknown
foo3 Checked

perhaps restructure your query to perform a union instead of the IN.
in that way, you will have an explicit value (unique to each unioned select statement, that will tell you which value has been matched.

If the designer were smart, bar1, bar2, and bar3 should be numeric powers of 2, so one can apply bitwise operators to them - then it would be trivial to know which of the particular bars are set.

I'm assuming bar1/2/3 can only be 0 or 1:
SELECT foo, bar,
CASE WHEN ( bar1 + bar2 + bar3 = 3 ) THEN 'checked'
WHEN ( bar1 + bar2 + bar3 = 0 ) THEN 'unchecked'
ELSE 'something else' END
FROM mytable
WHERE bar in (bar1, bar2, bar3);
Or am I missing something?
Edit: Looks like I originally misunderstood the problem. I think this will work.
SELECT foo,
DECODE( sum_status, 0, 'unchecked', 3, 'checked', 'something else' )
FROM ( SELECT foo, SUM( status ) AS sum_status
FROM mytable
WHERE bar in (bar1, bar2, bar3)
GROUP BY foo )
Again, this assumes status can only be 0 or 1.

Related

How to avoid alias shadowing in nested fields with non-unique names?

Given the following table:
I'd like to rename fred to freddy.
For this, I've written the following code:
WITH foo AS (
SELECT
1 corge,
STRUCT(
[STRUCT(
2 AS bar,
3 AS fred)
] AS qux,
4 AS plugh
) bar
)
SELECT
corge as corge,
(SELECT AS STRUCT ARRAY(
SELECT AS STRUCT
bar.qux.bar as bar,
bar.qux.fred as freddy
FROM
foo.bar.qux)
as qux)
as bar,
plugh as plugh
FROM
foo
But it results in the following error:
Cannot access field qux on a value with type INT64 at [17:17]
It seems like the inner bar is shadowing the outer bar. How can I avoid this and make it work?
How about to avoid all of those UNNESTs and rebuilding the arrays and rather simply force new names as it is in below example
WITH foo AS (
SELECT
1 corge,
STRUCT(
[STRUCT(
2 AS bar,
3 AS fred)
] AS qux,
4 AS plugh
) bar
), foo_with_new_names AS (
SELECT
-1 corge,
STRUCT(
[STRUCT(
2 AS bar,
3 AS freddy)
] AS qux,
4 AS plugh
) bar
)
select * from foo_with_new_names where false
union all select * from foo
with output
Try this one:
WITH foo AS (
SELECT
1 corge,
STRUCT(
[STRUCT(2 AS bar, 3 AS fred), (22, 32)] AS qux,
4 AS plugh
) bar
)
SELECT
corge as corge,
(SELECT AS STRUCT
ARRAY(SELECT STRUCT(bar, fred as freddy) FROM unnest(bar.qux)) AS qux,
bar.plugh) AS bar
FROM foo
Based on the very good answer given by Sergey Geron, here is a version that additionally preserves the order of the elements in the qux array:
WITH foo AS (
SELECT
1 corge,
STRUCT(
[STRUCT(
2 AS bar,
3 AS fred)
] AS qux,
4 AS plugh
) bar
)
SELECT
corge AS corge,
(SELECT AS STRUCT ARRAY(
SELECT STRUCT(
bar AS bar,
fred AS freddy)
FROM
UNNEST(bar.qux)
WITH OFFSET AS bar_qux_offset ORDER BY bar_qux_offset)
AS qux,
bar.plugh)
AS bar
FROM
foo

How to get count of matches in field of table for list of phrases from another table in bigquery?

Given an arbitrary list of phrases phrase1, phrase2*, ... phraseN (say these are in another table Phrase_Table), how would one get the count of matches for each phrase in a field F in a bigquery table?
Here, "*" means there must be some non-empty/non-blank string after the phrase.
Lets say you have a table with and ID field and two string fields Field1, Field2
Output would look something like
id, CountOfPhrase1InField1, CountOfPhrase2InField1, CountOfPhrase1InField2, CountOfPhrase2InField2
or I guess instead of all of those output fields maybe there's a single json object field
id, [{"fieldName": Field1, "counts": {phrase1: m, phrase2: mm, ...},
{"fieldName": Field2, "counts": {phrase1: m2, phrase2: mm2, ...},...]
Thanks!
Below example is for BigQuery Standard SQL
#standardSQL
WITH `project.dataset.table` AS (
SELECT 'foo1 foo foo40' str UNION ALL
SELECT 'test1 test test2 test'
), `project.dataset.keywords` AS (
SELECT 'foo' key UNION ALL
SELECT 'test'
)
SELECT str, ARRAY_AGG(STRUCT(key, ARRAY_LENGTH(REGEXP_EXTRACT_ALL(str, CONCAT(key, r'[^\s]'))) as matches)) all_matches
FROM `project.dataset.table`
CROSS JOIN `project.dataset.keywords`
GROUP BY str
with result
Row str all_matches.key all_matches.matches
1 foo1 foo foo40 foo 2
test 0
2 test1 test test2 test foo 0
test 2
If you prefer output as json you can add TO_JSON_STRING() as in below example
#standardSQL
WITH `project.dataset.table` AS (
SELECT 'foo1 foo foo40' str UNION ALL
SELECT 'test1 test test2 test'
), `project.dataset.keywords` AS (
SELECT 'foo' key UNION ALL
SELECT 'test'
)
SELECT str, TO_JSON_STRING(ARRAY_AGG(STRUCT(key, ARRAY_LENGTH(REGEXP_EXTRACT_ALL(str, CONCAT(key, r'[^\s]'))) as matches))) all_matches
FROM `project.dataset.table`
CROSS JOIN `project.dataset.keywords`
GROUP BY str
with output
Row str all_matches
1 foo1 foo foo40 [{"key":"foo","matches":2},{"key":"test","matches":0}]
2 test1 test test2 test [{"key":"foo","matches":0},{"key":"test","matches":2}]
there are endless ways of presenting outputs like above - hope you will adjust it to whatever exactly you need :o)

How to make an equality test on an array

I am trying to aggregate values on an ID. I return them if they are all the same, but have to create another value 'C' if both are encountered.
CREATE TABLE foo (
fooid int,
foocomm text
);
INSERT INTO foo (fooid,foocomm)
VALUES (1,'A');
INSERT INTO foo (fooid,foocomm)
VALUES (1,'B');
INSERT INTO foo (fooid,foocomm)
VALUES (2,'A');
SELECT
CASE
WHEN array_remove(array_agg(foocomm),NULL) = {'A'} THEN 'A'
WHEN array_remove(array_agg(foocomm),NULL) = {'B'} THEN 'B'
WHEN array_remove(array_agg(foocomm),NULL) = {'A','B'} THEN 'C'
END AS BAR
FROM foo
GROUP BY fooid;
It should yield
fooid,foocomm
1, 'C'
2, 'A'
t=# SELECT fooid,
CASE
WHEN array_remove(array_agg(foocomm order by foocomm),NULL) = '{A}' THEN 'A'
WHEN array_remove(array_agg(foocomm order by foocomm),NULL) = '{B}' THEN 'B'
WHEN array_remove(array_agg(foocomm order by foocomm),NULL) = '{A,B}' THEN 'C'
END AS BAR
FROM foo
GROUP BY fooid;
fooid | bar
-------+-----
1 | C
2 | A
(2 rows)
your query works, just fix array text representation:
https://www.postgresql.org/docs/current/static/arrays.html#ARRAYS-INPUT

filtering out duplicates in rows based on columns

say i have 2 columns:
Fruit Condition
apple unripe
banana ripe
apple ripe
banana moldy
peach moldy
peach ripe
apple ripe
pineapple soldout
and i only want to know which fruit are either Ripe or Unripe and not moldy or sold out (only apple)
Select fruit
from example
where (condition <> 'moldy' or condition <> 'soldout')
and (condition = 'ripe' or condition = 'unripe')
group by fruit
is not working
You are using or in a not. This is the wrong approach for this.
Use:
where not (condition = 'moldy' or condition = 'soldout')
or use
where (condition <> 'moldy' and condition <> 'soldout')
Then, I assume you want the fruits that are ONLY ripe or unripe.
select distinct Fruit
from Example E1
where Condition in ('ripe','unripe')
and not exists
(
select E2.Fruit
from Example E2
where E1.Fruit = E2.Fruit
and E2.Condition in ('moldy','soldout')
)
How about this:
with example as
( select 'apple' as fruit, 'unripe' as condition from dual union all
select 'banana', 'ripe' from dual union all
select 'apple', 'ripe' from dual union all
select 'banana', 'moldy' from dual union all
select 'peach', 'moldy' from dual union all
select 'peach', 'ripe' from dual union all
select 'apple', 'ripe' from dual union all
select 'pineapple', 'soldout' from dual )
select fruit from example
where condition in ('ripe','unripe')
minus
select fruit from example
where condition in ('moldy','soldout');
You can use the HAVING clause with CASE EXPRESSION for this purpose :
SELECT t.fruit FROM YourTable t
GROUP BY t.fruit
HAVING COUNT(CASE WHEN t.condition in ('ripe','unripe') THEN 1 END) > 0
-- makes sure at least 1 attribute of ripe/unripe exists
AND COUNT(CASE WHEN t.condition in ('soldout','moldy') THEN 1 END) = 0
-- makes sure no attribute of soldout/moldy exists
Try this:
select distinct fruit from example where fruit not in
(
select fruit from example
where condition in ('soldout', 'moldy')
);

If I use an alias in a SELECT clause, how do I refer back to that alias?

I want to do something like this:
SELECT round(100*(col_a - col_b)/col_a, 1) as Foo, Foo - col_c as Bar
FROM my_table
WHERE...;
However, I get an error saying Foo is unknown. Since Foo is derived from some calculations on a bunch of other columns, I don't want to repeat the formula again for Bar. Any work-arounds?
SELECT Foo, Foo - col_c as Bar
from (
SELECT round(100*(col_a - col_b)/col_a, 1) as Foo, col_c
FROM my_table
WHERE...
) t;
I would usually handle this with a sub-query:
SELECT Foo, Foo - col_c as Bar
FROM (
SELECT round(100*(col_a - col_b)/col_a, 1) as Foo, col_c
FROM my_table
WHERE ...
)
WHERE ...
If you've got SQL Server, a CTE achieves much the same thing.