Count value across multiple columns - sql

I am looking to count the number of times set of values occurred in a table. These values could occur in up to 10 different columns. I need to increment the count regardless of which column it is in. I know how I could count if they were all in the same column but not spanning multiple columns.
Values can be added in any order. I have about a thousand
Cpt1 Cpt2 Cpt3 Cpt4 Cpt5
63047 63048 63048 NULL NULL
I would want to for this row I'd expect this as the result
63047 1
63048 2

You could use a union all call to treat them as one column:
SELECT col, COUNT(*)
FROM (SELECT col1 FROM mytable
UNION ALL
SELECT col2 FROM mytable
UNION ALL
SELECT col3 FROM mytable
-- etc...
) t
GROUP BY col

It's not entirely clear what your table exactly looks like, but I'm guessing that what you're looking for is:
SELECT row_count = COUNT(*),
row_count_with_given_value = SUM ( CASE WHEN field1 = 'myValue' THEN 1
WHEN field2 = 'myValue' THEN 1
WHEN field3 = 'myValue' THEN 1
WHEN field4 = 'myValue' THEN 1 ELSE 0 END)
FROM myTable
Assuming the fieldx columns are not NULL-able, you could write it like this too:
SELECT row_count = COUNT(*),
row_count_with_given_value = SUM ( CASE WHEN 'myValue' IN (field1, field2, field3, field4) THEN 1 ELSE 0 END)
FROM myTable

Something like this might work (after adapting to your value domain and data types):
create table t1
(i1 int,
i2 int,
i3 int);
insert into t1 values (1,0,0);
insert into t1 values (1,1,1);
insert into t1 values (1,0,0);
declare #i int = 0;
select #i = #i + i1 + i2 + i3 from t1;
print #i;
drop table t1;
Output is: 5

Many databases support lateral joins, of one type of another. These can be used to simplify this operation. Using the SQL Server/Oracle 12C syntax:
select v.cpt, count(*)
from t cross apply
(values (cpt1), (cpt2), . . .
) v(cpt)
where cpt is not null
group by v.cpt;

Related

Use result from case statement in another case statement in oracle

I have a view which contains a cast statement for one column. This cast statement contains a case statement. This statement works. The result of this statement is 1, 2, or 3.
From here, I need to use the result from the previous case statement (I used a WITH statement and it doesn't work) to determine the value of the column. A simple case statement that assigns yes, no or null to the above statement's value (1,2, or 3)
ANY help is appreciated. Thank you.
Example using pseudo-code:
CAST (
WITH case_output
AS(
SELECT
CASE
WHEN EXISTS
(select from table where blah blah)
THEN
(select column from that table)
ELSE
(select from some another table)
END
)
CASE
WHEN case_output = 1
THEN 'Yes'
WHEN case_output = 2
THEN 'No'
else
NULL
AS VARCHAR2 (10))
column_name,
.... [rest of query]
You're mixing up the query name and the column name of the WITH clause. For example, it's
WITH my_query AS (SELECT c1 AS my_column FROM t1)
SELECT my_column FROM my_query;
Secondly, you'll always need a FROM clause in Oracle's SQL. Use the dummy table DUAL as stand-in:
SELECT CASE WHEN ... THEN END AS my_column
FROM DUAL;
Minimal working example:
CREATE TABLE t1 (c1 INT);
CREATE TABLE t2 (c2 INT);
INSERT INTO t1 VALUES (1);
INSERT INTO t2 VALUES (2);
WITH case_query AS (
SELECT CASE WHEN EXISTS (SELECT * FROM t1 WHERE c1=100)
THEN (SELECT c1 FROM t1)
ELSE (SELECT c2 FROM t2)
END AS case_output
FROM dual)
SELECT CASE case_output
WHEN 1 THEN 'Yes'
WHEN 2 THEN 'No'
ELSE NULL
END second_case_output
FROM case_query;

Rank columns by their count of values

I have a table with a bunch of boolean columns. I'd like to rank these columns by the count of true values each one has.
I found a way to count the number of true values in a column using:
SELECT count(CASE WHEN col1 THEN 1 ELSE null END) as col1,
count(CASE WHEN col2 THEN 1 ELSE null END) as col2
....
FROM my_table;
but this approach has two problems:
I have to manually type the names of the columns
I have to then transpose the result and order by value
Is there a way to do the whole operation one query?
This is not actually a crosstab job (or "pivot" in other RDBMS), but the reverse operation, "unpivot" if you will. One elegant technique is a VALUES expression in a LATERAL join.
The basic query can look like this, which takes care of:
I have to then transpose the result and order by value
SELECT c.col, c.ct
FROM (
SELECT count(col1 OR NULL) AS col1
, count(col2 OR NULL) AS col2
-- etc.
FROM tbl
) t
, LATERAL (
VALUES
('col1', col1)
, ('col2', col2)
-- etc.
) c(col, ct)
ORDER BY 2;
That was the simple part. Your other request is harder:
I have to manually type the names of the columns
This function takes your table name and retrieves meta data from the system catalog pg_attribute. It's a dynamic implementation of the above query, safe against SQL injection:
CREATE OR REPLACE FUNCTION f_true_ct(_tbl regclass)
RETURNS TABLE (col text, ct bigint)
LANGUAGE plpgsql AS
$func$
BEGIN
RETURN QUERY EXECUTE (
SELECT format('
SELECT c.col, c.ct
FROM (SELECT %s FROM tbl) t
, LATERAL (VALUES %s) c(col, ct)
ORDER BY 2 DESC'
, string_agg (format('count(%1$I OR NULL) AS %1$I', attname), ', ')
, string_agg (format('(%1$L, %1$I)', attname), ', ')
)
FROM pg_attribute
WHERE attrelid = _tbl -- valid, visible, legal table name
AND attnum >= 1 -- exclude tableoid & friends
AND NOT attisdropped -- exclude dropped columns
AND atttypid = 'bool'::regtype -- only character types
);
END
$func$;
Call:
SELECT * FROM f_true_ct('tbl'); -- table name optionally schema-qualified
Result:
col | ct
------+---
col1 | 3
col3 | 2
col2 | 1
Works for any table to rank all boolean columns by their count of true values.
To understand the function parameter, read this:
Table name as a PostgreSQL function parameter
Related answers with more explanation:
Check whether empty strings are present in character-type columns
Replace empty strings with null values
If I understand correctly, you can do this with a giant union all:
select c.*
from ((select 'col1' as which, sum(case when col1 then 1 else 0 end) as cnt from t
) union all
(select 'col2' as which, sum(case when col2 then 1 else 0 end) as cnt from t
) union all
. . .
) c
order by cnt desc;
Although you still need to type the results, this does sidestep the transpositions.

How do I determine if a group of data exists in a table, given the data that should appear in the group's rows?

I am writing data to a table and allocating a "group-id" for each batch of data that is written. To illustrate, consider the following table.
GroupId Value
------- -----
1 a
1 b
1 c
2 a
2 b
3 a
3 b
3 c
3 d
In this example, there are three groups of data, each with similar but varying values.
How do I query this table to find a group that contains a given set of values? For instance, if I query for (a,b,c) the result should be group 1. Similarly, a query for (b,a) should result in group 2, and a query for (a, b, c, e) should result in the empty set.
I can write a stored procedure that performs the following steps:
select distinct GroupId from Groups -- and store locally
for each distinct GroupId: perform a set-difference (except) between the input and table values (for the group), and vice versa
return the GroupId if both set-difference operations produced empty sets
This seems a bit excessive, and I hoping to leverage some other commands in SQL to simplify. Is there a simpler way to perform a set-comparison in this context, or to select the group ID that contains the exact input values for the query?
This is a set-within-sets query. I like to solve it using group by and having:
select groupid
from GroupValues gv
group by groupid
having sum(case when value = 'a' then 1 else 0 end) > 0 and
sum(case when value = 'b' then 1 else 0 end) > 0 and
sum(case when value = 'c' then 1 else 0 end) > 0 and
sum(case when value not in ('a', 'b', 'c') then 1 else - end) = 0;
The first three conditions in the having clause check that each elements exists. The last condition checks that there are no other values. This method is quite flexible, for various exclusions and inclusion conditions on the values you are looking for.
EDIT:
If you want to pass in a list, you can use:
with thelist as (
select 'a' as value union all
select 'b' union all
select 'c'
)
select groupid
from GroupValues gv left outer join
thelist
on gv.value = thelist.value
group by groupid
having count(distinct gv.value) = (select count(*) from thelist) and
count(distinct (case when gv.value = thelist.value then gv.value end)) = count(distinct gv.value);
Here the having clause counts the number of matching values and makes sure that this is the same size as the list.
EDIT:
query compile failed because missing the table alias. updated with right table alias.
This is kind of ugly, but it works. On larger datasets I'm not sure what performance would look like, but the nested instances of #GroupValues key off GroupID in the main table so I think as long as you have a good index on GroupID it probably wouldn't be too horrible.
If Object_ID('tempdb..#GroupValues') Is Not Null Drop Table #GroupValues
Create Table #GroupValues (GroupID Int, Val Varchar(10));
Insert #GroupValues (GroupID, Val)
Values (1,'a'),(1,'b'),(1,'c'),(2,'a'),(2,'b'),(3,'a'),(3,'b'),(3,'c'),(3,'d');
If Object_ID('tempdb..#FindValues') Is Not Null Drop Table #FindValues
Create Table #FindValues (Val Varchar(10));
Insert #FindValues (Val)
Values ('a'),('b'),('c');
Select Distinct gv.GroupID
From (Select Distinct GroupID
From #GroupValues) gv
Where Not Exists (Select 1
From #FindValues fv2
Where Not Exists (Select 1
From #GroupValues gv2
Where gv.GroupID = gv2.GroupID
And fv2.Val = gv2.Val))
And Not Exists (Select 1
From #GroupValues gv3
Where gv3.GroupID = gv.GroupID
And Not Exists (Select 1
From #FindValues fv3
Where gv3.Val = fv3.Val))

Recursive SQL query - using results from query within query

I'm running SQL Server 2012, and here's what I need:
Row Field1 Field2
1 0 1
2 ? 2
3 ? -5
I need a query that will go throw row by row.
It should take row2,field1 and set it equal to row1,field1+row2,field2
It then would take row3,field1 and set it equal to row2,field1+row3,field2
Initially the table has values in Field1 that are all equal to 0, and so when I run my query it just always uses 0 for the field1 values.
Any help would be appreciated. I was thinking a CTE would be the way to go, but I just don't know where to go with that.
Edit:
Just to clear up some things, in my example. The initial input would be
Row Field1 Field2
1 0 1
2 0 2
3 0 -5
The desired output would be:
Row Field1 Field2
1 1 1
2 3 2
3 -2 -5
My actual table is a bit complicated, but I know I can apply it specifically if I could understand how to pull it off with this example.
Is this what you need? (Unclear if when you refer to row2,field1 for example you mean the before or after update value)
CREATE TABLE YourTable
(
Row INT,
Field1 INT NULL,
Field2 INT
)
INSERT INTO YourTable
VALUES (1,0,1),
(2,0,2),
(3,0,-5);
WITH CTE AS
(
SELECT *,
SUM(Field2) OVER (ORDER BY Row ROWS UNBOUNDED PRECEDING) AS RunningTotal
FROM YourTable
)
UPDATE CTE
SET Field1 = RunningTotal
SELECT *
FROM YourTable
Final Result
Row Field1 Field2
----------- ----------- -----------
1 1 1
2 3 2
3 -2 -5
Or another (more literal) interpretation of your word problem might be
WITH CTE AS
(
SELECT *,
LAG(Field2) OVER (ORDER BY Row) AS PrevRowField2
FROM YourTable
)
UPDATE CTE
SET Field1 = PrevRowField2 + Field1
WHERE PrevRowField2 IS NOT NULL
Something like this adapted from TSQL A recursive update?
With cte As (
Select
Row,
Field1,
Field2
From
t
Where
Row = 1
Union All
Select
t.Row,
t.Field2 + c.Field1,
t.Field2
From
t
Inner Join
cte c
On t.Row = c.Row + 1
)
Update
t
Set
Field1 = c.Field1
From
t
inner join
cte c
On t.Row = c.Row
http://sqlfiddle.com/#!6/cf843/1

SQL statement for maximum common element in a set

I have a table like
id contact value
1 A 2
2 A 3
3 B 2
4 B 3
5 B 4
6 C 2
Now I would like to get the common maximum value for a given set of contacts.
For example:
if my contact set was {A,B} it would return 3;
for the set {A,C} it would return 2
for the set {B} it would return 4
What SQL statement(s) can do this?
Try this:
SELECT value, count(distinct contact) as cnt
FROM my_table
WHERE contact IN ('A', 'C')
GROUP BY value
HAVING cnt = 2
ORDER BY value DESC
LIMIT 1
This is MySQL syntax, may differ for your database. The number (2) in HAVING clause is the number of elements in set.
SELECT max(value) FROM table WHERE contact IN ('A', 'C')
Edit: max common
declare #contacts table ( contact nchar(10) )
insert into #contacts values ('a')
insert into #contacts values ('b')
select MAX(value)
from MyTable
where (select COUNT(*) from #contacts) =
(select COUNT(*)
from MyTable t
join #contacts c on c.contact = t.contact
where t.value = MyTable.value)
Most will tell you to use:
SELECT MAX(t.value)
FROM TABLE t
WHERE t.contact IN ('A', 'C')
GROUP BY t.value
HAVING COUNT(DISTINCT t.*) = 2
Couple of caveats:
The DISTINCT is key, otherwise you could have two rows of t.contact = 'A'.
The number of COUNT(DISTINCT t.*) has to equal the number of values specified in the IN clause
My preference is to use JOINs:
SELECT MAX(t.value)
FROM TABLE t
JOIN TABLE t2 ON t2.value = t.value AND t2.contact = 'C'
WHERE t.contact = 'A'
The downside to this is that you have to do a self join (join to the same table) for every criteria (contact value in this case).