SQL preferred way for keyword search - sql

I have a table in the following format:
row_key extID tag val
------- ----- --- ---
1 1 A a
2 1 A b
3 1 B c
4 2 A d
5 2 C e
Now I want to have all extID's where there are several pairs of (tag, val) with specific values, for example:
(tag, val) = (A,a) AND (tag, val) = (B,c)
or,
(tag, val) = (C,e)
The number of constrains can change.
I can think of several ways to do this:
Perform a self-join for each constraint
Do the searching (iteratively) in the caller program (multiple SQL queries)
(Maybe?) write a SQL function to do this
Nested SELECT clauses (passing to the outer level the "extID" and using WHERE extID IN (SELECT extID FROM ...)
The only true solution that I just can't find.
Which one would be the preferred (fastest and most elegant) way to do this? (Except, of course, "Surely, 5. is the correct answer.")
I think a multiple SELF-join is quite elegant. However, I do not know if it is fast and comparatively memory-efficient.
Further, I would like to use a way that works with MySQL, PostgreSQL and SQLite without adaptation - That's why I can't use PIVOT afaiu.

SELECT extID
FROM tableName
WHERE (tag = 'A' AND val = 'a') OR
(tag = 'B' AND val = 'c')
GROUP BY extID
HAVING COUNT(*) = 2
SQLFiddle Demo
SQL of Relational Division
UPDATE 1
since you haven't mentioned that there can be duplicate combination of tag and val, DISTINCT keyword is needed.
SELECT extID
FROM tableName
WHERE (tag = 'A' AND val = 'a') OR
(tag = 'B' AND val = 'c')
GROUP BY extID
HAVING COUNT(DISTINCT tag, val) = 2
SQLFiddle Demo

The tuple syntax would work:
SELECT extID
FROM tableName
WHERE (tag, val) in (('A', 'a'), ('B', 'c'))
GROUP BY extID
HAVING COUNT(DISTINCT tag, val) = 2
The HAVING COUNT(DISTINCT tag, val) = 2 ensures that each constraint tuple was present at least once. This means that the 2 needs to be adjusted to the number of constraint tuples in the query.
This would even work if you have two identical rows like this and the condition is ('C', 'e'):
row_key extID tag val
------- ----- --- ---
5 2 C e
6 2 C e
The query for this would look like this:
SELECT extID
FROM tableName
WHERE (tag, val) in (('C', 'e'))
GROUP BY extID
HAVING COUNT(DISTINCT tag, val) = 1

Related

SQL Server- Return Items Only When All Sub-Items Are Available

I have an Item table (denormalized for this example) containing a list of items, parts and whether the part is available. I want to return all the items for which all the parts are available. Each item can have a varying number of parts. For example:
Item Part Available
A 1 Y
A 2 N
A 3 N
B 1 Y
B 4 Y
C 2 N
C 5 Y
D 4 Y
D 6 Y
D 7 Y
The query should return the following:
Item Part
B 1
B 4
D 4
D 6
D 7
Thanks in advance for any assistance.
Here is one trick using Max() Over() Window aggregate Function
SELECT Item,
Part
FROM (SELECT Max([Available])OVER(partition BY [Item]) m_av,*
FROM yourtable) a
WHERE m_av = 'Y'
or using Group By and Having clause
Using IN clause
SELECT Item,
Part
FROM yourtable
WHERE Item IN (SELECT Item
FROM yourtable
GROUP BY Item
HAVING Count(*) = Sum(Iif(Available = 'Y', 1, 0)))
using Exists
SELECT Item,
Part
FROM yourtable A
WHERE EXISTS (SELECT 1
FROM yourtable B
WHERE A.Item = B.Item
HAVING Count(*) = Sum(Iif(Available = 'Y', 1, 0)))
using NOT EXISTS
SELECT Item,
Part
FROM yourtable A
WHERE NOT EXISTS (SELECT *
FROM yourtable B
WHERE A.Item = B.Item
AND B.Available = 'N')
I'd start with rephrasing the requirement - you want to return the items that don't have any parts that are not available. Once you put it like that, it's easy to translate the requirement to SQL using the not exists operator:
SELECT item, part
FROM parts a
WHERE NOT EXISTS (SELECT *
FROM parts b
WHERE a.item = b.item AND b.available = 'N')
Using window function does a single table read.
MIN and MAX window function
select *
from (
select
t.*,
max(available) over (partition by item) a,
min(available) over (partition by item) b
from your_table t
) t where a = b and a = 'Y';
COUNT window function:
select *
from (
select
t.*,
count(*) over (partition by item) n1
count(case when available = 'Y' then 1 end) over (partition by item) n2
from your_table t
) t where n1 = n2;
U can use NOT IN OR NOT EXISTS to achieve this
NOT EXISTS
Select item, part
from table as T1
where not exists( select 1 from tbl where item = t1.item and available = 'N')
NOT IN
Select item, part
from table
where item not in( select item from tbl where available = 'N')
I want to point out that the question in the text is: "I want to return all the items for which all the parts are available". However, your example results include the parts.
If the question is indeed that you want the items only, then you can use simple aggregation:
select item
from parts
group by item
having min(available) = max(available) and min(available) = 'Y';
If you indeed want the detail on the parts as well, then the other answers provide that information.
I do like it problems lend themselves well to being solved by infrequently used language features:
with cte as (
select * from (values
('A', 1, 'Y'),
('A', 2, 'N'),
('A', 3, 'N'),
('B', 1, 'Y'),
('B', 4, 'Y'),
('C', 2, 'N'),
('C', 5, 'Y'),
('D', 4, 'Y'),
('D', 6, 'Y'),
('D', 7, 'Y')
) as x(Item, Part, Available)
)
select *
into #t
from cte as c;
select *
from #t as c
where 'Y' = all (
select Available
from #t as a
where c.Item = a.Item
)
Here, we use a correlated subquery and the all keyword to see if all of the parts are available. My understanding is that, like exists, this will stop if it finds a counter-example.

ORACLE sum inside a case statement

Hi I need the result of this. so if a entityID matches to a value I need the sum of certain column.I am getting an expression missing error. Can someone point me to where the error is?
Thanks.
SELECT
p.jobTitle,
p.department,
p.person,
ufr.meets,
ufr.exceeds,
CASE
WHEN ufr.entityid = 'AHT' THEN (AD.acdcalls + AD.daacdcalls)
WHEN ufr.entityid = 'ACW' THEN (AD.acdcalls + AD.daacdcalls)
WHEN ufr.entityid = 'Adherence' THEN SUM(AA.totalSched)
WHEN ufr.entityid = 'Conformance' THEN SUM(AS.minutes)
ELSE null
END as weight,
(weight * meets) AS weightedMeets,
(weight * exceeds) AS weightedExceeds
FROM M_PERSON p
JOIN A_TMP5408_UNFLTRDRESULTSAG ufr
ON ufr.department = p.department AND ufr.jobTitle = p.jobTitle
LEFT JOIN M_AvayaDAgentChunk AD
ON AD.person = p.person and ufr.split = AD.split
LEFT JOIN M_AgentAdherenceChunk AA
ON AA.person = p.person
LEFT JOIN M_AgentScheduleChunk AS
ON AS.person = p.person
GROUP BY
p.person,
p.department,
p.jobTitle,
ufr.meets,
ufr.exceeds,
weight,
weightedMeets,
weightedExceeds
As well as the issues mentioned by #GordonLinoff (that AS is a keyword) and #DCookie (you need entityid in the group-by):
you also need acdcalls and daacdcalls in the group-by (unless you can aggregate those);
you can't refer to a column alias in the same level of query, so (weight * meets) AS weightedMeets isn't allowed - you've just define what weight is, in the same select list. You need to use an inline view, or a CTE, if you don't want to repeat the case logic.
I think this does what you want:
SELECT
jobTitle,
department,
person,
meets,
exceeds,
weight,
(weight * meets) AS weightedMeets,
(weight * exceeds) AS weightedExceeds
FROM
(
SELECT
MP.jobTitle,
MP.department,
MP.person,
ufr.meets,
ufr.exceeds,
CASE
WHEN ufr.entityid = 'AHT' THEN (MADAC.acdcalls + MADAC.daacdcalls)
WHEN ufr.entityid = 'ACW' THEN (MADAC.acdcalls + MADAC.daacdcalls)
WHEN ufr.entityid = 'Adherence' THEN SUM(MAAC.totalSched)
WHEN ufr.entityid = 'Conformance' THEN SUM(MASC.minutes)
ELSE null
END as weight
FROM M_PERSON MP
JOIN A_TMP5408_UNFLTRDRESULTSAG ufr
ON ufr.department = MP.department AND ufr.jobTitle = MP.jobTitle
LEFT JOIN M_AvayaDAgentChunk MADAC
ON MADAC.person = MP.person and ufr.split = MADAC.split
LEFT JOIN M_AgentAdherenceChunk MAAC
ON MAAC.person = MP.person
LEFT JOIN M_AgentScheduleChunk MASC
ON MASC.person = MP.person
GROUP BY
MP.person,
MP.department,
MP.jobTitle,
ufr.meets,
ufr.exceeds,
ufr.entityid,
MADAC.acdcalls,
MADAC.daacdcalls
);
Your fist two case branches could be combined since the calculation is the same, but will work either way.
In addition to the alias issue identified by Gordon, I think you'll find you need to use an aggregate function in all the THEN clauses of your CASE statement, and that you need to GROUP BY ufr.entityid as well. Otherwise you'll start getting ora-00979 errors (not a GROUP BY expression). If you don't want the aggregate function in all clauses, then you'll have to group by the expressions you're summing as well.
Small illustration:
CREATE TABLE tt (ID varchar2(32), sub_id varchar2(32), x NUMBER, y NUMBER);
INSERT INTO tt VALUES ('ID1', 'A', 1, 6);
INSERT INTO tt VALUES ('ID1', 'B', 1, 7);
INSERT INTO tt VALUES ('ID2', 'A', 2, 6);
INSERT INTO tt VALUES ('ID2', 'B', 2, 7);
INSERT INTO tt VALUES ('ID3', 'A', 3, 6);
INSERT INTO tt VALUES ('ID3', 'B', 3, 7);
INSERT INTO tt VALUES ('ID3', 'C', 3, 8);
SELECT ID, CASE WHEN sub_id = 'A' THEN SUM(y)
WHEN sub_id = 'B' THEN SUM(x)
ELSE (x + y) END tst
FROM tt
GROUP BY ID
ORA-00979: not a GROUP BY expression (points at sub_id in WHEN)
SELECT ID, CASE WHEN sub_id = 'A' THEN SUM(y)
WHEN sub_id = 'B' THEN SUM(x)
ELSE (x + y) END tst
FROM tt
GROUP BY ID, sub_id
ORA-00979: not a GROUP BY expression (points at x in ELSE)
SQL> SELECT ID, CASE WHEN sub_id = 'A' THEN SUM(y)
2 WHEN sub_id = 'B' THEN SUM(x)
3 ELSE SUM(x + y) END tst
4 FROM tt
5 GROUP BY ID, sub_id;
ID TST
-------------------------------- ----------
ID1 6
ID3 6
ID3 3
ID1 1
ID2 6
ID2 2
ID3 11

Multiple Columns in an "in" statement

I am using DB 2 and i am trying to write a query which checks multiple columns against a given set of values.Like field a, field b and field c against values x,y,z,f. One way that i can think for is writing same condition 3 times with or i.e. field a in ('x','y','z','f') or field b in .... and so on . Please let me know if there is some other efficient and easy way to accomplish this. I am looking for a query like if any of the condition is true return yes else no . Please suggest !
This may or may not work on as400:
create table a (a int not null, b int not null);
insert into a (a,b) values (1,1),(1,3),(2,3),(0,23);
select a.*
from a
where a in (1,2) or b in (1,2);
A B
----------- -----------
1 1
1 3
2 3
Rewriting as a join:
select a.*
from a
join ( values (1),(2) ) b (x)
on b.x in (a.a, a.b);
A B
----------- -----------
1 1
1 3
2 3
Assuming the column data types are the same, Create a subquery joining all the columns you want to search with your IN into one column with a union
SELECT *
FROM (
SELECT
YOUR_TABLE_PRIMARY_KEY
,A AS Col
FROM YOUR_TABLE
UNION ALL
SELECT
YOUR_TABLE_PRIMARY_KEY
,B AS Col
FROM YOUR_TABLE
UNION ALL
SELECT
YOUR_TABLE_PRIMARY_KEY
,C AS Col
FROM YOUR_TABLE
) AS SQ
WHERE
SQ.Col IN ('x','y','z','f')
Make sure to include the table key so you know which row the data refers to
You can create a regular expression that describe the set of characters and use it with xquery
Assuming you're on a supported version of the OS (tested on 7.1 TR6), this should work...
with sel (val) as (values ('x'),('y'),('f'))
select * from mytbl
where flda in (select val from sel)
or fldb in (select val from sel)
or fldc in (select val from sel)
Expanding on the above since your OP asked for "condition is true return yes else no"
Assuming you've got the key to a row to check, would 'yes' or the empty set be good enough? somekey is the key for the row you want to check.
with sel (val) as (values ('x'),('y'),('f'))
select 'yes' from mytbl
where thekey = somekey
and ( flda in (select val from sel)
or fldb in (select val from sel)
or fldc in (select val from sel)
)
It's actually rather difficult to return a value when you don't have a matching row. Here's one way. Note I've switch to 1=yes, 0=no..
with sel (val) as (values ('x'),('y'),('f'))
select 1 from mytbl
where thekey = somekey
and ( flda in (select val from sel)
or fldb in (select val from sel)
or fldc in (select val from sel)
)
UNION ALL
select 0
from sysibm.sysdummy1
order by 1 desc
fetch first row only

Generating all possible combinations of a timetable using an SQL Query

I have an awkward SQL Puzzle that has bested me.
I am trying to generate a list of possible configurations of student blocks so I can fit their course choices into a timetable. A list of possible qualifications and blocks for a student could be as the following:
Biology A
Biology C
Biology D
Biology E
Chemistry B
Chemistry C
Chemistry D
Chemistry E
Chemistry F
Computing D
Computing F
Tutorial A
Tutorial B
Tutorial E
A possible solution of blocks for a student could be
Biology D
Chemistry C
Computing F
Tutorial E
How would I query the above dataset to produce all possible combinations of lessons and blocks for a student? I could then pare down the list removing the ones that clash and choose one that works. I estimate that in this instance there will be about 120 combinations in total.
I could imagine that it would be some kind of cross join. I have tried all sorts of solutions using window functions and cross apply etc but they have all had some kind of flaw. They all tend to get tripped up because each student has a different number of courses and each course has a different number of blocks.
Cheers for any help you can offer! I can paste in the gnarled mess of a query I have if necessary too!
Alex
For a fixed number of qualifications, the answer is relatively simple - the CROSS JOIN option from the previous answers will work perfectly.
However, if the number of qualifications is unknown, or likely to change in the future, hard-coding four CROSS JOIN operations won't work. In this case, the answer gets more complicated.
For small numbers of rows, you could use a variation of this answer on DBA, which uses powers of two and bit comparisons to generate the combinations. However, this will be limited to a very small number of rows.
For larger numbers of rows, you can use a function to generate every combination of 'M' numbers from 'N' rows. You can then join this back to a ROW_NUMBER value computed on your source data to get the original row.
The function to generate the combinations could be written in TSQL, but it would make more sense to use SQLCLR if possible:
[SqlFunction(
DataAccess = DataAccessKind.None,
SystemDataAccess = SystemDataAccessKind.None,
IsDeterministic = true,
IsPrecise = true,
FillRowMethodName = "FillRow",
TableDefinition = "CombinationId bigint, Value int"
)]
public static IEnumerable Combinations(SqlInt32 TotalCount, SqlInt32 ItemsToPick)
{
if (TotalCount.IsNull || ItemsToPick.IsNull) yield break;
int totalCount = TotalCount.Value;
int itemsToPick = ItemsToPick.Value;
if (0 >= totalCount || 0 >= itemsToPick) yield break;
long combinationId = 1;
var result = new int[itemsToPick];
var stack = new Stack<int>();
stack.Push(0);
while (stack.Count > 0)
{
int index = stack.Count - 1;
int value = stack.Pop();
while (value < totalCount)
{
result[index++] = value++;
stack.Push(value);
if (index == itemsToPick)
{
for (int i = 0; i < result.Length; i++)
{
yield return new KeyValuePair<long, int>(
combinationId, result[i]);
}
combinationId++;
break;
}
}
}
}
public static void FillRow(object row, out long CombinationId, out int Value)
{
var pair = (KeyValuePair<long, int>)row;
CombinationId = pair.Key;
Value = pair.Value;
}
(Based on this function.)
Once the function is in place, generating the list of valid combinations is fairly easy:
DECLARE #Blocks TABLE
(
Qualification varchar(10) NOT NULL,
Block char(1) NOT NULL,
UNIQUE (Qualification, Block)
);
INSERT INTO #Blocks
VALUES
('Biology', 'A'),
('Biology', 'C'),
('Biology', 'D'),
('Biology', 'E'),
('Chemistry', 'B'),
('Chemistry', 'C'),
('Chemistry', 'D'),
('Chemistry', 'E'),
('Chemistry', 'F'),
('Computing', 'D'),
('Computing', 'F'),
('Tutorial', 'A'),
('Tutorial', 'B'),
('Tutorial', 'E')
;
DECLARE #Count int, #QualificationCount int;
SELECT
#Count = Count(1),
#QualificationCount = Count(DISTINCT Qualification)
FROM
#Blocks
;
WITH cteNumberedBlocks As
(
SELECT
ROW_NUMBER() OVER (ORDER BY Qualification, Block) - 1 As RowNumber,
Qualification,
Block
FROM
#Blocks
),
cteAllCombinations As
(
SELECT
C.CombinationId,
B.Qualification,
B.Block
FROM
dbo.Combinations(#Count, #QualificationCount) As C
INNER JOIN cteNumberedBlocks As B
ON B.RowNumber = C.Value
),
cteMatchingCombinations As
(
SELECT
CombinationId
FROM
cteAllCombinations
GROUP BY
CombinationId
HAVING
Count(DISTINCT Qualification) = #QualificationCount
And
Count(DISTINCT Block) = #QualificationCount
)
SELECT
DENSE_RANK() OVER(ORDER BY C.CombinationId) As CombinationNumber,
C.Qualification,
C.Block
FROM
cteAllCombinations As C
INNER JOIN cteMatchingCombinations As MC
ON MC.CombinationId = C.CombinationId
ORDER BY
CombinationNumber,
Qualification
;
This query will generate a list of 172 rows representing the 43 valid combinations:
1 Biology A
1 Chemistry B
1 Computing D
1 Tutorial E
2 Biology A
2 Chemistry B
2 Computing F
2 Tutorial E
...
In case you need the TSQL version of the Combinations function:
CREATE FUNCTION dbo.Combinations
(
#TotalCount int,
#ItemsToPick int
)
Returns #Result TABLE
(
CombinationId bigint NOT NULL,
ItemNumber int NOT NULL,
Unique (CombinationId, ItemNumber)
)
As
BEGIN
DECLARE #CombinationId bigint;
DECLARE #StackPointer int, #Index int, #Value int;
DECLARE #Stack TABLE
(
ID int NOT NULL Primary Key,
Value int NOT NULL
);
DECLARE #Temp TABLE
(
ID int NOT NULL Primary Key,
Value int NOT NULL Unique
);
SET #CombinationId = 1;
SET #StackPointer = 1;
INSERT INTO #Stack (ID, Value) VALUES (1, 0);
WHILE #StackPointer > 0
BEGIN
SET #Index = #StackPointer - 1;
DELETE FROM #Temp WHERE ID >= #Index;
-- Pop:
SELECT #Value = Value FROM #Stack WHERE ID = #StackPointer;
DELETE FROM #Stack WHERE ID = #StackPointer;
SET #StackPointer -= 1;
WHILE #Value < #TotalCount
BEGIN
INSERT INTO #Temp (ID, Value) VALUES (#Index, #Value);
SET #Index += 1;
SET #Value += 1;
-- Push:
SET #StackPointer += 1;
INSERT INTO #Stack (ID, Value) VALUES (#StackPointer, #Value);
If #Index = #ItemsToPick
BEGIN
INSERT INTO #Result (CombinationId, ItemNumber)
SELECT #CombinationId, Value
FROM #Temp;
SET #CombinationId += 1;
SET #Value = #TotalCount;
END;
END;
END;
Return;
END
It's virtually the same as the SQLCLR version, except for the fact that TSQL doesn't have stacks or arrays, so I've had to fake them with table variables.
One giant cross join?
select * from tablea,tableb,tablec,tabled
That will actually work for what you need, where tablea is the biology entries, b is chem, c is computing and d is tutorial. You can specify the joins a bit better:
select * from tablea cross join tableb cross join tablec cross join tabled.
Technically both statement are the same...this is all cross join so the comma version above is simpler, in more complicated queries, you'll want to use the second statement so you can be very explicit as to where you are cross joining vs inner/left joins.
You can replace the 'table' entries with a select union statement to give the values you are looking for in query form:
select * from
(select 'biology' as 'course','a' as 'class' union all select 'biology','c' union all select 'biology','d' union all select 'biology','e') a cross join
(select 'Chemistry' as 'course','b' as 'class' union all select 'Chemistry','c' union all select 'Chemistry','d' union all select 'Chemistry','e' union all select 'Chemistry','f') b cross join
(select 'Computing' as 'course','a' as 'class' union all select 'Computing','c') c cross join
(select 'Tutorial ' as 'course','a' as 'class' union all select 'Tutorial ','b' union all select 'Tutorial ','e') d
There is your 120 results (4*5*3*2)
Not really seeing the problem, but does this sqlFiddle work?
You should be able to do with a simple union, however each select of the union would have a filter on only one class type so you don't get
BIO, BIO, BIO, BIO, BIO
BIO, CHEM, BIO, BIO, BIO
etc...
select
b.course as BioCourse,
c.course as ChemCourse,
co.course as CompCourse,
t.course as Tutorial
from
YourTable b,
YourTable c,
YourTable co,
YourTable t
where
b.course like 'Biology%'
AND c.course like 'Chemistry%'
AND co.course like 'Computing%'
AND t.course like 'Tutorial%'
Lets use the paradigm where Table1 is Biology, Table2 is chemistry, Table3 is computing and Table4 is tutorial. Each table has 1 column and that is the possible blocks for that table or course. To get all possible combinations, we want to Cartesian Product all of the tables together and then filter the rows out that have duplicate letters.
Each column in the end result will represent their respective course. This means column 1 in the finished table will be the block letter for Biology which is Table1.
So the SQL for the answer would look something like this.
SELECT * FROM Table1,Table2,Table3,Table4
WHERE col1 != col2
AND col1 != col3
AND col1 != col4
AND col2 != col3
AND col2 != col4
AND col3 != col4;
Note: This is trivial to extend to the case where each table has 2 columns, the first is the subject and the 2nd is the block. Substitutions just need to be done in the where clause but if I ignore this case the code is much easier to follow along.
This is a little verbose but this works if each student must have a class from each of the tables and the max number of classes is 4 classes. This solution breaks down if a student doesnt have to have 4 classes.
The exact SQL needed may be a little different depending on what database your using. For example != could be <>.
Hope this helps!

GROUP BY or COUNT Like Field Values - UNPIVOT?

I have a table with test fields, Example
id | test1 | test2 | test3 | test4 | test5
+----------+----------+----------+----------+----------+----------+
12345 | P | P | F | I | P
So for each record I want to know how many Pass, Failed or Incomplete (P,F or I)
Is there a way to GROUP BY value?
Pseudo:
SELECT ('P' IN (fields)) AS pass
WHERE id = 12345
I have about 40 test fields that I need to somehow group together and I really don't want to write this super ugly, long query. Yes I know I should rewrite the table into two or three separate tables but this is another problem.
Expected Results:
passed | failed | incomplete
+----------+----------+----------+
3 | 1 | 1
Suggestions?
Note: I'm running PostgreSQL 7.4 and yes we are upgrading
I may have come up with a solution:
SELECT id
,l - length(replace(t, 'P', '')) AS nr_p
,l - length(replace(t, 'F', '')) AS nr_f
,l - length(replace(t, 'I', '')) AS nr_i
FROM (SELECT id, test::text AS t, length(test::text) AS l FROM test) t
The trick works like this:
Transform the rowtype into its text representation.
Measure character-length.
Replace the character you want to count and measure the change in length.
Compute the length of the original row in the subselect for repeated use.
This requires that P, F, I are present nowhere else in the row. Use a sub-select to exclude any other columns that might interfere.
Tested in 8.4 - 9.1. Nobody uses PostgreSQL 7.4 anymore nowadays, you'll have to test yourself. I only use basic functions, but I am not sure if casting the rowtype to text is feasible in 7.4. If that doesn't work, you'll have to concatenate all test-columns once by hand:
SELECT id
,length(t) - length(replace(t, 'P', '')) AS nr_p
,length(t) - length(replace(t, 'F', '')) AS nr_f
,length(t) - length(replace(t, 'I', '')) AS nr_i
FROM (SELECT id, test1||test2||test3||test4 AS t FROM test) t
This requires all columns to be NOT NULL.
Essentially, you need to unpivot your data by test:
id | test | result
+----------+----------+----------+
12345 | test1 | P
12345 | test2 | P
12345 | test3 | F
12345 | test4 | I
12345 | test5 | P
...
- so that you can then group it by test result.
Unfortunately, PostgreSQL doesn't have pivot/unpivot functionality built in, so the simplest way to do this would be something like:
select id, 'test1' test, test1 result from mytable union all
select id, 'test2' test, test2 result from mytable union all
select id, 'test3' test, test3 result from mytable union all
select id, 'test4' test, test4 result from mytable union all
select id, 'test5' test, test5 result from mytable union all
...
There are other ways of approaching this, but with 40 columns of data this is going to get really ugly.
EDIT: an alternative approach -
select r.result, sum(char_length(replace(replace(test1||test2||test3||test4||test5,excl1,''),excl2,'')))
from mytable m,
(select 'P' result, 'F' excl1, 'I' excl2 union all
select 'F' result, 'P' excl1, 'I' excl2 union all
select 'I' result, 'F' excl1, 'P' excl2) r
group by r.result
You could use an auxiliary on-the-fly table to turn columns into rows, then you would be able to apply aggregate functions, something like this:
SELECT
SUM(fields = 'P') AS passed,
SUM(fields = 'F') AS failed,
SUM(fields = 'I') AS incomplete
FROM (
SELECT
t.id,
CASE x.idx
WHEN 1 THEN t.test1
WHEN 2 THEN t.test2
WHEN 3 THEN t.test3
WHEN 4 THEN t.test4
WHEN 5 THEN t.test5
END AS fields
FROM atable t
CROSS JOIN (
SELECT 1 AS idx
UNION ALL SELECT 2
UNION ALL SELECT 3
UNION ALL SELECT 4
UNION ALL SELECT 5
) x
WHERE t.id = 12345
) s
Edit: just saw the comment about 7.4, I don't think this will work with that ancient version (unnest() came a lot later). If anyone thinks this is not worth keeping, I'll delete it.
Taking Erwin's idea to use the "row representation" as a base for the solution a bit further and automatically "normalize" the table on-the-fly:
select id,
sum(case when flag = 'F' then 1 else null end) as failed,
sum(case when flag = 'P' then 1 else null end) as passed,
sum(case when flag = 'I' then 1 else null end) as incomplete
from (
select id,
unnest(string_to_array(trim(trailing ')' from substr(all_column_values,strpos(all_column_values, ',') + 1)), ',')) flag
from (
SELECT id,
not_normalized::text AS all_column_values
FROM not_normalized
) t1
) t2
group by id
The heart of the solution is Erwin's trick to make a single value out of the complete row using the cast not_normalized::text. The string functions are applied to strip of the leading id value and the brackets around it.
The result of that is transformed into an array and that array is transformed into a result set using the unnest() function.
To understand that part, simply run the inner selects step by step.
Then the result is grouped and the corresponding values are counted.