Recursive query to find previous related row meeting criteria - sql

I have a database full of messages from various chatbots. The chatbots all
follow decision tree format and ultimately are questions presented with choices
to which the user responds.
The bot may send a message (Hello would you like A or B?) which has options
attached, A and B for example. The user responds B. Both of these messages are
recorded and the previous message id attached.
id
message
options
previous_id
1
Hello would you like A or B?
A,B
2
A
1
The structure of these conversations is not fixed. There may be various forms
of message flow. The above is a simplistic example of how the messages are
chained together. For example
// text question on same message as options, with preceding unrelated messages
Hello -> My name is Ben the bot. -> How are you today? (good, bad) -> [good]
// text question not on same message as options
Pick your favourite colour -> {picture of blue and red} (blue, red) -> [blue]
// no question just option prompt - here precending text wasn't a question
[red] -> (ferrari, lamborghini) -> [ferrari]
-> denotes separation of messages
[] denotes reply to bot from user
() denotes options attached to messages
{} denotes attachments
What I am trying to get from this data is a row for every question with its
corresponding answer. The problem i'm facing is the (presumable) recursion i'd
have to use to retrieve the previous message each time until it met criteria
indicating it's gone back far enough for that particular answer in the chain of
messages.
In theory what I am trying to achieve is
Find all answers to questions
From those results look at the previous message
2a. If previous message has text and is not an answer itself then use said text and stop recursing
2b. Else move onto the next previous message until the criteria is met.
Return rows containing answer/response, with question and other columns from question row (id, timestamp for example)
This would leave me with lots of rows containing a message and a response
in the following dataset for example,
id
message
other
previous_id
1
Hello would you like A or B?
2
B
1
3
Hello would you like A or B?
4
A
3
5
Hello would you like A or B?
6
B
5
7
A is a great answer. C or D?
4
8
D
7
9
Green or red?
10
image
9
11
Red
10
I'd hope to end up with
id
message
response
1
Hello would you like A or B?
B
3
Hello would you like A or B?
A
5
Hello would you like A or B?
B
7
A is a great answer. C or D?
D
8
Green or red?
Red
I have made a (somewhat) simplified version of some sample data which is at the bottom of this question for reference/use.
It uses the following structure
WITH data ( id, message, node, options, previous, attachment) AS ()
Answers can be found with select where node is null so I assumed that is the
best starting point and I can work backwards towards the question. previous
and options are json columns because that's how they are in the real data so
I left them as they were.
I have tried various means by which to get the data as I wanted but I haven't managed the recursion/unknown number of levels bit.
For example, this attempt can dig two levels deep but I couldn't coalesce the
id of the message i found because obviously both have non null values.
select COALESCE(d2.message, d3.message) as question, d.message as answer
-- select COALESCE(d2.message, d2.attachment, d3.message, d3.attachment) as question, d.message as answer
from data as d
left join data as d2 on (d.previous->>'id')::int = d2.id
left join data as d3 on (d2.previous->>'id')::int = d3.id
where d.previous->>'node' in (
SELECT node from data where options is not null group by node
)
I believe this answer https://dba.stackexchange.com/a/215125/4660 may be the
path to what I need but I've thus far been unable to get it to run as I'd like.
I think this would allow me to replace the two left joins in my above example
with say a recursive union which i can use conditions on the on clause to stop
it at the right point. Hopefully this sounds like it might be along the right
lines and someone can point me in the right direction. Something like the below
perhaps?
WITH data (
id,
message,
node,
options,
previous,
attachment
) AS (
VALUES ...
), RecursiveTable as (
select * from data d where node is null # all answers?
union all
select * from RecursiveTable where ??
)
select * from RecursiveTable
--
Basic sample dataset
WITH data (
id,
message,
node,
options,
previous,
attachment
) AS (
VALUES
-- QUESTION TYPE 1
-- pineapple questions
(1, 'Pineapple on pizza?', 'pineapple', '["Yes","No"]'::json, null::json, null),
(2, 'Pineapple on pizza?', 'pineapple', '["Yes","No"]'::json, null::json, null),
(3, 'Pineapple on pizza?', 'pineapple', '["Yes","No"]'::json, null::json, null),
(4, 'Pineapple on pizza?', 'pineapple', '["Yes","No"]'::json, null::json, null),
(5, 'Pineapple on pizza?', 'pineapple', '["Yes","No"]'::json, null::json, null),
-- pineapple answers
(6, 'No', null, null, '{"id": 1, "node": "pineapple"}'::json, null),
(7, 'Yes', null, null, '{"id": 2, "node": "pineapple"}'::json, null),
(8, 'No', null, null, '{"id": 3, "node": "pineapple"}'::json, null),
(9, 'Yes', null, null, '{"id": 4, "node": "pineapple"}'::json, null),
(10, 'No', null, null, '{"id": 5, "node": "pineapple"}'::json, null),
-- ----------------------------
-- QUESTION TYPE 2 - Previous message, then question with text + options followed by answer
--- previous messages to stuffed crust questions (we don't care about
--these but they're here to ensure we aren't accidentally getting them
--as the question in results)
(11, 'Hello', 'hello_pre_stuffed_crust', null, null::json, null),
(12, 'Hello', 'hello_pre_stuffed_crust', null, null::json, null),
(13, 'Hello', 'hello_pre_stuffed_crust', null, null::json, null),
-- stuffed crust questions
(14, 'Stuffed crust?', 'stuffed_crust', '["Crunchy crust","More cheese!"]'::json, '{"id": 11, "node": "hello_pre_stuffed_crust"}'::json, null),
(15, 'Stuffed crust?', 'stuffed_crust', '["Crunchy crust","More cheese!"]'::json, '{"id": 12, "node": "hello_pre_stuffed_crust"}'::json, null),
(16, 'Stuffed crust?', 'stuffed_crust', '["Crunchy crust","More cheese!"]'::json, '{"id": 13, "node": "hello_pre_stuffed_crust"}'::json, null),
-- stuffed crust answers
(17, 'More cheese!', null, null, '{"id": 14, "node": "stuffed_crust"}'::json, null),
(18, 'Crunchy crust', null, null, '{"id": 15, "node": "stuffed_crust"}'::json, null),
(19, 'Crunchy crust', null, null, '{"id": 16, "node": "stuffed_crust"}'::json, null),
-- ----------------------------
-- QUESTION TYPE 3
-- two part question, no text with options only image, should get text from previous
-- part 1
(20, 'What do you think of this pizza?', 'check_this_image', null, null::json, null),
(21, 'What do you think of this pizza?', 'check_this_image', null, null::json, null),
(22, 'What do you think of this pizza?', 'check_this_image', null, null::json, null),
-- part two
(23, null, 'image', '["Looks amazing!","Not my cup of tea"]'::json, '{"id": 20, "node": "check_this_image"}'::json, 'https://images.unsplash.com/photo-1544982503-9f984c14501a'),
(24, null, 'image', '["Looks amazing!","Not my cup of tea"]'::json, '{"id": 21, "node": "check_this_image"}'::json, 'https://images.unsplash.com/photo-1544982503-9f984c14501a'),
(25, null, 'image', '["Looks amazing!","Not my cup of tea"]'::json, '{"id": 22, "node": "check_this_image"}'::json, 'https://images.unsplash.com/photo-1544982503-9f984c14501a'),
-- two part answers
(26, 'Looks amazing!', null, null, '{"id": 23, "node": "image"}'::json, null),
(27, 'Not my cup of tea', null, null, '{"id": 24, "node": "image"}'::json, null),
(28, 'Looks amazing!', null, null, '{"id": 25, "node": "image"}'::json, null),
-- ----------------------------
-- QUESTION TYPE 4
-- no text, just options straight after responding to something else - options for text value would be options, or image
-- directly after question 3 was answered, previous message was user message - but we don't have text here - just an image and options
(29, null, 'which_brand', '["Dominos","Papa Johns"]'::json, '{"id": 27}'::json, 'https://peakstudentmediadotcom.files.wordpress.com/2018/11/vs.jpg'),
(30, null, 'which_brand', '["Dominos","Papa Johns"]'::json, '{"id": 28}'::json, 'https://peakstudentmediadotcom.files.wordpress.com/2018/11/vs.jpg'),
(31, null, 'which_brand', '["Dominos","Papa Johns"]'::json, '{"id": 29}'::json, 'https://peakstudentmediadotcom.files.wordpress.com/2018/11/vs.jpg')
)
SELECT * from data

You can use WIT HRECURSIVE to achieve your goal. You just need to specify when to stop the recursion and find a way to select only those records, where the recursion did not produce any additional rows for.
Have a look here:
WITH RECURSIVE comp (
id, message, node, options, previous, attachment,
id2, message2, node2, options2, previous2, attachment2,
rec_depth
) AS (
SELECT
t.id, t.message, t.node, t.options, t.previous, t.attachment,
null::integer AS id2, null::text AS message2, null::text AS node2, null::json AS options2, null::json AS previous2, null::text AS attachment2,
0
FROM data t
WHERE t.node IS NULL
UNION ALL
SELECT
c.id, c.message, c.node, c.options, c.previous, c.attachment,
prev.id, prev.message, prev.node, prev.options, prev.previous, prev.attachment,
c.rec_depth + 1
FROM comp c
INNER JOIN data prev ON prev.id = ((COALESCE(c.previous2, c.previous))->>'id')::int
WHERE prev.node IS NOT NULL -- do not reach back to the next answer
AND c.message2 IS NULL -- do not reach back beyond a message with text (the question text)
), data (id, message, node, options, previous, attachment) AS (
VALUES [...]
) SELECT
c.id2 AS question_id, c.id AS answer_id
FROM comp c
WHERE
NOT EXISTS(
SELECT 1
FROM comp c2
WHERE c2.id = c.id
AND c2.rec_depth > c.rec_depth
)
comp holds before the recursion only the "answers" (this is the part above UNION ALL). Then, in the first recursion step, they are joined with the predecesors. In the second step, another new record is created per answer-predecessor pair, where the predecessor replaces itself with its predecessor. This is done, until the "base-condition" (the joined partner is a record with message aka question text or the next partner is a record without node aka an answer) is reached (this means until no new records get created).
As we also compute the recursion depth (rec_depth) of each row, we can finally check that we use only those records generated per answer with the maximal recursion depth.
The second WITH statement can and should of course be removed and you should reference your real table in the WITH RECURSIVE part.
I chose to only select the ids of the answer and the corresponding question, but the WITH RECURSIVE is already built in a way, that you can use all of the columns.
Further reading in the docs:
https://www.postgresql.org/docs/13/sql-select.html#SQL-WITH
https://www.postgresql.org/docs/13/queries-with.html

Related

PostgreSQL: Get first non-true row or last row

I have a table containing measurement information of several tests per devices:
device_id gives the information on which device was tested
measurement_no is an incrementing number giving the order in which tests has been performed
test gives you the name of the test which is performed
is_last_measurement_on_test is a boolean field, giving the information if the specific row is the last measurement of a test. It returns true, if the row is the last row of the device for an specific test. It returns false, if there is a subsequent row of the same device for the same test.
ok gives information is the test was okay (=true) or not okay (=false)
error_code gives you a specific error code if ok=false, or 0 if ok=true
WITH measurements (device_id,measurement_no,test,is_last_measurement_on_test,ok,error_code) AS ( VALUES
-- case 1: all measurements good, expecting to show test 3 only
('d1',1,'test1',true,true,0),
('d1',2,'test2',true,true,0),
('d1',3,'test3',true,true,0),
-- case 2: test 2, expecting to show test 2 only
('d2',1,'test1',true,true,0),
('d2',2,'test2',true,false,100),
('d2',3,'test3',true,true,0),
-- case 3: test 2 und 3 bad, expecting to show test 2 only
('d3',1,'test1',true,true,0),
('d3',2,'test2',true,false,100),
('d3',3,'test3',true,false,200),
-- case 4: test 2 bad on first try, second time good, expecting to show test 3 only
('d4',1,'test1',true,true,0),
('d4',2,'test2',false,false,100),
('d4',3,'test2',true,true,0),
('d4',4,'test3',true,true,0)
)
select * from measurements
where is_last_measurement_on_test=true
Now I want to filter these rows on following conditions per device:
Only the last measurement on each test should be considered -> that's easy: filtering on is_last_measurement_on_test=true
For every device: If there is a bad result (ok=false) in any test where is_last_measurement_on_test=true, I want to display the first test on which the device failed.
For every device: If there is no bad result at all (ok=true) in any test where is_last_measurement_on_test=true, I want to display the last test on which the device passed.
For the given example above, I am expecting that only these rows to display:
('d1',3,'test3',true,true,0)
('d2',2,'test2',true,false,100)
('d3',2,'test2',true,false,100)
('d4',4,'test3',true,true,0)
How can I receive this result? I already tried a lot on using first_value, for example
first_value(nullif(error_code,0)) over (partition by device_id)
but i wasn't able to handle it in the way I wanted it to be.
Having this sample data:
CREATE TABLE measurements (
device_id text,
measurement_no integer,
test text,
is_last_measurement_on_test boolean,
ok boolean,
error_code integer
);
INSERT INTO measurements (device_id, measurement_no, test, is_last_measurement_on_test, ok, error_code)
VALUES
('d1', 1, 'test1', true, true, 0),
('d1', 2, 'test2', true, true, 0),
('d1', 3, 'test3', true, true, 0),
('d2', 1, 'test1', true, true, 0),
('d2', 2, 'test2', true, false, 100),
('d2', 3, 'test3', true, true, 0),
('d3', 1, 'test1', true, true, 0),
('d3', 2, 'test2', true, false, 100),
('d3', 3, 'test3', true, false, 200),
('d4', 1, 'test1', true, true, 0),
('d4', 2, 'test2', false, false, 100),
('d4', 3, 'test2', true, true, 0),
('d4', 4, 'test3', true, true, 0);
It will be like:
WITH DataSource AS
(
SELECT *
,MIN(CASE WHEN ok = false THEN test END) OVER (PARTITION BY device_id) AS first_failed_test
,ROW_NUMBER() OVER (PARTITION BY device_id ORDER BY test DESC) AS test_id
FROM measurements
WHERE is_last_measurement_on_test = true
)
SELECT device_id, measurement_no, test, is_last_measurement_on_test, ok, error_code
FROM DataSource
WHERE (first_failed_test IS NULL and test_id = 1)
OR (first_failed_test = test)
The idea is get the name of the first fail test and to order the test using row_number starting from the latest one.
The important part is that here I am ordering the test by there name. In your real scenario, I am guessing you have an record_id or date, which can be used to do this. So, you will need to change a little bit the code.
distinct on gets you the one, "top" record per device_id.
order by lets you establish the order according to which a record might or might not end up on "top". Since your 2nd and 3rd case require opposite ordering/priority of tests per device_id:
earliest negative record when there are negatives
latest record if there are no negatives
You can flip that order accordingly, with a case.
select distinct on (device_id) *
from measurements
where is_last_measurement_on_test
order by device_id, --Necessary for distinct on to return one row per device_id
not ok desc, --If all is ok for a device_id, this does nothing.
--Otherwise it'll put negative tests results first
(case when not ok then -1 else 1 end)*measurement_no desc;
--When considering negative test results, it'll
--put earliest first. Otherwise it'll put latest first.
It's a common misconception that order by section is somehow restricted to plain column names or aliases. Meanwhile, quoting the doc, it gives you as much freedom as select section does:
The sort expression(s) can be any expression that would be valid in the query's select list.
Online demo.

Recursion - chaining data in sql - stuck

I have different tables and the goal is to obtain the approval workflow for every customer
Customers have different approval workflows, take a look at this:
In my table "entities" i have this
(12, 'Math Andrew', 308, 'CHAIN1-MathAndrew').
It means that when the row was created the number 12 was assigned to Math Andrew... 308 is the number that says that Matt Andrew is a CLIENT
Table type_entities
(308,'CLIENT'),
(309,'APPROVER1'),
(310,'APPROVER2'),
(311,'APPROVER3'),
(312,'J3 APPROVER4'),
(313,'J4 APPROVER4'),
(314,'J5 APPROVER4'),
(315, 'J6 APPROVER4'),
(316,'J7 APPROVER4');
Because Math Andrew is a CLIENT (also known as CUSTOMER) he must be linked to one or more APPROVERS
A client could have 1 APPROVER, OR 2 APPROVERS OR 3 APPROVERS OR 4 APPROVERS, there exist different approvers inside entities table:
(18, 'ZATCH', 309, null),
(19, 'MAX', 309, null),
(20, 'Ger',310, null),
(21, 'Mar',310, null),
(22, 'Maxwell',311, null),
(23, 'Ryan',312, null),
(24, 'Juy',313, null),
(25, 'Angel',314, null),
(26, 'John',315, null);
Types of relations between entities:
(444,'J6 CLIENT-APPROVER4'),
(445,'J3 CLIENT-APPROVER4'),
(446,'J4 CLIENT-APPROVER4'),
(447,'J10 CLIENT-APPROVER4'),
(448,'J4 CLIENT-APPROVER4'),
(449,'J5 CLIENT-APPROVER4'),
(450,'J10 CLIENT-APPROVER4'),
(451,'J3 CLIENT-APPROVER4'),
(452,'J8 CLIENT-APPROVER4'),
(453,'J5 CLIENT-APPROVER4'),
(454,'J6 CLIENT-APPROVER4'),
(455,'J7 CLIENT-APPROVER4'),
(456,'J7 CLIENT-APPROVER4'),
(457,'J8 CLIENT-APPROVER4'),
(458,'CLIENT-APPROVER3'),
(459,'CLIENT-APPROVER1'),
(460,'APPROVER1-APPROVER2'),
(461,'APPROVER1-APPROVER3'),
(462,'J3 APPROVER1-APPROVER4'),
(463,'APPROVER2-APPROVER3'),
(464,'J3 APPROVER3-APPROVER4'),
(465,'J4 APPROVER3-APPROVER4'),
(466,'J5 APPROVER3-APPROVER4'),
(467,'J6 APPROVER3-APPROVER4'),
(468,'J7 APPROVER3-APPROVER4'),
(469,'J8 APPROVER3-APPROVER4'),
(470,'J10 APPROVER3-APPROVER4'),
(471,'CLIENT-APPROVER2');
This is the important part: when a client is linked to one approver, a relation is created inside relationships table.
In this case MathAndrew was linked to Approver #18 (ZATCH), THIS ROW WAS CREATED AFTER THE ASSIGNATION:
(787,459,'CHAIN1-MathAndrew',18)--
787 IS THE NUMBER THAT WAS ASSIGNED WHEN THAT ROW WAS CREATED 459
REPRESENTS THE RELATION CLIENT - APPROVER
CHAIN1-MathAndre is the
client 18 is the approver
Also, in this case APPROVER1 was linked to APPROVER2
(788,460,18,20)
Then, APPROVER2 was linked to APPROVER3
(789,463,20,21)
Finally, APPROVER3 was linked to APPROVER4
(790,467,21,26)
I WANT TO OBTAIN THE COMPLETE APPROVAL WORKFLOW CHAIN, I mean this:
CHAIN1-MathAndrew-ZATCH-Ger-Mar-John
I did this but i am not getting what i want:
WITH relationships_CTE as
select description_entity_1,description_entitiy_2
from relationships
where description_entitiy_1 like 'CHAIN1-MathAndrew'
UNION ALL
select description_entity_1,description_entitiy_2
from relationships
where relationships.description_entitiy_2 = relationships_CTE.description_entitiy_2
select *
from relationships_CTE ma
left join relationships_CTE na
This is my SQL FIDDLE:
http://sqlfiddle.com/#!9/51bb39/4
Could you please help me?
So you have a couple of major issues with your demo, firstly that you are trying to use a CTE on a version of MySQL that doesn't support it (CTE support was introduced in MySQL version 8), and secondly you are trying to insert a string into a column in the relationships table (which should have been left as a reference to the entities table. Having corrected those issues, we can look at the CTE. There you have a syntax error because you have not enclosed your CTE query in (), and also you have failed to declare the CTE as recursive (since it refers to itself).
Now, based on your question, you want to get names out of the entities table to correspond to the values in the relationships table. So we start the CTE by finding the appropriate entities.id value for CHAIN1-MathAndrew, and then in the recursive part of the CTE we loop through all the entities that are related to that entity, grabbing the names as we go. This gives us this query:
WITH recursive relationships_CTE as (
select e.id, e.description AS name
from entities e
where e.description like 'CHAIN1-MathAndrew'
UNION ALL
select r.description_entitiy_2, e.name
from relationships_CTE cte
left join relationships r
on r.description_entitiy_1 = cte.id
join entities e ON r.description_entitiy_2 = e.id
)
If we now
select *
from relationships_CTE
we get
id name
12 CHAIN1-MathAndrew
18 ZATCH
20 Ger
21 Mar
26 John
or we can use GROUP_CONCAT to string those names together:
select group_concat(name separator '-')
from relationships_CTE
Output:
CHAIN1-MathAndrew-ZATCH-Ger-Mar-John
Demo on dbfiddle

How can I sort version numbers such as ‘A’, ‘B’, ‘AA’, ‘AB’ and ‘AAA’?

In reference to this question: How do I grab only the latest Invoice Number
I accepted an answer that uses the MAX function but Robert McKee pointed out that will result in sorted values such as:
‘A’
‘AA’
‘AAA’
‘AB’
‘B’
When what I need is:
‘A’
‘B’
‘AA’
‘AB’
‘AAA’
I am trying to find a way to find the latest Version of an Invoice. The accepted answer from the referenced question will work up to a point. And it did satisfy my question... But now a new problem deserves its own question and not for me to go back and modify my original question. So…
The only thing I have to work with is the Invoice Number itself.
The Invoice number has a format of #####XXX, where ##### is the actual Invoice Number and the XXX is the version number. XXX can be anywhere from ‘A’ to ‘ZZZ’.
Here is my attempt to find a plausible work around (a sql test case):
DECLARE #TempTable TABLE (MyNumber int, MyString varchar(15));
INSERT #TempTable
VALUES (100, 'A'), (100, 'AAZ'), (100, 'B'), (100, 'AZ'), (100, 'C'), (100, 'Z'), (100, 'AA'), (100, 'AB');
SELECT TOP 1
RTRIM(CAST(MyNumber AS NVARCHAR(15)) + MyString) AS InvoiceNumber
FROM #TempTable
ORDER BY RIGHT(LEFT(MyString + SPACE(2), 3), 1) DESC, RIGHT(LEFT(MyString + SPACE(2), 2), 1) DESC, LEFT(MyString, 1) DESC;
Would anyone care to provide a better answer or point me in the right direction to clean mine up?
Thanks in advance,
Try something like this:
ORDER BY LEN(myValue),myValue
this will order the 1-character, then the 2-character, etc...
Not sure if this meets your definition of "better" or "cleaned up":
ORDER BY LEFT(MyString,1),SUBSTRING(MyString,2,1),SUBSTRING(MyString,3,1)

How to get the root parentID of a product or menu Item [duplicate]

This question already has answers here:
SQL - how to get the top parent of a given value in a self referencing table
(3 answers)
Closed 8 years ago.
I need to get the root PageID of a child item which actually in my case is Menu item.
i have a table structure as below
[PageId], [PageName], [PagePath], [PageInheritance]
What i want is a sql query that will get me the PageID if users select level 2 or level 3 menu item. so that i can always highlight the parent menu irrespective of its level
for example if PageID = 6 then it it should get me Root PageID as 2.
I also tried to set of SQL Fiddle for this page but it fails for some reason.
CREATE TABLE PageMenu
([PageId] int, [PageName] varchar(5), [PagePath] varchar(50), [PageInheritance] int)
;
INSERT INTO PageMenu
([PageId], [PageName], [PagePath], [PageInheritance])
VALUES
(1, 'Home', '/en/', 0),
(2, 'Menu1', '/en/Menu1/', 0),
(3, 'Child1', '/en/Menu1/Child1/', 2),
(4, 'Child1', '/en/Menu1/Child2/', 2),
(5, 'GrandChild1', '/en/Menu1/Child1/GrandChild1/', 4),
(6, 'GrandChild2', '/en/Menu1/Child1/GrandChild2/', 5)
;
My Solution : http://rextester.com/IXEKB9577
This is a bit challenging because you do not want "1", but "2". So, you want the second level, which makes this a bit different from most such problems. Here is one way to get the top level:
select pm.*, t.PageId
from PageMenu pm cross apply
(select top 1 pm2.PageId
from PageMenu pm2
where pm.PageName like pm2.PageName + '%'
order by len(pm2.PageName)
) t;
However, this returns "1", and not "2". You can do this with simple filtering
select pm.*, t.PageId
from PageMenu pm cross apply
(select top 1 pm2.PageId
from PageMenu pm2
where pm.PagePath like pm2.PagePath + '%' and
pm2.PageName <> 'Home'
order by len(pm.PageName)
) t;
Here is a SQL Fiddle

Implementation discussion

I am in a position where I want multiple counts from a single table based on different combination of conditions.
The table has 2 flags: A & B.
I want count for following criteria on same page:
A is true (Don't care about B)
A is false (Don't care about B)
A is true AND B is true
A is false AND B is true
A is true AND B is false
A is false AND B is false
B is true (Don't care about A)
B is false (Don't care about A)
I want all above count on same page. Which of following will a good approach for this:
Query for count on that table for each condition. [That is firing 8 queries every time user gives the command.]
Query for list of data from database and then count values for appropriate conditions on UI.
Which option should I choose? Do you know any other alternative for this?
Your table essentially looks like this (The ID column is redundant, but I expect you have other data in your actual table anyway.):
CREATE TABLE `stuff` (
`id` INT(11) UNSIGNED NOT NULL AUTO_INCREMENT,
`a` TINYINT(3) UNSIGNED NOT NULL DEFAULT '0',
`b` TINYINT(3) UNSIGNED NOT NULL DEFAULT '0',
PRIMARY KEY (`id`)
)
Some sample data:
INSERT INTO `stuff` (`id`, `a`, `b`) VALUES (1, 0, 0);
INSERT INTO `stuff` (`id`, `a`, `b`) VALUES (2, 0, 1);
INSERT INTO `stuff` (`id`, `a`, `b`) VALUES (3, 1, 0);
INSERT INTO `stuff` (`id`, `a`, `b`) VALUES (4, 1, 1);
This query (in mysql, I'm not sure about other DBMS) should produce the results you want.
select
count(if (a = 1, 1, NULL)) as one,
count(if (a = 0, 1, NULL)) as two,
count(if (a = 1 && b = 1, 1, NULL)) as three,
count(if (a = 0 && b = 1, 1, NULL)) as four,
count(if (a = 1 && b = 0, 1, NULL)) as five,
count(if (a = 0 && b = 0, 1, NULL)) as six,
count(if (b = 1, 1, NULL)) as seven,
count(if (b = 0, 1, NULL)) as eight
from stuff
group by null
With the sample, simple data above, the query generates:
one, two, three, four, five, six, seven, eight
2 , 2 , 1, 1, 1, 1, 2, 2
Notes:
group by null
This just causes every row ro be in the group.
count(...)
This function counts all the NON null values in the group, which is why we use the if(...) to return null if the condition is not met.
Create a query that already does the counting. At least with SQL this is not hard.
In my opinion 2nd option is better as you are querying only once. Firing 8 Queries to DB might later impact on performance.
Databases are designed to give you the data you want. In almost all cases, asking for what you want, is quicker than asking for everything and calculate or filter yourself. I'd say, you should blindly go for option 1 (ask what you need) and if it really does not work consider option 2 (or something else).
If every flag is true or false (no null values.) You don't need 8 queries, 4 would be enough.
Get the total
A true (don't care about B)
B true (don't care about A)
A and B true
'A true and B false' is second minus fourth, (A true) - (A and B true). And 'A and B false' = total - A true - B true + A and B true. Look for Inclusion exclusion principle for more information.