Merging results present in all sub queries - sql

I have the a postgre table with a similar representation to the data below.
Caters Table:
-----------------------
| Name | Option |
-----------------------
| jane | social |
| jane | vegan |
| jane | gmo-free |
| jane | italian |
| jack | social |
| jack | corporate |
| jack | gmo-free |
| jack | greek |
| rodz | social |
| rodz | wedding |
| rodz | gmo-free |
| rodz | vegan |
| rodz | french |
This is the "pseudo" query I'm trying to run
SELECT * FROM caters
WHERE option is either ['italian', 'french']
AND WHERE option is both ['wedding', 'social']
This pseudo query should return rodz. Because it either has italian or french and it has both wedding and social.
This is the query I tried to write to accomplish my sudo query
SELECT c.name FROM caters c
WHERE c.option in ('italian', 'french')
GROUP BY c.name
HAVING array_agg(c.option) #> array['wedding', 'social']
How ever this returns no results. Running the query individually
SELECT c.name FROM caters c
WHERE c.option in ('italian', 'french')
GROUP BY c.name
Result:
-----------
| Name |
-----------
| jane | // has italian
| rodz | // has french
The other query
SELECT c.name FROM caters c
GROUP BY c.name
HAVING array_agg(c.option) #> array['wedding', 'social']
Result:
-----------
| Name |
-----------
| rodz | // has wedding and social
So I can see individually the queries are correct. This made me think well if I have 2 queries giving me the correct results just need to filter out results that are in both queries why don't I JOIN them.
So I tried
SELECT c.name FROM caters c
JOIN caters c1
ON c1.name = c.name and c1.option = c.option
WHERE c1.option in ('italian', 'french')
GROUP BY c.name
HAVING array_agg(c.option) #> array['wedding', 'social']
But this also yielding no results. Any idea how I can go about this?
NOTE: The query is dynamic each time its ran the values being used could be different sometimes maybes it 5 languages sometimes its 2 languages like in this example ('italian', 'french'). To give an example what I mean by dynamic query another query could be
SELECT * FROM caters
WHERE option is either ['italian']
AND WHERE option is both ['corporate', 'social']
// returns none
----------------------------------------------------------
SELECT * FROM caters
WHERE option is either ['french', 'greek']
AND WHERE option is either ['gmo-free', 'vegan']
AND WHERE option is both ['corporate', 'social']
// returns jack
----------------------------------------------------------
SELECT * FROM caters WHERE option is ['social']
// returns jack, and rodz

You can try using a correlated subquery
DEMO
select distinct name from tablename a
where option in ('italian', 'french') and exists
(
select 1 from tablename b where a.name=b.name and option in ('wedding', 'social')
group by b.name having count(distinct option)=2
)
OUTPUT:
name
rodz

Here is one method:
SELECT c.name
FROM caters c
WHERE c.option in ('italian', 'french', 'wedding', 'social')
GROUP BY c.name
HAVING COUNT(*) FILTER (WHERE c.option IN ('italian', 'french')) >= 1 AND
COUNT(*) FILTER (WHERE c.option IN ('wedding', 'social')) = 2;

Related

Why the output of a SELECT can be another SELECT?

I am rather confused about the following SQL query:
SELECT (SELECT S.name FROM student AS S
WHERE S.sid = E.sid) AS sname
FROM enrolled as E
WHERE cid='15-455';
SELECT should be followed by an output, but why here there is another SELECT? How to understand the step-by-step meaning of this query?
The following is the query that has the same result of the above query, but its meaning is rather explicit: the output of the second SELECT is passed into the IN() function.
SELECT name FROM student
WHERE sid IN (
SELECT sid FROM enrolled
WHERE cid = '15-445'
);
Here are the original tables of this question:
mysql> select * from student;
+-------+--------+------------+------+---------+
| sid | name | login | age | gpa |
+-------+--------+------------+------+---------+
| 53666 | Kanye | kayne#cs | 39 | 4.00000 |
| 53688 | Bieber | jbieber#cs | 22 | 3.90000 |
| 53655 | Tupac | shakur#cs | 26 | 3.50000 |
+-------+--------+------------+------+---------+
mysql> select * from enrolled;
+-------+--------+-------+
| sid | cid | grade |
+-------+--------+-------+
| 53666 | 15-445 | C |
| 53688 | 15-721 | A |
| 53688 | 15-826 | B |
| 53655 | 15-445 | B |
| 53666 | 15-721 | C |
+-------+--------+-------+
mysql> select * from course;
+--------+------------------------------+
| cid | name |
+--------+------------------------------+
| 15-445 | Database Systems |
| 15-721 | Advanced Database Systems |
| 15-826 | Data Mining |
| 15-823 | Advanced Topics in Databases |
+--------+------------------------------+
In real life I'd say both queries are just two creepy ways to avoid joins.
But in this particular case they were included in the slides you've found in order to show in how many place nested loops can be used.
They all do the same thing as the following
SELECT name
FROM student s
JOIN enrolled e
ON s.sid = e.sid
WHERE cid = '15-445';
As for your question about step-by-step meaning of the first query. It is the following
This will loop through every record from "enrolled" table that has cid = '15-455'.
FROM enrolled as E
WHERE cid='15-455';
For every record from step 1 it will perform the following query
SELECT S.name
FROM student AS S
WHERE S.sid = E.sid;
This construct:
SELECT (SELECT S.name FROM student S WHERE S.sid = E.sid) AS sname
-------^
is called a scalar subquery. This is a special type of subquery that has two important properties:
It returns one column.
It returns at most one row.
In this case, the scalar subquery is also a correlated subquery meaning that it references columns in the outer query, via the where clause.
A scalar subquery can be using almost anywhere that a scalar (i.e. constant value) can be used in a query. They can be handy. They are not exactly equivalent to a join, because:
An inner join can filter values. A scalar subquery returns NULL if there are no rows returned.
A join can multiply the number of rows. A scalar subquery returns an error if it returns more than one row.
If you want to get informations like :
Name of student | CID | Grade |
You can do something like :
select t.name, e.cid, e.grade
from enrolled e
inner join student t on (e.sid = t.sid)
Or without join (for optimization) :
select (name from student t where t.sid = e.sid) as name, e.cid, e.grade
from enrolled e
so results are the same but in the second one you're avoiding joins.

Oracle SQL query comparing multiple rows with same identifier

I'm honestly not sure how to title this - so apologies if it is unclear.
I have two tables I need to compare. One table contains tree names and nodes that belong to that tree. Each Tree_name/Tree_node combo will have its own line. For example:
Table: treenode
| TREE_NAME | TREE_NODE |
|-----------|-----------|
| 1 | A |
| 1 | B |
| 1 | C |
| 1 | D |
| 1 | E |
| 2 | A |
| 2 | B |
| 2 | D |
| 3 | C |
| 3 | D |
| 3 | E |
| 3 | F |
I have another table that contains names of queries and what tree_nodes they use. Example:
Table: queryrecord
| QUERY | TREE_NODE |
|---------|-----------|
| Alpha | A |
| Alpha | B |
| Alpha | D |
| BRAVO | A |
| BRAVO | B |
| BRAVO | D |
| CHARLIE | A |
| CHARLIE | B |
| CHARLIE | F |
I need to create an SQL where I input the QUERY name, and it returns any ‘TREE_NAME’ that includes all the nodes associated with the query. So if I input ‘ALPHA’, it would return TREE_NAME 1 & 2. If I ask it for CHARLIE, it would return nothing.
I only have read access, and don’t believe I can create temp tables, so I’m not sure if this is possible. Any advice would be amazing. Thank you!
You can use group by and having as follows:
Select t.tree_name
From tree_node t
join query_record q
on t.tree_node = q.tree_node
WHERE q.query = 'ALPHA'
Group by t.tree_name
Having count(distinct t.tree_node)
= (Select count(distinct q.tree_node) query_record q WHERE q.query = 'ALPHA');
Using an IN condition (a semi-join, which saves time over a join):
with prep (tree_node) as (select tree_node from queryrecord where query = :q)
select tree_name
from treenode
where tree_node in (select tree_node from prep)
group by tree_name
having count(*) = (select count(*) from prep)
;
:q in the prep subquery (in the with clause) is the bind variable to which you will assign the various QUERY values at runtime.
EDIT
I don't generally set up the test case on online engines; but in a comment below this answer, the OP said the query didn't work for him. So, I set up the example on SQLFiddle, here:
http://sqlfiddle.com/#!4/b575e/2
A couple of notes: for some reason, SQLFiddle thinks table names should be at most eight characters, so I had to change the second table name to queryrec (instead of queryrecord). I changed the name in the query, too, of course. And, second, I don't know how I can give bind values on SQLFiddle; I hard-coded the name 'Alpha'. (Note also that in the OP's sample data, this query value is not capitalized, while the other two are; of course, text values in SQL are case sensitive, so one should pay attention when testing.)
You can do this with a join and aggregation. The trick is to count the number of nodes in query_record before joining:
select qr.query, t.tree_name
from (select qr.*,
count(*) over (partition by query) as num_tree_node
from query_record qr
) qr join
tree_node t
on t.tree_node = qr.tree_node
where qr.query = 'ALPHA'
group by qr.query, t.tree_name, qr.num_tree_node
having count(*) = qr.num_tree_node;
Here is a db<>fiddle.

How to select table with a concatenated column?

I have the following data:
select * from art_skills_table;
+----+------+---------------------------+
| ID | Name | skills |
+----+------+---------------------------|
| 1 | Anna | ["painting","photography"]|
| 2 | Bob | ["drawing","sculpting"] |
| 3 | Cat | ["pastel"] |
+----+------+---------------------------+
select * from computer_table;
+------+------+-------------------------+
| ID | Name | skills |
+------+------+-------------------------+
| 1 | Anna | ["word","typing"] |
| 2 | Cat | ["code","editing"] |
| 3 | Bob | ["excel","code"] |
+------+------+-------------------------+
I would like to write an SQL statement which results in the following table.
+------+------+-----------------------------------------------+
| ID | Name | skills |
+------+------+-----------------------------------------------+
| 1 | Anna | ["painting","photography","word","typing"] |
| 2 | Bob | ["drawing","sculpting","excel","code"] |
| 3 | Cat | ["pastel","code","editing"] |
+------+------+-----------------------------------------------+
I've tried something like SELECT * from art_skills_table LEFT JOIN computer_table ON name. However it doesn't give what I need. I've read about array_cat but I'm having a bit of trouble implementing it.
if the skills column from both tables are arrays, then you should be able to get away with this:
SELECT a.ID, a.name, array_cat(a.skills, c.skills)
FROM art_skills_table a LEFT JOIN computer_table c
ON c.id = a.id
That said, While you used LEFT join in your sample, I think either an INNER or FULL (OUTER) join might serve you better.
First, i wondered why the data are stored in such a model.
Was of the opinion that NoSQL databases lack ability for joins and ...
... a semantic triple would be in the form of subject–predicate–object.
... a Key-value (KV) stores use associative arrays.
... a relational database would be normalized.
A few information about the use case would have helped.
Nevertheless, you can select the data with CONCAT and REPLACE for the desired form.
SELECT art_skills_table.ID, computer_table.name,
CONCAT(
REPLACE(art_skills_table.skills, '}',','),
REPLACE(computer_table.skills, '{','')
)
FROM art_skills_table JOIN computer_table ON art_skills_table.ID = computer_table.ID
The query returns the following result:
+----+------+--------------------------------------------+
| ID | Name | Skills |
+----+------+--------------------------------------------+
| 1 | Anna | {"painting","photography","word","typing"} |
| 2 | Cat | {"drawing","sculpting","code","editing"} |
| 3 | Bob | {"pastel","excel","code"} |
+----+------+--------------------------------------------+
I've used the ID for the JOIN, even though Bob has different values.
The JOIN should probably be done over the name.
JOIN computer_table ON art_skills_table.Name = computer_table.Name
BTW, you need to tell us what SQL engine you're running on.

JOIN, aggregate and convert in postgres between two tables

Here are the two tables i have: [all columns in both tables are of type "text"], Table name and the column names are in bold fonts.
Names
--------------------------------
Name | DoB | Team |
--------------------------------
Harry | 3/12/85 | England
Kevin | 8/07/86 | England
James | 5/05/89 | England
Scores
------------------------
ScoreName | Score
------------------------
James-1 | 120
Harry-1 | 30
Harry-2 | 40
James-2 | 56
End result i need is a table that has the following
NameScores
---------------------------------------------
Name | DoB | Team | ScoreData
---------------------------------------------
Harry | 3/12/85 | England | "{"ScoreName":"Harry-1", "Score":"30"}, {"ScoreName":"Harry-2", "Score":"40"}"
Kevin | 8/07/86 | England | null
James | 5/05/89 | England | "{"ScoreName":"James-1", "Score":"120"}, {"ScoreName":"James-2", "Score":"56"}"
I need to do this using a single SQL command which i will use to create a materialized view.
I have gotten as far as realising that it will involve a combination of string_agg, JOIN and JSON, but haven't been able to crack it fully. Please help :)
I don't think the join is tricky. The complication is building the JSON object:
select n.name, n.dob, n.team,
json_agg(json_build_object('ScoreName', s.name,
'Score', s.score)) as ScoreData
from names n left join
scores s
ons.name like concat(s.name, '-', '%')
group by n.name, n.dob, n.team;
Note: json_build_object() was introduced in Postgres 9.4.
EDIT:
I think you can add a case statement to get the simple NULL:
(case when s.name is null then NULL
else json_agg(json_build_object('ScoreName', s.name,
'Score', s.score))
end) as ScoreData
Use json_agg() with row_to_json() to aggregate scores data into a json value:
select n.*, json_agg(row_to_json(s)) "ScoreData"
from "Names" n
left join "Scores" s
on n."Name" = regexp_replace(s."ScoreName", '(.*)-.*', '\1')
group by 1, 2, 3;
Name | DoB | Team | ScoreData
-------+---------+---------+---------------------------------------------------------------------------
Harry | 3/12/85 | England | [{"ScoreName":"Harry-1","Score":30}, {"ScoreName":"Harry-2","Score":40}]
James | 5/05/89 | England | [{"ScoreName":"James-1","Score":120}, {"ScoreName":"James-2","Score":56}]
Kevin | 8/07/86 | England | [null]
(3 rows)

SQL Server - Given a Set of Columns, finding missing combinations within the Set

I have the following Table. What I want to get is the missing combinations of Student, Class, Book. I have a query below that does it, but I would like others to provide more efficient queries (ie possibly ones that use group by) to find the missing combos.
SQL FIDDLE HERE - http://sqlfiddle.com/#!6/16e2b/3
StudentBook Table
+---------+---------+--------------+
| Student | Class | Book |
+---------+---------+--------------+
| Albert | Math | AlgebraBook |
| Albert | Math | FractionBook |
| Bridget | Math | AlgebraBook |
| Bridget | Math | FractionBook |
| Charles | Math | AlgebraBook |
| Charles | Math | FractionBook |
| Debbie | English | NovelBook |
| Debbie | English | PoemBook |
| Edward | English | PoemBook |
| Frank | English | PoemBook |
+---------+---------+--------------+
The following Rows in the Set are the missing combinations
Correct Result of My Query Below
+---------+---------+-----------+
| Student | Class | Book |
+---------+---------+-----------+
| Edward | English | NovelBook |
| Frank | English | NovelBook |
+---------+---------+-----------+
And I can use the following Query to get the Missing Combinations, but I want a faster more efficient solutions. Basically I'm looking for other more Effective Techniques, such as possibly using Group By.
WITH CTE_ClassBooks AS
(
SELECT DISTINCT Class, Book FROM StudentBook
),
CTE_StudentClasses AS
(
SELECT DISTINCT Student, Class FROM StudentBook
),
CTE_CombosOfStudentClassBooks AS
(
SELECT DISTINCT b.Student, a.Class, a.Book
FROM CTE_ClassBooks a
INNER JOIN CTE_StudentClasses b ON a.Class = B.Class
)
SELECT * FROM CTE_CombosOfStudentClassBooks
EXCEPT
SELECT * FROM StudentBook
This might be a little faster, your route doesn't seem terribly inefficient though.
;WITH cte AS (SELECT DISTINCT Class,Book FROM Table1)
SELECT b.Student,a.*
FROM cte a
JOIN Table1 b
ON a.Class = b.Class
LEFT JOIN Table1 c
ON a.Class = c.CLass
AND a.Book = c.Book
AND b.Student = c.Student
WHERE c.Class IS NULL
Demo: SQL Fiddle
SELECT S1.STUDENT,S1.CLASS,S2.BOOK FROM
STUDENTBOOK S1,(SELECT DISTINCT CLASS,BOOK FROM STUDENTBOOK) S2
WHERE S1.CLASS = S2.CLASS
AND S1.BOOK <> S2.BOOK
EXCEPT
SELECT STUDENT,CLASS,BOOK FROM STUDENTBOOK