SQL JOIN each id in JSON object - sql

I have a JSON column containing col_values for another table. I want to return rows from that other table for each item in the JSON object.
If this was an INT column, I would use JOIN, but I need to JOIN every entry in the JSON object.
Take:
writers :
| id | name | projects (JSON) |
|:-- |:-----|:------------------|
| 1 | Andy | ["1","2","3","4"] |
| 2 | Hank | ["3","4","5","6"] |
| 3 | Alex | ["1","7","8","9"] |
| 4 | Joe | ["1","5","6","7"] |
| 5 | Ken | ["2","4","5","6"] |
| 6 | Zach | ["2","7","8","9"] |
| 7 | Walt | ["2","5","6","7"] |
| 8 | Mike | ["2","3","4","5"] |
cities :
| id | name | project |
|:-- |:---------|:--------|
| 1 | Boston | 1 |
| 2 | Chicago | 2 |
| 3 | Cisco | 3 |
| 4 | Seattle | 4 |
| 5 | North | 5 |
| 6 | West | 6 |
| 7 | Miami | 7 |
| 8 | York | 8 |
| 9 | Tainan | 9 |
| 10 | Seoul | 1 |
| 11 | South | 2 |
| 12 | Tokyo | 3 |
| 13 | Carlisle | 4 |
| 14 | Fugging | 5 |
| 15 | Turkey | 6 |
| 16 | Paris | 7 |
| 17 | Midguard | 8 |
| 18 | Fugging | 9 |
| 19 | Madrid | 1 |
| 20 | Salvador | 2 |
| 21 | Everett | 3 |
I need every city ordered by name for Mike (id=8).
Desired results:
This is what I'm getting and what I need to get (ORDER BY name).
Output :
| id | name | project |
|:---|:---------|:--------|
| 13 | Carlisle | 4 |
| 2 | Chicago | 2 |
| 3 | Cisco | 3 |
| 21 | Everett | 3 |
| 14 | Fugging | 5 |
| 5 | North | 5 |
| 20 | Salvador | 2 |
| 4 | Seattle | 4 |
| 11 | South | 2 |
| 12 | Tokyo | 3 |
Current query, but this can't be the best way...
SQL >
SELECT c.*
FROM cities c
WHERE EXISTS (
SELECT 1
FROM writers w
WHERE JSON_CONTAINS(
w.projects, CONCAT('\"', c.project, '\"'))
AND w.id = '8'
)
ORDER BY c.name;
DB Fiddle with the above. Is there a better way to do this "properly"?
Background
If it matters, I need to keep using JSON as the datatype because my server-side software that uses this database normally reads that column best if presented as a JSON object.
I would normally just do several database calls and iterate through that JSON object in my server-side language, but that is way too expensive with so many database calls, notwithstanding that it is even more costly to do multiple database calls for pagination.
I need all the results in a single database call. So, I need to JOIN or otherwise loop through each item in the JSON object within SQL.

Start with JOIN
Per a comment from a user, there is a better way...
SQL >
SELECT c.*
FROM writers w
JOIN cities c ON JSON_CONTAINS(w.projects, CONCAT('\"', c.project, '\"'))
WHERE w.id = '8'
ORDER BY c.name;
Output is the same...
Output :
id
name
project
13
Carlisle
4
2
Chicago
2
3
Cisco
3
21
Everett
3
14
Fugging
5
5
North
5
20
Salvador
2
4
Seattle
4
11
South
2
12
Tokyo
3
DB Fiddle

Related

How to structure a proper SQL subquery?

I'm trying to wrap my head around how to do a proper subquery, it's not making sense to me, lets say I have two tables books and chapters:
Books
+----+------------------+----------+---------------------+
| id | name | author | last_great_chapters |
+----+------------------+----------+---------------------+
| 1 | some book title | john doe | 2 |
| 2 | foo novel title | some guy | 4 |
| 3 | other book title | lol man | 3 |
+----+------------------+----------+---------------------+
Chapters
+----+---------+----------------+
| id | book_id | chapter_number |
+----+---------+----------------+
| 1 | 1 | 1 |
| 2 | 1 | 3 |
| 3 | 1 | 4 |
| 4 | 1 | 5 |
| 5 | 2 | 1 |
| 6 | 2 | 2 |
| 7 | 2 | 3 |
| 8 | 2 | 4 |
| 9 | 2 | 5 |
| 10 | 3 | 1 |
| 11 | 3 | 2 |
| 12 | 3 | 3 |
| 13 | 3 | 4 |
| 14 | 3 | 5 |
+----+---------+----------------+
How can I join the two tables, and just print out the number of rows (sorted limit(last_great_chapters)) of the "last_great_chapters" from the books table list for each book?
if I understood correctly, you want to print out table books and last_great_chapters count in Chapters table?
if yes, try it
select b.id, b.name, b.author , b.last_great_chapter, COUNT(c.chapter_number) as rownumbers FROM Books as b
LEFT JOIN Chapters AS C ON c.chapter_number = b.last_great_chapters
group by b.id, b.name, b.author , b.last_great_chapter

Percentage to total in BigQuery Legacy SQL (Subqueries?)

I can't understand how to calulate percentage to total in BigQuery Legacy SQL.
So, I have a table:
ID | Name | Group | Mark
1 | John | A | 10
2 | Lucy | A | 5
3 | Jane | A | 7
4 | Lily | B | 9
5 | Steve | B | 14
6 | Rita | B | 11
I want to calculate percentage like this:
ID | Name | Group | Mark | Percent
1 | John | A | 10 | 10/(10+5+7)=45%
2 | Lucy | A | 5 | 5/(10+5+7)=22%
3 | Jane | A | 7 | 7/(10+5+7)=33%
4 | Lily | B | 9 | 9/(9+14+11)=26%
5 | Steve | B | 14 | 14/(9+14+11)=42%
6 | Rita | B | 11 | 11/(9+14+11)=32%
My table is quite long for me (3 million rows).
I thought that I could do it with subqueries, but in SELECT I can't use subqueries.
Does anyone know a way to do it?
SELECT
ID, Name, [Group], Mark,
RATIO_TO_REPORT(Mark) OVER(PARTITION BY [Group]) AS percent
FROM YourTable
Check more about RATIO_TO_REPORT

Selecting Multiple ID's in one Select

I have a Database with entries that have to be grouped togethe
id | Name | Surname | Time
1 | Michael | Kane | 3
2 | Torben | Dane | 4
3 | Dinge | Chain | 5
4 | Django | Fain | 5
5 | Juliett | Bravo | 6
6 | Django | Fain | 7
7 | Django | Fain | 3
8 | Django | Fain | 4
9 | Dinge | Chain | 4
10 | Torben | Dane | 4
Now I want to group the items while maintaing all Id's. I'm comming close with the following query but I am lossing my ids
SELECT id, Name, Surname, sum(Time) from Names group by(Name)
The Result of the Query is
id | Name | Surname | Time
9 | Dinge | Chain | 9
8 | Django | Fain | 19
5 | Juliett | Bravo | 6
1 | Michael | Kane | 3
10 | Torben | Dane | 8
while I would need all ids like this
ids | Name | Surname | Time
3,9 | Dinge | Chain | 9
4,6,78 | Django | Fain | 19
5 | Juliett | Bravo | 6
1 | Michael | Kane | 3
2,10 | Torben | Dane | 8
How can i accomplish this?
You would do this using group_concat():
select group_concat(id, ',') as ids, name, surname, sum(time) as time
from table t
group by name, surname;
Just don't store the results back in the database. Comma-separated values are useful for returning results, but it is the wrong format for storing data in the database.

List the name of division that all employees are working on some project(s)

List the name of division that ALL employees are working on some project(s). Namly, there not exists an employee who do is the full question. I'm having trouble getting an actual answer for this one, and my professor is being no help to telling me what I'm doing wrong. The code I have is
select dname
from division d, employee e, workon w
where e.did = d.did
and w.empid = e.empid
and not exists
(select empid
from workon
group by empid
having count (empid) >= all(select e.empid
from employee ee
where e.did = ee.did
group by ee.empid))
group by dname
The tables I have are
Employee
| EMPID | NAME | SALARY | DID |
--------------------------------
| 1 | kevin | 32000 | 2 |
| 2 | joan | 46200 | 1 |
| 3 | brian | 37000 | 3 |
| 4 | larry | 82000 | 5 |
| 5 | harry | 92000 | 4 |
| 6 | peter | 45000 | 2 |
| 7 | peter | 68000 | 3 |
| 8 | smith | 39000 | 4 |
| 9 | chen | 71000 | 1 |
| 10 | kim | 46000 | 5 |
Division
| DID | DNAME | MANAGERID |
----------------------------------------------
| 1 | engineering | 2 |
| 2 | marketing | 1 |
| 3 | human resource | 3 |
| 4 | Research and development | 5 |
| 5 | accounting | 4 |
Workon
| PID | EMPID | HOURS |
-----------------------
| 3 | 1 | 30 |
| 2 | 3 | 40 |
| 5 | 4 | 30 |
| 6 | 6 | 60 |
| 4 | 3 | 70 |
| 2 | 4 | 45 |
| 5 | 3 | 90 |
| 3 | 3 | 100 |
| 6 | 8 | 30 |
| 4 | 4 | 30 |
| 5 | 8 | 30 |
| 6 | 7 | 30 |
| 6 | 9 | 40 |
| 5 | 9 | 50 |
| 4 | 6 | 45 |
| 2 | 7 | 30 |
| 2 | 8 | 30 |
| 2 | 9 | 30 |
| 1 | 9 | 30 |
| 1 | 8 | 30 |
| 1 | 7 | 30 |
| 1 | 5 | 30 |
| 1 | 6 | 30 |
| 2 | 6 | 30 |
You're very close. What you're trying to do is called a "correlated subquery". You're relating a key from a table you are querying to a key in a query that doesn't contribute to the candidate set, but does act as a filter in your where clause.
The key line in your code that demonstrates this is the line in the NOT EXISTS clause that says:
e.did = ee.did
Instead of trying to do this by comparing aggregate COUNT(...) results, do an outer join between the Employee and Workon tables to find out if there are any employees who aren't doing anything, then find your departments based on those employees not existing for a given department.
Here's an example query using the Oracle standard HR example tutorial tables representing the same join conditions as you have here. You probably have access to these tables wherever you're running the query, and so should anyone else here who might be interested in the answer, so they can run the query without building your tables to play around with the answer. It's a relatively trivial matter to convert the query to your tables, so I'll leave that exercise to you! :)
The final capitalized line in my query below is the join condition that makes this query a correlated subquery, like you tried to do in yours.
select
*
from
hr.departments d
where
not exists
(
select
ee.employee_id
,ee.first_name
,ee.last_name
,dd.department_id
,dd.department_name
,jj.job_id
from
hr.employees ee
,hr.departments dd
,hr.job_history jj
where
ee.department_id = dd.department_id
and ee.employee_id = jj.employee_id (+)
and jj.job_id is null
AND D.DEPARTMENT_ID = DD.DEPARTMENT_ID
)

SQL Query - Grouping Data

So every morning at work we have a stand-up meeting. We throw the nearest object to hand around the room as a method of deciding who speaks in what order. Being slightly odd I decided it could be fun to get some data on these throws. So, every morning I memorise the order of throws (as well as other relevant things like who dropped the ball/strange sponge object that was probably once a ball too and who threw to someone who'd already been or just gave an atrocious throw), and record this data in a table:
+---------+-----+------------+----------+---------+----------+--------+--------------+
| throwid | day | date | thrownum | thrower | receiver | caught | correctthrow |
+---------+-----+------------+----------+---------+----------+--------+--------------+
| 1 | 1 | 10/01/2012 | 1 | dan | steve | 1 | 1 |
| 2 | 1 | 10/01/2012 | 2 | steve | alice | 1 | 1 |
| 3 | 1 | 10/01/2012 | 3 | alice | matt | 1 | 1 |
| 4 | 1 | 10/01/2012 | 4 | matt | justin | 1 | 1 |
| 5 | 1 | 10/01/2012 | 5 | justin | arif | 1 | 1 |
| 6 | 1 | 10/01/2012 | 6 | arif | pete | 1 | 1 |
| 7 | 1 | 10/01/2012 | 7 | pete | greg | 0 | 1 |
| 8 | 1 | 10/01/2012 | 8 | greg | alan | 1 | 1 |
| 9 | 1 | 10/01/2012 | 9 | alan | david | 1 | 1 |
| 10 | 1 | 10/01/2012 | 10 | david | dan | 1 | 1 |
| 11 | 2 | 11/01/2012 | 1 | dan | david | 1 | 1 |
| 12 | 2 | 11/01/2012 | 2 | david | alice | 1 | 1 |
| 13 | 2 | 11/01/2012 | 3 | alice | steve | 1 | 1 |
| 14 | 2 | 11/01/2012 | 4 | steve | arif | 1 | 1 |
| 15 | 2 | 11/01/2012 | 5 | arif | pete | 0 | 1 |
| 16 | 2 | 11/01/2012 | 6 | pete | justin | 1 | 1 |
| 17 | 2 | 11/01/2012 | 7 | justin | alan | 1 | 1 |
| 18 | 2 | 11/01/2012 | 8 | alan | dan | 1 | 1 |
| 19 | 2 | 11/01/2012 | 9 | dan | greg | 1 | 1 |
+---------+-----+------------+----------+---------+----------+--------+--------------+
I've now got quite a few days worth of data for this, and I'm starting to run some queries on it for my own purposes (I've not told the rest of the team yet...wouldn't like to influence the results). I've done a few with no issues, but I'm stuck trying to get a certain result out.
What I'm looking for is the number of times each person has been the last team member to receive the ball. Now, as you can see on the table, due to absences etc the number of throws per day is not always constant, so I can't simply select the receiver by thrownum.
In the case for the data above, it would return:
+--------+-------------------+
| person | LastReceiverTotal |
+--------+-------------------+
| dan | 1 |
| greg | 1 |
+--------+-------------------+
I've got this far:
SELECT MAX(thrownum) AS LastThrowNum, day FROM Throws GROUP BY day
Now, this returns some useful data. I get the highest thrownum for each and every day. It would seem like all I need to do is get the receiver for this value, and then get a count grouped by receiver to get my answer. This doesn't work, though, because the resultset isn't what it seems due to the above query using aggregate functions.
I suspect there's a much better way of designing tables to store the data to be honest, but equally I'm also sure there's a way to get this information with the tables as they are - some kind of inner query? I can't figure out how it would work. Can anyone shed some light on how this would be done?
The query that you have gives you the biggest thrownum for each day.
With that, you just do a inner join with your table and get the receiver and the number of times he happears.
select t.receiver as person, count(t.day) as LastReceiverTotal from Throws t
inner join (SELECT MAX(thrownum) AS LastThrowNum, day FROM Throws GROUP BY day) a on a.LastThrowNum = t.thrownum and a.day = t.day
group by t.receiver