SQlite join same table twice with different "on" statement - sql

I couldn't find answer for my question, and I don't know if my query is correct and this could be a SQLite issue, please help me solve the problem.
I have two tables in my database:
processTable {id}
taskTable {id, processId, amount, done}
There is a many-to-one relation (one process can have multiple tasks assigned). The "amount" and "done" are integer values that provides task progress information. If "done" >= "amount", the task is done. I need to query database to get something like that:
+---------+-----------+------------+
| process | tasksDone | tasksCount |
+---------+-----------+------------+
| 1 | 1 | 3 |
+---------+-----------+------------+
| 2 | 2 | 5 |
+---------+-----------+------------+
Basing on data that I have in my tables:
processTable
+----+
| id |
+----+
| 1 |
+----+
| 2 |
+----+
tasksTable
+----+-----------+--------+------+
| id | processId | amount | done |
+----+-----------+--------+------+
| 1 | 1 | 10 | 10 | <- this task is done
+----+-----------+--------+------+
| 2 | 1 | 15 | 5 |
+----+-----------+--------+------+
| 3 | 1 | 80 | 5 |
+----+-----------+--------+------+
| 4 | 2 | 25 | 0 |
+----+-----------+--------+------+
| 5 | 2 | 60 | 60 | <- this task is done
+----+-----------+--------+------+
| 6 | 2 | 30 | 15 |
+----+-----------+--------+------+
| 7 | 2 | 40 | 40 | <- this task is done
+----+-----------+--------+------+
| 8 | 2 | 100 | 50 |
+----+-----------+--------+------+
So, I wrote this query:
SELECT processTable.id AS process,
COUNT(tasksTableDone.id) AS tasksDone,
COUNT(tasksTableAll.id) AS tasksCount
FROM processTable
LEFT JOIN tasksTable AS tasksTableAll
ON tasksTableAll.processId = processTable.id
LEFT JOIN tasksTable AS tasksTableDone
ON tasksTableDone.processId = processTable.id
AND
tasksTableDone.done >= tasksTableDone.amount
But what I've got is:
+---------+-----------+------------+
| process | tasksDone | tasksCount |
+---------+-----------+------------+
| 1 | 3 | 3 |
+---------+-----------+------------+
| 2 | 5 | 5 |
+---------+-----------+------------+
I was trying run the query with only one join at a time, and everything was working well.
Query with first join only:
SELECT processTable.id AS process,
COUNT(tasksTableAll.id) AS tasksCount
FROM processTable
LEFT JOIN tasksTable AS tasksTableAll
ON tasksTableAll.processId = processTable.id
Result:
+---------+------------+
| process | tasksCount |
+---------+------------+
| 1 | 3 |
+---------+------------+
| 2 | 5 |
+---------+------------+
Query with second join only:
SELECT processTable.id AS process,
COUNT(tasksTableDone.id) AS tasksDone
FROM processTable
LEFT JOIN tasksTable AS tasksTableDone
ON tasksTableDone.processId = processTable.id
AND
tasksTableDone.done >= tasksTableDone.amount
Result:
+---------+-----------+
| process | tasksDone |
+---------+-----------+
| 1 | 1 |
+---------+-----------+
| 2 | 2 |
+---------+-----------+
How to use this two joins within one query to get proper results? I know that instead of JOIN I could use another SELECT, but I think it would be more expensive in the performance meaning.

You can implement a CASE statement with an aggregate:
Version using SUM()
SELECT p.id AS process,
sum(case when t.amount = t.done then 1 else 0 end) AS tasksDone,
count(p.id) AS tasksCount
FROM processTable p
LEFT JOIN tasksTable t
ON t.processId = p.id
group by p.id
See SQL Fiddle with Demo
Version using COUNT():
SELECT p.id AS process,
count(case when t.amount = t.done then 1 else null end) AS tasksDone,
count(p.id) AS tasksCount
FROM processTable p
LEFT JOIN tasksTable t
ON t.processId = p.id
group by p.id
See SQL Fiddle with Demo
Edit, after your comment you can wrap this in a select to get the progress:
select process,
tasksDone,
tasksCount,
(tasksDone / tasksCount) progress
from
(
SELECT p.id AS process,
count(case when t.amount = t.done then 1 else null end) AS tasksDone,
count(p.id) AS tasksCount
FROM processTable p
LEFT JOIN tasksTable t
ON t.processId = p.id
group by p.id
) src

Related

Sql join multiple tables, get count of certain rows, and also check some rows satisfy condition

I have a Zoo, each Zoo has many Cages, each Cage has many Animals.
Zoo:
+----+
| Id |
+----+
| 1 |
| 2 |
+----+
Cage:
+----+-------+
| Id | ZooId |
+----+-------+
| 1 | 1 |
| 2 | 1 |
| 3 | 2 |
| 4 | 2 |
| 5 | 2 |
+----+-------+
Animal:
+----+--------+----------+
| Id | CageId | IsHungry |
+----+--------+----------+
| 1 | 1 | 0 |
| 2 | 1 | 0 |
| 3 | 1 | 0 |
| 4 | 2 | 1 |
| 5 | 3 | 0 |
| 6 | 4 | 0 |
| 7 | 5 | 0 |
+----+--------+----------+
I'm trying to design a query to show each Zoo, the number of cages in that Zoo, and whether or not the Zoo has hungry Animals.
Here is the results I expect:
+-------+-----------+--------------+
| ZooID | CageCount | AnyoneHungry |
+-------+-----------+--------------+
| 1 | 2 | 1 |
| 2 | 3 | 0 |
+-------+-----------+--------------+
I can get the number of Cages in a Zoo:
SELECT
[c].[ZooId],
COUNT(*) AS [NumCages]
FROM [Cage] [c]
GROUP BY [c].[ZooId]
ORDER BY [NumCages] DESC
I can determine if a Cage has a hungry animal or not:
SELECT CASE WHEN EXISTS (
SELECT NULL
FROM [Animal] [a]
WHERE [a].[CageId] = #CageId AND [a].[IsHungry] = 1
) THEN 1 ELSE 0 END
But I'm having trouble combining these two into a single query that runs efficiently (in this universe zoos are very popular and have millions of cages and animals).
SELECT
[c].[ZooId],
COUNT(*) AS [CageCount],
MAX(CONVERT(INT, [x].[AnyoneHungry])) AS [AnyoneHungry]
FROM [Cage] [c]
INNER JOIN (
SELECT [a].[CageId], MAX(CONVERT(INT, [a].[IsHungry])) AS [AnyoneHungry]
FROM [Animal] [a]
GROUP BY [a].[CageId]
) [x] on [x].[CageId] = [c].[Id]
GROUP BY [c].[ZooId]
I feel like I'm missing something and it should be possible do run this query using a simpler statement.
This should do
SELECT
Z.Id,
COUNT(DISTINCT C.Id) AS CageCount,
COALESCE(MAX(CAST(A.IsHungry AS INT)), 0) AS AnyHungry /*The cast is only required if A.IsHungry is BIT and not INT*/
FROM Zoo Z
LEFT JOIN Cage C ON Z.Id = C.ZooId
LEFT JOIN Animal A ON C.Id = A.CageId
GROUP BY Z.Id
If you only need the zoo id and hungry animals:
SELECT c.zooid,
COUNT(DISTINCT C.Id) as CageCount,
COALESCE(MAX(CONVERT(int, a.IsHungry)), 0) AS AnyHungry
FROM Cage C LEFT JOIN
Animal A
ON c.Id = a.CageId AND a.IsHungry = 1
GROUP BY c.zooid;

Replace subqueries in where statement

I've built a query that intends to find products (products table) with both a 'used' offer and a 'new' offer, and get the lowest price for each. A product can have multiple offers (link_prices table). The offer's condition is determined by the name of the merchant (merchants table): a name without used and occasion is a 'new' offer, a name with used is a 'used' offer.
Here's a sample of the tables (PostgreSQL):
merchants
+----+---------------+
| id | name |
+----+---------------+
| 1 | amazon_used |
| 2 | ebay_location |
| 3 | amazon |
| 4 | target |
| 5 | target_used |
+----+---------------+
link_prices
+----+-------------+------------+-------+
| id | merchant_id | product_id | price |
+----+-------------+------------+-------+
| 1 | 1 | 1 | |
| 2 | 1 | 2 | 20 |
| 3 | 4 | 2 | 30 |
| 4 | 5 | 2 | 5 |
| 5 | 2 | 3 | 10 |
| 6 | 1 | 4 | 80 |
| 7 | 1 | 3 | 100 |
+----+-------------+------------+-------+
In this case, I'm expecting my query to return
+------------+----------------+---------------+
| product_id | min_used_price | min_new_price |
+------------+----------------+---------------+
| 2 | 5 | 30 |
+------------+----------------+---------------+
I've got the following query to work but I feel like I shouldn't need to use subqueries to achieve this. I just can't work my head around it. Any help would be appreciated to optimize this query.
SELECT products.id,
MIN(CASE WHEN merchants.name ILIKE '%used%' THEN link_prices.price END) as min_used_price,
MIN(CASE WHEN merchants.name NOT ILIKE '%used%' THEN link_prices.price END) as min_new_price
FROM products
INNER JOIN link_prices ON link_prices.product_id = products.id
INNER JOIN merchants ON merchants.id = link_prices.merchant_id
WHERE
products.id IN (
SELECT products.id
FROM products
INNER JOIN link_prices ON link_prices.product_id = products.id
INNER JOIN merchants ON merchants.id = link_prices.merchant_id
AND merchants.name ILIKE '%used%'
AND link_prices.price IS NOT NULL
AND link_prices.price <> 0
)
AND products.id IN (
SELECT products.id
FROM products
INNER JOIN link_prices ON link_prices.product_id = products.id
INNER JOIN merchants ON merchants.id = link_prices.merchant_id
AND merchants.name NOT ILIKE '%used%'
AND merchants.name NOT ILIKE '%location%'
AND link_prices.price IS NOT NULL
AND link_prices.price <> 0
)
GROUP BY products.id
Thanks a ton!
Your description makes this sound like conditional aggregation:
select lp.product_id,
min(lp.price) filter (where m.name like '%used') as min_used_price,
min(lp.price) filter (where m.name not like '%used') as min_new_price
from merchants m join
link_prices lp
on lp.merchant_id = m.id
group by lp.product_id;
You sample query is much more complicated and has conditions that are not mentioned in the text of the question. But I think this structure will work for what you want to do.

How can I perform the self join in left join table?

I have two tables
first one is 'blog' table :
+----+--------+--------+
| id | title | status |
+----+--------+--------+
| 1 | blog 1 | 1 |
| 2 | blog 2 | 1 |
+----+--------+--------+
Second is blog_activity:
status 1 is: create
status 2 is: opened
+----+---------+--------+------------+
| id | blog_id | status | date |
+----+---------+--------+------------+
| 1 | 1 | 1 | 2019-09-09 |
| 2 | 2 | 1 | 2019-09-10 |
| 2 | 2 | 2 | 2019-09-11 |
+----+---------+--------+------------+
I want the record of the blog not opened with all the detail of the blog table.
Example :
+----+---------+--------+------------+--------------------+
| id | blog_id | title | blog.date | blog_activity.date |
+----+---------+--------+------------+--------------------+
| 1 | 1 | blog 1 | 2019-09-09 | 2019-09-09 |
+----+---------+--------+------------+--------------------+
I think I would use exists and join:
select b.*, ba.date as created_date
from blog b join
blog_activity ba
on ba.blog_id = b.id and ba.status = 1
where not exists (select 1
from block_activity ba2
where ba2.blog_id = b.id and ba2.status = 2
);
This avoids aggregation and it can use an index on blog_activity(blog_id, status).
One approach uses aggregation:
SELECT
ba.id,
ba.blog_id,
b.title,
ba.date
FROM blog b
INNER JOIN blog_activity ba
ON b.id = ba.blog_id
INNER JOIN
(
SELECT blog_id
FROM blog_activity
GROUP BY blog_id
HAVING COUNT(CASE WHEN status = 2 THEN 1 END) = 0
) t
ON b.id = t.blog_id;
Demo
The subquery aliased as t finds all blogs which do not have an opened status associated with them. In this case, only blog_id = 1 meets this condition.

Can't show all records with the same id while join in oracle xe 11g

I'm getting this message while using this query, is there anything wrong?
SELECT t.tanggal_transaksi, o.nama_lengkap, SUM(td.harga * td.qty) total
FROM transaksi t, transaksi_detail td, operator o
WHERE td.transaksi_id = t.transaksi_id AND o.operator_id = t.operator_id
GROUP BY t.transaksi_id
Updated :
After using the answer from #Barbaros Özhan using this query :
SELECT t.tanggal_transaksi, o.nama_lengkap, SUM(td.harga * td.qty) total
FROM transaksi t
INNER JOIN transaksi_detail td ON ( td.transaksi_id = t.transaksi_id )
INNER JOIN operator o ON ( o.operator_id = t.operator_id )
GROUP BY t.tanggal_transaksi, o.nama_lengkap;
the data is successfully displayed. but, there are few problems that occur, the value of the same operator_id cannot appear more than 1 time. Here is the sample data :
+--------------+-------------+-------------------+
| TRANSAKSI_ID | OPERATOR_ID | TANGGAL_TRANSAKSI |
+--------------+-------------+-------------------+
| 1 | 5 | 09/29/2018 |
| 2 | 3 | 09/29/2018 |
| 3 | 3 | 09/29/2018 |
| 4 | 1 | 09/29/2018 |
| 5 | 1 | 09/29/2018 |
+--------------+-------------+-------------------+
After use the query command, the output is :
+-------------------+------------------+--------+
| TANGGAL_TRANSAKSI | NAMA_LENGKAP | TOTAL |
+-------------------+------------------+--------+
| 09/29/2018 | Lina Harun | 419800 |
| 09/29/2018 | Titro Kusumo | 484000 |
| 09/29/2018 | Muhammad Kusnadi | 402000 |
+-------------------+------------------+--------+
When viewed from the operator table, there are 2 data with the same operator_id that is unreadable
+-------------+------------------+
| OPERATOR_ID | NAMA_LENGKAP |
+-------------+------------------+
| 1 | Muhammad Kusnadi |
| 3 | Lina Harun |
| 5 | Tirto Kusumo |
+-------------+------------------+
You need to include the columns in the SELECT-list t.tanggal_transaksi, o.nama_lengkap, also in the GROUP BY-list but not the others like t.transaksi_id. So, you might use the following without any issue :
SELECT t.tanggal_transaksi, o.nama_lengkap, SUM(td.harga * td.qty) total
FROM transaksi t
INNER JOIN transaksi_detail td ON ( td.transaksi_id = t.transaksi_id )
INNER JOIN operator o ON ( o.operator_id = t.operator_id )
GROUP BY t.tanggal_transaksi, o.nama_lengkap;
Or this one :
SELECT t.transaksi_id, SUM(td.harga * td.qty) total
FROM transaksi t
INNER JOIN transaksi_detail td ON ( td.transaksi_id = t.transaksi_id )
GROUP BY t.transaksi_id;
P.S. Prefer using ANSI-92 JOIN standard rather than old-style comma-type JOIN.

SQL join 3 tables (based on 2 criterias?)

I have 3 tables setup like this (a bit simplified):
time_tracking: id, tr_proj_id, tr_min, tr_type
time_projects: id, project_name
time_tasks: id, task_name
Basically, I want to retrieve either project_name or task_name based on tr_type which can be of value "project" or "task"
An example
time_tracking
+----+------------+--------+---------+
| id | tr_proj_id | tr_min | tr_type |
+----+------------+--------+---------+
| 1 | 3 | 60 | project |
| 2 | 3 | 360 | task |
| 3 | 1 | 120 | project |
| 4 | 2 | 30 | project |
| 5 | 2 | 30 | task |
| 6 | 1 | 90 | task |
+----+------------+--------+---------+
time_projects
+----+------------------------+
| id | project_name |
+----+------------------------+
| 1 | Make someone happy |
| 2 | Start a project |
| 3 | Jump out of the window |
+----+------------------------+
time_tasks
+----+---------------------+
| id | task_name |
+----+---------------------+
| 1 | drink a beer |
| 2 | drink a second beer |
| 3 | drink more |
+----+---------------------+
Desired output
+----+------------------------+------------+--------+---------+
| id | name | tr_proj_id | tr_min | tr_type |
+----+------------------------+------------+--------+---------+
| 1 | Jump out of the window | 3 | 60 | project |
| 2 | drink more | 3 | 360 | task |
| 3 | Make someone happy | 1 | 120 | project |
| 4 | Start a project | 2 | 30 | project |
| 5 | drink a second beer | 2 | 30 | task |
| 6 | drink a beer | 1 | 90 | task |
+----+------------------------+------------+--------+---------+
And being really bad at the whole JOIN thing, here's the only thing I've come up with so far (which doesn't work..):
SELECT tt.tr_proj_id, tt.tr_type, tt.tr_min, pp.project_name, pp.id, ta.task_name, ta.id
FROM time_tracking as tt, time_projects as pp, time_tasks as ta
WHERE ((tt.tr_type = 'project' AND pp.id = tt.tr_proj_id) OR (tt.tr_type = 'task' AND ta.id = tt.tr_proj_id))
AND tt.tr_min > 0
ORDER BY tt.tr_proj_id DESC
If anyone has an idea on how to do this, feel free to share!
Update: Looks like I forgot to specify that I'm using an access database. Which apparently doesn't accept things like CASE or coalesce.. Apparently there is IIF() but I'm not quite sure on how to use it in this case.
Use join clauses and move your join conditions from the where clause into the on clauses:
SELECT
tt.tr_proj_id,
tt.tr_type,
tt.tr_min,
pp.project_name,
pp.id,
ta.task_name,
ta.id
FROM time_tracking as tt
left join time_projects as pp on tt.tr_type = 'project' AND pp.id = tt.tr_proj_id
left join time_tasks as ta on tt.tr_type = 'task' AND ta.id = tt.tr_proj_id
WHERE tt.tr_min > 0
ORDER BY tt.tr_proj_id DESC,tt.tr_day ASC
I've used left join, which gives you a row from the main table even if one doesn't exist for the join (you get nulls from columns in the joined table if there's no join)
A key point here, that many SQL programmers do not realise, is that the ON clause may contain any conditions, even ones not from the joined table (as in this example). Many programmers assume that the conditions must be only those relating to the formal foreign key relationship.
Try this:
SELECT
tt.id,
CASE WHEN tt.tr_type = 'project' THEN pp.project_name
WHEN tt.tr_type = 'task' THEN ta.task_name END as name,
tt.tr_proj_id,
tt.tr_type,
tt.tr_min,
FROM time_tracking as tt
left join time_projects as pp on pp.id = tt.tr_proj_id
left join time_tasks as ta on ta.id = tt.tr_proj_id
WHERE tt.tr_min > 0
ORDER BY tt.tr_proj_id DESC
perform a union on two joins:
select tt.id, tp.project_name name, tt.tr_proj_id, tt.tr_min, tt.tr_type
from time_tracking tt
inner join time_projects tp on tp.id = tt.tr_proj_id
where tt.tr_type = 'project'
union all
select tt.id, tp.project_name name, tt.tr_proj_id, tt.tr_min, tt.tr_type
from time_tracking tt
inner join time_tasks tk on tk.id = tt.tr_proj_id
where tt.tr_type = 'task'
That will give you the exact table results you want
SELECT
time_tracking.id,
time_tracking.tr_min,
time_tracking.tr_type,
coalesce(time_projects.project_name, time_tasks.task_name) as name
FROM time_tracking
LEFT OUTER JOIN time_projects on time_projects.id = time_tracking.tr_proj_id AND time_tracking.tr_type = 'project'
LEFT OUTER JOIN time_tasks on time_tasks.id = time_tracking.tr_proj_id AND time_tracking.tr_type = 'task'
WHERE time_tracking.tr_min > 0
ORDER BY time_tracking.id DESC -- ...
coalesce is MSSQL, there's equivalent ISNULL and such in other database technologies
The idea is you join to the tables and if the join fails, you'll get NULL where the join failed. Then you use COALESCE to pick out the successful join value.