Help with optimising SQL query - sql

Hi i need some help with this problem.
I am working web application and for database i am using sqlite. Can someone help me with one query from databse which must be optimized == fast =)
I have table x:
ID | ID_DISH | ID_INGREDIENT
1 | 1 | 2
2 | 1 | 3
3 | 1 | 8
4 | 1 | 12
5 | 2 | 13
6 | 2 | 5
7 | 2 | 3
8 | 3 | 5
9 | 3 | 8
10| 3 | 2
....
ID_DISH is id of different dishes, ID_INGREDIENT is ingredient which dish is made of:
so in my case dish with id 1 is made with ingredients with ids 2,3
In this table a have more then 15000 rows and my question is:
i need query which will fetch rows where i can find ids of dishes ordered by count of ingreedients ASC which i haven added to my algoritem.
examle: foo(2,4)
will rows in this order:
ID_DISH | count(stillMissing)
10 | 2
1 | 3
Dish with id 10 has ingredients with id 2 and 4 and hasn't got 2 more, then is
My query is:
SELECT
t2.ID_dish,
(SELECT COUNT(*) as c FROM dishIngredient as t1
WHERE t1.ID_ingredient NOT IN (2,4)
AND t1.ID_dish = t2.ID_dish
GROUP BY ID_dish) as c
FROM dishIngredient as t2
WHERE t2.ID_ingredient IN (2,4)
GROUP BY t2.ID_dish
ORDER BY c ASC
works,but it is slow....

select ID_DISH, sum(ID_INGREDIENT not in (2, 4)) stillMissing
from x
group by ID_DISH
having stillMissing != count(*)
order by stillMissing
this is the solution, my previous query work 5 - 20s this work about 80ms

This is from memory, as I don't know the SQL dialect of sqlite.
SELECT DISTINCT T1.ID_DISH, COUNT(T1.ID_INGREDIENT) as COUNT
FROM dishIngredient as T1 LEFT JOIN dishIngredient as T2
ON T1.ID_DISH = T2.ID_DISH
WHERE T2.ID_INGREDIENT IN (2,4)
GROUP BY T1.ID_DISH
ORDER BY T1.ID_DISH

Related

Why does my PostgreSQL query not work as expected?

I currently have two tables called calendars and events in my PostgreSQL database which are joined on calendars.uuid = events.calendar_id.
At present, a person can have more than one calendar in the calendars table, however I need to change this so the person_id has a unique constraint, and hence they should only be able to have one entry moving forward.
I therefore need to identify only the person(s) which currently have more than one calendar and all the associated records from the events table i.e. person_id = 4
calendars:
uuid | person_id
-----+---------------
1 | 1
2 | 2
3 | 3
4 | 4
5 | 4
6 | 4
7 | 5
8 | 5
events:
uuid | calendar_id | event_id
-----+-----------------------
1 | 1 | 4728
2 | 1 | 8942
3 | 1 | 7842
4 | 2 | 9784
5 | 3 | 9852
6 | 3 | 1298
7 | 4 | 4983
8 | 5 | 4892
9 | 5 | 8522
My query is as follows, however this is not working, and as i'm fairly new to SQL/PSQL I'm struggling to figure this one out:
SELECT
calendars.uuid,
calendars.person_id,
events.uuid,
events.calendar_id,
events.event_id
FROM
events
INNER JOIN (
SELECT
person_id,
count(*)
FROM
calendars
GROUP BY
person_id
HAVING
count(*) > 1) AS calendars ON calendars.uuid = events.calendar_id
Any help would be much appreciated.
You can join events and calendar then put the person_id in the where clause.
SELECT
calendars.uuid,
calendars.person_id,
events.uuid,
events.calendar_id,
events.event_id
FROM
calendars
INNER JOIN events
ON events.calendar_id = calendars.uuid
WHERE calendars.person_id in (
SELECT
person_id
FROM
calendars
GROUP BY
person_id
HAVING
count(*) > 1 )
uuid person_id uuid calendar_id event_id
4 4 7 4 4983
5 4 8 5 4892
5 4 9 5 8522
I find it helps to structure your query such that you segregate the parts that are most restrictive first. So I would use a cte to restrict the persons to those wanted and then include the cte as an inner join to a standard query. Something like this:
WITH cte as
(SELECT person_id
FROM
calendars
GROUP BY
person_id
HAVING
count(*) > 1)
SELECT
calendars.uuid,
calendars.person_id,
events.uuid,
events.calendar_id,
events.event_id
FROM
events
INNER JOIN calendars ON calendars.uuid = events.calendar_id
INNER JOIN cte ON cte.person_id = calendars.person_id

Include zero counts when grouping by multiple columns

I have a table (TCAP) containing the gender (2 categories), race/ethnicity (3 categories), and height (integer in inches) for multiple individuals. For example:
GND RCE HGT
1 3 65
1 2 72
2 1 62
1 2 68
2 1 65
2 2 64
1 3 69
1 1 70
I want to get a count of the number of individuals in each possible gender and race/ethnicity combination. When I group by GND and RCE, however, it doesn't show zero counts. I've tried the following code:
SELECT
GND,
RCE,
COUNT(*) TotalRecords
FROM TCAP
GROUP BY GND, RCE;
This gives me:
GND RCE TotalRecords
1 1 1
1 2 2
1 3 2
2 1 2
2 2 1
I want it to show all possible combinations though. In other words, even though there are no individuals with a gender of 1 and race/ethnicity of 3 in the table, I want that to display as a zero count. So, like this:
GND RCE TotalRecords
1 1 1
1 2 2
1 3 2
2 1 2
2 2 1
2 3 0
I've looked at the responses to similar questions, but they are based on a single group, resolved using an outer join with a table that has all possible values. Would I use a similar process here? Would I create a single table that has all 6 combinations of GND and RCE to join on? Is there another way to accomplish this, especially if the number of combinations increases (for example, 1 group with 5 values and 1 group with 10 values)?
Any help would be much appreciated! Thanks!
You can try to use CROSS JOIN make for GND,RCE columns then do OUTER JOIN base on it.
Query #1
SELECT t1.GND,t1.RCE,COUNT(t3.GND) TotalRecords
FROM (
SELECT GND,RCE
FROM (
SELECT DISTINCT GND
FROM TCAP
) t1 CROSS JOIN
(
SELECT DISTINCT RCE FROM TCAP
) t2
) t1
LEFT JOIN TCAP t3 ON t3.GND = t1.GND and t3.RCE = t1.RCE
group by t1.GND,t1.RCE;
| GND | RCE | TotalRecords |
| --- | --- | ------------ |
| 1 | 1 | 1 |
| 1 | 2 | 2 |
| 1 | 3 | 2 |
| 2 | 1 | 2 |
| 2 | 2 | 1 |
| 2 | 3 | 0 |
View on DB Fiddle
Use a cross join to generate the rows and a left join to bring in the results -- with a final group by:
select g.gnd, r.rce, count(t.gnd) as cnt
from (select distinct gnd from tcap) g cross join
(select distinct rce from tcap) r left join
tcap t
on t.gnd = g.gnd and t.rce = r.rce
group by g.gnd, r.rce;

SQL: Add values to STDEVP calculation

I have the following table.
Key | Count | Amount
----| ----- | ------
1 | 2 | 10
1 | 2 | 15
2 | 5 | 1
2 | 5 | 2
2 | 5 | 3
2 | 5 | 50
2 | 5 | 20
3 | 3 | 5
3 | 3 | 4
3 | 3 | 5
Sorry I couldn't figure out who to make the above a table.
I'm running this on SQL Server Management Studio 2012.
I'd like the stdevp return of the amount columns but if the number of records is less than some value 'x' (there will never be more than x records for a given key), then I want to add zeros to account for the remainder.
For example, if 'x' is 6:
for key 1, I need stdevp(10,5,0,0,0,0)
for key 2, I need stdevp(1,2,3,50,20,0)
for key 3, I need stdevp(5,4,5,0,0,0)
I just need to be able to add zeros to the calculation. I could insert records to my table, but that seems rather tedious.
This seems complicated -- padding data for each key. Here is one approach:
with xs as (
select 0 as val, 1 as n
union all
select 0, n + 1
from xs
where xs.n < 6
)
select k.key, stdevp(coalesce(t.amount, 0))
from xs cross join
(select distinct key from t) k left join
(select t.*, row_number() over (partition by key order by key) as seqnum
from t
) t
on t.key = k.key and t.seqnum = xs.n
group by k.key;
The idea is that the cross join generates 6 rows for each key. Then the left join brings in available rows, up to the maximum.

SQL: Find top-rated article in each category

I have a table articles, with fields id, rating (an integer from 1-10), and category_id (an integer representing to which category it belongs).
How can I, in one MySQL query, find the single article with the highest rating from each category? ORDER BY and LIMIT would usually be how I would find the top-rated article, I suppose, but I'm not sure how to mix that with grouping to get the desired result, if I even can. (A dependent subquery would likely be an easy answer, but ewwww. Is there something better?)
For the following data:
id | category_id | rating
---+-------------+-------
1 | 1 | 10
2 | 1 | 8
3 | 2 | 7
4 | 3 | 5
5 | 3 | 2
6 | 3 | 6
I would like the following to be returned:
id | category_id | rating
---+-------------+-------
1 | 1 | 10
3 | 2 | 7
6 | 3 | 6
Try These
SELECT id, category_id, rating
FROM articles a1
WHERE rating =
(SELECT MAX(a2.rating) FROM articles a2 WHERE a1.category_id = a2.category_id)
OR
SELECT * FROM (SELECT * FROM articles ORDER BY rating DESC) AS a1 GROUP BY a1.rating;
You can use a subselect as the target of a FROM clause, too, which reads funny but makes for a slightly easier-to-understand query.
SELECT a1.id, a1.category_id, a1.rating
FROM articles as a1,
(SELECT category_id, max(rating) AS mrating FROM articles AS a2
GROUP BY a2.category_id) AS a_inner
WHERE
a_inner.category_id = a1.category_id AND
a_inner.mrating = a1.rating;

SQL AVG(COUNT(*))?

I'm trying to find out the average number of times a value appears in a column, group it based on another column and then perform a calculation on it.
I have 3 tables a little like this
DVD
ID | NAME
1 | 1
2 | 1
3 | 2
4 | 3
COPY
ID | DVDID
1 | 1
2 | 1
3 | 2
4 | 3
5 | 1
LOAN
ID | DVDID | COPYID
1 | 1 | 1
2 | 1 | 2
3 | 2 | 3
4 | 3 | 4
5 | 1 | 5
6 | 1 | 5
7 | 1 | 5
8 | 1 | 2
etc
Basically, I'm trying to find all the copy ids that appear in the loan table LESS times than the average number of times for all copies of that DVD.
So in the example above, copy 5 of dvd 1 appears 3 times, copy 2 twice and copy 1 once so the average for that DVD is 2. I want to list all the copies of that (and each other) dvd that appear less than that number in the Loan table.
I hope that makes a bit more sense...
Thanks
Similar to dotjoe's solution, but using an analytic function to avoid the extra join. May be more or less efficient.
with
loan_copy_total as
(
select dvdid, copyid, count(*) as cnt
from loan
group by dvdid, copyid
),
loan_copy_avg as
(
select dvdid, copyid, cnt, avg(cnt) over (partition by dvdid) as copy_avg
from loan_copy_total
)
select *
from loan_copy_avg lca
where cnt <= copy_avg;
This should work in Oracle:
create view dvd_count_view
select dvdid, count(1) as howmanytimes
from loans
group by dvdid;
select avg(howmanytimes) from dvd_count_view;
Untested...
with
loan_copy_total as
(
select dvdid, copyid, count(*) as cnt
from loan
group by dvdid, copyid
),
loan_copy_avg as
(
select dvdid, avg(cnt) as copy_avg
from loan_copy_total
group by dvdid
)
select lct.*, lca.copy_avg
from loan_copy_avg lca
inner join loan_copy_total lct on lca.dvdid = lct.dvdid
and lct.cnt <= lca.copy_avg;