Query different IDs with different values? - sql

I'm trying to write a query for a golf database. It needs to return players who have statisticID = 1 with a p2sStatistic > 65 and who also have statisticID = 3 with p2sStatistic > 295.
One statisticID is driving distance, the other accuracy, etc. I've tried the following but it doesn't work and can't seem to find an answer online. How would I go about this without doing a view?
SELECT playerFirstName, playerLastName
FROM player2Statistic, player
WHERE player.playerID=player2Statistic.playerID
AND player2Statistic.statisticID=statistic.statisticID
AND p2sStatistic.3 > 295
AND p2sStatistic.1 > 65;
http://i.imgur.com/o8epk.png - pic of db
Trying to get it just output the list of players that satisfy those two conditions.

For a list of players without duplicates an EXISTS semi-join is probably best:
SELECT playerFirstName, playerLastName
FROM player AS p
WHERE EXISTS (
SELECT 1
FROM player2Statistic AS ps
WHERE ps.playerID = p.playerID
AND ps.StatisticID = 1
AND ps.p2sStatistic > 65
)
AND EXISTS (
SELECT 1
FROM player2Statistic AS ps
WHERE ps.playerID = p.playerID
AND ps.StatisticID = 3
AND ps.p2sStatistic > 295
);
Column names and context are derived from the provided screenshots. The query in the question does not quite cover it.
Note the parenthesis, they are needed to cope with operator precedence.
This is probably faster (duplicates are probably not possible):
SELECT p.playerFirstName, p.playerLastName
FROM player AS p
JOIN player2Statistic AS ps1 USING (playerID)
JOIN player2Statistic AS ps3 USING (playerID)
AND ps1.StatisticID = 1
AND ps1.p2sStatistic > 65
AND ps3.StatisticID = 3
AND ps3.p2sStatistic > 295;
If your top-secret brand of RDBMS does not support the SQL-standard (USING (playerID), substitute: ON ps1.playerID = p.playerID to the same effect.
It's a case of relational division. Find many more query techniques to deal with it under this related question:
How to filter SQL results in a has-many-through relation

You are missing the statistic table in your query. You need to join it in, based on your where clause.
You also need to use proper join syntax.
The following version joins in the statistics table twice, once for the "1" and once for the "3":
SELECT distinct playerFirstName, playerLastName
FROM player2Statistic p2s join
player p
on p.playerId = p2s.playerId join
statistic s3
on s3.StatisticId = p2s.statistcId and
s3.StatisticId = 3 join
statistic s1
on s1.StatisticId = p2s.statisticId and
s1.StatisticId = 1
WHERE (s3.statistic > 295 and s1.statistic > 65)

You will want to join to the statistics table twice:
SELECT playerFirstName, playerLastName
FROM player p
JOIN player2Statistic s1
on p.playerID=s1.playerID and s1.statisticID = 1
JOIN player2Statistic s3
on p.playerID=s3.playerID and s1.statisticID = 3
WHERE s1.p2sStatistic > 65 and s3.p2sStatistic > 295;

Related

Long query execution - SSP Datatables

I use Datables with SSP Class. And i have query with result 2000 lines.
But when I try running query I got error 500/504 error (if I have 504 table is didn't load)
I use OVH CloudDB with MariadDB 10.2 and on server I have php 7.2
My joinQuery looks like:
FROM data_platforms p
LEFT JOIN game_platforms gp ON gp.platform_id = p.platform_id
LEFT JOIN games g ON g.game_id = gp.game_id
LEFT JOIN tastings t ON t.tasting_game_id = g.game_id
LEFT JOIN notes n ON n.note_game_id = g.game_id
LEFT JOIN ratings r ON r.rating_game_id = g.game_id
LEFT JOIN images i ON i.image_type_id = g.game_id AND (i.image_type = 2 || i.image_type = 1)
LEFT JOIN game_generes gg ON gg.game_id = g.game_id
LEFT JOIN generes gen ON gen.id = gg.genere_id
And extraWhere
p.platform_id = '.$platformID.' AND g.game_status = 1
And groupBy
gp.game_id
Is there any way to be able to optimize this query?
Am I doomed to fail at this point, or should I use a different SSP class?
Don't use LEFT unless you really expect the 'right' table to be missing.
Don't "over-normalize". For example, I would expect genre to simply be a column, not a many-to-many mapping table plus a genre table.
(i.image_type = 2 || i.image_type = 1) --> i.image_type IN (1,2) might optimizer better.
See this for a likely improvement in many-to-many table indexes: http://mysql.rjweb.org/doc.php/index_cookbook_mysql#many_to_many_mapping_table
Please provide SHOW CREATE TABLE so we can check other indexes, such as on platform_id.
Let's see the entire query, plus EXPLAIN SELECT .... GROUP BY gp.game_id may be invalid (unless everything else is dependent on it).

How does this SQL query return results with same id_product?

I am facing a complex SQL query in some code, which is suppose to return products without duplicates (by the use of DISTINCT keywork at the beginning), here is the query:
SELECT DISTINCT p.`id_product`, p.*, product_shop.*, pl.* , m.`name` AS manufacturer_name, x.`id_feature` , x.`id_feature_value` , s.`name` AS supplier_name
FROM `ps_product` p
INNER JOIN ps_product_shop product_shop
ON (product_shop.id_product = p.id_product AND product_shop.id_shop = 1)
LEFT JOIN `ps_product_attribute` y ON (y.`id_product` = p.`id_product`)
LEFT JOIN `ps_product_attribute_combination` ac ON (y.`id_product_attribute` = ac.`id_product_attribute`)
LEFT JOIN `ps_product_lang` pl ON (p.`id_product` = pl.`id_product` AND pl.id_shop = 1 )
LEFT JOIN `ps_manufacturer` m ON (m.`id_manufacturer` = p.`id_manufacturer`)
LEFT JOIN `ps_feature_product` x ON (x.`id_product` = p.`id_product`)
LEFT JOIN `ps_supplier` s ON (s.`id_supplier` = p.`id_supplier`)
LEFT JOIN `ps_category_product` c ON (c.`id_product` = p.`id_product`)
WHERE pl.`id_lang` = 1 AND c.`id_category` = 18 AND p.`price` between 0 and 1000
AND product_shop.`visibility` IN ("both", "catalog") AND product_shop.`active` = 1
ORDER BY p.`id_product` ASC LIMIT 1,4
But it returns 4 product with 2 products with same "id_product" (11941)
What I need is to return 4 products but of different ids each.
Anyone ?
Thanks a lot
Aymeric
[EDIT]
The result of this query shows 4 rows, with 2 having the same exact columns values EXCEPT for the id_feature_value column which 36 for one and 38 for the other.
SELECT DISTINCT gets all the distinct combinations of all selected fields in your query, not just the first field.
Now, you could solve that by using GROUP BY to select only distinct values of id_product specifically, like:
SELECT p.`id_product`, p.*, product_shop.*, pl.* , m.`name` AS manufacturer_name, x.`id_feature` , x.`id_feature_value` , s.`name` AS supplier_name
FROM `ps_product` p
INNER JOIN ps_product_shop product_shop
ON (product_shop.id_product = p.id_product AND product_shop.id_shop = 1)
LEFT JOIN `ps_product_attribute` y ON (y.`id_product` = p.`id_product`)
LEFT JOIN `ps_product_attribute_combination` ac ON (y.`id_product_attribute` = ac.`id_product_attribute`)
LEFT JOIN `ps_product_lang` pl ON (p.`id_product` = pl.`id_product` AND pl.id_shop = 1 )
LEFT JOIN `ps_manufacturer` m ON (m.`id_manufacturer` = p.`id_manufacturer`)
LEFT JOIN `ps_feature_product` x ON (x.`id_product` = p.`id_product`)
LEFT JOIN `ps_supplier` s ON (s.`id_supplier` = p.`id_supplier`)
LEFT JOIN `ps_category_product` c ON (c.`id_product` = p.`id_product`)
WHERE pl.`id_lang` = 1 AND c.`id_category` = 18 AND p.`price` between 0 and 1000
AND product_shop.`visibility` IN ("both", "catalog") AND product_shop.`active` = 1
GROUP BY p.`id_product`
ORDER BY p.`id_product` ASC LIMIT 1,4
However, the problem now is that your query has multiple different values of all the other fields you are selected to choose from, and no deterministic way to pick from them. Even though the id_product is unique in it's table, it's not unique in the result set because in at least one of your JOINs there is a one-to-many relationship, meaning there are several rows that match the JOIN conditions.
On older versions of MySQL, it will just pick the first value it finds in this case, but on SQL Server it will actually error out and tell you that the remaining fields either have to be mentioned in the GROUP BY clause, or they have to be aggregated. So, you've got a few ways you can go from here:
You are on an old version of MySQL and you don't particularly care which values are returned for the rest of the fields, so leave the query as I've posted and use that. I wouldn't recommend this, as it's undefined behaviour so in theory it could change at MySQL's whim. All the values returned will be from the same result row though.
Add aggregate functions, such as MIN() or MAX() to the rest of the remaining fields in the select clause. This will reduce the possible values for the fields down to one, but you will probably end up with a mixture of values from different rows.
Remove any one-to-many JOINs from your query so that you only ever get one row back in the result set for each individual id_product. Then, fetch the remaining data you need in a separate query.
There may be other alternative solutions, but it depends a lot on which values you want returned for the rest of the rows and what RDBMS you are using. For example, on SQL Server you could potentially make use of PARTITION BY to select the first row for each distinct id_product deterministically.

Refactoring slow SQL query

I currently have this very very slow query:
SELECT generators.id AS generator_id, COUNT(*) AS cnt
FROM generator_rows
JOIN generators ON generators.id = generator_rows.generator_id
WHERE
generators.id IN (SELECT "generators"."id" FROM "generators" WHERE "generators"."client_id" = 5212 AND ("generators"."state" IN ('enabled'))) AND
(
generators.single_use = 'f' OR generators.single_use IS NULL OR
generator_rows.id NOT IN (SELECT run_generator_rows.generator_row_id FROM run_generator_rows)
)
GROUP BY generators.id;
An I'm trying to refactor it/improve it with this query:
SELECT g.id AS generator_id, COUNT(*) AS cnt
from generator_rows gr
join generators g on g.id = gr.generator_id
join lateral(select case when exists(select * from run_generator_rows rgr where rgr.generator_row_id = gr.id) then 0 else 1 end as noRows) has on true
where g.client_id = 5212 and "g"."state" IN ('enabled') AND
(g.single_use = 'f' OR g.single_use IS NULL OR has.norows = 1)
group by g.id
For reason it doesn't quite work as expected(It returns 0 rows). I think I'm pretty close to the end result but can't get it to work.
I'm running on PostgreSQL 9.6.1.
This appears to be the query, formatted so I can read it:
SELECT gr.generators_id, COUNT(*) AS cnt
FROM generators g JOIN
generator_rows gr
ON g.id = gr.generator_id
WHERE gr.generators_id IN (SELECT g.id
FROM generators g
WHERE g.client_id = 5212 AND
g.state = 'enabled'
) AND
(g.single_use = 'f' OR
g.single_use IS NULL OR
gr.id NOT IN (SELECT rgr.generator_row_id FROM run_generator_rows rgr)
)
GROUP BY gr.generators_id;
I would be inclined to do most of this work in the FROM clause:
SELECT gr.generators_id, COUNT(*) AS cnt
FROM generators g JOIN
generator_rows gr
ON g.id = gr.generator_id JOIN
generators gg
on g.id = gg.id AND
gg.client_id = 5212 AND gg.state = 'enabled' LEFT JOIN
run_generator_rows rgr
ON g.id = rgr.generator_row_id
WHERE g.single_use = 'f' OR
g.single_use IS NULL OR
rgr.generator_row_id IS NULL
GROUP BY gr.generators_id;
This does make two assumptions that I think are reasonable:
generators.id is unique
run_generator_rows.generator_row_id is unique
(It is easy to avoid these assumptions, but the duplicate elimination is more work.)
Then, some indexes could help:
generators(client_id, state, id)
run_generator_rows(id)
generator_rows(generators_id)
Generally avoid inner selects as in
WHERE ... IN (SELECT ...)
as they are usually slow.
As it was already shown for your problem it's a good idea to think of SQL as of set- theory.
You do NOT join tables on their sole identity:
In fact you take (SQL does take) the set (- that is: all rows) of the first table and "multiply" it with the set of the second table - thus ending up with n times m rows.
Then the ON- clause is used to (often strongly) reduce the result by simply selecting each one of those many combinations by evaluating this portion to either true (take) or false (drop). This way you can chose any arbitrary logic to select those combinations in favor.
Things get trickier with LEFT JOIN and RIGHT JOIN, but one can easily think of them as to take one side for granted:
output the combinations of that row IF the logic yields true (once at least) - exactly like JOIN does
output exactly ONE row, with 'the other side' (right side on LEFT JOIN and vice versa) consisting of ALL NULL for every column.
Count(*) is great either, but if things getting complicated don't stick to it: Use Sub- Selects for the keys only, and once all the hard word is done join the Fun- Stuff to it. Like in
SELECT SUM(VALID), ID
FROM SELECT
(
(1 IF X 0 ELSE) AS VALID, ID
FROM ...
)
GROUP BY ID) AS sub
JOIN ... AS details ON sub.id = details.id
Difference is: The inner query is executed only once. The outer query does usually have no indices left to work with and will be slow, but if the inner select here doesn't make the data explode this is usually many times faster than SELECT ... WHERE ... IN (SELECT..) constructs.

Simplify SQL query with multiple Sub-Selects

Goal
I would like to simplify the following sql query in term of visual length (decrease amount nested sub-selects) and/or performance and/or readability. This query is dedicated for MS Access, that's why there are parenthesis around INNER JOIN.
Explanation
To not lose global view all table relations are shortened after ON clauses.
Table S1 and S2 is the same table S. To work, the query (MS Access to be precise) needed to separate names of S in different sub-selects.
Level 1 query (T1) calculates: qty of defects, qty of production and defect rate in PPM for all production areas and subareas in specific period of time.
Level 2 query (T2) calculates sum of defect rates from in T1 per area and adds some additional information like specific one main user from many attached to areas, area name itself and target taken for specific year and month.
Level 3 query takes all from level 2 and adds some comment defined for specific year and week and also filter all the list down to areas where sum of defect rates exceeds target.
This query works as expected and all the columns in the result table show properlly calculated values and fields.
Is it possible to simplify this query?
Note
I added sql-server tag to reach more pro-users. Sorry if it doesn't fit.
Query
SELECT s1id, s2name, user, target, ratio, C.msg AS msg
FROM
(SELECT s1id, S2.nam AS s2name, (U.lastname & ' ' & U.firstname) AS user, T.m1 AS target, SUM(ratio1) as ratio
FROM ((((
(SELECT S1.id AS s1id, WG.id AS wgid, SUM(IIF(D.ok=False,F.qty,0)) AS nok, SUM(F.qty) AS production, IIF(production > 0, CLNG(nok/production * 1000000), 0) AS ratio1
FROM (((F
INNER JOIN D ON D.x = F.x)
INNER JOIN W ON W.x = D.x)
INNER JOIN WG ON WG.x = W.x)
INNER JOIN S1 ON S1.x = WG.x
WHERE F.entrydate BETWEEN #2017-01-23# And #2017-01-29#
GROUP BY S1.id, WG.id) AS T1
INNER JOIN S2 ON S2.x = T1.x)
INNER JOIN UPS ON UPS.x = S2.x)
INNER JOIN UP ON UP.x = UPS.x)
INNER JOIN U ON U.x = UP.x)
INNER JOIN T ON T.x = S2.x
WHERE UPS.main = true AND UP.positionid = 3 AND T.y = 2017
GROUP BY sectorid, S2.nam, U.lastname, U.firstname, T.m1) AS T2
INNER JOIN C ON C.x = T2.x
WHERE C.yearnum = 2017 AND C.weeknum = 4 AND ratio > target

Receiving 1 row from joined (1 to many) postgresql

I have this problem:
I have 2 major tables (apartments, tenants) that have a connection of 1 to many (1 apartment, many tenants).
I'm trying to pull all my building apartments, but with one of his tenants.
The preffered tenant is the one who have ot=2 (there are 2 possible values: 1 or 2).
I tried to use subqueries but in postgresql it doesn't let you return more than 1 column.
I don't know how to solve it. Here is my latest code:
SELECT a.apartment_id, a.apartment_num, a.floor, at.app_type_desc_he, tn.otype_desc_he, tn.e_name
FROM
public.apartments a INNER JOIN public.apartment_types at ON
at.app_type_id = a.apartment_type INNER JOIN
(select t.apartment_id, t.building_id, ot.otype_id, ot.otype_desc_he, e.e_name
from public.tenants t INNER JOIN public.ownership_types ot ON
ot.otype_id = t.ownership_type INNER JOIN entities e ON
t.entity_id = e.entity_id
) tn ON
a.apartment_id = tn.apartment_id AND tn.building_id = a.building_id
WHERE
a.building_id = 4 AND tn.building_id=4
ORDER BY
a.apartment_num ASC,
tn.otype_id DESC
Thanx in advance
SELECT a.apartment_id, a.apartment_num, a.floor
,at.app_type_desc_he, tn.otype_desc_he, tn.e_name
FROM public.apartments a
JOIN public.apartment_types at ON at.app_type_id = a.apartment_type
LEFT JOIN (
SELECT t.apartment_id, t.building_id, ot.otype_id
,ot.otype_desc_he, e.e_name
FROM public.tenants t
JOIN public.ownership_types ot ON ot.otype_id = t.ownership_type
JOIN entities e ON t.entity_id = e.entity_id
ORDER BY (ot.otype_id = 2) DESC
LIMIT 1
) tn ON (tn.apartment_id, tn.building_id)=(a.apartment_id, a.building_id)
WHERE a.building_id = 4
AND tn.building_id = 4
ORDER BY a.apartment_num; -- , tn.otype_id DESC -- pointless
Crucial part emphasized.
This works in either case.
If there are tenants for an apartment, exactly 1 will be returned.
If there is one (or more) tenant of ot.otype_id = 2, it will be one of that type.
If there are no tenants, the apartment is still returned.
If, for ot.otype_id ...
there are 2 possible values: 1 or 2
... you can simplify to:
ORDER BY ot.otype_id DESC
Debug query
Try removing the WHERE clauses from the base query and change
JOIN public.apartment_types
to
LEFT JOIN public.apartment_types
and add them back one by one to see which condition excludes all rows.
Do at.app_type_id and a.apartment_type really match?