Limiting MySQL results within query - sql

I'm looking to see if I can get the results I need with a single query, and my MySQL skills are still in their adolescence over here.
I have 4 tables: shows, artists, venues and tours. A simplified version of my main query right now looks like this:
SELECT *
FROM artists AS a,
venues AS v,
shows AS s
LEFT JOIN tours AS t ON s.show_tour_id = t.tour_id
WHERE s.show_artist_id = a.artist_id
AND s.show_venue_id = v.venue_id
ORDER BY a.artist_name ASC, s.show_date ASC;
What I want to add is a limit on how many shows are returned per artist. I know I could SELECT * FROM artists, and then run a query with a simple LIMIT clause for each returned row, but I figure there must be a more efficient way.
UPDATE: to put this more simply, I want to select up to 5 shows for each artist. I know I could do this (stripping away all irrelevancies):
<?php
$artists = $db->query("SELECT * FROM artists");
foreach($artists as $artist) {
$db->query("SELECT * FROM shows WHERE show_artist_id = $artist->artist_id LIMIT 5");
}
?>
But it seems wrong to be putting another query within a foreach loop. I'm looking for a way to achieve this within one result set.

This is the kind of thing stored procedures are for.
Select a list of artists, then loop through that list, adding 5 or fewer shows for each artists to a temp table.
Then, return the temp table.

As a plan-B, if you can't figure the proper SQL statement to use you can read the whole thing into a memory construct (array, class, etc) and loop it that way. If the data is sufficiently small and memory available sufficiently large this would let you do only one query. Not elegant, but may work for you.

Well I hesitate to suggest this because it certainly won't be computationally efficient (see the stored procedures answer for that...) but it will all be in one query like you wanted. I'm also taking some liberties and assuming that you want the 5 most recent shows...hopefully you can modify to your actual requirements.
SELECT *
FROM artists AS a,
venues AS v,
shows AS s
LEFT JOIN tours AS t ON s.show_tour_id = t.tour_id
WHERE s.show_artist_id = a.artist_id
AND s.show_venue_id = v.venue_id
AND s.show_id IN
(SELECT subS.show_id FROM shows subS
WHERE subS.show_artist_id = s.show_artist_id
ORDER BY subS.show_date DESC
LIMIT 5)
ORDER BY a.artist_name ASC, s.show_date ASC;

Related

How to select data with a complex condition?

Using Microsoft Access, I normally use condition (mostly where) to obtain the data I want to display.
So far, it went well. However now I have a complex filtering and I'm not sure of the best way to do it. I will explain how I do it with many queries, and I'd like to know if there is something simpler, since I feel like it's doing too much for what I accomplish.
I have Building and Energy tables. Between them, I have a link table since a Building has a list of possible energies.
My goal is to display ALL energy not already associated with the building.
I first have a simple query to display all the IDs of energy that are in the link table where building is the one of interest.
Once I do that, I have another query using this one, which display an energy if it is an energy absent from previous list.
This takes 2 queries and I feel like I could have a better way to do this. I'm fairly new to MS Access, so any suggestion is welcome.
Here is the first request to obtain the list of energies:
SELECT
Batiments.ID, Energies.ID, Energies.Type
FROM
Energies
INNER JOIN
(Batiments
INNER JOIN
Batiment_Energie ON Batiments.ID = Batiment_Energie.Batiment_ID) ON Energies.ID = Batiment_Energie.Energie_ID
WHERE
(((Batiments.ID) = " & cbxBatiments.Column(0) & "));"
You can query the non-associated energy types with
SELECT
ID, Type
FROM
Energies
WHERE
ID NOT IN (SELECT Energie_ID
FROM Batiment_Energie
WHERE Batiment_ID = 123)
where 123 is to be replaced by the Id comming from cbxBatiments.Column(0).
You can use not exists:
select e.*
from energie as e
where not exists (select 1
from Batiment_Energie as be
where be.energie_id = e.id and be.batiment_id = <your id>
);

Oracle view join to table very weird slow issue

I have a table order, which is very straightforward, it is storing order data.
I have a view, which is storing currency pair and currency rate. The view is created as below:
create or replace view view_currency_rate as (
select c.* from currency_rate c, (
select curr_from, curr_to, max(rate_date) max_rate_date from currency_rate
where system_rate > 0
group by curr_from, curr_to) r
where c.curr_from = r.curr_from
and c.curr_to = r.curr_to
and c.rate_date = r.max_rate_date
and c.system_rate > 0
);
nothing fancy here, this view populate the latest currency rate (curr_from -> curr_to) from the currency_rate table.
When I do as below, it populate 80k row (all data) because I have plenty of records in order table. And the time spent is less than 5 seconds.
First Query:
select * from
VIEW_CURRENCY_RATE c, order a
where
c.curr_from = A.CURRENCY;
I want to add in more filter, so I thought it could be faster, so I added this:
Second Query:
select * from
VIEW_CURRENCY_RATE c, order a
where
a.id = 'xxxx'
and c.curr_from = A.CURRENCY;
And now it run over 1 minute! I totally have no idea what happen to this. I thought it would be some oracle optimizer goes wrong, so I try to find another way, think of just the 80K data can be populated quite fast, so I try to get the data from it, so I nested the SQL as below:
select * from (
select * from
VIEW_CURRENCY_RATE c, order a
where
c.curr_from = A.CURRENCY
)
where id = 'xxxx';
It run damn slow as well! I running out of idea, can anyone explain what happen to my script?
Updated on 6-Sep-2016
After I know how to 'explain plan', I capture the screen:
Fist query (fast one with 80K data):
Second query (slow one):
The slow one totally break the view and form a new SQL! This is super weird that how can Oracle optimize this like that?
It seems problem relates to the plan of second query. because it uses of nest loops inplace of hash joint.
at first check if _hash_join_enable is true if it isn't true change it to true. if it is true there are some problem with oracle optimizer. for test it use of USE_HASH(tab2 tab1) hint.
Regards
mohsen
I am using Mike solution, I re-write the script, and it is running fast now, although the root cause is not determined, probably due to the oracle optimizer algorithm working in different way that I expect.

Optimize annotated query in Django

I'm trying to convert this simple SQL query into something Django can handle:
SELECT *
FROM location AS a
WHERE a.travel_distance = (
SELECT MAX(travel_distance)
FROM location AS b
WHERE b.person_id = a.person_id
)
ORDER BY a.travel_distance DESC
What this basically does is fetching all traveled locations and select only the rows that contain the maximum travel distance.
This is what i got so far:
travels = Location.objects.filter(pk__in=Location.objects.order_by().values('person_id').annotate(max_id=Max('id')).values('max_id')).order_by('travel_distance')[::-1]
Although the results match each other. It takes a whole lot longer for the second method to return results.
Is there anyway I can rewrite this query, so it becomes faster?
If I understand correctly you want the maximum distance travelled for each person. Assuming there is a Person model, perhaps ask from the other direction. Something like:
Person.objects.values('id').annotate(max_distance=Max('location__travel_distance'))
I haven't tested this since I don't have an equivalent data schema handy, but does this work for you?
Isn't this works? Something like:
select max(id), sum(travel_distance) from table group by person_id;

mysql query tuning

I have a query that I have made into a MYSQL view. This particular view is central to our application so we are looking at tuning it. There is a primary key on Map_Id,User_No,X,Y. I am pretty comfortable tuning SQL server queries but not totally sure about how MySql works in this aspect. Would it help to put an index on it that covers points and update_stamp as well? Reads on this table are 90% so while it has lots of inserts, it does not compare to the amount of reads.
Description: Get the person with the most points for each x,y coord in a given map. Tie break by who has the latest update stamp and then by user id.
SELECT GP.Map_Id AS Map_Id,GP.User_No AS User_No,GP.X AS X,GP.Y AS Y, GP.Points AS Points,GP.Update_Stamp AS Update_Stamp
FROM (Grid_Points GP LEFT JOIN Grid_Points GP2
ON (
(
(GP2.Map_Id = GP.Map_Id) AND (GP2.X = GP.X) AND (GP2.Y = GP.Y) AND
((GP2.Points > GP.Points) OR ((GP2.Points = GP.Points) AND (GP2.Update_Stamp > GP.Update_Stamp)) OR
((GP2.Points = GP.Points) AND (GP2.Update_Stamp = GP.Update_Stamp) AND (GP2.User_No < GP.User_No)))
)
)
)
WHERE ISNULL(GP2.User_No);
Wow man, you really like to use parentheses. :-)
You're right, a compound index may help. You might even be able to make it a covering index. Probably either an index on Grid_Points(Map_Id,X,Y) or else an index on Grid_Points(Points,Update_Stamp,User_No) would be what I try.
Always test query optimization with EXPLAIN to see if the optimizer is using your index. Read that documentation section until you understand the cryptic notes in the EXPLAIN report.
The EXPLAIN report will probably show you which index it decides to use. You should be aware that MySQL uses only one index per table in a given query.
Here's how I would write that query, relying on order of precedence between AND and OR instead of so many nested parentheses:
SELECT GP.Map_Id, GP.User_No, GP.X, GP.Y, GP.Points, GP.Update_Stamp
FROM Grid_Points GP LEFT JOIN Grid_Points GP2
ON GP2.Map_Id = GP.Map_Id AND GP2.X = GP.X AND GP2.Y = GP.Y
AND (
GP2.Points > GP.Points
OR
GP2.Points = GP.Points
AND GP2.Update_Stamp > GP.Update_Stamp
OR
GP2.Points = GP.Points
AND GP2.Update_Stamp = GP.Update_Stamp
AND GP2.User_No < GP.User_No
)
WHERE GP2.User_No IS NULL;
You are using my favorite method for finding greatest-n-per-group in MySQL. MySQL doesn't optimize GROUP BY very well (it often incurs a temporary table which gets serialized to disk), so the left outer join solution that you're using is usually a lot better, at least for MySQL. In other brands of RDBMS, this solution may not have such an advantage.
I wouldn't match it to itself, I'd do it as a "group by" and then possibly match back to get who the person is.
SELECT Map_Id, X, Y, max(Points)
FROM Grid_Points
GROUP BY Map_Id, X, Y;
This would give you a table of Map_Id, X, and Y, then the maximum points.
You could then join those results back to Grid_Points to find which user is = to those points.

changing sorting criteria after the first result

I am selecting from a database of news articles, and I'd prefer to do it all in one query if possible. In the results, I need a sorting criteria that applies ONLY to the first result.
In my case, the first result must have an image, but the others should be sorted without caring about their image status.
Is this something I can do with some sort of conditionals or user variables in a MySQL query?
Even if you manage to find a query that looks like one query, it is going to be logicaly two queries. Have a look at MySQL UNION if you really must make it one query (but it will still be 2 logical queries). You can union the image in the first with a limit of 1 and the rest in the second.
Something like this ensures an article with an image on the top.
SELECT
id,
title,
newsdate,
article
FROM
news
ORDER BY
CASE WHEN HasImage = 'Y' THEN 0 ELSE 1 END,
newsdate DESC
Unless you define "the first result" closer, of course. This query prefers articles with images, articles without will appear at the end.
Another variant (thanks to le dorfier, who deleted his answer for some reason) would be this:
SELECT
id,
title,
newsdate,
article
FROM
news
ORDER BY
CASE WHEN id = (
SELECT MIN(id) FROM news WHERE HasImage = 'Y'
) THEN 0 ELSE 1 END,
newsdate DESC
This sorts the earliest (assuming MIN(id) means "earliest") article with an image to the top.
I don't think it's possible, as it's effectively 2 queries (the first query the table has to get sorted for, and the second unordered), so you might as well use 2 queries with a LIMIT 1 in the first.