How does DISTINCT interact with ORDER BY?

How does DISTINCT interact with ORDER BY? - sql

Consider the two tables below:
user:
ID | name
---+--------
1 | Alice
2 | Bob
3 | Charlie
event:
order | user
------+------------
1 | 1 (Alice)
2 | 2 (Bob)
3 | 3 (Charlie)
4 | 3 (Charlie)
5 | 2 (Bob)
6 | 1 (Alice)
If I run the following query:
SELECT DISTINCT user FROM event ORDER BY "order" DESC;
will it be guaranteed that I get the results in the following order?
1 (Alice)
2 (Bob)
3 (Charlie)
If the three last rows of event are selected, I know this is the order I get, because it would be ordering 4, 5, 6 in descending order. But if the first three rows are selected, and then DISTINCT prevents the last tree to be loaded for consideration, I would get it in reversed order.
Is this behavior well defined in SQL? Which of the two will happen? What about in SQLite?

No, it will not be guaranteed.
Find Itzik Ben-Gan's Logical Query Processing Phases poster for MS SQL. It migrates over many sites, currently found at https://accessexperts.com/wp-content/uploads/2015/07/Logical-Query-Processing-Poster.pdf .
DISTINCT preceeds ORDER BY .. TOP and Sql Server is free to return any of 1 | 1 (Alice) or 6 | 1 (Alice) rows for Alice. So any of (1,2,3), (1,4,5) an so on are valid results of DISTINCT.

Here's a query solution that I believe solves your problem.
SELECT
MAX([order]) AS MaxOrd
, [user]
FROM Event
GROUP BY [User]
ORDER BY MaxOrd DESC

Related

How do I query for records that all have the same relationships (MS Access)?

I'm trying to query for records that share common relationships. In order to avoid gratuitous context, here is a hypothetical example from my favorite old-school Nintendo game:
Consider a table of boxers:
tableBoxers
ID | boxerName
----------------
1 | Little Mac
2 | King Hippo
3 | Von Kaiser
4 | Don Flamenco
5 | Bald Bull
Now I have a relationship table that links them together
boxingMatches
boxerID1 | boxerID2
-----------------------
1 | 3
2 | 5
2 | 4
5 | 1
4 | 1
Since I don't want to discriminate between ID1 and ID2, I create a query that UNIONs them together:
SELECT firstID AS boxerID1, secondID AS boxerID2 FROM
(
SELECT boxerID1 AS firstID, boxerID2 AS boxerID FROM boxingMatches
UNION ALL
SELECT boxerID2 AS firstID, boxerID1 AS secondID FROM boxingMatches
) ORDER BY firstID, secondID
I get:
queryBoxingMatches
boxerID1 | boxerID2
-----------------------
1 | 3
1 | 4
1 | 5
2 | 4
2 | 5
3 | 1
4 | 1
4 | 2
5 | 1
5 | 2
Now I have VBA script where a user can select the boxers he's interested in. Let's say he selects Little Mac (1) and King Hippo (2). This gets appended into a temporary table:
summaryRequest
boxerID
--------
1
2
Using table [summaryRequest] and [queryBoxingMatches], how do I find out whom Little Mac (1) and King Hippo (2) have similarly fought against? The result should be Bald Bull (5) and Don Flamenco (4).
Bear in mind that [summaryRequest] could have 0 or more records. I have considered an INTERSECT, but I'm not sure that's the right function for this. I've tried using COUNT numerous ways, but it gives undesired data when there are multiple relationships (e.g. if Little Mac fought Bald Bull twice and King Hippo only fought him once).
I can't help but feel like the answer is plain and simple and I'm just overthinking it. Any help is appreciated. Thanks.

Not sure if that works like this in MS Access:
SELECT boxerID2, COUNT(*) as cnt
FROM queryBoxingMatches
WHERE boxerID1 IN (SELECT boxerID FROM summaryRequest)
GROUP BY boxerID2
HAVING COUNT(DISTINCT boxerID1) = (SELECT COUNT(DISTINCT boxerID)
FROM summaryRequest)

Okay, I think I found the answer:
SELECT boxerID2 FROM
(
SELECT boxerID2, COUNT(boxerID2) AS getCount FROM queryBoxingMatches
WHERE boxerID1 IN (SELECT ID FROM summaryRequest)
GROUP BY boxerID2
)
WHERE getCount = (SELECT COUNT(*) FROM summaryRequest)
It's a play off of COUNT(), which I don't like because COUNT() doesn't guarantee a full-fledged relationship--just a pattern.

TSQL change in query to and query

I have one to many relationship table
ReviewId EffectId
1 | 2
1 | 5
1 | 8
2 | 2
2 | 5
2 | 9
2 | 3
3 | 3
3 | 2
3 | 9
In the site the users select each effect he chooses, and I get all the relevant review.
I make an in query
For example if the user select effects 2 and 5
My query: “
select reviewed from table_name where effected in(2,5)
Now I need get all the review that contain both effect
All reviews that has effect 2 and effect 5
What is the best query to make this?
Important for me that the query will run as quick as possible.
And for this I can also change the table schema (if needed ) like add a cached field that contain all the effect with comma like
Reviewed cachedEffects
1 | ,2,5,8
2 | ,2,5,9,3,
3 | ,3,2,9

You can do it this way:
select reviewid
from
tbl
where effectid in (2,5)
group by reviewid
having count(distinct effectid) > 1
Demo
count (distinct effectid) is used to ensure that the results contain only those reviewIDs which have multiple records with different values of effectID. The where clause is used to filter out based on your filter condition of having both 2 and 5.
The key thing to note here is that we are grouping by reviewID, and also using the count of distinct effectID values to ensure that only those records which have both 2 and 5 are returned. If we did not do so, the query would return all rows which have effectID equal to either 2 or 5.
For improving performance, you could create an index on reviewID.

Matching algorithm in SQL

I have the following table in my database.
# select * FROM matches;
name | prop | rank
------+------+-------
carl | 1 | 4
carl | 2 | 3
carl | 3 | 9
alex | 1 | 8
alex | 2 | 5
alex | 3 | 6
alex | 3 | 8
alex | 2 | 11
anna | 3 | 8
anna | 3 | 13
anna | 2 | 14
(11 rows)
Each person is ranked at work by different properties/criterias called 'prop' and the performance is called 'rank'. The table contains multiple values of (name, prop) as the example shows. I want to get the best candidate following from some requirements. E.g. I need a candidate that have (prop=1 AND rank > 5) and (prop=3 AND rank >= 8). Then we must be able to sort the candidates by their rankings to get the best candidate.
EDIT: Each person must fulfill ALL requirements
How can I do this in SQL?

select x.name, max(x.rank)
from matches x
join (
select name from matches where prop = 1 AND rank > 5
intersect
select name from matches where prop = 3 AND rank >= 8
) y
on x.name = y.name
group by x.name
order by max(rank);

Filtering the data to match your criteria here is quite simple (as shown by both Amir and sternze):
SELECT *
FROM matches
WHERE prop=1 AND rank>5) OR (prop=3 AND rank>=8
The problem is how to aggregate this data so as to have just one row per candidate.
I suggest you do something like this:
SELECT m.name,
MAX(DeltaRank1) AS MaxDeltaRank1,
MAX(DeltaRank3) AS MaxDeltaRank3
FROM (
SELECT name,
(CASE WHEN prop=1 THEN rank-6 ELSE 0 END) AS DeltaRank1,
(CASE WHEN prop=3 THEN rank-8 ELSE 0 END) AS DeltaRank3,
FROM matches
) m
GROUP BY m.name
HAVING MaxDeltaRank1>0 AND MaxDeltaRank3>0
SORT BY MaxDeltaRank1+MaxDeltaRank3 DESC;
This will order the candidates by the sum of how much they exceeded the target rank in prop1 and prop3. You could use different logic to indicate which is best though.
In the case above, this should be the result:
name | MaxDeltaRank1 | MaxDeltaRank3
------+---------------+--------------
alex | 3 | 0
... because neither anna nor carl reach both the required ranks.

A typical case of relational division. We assembled a whole arsenal of techniques under this related question:
How to filter SQL results in a has-many-through relation
Assuming you want the minimum rank of a person, I might solve your particular case with LEAST():
SELECT m1.name, LEAST(m1.rank, m2.rank, ...) AS best_rank
FROM matches m1
JOIN matches m2 USING (name)
...
WHERE m1.prop = 1 AND m1.rank > 5
AND m2.prop = 3 AND m2.rank >= 8
...
ORDER BY best_rank;
Also assuming name to be unique per individual person. You'd probably use some kind of foreign key to a pk column of a person table in reality.
And if you have such a person table like you should, the best rank would be stored in a column there ...

If I understand you question, then you just need to execute the following operation:
SELECT * FROM matches where (prop = 1 AND rank > 5) OR (prop = 3 AND rank >= 8) ORDER BY rank
It gives you the canidates that either have prop=1 and rank > 5 or prop=3 and rank >= 8 sorted by their rankings.

Sort by data from multiple columns

For customer reviews on my products, I have them stored in SQL something like the below:
durability | cost | appearance
----------------------------------
5 | 3 | 4
2 | 4 | 2
1 | 5 | 5
Each value is an out of five score in the three categories.
When I want to print this information on page, I'd like to order them in descending order by the average score of an individual review.
SELECT *
FROM reviews
ORDER BY (durability+cost+appearance)/3 DESC
Obviously this doesn't work, but is there a way to get my result? I don't want to include an average column in SQL because outside of this one small application, it serves zero purpose.

Use ORDER BY instead of SORT BY:
SELECT *
FROM reviews
ORDER BY (durability+cost+appearance)/3 DESC
EDIT:
To see the order by value, try adding one more column in the select clause:
SELECT *,(durability+cost+appearance)/3 as OrderValue
FROM reviews
ORDER BY (durability+cost+appearance)/3 DESC
Sample output:
DURABILITY COST APPEARANCE ORDERVALUE
5 3 4 4
1 5 5 3
2 4 2 2

Problem with advanced distinct SQL query

Ok this one is realy tricky :D
i have a this table
bills_products:
- bill_id - product_id - action -
| 1 | 4 | add |
| 1 | 5 | add |
| 2 | 4 | remove |
| 2 | 1 | add |
| 3 | 4 | add |
as you can see product with the id 4 was added at bill 1 then removed in bill 2 and added again in bill 3
All Bills belong to a bill_group. But for the simplicity sake let's assume all the bills are in the same group.
Now i need a SQL Query that shows all the products that are currently added at this group.
In this example that would be 5, 1 and 4. If we would remove the bill with id 3 that would be 5 and 1
I've tried to do this with DISTINCT but it's not powerful enough or maybe I'm doing it wrong.

This seems to work in SQL Server at least:
select product_id
from (
select product_id,
sum((case when action='add' then 1 else -1 end)) as number
from bills_products
group by product_id
) as counts
where number > 0

SELECT DISTINCT product_id FROM bills_products WHERE action = 'add';

GSto almost had it, but you have to ORDER BY bill_id DESC to ensure you get the latest records.
SELECT DISTINCT product_id FROM bills_products
WHERE action = 'add'
ORDER BY bill_id DESC;
(P.S. I think most people would say it's a best practice to have a timestamp column on tables like this where you need to be able to know what the "newest" row is. You can't always rely on ids only ascending.)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How does DISTINCT interact with ORDER BY? - sql

Here's a query solution that I believe solves your problem. SELECT MAX([order]) AS MaxOrd , [user] FROM Event GROUP BY [User] ORDER BY MaxOrd DESC

Related

How do I query for records that all have the same relationships (MS Access)?

TSQL change in query to and query

Matching algorithm in SQL

Sort by data from multiple columns

Problem with advanced distinct SQL query

Categories

Resources