MySQL get row position in ORDER BY - sql

With the following MySQL table:
+-----------------------------+
+ id INT UNSIGNED +
+ name VARCHAR(100) +
+-----------------------------+
How can I select a single row AND its position amongst the other rows in the table, when sorted by name ASC. So if the table data looks like this, when sorted by name:
+-----------------------------+
+ id | name +
+-----------------------------+
+ 5 | Alpha +
+ 7 | Beta +
+ 3 | Delta +
+ ..... +
+ 1 | Zed +
+-----------------------------+
How could I select the Beta row getting the current position of that row? The result set I'm looking for would be something like this:
+-----------------------------+
+ id | position | name +
+-----------------------------+
+ 7 | 2 | Beta +
+-----------------------------+
I can do a simple SELECT * FROM tbl ORDER BY name ASC then enumerate the rows in PHP, but it seems wasteful to load a potentially large resultset just for a single row.

Use this:
SELECT x.id,
x.position,
x.name
FROM (SELECT t.id,
t.name,
#rownum := #rownum + 1 AS position
FROM TABLE t
JOIN (SELECT #rownum := 0) r
ORDER BY t.name) x
WHERE x.name = 'Beta'
...to get a unique position value. This:
SELECT t.id,
(SELECT COUNT(*)
FROM TABLE x
WHERE x.name <= t.name) AS position,
t.name
FROM TABLE t
WHERE t.name = 'Beta'
...will give ties the same value. IE: If there are two values at second place, they'll both have a position of 2 when the first query will give a position of 2 to one of them, and 3 to the other...

This is the only way that I can think of:
SELECT `id`,
(SELECT COUNT(*) FROM `table` WHERE `name` <= 'Beta') AS `position`,
`name`
FROM `table`
WHERE `name` = 'Beta'

If the query is simple and the size of returned result set is potentially large, then you may try to split it into two queries.
The first query with a narrow-down filtering criteria just to retrieve data of that row, and the second query uses COUNT with WHERE clause to calculate the position.
For example in your case
Query 1:
SELECT * FROM tbl WHERE name = 'Beta'
Query 2:
SELECT COUNT(1) FROM tbl WHERE name >= 'Beta'
We use this approach in a table with 2M record and this is way more scalable than OMG Ponies's approach.

The other answers seem too complicated for me.
Here comes an easy example, let's say you have a table with columns:
userid | points
and you want to sort the userids by points and get the row position (the "ranking" of the user), then you use:
SET #row_number = 0;
SELECT
(#row_number:=#row_number + 1) AS num, userid, points
FROM
ourtable
ORDER BY points DESC
num gives you the row postion (ranking).
If you have MySQL 8.0+ then you might want to use ROW_NUMBER()

The position of a row in the table represents how many rows are "better" than the targeted row.
So, you must count those rows.
SELECT COUNT(*)+1 FROM table WHERE name<'Beta'
In case of a tie, the highest position is returned.
If you add another row with same name of "Beta" after the existing "Beta" row, then the position returned would be still 2, as they would share same place in the classification.
Hope this helps people that will search for something similar in the future, as I believe that the question owner already solved his issue.

I've got a very very similar issue, that's why I won't ask the same question, but I will share here what did I do, I had to use also a group by, and order by AVG.
There are students, with signatures and socore, and I had to rank them (in other words, I first calc the AVG, then order them in DESC, and then finally I needed to add the position (rank for me), So I did something Very similar as the best answer here, with a little changes that adjust to my problem):
I put finally the position (rank for me) column in the external SELECT
SET #rank=0;
SELECT #rank := #rank + 1 AS ranking, t.avg, t.name
FROM(SELECT avg(students_signatures.score) as avg, students.name as name
FROM alumnos_materia
JOIN (SELECT #rownum := 0) r
left JOIN students ON students.id=students_signatures.id_student
GROUP BY students.name order by avg DESC) t

I was going through the accepted answer and it seemed bit complicated so here is the simplified version of it.
SELECT t,COUNT(*) AS position FROM t
WHERE name <= 'search string' ORDER BY name

I have similar types of problem where I require rank(Index) of table order by votes desc. The following works fine with for me.
Select *, ROW_NUMBER() OVER(ORDER BY votes DESC) as "rank"
From "category_model"
where ("model_type" = ? and "category_id" = ?)

may be what you need is with add syntax
LIMIT
so use
SELECT * FROM tbl ORDER BY name ASC LIMIT 1
if you just need one row..

Related

Ranking players in SQL database will return always 1 when where id_user is used

I basically have a table "race" with columns for "id_race", "id_user" and columns for user predictions "pole_position", "1st", "2nd", "3rd" and "fastest_lap". In addition to those columns, each prediction column also has a control column such as "PPC", "1eC", "2eC", "3eC" and "srC". Those control columns are then compared by a query against a "result" table. Then the control columns in race are awarded points for a correct prediction.
table race
I want to add up those results per user and then rank them per user. I want to show that rank on the player's user page. I have a query for my SQL which works fine in itself and gives me a list with rank column.
SELECT
#rownum := #rownum +1 AS rank,
total,
id_user
FROM
(SELECT
SUM(PPC + 1eC + 2eC + 3eC + srC ) AS total,
id_user
FROM
race
GROUP BY
id_user
ORDER BY
total DESC) T,
(SELECT #rownum := 0) a;
Output of rank query:
However when I add the where id_user it always gets the first rank. Does anyone have an idea if this can be solved and how I could achieve it to add where to my rank query?
I've already tried filtering. In addition, I have tried to use the Row_number function. It also always gives a result of 1 because only 1 user remains after filtering. I am unable to filter out the correct position. So please help!
You have to create a view to extracting the correct rank. Once you use WHERE clause, you will get the rank based on the population rather that the subset.
Please find an indicative answer on fiddle where a CTE and ROW function are used. The indicative code is:
WITH sum_cte AS (
SELECT ROW_NUMBER() OVER(ORDER BY SUM(PPC + 1EC + 2eC + 3eC + srC) DESC) AS Row,
id_user,
SUM(PPC + 1EC + 2eC + 3eC + srC) AS total_sum
FROM race
GROUP BY id_user)
SELECT Row, id_user, total_sum
FROM sum_cte
WHERE id_user = 1
User 1 with the second score will appear with a row valuation 2.

Substring in a column

I have a column that has several items in which I need to count the times it is called, my column table looks something like this:
Table Example
Id_TR Triggered
-------------- ------------------
A1_6547 R1:23;R2:0;R4:9000
A2_1235 R2:0;R2:100;R3:-100
A3_5436 R1:23;R2:100;R4:9000
A4_1245 R2:0;R5:150
And I would like the result to be like this:
Expected Results
Triggered Count(1)
--------------- --------
R1:23 2
R2:0 3
R2:100 2
R3:-100 1
R4:9000 2
R5:150 1
I've tried to do some substring, but cant seem to find how to solve this problem. Can anyone help?
This solution is X3 times faster than the CONNECT BY solution
performance: 15K records per second
with cte (token,suffix)
as
(
select substr(triggered||';',1,instr(triggered,';')-1) as token
,substr(triggered||';',instr(triggered,';')+1) as suffix
from t
union all
select substr(suffix,1,instr(suffix,';')-1) as token
,substr(suffix,instr(suffix,';')+1) as suffix
from cte
where suffix is not null
)
select token,count(*)
from cte
group by token
;
with x as (
select listagg(Triggered, ';') within group (order by Id_TR) str from table
)
select regexp_substr(str,'[^;]+',1,level) element, count(*)
from x
connect by level <= length(regexp_replace(str,'[^;]+')) + 1
group by regexp_substr(str,'[^;]+',1,level);
First concatenate all values of triggered into one list using listagg then parse it and do group by.
Another methods of parsing list you can find here or here
This is a fair solution.
performance: 5K records per second
select triggered
,count(*) as cnt
from (select id_tr
,regexp_substr(triggered,'[^;]+',1,level) as triggered
from t
connect by id_tr = prior id_tr
and level <= regexp_count(triggered,';')+1
and prior sys_guid() is not null
) t
group by triggered
;
This is just for learning purposes.
Check my other solutions.
performance: 1K records per second
select x.triggered
,count(*)
from t
,xmltable
(
'/r/x'
passing xmltype('<r><x>' || replace(triggered,';', '</x><x>') || '</x></r>')
columns triggered varchar(100) path '.'
) x
group by x.triggered
;

Returning the lowest integer not in a list in SQL

Supposed you have a table T(A) with only positive integers allowed, like:
1,1,2,3,4,5,6,7,8,9,11,12,13,14,15,16,17,18
In the above example, the result is 10. We always can use ORDER BY and DISTINCT to sort and remove duplicates. However, to find the lowest integer not in the list, I came up with the following SQL query:
select list.x + 1
from (select x from (select distinct a as x from T order by a)) as list, T
where list.x + 1 not in T limit 1;
My idea is start a counter and 1, check if that counter is in list: if it is, return it, otherwise increment and look again. However, I have to start that counter as 1, and then increment. That query works most of the cases, by there are some corner cases like in 1. How can I accomplish that in SQL or should I go about a completely different direction to solve this problem?
Because SQL works on sets, the intermediate SELECT DISTINCT a AS x FROM t ORDER BY a is redundant.
The basic technique of looking for a gap in a column of integers is to find where the current entry plus 1 does not exist. This requires a self-join of some sort.
Your query is not far off, but I think it can be simplified to:
SELECT MIN(a) + 1
FROM t
WHERE a + 1 NOT IN (SELECT a FROM t)
The NOT IN acts as a sort of self-join. This won't produce anything from an empty table, but should be OK otherwise.
SQL Fiddle
select min(y.a) as a
from
t x
right join
(
select a + 1 as a from t
union
select 1
) y on y.a = x.a
where x.a is null
It will work even in an empty table
SELECT min(t.a) - 1
FROM t
LEFT JOIN t t1 ON t1.a = t.a - 1
WHERE t1.a IS NULL
AND t.a > 1; -- exclude 0
This finds the smallest number greater than 1, where the next-smaller number is not in the same table. That missing number is returned.
This works even for a missing 1. There are multiple answers checking in the opposite direction. All of them would fail with a missing 1.
SQL Fiddle.
You can do the following, although you may also want to define a range - in which case you might need a couple of UNIONs
SELECT x.id+1
FROM my_table x
LEFT
JOIN my_table y
ON x.id+1 = y.id
WHERE y.id IS NULL
ORDER
BY x.id LIMIT 1;
You can always create a table with all of the numbers from 1 to X and then join that table with the table you are comparing. Then just find the TOP value in your SELECT statement that isn't present in the table you are comparing
SELECT TOP 1 table_with_all_numbers.number, table_with_missing_numbers.number
FROM table_with_all_numbers
LEFT JOIN table_with_missing_numbers
ON table_with_missing_numbers.number = table_with_all_numbers.number
WHERE table_with_missing_numbers.number IS NULL
ORDER BY table_with_all_numbers.number ASC;
In SQLite 3.8.3 or later, you can use a recursive common table expression to create a counter.
Here, we stop counting when we find a value not in the table:
WITH RECURSIVE counter(c) AS (
SELECT 1
UNION ALL
SELECT c + 1 FROM counter WHERE c IN t)
SELECT max(c) FROM counter;
(This works for an empty table or a missing 1.)
This query ranks (starting from rank 1) each distinct number in ascending order and selects the lowest rank that's less than its number. If no rank is lower than its number (i.e. there are no gaps in the table) the query returns the max number + 1.
select coalesce(min(number),1) from (
select min(cnt) number
from (
select
number,
(select count(*) from (select distinct number from numbers) b where b.number <= a.number) as cnt
from (select distinct number from numbers) a
) t1 where number > cnt
union
select max(number) + 1 number from numbers
) t1
http://sqlfiddle.com/#!7/720cc/3
Just another method, using EXCEPT this time:
SELECT a + 1 AS missing FROM T
EXCEPT
SELECT a FROM T
ORDER BY missing
LIMIT 1;

How can I select adjacent rows to an arbitrary row (in sql or postgresql)?

I want to select some rows based on certain criteria, and then take one entry from that set and the 5 rows before it and after it.
Now, I can do this numerically if there is a primary key on the table, (e.g. primary keys that are numerically 5 less than the target row's key and 5 more than the target row's key).
So select the row with the primary key of 7 and the nearby rows:
select primary_key from table where primary_key > (7-5) order by primary_key limit 11;
2
3
4
5
6
-=7=-
8
9
10
11
12
But if I select only certain rows to begin with, I lose that numeric method of using primary keys (and that was assuming the keys didn't have any gaps in their order anyway), and need another way to get the closest rows before and after a certain targeted row.
The primary key output of such a select might look more random and thus less succeptable to mathematical locating (since some results would be filtered, out, e.g. with a where active=1):
select primary_key from table where primary_key > (34-5)
order by primary_key where active=1 limit 11;
30
-=34=-
80
83
100
113
125
126
127
128
129
Note how due to the gaps in the primary keys caused by the example where condition (for example becaseu there are many inactive items), I'm no longer getting the closest 5 above and 5 below, instead I'm getting the closest 1 below and the closest 9 above, instead.
There's a lot of ways to do it if you run two queries with a programming language, but here's one way to do it in one SQL query:
(SELECT * FROM table WHERE id >= 34 AND active = 1 ORDER BY id ASC LIMIT 6)
UNION
(SELECT * FROM table WHERE id < 34 AND active = 1 ORDER BY id DESC LIMIT 5)
ORDER BY id ASC
This would return the 5 rows above, the target row, and 5 rows below.
Here's another way to do it with analytic functions lead and lag. It would be nice if we could use analytic functions in the WHERE clause. So instead you need to use subqueries or CTE's. Here's an example that will work with the pagila sample database.
WITH base AS (
SELECT lag(customer_id, 5) OVER (ORDER BY customer_id) lag,
lead(customer_id, 5) OVER (ORDER BY customer_id) lead,
c.*
FROM customer c
WHERE c.active = 1
AND c.last_name LIKE 'B%'
)
SELECT base.* FROM base
JOIN (
-- Select the center row, coalesce so it still works if there aren't
-- 5 rows in front or behind
SELECT COALESCE(lag, 0) AS lag, COALESCE(lead, 99999) AS lead
FROM base WHERE customer_id = 280
) sub ON base.customer_id BETWEEN sub.lag AND sub.lead
The problem with sgriffinusa's solution is that you don't know which row_number your center row will end up being. He assumed it will be row 30.
For similar query I use analytic functions without CTE. Something like:
select ...,
LEAD(gm.id) OVER (ORDER BY Cit DESC) as leadId,
LEAD(gm.id, 2) OVER (ORDER BY Cit DESC) as leadId2,
LAG(gm.id) OVER (ORDER BY Cit DESC) as lagId,
LAG(gm.id, 2) OVER (ORDER BY Cit DESC) as lagId2
...
where id = 25912
or leadId = 25912 or leadId2 = 25912
or lagId = 25912 or lagId2 = 25912
such query works more faster for me than CTE with join (answer from Scott Bailey). But of course less elegant
You could do this utilizing row_number() (available as of 8.4). This may not be the correct syntax (not familiar with postgresql), but hopefully the idea will be illustrated:
SELECT *
FROM (SELECT ROW_NUMBER() OVER (ORDER BY primary_key) AS r, *
FROM table
WHERE active=1) t
WHERE 25 < r and r < 35
This will generate a first column having sequential numbers. You can use this to identify the single row and the rows above and below it.
If you wanted to do it in a 'relationally pure' way, you could write a query that sorted and numbered the rows. Like:
select (
select count(*) from employees b
where b.name < a.name
) as idx, name
from employees a
order by name
Then use that as a common table expression. Write a select which filters it down to the rows you're interested in, then join it back onto itself using a criterion that the index of the right-hand copy of the table is no more than k larger or smaller than the index of the row on the left. Project over just the rows on the right. Like:
with numbered_emps as (
select (
select count(*)
from employees b
where b.name < a.name
) as idx, name
from employees a
order by name
)
select b.*
from numbered_emps a, numbered_emps b
where a.name like '% Smith' -- this is your main selection criterion
and ((b.idx - a.idx) between -5 and 5) -- this is your adjacency fuzzy-join criterion
What could be simpler!
I'd imagine the row-number based solutions will be faster, though.

How to select a row for certain (or give preference in the selection) in mysql?

Need your help guys in forming a query.
Example.
Company - Car Rental
Table - Cars
ID NAME STATUS
1 Mercedes Showroom
2 Mercedes On-Road
Now, how do I select only one entry from this table which satisfies the below conditions?
If Mercedes is available in Showroom, then fetch only that row. (i.e. row 1 in above example)
But If none of the Mercedes are available in the showroom, then fetch any one of the rows. (i.e. row 1 or row 2) - (This is just to say that all the mercedes are on-road)
Using distinct ain't helping here as the ID's are also fetched in the select statement
Thanks!
Here's a common way of solving that problem:
SELECT *,
CASE STATUS
WHEN 'Showroom' THEN 0
ELSE 1
END AS InShowRoom
FROM Cars
WHERE NAME = 'Mercedes'
ORDER BY InShowRoom
LIMIT 1
Here's how to get all the cars, which also shows another way to solve the problem:
SELECT ID, NAME, IFNULL(c2.STATUS, c1.STATUS)
FROM Cars c1
LEFT OUTER JOIN Cars c2
ON c2.NAME = c1.NAME AND c2.STATUS = 'Showroom'
GROUP BY NAME
ORDER BY NAME
You would want to use the FIND_IN_SET() function to do that.
SELECT *
FROM Cars
WHERE NAME = 'Mercedes'
ORDER BY FIND_IN_SET(`STATUS`,'Showroom') DESC
LIMIT 1
If you have a preferred order of other statuses, just add them to the second parameter.
ORDER BY FIND_IN_SET(`STATUS`,'On-Road,Showroom' ) DESC
To fetch 'best' status for all cars you can simply do this:
SELECT *
FROM Cars
GROUP BY NAME
ORDER BY FIND_IN_SET(`STATUS`,'Showroom') DESC
SELECT * FROM cars
WHERE name = 'Mercedes'
AND status = 'Showroom'
UNION SELECT * FROM cars
WHERE name = 'Mercedes'
LIMIT 1;
EDIT Removed the ALL on the UNION since we only want distinct rows anyway.
MySQL doesn't have ranking/analytic/windowing functions, but you can use a variable to simulate ROW_NUMBER functionality (when you see "--", it's a comment):
SELECT x.id, x.name, x.status
FROM (SELECT t.id,
t.name,
t.status,
CASE
WHEN #car_name != t.name THEN #rownum := 1 -- reset on diff name
ELSE #rownum := #rownum + 1
END AS rank,
#car_name := t.name -- necessary to set #car_name for the comparison
FROM CARS t
JOIN (SELECT #rownum := NULL, #car_name := '') r
ORDER BY t.name, t.status DESC) x --ORDER BY is necessary for rank value
WHERE x.rank = 1
Ordering by status DESC means that "Showroom" will be at the top of the list, so it'll be ranked as 1. If the car name doesn't have a "Showroom" status, the row ranked as 1 will be whatever status comes after "Showroom". The WHERE clause will only return the first row for each car in the table.
The status being a text based data type tells me your data is not normalized - I could add records with "Showroom", "SHOWroom", and "showROOM". They'd be valid, but you're looking at using functions like LOWER & UPPER when you are grouping things for counting, sum, etc. The use of functions would also render an index on the column useless... You'll want to consider making a CAR_STATUS_TYPE_CODE table, and use a foreign key relationship to make sure bad data doesn't get into your table:
DROP TABLE IF EXISTS `example`.`car_status_type_code`;
CREATE TABLE `example`.`car_status_type_code` (
`car_status_type_code_id` int(10) unsigned NOT NULL auto_increment,
`description` varchar(45) NOT NULL default '',
PRIMARY KEY (`car_status_type_code_id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;