Excluding only one MIN value on Oracle SQL - sql

I am trying to select all but the lowest value in a column (GameScore), but when there are two of this lowest value, my code excludes both (I know why it does this, I just don't know exactly how to correct it and include one of the two lowest values).
The code looks something like this:
SELECT Id, SUM(Score) / COUNT(Score) AS Score
FROM
(SELECT Id, Score
FROM GameScore
WHERE Game_No = 1
AND Score NOT IN
(SELECT MIN(Score)
FROM GameScore
WHERE Game_No = 1
GROUP BY Id))
GROUP BY Id
So if I am drawing from 5 values, but one of the rows only pulls 3 scores because the bottom two are the same, how do I include the 4th? Thanks.

In order to do this you have to separate them up somehow; your current issue is that the 2 lowest scores are the same so any (in)equality operation performed on either values treats the other one identically.
You could use something like the analytic query ROW_NUMBER() to uniquely identify rows:
select id, sum(score) / count(score) as score
from ( select id, score, row_number() over (order by score) as score_rank
from gamescore
where gameno = 1
)
where score_rank <> 1
group by id
ROW_NUMBER():
assigns a unique number to each row to which it is applied (either each row in the partition or each row returned by the query), in the ordered sequence of rows specified in the order_by_clause, beginning with 1.
As the ORDER BY clause is on SCORE in ascending order one of the lowest score will be removed. This will be a random value unless you add other tie-breaker conditions to the ORDER BY.

You can do this a few ways, including what #Ben shows. From a mostly SQL Server background I was curious if just ROWNUM could be used and found this piece on ROWNUM vs ROW_NUMBER interesting. I'm not sure if it is dated.
All in a SQLFiddle.
Note: I'm using a subquery factoring/CTE as I think the read more clearly than in-line subqueries.
Using ROWNUM:
WITH OrderedScore AS (
SELECT id, game_no, score
,rownum as score_rank
FROM GameScore
WHERE game_no = 1
ORDER BY Score ASC
)
SELECT id
,sum(score)/count(score)
FROM OrderedScore
WHERE score_rank > 1
GROUP BY id;
Using ROW_NUMBER() OVER(ORDER BY...) as Ben does:
WITH OrderedScore AS (
SELECT id, game_no, score
,ROW_NUMBER() OVER(ORDER BY score ASC) as score_rank
FROM GameScore
WHERE game_no = 1
ORDER BY Score ASC
)
SELECT id
,sum(score)/count(score)
FROM OrderedScore
WHERE score_rank > 1
GROUP BY id;
Using ROW_NUMBER() OVER(PARTION BY...ORDER BY...) which I think leads to more flexibility if you want to remove the low score by game_no or id at some point:
WITH OrderedScore AS (
SELECT id, game_no, score
,ROW_NUMBER() OVER(PARTITION BY id ORDER BY score ASC) as score_rank
FROM GameScore
WHERE game_no = 1
ORDER BY Score ASC
)
SELECT id
,sum(score)/count(score)
FROM OrderedScore
WHERE score_rank > 1
GROUP BY id;

Related

Complex Ranking in SQL (Teradata)

I have a peculiar problem at hand. I need to rank in the following manner:
Each ID gets a new rank.
rank #1 is assigned to the ID with the lowest date. However, the subsequent dates for that particular ID can be higher but they will get the incremental rank w.r.t other IDs.
(E.g. ADF32 series will be considered to be ranked first as it had the lowest date, although it ends with dates 09-Nov, and RT659 starts with 13-Aug it will be ranked subsequently)
For a particular ID, if the days are consecutive then ranks are same, else they add by 1.
For a particular ID, ranks are given in date ASC.
How to formulate a query?
You need two steps:
select
id_col
,dt_col
,dense_rank()
over (order by min_dt, id_col, dt_col - rnk) as part_col
from
(
select
id_col
,dt_col
,min(dt_col)
over (partition by id_col) as min_dt
,rank()
over (partition by id_col
order by dt_col) as rnk
from tab
) as dt
dt_col - rnk caluclates the same result for consecutives dates -> same rank
Try datediff on lead/lag and then perform partitioned ranking
select t.ID_COL,t.dt_col,
rank() over(partition by t.ID_COL, t.date_diff order by t.dt_col desc) as rankk
from ( SELECT ID_COL,dt_col,
DATEDIFF(day, Lag(dt_col, 1) OVER(ORDER BY dt_col),dt_col) as date_diff FROM table1 ) t
One way to think about this problem is "when to add 1 to the rank". Well, that occurs when the previous value on a row with the same id_col differs by more than one day. Or when the row is the earliest day for an id.
This turns the problem into a cumulative sum:
select t.*,
sum(case when prev_dt_col = dt_col - 1 then 0 else 1
end) over
(order by min_dt_col, id_col, dt_col) as ranking
from (select t.*,
lag(dt_col) over (partition by id_col order by dt_col) as prev_dt_col,
min(dt_col) over (partition by id_col) as min_dt_col
from t
) t;

Multiple maximum values

So I am on SQLite and have the following table and need to print the two names to which correspond the highest number of wins(25). By using the MAX() function, I only get the first of the two rows. How is it possible to print both rows that have the maximum value for wins?
If you query this way, it will return the highest two rows.
SELECT
name,
Wins
FROM
table_name
ORDER BY Wins DESC
LIMIT 2;
Use RANK() window function:
SELECT name, Wins
FROM (
SELECT *, RANK() OVER(ORDER BY Wins DESC) rnk
FROM tablename
)
WHERE rnk = 1
This will return all the rows which will have the max number of Wins because all of them will be ranked as 1.
if you dont mind nested selects I think this should be getting all records have max wins
select name, wins from table where wins=(select max(wins) from table )
All names with top 2 score
WITH maxw (wins) AS(
SELECT DISTINCT wins
FROM tbl
ORDER BY wins DESC
LIMIT 2
)
SELECT tbl.*
FROM tbl
JOIN maxw ON tbl.wins = maxw.wins;

How do I create a new SQL table with custom column names and populate these columns

So I currently have an SQL statement that generates a table with the most frequent occurring value as well as the least frequent occurring value in a table. However this table has 2 rows with the row values as well as the fields. I need to create a custom table with 2 columns with min and max. Then have one row with one value for each. The value for these columns needs to be from the same row.
(SELECT name, COUNT(name) AS frequency
FROM firefighter_certifications
GROUP BY name
ORDER BY frequency DESC limit 1)
UNION
(SELECT name, COUNT(name) AS frequency
FROM firefighter_certifications
GROUP BY name
ORDER BY frequency ASC limit 1);
So for the above query I would need the names of the min and max values in one row. I need to be able to define the name of new columns for the generated SQL query as well.
Min_Name | Max_Name
Certif_1 | Certif_2
I think this query should give you the results you want. It ranks each name according to the number of times it appears in the table, then uses conditional aggregation to select the min and max frequency names in one row:
with cte as (
select name,
row_number() over (order by count(*) desc) as maxr,
row_number() over (order by count(*)) as minr
from firefighter_certifications
group by name
)
select max(case when minr = 1 then name end) as Min_Name,
max(case when maxr = 1 then name end) as Max_Name
from cte
Postgres doesn't offer "first" and "last" aggregation functions. But there are other, similar methods:
select distinct first_value(name) over (order by cnt desc, name) as name_at_max,
first_value(name) over (order by cnt asc, name) as name_at_min
from (select name, count(*) as cnt
from firefighter_certifications
group by name
) n;
Or without any subquery at all:
select first_value(name) over (order by count(*) desc, name) as name_at_max,
first_value(name) over (order by count(*) asc, name) as name_at_min
from firefighter_certifications
group by name
limit 1;
Here is a db<>fiddle

Aggregate function like MAX for most common cell in column?

Group by the highest Number in a column worked great with MAX(), but what if I would like to get the cell that is at most common.
As example:
ID
100
250
250
300
200
250
So I would like to group by ID and instead of get the lowest (MIN) or highest (MAX) number, I would like to get the most common one (that would be 250, because there 3x).
Is there an easy way in SQL Server 2012 or am I forced to add a second SELECT where I COUNT(DISTINCT ID) and add that somehow to my first SELECT statement?
You can use dense_rank to return all the id's with the highest counts. This would handle cases when there are ties for the highest counts as well.
select id from
(select id, dense_rank() over(order by count(*) desc) as rnk from tablename group by id) t
where rnk = 1
A simple way to do what you want uses top and order by:
SELECT top 1 id
FROM t
GROUP BY id
ORDER BY COUNT(*) DESC;
This is a statistic called the mode. Getting the mode and max is a bit challenging in SQL Server. I would approach it as:
WITH cte AS (
SELECT t.id, COUNT(*) AS cnt,
row_number() OVER (ORDER BY COUNT(*) DESC) AS seqnum
FROM t
GROUP BY id
)
SELECT MAX(id) AS themax, MAX(CASE WHEN seqnum = 1 THEN id END) AS MODE
FROM cte;

oracle sql wih rownum <=

why below query is not giving results if I remove the < sign from query.Because even without < it must match with results?
Query used to get second max id value:
select min(id)
from(
select distinct id
from student
order by id desc
)
where rownum <=2
student id
1
2
3
4
Rownum has a special meaning in Oracle. It is increased with every row, but the optimizer knows that is increasing continuously and all consecutive rows must met the rownum condition. So if you specify rownum = 2 it will never occur since the first row is already rejected.
You can see this very nice if you do an explain plan on your query. It will show something like:
Plan for rownum <=:
COUNT STOPKEY
Plan for rownum =:
FILTER
A ROWNUM value is not assigned permanently to a row (this is a common misconception). A row in a table does not have a number; you cannot ask for row 2 or 3 from a table
click Here for more Info.
This is from the link provided:
Also confusing to many people is when a ROWNUM value is actually assigned. A ROWNUM value is assigned to a row after it passes the predicate phase of the query but before the query does any sorting or aggregation. Also, a ROWNUM value is incremented only after it is assigned, which is why the following query will never return a row:
select *
from t
where ROWNUM > 1;
Because ROWNUM > 1 is not true for the first row, ROWNUM does not advance to 2. Hence, no ROWNUM value ever gets to be greater than 1. Consider a query with this structure:
select ..., ROWNUM
from t
where <where clause>
group by <columns>
having <having clause>
order by <columns>;
I think this is the query you are looking for:
select id
from (select distinct id
from student
order by id desc
) t
where rownum <= 2;
Oracle processes the rownum before the order by, so you need a subquery to get the first two rows. The min() was forcing an aggregation that returned only one result, but before the rownum was applied.
If you actually want only the second value, you need an additional layer of subqueries:
select min(id)
from (select id
from (select distinct id
from student
order by id desc
) t
where rownum <= 2
) t;
However, I would do:
select id
from (select id, dense_rank() over (order by id) as seqnum
from student
) t
where seqnum = 2;
Order asc instead of desc
select id from student where rownum <=2 order by id asc;
Why not just use
select id
from ( select distinct id
, row_number() over (order by id desc) x
from student
)
where x = 2
Or even really bad. Getting the count and index :)
select id
from ( select id
, row_number() over (order by id desc) idx
, sum(1) over (order by null) cnt
from student
group
by id
)
where idx = cnt - 1 -- get the pre-last
Or
where idx = cnt - 2 -- get the 2nd-last
Or
where idx = 3 -- get the 3rd
Try this
SELECT *
FROM (
SELECT id, row_number() over (order by id asc) row_num
FROM student
) AS T
WHERE row_num = 2 -- or 3 ... n
ROW_NUMBER