SQL set increasing integer where value of column is 1 - sql

I have a data set which looks like:
Id INT,
Choice VARCHAR,
Order INT
Id + Choice form the primary key.
Currently a lot of the rows have Order = 1.
What I would like to do is, for each Id, if there are multiple rows with that Id where Order = 1, set them to be 1, 2, 3, 4, etc.
I can't work out the SQL to do this.
Example data:
+----+--------+-------+
| Id | Choice | Order |
+----+--------+-------+
| 4 | hello | 1 |
| 4 | world | 1 |
| 4 | test | 1 |
+----+--------+-------+
Would become:
+----+--------+-------+
| Id | Choice | Order |
+----+--------+-------+
| 4 | hello | 1 |
| 4 | world | 2 |
| 4 | test | 3 |
+----+--------+-------+

We can try using ROW_NUMBER here with a partition by Id. As for the ordering in your Order column, I don't see any logic present for how you numbered things. In the absence of this, I use the Choice column to decide how to order the row numbering.
WITH cte AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY Id ORDER BY Choice) rn
FROM yourTable
WHERE [Order] = 1
)
UPDATE cte
SET [Order] = rn;
Note: Please avoid naming your columns (tables, etc.) using reserved SQL keywords like ORDER. You will forever have to put that column name in square brackets, like this: [Order].

Related

Oracle SQL: Counting how often an attribute occurs for a given entry and choosing the attribute with the maximum number of occurs

I have a table that has a number column and an attribute column like this:
1.
+-----+-----+
| num | att |
-------------
| 1 | a |
| 1 | b |
| 1 | a |
| 2 | a |
| 2 | b |
| 2 | b |
+------------
I want to make the number unique, and the attribute to be whichever attribute occured most often for that number, like this (This is the end-product im interrested in) :
2.
+-----+-----+
| num | att |
-------------
| 1 | a |
| 2 | b |
+------------
I have been working on this for a while and managed to write myself a query that looks up how many times an attribute occurs for a given number like this:
3.
+-----+-----+-----+
| num | att |count|
------------------+
| 1 | a | 1 |
| 1 | b | 2 |
| 2 | a | 1 |
| 2 | b | 2 |
+-----------------+
But I can't think of a way to only select those rows from the above table where the count is the highest (for each number of course).
So basically what I am asking is given table 3, how do I select only the rows with the highest count for each number (Of course an answer describing providing a way to get from table 1 to table 2 directly also works as an answer :) )
You can use aggregation and window functions:
select num, att
from (
select num, att, row_number() over(partition by num order by count(*) desc, att) rn
from mytable
group by num, att
) t
where rn = 1
For each num, this brings the most frequent att; if there are ties, the smaller att is retained.
Oracle has an aggregation function that does this, stats_mode().:
select num, stats_mode(att)
from t
group by num;
In statistics, the most common value is called the mode -- hence the name of the function.
Here is a db<>fiddle.
You can use group by and count as below
select id, col, count(col) as count
from
df_b_sql
group by id, col

Replacing numeric values in a cloumn by their ranking respective the other values in the column

Lets say I have A table a that looks like this:
+---+--------------+------+
|NUM| NAME |POINTS|
+-------------------------+
| 1 | Peter | 92 |
| 1 | Rose | 93 |
| 1 | Karl | 94 |
| 2 | Frank | 15 |
| 2 | Sarah | 16 |
+-------------------------+
With the primary key being combination of NUM and NAME.
Now I would like to replace the numbers in POINTS with their ranking, starting by 1 for every num. I want to actually update the table.
Example:
+---+--------------+------+
|NUM| NAME |POINTS|
+-------------------------+
| 1 | Peter | 3 |
| 1 | Rose | 2 |
| 1 | Karl | 1 |
| 2 | Frank | 1 |
| 2 | Sarah | 2 |
+-------------------------+
What would be the best way of doing that?
If you want to actually change the values in the table, you can use a MERGE statement:
merge into the_table t
using (
select num, name,
dense_rank() over (partition by num order by points) as rnk
from the_table
) x on (x.num = t.num and x.name = t.name)
when matched then update
set points = x.rnk;
If you just want to display the values, use the inner select on its own:
select num, name,
dense_rank() over (partition by num order by points) as points,
from the_table
I would add a column to the table for ranking, then loop through the table while there exists a row with no ranking.
Then find the max points where ranking is null, and update ranking for all rows who have those points to their ranking.
The ranking is either a counter if the points are unique, or you could just count the records with ranking every time you loop and have the new ranking be count(records with ranking) + 1
All this would look something like this
FOR r in (SELECT * FROM people WHERE ranking IS NULL) LOOP
SELECT MAX(POINTS)
INTO temp_points
FROM people
WHERE ranking IS NULL;
SELECT COUNT(*)
INTO temp_ranking
FROM people
WHERE ranking is not null;
temp_ranking := temp_ranking + 1;
UPDATE people
SET ranking = temp_ranking
WHERE points = temp_points;
END LOOP;

SQL query to create ascending values within groups

I have the following table:
+----+--------+-----+
| id | fk_did | pos |
+----+--------+-----+
This table contains hundreds of rows, each of them referencing another table with fk_did. The value in pos is currently always zero which I want to change.
Basically, for each group of fk_did, the pos-column should start at zero and be ascending. It doesn't matter how the rows are ordered.
Example output (select * from table order by fk_did, pos) that I wanna get:
+----+--------+-----+
| id | fk_did | pos |
+----+--------+-----+
| xx | 0 | 0 |
| xx | 0 | 1 |
| xx | 0 | 2 |
| xx | 1 | 0 |
| xx | 1 | 1 |
| xx | 1 | 2 |
| xx | 4 | 0 |
| xx | 8 | 0 |
| xx | 8 | 1 |
| xx | 8 | 2 |
+----+--------+-----+
There must be no two rows that have the same combination of fk_did and pos
pos must be ascending for each fk_did
If there is a row with pos > 0, there must also be a row with the same fk_did and a lower pos.
Can this be done with a single update query?
You can do this using a window function:
update the_table
set pos = t.rn - 1
from (
select id,
row_number() over (partition by fk_id) as rn
from the_table
) t
where t.id = the_table.id;
The ordering of pos will be more or less random, as there is no order by, but you said that doesn't matter.
This assumes that id is unique, if not, you can use the internal column ctid instead.
If id is the PK of your table, then you can use the following query to update your table:
UPDATE mytable
SET pos = t.rn
FROM (
SELECT id, fk_did, pos,
ROW_NUMBER() OVER (PARTITION BY fk_did ORDER BY id) - 1 AS rn
FROM mytable) AS t
WHERE mytable.id = t.id
ROW_NUMBER window function, used with a PARTITION BY clause, generates sequence numbers starting from 1 for each fk_did slice.
Demo here
I'd suggest creating a temporary table if id column is not unique):
create temp table tmp_table as
select id, fk_did, row_number() over (partition by fk_did) - 1 pos
from table_name
And then truncate current table and insert records from the temp table

SQL Change Rank based on any value in group of values

I'm not looking for the answer as much as what to search for as I think this is possible. I have a query where the result can be as such:
| ID | CODE | RANK |
I want to base rank off of the code so my I get these results
| 1 | A | 1 |
| 1 | B | 1 |
| 2 | A | 1 |
| 2 | C | 1 |
| 3 | B | 2 |
| 3 | C | 2 |
| 4 | C | 3 |
Basically, based on the group of IDs, if any of the CODEs = a certain value I want to adjust the rank so then I can order by rank first and then other columns. Never sure how to phrase things in SQL.
I tried
CASE WHEN CODE = 'A' THEN 1 WHEN CODE = 'B' THEN 2 ELSE 3 END rank
ORDER BY rank DESC
But I want to keep the ids together, I don't want them broken apart, I was thinking of doing all ranks the same based on the highest if I can't solve it another way?
Thoughts of a SQL function to look at?
You could use the MIN() OVER() analytic function to get the minimum rank value per group, and just order by that;
WITH cte AS (
SELECT id, code,
MIN(CASE WHEN code='A' THEN 1 WHEN code='B' THEN 2 ELSE 3 END)
OVER (PARTITION BY id) rank
FROM mytable
)
SELECT * FROM cte
ORDER BY rank, id, code
An SQLfiddle to test with.

Select rows appearing after a row with a given ID when sorted by criteria unrelated to the ID

Given the data in the table "people":
+----+-------+
| id | name |
+----+-------+
| 1 | Jane |
| 2 | Joe |
| 4 | John |
| 5 | Alice |
| 6 | Bob |
+----+-------+
And the order:
SELECT * FROM people ORDER BY name
... which would return:
+----+-------+
| id | name |
+----+-------+
| 5 | Alice |
| 6 | Bob |
| 1 | Jane |
| 2 | Joe |
| 4 | John |
+----+-------+
How could one write a query--including the order above--which would return only rows after the one with a given id, e.g., if given an id of 1, it would return:
+----+-------+
| id | name |
+----+-------+
| 2 | Joe |
| 4 | John |
+----+-------+
To be clear, the id is variable and not known before hand.
An approach using commonly supported SQL would be great, but I'm using PostgreSQL 9.2 and ActiveRecord 3.2 if they have anything additional of use, e.g., OVER() and ROW_NUMBER().
[Edit] I'd previously showed the wrong desired result set, including the row with the given id. But, the result set, as described in the question, should only include rows after the given ID.
select *
from people
where
name >= (
select name
from people
where id = 1
)
and id != 1
order by name
So far the simplest approach I've found for a situation where precision is needed, e.g., no missing or duplicate results across multiple calls with varying values for ID is to combine window functions and CTEs, as in:
WITH ordered_people AS (
SELECT *, ROW_NUMBER() OVER (ORDER BY name) AS n
FROM people
ORDER BY name
)
SELECT *
FROM ordered_people
WHERE n > (SELECT n FROM ordered_people WHERE id = 1)
ORDER BY name
;