Update multiple records using index result of sub-query

Update multiple records using index result of sub-query - sql

Let's say I have a table with data...
| person_id | priority |
|------------|------------|
| 678 | 2 |
| 413 | 4 |
| 912 | 1 |
| 111 | 5 |
How can I update priority so that the values are contiguous? I.e....
| person_id | priority |
|------------|------------|
| 678 | 2 |
| 413 | 3 | -- updated from 4 to 3
| 912 | 1 |
| 111 | 4 | -- updated from 5 to 4
I know that I can use something like...
select
row_number() over (order by [priority]) as position
from
table_name
...to find a person's 'position', but how can I use this to update the same row?
The priority values should always start at 1.

You can use an updatable CTE or subquery:
with toupdate as (
select t.*, row_number() over (order by [priority]) as new_priority
from table_name
)
update toudpate
set priority = new_priority
where priority <> new_priority;

Related

Get the position of X user in the ranking

I have these tables
RANKING
+-----------+----------+
| id_users | points |
+-----------+----------+
| 1 | 27 | //3rd
| 2 | 55 | //1st
| 3 | 9 | //5th
| 4 | 14 | //4th
| 5 | 38 | //2nd
+-----------+----------+
I would like to retrieve user's data along with its ranking position, filtering by id. So for example if I want info for id 3 I should get
+----------+--------|---------------+
| id_users | points | rank_position |
+----------+--------|---------------+
| 3 | 9 | 5 |
+----------+--------|---------------+
My query actually has the following:
SELECT
ROW_NUMBER() OVER (ORDER BY points ASC) AS RowNum,
id_users
FROM
RANKING
And I don't know how to continue

If you use ROW_NUMBER(), you need to use a subquery:
SELECT r.*
FROM (SELECT r.*,
ROW_NUMBER() OVER (ORDER BY points ASC) AS RowNum
FROM RANKING r
) r
WHERE id_users = 5;

More efficient way to SELECT rows from PARTITION BY

Suppose I have the following table:
+----+-------------+-------------+
| id | step_number | employee_id |
+----+-------------+-------------+
| 1 | 1 | 3 |
| 1 | 2 | 3 |
| 1 | 3 | 4 |
| 2 | 2 | 3 |
| 2 | 3 | 4 |
| 2 | 4 | 5 |
+----+-------------+-------------+
My desired results are:
+----+-------------+-------------+
| id | step_number | employee_id |
+----+-------------+-------------+
| 1 | 1 | 3 |
| 2 | 2 | 3 |
+----+-------------+-------------+
My current solution is:
SELECT
*
FROM
(SELECT
id,
step_number,
MIN(step_number) OVER (PARTITION BY id) AS min_step_number,
employee_id
FROM
table_name) AS t
WHERE
t.step_number = t.min_step_number
Is there a more efficient way I could be doing this?
I'm currently using postgresql, version 12.

In Postgres, I would recommend using distinct on to adress this greatest-n-per-group problem:
select distinct on (id) t.*
from mytbale t
order by id, step_number
This Postgres extension to the SQL standard has usually better performance than the standard approach using window functions (and, as a bonus, the syntax is neater).
Note that this assumes unicity of (id, step_number) tuples: otherwise, the results might be different than those of your query (which allows ties, while distinct on does not).

How to select the latest date for each group by number?

I've been stuck on this question for a while, and I was wondering if the community would be able to direct me in the right direction?
I have some tag IDs that needs to be grouped, with exceptions (column: deleted) that need to be retained in the results. After which, for each grouped tag ID, I need to select the one with the latest date. How can I do this? An example below:
ID | TAG_ID | DATE | DELETED
1 | 300 | 05/01/20 | null
2 | 300 | 03/01/20 | 04/01/20
3 | 400 | 06/01/20 | null
4 | 400 | 05/01/20 | null
5 | 400 | 04/01/20 | null
6 | 500 | 03/01/20 | null
7 | 500 | 02/01/20 | null
I am trying to reach this outcome:
ID | TAG_ID | DATE | DELETED
1 | 300 | 05/01/20 | null
2 | 300 | 03/01/20 | 04/01/20
3 | 400 | 06/01/20 | null
6 | 500 | 03/01/20 | null
So, firstly if there is a date in the "DELETED" column, I would like the row to be present. Secondly, for each unique tag ID, I would like the row with the latest "DATE" to be present.
Hopefully this question is clear. Would appreciate your feedback and help! A big thanks in advance.

Your results seem to be something like this:
select t.*
from (select t.*,
row_number() over (partition by tag_id, deleted order by date desc) as seqnum
from t
) t
where seqnum = 1 or deleted is not null;
This takes one row where deleted is null -- the most recent row. It also keeps each row where deleted is not null.

You need 2 conditions combined with OR in the WHERE clause:
the 1st is deleted is not null, or
the 2nd that there isn't any other row with the same tag_id and date later than the current row's date, meaning that the current row's date is the latest:
select t.* from tablename t
where t.deleted is not null
or not exists (
select 1 from tablename
where tag_id = t.tag_id and date > t.date
)
See the demo.
Results:
| id | tag_id | date | deleted |
| --- | ------ | ---------- | -------- |
| 1 | 300 | 2020-05-01 | |
| 2 | 300 | 2020-03-01 | 04/01/20 |
| 3 | 400 | 2020-06-01 | |
| 6 | 500 | 2020-03-01 | |

Filtering using aggregation functions

I would like to filter my table by MIN() function but still keep columns which cant be grouped.
I have table:
+----+----------+----------------------+
| ID | distance | geom |
+----+----------+----------------------+
| 1 | 2 | DSDGSAsd23423DSFF |
| 2 | 11.2 | SXSADVERG678BNDVS4 |
| 2 | 2 | XCZFETEFD567687SDF |
| 3 | 24 | SADASDSVG3423FD |
| 3 | 10 | SDFSDFSDF343DFDGF |
| 4 | 34 | SFDHGHJ546GHJHJHJ |
| 5 | 22 | SDFSGTHHGHGFHUKJYU45 |
| 6 | 78 | SDFDGDHKIKUI45 |
| 6 | 15 | DSGDHHJGHJKHGKHJKJ65 |
+----+----------+----------------------+
This is what I would like to achieve:
+----+----------+----------------------+
| ID | distance | geom |
+----+----------+----------------------+
| 1 | 2 | DSDGSAsd23423DSFF |
| 2 | 2 | XCZFETEFD567687SDF |
| 3 | 10 | SDFSDFSDF343DFDGF |
| 4 | 34 | SFDHGHJ546GHJHJHJ |
| 5 | 22 | SDFSGTHHGHGFHUKJYU45 |
| 6 | 15 | DSGDHHJGHJKHGKHJKJ65 |
+----+----------+----------------------+
it is possible when I use MIN() on distance column and grouping by ID but then I loose my geom which is essential.
The query looks like this:
SELECT "ID", MIN(distance) AS distance FROM somefile GROUP BY "ID"
the result is:
+----+----------+
| ID | distance |
+----+----------+
| 1 | 2 |
| 2 | 2 |
| 3 | 10 |
| 4 | 34 |
| 5 | 22 |
| 6 | 15 |
+----+----------+
but this is not what I want.
Any suggestions?

One common approach to this is to find the minimum values in a derived table that you join with:
SELECT somefile."ID", somefile.distance, somefile.geom
FROM somefile
JOIN (
SELECT "ID", MIN(distance) AS distance FROM somefile GROUP BY "ID"
) t ON t.distance = somefile.distance AND t.ID = somefile.ID;
Sample SQL Fiddle

You need a window function to do this:
SELECT "ID", distance, geom
FROM (
SELECT "ID", distance, geom, rank() OVER (PARTITION BY "ID" ORDER BY distance) AS rnk
FROM somefile) sub
WHERE rnk = 1;
This effectively orders the entire set of rows first by the "ID" value, then by the distance and returns the record for each "ID" where the distance is minimal - no need to do a GROUP BY.

select a.*,b.geom from
(SELECT ID, MIN(distance) AS distance FROM somefile GROUP BY ID) as a
inner join somefile as b on a.id=b.id and a.distance=b.distance

You can use "distinct on" clause of the PostgreSQL.
select distinct on(id) id, distance, geom
from table_name
order by distance;
I think this is what you are exactly looking for.
For more details on how "distinct on" works, refer the documentation and the example.
But, remember, using "distinct on" does not comply to SQL standards.

sql - select row from group based on multiple values

I have a table like:
| ID | Val |
+-------+-----+
| abc-1 | 10 |
| abc-2 | 30 |
| cde-1 | 10 |
| cde-2 | 10 |
| efg-1 | 20 |
| efg-2 | 11 |
and would like to get the result based on the substring(ID, 1, 3) and minimum value and ist must be only the first in case the Val has duplicates
| ID | Val |
+-------+-----+
| abc-1 | 10 |
| cde-1 | 10 |
| efg-2 | 11 |
the problem is that I am stuck, because I cannot use group by substring(id,1,3), ID since it will then have again 2 rows (each for abc-1 and abc-2)

WITH
sorted
AS
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY substring(id,1,3) ORDER BY val, id) AS sequence_id
FROM
yourTable
)
SELECT
*
FROM
sorted
WHERE
sequence_id = 1

SELECT SUBSTRING(id,1,3),MIN(val) FROM Table1 GROUP BY SUBSTRING(id,1,3);
You were grouping the columns using both SUBSTRING(id,1,3),id instead of just SUBSTRING(id,1,3). It works perfectly fine.Check the same example in this below link.
http://sqlfiddle.com/#!3/fd9fc/1

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Update multiple records using index result of sub-query - sql

You can use an updatable CTE or subquery: with toupdate as ( select t.*, row_number() over (order by [priority]) as new_priority from table_name ) update toudpate set priority = new_priority where priority <> new_priority;

Related

Get the position of X user in the ranking

More efficient way to SELECT rows from PARTITION BY

How to select the latest date for each group by number?

Filtering using aggregation functions

sql - select row from group based on multiple values

Categories

Resources