sql - select row from group based on multiple values

sql - select row from group based on multiple values - sql

I have a table like:
| ID | Val |
+-------+-----+
| abc-1 | 10 |
| abc-2 | 30 |
| cde-1 | 10 |
| cde-2 | 10 |
| efg-1 | 20 |
| efg-2 | 11 |
and would like to get the result based on the substring(ID, 1, 3) and minimum value and ist must be only the first in case the Val has duplicates
| ID | Val |
+-------+-----+
| abc-1 | 10 |
| cde-1 | 10 |
| efg-2 | 11 |
the problem is that I am stuck, because I cannot use group by substring(id,1,3), ID since it will then have again 2 rows (each for abc-1 and abc-2)

WITH
sorted
AS
(
SELECT
*,
ROW_NUMBER() OVER (PARTITION BY substring(id,1,3) ORDER BY val, id) AS sequence_id
FROM
yourTable
)
SELECT
*
FROM
sorted
WHERE
sequence_id = 1

SELECT SUBSTRING(id,1,3),MIN(val) FROM Table1 GROUP BY SUBSTRING(id,1,3);
You were grouping the columns using both SUBSTRING(id,1,3),id instead of just SUBSTRING(id,1,3). It works perfectly fine.Check the same example in this below link.
http://sqlfiddle.com/#!3/fd9fc/1

Related

Combine PARTITION BY and GROUP BY

I have a (mssql) table like this:
+----+----------+---------+--------+--------+
| id | username | date | scoreA | scoreB |
+----+----------+---------+--------+--------+
| 1 | jim | 01/2020 | 100 | 0 |
| 2 | max | 01/2020 | 0 | 200 |
| 3 | jim | 01/2020 | 0 | 150 |
| 4 | max | 02/2020 | 150 | 0 |
| 5 | jim | 02/2020 | 0 | 300 |
| 6 | lee | 02/2020 | 100 | 0 |
| 7 | max | 02/2020 | 0 | 200 |
+----+----------+---------+--------+--------+
What I need is to get the best "combined" score per date. (With "combined" score I mean the best scores per user and per date summarized)
The result should look like this:
+----------+---------+--------------------------------------------+
| username | date | combined_score (max(scoreA) + max(scoreB)) |
+----------+---------+--------------------------------------------+
| jim | 01/2020 | 250 |
| max | 02/2020 | 350 |
+----------+---------+--------------------------------------------+
I came this far:
I can group the scores by user like this:
SELECT
username, (max(scoreA) + max(scoreB)) AS combined_score,
FROM score_table
GROUP BY username
ORDER BY combined_score DESC
And I can get the best score per date with PARTITION BY like this:
SELECT *
FROM
(SELECT t.*, row_number() OVER (PARTITION BY date ORDER BY scoreA DESC) rn
FROM score_table t) as tmp
WHERE tmp.rn = 1
ORDER BY date
Is there a proper way to combine these statements and get the result I need? Thank you!
Btw. Don't care about possible ties!

You can combine window functions and aggregation functions like this:
SELECT s.*
FROM (SELECT username, date, (max(scoreA) + max(scoreB)) AS combined_score,
ROW_NUMBER() OVER (PARTITION BY date ORDER BY max(scoreA) + max(scoreB) DESC) as seqnum
FROM score_table
GROUP BY username, date
) s
ORDER BY combined_score DESC;
Note that date needs to be part of the aggregation.

How to select values, where each one depends on a previously aggregated state?

I have the following table:
|-----|-----|
| i d | val |
|-----|-----|
| 1 | 1 |
|-----|-----|
| 2 | 4 |
|-----|-----|
| 3 | 3 |
|-----|-----|
| 4 | 7 |
|-----|-----|
Can I get the following output:
|-----|
| sum |
|-----|
| 1 |
|-----|
| 5 |
|-----|
| 8 |
|-----|
| 1 5 |
|-----|
using a single SQLite3 SELECT-query? I know it could be easily achieved using variables, but SQLite3 lacks those. Maybe some recursive query? Thanks.

No.
In a relational database table rows do not have any order. If you specify an order for the rows, then it's possible to write a query.
Now, you could add an extra column to sort the rows. For example:
| val | sort
|-----|-----
| 1 | 10
| 4 | 20
| 3 | 30
| 7 | 40
The query could be:
select
sum(val) over(order by sort)
from my_table

For the updated question, you can write:
select
sum(val) over(order by id)
from my_table

By using the order of the id column and if you want only the sum column, you can do this:
select (select sum(val) from tablename where id <= t.id) sum
from tablename t

calculating sum of rows with identical id

Let's imagine a table with two columns ex:
| Value | ID |
+-------+----+
| 2 | 1 |
| 3 | 1 |
| 4 | 1 |
| 1 | 2 |
| 2 | 2 |
| 2 | 2 |
What I am trying to do is to calculate the sum of those with similar id and display them in different table like:
| Sum | ID |
+-----+----+
| 9 | 1 |
| 5 | 2 |
and so on.
I could find a sum of a known id by
SELECT SUM(VALUE) FROM MYTABLE WHERE ID = 1;
However not sure on how to find sum of different id's separately, could you give an idea on how to proceed?

Select SUM(VALUE),ID FROM MYTABLE GROUP BY ID

Use GROUP BY clause:
SELECT SUM(VALUE) Sum, ID FROM MYTABLE GROUP BY ID;

SELECT SUM(VALUE),ID FROM MYTABLE Group By ID

Filtering using aggregation functions

I would like to filter my table by MIN() function but still keep columns which cant be grouped.
I have table:
+----+----------+----------------------+
| ID | distance | geom |
+----+----------+----------------------+
| 1 | 2 | DSDGSAsd23423DSFF |
| 2 | 11.2 | SXSADVERG678BNDVS4 |
| 2 | 2 | XCZFETEFD567687SDF |
| 3 | 24 | SADASDSVG3423FD |
| 3 | 10 | SDFSDFSDF343DFDGF |
| 4 | 34 | SFDHGHJ546GHJHJHJ |
| 5 | 22 | SDFSGTHHGHGFHUKJYU45 |
| 6 | 78 | SDFDGDHKIKUI45 |
| 6 | 15 | DSGDHHJGHJKHGKHJKJ65 |
+----+----------+----------------------+
This is what I would like to achieve:
+----+----------+----------------------+
| ID | distance | geom |
+----+----------+----------------------+
| 1 | 2 | DSDGSAsd23423DSFF |
| 2 | 2 | XCZFETEFD567687SDF |
| 3 | 10 | SDFSDFSDF343DFDGF |
| 4 | 34 | SFDHGHJ546GHJHJHJ |
| 5 | 22 | SDFSGTHHGHGFHUKJYU45 |
| 6 | 15 | DSGDHHJGHJKHGKHJKJ65 |
+----+----------+----------------------+
it is possible when I use MIN() on distance column and grouping by ID but then I loose my geom which is essential.
The query looks like this:
SELECT "ID", MIN(distance) AS distance FROM somefile GROUP BY "ID"
the result is:
+----+----------+
| ID | distance |
+----+----------+
| 1 | 2 |
| 2 | 2 |
| 3 | 10 |
| 4 | 34 |
| 5 | 22 |
| 6 | 15 |
+----+----------+
but this is not what I want.
Any suggestions?

One common approach to this is to find the minimum values in a derived table that you join with:
SELECT somefile."ID", somefile.distance, somefile.geom
FROM somefile
JOIN (
SELECT "ID", MIN(distance) AS distance FROM somefile GROUP BY "ID"
) t ON t.distance = somefile.distance AND t.ID = somefile.ID;
Sample SQL Fiddle

You need a window function to do this:
SELECT "ID", distance, geom
FROM (
SELECT "ID", distance, geom, rank() OVER (PARTITION BY "ID" ORDER BY distance) AS rnk
FROM somefile) sub
WHERE rnk = 1;
This effectively orders the entire set of rows first by the "ID" value, then by the distance and returns the record for each "ID" where the distance is minimal - no need to do a GROUP BY.

select a.*,b.geom from
(SELECT ID, MIN(distance) AS distance FROM somefile GROUP BY ID) as a
inner join somefile as b on a.id=b.id and a.distance=b.distance

You can use "distinct on" clause of the PostgreSQL.
select distinct on(id) id, distance, geom
from table_name
order by distance;
I think this is what you are exactly looking for.
For more details on how "distinct on" works, refer the documentation and the example.
But, remember, using "distinct on" does not comply to SQL standards.

selecting data with highest field value in a field

I have a table, and I'd like to select rows with the highest value. For example:
----------------
| user | index |
----------------
| 1 | 1 |
| 2 | 1 |
| 2 | 2 |
| 3 | 4 |
| 3 | 7 |
| 4 | 1 |
| 5 | 1 |
----------------
Expected result:
----------------
| user | index |
----------------
| 1 | 1 |
| 2 | 2 |
| 3 | 7 |
| 4 | 1 |
| 5 | 1 |
----------------
How may I do so? I assume it can be done by some oracle function I am not aware of?
Thanks in advance :-)

You can use MAX() function for that with grouping user column like this:
SELECT "user"
,MAX("index") AS "index"
FROM Table1
GROUP BY "user"
ORDER BY "user";
Result:
| USER | INDEX |
----------------
| 1 | 1 |
| 2 | 2 |
| 3 | 7 |
| 4 | 1 |
| 5 | 1 |
See this SQLFiddle

if you have more than one column
select user , index
from (
select u.* , row_number() over (partition by user order by index desc) as rnk
from some_table u)
where rnk = 1
user is a reserved word - you should use a different name for the column.

select user,max(index) index from tbl
group by user;

Alternatively, you can use analytic functions:
select user,index, max(index) over (partition by user order by 1 ) highest from YOURTABLE
Note: Try NOT to use words like user, index, date etc.. as your column names, as they are reserved words for Oracle. If you will use, then use them with quotation marks, eg. "index", "date"...

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

sql - select row from group based on multiple values - sql

WITH sorted AS ( SELECT , ROW_NUMBER() OVER (PARTITION BY substring(id,1,3) ORDER BY val, id) AS sequence_id FROM yourTable ) SELECT FROM sorted WHERE sequence_id = 1

SELECT SUBSTRING(id,1,3),MIN(val) FROM Table1 GROUP BY SUBSTRING(id,1,3); You were grouping the columns using both SUBSTRING(id,1,3),id instead of just SUBSTRING(id,1,3). It works perfectly fine.Check the same example in this below link. http://sqlfiddle.com/#!3/fd9fc/1

Related

Combine PARTITION BY and GROUP BY

How to select values, where each one depends on a previously aggregated state?

calculating sum of rows with identical id

Filtering using aggregation functions

selecting data with highest field value in a field

Categories

Resources

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

sql - select row from group based on multiple values - sql

WITH sorted AS ( SELECT *, ROW_NUMBER() OVER (PARTITION BY substring(id,1,3) ORDER BY val, id) AS sequence_id FROM yourTable ) SELECT * FROM sorted WHERE sequence_id = 1

SELECT SUBSTRING(id,1,3),MIN(val) FROM Table1 GROUP BY SUBSTRING(id,1,3); You were grouping the columns using both SUBSTRING(id,1,3),id instead of just SUBSTRING(id,1,3). It works perfectly fine.Check the same example in this below link. http://sqlfiddle.com/#!3/fd9fc/1

Related

Combine PARTITION BY and GROUP BY

How to select values, where each one depends on a previously aggregated state?

calculating sum of rows with identical id

Filtering using aggregation functions

selecting data with highest field value in a field

Categories

Resources

WITH sorted AS ( SELECT , ROW_NUMBER() OVER (PARTITION BY substring(id,1,3) ORDER BY val, id) AS sequence_id FROM yourTable ) SELECT FROM sorted WHERE sequence_id = 1