SQL - select distinct only on one column [duplicate] - sql

This question already has answers here:
How can I SELECT rows with MAX(Column value), PARTITION by another column in MYSQL?
(22 answers)
Closed 9 years ago.
I have searched far and wide for an answer to this problem. I'm using a Microsoft SQL Server, suppose I have a table that looks like this:
+--------+---------+-------------+-------------+
| ID | NUMBER | COUNTRY | LANG |
+--------+---------+-------------+-------------+
| 1 | 3968 | UK | English |
| 2 | 3968 | Spain | Spanish |
| 3 | 3968 | USA | English |
| 4 | 1234 | Greece | Greek |
| 5 | 1234 | Italy | Italian |
I want to perform one query which only selects the unique 'NUMBER' column (whether is be the first or last row doesn't bother me). So this would give me:
+--------+---------+-------------+-------------+
| ID | NUMBER | COUNTRY | LANG |
+--------+---------+-------------+-------------+
| 1 | 3968 | UK | English |
| 4 | 1234 | Greece | Greek |
How is this achievable?

A very typical approach to this type of problem is to use row_number():
select t.*
from (select t.*,
row_number() over (partition by number order by id) as seqnum
from t
) t
where seqnum = 1;
This is more generalizable than using a comparison to the minimum id. For instance, you can get a random row by using order by newid(). You can select 2 rows by using where seqnum <= 2.

Since you don't care, I chose the max ID for each number.
select tbl.* from tbl
inner join (
select max(id) as maxID, number from tbl group by number) maxID
on maxID.maxID = tbl.id
Query Explanation
select
tbl.* -- give me all the data from the base table (tbl)
from
tbl
inner join ( -- only return rows in tbl which match this subquery
select
max(id) as maxID -- MAX (ie distinct) ID per GROUP BY below
from
tbl
group by
NUMBER -- how to group rows for the MAX aggregation
) maxID
on maxID.maxID = tbl.id -- join condition ie only return rows in tbl
-- whose ID is also a MAX ID for a given NUMBER

You will use the following query:
SELECT * FROM [table] GROUP BY NUMBER;
Where [table] is the name of the table.
This provides a unique listing for the NUMBER column however the other columns may be meaningless depending on the vendor implementation; which is to say they may not together correspond to a specific row or rows.

Related

Oracle Query that fetches one row per each _id

I have a table like :
ID | Val | Kind
----------------------
1 | a | 2
2 | b | 1
3 | c | 4
3 | c | 33
and I need to fetch one row per each id in Oracle SQL.
any ideas?
You can use row_number() to enumerate the rows. For an arbitrary row:
select t.*
from (select t.*,
row_number() over (partition by id order by id) as seqnum
from t
) t
where seqnum = 1;
As I point out in a comment, though, this is unnecessary based on the data in your question. The ids are already unique.

Order By Id and Limit Offset By Id from a table

I have an issue similar to the following query:
select name, number, id
from tableName
order by id
limit 10 offset 5
But in this case I only take the 10 elements from the group with offset 5
Is there a way to set limit and offset by id?
For example if I have a set:
|------------------------------------|---|---------------------------------------|
| Ana | 1 | 589d0011-ef54-4708-a64a-f85228149651 |
| Jana | 2 | 589d0011-ef54-4708-a64a-f85228149651 |
| Jan | 3 | 589d0011-ef54-4708-a64a-f85228149651 |
| Joe | 2 | 64ed0011-ef54-4708-a64a-f85228149651 |
and if I have skip 1 I should get
|------------------------------------|---|---------------------------------------|
| Jana | 2 | 589d0011-ef54-4708-a64a-f85228149651 |
| Jan | 3 | 589d0011-ef54-4708-a64a-f85228149651 |
I think that you want to filter by row_number():
select name, number, id
from (
select t.*, row_number() over(partition by name order by id) rn
from mytable t
) t
where
rn >= :number_of_records_per_group_to_skip
and rn < :number_of_records_per_group_to_skip + :number_of_records_per_group_to_keep
The query ranks records by id withing groups of records having the same name, and then filters using two parameters:
:number_of_records_per_group_to_skip: how many records per group should be skipped
:number_of_records_per_group_to_skip: how many records per group should be kept (after skipping :number_of_records_per_group_to_skip records)
This might not be the answer you are looking for but it gives you the results your example shows:
select name, number, id
from (
select * from tableName
order by id
limit 3 offset 0
) d
where id > 1;
Best regards,
Bjarni

Update statement on same table with multi column subsquery

Everyone, I have been working on a query for last one hour that I need to sort country field alphabetically and update its sort_order field with incremantal numbers.
I wrote a select query which sorts my country field in alphabetic order and I added a row number. So far so good.
My select query response is :
objid | country | sort_order | my_rownum
--------------------------------------------
c1 | America | 0 | 1
g2 | Englanc | 0 | 2
k1 | France | 0 | 3
Now, I need to update my sort_order field with my_rownum values.
Country_Info Table :
objid | country | sort_order
----------------------------
c1 | America | 0
g2 | Englanc | 0
k1 | France | 0
After Update table must be like that :
objid | country | sort_order
----------------------------
c1 | America | 1
g2 | Englanc | 2
k1 | France | 3
I tried many sql queries but something is wrong with my queries.
Sample pseudo query for explaining what I need. I can not merge the logic.
-- sort_order will set when sample query objid and counrty_info objid are equals
update country_info
set sort_order = (value from subquery here)
where (value equivlance from subquery for objid here)
--Subquery
(select c.objid, c.country, c.sort_order, row_number() over (partition by 1 order by 1) as my_rownumber
from counrty c where <bla bla> order by c.country asc)
Regards...
Assuming you table name as table1, you can use merge clause like this:
MERGE INTO table1 t1
USING (SELECT s.*, ROWNUM my_rownum
FROM ( SELECT *
FROM table1
ORDER BY 3) s) t2
ON (T1.objid = t2.objid)
WHEN MATCHED
THEN
UPDATE SET T1.sort_order = t2.my_rownum
I hope this helps
if using SQL2005 or later, try:
select *, sort_order = dense_rank() over (order by country)
from yourtable
This would give all occurrences of the first country 1, all of the second 2, and so on. If you want unique sort values, use row_number().

Select the most common item for each category

Each row in my table belongs to some category, has some value and other data.
I would like to select each category with the most common value for it (doesn't matter which one if there are multiple), ordered by category.
some_table: expected result:
+--------+-----+--- +--------+-----+
|category|value|... |category|value|
+--------+-----+--- +--------+-----+
| 1 | a | | 1 | a |
| 1 | a | | 2 | b |
| 1 | b | | 3 | a # or b
| 2 | a | +--------+-----+
| 2 | b |
| 2 | c |
| 2 | b |
| 3 | a |
| 3 | a |
| 3 | b |
| 3 | b |
+--------+-----+---
I have a solution (posting it as an answer) but it seems suboptimal to me. So I'm looking for better solutions.
My table will have up to 10000 rows (possibly, but not likely, beyond that).
I'm planning to use SQLite but I'm not tied to it, so I may reconsider if SQLite can't do this with reasonable performance.
I would be inclined to do this using a correlated subquery:
select distinct category,
(select value
from some_table t2
where t2.category = t.category
group by value
order by count(*) desc
limit 1
) as mode_value
from some_table t;
The name for the most common value is "mode" in statistics.
And, if you had a categories table, this would be written as:
select category,
(select value
from some_table t2
where t2.category = c.category
group by value
order by count(*) desc
limit 1
) as mode_value
from categories c;
Here is one option, but I think it's slow...
SELECT DISTINCT `category` AS `the_category`, `value`
FROM `some_table`
WHERE `value`=(
SELECT `value`
FROM `some_table`
WHERE `category`=`the_category`
GROUP BY `value`
ORDER BY COUNT(`value`) DESC LIMIT 1)
ORDER BY `category`;
You can replace a part of this with WHERE `id`=( SELECT `id` if the table has a unique/primary key column, then the LIMIT 1 is not needed.
select category, value, count(*) value_count
from some_table t
group by category, value
order by category, value_count DESC;
returns us amout of each value in each category
select category, value
from (
select category, value, count(*) value_count
from some_table t
group by category, value) sub
group by category
actually we need the first value because it's sorted.
I am not sure sqlite leaves the first one and can't test but IMHO it should work

Select multiple distinct rows from table SQL

I am attempting to select distinct (last updated) rows from a table in my database. I am trying to get the last updated row for each "Sub section". However I cannot find a way to achieve this.
The table looks like:
ID | Name |LastUpdated | Section | Sub |
1 | Name1 | 2013-04-07 16:38:18.837 | 1 | 1 |
2 | Name2 | 2013-04-07 15:38:18.837 | 1 | 2 |
3 | Name3 | 2013-04-07 12:38:18.837 | 1 | 1 |
4 | Name4 | 2013-04-07 13:38:18.837 | 1 | 3 |
5 | Name5 | 2013-04-07 17:38:18.837 | 1 | 3 |
What I am trying to get my SQL Statement to do is return rows:
1, 2, and 5.
They are distinct for the Sub, and the most recent.
I have tried:
SELECT DISTINCT Sub, LastUpdated, Name
FROM TABLE
WHERE LastUpdated = (SELECT MAX(LastUpdated) FROM TABLE WHERE Section = 1)
Which only returns the distinct row for the most recent updated Row. Which makes sense.
I have googled what I am trying, and checked relevant posts on here. However not managed to find one which really answers what I am trying.
You can use the row_number() window function to assign numbers for each partition of rows with the same value of Sub. Using order by LastUpdated desc, the row with row number one will be the latest row:
select *
from (
select row_number() over (
partition by Sub
order by LastUpdated desc) as rn
, *
from YourTable
) as SubQueryAlias
where rn = 1
Wouldn't it be enough to use group by?
SELECT DISTINCT MIN(Sub), MAX(LastUpdated), MIN(NAME) FROM TABLE GROUP BY Sub Where Section = 1