Only the distinct values after a group by in SQL Server 2014 - sql

Here is the sample of the data:
ID Value NumPeriod
------------------------
1681642 596.8 2
1681642 596.8 3
1681663 445.4 2
1681663 445.4 3
1681688 461.9 3
1681707 282.2 3
1681724 407.1 3
1681743 467 2
1681743 467 3
1681767 502 3
I want to group by the [ID] and take only the distinct values of [Value] within each group and take the "first" distinct [Value] according to [NumPeriod]. So the result would look something this:
ID Value NumPeriod
-------------------------
1681642 596.8 2
1681663 445.4 2
1681688 461.9 3
1681707 282.2 3
1681724 407.1 3
1681743 467 2
1681767 502 3
So I though something like this would work, but no luck:
select
ID, distinct(Value), NumPeriod
from
MyTable
group by
ID, Value, NumPeriod
order by
ID, NumPeriod
Any help would be appreciated. Thanks!

You can use a ranking function and a CTE:
WITH CTE AS
(
SELECT ID, Value, NumPeriod,
RN = ROW_NUMBER() OVER (PARTITION BY ID ORDER BY NumPeriod ASC)
FROM MyTable
)
SELECT ID, Value, NumPeriod
FROM CTE
WHERE RN = 1
ORDER BY ID, Value

I think all you have to change is where you call distinct.
Try this:
select distinct ID, Value, NumPeriod
from MyTable
group by ID, Value, NumPeriod
order by ID, NumPeriod

Related

Select row in group with largest value in particular column postgres

I have a database table which looks like this.
id account_id action time_point
3 234 delete 100
1 656 create 600
1 4435 update 900
3 645 create 50
I need to group this table by id and select particular row where time_point has a largest value.
Result table should look like this:
id account_id action time_point
3 234 delete 100
1 4435 update 900
Thanks for help,
qwew
In Postgres, I would recommend distinct on to solve this top 1 per group problem:
select distinct on (id) *
from mytable
order by id, time_point desc
However, this does not allow possible to ties. If so, rank() is a better solution:
select *
from (
select t.*, rank() over(partition by id order by time_point desc) rn
from mytable t
) t
where rn = 1
Or, if you are running Postgres 13:
select *
from mytable t
order by rank() over(partition by id order by time_point desc)
fetch first row with ties
check this.
select * from x
where exists (
select 1 from x xin
where xin.id = x.id
having max(time_point) = time_point
);

SQL to get unique rows in Netezza DB

I have a table with rows like:
id group_name_code
1 999
2 16
3 789
4 999
5 231
6 999
7 349
8 16
9 819
10 999
11 654
But I want output rows like this:
id group_name_code
1 999
2 16
3 789
4 231
5 349
6 819
7 654
Will this query help?
select id, distinct(group_name_code) from group_table;
You seem to want:
Distinct values for group_name_code and a sequential id ordered by minimum id per set of group_name_code.
Netezza has the DISTINCT key word, but not DISTINCT ON () (Postgres feature):
https://www.ibm.com/support/knowledgecenter/en/SSULQD_7.2.1/com.ibm.nz.dbu.doc/r_dbuser_select.html
You could:
SELECT DISTINCT group_name_code FROM group_table;
No parentheses, the DISTINCT key word does not require parentheses.
But you would not get the sequential id you show with this.
There are "analytic functions" a.k.a. window functions:
https://www.ibm.com/support/knowledgecenter/en/SSULQD_7.2.1/com.ibm.nz.dbu.doc/c_dbuser_overview_analytic_funcs.html
And there is also row_number():
https://www.ibm.com/support/knowledgecenter/en/SSULQD_7.2.1/com.ibm.nz.dbu.doc/r_dbuser_functions.html
So this should work:
SELECT row_number() OVER (ORDER BY min(id)) AS new_id, group_name_code
FROM group_table
GROUP BY group_name_code
ORDER BY min(id);
Or use a subquery if Netezza should not allow to nest aggregate and window functions:
SELECT row_number() OVER (ORDER BY id) AS new_id, group_name_code
FROM (
SELECT min(id) AS id, group_name_code
FROM group_table
GROUP BY group_name_code
) sub
ORDER BY id;
If you do not mind losing data on id you can use an aggregate function on that column and group by group_name_code:
select min(id) as id, group_name_code
from group_table
group by group_name_code
order by id;
This way you pull unique values for group_name_code and the lowest id for each code.
If you don't need id in your output (it seems like this doesn't correspond to input table) and just want the unique codes, try this:
select group_name_code
from p
group by group_name_code
order by id;
This gets the codes you want. If you want id to be the rownumber that will depend on which RDBMS you are using
you can get that result using CTE, replace #t with you table name and value with group_name_code
; WITH tbl AS (
SELECT DISTINCT value FROM #t
)
SELECT ROW_NUMBER() OVER (ORDER BY value) AS id,* FROM tbl

Getting all fields from table filtered by MAX(Column1)

I have table with some data, for example
ID Specified TIN Value
----------------------
1 0 tin1 45
2 1 tin1 34
3 0 tin2 23
4 3 tin2 47
5 3 tin2 12
I need to get rows with all fields by MAX(Specified) column. And if I have few row with MAX column (in example ID 4 and 5) i must take last one (with ID 5)
finally the result must be
ID Specified TIN Value
-----------------------
2 1 tin1 34
5 3 tin2 12
This will give the desired result with using window function:
;with cte as(select *, row_number(partition by tin order by specified desc, id desc) as rn
from tablename)
select * from cte where rn = 1
Edit: Updated query after question edit.
Here is the fiddle
http://sqlfiddle.com/#!9/20e1b/1/0
SELECT * FROM TBL WHERE ID IN (
SELECT max(id) FROM
TBL WHERE SPECIFIED IN
(SELECT MAX(SPECIFIED) FROM TBL
GROUP BY TIN)
group by specified)
I am sure we can simplify it further, but this will work.
select * from tbl where id =(
SELECT MAX(ID) FROM
tbl where specified =(SELECT MAX(SPECIFIED) FROM tbl))
One method is to use window functions, row_number():
select t.*
from (select t.*, row_number() over (partition by tim
order by specified desc, id desc
) as seqnum
from t
) t
where seqnum = 1;
However, if you have an index on tin, specified id and on id, the most efficient method is:
select t.*
from t
where t.id = (select top 1 t2.id
from t t2
where t2.tin = t.tin
order by t2.specified desc, id desc
);
The reason this is better is that the index will be used for the subquery. Then the index will be used for the outer query as well. This is highly efficient. Although the index will be used for the window functions; the resulting execution plan probably requires scanning the entire table.

Tsql to get first random product in a category

I've this result set:
select a.id, a.categoria from Articolo a
where novita = 1
order by a.categoria, newid()
id categoria
----------- -----------
3 4
11 4
1 4
12 5
13 5
4 6
and i would to get the first product (in a random order) from each different category:
id categoria
----------- -----------
3 4
12 5
4 6
Ideally something like
select FIRST(a.id), a.categoria from Articolo a
where novita = 1
order by a.categoria, newid()
Any ideas?
Use MAX(a.id) with GROUP BY a.categoria
SELECT MAX(a.id), a.categoria
from Articolo a
where novita = 1
GROUP BY a.category
Update
To get random id for each categoria you can use the ranking function ROW_NUMBER() OVER(PARTITION BY categoria) with ORDER BY NEWID to get a random ordering, like this:
WITH CTE
AS
(
SELECT id, categoria, ROW_NUMBER() OVER(PARTITION BY categoria
ORDER BY NEwID()) AS rn
FROM Articolo
)
SELECT id, categoria
FROM CTE
WHERE rn = 1;
See it in action here:
SQL Fiddle Demo
This way, it will give you a random id for each categoria each time.
However, If you want the first, you can use the ORDER BY(SELECT 1) inside the ranking function ROW_NUMBER():
WITH CTE
AS
(
SELECT id, categoria, ROW_NUMBER() OVER(PARTITION BY categoria
ORDER BY (select 1)) AS rn
FROM Articolo
)
SELECT id, categoria
FROM CTE
WHERE rn = 1;
Updated SQL Fiddle Demo
This will give you the first id for each categoria.
Note that: There is no meaning of the first value in the database concepts, because in the relational model, the rows order is not significant. And it is not guaranteed to return the same order each time, you have to ORDER BY specific column to get consistent ordering.

Getting rows with duplicate column values

I tried this with solutions avaialble online, but none worked for me.
Table :
Id rank
1 100
1 100
2 75
2 45
3 50
3 50
I want Ids 1 and 3 returned, beacuse they have duplicates.
I tried something like
select * from A where rank in (
select rank from A group by rank having count(rank) > 1
This also returned ids without any duplicates. Please help.
Try this:
select id from table
group by id, rank
having count(*) > 1
select id, rank
from
(
select id, rank, count(*) cnt
from rank_tab
group by id, rank
having count(*) > 1
) t
This general idea should work:
SELECT id
FROM your_table
GROUP BY id
HAVING COUNT(*) > 1 AND COUNT(DISTINCT rank) = 1
In plain English: get every id that exists in multiple rows, but all these rows have the same value in rank.
If you want ids that have some duplicated ranks (but not necessarily all), something like this should work:
SELECT id
FROM your_table
GROUP BY id
HAVING COUNT(*) > COUNT(DISTINCT rank)