Select distinct rows based on two columns with additional columns

Select distinct rows based on two columns with additional columns - sql

I have a table such as this
Blog
Id ColumnId DateCreated Type
1 1 2018-01-01 1
1 1 2018-01-02 2
1 1 2018-02-01 3
I need to select all unique rows based on the combination of Id and ColumnId. Then it needs to grab me the latest date and largest Type. I can't seem to figure this out.
I started of with just getting distinct values like this:
SELECT Id, ColumnId
FROM Blog
GROUP BY Id, ColumnId
Then I figured if I maybe joined it on itself I can pull out the rest, but I'm having no luck how to accomplish this
SELECT *
FROM ( SELECT Id, ColumnId
FROM Blog
GROUP BY Id, ColumnId) A
INNER JOIN Blog B ON A.Id = B.Id AND A.ColumnId = B.ColumnId
But that just gives me back all 3 rows.
In my live example the Id column is not int, but uniqueidentifier. For sake of simplicity I made it int in my example.
Expected Result Sample:
Id ColumnId DateCreated Type
1 1 2018-02-01 3

Have you tried using MAX?
SELECT Id, ColumnId, MAX(DateCreated), MAX([Type])
FROM Blog
GROUP BY Id, ColumnId

Use row_number():
select b.*
from (select b.*,
row_number() over (partition by id, columnid order by date desc, type desc) as seqnum
from blog b
) b
where seqnum = 1;
This grabs the largest type on the latest date, which is how I interpret "it needs to grab me the latest date and largest Type."

seems you need the last
you could use an inner join on the max date
select * from blog
inner join (
select id, columnId, max(DateCreated) as max_date
from blog
group by id, columnId
) t on t.id = blog.id, t.columnId = blog.columnId and t.max_date = blog.DateCreated

Maybe is not the best solution, but you could try to use a subselect in your query.
Something like this
SELECT DISTINCT Id, ColumnId, (SELECT MAX(Type) FROM blog) AS type
FROM Blog

Related

Finding top count of a value in a table using SQL

I'm looking for a way to find the top count value of a column by SQL.
If for example this is my data
id type
----------
1 A
1 B
1 A
2 C
2 D
2 D
I would like the result to be:
1 A
2 D
I'm looking for a way to do it without groping by the column I count (type in the example)
Thanks

Statistically, this is called the "mode". You can calculate it using window functions:
select id, type, cnt
from (select id, type, count(*) as cnt,
row_number() over (partition by id order by count(*) desc) as seqnum
from t
group by id, type
) t
where seqnum = 1;
If there are ties, then an arbitrary value is chosen from among the ties.

You are looking for the statistic mode (the most often ocurring value):
select id, stats_mode(type)
from mytable
group by id
order by id;
Not all DBMS support this however. Check your docs, wheher this function or a similar one is available in your DBMS.

Just GROUP BY id, type and keep the rows with the maximum counter:
select id, type
from tablename
group by id, type
having count(*) = (
select count(*) from tablename group by id, type order by count(*) desc limit 1
)
See the demo
Or
select id, type
from tablename
group by id, type
having count(*) = (
select max(t.counter) from (select count(*) counter from tablename group by id, type) t
)
See the demo

Select record according to column value

I have a table as shown below, and I want to select u_id with type, but if there are more records for u_id then I want to get record that has best type, or if it's not there then good and so on, best>good>worst so far. I am only able to get first row for u_id that is returned.
u_id type
1 best
2 good
3 worst
2 best

You can prioritize this with row_number and select one row per u_id.
select u_id,type
from (
select u_id,type,
row_number() over(partition by u_id order by case when type='best' then 1
when type='good' then 2
when type='worst' then 3
else 4 end) as rn
from tablename
) t
where rn=1

with type (n, type) as (values
(1, 'best'),(2,'good'),(3,'worst')
)
select distinct on (u_id) u_id, type
from t inner join type using (type)
order by u_id, n

Both the other answers are really good. This is a variant on the distinct on version that doesn't require a join:
select distinct on (u_id) u_id, type
from t
order by u_id, array_position(array[('best'), ('good'), ('worst')], type)

Return two rows from SQL table with a difference in values [duplicate]

This question already has answers here:
How to request a random row in SQL?
(30 answers)
Closed 6 years ago.
iam trying to return 2 rows from table that have a difference in values, not being an SQL wise man i am stuck any help would be appreciated :-)
TABLE A:
NAME DATA
Oscar HOME1
Jens HOME2
Will HOME1
Jeremy HOME2
Al HOME1
Result, should be 2 random rows with a difference in DATA value
NAME DATA
Oscar HOME1
Jeremy HOME2
Anyone?

Easy way to have random data.
;with tblA as (
select name,data,
row_number() over(partition by data order by newid()) rn
from A
)
select name,data
from tblA
where rn = 1

Couuld be you need
select * from my_table a
inner join my_table b on a.data !=b.data
where a.data in ( SELECT data FROM my_table ORDER BY RAND() LIMIT 1);
For your code
SELECT *
FROM [dbo].[ComputerState] as a
INNER JOIN [dbo].[ComputerState] as b ON a.ServiceName != b.ServiceName
WHERE a.ServiceName IN (
SELECT top 1 [ServiceName] FROM [dbo].[ComputerState]
);

If the question is really this simple, you can use an aggregate such as MAX() or MIN() to grab one row for each different DATA:
SELECT MAX(NAME), DATA
FROM TABLE_A
GROUP BY DATA
Of course, if any other variables are introduced to the requirements, this may no longer work.

;WITH cteA AS (
SELECT
name
,data
,ROW_NUMBER() OVER (PARTITION BY data ORDER BY NEWID()) as DataRowNumber
,ROW_NUMBER() OVER (PARTITION BY 1 ORDER BY NEWID()) as RandomRowNumber
FROM
A
)
SELECT *
FROM
cteA
WHERe
DataRowNumber = 1
AND RandomRowNumber <= 2
This Expands on #AlexKudryashev 's answer a little.
;with tblA as (
select name,data,
row_number() over(partition by data order by newid()) rn
from A
)
select name,data
from tblA
where rn = 1
The only issue with what he had Is that the number of Rows where rn = 1 will be depended on the COUNT(DISTINCT data) so it could lead to more than 2 results. To fix one could add a SELECT TOP 2 clause but it might not be fully random as results at that point as it will be dependent on the ordinal results of how SQL optimizes the query which is likely to be consistent. To get truly random add a second random row number and limit the results to the top 2 of those.

tsql group by get alphanumeric column value with maximum length

I have a sql view, let's call it SampleView, whose results have the following format.
Id (INT), NameA (VARVHAR(50)), NameB (VARCHAR(50)), ValueA (INT), ValueB (INT)
The result set of the view contains rows that may have the same Id or not. When there are two or more rows with the same Id, I would like to get something like the following
SELECT
Id,
MAX(NameA),
MAX(NameB),
MAX(ValueA),
MAX(ValueB)
FROM SampleView
GROUP BY Id
ORDER BY Id
Regarding the columns Id, ValueA and ValueB there isn't any problem. On the other hand using MAX for both NameA and NameB things are not as expected. After some googling and searching I realized that MAX has not the "expected" behavior for alphanumeric columns. Saying the expected, I mean using MAX in my case, it would be to return the value of NameA with the maximum number of characters, MAX(LEN(NameA)). I have to mention here that there ins't any possibility for NameA to have two values for the same Id with the same length. This might makes the problem more easy to be solved.
I use SQL Server 2012 and TSQL.
Have you any suggestion on how I could deal with this problem?
Thank you very much in advance for any help.

You can use window functions:
SELECT DISTINCT
id,
FIRST_VALUE(NameA) OVER (PARTITION BY id
ORDER BY len(NameA) DESC) AS MaxNameA,
MAX(ValueA) OVER (PARTITION BY id) AS MaxValueA,
FIRST_VALUE(NameB) OVER (PARTITION BY id
ORDER BY len(NameB) DESC) AS MaxNameB,
MAX(ValueB) OVER (PARTITION BY id) AS MaxValueB
FROM SampleView
Demo here

You can use correlated queries like this:
SELECT
t.Id,
(SELECT TOP 1 s.NameA FROM SampleView s
WHERE s.id = t.id
ORDER BY length(s.NameA) DESC) as NameA,
(SELECT TOP 1 s.NameB FROM SampleView s
WHERE s.id = t.id
ORDER BY length(s.NameB) DESC) as NameB,
MAX(t.ValueA),
MAX(t.ValueB)
FROM SampleView t
GROUP BY t.Id
ORDER BY t.Id

One possible variant is to use ROW_NUMBER twice:
WITH
CTE_NameA
AS
(
SELECT
Id,
NameA,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY LEN(NameA) DESC) AS rnA
FROM SampleView
)
,CTE_NameB
AS
(
SELECT
Id,
NameB,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY LEN(NameB) DESC) AS rnB
FROM SampleView
)
SELECT
Id,
CTE_NameA.NameA,
CTE_NameB.NameB,
MAX(ValueA),
MAX(ValueB)
FROM
SampleView
INNER JOIN CTE_NameA ON CTE_NameA.Id = SampleView.Id AND CTE_NameA.rnA = 1
INNER JOIN CTE_NameB ON CTE_NameB.Id = SampleView.Id AND CTE_NameB.rnB = 1
GROUP BY Id
ORDER BY Id;

How do I return a single row based on the aggregate of more than one column

Sorry for the ambiguous title, not sure how to search, or ask this question.
Lets say we have TableA:
RowID FkId Rank Date
ID1 A 1 2013-3-1
ID2 A 2 2013-3-2
ID3 A 2 2013-3-3
ID4 B 3 2013-3-4
ID5 A 1 2013-3-5
I need to create a view, that will return 1 row for each FkId. The row should be the max rank, and max date. So for FkId "A", the query would return the row for "ID3".
I was able to return a single row by using sub-queries; first I get the MAX(Rank), then join to another query that gets MAX(Date) group by FkId & Rank.
SELECT TableA.*
(Select FkId, MAX(Rank) AS Rank FROM TableA GROUP BY FkId) s1
INNER JOIN (Select FkId, Rank, MAX(Date) AS Date FROM TableA GROUP BY FkId,Rank) s2 ON s1.FkId = s2.FkId AND s1.Rank = s2.Rank
INNER JOIN TableA ON s2.FkId = TableA.FkId AND s2.Rank = TableA.Rank AND s2.Date = TableA.Date
Is there a more efficient query that would achieve the same results? Thanks for looking.
Edit: Added ID5 since the last answer. If I tried a normal MAX(rank),MAX(Date) GROUP BY FkId, then for "A", I would get A; 2; 2013-3-5. This result would not match up to a RowId.

You can use ROW_NUMBER with a CTE (presuming sql-server >= 2005):
WITH CTE AS
(
SELECT TableA.*,
RN = ROW_NUMBER() OVER (PARTITION BY FkId Order By Rank Desc, Date DESC)
FROM Table A
)
SELECT RowID,FkId, Rank,Date
FROM CTE WHERE RN = 1

Your question (clarified in the comments to this answer) asks for:
A single row for each FkId
The Max Date and Rank
The results to correspond to a row in the original data.
In the case that there are FkIds with rows such that the maximum date and maximum rank are in separate rows, you'll have to relax at least one of these requirements.
If you're willing to relax requirement (3), then you can use GROUP BY:
SELECT FkId, MAX(Rank) AS Rank, Max(Date) AS Date
FROM TableA
GROUP BY FkId
Given the extra information in the comments. That you want the latest, of the highest ranked entries for each FkId, the following should work:
SELECT FkId, Rank, MAX(Date) AS Date
FROM TableA A
WHERE Rank = (SELECT MAX(Rank)
FROM TableA sub
WHERE A.FkId = sub.FkId
GROUP BY sub.FkId)
GROUP BY FkId, Rank
Here's a sqlfiddle to show it in action.

You can use Rank() and inline query to achieve it.
select * from TableA
where RowID in (
select rowID from (
select FKID, RowID,
rank() over (partition by FKID order by [Rank] desc, [Date] desc) as RankNumber
from TableA ) A
where A.RankNumber=1 )
SQL Fiddle Demo

You can also be sneaky and accomplish what ljh suggested like this:
select top 1 with ties *
from TableA
order by rank() over (
partition by FKID
order by [Rank] desc, [Date] desc
)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Select distinct rows based on two columns with additional columns - sql

Have you tried using MAX? SELECT Id, ColumnId, MAX(DateCreated), MAX([Type]) FROM Blog GROUP BY Id, ColumnId

Use row_number(): select b.* from (select b.*, row_number() over (partition by id, columnid order by date desc, type desc) as seqnum from blog b ) b where seqnum = 1; This grabs the largest type on the latest date, which is how I interpret "it needs to grab me the latest date and largest Type."

seems you need the last you could use an inner join on the max date select * from blog inner join ( select id, columnId, max(DateCreated) as max_date from blog group by id, columnId ) t on t.id = blog.id, t.columnId = blog.columnId and t.max_date = blog.DateCreated

Maybe is not the best solution, but you could try to use a subselect in your query. Something like this SELECT DISTINCT Id, ColumnId, (SELECT MAX(Type) FROM blog) AS type FROM Blog

Related

Finding top count of a value in a table using SQL

Select record according to column value

Return two rows from SQL table with a difference in values [duplicate]

tsql group by get alphanumeric column value with maximum length

How do I return a single row based on the aggregate of more than one column

Categories

Resources