SQL rank/dense_rank and how to query/calculate with the result - sql

So I have a table where it dense_ranks my rows.
Here is the table:
COL1 | COL2 | COL3 | DENSE_RANK |
a | b | c | 1 |
a | s | r | 1 |
a | w | f | 1 |
b | b | c | 2 |
c | f | r | 3 |
c | q | d | 3 |
So now I want to select any rows where the rank was only represented once, so the 2 is all alone, but not the 1 or 3. I want to select all the rows where this occurs, but how do I do that?
Some ideas:
-COUNT DISTINCT (RANK())
-COUNT RANK()
but neither of those are working, any ideas? please and thank you!
happy hacking
actual code:
SELECT events.event_type AS "event",
DENSE_RANK() OVER (ORDER BY bw_user_event.pad_id) as rank
FROM user_event
WHERE (software_events.software_id = '8' OR software_events.software_id = '14')
AND (software_events.event_type = 'install')

WITH Dense_ranked_table as (
-- Your select query that generates the table with dense ranks
)
SELECT DENSE_RANK
FROM Dense_ranked_table
GROUP BY DENSE_RANK
HAVING COUNT(DENSE_RANK) = 1;
I don't have SQL Server to test this. So please let me know whether this works or not.

I would think you can add a COUNT(*) OVER (PARTITION BY XXXXX) where XXXXX is what you include in your dense rank.
Then wrap this in a Common Table Expression and select where your new Count is = 1.
Something like this fiddler:
http://sqlfiddle.com/#!6/ae774/1
Code included here as well:
CREATE TABLE T
(
COL1 CHAR,
COL2 CHAR,
COL3 CHAR
);
INSERT INTO T
VALUES
('a','b','c'),
('a','s','r'),
('a','w','f'),
('b','b','c'),
('c','f','r'),
('c','q','d');
WITH CTE AS (
SELECT COL1 ,
COL2 ,
COL3,
DENSE_RANK() OVER (ORDER BY COL1) AS DR,
COUNT(*) OVER (PARTITION BY COL1) AS C
FROM dbo.T AS t
)
SELECT COL1, COL2, COL3, DR
FROM CTE
WHERE C = 1
Would return just the
b, b, c, 2
row from your test data.

Related

Two rows with the same id and two different values, getting the second value into another column

I have two rows with the same id but different values. I want a query to get the second value and display it in the first row.
There are only two rows for each productId and 2 different values.
I've tried looking for this for the solution everywhere.
What I have, example:
+-----+-------+
| ID | Value |
+-----+-------+
| 123 | 1 |
| 123 | 2 |
+-----+-------+
What I want
+------+-------+---------+
| ID | Value | Value 1 |
+------+-------+---------+
| 123 | 1 | 2 |
+------+-------+---------+
Not sure whether order matters to you. Here is one way:
SELECT MIN(Value), MAX(Value), ID
FROM Table
GROUP BY ID;
This is a self-join:
SELECT a.ID, a.Value, b.Value
FROM table a
JOIN table b on a.ID = b.ID
and a.Value <> b.Value
You can use a LEFT JOIN instead if there are IDs that only have one value and would be lost by the above JOIN
May be you may try this
DECLARE #T TABLE
(
Id INT,
Val INT
)
INSERT INTO #T
VALUES(123,1),(123,2),
(456,1),(789,1),(789,2)
;WITH CTE
AS
(
SELECT
RN = ROW_NUMBER() OVER(PARTITION BY Id ORDER BY Val),
*
FROM #T
)
SELECT
*
FROM CTE
PIVOT
(
MAX(Val)
FOR
RN IN
(
[1],[2]--Add More Numbers here if there are more values
)
)Q

Get row which matched in each group

I am trying to make a sql query. I got some results from 2 tables below. Below results are good for me. Now I want those values which is present in each group. for example, A and B is present in each group(in each ID). so i want only A and B in result. and also i want make my query dynamic. Could anyone help?
| ID | Value |
|----|-------|
| 1 | A |
| 1 | B |
| 1 | C |
| 1 | D |
| 2 | A |
| 2 | B |
| 2 | C |
| 3 | A |
| 3 | B |
In the following query, I have placed your current query into a CTE for further use. We can try selecting those values for which every ID in your current result appears. This would imply that such values are associated with every ID.
WITH cte AS (
-- your current query
)
SELECT Value
FROM cte
GROUP BY Value
HAVING COUNT(DISTINCT ID) = (SELECT COUNT(DISTINCT ID) FROM cte);
Demo
The solution is simple - you can do this in two ways at least. Group by letters (Value), aggregate IDs with SUM or COUNT (distinct values in ID). Having that, choose those letters that have the value for SUM(ID) or COUNT(ID).
select Value from MyTable group by Value
having SUM(ID) = (SELECT SUM(DISTINCT ID) from MyTable)
select Value from MyTable group by Value
having COUNT(ID) = (SELECT COUNT(DISTINCT ID) from MyTable)
Use This
WITH CTE
AS
(
SELECT
Value,
Cnt = COUNT(DISTINCT ID)
FROM T1
GROUP BY Value
)
SELECT
Value
FROM CTE
WHERE Cnt = (SELECT COUNT(DISTINCT ID) FROM T1)

TSQL Number Rows Based on change in fieldvalue and sorted on date with incremented numbers on duplicates

Say I have a data like the following:
X | 2/2/2000
X | 2/3/2000
B | 2/4/2000
B | 2/10/2000
B | 2/10/2000
J | 2/11/2000
X | 3/1/2000
I would like to get a dataset like this:
1 | X | 2/2/2000
1 | X | 2/3/2000
2 | B | 2/4/2000
2 | B | 2/10/2000
2 | B | 2/10/2000
3 | J | 2/11/2000
4 | X | 3/1/2000
So far everything I have tried has either ended up numbering each change resetting the count on each field value change or in the example leave the last X as 1.
This is a gaps and islands problem. You can use a difference of row numbers:
select dense_rank() over (order by col1, seqnum_1 - seqnum_2) as col0,
col1, col2
from (select t.*,
row_number() over (order by col2) as seqnum_1,
row_number() over (partition by col1 order by col2) as seqnum_2
from t
) t;
Explaining why this works is a bit cumbersome. If you run the subquery, you will see how the sequence numbers are assigned and why the difference is what you want.
you can query like this:
SELECT dense_rank() over(order by yourcolumn1), * from yourtable

SQL Select First column and for each row select unique ID and the last date

I have a problems this mornig , I have tried many solutions and nothing gave me the expected result.
I have a table that looks like this :
+----+----------+-------+
| ID | COL2 | DATE |
+----+----------+-------+
| 1 | 1 | 2001 |
| 1 | 2 | 2002 |
| 1 | 3 | 2003 |
| 1 | 4 | 2004 |
| 2 | 1 | 2001 |
| 2 | 2 | 2002 |
| 2 | 3 | 2003 |
| 2 | 4 | 2004 |
+----+----------+-------+
And I have a query that returns a result like this :
I have the unique ID and for this ID I want to take the last date of the ID
+----+----------+-------+
| ID | COL2 | DATE |
+----+----------+-------+
| 1 | 4 | 2004 |
| 2 | 4 | 2004 |
+----+----------+-------+
But I don't have any idea how I can do that.
I tried Join , CROSS APPLY ..
If you have some idea ,
Thank you
Clement FAYARD
declare #t table (ID INT,Col2 INT,Date INT)
insert into #t(ID,Col2,Date)values (1,1,2001)
insert into #t(ID,Col2,Date)values (1,2,2001)
insert into #t(ID,Col2,Date)values (1,3,2001)
insert into #t(ID,Col2,Date)values (1,4,2001)
insert into #t(ID,Col2,Date)values (2,1,2002)
insert into #t(ID,Col2,Date)values (2,2,2002)
insert into #t(ID,Col2,Date)values (2,3,2002)
insert into #t(ID,Col2,Date)values (2,4,2002)
;with cte as(
select
*,
rn = row_number() over(partition by ID order by Col2 desc)
from #t
)
select
ID,
Col2,
Date
from cte
where
rn = 1
SELECT ID,MAX(Col2),MAX(Date) FROM tableName GROUP BY ID
If col2 and date allways the highest value in combination than you can try
SELECT ID, MAX(COL2), MAX(DATE)
FROM Table1
GROUP BY ID
But it is not realy good.
The alternative is a subquery with:
SELECT yourtable.ID, sub1.COL2, sub1.DATE
FROM yourtable
INNER JOIN -- try with CROSS APPLY for performance AND without ON 1=1
(SELECT TOP 1 COL2, DATE
FROM yourtable sub2
WHERE sub2.ID = topquery.ID
ORDER BY COL2, DATE) sub1 ON 1=1
You didn't tell what's the name of your table so I'll assume below it is tbl:
SELECT m.ID, m.COL2, m.DATE
FROM tbl m
LEFT JOIN tbl o ON m.ID = o.ID AND m.DATE < o.DATE
WHERE o.DATE is NULL
ORDER BY m.ID ASC
Explanation:
The query left joins the table tbl aliased as m (for "max") against itself (alias o, for "others") using the column ID; the condition m.DATE < o.DATE will combine all the rows from m with rows from o having a greater value in DATE. The row having the maximum value of DATE for a given value of ID from m has no pair in o (there is no value greater than the maximum value). Because of the LEFT JOIN this row will be combined with a row of NULLs. The WHERE clause selects only these rows that have NULL for o.DATE (i.e. they have the maximum value of m.DATE).
Check the SQL Antipatterns: Avoiding the Pitfalls of Database Programming book for other SQL tips.
In order to do this you MUST exclude COL2 Your query should look like this
SELECT ID, MAX(DATE)
FROM table_name
GROUP BY ID
The above query produces the Maximum Date for each ID.
Having COL2 with that query does not makes sense, unless you want the maximum date for each ID and COL2
In that case you can run:
SELECT ID, COL2, MAX(DATE)
GROUP BY ID, COL2;
When you use aggregation functions(like max()), you must always group by all the other columns you have in the select statement.
I think you are facing this problem because you have some fundemental flaws with the design of the table. Usually ID should be a Primary Key (Which is Unique). In this table you have repeated IDs. I do not understand the business logic behind the table but it seems to have some flaws to me.

Grouping SQL Results based on order

I have table with data something like this:
ID | RowNumber | Data
------------------------------
1 | 1 | Data
2 | 2 | Data
3 | 3 | Data
4 | 1 | Data
5 | 2 | Data
6 | 1 | Data
7 | 2 | Data
8 | 3 | Data
9 | 4 | Data
I want to group each set of RowNumbers So that my result is something like this:
ID | RowNumber | Group | Data
--------------------------------------
1 | 1 | a | Data
2 | 2 | a | Data
3 | 3 | a | Data
4 | 1 | b | Data
5 | 2 | b | Data
6 | 1 | c | Data
7 | 2 | c | Data
8 | 3 | c | Data
9 | 4 | c | Data
The only way I know where each group starts and stops is when the RowNumber starts over. How can I accomplish this? It also needs to be fairly efficient since the table I need to do this on has 52 Million Rows.
Additional Info
ID is truly sequential, but RowNumber may not be. I think RowNumber will always begin with 1 but for example the RowNumbers for group1 could be "1,1,2,2,3,4" and for group2 they could be "1,2,4,6", etc.
For the clarified requirements in the comments
The rownumbers for group1 could be "1,1,2,2,3,4" and for group2 they
could be "1,2,4,6" ... a higher number followed by a lower would be a
new group.
A SQL Server 2012 solution could be as follows.
Use LAG to access the previous row and set a flag to 1 if that row is the start of a new group or 0 otherwise.
Calculate a running sum of these flags to use as the grouping value.
Code
WITH T1 AS
(
SELECT *,
LAG(RowNumber) OVER (ORDER BY ID) AS PrevRowNumber
FROM YourTable
), T2 AS
(
SELECT *,
IIF(PrevRowNumber IS NULL OR PrevRowNumber > RowNumber, 1, 0) AS NewGroup
FROM T1
)
SELECT ID,
RowNumber,
Data,
SUM(NewGroup) OVER (ORDER BY ID
ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW) AS Grp
FROM T2
SQL Fiddle
Assuming ID is the clustered index the plan for this has one scan against YourTable and avoids any sort operations.
If the ids are truly sequential, you can do:
select t.*,
(id - rowNumber) as grp
from t
Also you can use recursive CTE
;WITH cte AS
(
SELECT ID, RowNumber, Data, 1 AS [Group]
FROM dbo.test1
WHERE ID = 1
UNION ALL
SELECT t.ID, t.RowNumber, t.Data,
CASE WHEN t.RowNumber != 1 THEN c.[Group] ELSE c.[Group] + 1 END
FROM dbo.test1 t JOIN cte c ON t.ID = c.ID + 1
)
SELECT *
FROM cte
Demo on SQLFiddle
How about:
select ID, RowNumber, Data, dense_rank() over (order by grp) as Grp
from (
select *, (select min(ID) from [Your Table] where ID > t.ID and RowNumber = 1) as grp
from [Your Table] t
) t
order by ID
This should work on SQL 2005. You could also use rank() instead if you don't care about consecutive numbers.