SQL Adding row numbers - sql

I am looking for a way to add row numbers, but adding duplicated row numbers when one of the columns are duplicates
Logic
* Every time Col1 always start RowNo from 1
* Every time Col1 + Col2 are the same use the same RowNo
Table1
Col1 Col2
1 A
1 B
1 B
2 C
2 D
2 E
3 F
4 G
Output should be
Col1 Col2 RowNo
1 A 1
1 B 2
1 B 2
2 C 1
2 D 2
2 E 3
3 F 1
4 G 1
I have tried,but the output is not correct
select col1,col2
,row_number() over(partition by (col1+col2) order by col1)
from Table1

Use DENSE_RANK():
SELECT Col1, Col2,
DENSE_RANK() OVER (PARTITION BY Col1 ORDER BY Col2) RowNo
FROM yourTable;
ORDER BY Col1, Col2;
Demo

You can use row_number window function with partitioning on the col1 column and ordering on col2
select t.*,
row_number() over (partition by col1 order by col2) as col3
from your_table t;

Related

How can I find groups with more than one rows and list the rows in each such group?

I have a table "mytable" in a database.
Given a subset of the columns of the table, I would like to group by the subset of the columns, and find those groups with more than one rows:
For example, if the table is
col1 col2 col3
1 1 1
1 1 2
1 2 1
2 2 1
2 2 3
2 1 1
I am interested in finding groups by col1 and col2 with more than one rows, which are:
col1 col2 col3
1 1 1
1 1 2
and
col1 col2 col3
2 2 1
2 2 3
I was wondering how to write a SQL query for that purpose?
Is the following the best way to do that?
First get the col1 and col2 values of such groups:
SELECT col1 col2 COUNT(*)
FROM mytable
GROUP BY col1, col2
HAVING COUNT(*) > 1
Then based on the output of the previous query, manually write a query for each group:
SELECT *
FROM mytable
WHERE col1 = val1 AND col2 = val2
If there are many such groups, then I will have to manually write many queries, which can be a disadvantage.
I am using SQL Server.
Thanks.
This is a common problem. One solution is to get the "keys" in a derived table and join to that to get the rows.
declare #test as table (col1 int, col2 int, col3 int)
insert into #test values (1,1,1),(1,1,2),(1,2,1),(2,2,1),(2,2,3),(2,1,1)
select t.*
from #test t
inner join (
select col1, col2
from #test
group by col1, col2
having count(*) > 1
) k
on k.col1 = t.col1 and k.col2 = t.col2
col1 col2 col3
----------- ----------- -----------
1 1 1
1 1 2
2 2 1
2 2 3
The window function sum() over() may help here
Example
with cte as (
Select *
,Cnt = sum(1) over (partition by Col1,Col2)
From YourTable
)
Select *
From cte
Where Cnt>=2
Results
Another option (less performant)
Select top 1 with ties *
From YourTable
Order By case when sum(1) over (partition by Col1,Col2) > 1 then 1 else 2 end
Results

How to keep track of values which are present in a group as well as in all previous group in oracle SQL?

Let's say I have a table with col1 and col2
I group by col1 and order by col1
From the first group, I want to have all values of col2 but from the second group, I want to have only those values which were present in the first group and so on with the consecutive groups.
sample table
col1 col2
1 A
1 B
1 C
1 D
2 E
2 A
2 B
2 G
3 B
3 D
And the output should be
col1 col2
1 A
1 B
1 C
1 D
2 A
2 B
3 B
You can use window functions in order to avoid to read the same table twice:
Number the groups to make sure to have 1, 2, 3, ... without gaps.
Get a rolling count of col2, or in other words the cumulated numbers of their appearances.
Only show rows where the group number equals the count.
The query:
select col1, col2
from
(
select
col1, col2,
dense_rank() over (order by col1) as rn,
count(*) over (partition by col2 order by col1) as cnt
from mytable
) numbered_and_counted
where rn = cnt
order by col1, col2;
Demo: https://dbfiddle.uk/?rdbms=oracle_18&fiddle=f0cc6a211a1a4c767c9e3ce9deb8c28f

Count records in query in groups based on column value

Let's suppose a have a very simple query in SQL
SELECT Col1,Col2 From Table1
and it gives me result:
Col1 Col2
A 5
A 7
A 2
B 1
B 1
B 4
B 0
C 4
C 1
C 2
I want to count rows in groups made by Col1 and in order made by Col2. If values in Col2 for some rows in group are equal then they should have different numbers, as shown in example
So I want to have
Col1 Col2 Nr
A 5 2
A 7 3
A 2 1
B 0 1
B 1 2
B 1 3
B 4 4
C 4 3
C 1 1
C 2 2
Any ideas how to make it?
If your database supports window functions, use ROW_NUMBER
select col1,col2,row_number() over(partition by col1 order by col2) as nr
from tablename
If your database doesn't support window functions, use
select col1,col2,
(select count(*)+1 from tablename t1 where t1.col1=t.col1 and t1.col2<t.col2) as nr
from tablename t
You can use the row_number window function:
SELECT col1,
col2,
ROW_NUMBER() OVER (PARTITION BY col1 ORDER BY col2 ASC) AS Nr
FROM table1
ORDER BY 1, 2, 3

select query to fetch rows corresponding to all values in a column

Consider this example table "Table1".
Col1 Col2
A 1
B 1
A 4
A 5
A 3
A 2
D 1
B 2
C 3
B 4
I am trying to fetch those values from Col1 which corresponds to all values (in this case, 1,2,3,4,5). Here the result of the query should return 'A' as none of the others have all values 1,2,3,4,5 in Col2.
Note that the values in Col2 are decided by other parameters in the query and they will always return some numeric values. Out of those values the query needs to fetch values from Col1 corresponding to all in Col2. The values in Col2 could be 11,12,1,2,3,4 for instance (meaning not necessarily in sequence).
I have tried the following select query:
select distinct Col1 from Table1 where Col1 in (1,2,3,4,5);
select distinct Col1 from Table1 where Col1 exists (select distinct Col2 from Table1);
and its different variations. But the problem is that I need to apply an 'and' for Col2 not an 'or'.
like Return a value from Col1 where Col2 'contains' all values between 1 and 5.
Appreciate any suggestion.
You could use analytic ROW_NUMBER() function.
SQL FIddle for a setup and working demonstration.
SELECT col1
FROM
(SELECT col1,
col2,
row_number() OVER(PARTITION BY col1 ORDER BY col2) rn
FROM your_table
WHERE col2 IN (1,2,3,4,5)
)
WHERE rn =5;
UPDATE As requested by OP, some explanation about how the query works.
The inner sub-query gives you the following resultset:
SQL> SELECT col1,
2 col2,
3 row_number() OVER(PARTITION BY col1 ORDER BY col2) rn
4 FROM t
5 WHERE col2 IN (1,2,3,4,5);
C COL2 RN
- ---------- ----------
A 1 1
A 2 2
A 3 3
A 4 4
A 5 5
B 1 1
B 2 2
B 4 3
C 3 1
D 1 1
10 rows selected.
PARTITION BY clause will group each sets of col1, and ORDER BY will sort col2 in each group set of col1. Thus the sub-query gives you the row_number for each row in an ordered way. now you know that you only need those rows where row_number is at least 5. So, in the outer query all you need ot do is WHERE rn =5 to filter the rows.
You can use listagg function, like
SELECT Col1
FROM
(select Col1,listagg(Col2,',') within group (order by Col2) Col2List from Table1
group by Col1)
WHERE Col2List = '1,2,3,4,5'
You can also use below
SELECT COL1
FROM TABLE_NAME
GROUP BY COL1
HAVING
COUNT(COL1)=5
AND
SUM(
(CASE WHEN COL2=1 THEN 1 ELSE 0
END)
+
(CASE WHEN COL2=2 THEN 1 ELSE 0
END)
+
(CASE WHEN COL2=3 THEN 1 ELSE 0
END)
+
(CASE WHEN COL2=4 THEN 1 ELSE 0
END)
+
(CASE WHEN COL2=5 THEN 1 ELSE 0
END))=5

Group rows Keeping the Order of values

How can I group following set of data:
DATENO COL1 COL2
1 A 1
2 B 1
3 C 1
4 C 1
5 D 1
6 C 1
7 D 1
8 D 1
9 E 1
To get something like this:
DATENO COL1 COL2
1 A 1
2 B 1
3 C 2
5 D 1
6 C 1
7 D 2
9 E 1
Sum for C and D are grouped keeping the order intact. Any ideas?
Updated: answer corrected according to comments.
Rows can be grouped as required on such a way:
-- leave only first rows of each group and substitute col2 with a sum.
select
dateno,
col1,
group_sum as col2
from (
-- Get sum of col2 for each bucket
select
dateno,
col1,
is_start,
sum(col2) over (partition by bucket_number) group_sum
from (
-- divide rows into buckets based on previous col1 change count
select
dateno, col1, col2, is_start,
sum(is_start) over(order by dateno rows unbounded preceding) bucket_number
from (
-- mark rows with change of col1 value as start of new sequence
select
dateno, col1, col2,
decode (nvl(prev_col1, col1||'X'), col1, 0, 1) is_start
from (
-- determine for each row value of col1 in previous row.
select
dateno,
col1,
col2,
lag(col1) over (order by dateno) prev_col1
from t
)
)
)
)
where is_start = 1
order by dateno
Example at SQLFiddle