row_number or rank similar data - sql

I have a table like this:
col1 col2
A A
A A
A F
B B
B B
B H
C L
A A
A A
A A
A E
C C
C C
C C
C C
C C
C J
And I want result like this:
col1 count
A 3
B 3
C 1
A 4
C 6
If the col1 <> col2 reset count... But I only want sql code not pl-sql etc.
Maybe row_number() over(RESET WHEN col1<>col2).
Please help me.
Ok freinds thank you. sorry for my bad english.
In fact my table like this :
id col1 col2
1000 A A
2000 A A
3000 A F
4000 B B
5000 B B
6000 B H
7000 C L
8000 A A
9000 A A
10000 A A
11000 A E
12000 C C
13000 C C
14000 C C
15000 C C
16000 C C
17000 C J
Id column is unique and has ordered values always. Maybe this will help us to solve problem. Sorry for my missing information to you. And I want solution like above.
I only want col1 and count. But not col1 unique, count must be 1,2,3 bla bla bla... until col1 <> col2...
After this row count must be reset.

First, I'd like to note, that without having an ORDER BY clause, you cannot guarantee the order of the results. To do this sort of calculation, it would be useful to have an identity (auto-incremental) field to establish an order.
That said, you can attempt to use ROW_NUMBER() to create a field to order on.
with yourtablewithrn as (
select col1, col2, row_number() over (order by (select null)) rn
from yourtable
),
yourtablegrouped as (
select *,
rn - row_number() over (partition by col1 order by rn) as grp
from yourtablewithrn
)
select col1,
count(col2) AS cnt
from yourtablegrouped
group by col1, grp
order by min(rn)
SQL Fiddle Demo

As mentioned above I agree with sgeddes that we need some kind of order that we can rely on for this kind of problem. row_number() over () wont do since it more or less is a random number:
create table yourtable
( n int
, col1 varchar(1)
, col2 varchar(1));
insert into yourtable values
(1,'A','A'),
(2,'A','A'),
(3,'A','F'),
(4,'B','B'),
(5,'B','B'),
(6,'B','H'),
(7,'C','L'),
(8,'A','A'),
(9,'A','A'),
(10,'A','A'),
(11,'A','E'),
(12,'C','C'),
(13,'C','C'),
(14,'C','C'),
(15,'C','C'),
(16,'C','C'),
(17,'C','J');
For this sample data col2 has no impact. We could do (a slight variation of sgeddes solution):
select col1, count(1)
from (
select n, col1
, col2
, row_number() over (order by n)
- row_number() over (partition by col1
order by n) as grp
from yourtable
) t
group by col1, grp
order by min(n)
But, what should the result be with a sample like below?
delete from yourtable;
insert into yourtable values
('A','A'),
('A','A'),
('A','F'),
('A','A'),
('A','G');

Related

Add rownum from specific number - Oracle SQL

I have a table:
table1
col1 col2
1 a
1 b
1 c
I want to add rownum but from a specific number, for ex. starting from 100, so it would look like:
col1 col2 rn
1 a 100
1 b 101
1 c 102
I know how to add rownum like below:
select a.*, rownum as rn from table1 a;
But I don't know how to add from a specific number. How to do it in Oracle SQL?
The ANSI SQL way of doing this would be to use ROW_NUMBER:
SELECT col1, col2, 99 + ROW_NUMBER() OVER (ORDER BY col2) rn
FROM table1;
You might be able to use Oracle's ROWNUM function here, but in that case you would also need to provide an ORDER BY clause to your query:
SELECT col1, col2, 99 + ROWNUM AS rn
FROM table1
ORDER BY col2;
I think it's not necessary to get this kind of rownum from systematic source, you can use below query for example
select a.*, 99+rownum as rn from table1 a;

Add key to unique values in the SQl database

My SQL data looks like this:
Col1
A
A
A
B
B
C
D
I want to add a key to only unique values. So the end result will look like this:
Col1 Col2
A 1
A 1
A 1
B 2
B 2
C 3
D 3
How can I do this?
You can do this with the dense_rank() window function:
select col1, dense_rank() over (order by col1) as col2
from t;
This solves the problem as a query. If you want to actually change the table, then the code is more like:
alter table t add col2 int;
with toupdate as (
select t.*, dense_rank() over (order by col1) as newcol2
from t
)
update toupdate
set col2 = newcol2;

Find duplicate symmetric rows in a table

I have a table which contains data as
col1 col2
a b
b a
c d
d c
a d
a c
For me row 1 and row 2 are duplicate because a, b & b, a are the same. The same stands for row 3 and row 4.
I need an SQL (not PL/SQL) query which gives output as
col1 col2
a b
c d
a d
a c
select distinct least(col1, col2), greatest(col1, col2)
from your_table
Edit: for those using a DBMS that does support the standard SQL functions least and greatest this can be simulated using a CASE expression:
select distinct
case
when col1 < col2 then col1
else col2
end as least_col,
case
when col1 > col2 then col1
else col2
end as greatest_col
from your_table
Try this:
CREATE TABLE t_1(col1 varchar(10),col2 varchar(10))
INSERT INTO t_1
VALUES ('a','b'),
('b','a'),
('c','d'),
('d','c'),
('a','d'),
('a','c')
;with CTE as (select ROW_NUMBER() over (order by (select 0)) as id,col1,col2,col1+col2 as col3 from t_1)
,CTE1 as (
select id,col1,col2,col3 from CTE where id=1
union all
select c.id,c.col1,c.col2,CASE when c.col3=REVERSE(c1.col3) then null else c.col3 end from CTE c inner join CTE1 c1
on c.id-1=c1.id
)
select col1,col2 from CTE1 where col3 is not null

sort items based on their appears count

I have data like this
d b c
a d
c b
a b
c a
c a d
c
if you analyse, you will find the appearance of each element as follows
a: 4
b: 3
c: 5
d: 2
According to appearance my sorted elements would be
c,a,b,d
and final output should be
c b d
a d
c b
a b
c a
c a d
c
Any clue, how we can achieve this using sql query ?
Unless there is another column which dictates the order of the input rows, it will not be possible to guarantee that the output rows are returned in the same order. I've made an assumption here to order them by the three column values so that the result is deterministic.
It's likely to be possible to compact this code into fewer steps, but shows the steps reasonably clearly.
Note that for a large dataset, it may be more efficient to partition some of these steps into SELECT INTO operations creating temporary tables or work tables.
DECLARE #t TABLE
(col1 CHAR(1)
,col2 CHAR(1)
,col3 CHAR(1)
)
INSERT #t
SELECT 'd','b','c'
UNION SELECT 'a','d',NULL
UNION SELECT 'c','b',NULL
UNION SELECT 'a','b',NULL
UNION SELECT 'c','a',NULL
UNION SELECT 'c','a','d'
UNION SELECT 'c',NULL,NULL
;WITH freqCTE
AS
(
SELECT col1 FROM #t WHERE col1 IS NOT NULL
UNION ALL
SELECT col2 FROM #t WHERE col2 IS NOT NULL
UNION ALL
SELECT col3 FROM #t WHERE col3 IS NOT NULL
)
,grpCTE
AS
(
SELECT col1 AS val
,COUNT(1) AS cnt
FROM freqCTE
GROUP BY col1
)
,rowNCTE
AS
(
SELECT *
,ROW_NUMBER() OVER (ORDER BY col1
,col2
,col3
) AS rowN
FROM #t
)
,buildCTE
AS
(
SELECT rowN
,val
,cnt
,ROW_NUMBER() OVER (PARTITION BY rowN
ORDER BY ISNULL(cnt,-1) DESC
,ISNULL(val,'z')
) AS colOrd
FROM (
SELECT *
FROM rowNCTE AS t
JOIN grpCTE AS g1
ON g1.val = t.col1
UNION ALL
SELECT *
FROM rowNCTE AS t
LEFT JOIN grpCTE AS g2
ON g2.val = t.col2
UNION ALL
SELECT *
FROM rowNCTE AS t
LEFT JOIN grpCTE AS g3
ON g3.val = t.col3
) AS x
)
SELECT b1.val AS col1
,b2.val AS col2
,b3.val AS col3
FROM buildCTE AS b1
JOIN buildCTE AS b2
ON b2.rowN = b1.rowN
AND b2.colOrd = 2
JOIN buildCTE AS b3
ON b3.rowN = b1.rowN
AND b3.colOrd = 3
WHERE b1.colOrd = 1
ORDER BY b1.rowN

In SQL in a "group by" expression: how to get the string that occurs most often in a group?

Assume we have the following table:
Id A B
1 10 ABC
2 10 ABC
3 10 FFF
4 20 HHH
As result of a "group by A" expression I want to have the value of the B-Column that occurs most often:
select A, mostoften(B) from table group by A;
A mostoften(B)
10 ABC
20 HHH
How do I achieve this in Oracle 10g?
Remark: in the case of a tie (when there are more than one value that occurs most often) it does not matter which value is selected.
select A, B
from (
select A, B, ROW_NUMBER() OVER (PARTITION BY A ORDER BY C_B DESC) as rn
from (
select A, COUNT (B) as C_B, B
from table
group by A, B
) count_table
) order_table
where rn = 1;
You want the Bs with the MAX of COUNT group by A, B.
Old school solution, it took me some time and some cursing :)
select a,b
from ta ta1
group by a,b
having count(*) = (select max(count(*))
from ta ta2
where ta1.a = ta2.a
group by b)
This problem can be clarified by creating a view for the count in each A & B group:
CREATE VIEW MyTableCounts AS
SELECT A, B, COUNT(*) C
FROM MyTable
GROUP BY A, B;
Now we can do a query that finds the row c1 where the count is greatest. That is, no other row that has the same A has a greater count. Therefore if we try to find a row c2 with a greater count, no match is found.
SELECT c1.A, c1.B
FROM MyTableCounts c1
LEFT OUTER JOIN MyTableCounts c2
ON (c1.A = c2.A AND (c1.C < c2.C OR (c1.C = c2.C AND c1.B < c2.B)))
WHERE c2.A IS NULL
ORDER BY c1.A;
To resolve tied counts (c1.C = c2.C), we use the value of B which we know is unique within a given group of A.
try this (works on SQL Server 2005):
declare #yourtable table (rowid int, a int,b char(3))
insert into #yourtable values (1,10,'ABC')
insert into #yourtable values (2,10,'ABC')
insert into #yourtable values (3,10,'FFF')
insert into #yourtable values (4,20,'HHH')
;WITH YourTableCTE AS
(
SELECT
*, ROW_NUMBER() OVER(partition by A ORDER BY A ASC,CountOfB DESC) AS RowRank
FROM (SELECT
A, B, COUNT(B) AS CountOfB
FROM #yourtable
GROUP BY A,B
) dt
)
SELECT
A,B
FROM YourTableCTE
WHERE RowRank=1
EDIT without CTE...
SELECT
A,B
FROM (SELECT
*, ROW_NUMBER() OVER(partition by A ORDER BY A ASC,CountOfB DESC) AS RowRank
FROM (SELECT
A, B, COUNT(B) AS CountOfB
FROM #yourtable
GROUP BY A,B
) dt
) dt2
WHERE RowRank=1