Remove duplicate values from string in oracle - sql

I have requirement where I have input data like
Col1 COl2 Col3
A1 2 B
A1 1 A
A1 3 B
B1 1 A
B2 2 B
B2 3 C
B4 4 C
B5 5 A
B6 6 B
Output Required:
Col1 COl2 Col3
A1 2 AB
A1 1 AB
A1 3 AB
B1 1 ABC
B2 2 ABC
B2 3 ABC
B4 4 ABC
B5 5 ABC
B6 6 ABC
Solution Tried:
select col1,col2,listagg(col3,'') within group (order by col3) over(partition by col1)
from tab
Output of the query:
Col1 COl2 Col3
A1 2 ABB
A1 1 ABB
A1 3 ABB
B1 1 AABBCC
B2 2 AABBCC
B2 3 AABBCC
B4 4 AABBCC
B5 5 AABBCC
B6 6 AABBCC
Can someone help here in removing repeating alphabets.
Thanks

You can use a subquery:
select col1, col2,
listagg(case when seqnum = 1 then col3 end, '') within group (order by col3) over (partition by col1)
from (select t.*,
row_number() over (partition by col1, col3 order by col3) as seqnum
from tab t
) t

Related

How to group specific SQL columns and retrieve rows with highest counts for those columns? [duplicate]

This question already has answers here:
How to select the first row of each group?
(9 answers)
Closed 1 year ago.
I have the following data:
col_1 | col_2 | col_3 | col_4
-----------------------------
a1 b1 c1 d1
a1 b2 c1 d1
a1 b3 c1 d1
a1 b4 c1 d2
a1 b5 c2 d2
a1 b6 c2 d2
a1 b7 c1 d3
a1 b8 c2 d3
a1 b9 c3 d3
a1 b10 c1 d2
a1 b11 c2 d3
a2 b12 c1 d1
a3 b13 c1 d1
I am interested in being able to:
Return rows where the value for col_1 is unique
For each row in the result, it should return the values for the columnns that have the highest counts when grouping by: col_3, col_4
For example, I would like the output to return the following:
col_1 | col_2 | col_3 | col_4
-----------------------------
a1 b1 c1 d1
a2 b12 c1 d1
a3 b13 c1 d1
Notice in the result that each value in col_1 is unique. Also note that for a1, it returned with c1 and d1 as they had the highest group by counts for a1.
How can I achieve this by SQL query? I will be using it for a Hive SQL query.
With row_number() window function:
select t.col_1, t.col_2, t.col_3, t.col_4
from (
select col_1, min(col_2) col_2, col_3, col_4,
row_number() over (partition by col_1 order by count(*) desc) rn
from tablename
group by col_1, col_3, col_4
) t
where t.rn = 1
See the demo.
Results:
| col_1 | col_2 | col_3 | col_4 |
| ----- | ----- | ----- | ----- |
| a1 | b1 | c1 | d1 |
| a2 | b12 | c1 | d1 |
| a3 | b13 | c1 | d1 |
You can use aggregation and window functions:
select col_1, col_2, col_3, col_4
from (
select
col_1,
col_2,
col_3,
col_4,
rank() over(partition by col_1 order by count(*) desc) rn
from mytable t
group by col_1, col_2, col_3, col_4
) t
where rn = 1
You can use window functions if you want the complete rows:
select t.*
from (select t.*,
rank() over (partition by col1 order by cnt desc) as seqnum
from (select t.*, count(*) over (partition by col1, col3, col4) as cnt
from t
) t
) t
where seqnum = 1;
The innermost subquery counts the number of rows for each col1/col3/col4 combination. The middle subquery enumerates the rows the highest count for each col1. The outermost filters for the highest count.

Filter data based on result set of group and count

I have the following table
Col1 Col2 Col3
A1 B1 C1
A1 B1 C2
A1 B2 C1
A1 B2 C2
A1 B2 C3
A2 B1 C1
A2 B1 C2
A2 B2 C1
A2 B2 C2
From this table I want all the unique records from Col1 where for the combination of col1 and col2 there's a different count for the same value in Col1. The only possible answer is A1 in the table above.
The following query gives me the count of each col1 and col2.
select col1, col2, count(*) from table
group by col1, col2;
Col1 Col2 Count
A1 B1 2
A1 B2 3
A2 B1 2
A2 B2 2
From the above query I can see that A1 has two records with a different count. How do I return A1 in a single query?
You can use another level of aggregation:
select col1
from (select col1, col2, count(*) as cnt
from table
group by col1, col2
) t
group by col1
having min(cnt) <> max(cnt);

SQL Query to achieve the sequence in the below format

I am trying to find solution to achieve the result in the below format using sql.
I have two columns:
col1 col2
1 e
1 e
1 e
2 e2
2 e2
2 e2
3 e3
3 e3
4 e4
4 e4
4 e4
4 e4
4 e4
4 e4
6 e6
6 e6
6 e6
where col1 has the sequence number and col2 has the events where the col1 has the numbers starting from 1 to 10 ...and so on for each batch of events i.e first batch has the sequence 1, next 2 assigned and so on.
I am trying to renumber the sequence col1 in the format below using sql
col1 col2
1 e
2 e
3 e
1 e2
2 e2
3 e2
1 e3
2 e3
1 e4
2 e4
3 e4
4 e4
5 e4
6 e4
1 e6
2 e6
3 e6
Maybe you want this:
SELECT
ROW_NUMBER() OVER (PARTITION BY col2 ORDER BY col2) col1,
col2
FROM
table_name
ORDER BY
col2;
Try this:
SELECT col1, col2 FROM YourTable ORDER BY col2, col1
ORDER BY clause helps to get the result in order.
Check Col2 in your procedure use logic Like:
IF col2='e' Then
Begin
RESET SEQUENCE
SEQUENCE
end
else IF col2='p' Then
Begin
RESET SEQUENCE
SEQUENCE
end

SQL special select from table

I have a table with about 20 columns and 2000 rows.
Example:
Col1 Col2 Col3 Col4 ...
A01 22 AB 11
A01 22 AX 112
A01 23 A5 11
A02 20 AB AA
A04 21 AB 11
A04 21 AU 11
A04 29 AB BA
A05 21 AB 11
AAA 111 XX 18
AAA 222 GT 1O
...
I need a select which displays all rows and all columns that satisfy the requirement of two columns (Col1 and Col2) based on the following:
if Col1 is unique - show row,
or
if Col1 is not unique show all row only if Col1 and Col2 are same.
From the previos table is after select the result:
Col1 Col2 Col3 Col4 ...
A01 22 AB 11
A01 22 AX 112
A02 20 AB AA
A04 21 AB 11
A04 21 AU 11
A05 21 AB 11
AAA 111 XX 18
...
The new table (your solution) contains data:
Col1 Col2 Col3 Col4 ...
A01 22 AB 11
A01 22 AX 112
A02 20 AB AA
A04 21 AB 11
A04 21 AU 11
A05 21 AB 11
AAA 111 XX 18
...
what I wont see from this is:
Col1 Col2 Col3 Col4 ...
A01 2 AB 11
A02 1 AB AA
A04 2 AB 11
A05 1 AB 11
AAA 1 XX 18
...
In Oracle and MS SQL I would use analytical functions:
select * from
(
select
t.* ,
count(Col1) over (partition by Col1) as count_col1,
count(Col2) over (partition by Col1, Col2) as count_col2
from yourTable t
) t
where count_col1 = 1 or count_col2 > 1;
See this fiddle (Oracle) and this fiddle (MSSQL) as proof.
select *
from table t1
join (select col1
from table
group by col1
having avg(col2)=max(col2)) t2
on t1.col1=t2.col1
Seeing that I didn't look at your example .. and your request is slightly different then the example, Because my query checks that for a col1 all col2 should be the same. It will not display the ones that are the same.
In this case the answer will be
select *
from table1 t1
join (select col1,col2
from table1
group by col1,col2
having count(*)>1
union
select col1,cast(null as varchar)
from table1 group by col1
having count(*)=1) t2
on t1.col1=t2.col1 and t1.col2=isnull(t2.col2,t1.col2)
This is the updated query, and the fiddle for it http://sqlfiddle.com/#!3/e944b/2/0
Ok .. updated one more time:
select *
from table1 t1
join (select col1,col2
from table1
group by col1,col2
having count(*)>1
union
select col1,min(col2)
from table1 group by col1
having count(*)=1 or count(*)=count(distinct col2)) t2
on t1.col1=t2.col1 and t1.col2=t2.col2
and with fiddle http://sqlfiddle.com/#!3/d5437/12/0
This should be enough for the second problem:
select t3.*
from (select distinct col1 from table1)t1
cross apply (select top 1 * from table1 t2 where t1.col1=t2.col1) t3
and the fiddle: http://sqlfiddle.com/#!3/e944b/4/0

SQL Selecting all records with composite key

Here is the table structure, with the first 6 columns as composite keys.
col1 col2 col3 col4 col5 col6 col7 col8
A1 A2 A3 A4 A5 1 xx yy
A1 A2 A3 A4 A5 2 xxx yyy
A1 A2 A3 A4 A5 3 a b
A1 B2 A3 A4 A5 4 aa bb
B1 A2 A3 A4 A5 5 aaa bbb
B1 B2 B3 B4 B5 6 d e
B1 B2 B3 B4 B5 7 dd ee
B1 B3 C3 B4 B5 8 ddd eee
I need a stored procedure which returns the values like below
A1 A2 A3 A4 A5 xx yy xxx yyy a b
A1 B2 A3 A4 A5 aa bb
B1 A2 A3 A4 A5 aaa bbb
B1 B2 B3 B4 B5 d e dd ee
B1 B2 C3 B4 B5 ddd eee
Any pointers or help is appreciated.
If you just need a result like this:
A1 A2 A3 A4 A5 xx,yy,xxx,yyy,a,b
A1 B2 A3 A4 A5 aa,bb
B1 A2 A3 A4 A5 aaa,bbb
B1 B2 B3 B4 B5 d,e,dd,ee
B1 B3 C3 B4 B5 ddd,eee
Then a simple query will do:
select COL1
,COL2
,COL3
,COL4
,COL5
,LISTAGG(COL7 || ',' || COL8, ',') within group (order by COL6)
from TAB1
group by COL1, COL2, COL3, COL4, COL5
order by COL1, COL2, COL3, COL4, COL5
To get a dynamic number of columns you need to create a dynamic SQL. See this article for a guide on that.
First, as a standard result set from a query, you cant return variable numbers of columns, you can however return on column which is a xml column (intoduced in SQL 2005) http://msdn.microsoft.com/en-us/library/ms345117%28v=sql.90%29.aspx which you can then bind to. The binding will have to be dynamic.You can use the pivot function http://msdn.microsoft.com/en-us/library/ms177410.aspx may be of help in presenting the data in the format you want - or as stated above you can use dynamic SQL. If you do this you should generate your dynamic SQL inside a stored procedure and use sp_executesql to execute the string you generate http://msdn.microsoft.com/en-us/library/ms175170.aspx you'll probably need to pass parameters http://support.microsoft.com/kb/262499 to guard against injection attacks. Quite a few steps to learn... however each step is not particularly difficult and its a great exercise!
check this, don't know it helps you or not. I used sql server 2008, it returns result as per your requirments.
SELECT col1, col2, col3, col4, col5,
(SELECT ' ' + col7 + ' ' + col8 FROM TABLE1 t1
WHERE t1.col1 = t.col1 and t1.col2 = t.col2 and
t1.col3 = t.col3 and t1.col4 = t.col4 and t1.col5 = t.col5
ORDER BY col6
FOR XML PATH('')) AS mearge_col
FROM TABLE1 t
GROUP BY col1, col2, col3, col4, col5