concatenate column values from multiple rows in Oracle without duplicates - sql

I can concatenate column values from multiple rows in Oracle using LISTAGG
But I want to avoid duplicates
Currently it return duplicates
select LISTAGG( t.id,',') WITHIN GROUP (ORDER BY t.id) from table t;
for example for data
ID
10
10
20
30
30
40
Returns 10,10,20,30,40,40
Instead 10,20,30,40
And I can't use distinct inside LISTAGG
select LISTAGG( distinct t.id,',') WITHIN GROUP (ORDER BY t.id) from table t;
Error
ORA-30482: DISTINCT option not allowed for this function

One option would be using regexp_replace():
select regexp_replace(
listagg( t.id,',') within group (order by t.id)
, '([^,]+)(,\1)+', '\1') as "Result"
from t
Demo

You can put the distinct in a subquery:
select LISTAGG( t.id,',') WITHIN GROUP (ORDER BY t.id) from (SELECT DISTINCT t.id FROM TABLE) t

Related

SQL: histograms for multiple columns

Given the following table:
Column A Column B
east red
west blue
east green
I want to find out the of column values of each column and how many times each value is present in the table. Given the output above the result should look like:
A values A value counts B values B value counts
east 2 red 1
west 1 blue 1
green 1
This is achievable by running SELECT colX, count(colX) From Table GROUP BY colX for each column. This is not a scalable solution if there is a complex WHERE condition since it needs to be executed for each query.
An alternative is to execute the complex where query once and compute the aggregations in the server code. But is there a single SQL query that can compute that?
You can use window function :
select cola, count(*) over (partition by cola) as a_count,
colb, count(*) over (partition by colb) as b_count
This will count for both columns (a & b) with their values display.
You can use subqueries to aggregate, then union all and aggregate again to combine the results:
select max(a) as a, max(a_cnt) as a_cnt, max(b) as b, max(b_cnt) as b
from ((select a, count(*) as a_cnt, null as b, null as b_cnt,
row_number() over (order by count(*) desc) as seqnum
from t
group by a
) union all
(select null, null, b, count(*),
row_number() over (order by count(*) desc) as seqnum
from t
group by b
)
) ab
group by seqnum
order by seqnum;
If you are using Oracle you can use user_tab_cols to generate the SQL for all columns in your table
SELECT 'SELECT '
|| Listagg(column_name
||',count(1) over (partition by '
||column_name
||') as '
||column_name
||'_cnt', ',')
within GROUP (ORDER BY column_id)
||' FROM '
||'TEST_DATA'
FROM user_tab_cols
WHERE table_name = 'TEST_DATA'
Sample output is below
SELECT ID,count(1) over (partition by ID) as ID_cnt,VALUE,count(1) over (partition by
VALUE) as VALUE_cnt FROM TEST_DATA

use distinct and order by in STRING_AGG function

I am trying the string_agg a column while at the same time ordering the column and only show unique values. Consider the following demo. IS there a syntax issue or is this simply not possible with the method I am using?
SELECT STRING_AGG(DISTINCT foo.a::TEXT,',' ORDER BY foo.a DESC)
FROM (
SELECT 1 As a
UNION ALL
SELECT 1
UNION ALL
SELECT 1
UNION ALL
SELECT 2
) AS foo
[2019-11-22 13:29:32] [42P10] ERROR: in an aggregate with DISTINCT, ORDER BY expressions must appear in argument list
[2019-11-22 13:29:32] Position: 53
The error message is quite clear. The expression that you use in the ORDER BY clause must also appear in the aggregated part.
You could do:
SELECT STRING_AGG(DISTINCT foo.a::TEXT, ',' ORDER BY foo.a::TEXT DESC)
FROM (
SELECT 1 As a
UNION ALL SELECT 1
UNION ALL SELECT 1
UNION ALL SELECT 2
) AS foo
Demo on DB Fiddle
While this will work, the problem with this solution is that it will order numbers as strings, that do not have the same ordering rules. String wise, 10 is less than 2.
Another option is to use arrays: first, ARRAY_AGG() can be used to aggregate the numbers (with proper, numeric ordering), then you can turn it to a comma-separated list of strings with ARRAY_TO_STRING().
SELECT ARRAY_TO_STRING(ARRAY_AGG(DISTINCT a ORDER BY a DESC), ',')
FROM (
SELECT 1 As a
UNION ALL SELECT 1
UNION ALL SELECT 1
UNION ALL SELECT 2
) AS foo
Demo on DB Fiddle

GROUP BY on specific columns in hive

I have a hive query with 38 columns and only one column is using an aggregate function. But I need to group it only with column name 1, 2 instead of all. How can this be accomplished?
for example,
What I need is,
SELECT
1
,2
,3
,4
,5
,MAX(6)
FROM
table_x
GROUP BY
1,2
select all the columns you want and group it only with column name 1, 2 use analytical functions .
use below query:
select col1,col2.....col38,
max(col6) over(partition by col1,col2 order by col1) as max_val
from tablename
use row_number() function
select * from
(
SELECT 1,2,3,4,5,6,row_number() over(partition by 1,2 order by 6 desc) as rn
FROM table_x
)A where rn=1
It does not comply with group by definition. When you group by X columns, other Y columns must be aggregated to fit in the existing groups.

oracle distinct with listagg

I have a table. I can show all the data of a colomn in my table using ',' in the same line. but I can't apply it distinctly. hepl please
This is tricky. One simple suggestion is to use select distinct:
select listagg(col, ',') within group (order by col)
from (select distinct col from t) x;
However, that makes it difficult to calculate other aggregations (or to generate more than on listagg() result). Another way is to use window functions in combination with listagg():
select listagg(case when seqnum = 1 then col end, ',') within group (order by col)
from (select t.*,
row_number() over (partition by col order by col) as seqnum
from t
) t

oracle pl/sql results into one string

I'm trying to create a simple stored procedure that stores queried result into one string.
v_string1 varchar2(100);
Select column1
From dual;
Will return
column 1
--------
aaaa
bbbb
cccc
I want to store "aaaa, bbbb, cccc' into v_string1.
And all I can think of is a Cursor...
Is there a better way to handle this?
Using SQL Fiddle:
select LISTAGG(name, ',') WITHIN GROUP (ORDER BY 1) AS names
from temp_table
Another option using pure SQL that will work before Oracle 11G, although is still limited to 4000 characters for the string.
Select ltrim(max(names), ', ') as names
From (
Select sys_connect_by_path(name, ' ,') as names
From (
Select name, row_number() over (order by name) as rown
From temp_table
)
Start with rown = 1
Connect by rown = prior rown + 1
)