Oracle : Combine unique row values into comma-separated string [duplicate] - sql

This question already has answers here:
How can I combine multiple rows into a comma-delimited list in Oracle? [duplicate]
(11 answers)
Closed 8 years ago.
I have a query that returns data like below, need to combine the value column into comma-separated string excluding duplicates and null values.
g_name g_id v_data
----- ---- ------
Test 123 ABC
Test 123 ABC
Test 123 DEG
Test 123 None
Test 123
Test 123 HIJ
Desired output :
g_name g_id v_data
----- ---- ------
Test 123 ABC,DEG,HIJ
I have tried using XMLAGG but can't remove duplicates and null values.
select g_name,
g_id,
RTRIM(XMLAGG(XMLELEMENT(e, v_data || ',')).EXTRACT('//text()'), ',')
from tblData
group by g_name, g_id

Just pre-filter your rows with a common table expression, then do your string concatenation.
with cte1 as (
select distinct *
from tblData
where v_data is not null
)
select g_name,
g_id,
RTRIM(XMLAGG(XMLELEMENT(e, v_data || ',')).EXTRACT('//text()'), ',')
from cte1
group by g_name, g_id

You can use the LISTAGG function:
select g_name, g_id, listagg(v_data,',') within group (order by v_data) v_data
from (
select distinct g_name, g_id, v_data
from tblData
where v_data is not null
)
group by g_name, g_id
In the inner select, we deal with the DISTINCT and NULL removal (you could also remove None here if you wanted).
The outer query deals with grouping by your name and id and concatenating the values

Related

Using query results as In clause parameter

I know I've seen this before, but can't come up with the search terms to find it.
I have a CTE returning a comma separated list from the below table:
Table
Create table Table
(
ID Number
, Name varchar2(100)
);
insert all
into Table (ID, Name) values (1, 'Alex')
into Table (ID, Name) values (2, 'Amy')
into Table (ID, Name) values (3, 'Jim')
select * from dual;
ID
Name
1
Alex
2
Amy
3
Jim
select substr(
listagg(Table.ID || ',') within group (order by null)
, 1
, length(listagg(Table.ID || ',') within group (order by null)) - 1
) IDs
from Table
where Name like 'A%'
Which gives me the results: 1,2
I'm trying to use this result in a query's in clause:
with CTE as
(
select substr(
listagg(tbl.ID || ',') within group (order by null)
, 1
, length(listagg(tbl.ID || ',') within group (order by null)) - 1
) IDs
from Table
where Name like 'A%'
)
select *
from Table
where cast(ID as varhcar2(1000)) in (select IDs from CTE) --Use results here
--believe the cast is required to compare, otherwise get a ORA-01722: invalid number
Which I want to return:
ID
Name
1
Alex
2
Amy
How can I use the CTE's resulting IDs string as the parameter of my in clause?
I'm afraid I don't understand your "problem". CTE is really strange; SUBSTR of something? Why? LISTAGG returns the same result anyway. Then you want to ... what? split that result so that you could use it in another query? As if you want to make it as complex as possible (and beyond) to solve something "simple". Therefore: what real problem are you trying to solve?
Anyway, here you go: you'll have to split aggregated string into rows if you want to use it in IN clause:
SQL> with CTE as
2 (select listagg(ID || ',') within group (order by null) IDs
3 from Test
4 where Name like 'A%'
5 )
6 select *
7 from Test
8 where id in (select regexp_substr(IDs, '[^,]+', 1, level)
9 from CTE
10 connect by level <= regexp_count(IDS, ',') + 1
11 );
ID NAME
---------- ----------
1 Alex
2 Amy
SQL>
The same result is returned by a simple
SQL> select *
2 from Test
3 where Name like 'A%';
ID NAME
---------- ----------
1 Alex
2 Amy
SQL>
That's why I asked: what problem are you trying to solve?
[EDIT] As of trailing comma: there's none, at least not any Oracle version I used (11g, 12c, 18cXE, 21cXE):
SQL> select listagg(id, ',') within group (order by null) result from test;
RESULT
------------------------------
1,2,3
SQL> select listagg(name, ',') within group (order by null) result from test;
RESULT
------------------------------
Alex,Amy,Jim
SQL>

select max from a group by [duplicate]

This question already has answers here:
Fetch the rows which have the Max value for a column for each distinct value of another column
(35 answers)
Select First Row of Every Group in sql [duplicate]
(2 answers)
Get value based on max of a different column grouped by another column [duplicate]
(1 answer)
SQL: getting the max value of one column and the corresponding other columns [duplicate]
(2 answers)
Oracle SQL query: Retrieve latest values per group based on time [duplicate]
(2 answers)
Closed 2 years ago.
I've got a very simple question, I hope.
I have a table MYTABLE like this:
FIELD1 FIELD2 FREQ
A1 B 10
A1 C 20
A1 D 5
A2 X 7
A2 Y 12
...
and I want to get something like this:
enter code here
FIELD1 FIELD2
A1 C
A2 Y
that is I want to get for every distinct FIELD1 the row having the max(FREQ) but also with its FIELD2 value
Oracle 10g
Thanks in advance!
Mark
you can use analytical functions to get this done
select * from (
select x.*,row_number() over(partition by field1,field2 order by field1 asc) as rnk from (
select field1
,field2
,max(field2) over(partition by field1) as max_value
from mytable
)x
where max_value=field2
)y
where y.rnk=1
Use window functions.
select FIELD1, FIELD2
from (
select *, rank() over (partition by FIELD1 order by FREQ desc) rn
from mytable
) t
where t.rn = 1
If you use RANK() then you return all rows with MAX(FREQ) for each FIELD1.
use row_number()
select b.* from
(select a.* ,row_number()over (partition by FIELD1 orer by FREQ desc) rn
from table_name a
) b where b.rn=1

How to select values from a column that have only specific values from another column and not other values?

I have a pgsql schema having a table that has two columns among others: id and status. status values are of varchar type ranging from '1' to '6'. I want to select values of id that have only specific status, precisely, one id having only one status ('1'), then another having two values ('1' ands '2'), then another having only three values ('1', '2' and '3') and so on.
This is for a pgsql database. I have tried using inner query joining with the same table.
select *
from srt s
join ( select id
from srt
group by id
having count(distinct status) = 2
) t on t.id = s.id
where srt.status in ('1', '2')
limit 10
I used this to get the IDs having only status values 1 and 2 (and not having any rows with status values 3, 4, 5, 6) but didn't get the expected result
The expected result would be something like this
id status
123 1
234 1
234 2
345 1
345 2
345 3
456 1
456 2
456 3
456 4
567 1
567 2
567 3
567 4
567 5
678 1
678 2
678 3
678 4
678 5
678 6
Move your where condition inside sub-query -
select *
from srt s
join ( select id
from srt
where status in ('1', '2')
group by id
having count(distinct status) = 2
) t on t.id = s.id
limit 10
To identify the ids with consecutive statuses, you can do:
select id, max(status) as max_status
from srt s
group by id
having min(status) = 1 and
max(status::int) = count(*);
Then, you can narrow this down to one example using distinct on and use join to bring in your results:
select s.*
from srt s join
(select distinct on (max(status)) id, max(status) as max_status
from srt s
group by id
having min(status) = 1 and
max(status::int) = count(*)
order by max_status asc
) ss
on ss.id = s.id
order by ss.max_status, s.status;
This is a tricky one. My solution is to first specify a list of the "target statuses" you want to match:
with target_statuses(s) as ( values (1),(2),(3) )
Then JOIN your srt table to it and count the occurrences grouped by id.
with target_statuses(s) as ( values (1),(2),(3) )
select id, count(*), row_number() OVER (partition by count(*) order by id) rownum
from srt
join target_statuses on status=s
group by id
)
This query also captures a row number, which we'll later use to limit it to the first id that has one match, the first id that has two matches, etc. Note the order by clause... I assume you want the alphabetically lowest id first in each case, but you may change that.
Since you can't put a window function in a HAVING clause, I wrap up the whole result at ids_and_counts_of_statuses and perform a follow-up query that rejoins it with the srt table to output it:
with ids_and_counts_of_statuses as(
with target_statuses(s) as ( values (1),(2),(3) )
select id, count(*), row_number() OVER (partition by count(*) order by id) rownum
from srt
join target_statuses on status=s
group by id
)
select srt.id, srt.status
from ids_and_counts_of_statuses
join srt on ids_and_counts_of_statuses.id=srt.id
where rownum=1;
Note that I have changed your varchar values to integers just so I didn't have to type quite so much punctuation. It works, here's an example: https://www.db-fiddle.com/f/wwob31uiNgr9aAkZoe1Jgs/0

concatenate certain columns across multiple rows

I have a dataset that looks like this
RID SID MID QID QText
------------------------------------------------------------------
NULL NULL NULL 10,20,30 't1','t2','t3'
10 14 13 4 'text2'
100 141 131 5,6 't5','t6'
I'd like to run some sql command that would basically take the row with the nulls and concatenate the QID and QText columns to each row that has a valid RID, SID, MID
so the end result would be a dataset similar to this (in this case the first row doesn't need to be there because I've concatenated the info I've got in that row to the other rows).
RID SID MID QID QText
------------------------------------------------------------------
NULL NULL NULL 10,20,30 't1','t2','t3'
10 14 13 4,10,20,30 'text2','t1','t2','t3'
100 141 131 5,6,10,20,30 't5','t6','t1','t2','t3'
I've tried several group_concats with different grouping but can't quite get it to work the way I need it to. Is this transform possible with raw SQL (mysql) ?
Some of what I've tried so far (really bad attempts because I just don't know what will do what I'm trying to do) are
select group_concat(QText) from myTable group by ? <--- I don't know of anything that I can group by that will give me what i'm looking for. That's what I mean by really bad attempts. I know they are wrong (group by id, qid, etc, etc). Also thought about and tried a sum on the columns that I want to concatenate.
If you want the NULL row(s) to be grouped with all non-NULL rows, make a copy for every group. You could, for instance, derive the list of distinct RID, SID, MID combinations
SELECT DISTINCT RID, SID, MID, QID, QText
FROM myTable
WHERE RID IS NOT NULL
OR SID IS NOT NULL
OR MID IS NOT NULL
and cross join it with the NULL row(s):
SELECT
groups.RID, groups.SID, groups.MID,
empty.QID, empty.QText
FROM
(
SELECT DISTINCT RID, SID, MID
FROM myTable
WHERE RID IS NOT NULL
OR SID IS NOT NULL
OR MID IS NOT NULL
) AS groups
CROSS JOIN
(
SELECT QID, QText
FROM myTable
WHERE RID IS NULL
AND SID IS NULL
AND MID IS NULL
) AS empty
then combine the resulting set with the original set:
SELECT RID, SID, MID, QID, QText
FROM myTable
UNION ALL
SELECT
groups.RID, groups.SID, groups.MID,
empty.QID, empty.QText
FROM
(
SELECT DISTINCT RID, SID, MID
FROM myTable
WHERE RID IS NOT NULL
OR SID IS NOT NULL
OR MID IS NOT NULL
) AS groups
CROSS JOIN
(
SELECT QID, QText
FROM myTable
WHERE RID IS NULL
AND SID IS NULL
AND MID IS NULL
) AS empty
Now just use the combined result set as a derived table and get your GROUP_CONCATs from it:
SELECT
RID, SID, MID,
GROUP_CONCAT(QID) AS QID,
GROUP_CONCAT(QText) AS QText
FROM
(
SELECT … /* the above UNION ALL query here */
) AS s
GROUP BY
RID, SID, MID
;

GROUP BY without aggregate function

I am trying to understand GROUP BY (new to oracle dbms) without aggregate function.
How does it operate?
Here is what i have tried.
EMP table on which i will run my SQL.
SELECT ename , sal
FROM emp
GROUP BY ename , sal
SELECT ename , sal
FROM emp
GROUP BY ename;
Result
ORA-00979: not a GROUP BY expression
00979. 00000 - "not a GROUP BY expression"
*Cause:
*Action:
Error at Line: 397 Column: 16
SELECT ename , sal
FROM emp
GROUP BY sal;
Result
ORA-00979: not a GROUP BY expression
00979. 00000 - "not a GROUP BY expression"
*Cause:
*Action: Error at Line: 411 Column: 8
SELECT empno , ename , sal
FROM emp
GROUP BY sal , ename;
Result
ORA-00979: not a GROUP BY expression
00979. 00000 - "not a GROUP BY expression"
*Cause:
*Action: Error at Line: 425 Column: 8
SELECT empno , ename , sal
FROM emp
GROUP BY empno , ename , sal;
So, basically the number of columns have to be equal to the number of columns in the GROUP BY clause, but i still do not understand why or what is going on.
That's how GROUP BY works. It takes several rows and turns them into one row. Because of this, it has to know what to do with all the combined rows where there have different values for some columns (fields). This is why you have two options for every field you want to SELECT : Either include it in the GROUP BY clause, or use it in an aggregate function so the system knows how you want to combine the field.
For example, let's say you have this table:
Name | OrderNumber
------------------
John | 1
John | 2
If you say GROUP BY Name, how will it know which OrderNumber to show in the result? So you either include OrderNumber in group by, which will result in these two rows. Or, you use an aggregate function to show how to handle the OrderNumbers. For example, MAX(OrderNumber), which means the result is John | 2 or SUM(OrderNumber) which means the result is John | 3.
Given this data:
Col1 Col2 Col3
A X 1
A Y 2
A Y 3
B X 0
B Y 3
B Z 1
This query:
SELECT Col1, Col2, Col3 FROM data GROUP BY Col1, Col2, Col3
Would result in exactly the same table.
However, this query:
SELECT Col1, Col2 FROM data GROUP BY Col1, Col2
Would result in:
Col1 Col2
A X
A Y
B X
B Y
B Z
Now, a query:
SELECT Col1, Col2, Col3 FROM data GROUP BY Col1, Col2
Would create a problem: the line with A, Y is the result of grouping the two lines
A Y 2
A Y 3
So, which value should be in Col3, '2' or '3'?
Normally you would use a GROUP BY to calculate e.g. a sum:
SELECT Col1, Col2, SUM(Col3) FROM data GROUP BY Col1, Col2
So in the line, we had a problem with we now get (2+3) = 5.
Grouping by all your columns in your select is effectively the same as using DISTINCT, and it is preferable to use the DISTINCT keyword word readability in this case.
So instead of
SELECT Col1, Col2, Col3 FROM data GROUP BY Col1, Col2, Col3
use
SELECT DISTINCT Col1, Col2, Col3 FROM data
You're experiencing a strict requirement of the GROUP BY clause. Every column not in the group-by clause must have a function applied to reduce all records for the matching "group" to a single record (sum, max, min, etc).
If you list all queried (selected) columns in the GROUP BY clause, you are essentially requesting that duplicate records be excluded from the result set. That gives the same effect as SELECT DISTINCT which also eliminates duplicate rows from the result set.
The only real use case for GROUP BY without aggregation is when you GROUP BY more columns than are selected, in which case the selected columns might be repeated. Otherwise you might as well use a DISTINCT.
It's worth noting that other RDBMS's do not require that all non-aggregated columns be included in the GROUP BY. For example in PostgreSQL if the primary key columns of a table are included in the GROUP BY then other columns of that table need not be as they are guaranteed to be distinct for every distinct primary key column. I've wished in the past that Oracle did the same as it would have made for more compact SQL in many cases.
Let me give some examples.
Consider this data.
CREATE TABLE DATASET ( VAL1 CHAR ( 1 CHAR ),
VAL2 VARCHAR2 ( 10 CHAR ),
VAL3 NUMBER );
INSERT INTO
DATASET ( VAL1, VAL2, VAL3 )
VALUES
( 'b', 'b-details', 2 );
INSERT INTO
DATASET ( VAL1, VAL2, VAL3 )
VALUES
( 'a', 'a-details', 1 );
INSERT INTO
DATASET ( VAL1, VAL2, VAL3 )
VALUES
( 'c', 'c-details', 3 );
INSERT INTO
DATASET ( VAL1, VAL2, VAL3 )
VALUES
( 'a', 'dup', 4 );
INSERT INTO
DATASET ( VAL1, VAL2, VAL3 )
VALUES
( 'c', 'c-details', 5 );
COMMIT;
Whats there in table now
SELECT * FROM DATASET;
VAL1 VAL2 VAL3
---- ---------- ----------
b b-details 2
a a-details 1
c c-details 3
a dup 4
c c-details 5
5 rows selected.
--aggregate with group by
SELECT
VAL1,
COUNT ( * )
FROM
DATASET A
GROUP BY
VAL1;
VAL1 COUNT(*)
---- ----------
b 1
a 2
c 2
3 rows selected.
--aggregate with group by multiple columns but select partial column
SELECT
VAL1,
COUNT ( * )
FROM
DATASET A
GROUP BY
VAL1,
VAL2;
VAL1
----
b
c
a
a
4 rows selected.
--No aggregate with group by multiple columns
SELECT
VAL1,
VAL2
FROM
DATASET A
GROUP BY
VAL1,
VAL2;
VAL1
----
b b-details
c c-details
a dup
a a-details
4 rows selected.
--No aggregate with group by multiple columns
SELECT
VAL1
FROM
DATASET A
GROUP BY
VAL1,
VAL2;
VAL1
----
b
c
a
a
4 rows selected.
You have N columns in select (excluding aggregations), then you should have N or N+x columns
Use sub query e.g:
SELECT field1,field2,(SELECT distinct field3 FROM tbl2 WHERE criteria) AS field3
FROM tbl1 GROUP BY field1,field2
OR
SELECT DISTINCT field1,field2,(SELECT distinct field3 FROM tbl2 WHERE criteria) AS field3
FROM tbl1
If you have some column in SELECT clause , how will it select it if there is several rows ? so yes , every column in SELECT clause should be in GROUP BY clause also , you can use aggregate functions in SELECT ...
you can have column in GROUP BY clause which is not in SELECT clause , but not otherwise
As an addition
basically the number of columns have to be equal to the number of columns in the GROUP BY clause
is not a correct statement.
Any attribute which is not a part of GROUP BY clause can not be used for selection
Any attribute which is a part of GROUP BY clause can be used for selection but not mandatory.
For anyone trying to group data (from foreign tables as an example) like a json object with nested arrays of data you can achieve this in sql with array_agg (you can also use this in conjunction with json_build_object to create a json object with key-value pairs).
As a refference, I found helpful this video on yt: https://www.youtube.com/watch?v=A6N1h9mcJf4
-- Edit
If you want to have a nested array inside a nested array, you could do it by using array.
In the following example, 'variation_images' (subquery 2 - in relation to the variation table) are nested under the 'variation' query (subquery 1 - in relation to product table) which is nested under the product query (main query):
SELECT product.title, product.slug, product.description,
ARRAY(SELECT jsonb_build_object(
'var_id', variation.id, 'var_name', variation.name, 'images',
ARRAY(SELECT json_build_object('img_url', variation_images.images)
FROM variation_images WHERE variation_images.variation_id = variation.id)
)
FROM variation WHERE variation.product_id = product.id)
FROM product
I know you said you want to understand group by if you have data like this:
COL-A COL-B COL-C COL-D
1 Ac C1 D1
2 Bd C2 D2
3 Ba C1 D3
4 Ab C1 D4
5 C C2 D5
And you want to make the data appear like:
COL-A COL-B COL-C COL-D
4 Ab C1 D4
1 Ac C1 D1
3 Ba C1 D3
2 Bd C2 D2
5 C C2 D5
You use:
select * from table_name
order by col-c,colb
Because I think this is what you intend to do.