BigQuery from column to rows by separator - sql

I want to generate table and add all values per distinct id to one row using BigQuery
Example:
id label
000756f4-1af2-439b-b607-ce7384a6b8ee fast
000756f4-1af2-439b-b607-ce7384a6b8ee streaming
000756f4-1af2-439b-b607-ce7384a6b8ee other
0007bac4-1bed-4bf0-8b55-d21216723ef5 issue
000a03d2-f88c-4150-aa96-40b9fdaccb17 fast
000a03d2-f88c-4150-aa96-40b9fdaccb17 other
I would like to receive such table:
id label
000756f4-1af2-439b-b607-ce7384a6b8ee fast, streaming, other
0007bac4-1bed-4bf0-8b55-d21216723ef5 issue
000a03d2-f88c-4150-aa96-40b9fdaccb17 fast, other
Is it possible to achieve it with BigQuery?

You can just use string_agg():
select id, string_agg(label, ', ') as labels
from t
group by id;
Note that the ordering is arbitrary (and might even vary from one run to another). You might want to include an order by as well:
select id, string_agg(label, ', ' order by label) as labels
from t
group by id;

Update
Use string_agg:
select id, string_agg(label, ', ')
from mytable
group by id
Original answer
Use array_agg and array_to_string:
select id, array_to_string(array_agg(label), ', ')
from mytable
group by id

Related

Listagg + Count in Select duplicates

I'm writing up a query and cannot seem to get over this hurdle.
I am using both LISTAGG and COUNT (side-by-side) in it and whenever I do so, the ListAgg will duplicate when count is more than 1. Moreover, it adds more into the count when the ListAgg is more than one. They're each messing with each other, and I want to know how to keep them within the same query, but keep duplicates from appearing in the ListAgg while finding only the correct amount of instances for the Count.
I've tried using DISTINCT and various groupings, but to no avail.
Here is my (simplified) SQL:
SELECT DISTINCT /*+PARALLEL */ ID, NAME, LISTAGG(USERID, ';'), COUNT(MAIN_DATA)
FROM MAIN m
JOIN USERS u on m.pk1 = u.main_pk1
WHERE MAIN_DATA like '%keyword%'
GROUP BY ID, NAME
which yields something similar to this:
ID|NAME|USERID|MAIN_DATA
1|Hello|Jim|1
2|Hi|Arthur;Arthur;Arthur|3
3|Bonjour|Jane;Jane;Jim;Jim|4
When ID 2 should only have Arthur once, and there are only 2 instances of the keyword in ID 3, not 4. How can I achieve this?
Unfortunately, LISTAGG() doesn't support DISTINCT.
To remove duplicates, you need a subquery:
SELECT ID, NAME, LISTAGG(USERID, ';'), SUM(cnt)
FROM (SELECT ID, NAME, USERID, COUNT(*) as cnt
FROM MAIN m JOIN
USERS u
ON m.pk1 = u.main_pk1
WHERE m.MAIN_DATA like '%keyword%'
GROUP BY ID, NAME, USERID
) mu
GROUP BY ID, NAME;

How to Convert Column data into row data with comma separated value using SQL

I have a table Like as Follows
enter image description here
I want Out put like
enter image description here
how to Achieve this using SQL in Oracle database
You are looking for listagg(). The only caveat is that you need to specify the ordering for the values:
select stdname, listagg(marks, ', ') within group (order by ?) as marks
from t
group by stdname;
If you want them in order of the marks:
select stdname, listagg(marks, ', ') within group (order by marks desc) as marks
from t
group by stdname;

Grouping by a custom column composed of multiple actual columns

I want to display location info constituted by multiple columns in the DB but then I need to group it by the ID. The solution I've got is to list the constituting columns as groupees in the following way.
select
Id,
Name,
Here + ' and ' + There as Location,
count(*) as Count
from KnownStuff
group by Id, Name, Here, There
However, I'd like to know if there a more like-a-bossy way to group by that column, i.d. something along the lines of this.
group by Id, Name, Location
Or, even better (although, based on my googlearching, I'm pretty sure that it's not possible), I'd like to exclude all the other columns except for Id from the grouping constraints. In some cases I'll use sum or some other aggregating function but it'd be nice to just tell the server not to bother And if there are non-identical occurrences, then so be it - let it crash, burn, cry or lie - after all, it's my problem that I wrote a faulty script.
So:
Is there a like-a-bossy approach to grouping a custom column?
Is there a bite-me-in-the-ass-laterish approach to make it easier for now?
Wrap the query up in a derived table. Do GROUP BY it's result:
select id, name, location, count(*)
from
(
select
Id,
Name,
Here + ' and ' + There as Location,
from KnownStuff
)
group by Id, Name, location
May be encapsulating the Select with a SQL CTE expression can be a way
;with cte as (
select
Id,
Name,
Here + ' and ' + There as Location,
--count(*) as Count
from KnownStuff
--group by Id, Name, Here, There
)
select
Id, Name, Location, count(*) as [Count]
from cte
group by Id, Name, Location
This is actually like a sub-query
You can add computed column to your table.
alter table knownstuff add Location as (Here + ' and ' + There)
Computed columns are not stored physically. Then you can rewrite the query as:
select
Id,
Name,
Location,
count(*) as Count
from KnownStuff
group by Id, Name,Location

Group query rows result in one result

I need make a query that I get the result and put in one line separated per comma.
For example, I have this query:
SELECT
SIGLA
FROM
LANGUAGES
This query return the result below:
SIGLA
ESP
EN
BRA
I need to get this result in one single line that way:
SIGLA
ESP,EN,BRA
Can anyone help me?
Thank you!
SELECT LISTAGG(SIGLA, ', ') WITHIN GROUP (ORDER BY SIGLA) " As "S_List" FROM LANGUAGES
Should be the listagg sequence you are needing
try
SELECT LISTAGG( SIGLA, ',' ) within group (order by SIGLA) as NewSigla FROM LANGUAGES
If you want to get the values grouped together in the order that Oracle produces the rows then:
SELECT LISTAGG( SIGLA, ',' ) WITHIN GROUP ( ORDER BY ROWNUM ) AS SIGLA
FROM LANGUAGES;
If you want alphabetical ordering then replace ORDER BY ROWNUM with ORDER BY SIGLA (or, curiously, ORDER BY NULL).

Select all columns for a distinct column

I need to fetch distinct rows for a particular column from a table having over 200 columns. How can I achieve this?
I used
Select distinct col1 , * from table;
but it failed.
Please suggest an elegant way to achieve this.
Regards,
Tarun
The solution if you want one row for each distinct value in a particular column is:
SELECT col1, MAX(col2), MAX(col3), MAX(col4), ...
FROM mytable
GROUP BY col1
I chose MAX() arbitrarily. It could also be MIN() or some other aggregate function. The point is if you use GROUP BY to make sure to get one row per value in col1, then all the other columns must be inside aggregate functions.
There is no way to write MAX(*) to get that treatment for all columns. Sorry, you will have to write out all the column names (at least those that you need in this query, which might not be all 200).
We can generate a sequence of ROW_NUMBER() for every COL1 and then select the first entry of every sequence.
SELECT * FROM
(
SELECT E.* , ROW_NUMBER() OVER( PARTITION BY COL1 ORDER BY 1) AS ID
FROM YOURTABLE E
) MYDATA
WHERE MYDATA.ID = 1
A working example in fiddle