Concat data from different rows into one in BigQuery - sql

i want to concat my data from a particular column which is present in different rows.
The Data is something like this:
id | Name |
1 | Jack, John |
2 | John |
3 | John, Julie |
4 | Jack |
5 | Jack, Julie |
I want the output as Jack, John, Julie. Every name should be unique.
I tried using string_agg(distinct Name), but the result is coming out as (Jack, John, John, John, Julie, Jack, Jack, Julie).
How can i solve this issue and get desired result?
Thanks in advance

Use below
select string_agg(distinct trim(nm), ', ') as names
from your_table, unnest(split(name)) nm
if applied to sample data in your question - output is

Does this work for you? if it works, please mark as answer
WITH DistinctValues AS(
SELECT DISTINCT
V.DenormalisedData,
SS.[Value]
FROM (VALUES((Select SUBSTRING(( SELECT ',' + trim(Name) AS 'data()' FROM TableName FOR XML PATH('') ), 2 , 9999))))V(DenormalisedData)
CROSS APPLY STRING_SPLIT(V.DenormalisedData,',') SS)
SELECT STRING_AGG(DV.[Value],',') AS RedenormalisedData
FROM DistinctValues DV
GROUP BY DenormalisedData;

Related

Group by portion of field

I have a field in a PostgreSQL table, name, with this format:
JOHN^DOE
BILLY^SMITH
FIRL^GREGOIRE
NOEL^JOHN
and so on. The format is LASTNAME^FIRSTNAME. The table has ID, name, birthdate and sex fields.
How can I do a SQL statement with GROUP BY FIRSTNAME only ? I have tried several things, and I guess regexp_match could be the way, but I don't know how to write a correct regular expression for this task. Can you help me ?
I would recommend split_part():
group by split_part(mycol, '^', 1)
Demo on DB Fiddle:
mycol | split_part
:------------ | :---------
JOHN^DOE | JOHN
BILLY^SMITH | BILLY
FIRL^GREGOIRE | FIRL
NOEL^JOHN | NOEL
Use regexp_replace. Note that '^' needs to be escaped, since in many regexp dialects it means the beginning of the line or or the string. Extending your example with one more name, and using group by on the first field:
select
count(*)
, regexp_replace(tmp_col, '\^.*', '')
from
(values
('JOHN^DOE')
, ('BILLY^SMITH')
, ('FIRL^GREGOIRE')
, ('NOEL^JOHN')
, ('JOHN^SMITH')
)
as tmp_table(tmp_col)
group by regexp_replace(tmp_col, '\^.*', '')
;
Prints:
count | regexp_replace
-------+----------------
1 | BILLY
2 | JOHN
1 | NOEL
1 | FIRL
(4 rows)
To group by on the second field, use a similar regex:
select
count(*)
, regexp_replace(tmp_col, '.*\^', '')
from
(values
('JOHN^DOE')
, ('BILLY^SMITH')
, ('FIRL^GREGOIRE')
, ('NOEL^JOHN')
, ('JOHN^SMITH')
)
as tmp_table(tmp_col)
group by regexp_replace(tmp_col, '.*\^', '')
;
Prints:
count | regexp_replace
-------+----------------
1 | JOHN
1 | GREGOIRE
1 | DOE
2 | SMITH
(4 rows)

Concat multiple rows PSQL

id | name | Subject | Lectured_Times | Faculty
3258132 | Chris Smith | SATS1364 | 10 | Science
3258132 | Chris Smith | ECTS4605 | 9 | Engineering
How would I go about creating the following
3258132 Chris Smith SATS1364, 10, Science + ECTS4605, 9,Engineering
where the + is just a new line. Notice how after the '+'(new line) it doesnt concat the id,name
try
SELECT distinct concat(id,"name",string_agg(concat(subject, Lectured_Times , Faculty), chr(10)))
from tn
where id = 3258132
group by id;
As mentioned above string_agg is perfect solution for this.
select
id, name, string_agg(concat(subject, Lectured_Times, Faculty), '\n')
from table
group by id, name

SQL comma separated column loop

I have a serious problem with SQL that already took me 3 hours. I have two tables like these:
First table: Employees
ID | NAME
---+--------
1 | John
2 | Mike
3 | Robert
Second table: Customers
ID | NAME | EMPLOYEES
---+---------+--------------
1 | Michael | 2,3
2 | Julia | 1
3 | Mila | 1,2,3
I want the output like this:
Michael | Mike, Robert
Julia | John
Mila | John, Mike, Robert
What should the SQL command to get the expected output?
Select A.Name
,Employees = (Select Stuff((Select Distinct ',' +Name From Employees Where charindex(','+cast(ID as varchar(25))+',',','+A.EMPLOYEE_ID+',')>0 For XML Path ('')),1,1,'') )
From Customers A
Returns
Name Employees
Michael Mike,Robert
Julia John
Mila John,Mike,Robert
This is an awful data structure and you should fix it. That is the primary thing. Storing numbers as strings is bad. Storing multiple values in a column is bad. Not declaring foreign key relationships is bad.
That said, what can you do if someone else set up such a database and did so in this bad way? Well, you can do:
select c.*, e.name
from customers c join
employees e
on ',' + cast(e.id as varchar(255)) + ',' like '%,' + c.employee_id + ',%';
Note that this query cannot be optimized using normal SQL methods, such as indexes, because the JOIN condition is too complicated.
This will give you more rows than you have asked for:
Michael Mike
Michael Robert
Julia John
Mila John
Mila Mike
Mila Robert
However, this is the normal way that SQL works, so you should get used to it.

How to get initials easily out of text field using Postgres

I am using Postgres version 9.4 and I have a full_name field in a table.
In some cases, I want to put initials instead of the full_name of the person in my table.
Something like:
Name | Initials
------------------------
Joe Blow | J. B.
Phil Smith | P. S.
The full_name field is a string value (obviously) and I think the best way to go about this is to split the string into an array foreach space i.e.:
select full_name, string_to_array(full_name,' ') initials
from my_table
This produces the following result-set:
Eric A. Korver;{Eric,A.,Korver}
Ignacio Bueno;{Ignacio,Bueno}
Igmar Mendoza;{Igmar,Mendoza}
Now, the only thing I am missing is how to loop through each array element and pull the 1st character out of it. I will end up using substring() to get the initial character of each element - however I am just stuck on how to loop through them on-the-fly..
Anybody have a simple way to go about this?
Use unnest with string_agg:
select full_name, string_agg(substr(initials, 1,1)||'.', ' ') initials
from (
select full_name, unnest(string_to_array(full_name,' ')) initials
from my_table
) sub
group by 1;
full_name | initials
------------------------+-------------
Phil Smith | P. S.
Joe Blow | J. B.
Jose Maria Allan Pride | J. M. A. P.
Eric A. Korver | E. A. K.
(4 rows)
In Postgres 14+ you can replace unnest(string_to_array(...)) with string_to_table(...).
Test it in db<>fiddle.
You can also create a helper function for this, in case you want to use similar logic in multiple queries. Check this out
--
-- Function to extract a person's initials from the full name.
--
DROP FUNCTION IF EXISTS get_name_initials(TEXT);
CREATE OR REPLACE FUNCTION get_name_initials(full_name TEXT)
RETURNS TEXT AS $$
DECLARE
result TEXT :='';
part VARCHAR :='';
BEGIN
FOREACH part IN ARRAY string_to_array($1, ' ') LOOP
result := result || substr(part, 1, 1) || '.';
END LOOP;
RETURN result;
END;
$$ LANGUAGE plpgsql;
Now you can simply use this function to get the initials like this.
SELECT full_name, get_name_initials(full_name) as initials
FROM my_table;
SELECT get_name_initials('Phil Smith'); -- Returns P. H.
SELECT get_name_initials('Joe Blow'); -- Returns J. B.
SqlFiddleDemo
WITH add_id AS (
SELECT n.*, row_number() OVER (ORDER BY "Name") AS id
FROM names n
),
split_names AS (
SELECT id, regexp_split_to_table("Name", E'\\s+') AS single_name
FROM add_id
),
initials AS (
SELECT id, left(single_name, 1) || '.' AS initial
FROM split_names
),
final AS (
SELECT id, string_agg(initial, ' ')
FROM initials
GROUP BY id
)
SELECT a.*, f.*
FROM add_id a
JOIN final f USING (id)
For debug I create the Initial to Show how match the string_agg
| Name | Initials | id | id | string_agg |
|----------------|----------|----|----|------------|
| Eric A. Korver | E. A. K. | 1 | 1 | E. A. K. |
| Igmar Mendoza | I. M. | 2 | 2 | I. M. |
| Ignacio Bueno | I. B. | 3 | 3 | I. B. |
| Joe Blow | J. B. | 4 | 4 | J. B. |
| Phil Smith | P. S. | 5 | 5 | P. S. |
After some work I got a compact version SqlFiddleDemo
SELECT "Name", string_agg(left(single_name, 1) || '.', '') AS Initials
FROM (
SELECT
"Name",
regexp_split_to_table("Name", E'\\s+') AS single_name
FROM names
) split_names
GROUP BY "Name"
OUTPUT
| Name | initials |
|----------------|----------|
| Eric A. Korver | E.K.A. |
| Igmar Mendoza | M.I. |
| Ignacio Bueno | I.B. |
| Joe Blow | B.J. |
| Phil Smith | P.S. |

SQL - Group by Elements of Comma Delineation

How can I group by a comma delineated list within a row?
Situation:
I have a view that shows me information on support tickets. Each ticket is assigned to an indefinite number of resources. It might have one name in the resource list, it might have 5.
I would like to aggregate by individual names, so:
| Ticket ID | resource list
+-----------+----------
| 1 | Smith, Fred, Joe
| 2 | Fred
| 3 | Smith, Joe
| 4 | Joe, Fred
Would become:
| Name | # of Tickets
+-----------+----------
| Fred | 3
| Smith | 2
| Joe | 3
I did not design the database, so I am stuck with this awkward resource list column.
I've tried something like this:
SELECT DISTINCT resource_list
, Count(*) AS '# of Tickets'
FROM IEG.vServiceIEG
GROUP BY resource_list
ORDER BY '# of Tickets' DESC
...which gives me ticket counts based on particular combinations, but I'm having trouble getting this one step further to separate that out.
I also have access to a list of these individual names that I could do a join from, but I'm not sure how I would make that work. Previously in reports, I've used WHERE resource_list LIKE '%' + #tech + '%', but I'm not sure how I would iterate through this for all names.
EDIT:
This is my final query that gave me the information I was looking for:
select b.Item, Count(*) AS 'Ticket Count'
from IEG.vServiceIEG a
cross apply (Select * from dbo.Split(REPLACE(a.resource_list, ' ', ''),',')) b
Group by b.Item
order by 2 desc
Check this Post (Function Definition by Romil) for splitting strings into a table:
How to split string and insert values into table in SQL Server
Use it this way :
select b.Item, Count(*) from IEG.vServiceIEG a
cross apply (
Select * from dbo.Split (a.resource_list,',')
) b
Group by b.Item
order by 2 desc