MySQL GROUP_CONCAT headache - sql

For performance,I need to set a limit for the GROUP_CONCAT,
and I need to know if there are rows not included.
How to do it?
EDIT
Let me provide a contrived example:
create table t(qid integer unsigned,name varchar(30));
insert into t value(1,'test1');
insert into t value(1,'test2');
insert into t value(1,'test3');
select group_concat(name separator ',')
from t
where qid=1;
+----------------------------------+
| group_concat(name separator ',') |
+----------------------------------+
| test1,test2,test3 |
+----------------------------------+
But now,I want to group 2 entries at most,and need to know if there is some entry not included in the result:
+----------------------------------+
| group_concat(name separator ',') |
+----------------------------------+
| test1,test2 |
+----------------------------------+
And I need to know that there is another entry left(in this case it's "test3")

this should do the trick
SELECT
SUBSTRING_INDEX(group_CONCAT(name) , ',', 2) as list ,
( if(count(*) > 2 , 1 , 0)) as more
FROM
t
WHERE
qid=1

How are you going to set the limit? And what performance issues will it solve?
You can get the number of rows in a group using count(*) and compare it to the limit.

Related

Sort each character in a string from a specific column in Snowflake SQL

I am trying to alphabetically sort each value in a column with Snowflake. For example I have:
| NAME |
| ---- |
| abc |
| bca |
| acb |
and want
| NAME |
| ---- |
| abc |
| abc |
| abc |
how would I go about doing that? I've tried using SPLIT and the ordering the rows, but that doesn't seem to work without a specific delimiter.
Using REGEXP_REPLACE to introduce separator between each character, STRTOK_SPLIT_TO_TABLE to get individual letters as rows and LISTAGG to combine again as sorted string:
SELECT tab.col, LISTAGG(s.value) WITHIN GROUP (ORDER BY s.value) AS result
FROM tab
, TABLE(STRTOK_SPLIT_TO_TABLE(REGEXP_REPLACE(tab.col, '(.)', '\\1~'), '~')) AS s
GROUP BY tab.col;
For sample data:
CREATE OR REPLACE TABLE tab
AS
SELECT 'abc' AS col UNION
SELECT 'bca' UNION
SELECT 'acb';
Output:
Similar implementation as Lukasz's, but using regexp_extract_all to extract individual characters in the form of an array that we later split to rows using flatten . The listagg then stitches it back in the order we specify in within group clause.
with cte (col) as
(select 'abc' union
select 'bca' union
select 'acb')
select col, listagg(b.value) within group (order by b.value) as col2
from cte, lateral flatten(regexp_extract_all(col,'.')) b
group by col;

Group by portion of field

I have a field in a PostgreSQL table, name, with this format:
JOHN^DOE
BILLY^SMITH
FIRL^GREGOIRE
NOEL^JOHN
and so on. The format is LASTNAME^FIRSTNAME. The table has ID, name, birthdate and sex fields.
How can I do a SQL statement with GROUP BY FIRSTNAME only ? I have tried several things, and I guess regexp_match could be the way, but I don't know how to write a correct regular expression for this task. Can you help me ?
I would recommend split_part():
group by split_part(mycol, '^', 1)
Demo on DB Fiddle:
mycol | split_part
:------------ | :---------
JOHN^DOE | JOHN
BILLY^SMITH | BILLY
FIRL^GREGOIRE | FIRL
NOEL^JOHN | NOEL
Use regexp_replace. Note that '^' needs to be escaped, since in many regexp dialects it means the beginning of the line or or the string. Extending your example with one more name, and using group by on the first field:
select
count(*)
, regexp_replace(tmp_col, '\^.*', '')
from
(values
('JOHN^DOE')
, ('BILLY^SMITH')
, ('FIRL^GREGOIRE')
, ('NOEL^JOHN')
, ('JOHN^SMITH')
)
as tmp_table(tmp_col)
group by regexp_replace(tmp_col, '\^.*', '')
;
Prints:
count | regexp_replace
-------+----------------
1 | BILLY
2 | JOHN
1 | NOEL
1 | FIRL
(4 rows)
To group by on the second field, use a similar regex:
select
count(*)
, regexp_replace(tmp_col, '.*\^', '')
from
(values
('JOHN^DOE')
, ('BILLY^SMITH')
, ('FIRL^GREGOIRE')
, ('NOEL^JOHN')
, ('JOHN^SMITH')
)
as tmp_table(tmp_col)
group by regexp_replace(tmp_col, '.*\^', '')
;
Prints:
count | regexp_replace
-------+----------------
1 | JOHN
1 | GREGOIRE
1 | DOE
2 | SMITH
(4 rows)

Group data on different key-value pairs in a string

I have a table LOG that contains a field NOTES. Table LOG also contains a field NrofItems. This is on Azure SQL. NOTES is a string that contains key-value pairs separated by semicolons. The order of the key-value pairs is random. The keys are known.
Example of three records:
NOTES | NrofItems
"customer=customer1;code=blablabla;application=SomeApplication" | 23
"code=adfadfadf;customer=customer99;application=AlsoApplication" | 33
"code=xyzxyzxyz;application=AlsoApplication;customer=customer1" | 13
"code=blablabla;customer=customer1;application=SomeApplication" | 2
I need to sum the value of NrofItems per customer per application per... like this:
customer1 | blablabla | SomeApplication | 25
customer1 | xyzxyzxyz | AlsoApplication | 13
customer99 | adfadfadf | AlsoApplication | 33
I would like to be able to use one or more of the key-value pairs to make groupings.
I do know how to to it for one grouping but how for more?
See this URL to see how to do it for one grouping: Group By on part of string
Hmmm. For this, I'm thinking that extracting the customer and application separately is a convenient way to go:
select c.customer, a.application, sum(nrofitems)
from t outer apply
(select top (1) stuff(s.value, 1, 10, '') as customer
from string_split(t.notes, ';') s
where s.value like 'customer=%'
) c outer apply
(select top (1) stuff(s.value, 1, 12, '') as application
from string_split(t.notes, ';') s
where s.value like 'application=%'
) a
group by c.customer, a.application;
Here is a db<>fiddle.

Get Count Between Comma and First Character

In my table, the values appear like this in a column:
Names
----------------
Doe,John P
Woods, Adam
Hart, Keeve
Hensen,Sarah J
Is it possible to get a count of space between the comma and first character after it? Expected result:
Names |Count_of_spaces_before_next_character
-----------------|--------------------------------------
Doe,John P | 0
Woods, Adam | 1
Hart, Keeve | 5
Hensen,Sarah J | 0
Thanks for any direction, much appreciate it !
You may try with the following statement:
Table:
CREATE TABLE Data (Names varchar(1000))
INSERT INTO Data
(Names)
VALUES
('Doe,John P'),
('Woods, Adam'),
('Hart, Keeve'),
('Hensen,Sarah J')
Statement:
SELECT Names, LEN(After) - LEN(LTRIM(After)) AS [Count]
FROM (
SELECT
Names,
RIGHT(Names, LEN(Names) - CHARINDEX(',', Names)) AS After
FROM Data
) t
Result:
Names Count
Doe,John P 0
Woods, Adam 1
Hart, Keeve 5
Hensen,Sarah J 0
You can remove everything up to the comma and then measure the length after trimming off the spaces:
select len(rest) - len(ltrim(rest))
from t cross apply
(values (stuff(name, 1, charindex(',', name + ','), ''))
) v(rest);
The + ',' handles the case where there is no comma in name.
Here is a db<>fiddle.
Updated
Even shorter & Cleaner
select names
,patindex('%[^ ]%',stuff(names,1,charindex(',',names),''))-1 as Count_of_spaces_before_next_character
from mytab
-
+----------------+---------------------------------------+
| names | Count_of_spaces_before_next_character |
+----------------+---------------------------------------+
| Doe,John P | 0 |
| Woods, Adam | 1 |
| Hart, Keeve | 5 |
| Hensen,Sarah J | 0 |
| Hello World | 0 |
+----------------+---------------------------------------+
SQL Fiddle
Building off of Gordon's answer. If you are not familiar with table constructor approach in cross apply, you might find this more readable. Hope the select in cross apply clears your confusion around how it can behave like a subquery. Also, I don't think you need to pad the name column with an additional ',' because charindex will return 0 if it doesn't find that character.
select name, len(rest) - len(ltrim(rest))
from t1
cross apply (select stuff(name, 1, charindex(',', name), '') as rest) t2
Logic: LEN(string after first comma) - LEN(LTRIM(string after first comma))
SELECT Name, LEN (SUBSTRING(Names,CHARINDEX(',',Names,1)+1 ,LEN(Names) - CHARINDEX(',',Names,1)) ) - LEN(LTRIM(SUBSTRING(Names,CHARINDEX(',',Names,1)+1 , LEN(Names) - CHARINDEX(',',Names,1))))
FROM #Table

SQL displaying results as columns from rows

SQL Noob here.
I realize that many variations to this question have been asked but none seems to work or be fully applicable to my annoying situation, ie. I dont think PIVOT would work for what I require. I cant fathom the necessary words to google what I need efficiently.
I have the following query:
Select w.WORKORDERID, y.Answer
From
[SD].[dbo].[WORKORDERSTATES] w
LEFT JOIN [SD].[dbo].[WO_RESOURCES] x
ON w.workorderid = x.woid
Full Outer Join [SD].[dbo].ResourcesQAMapping y
ON x.UID = y.MAPPINGID
WHERE w.APPR_STATUSID = '2'
AND w.STATUSID = '1'
AND w.REOPENED = 'false'
It will bring back the following result:
+-----------+---------------------+
| WORKORDER | Answer |
+-----------+---------------------+
| 55693 | Brad Pitt |
| 55693 | brad.pitt#mycom.com |
| 55693 | Location |
| 55693 | NULL |
| 55693 | george |
+-----------+---------------------+
I would like all rows related to the value 55693 to output as columns like below:
+-----------+-----------+---------------------+----------+--------+--------+
| WORKORDER | VALUE1 | VALUE2 | VALUE3 | VALUE4 | VALUE5 |
+-----------+-----------+---------------------+----------+--------+--------+
| 55693 | Brad Pitt | brad.pitt#mycom.com | Location | NULL | george |
+-----------+-----------+---------------------+----------+--------+--------+
There will always be the same amount of values, and I am almost sure that the solution involves creating a temporary table but I cant get it to work any which way.
Any help would be greatly appreciated.
If you always have the same number of values (5) you can use a static PIVOT, otherwise you need a dynamic TSQL statement with PIVOT.
In both cases you'll need to add a column to guarantee rows/columns ordering otherwise there is no guarantee that you'll see the correct value in each column.
Here is a sample query thet uses a static PIVOT on 5 values (but remember to add a column to properly order the data replacing ORDER BY WORKORDER with ORDER BY YOUR_COLUMN_NAME):
declare #tmp table (WORKORDER int, Answer varchar(50))
insert into #tmp values
(55693, 'Brad Pitt')
,(55693, 'brad.pitt#mycom.com')
,(55693, 'Location')
,(55693, 'NULL')
,(55693, 'george')
select * from
(
select
WORKORDER,
Answer,
CONCAT('VALUE', ROW_NUMBER() OVER (PARTITION BY WORKORDER ORDER BY WORKORDER)) AS COL
from #tmp
) as src
pivot
(
max(Answer) for COL in ([VALUE1], [VALUE2], [VALUE3], [VALUE4], [VALUE5])
)
as pvt
Results:
Try to select another column that has different values as answer column and try to run pivot and that will work