Concatenating records in a single column without looping?

Concatenating records in a single column without looping? - sql

I have a table with 1 column of varchar values. I am looking for a way to concatenate those values into a single value without a loop, if possible. If a loop is the most efficient way of going about this, then I'll go that way but figured I'd ask for other options before defaulting to that method. I'd also like to keep this inside of a SQL query.
Ultimately, I want to do the opposite of a split function.
Is it possible to do without a loop (or cursor) or should I just use a loop to make this happen?
Edit:
Since there was a very good answer associated with how to do it in MySql (as opposed to MS Sql like I initially intended), I decided to retag so others may be able to find the answer as well.

declare #concat varchar(max)
set #concat = ''
select #concat = #concat + col1 + ','
from tablename1

try this:
DECLARE #YourTable table (Col1 int)
INSERT INTO #YourTable VALUES (1)
INSERT INTO #YourTable VALUES (2)
INSERT INTO #YourTable VALUES (30)
INSERT INTO #YourTable VALUES (400)
INSERT INTO #YourTable VALUES (12)
INSERT INTO #YourTable VALUES (46454)
SELECT
STUFF(
(
SELECT
', ' + cast(Col1 as varchar(30))
FROM #YourTable
WHERE Col1<=400
ORDER BY Col1
FOR XML PATH('')
), 1, 2, ''
)
OUTPUT:
-------------------
1, 2, 12, 30, 400
(1 row(s) affected)

I just tackled a problem like this and looping took forever. So, I concantenated the values in the presentation medium (in this case Crystal Reports) and it was very fast.
Just an idea.

If it is MySQL, you can use GROUP_CONCAT
SELECT a, GROUP_CONCAT(b SEPARATOR ',') FROM table GROUP BY a;

Probably dated now but check out Adam Machanic's post on the topic.
And this one is certainly dated; I wrote it in 2004.
Why do I prefer a function over "keeping it inside a SQL query"? Because you'll probably have to do this more than once. Why not encapsulate that code into a single module instead of repeating it all over the place?

Related

SQL Server : exploding CSV for SELECT statement

I have a table structure as below;
id txtName intReferences
------------------------------
1 Fred 1,4,6,444,56,43,
2 Sam 5,33,5904,43
3 Tom 1200
4 Samantha 43,44,888,99
I'd like to write a T-SQL query to return all the records based on a series of numbers provided.
For example, querying for 43 would return Fred, Sam and Samantha. The catch is, when querying for 3, it shouldn't return results for Sam or Samantha, given that that isn't the number in its entirety. Looking for a direct and whole number match.
The CSV value may end in a comma.
I've tried to use the "IN" statement, but it returns results if any portion of the number exists. Ideally trying to achieve without creating a function given some database restrictions.

Use string_split():
select t.*
from t cross apply
string_split(t.intReferences, ',') s
where s.value = '3';
Then, fix your data model so your are not storing integer values in strings. This is bad, bad, bad. Here are some reasons why:
Numbers should be stored as numbers, not strings (using the correct type).
SQL Server has lousy string manipulation functions.
Only one value should be stored in a column.
Foreign key relationships should be properly declared.
Resulting queries cannot be optimized to using indexes or partitions.
SQL has a great way to store lists. It is called a table not a string.

Clearly, the best way to accommodate this situation is to have properly normalized data.
Another method for querying the data with the current structure would be to check for comma + (your number) + comma. Something like this...
Declare #Temp Table(id int, txtName varchar(200), intReferences varchar(200))
Insert Into #Temp Values(1, 'Fred', '1,4,6,444,56,43,')
Insert Into #Temp Values(2, 'Sam', '5,33,5904,43')
Insert Into #Temp Values(3, 'Tom', '1200')
Insert Into #Temp Values(4, 'Samantha', '43,44,888,99')
Select *
From #Temp
Where ',' + intReferences + ',' like '%,' + '43' + ',%'
Select *
From #Temp
Where ',' + intReferences + ',' like '%,' + '3' + ',%'

'LIKE' issues with FLOAT: SQL query needed to find values >= 4 decimal places

I have a conundrum....
There is a table with one NVARCHAR(50) Float column that has many rows with many numbers of various decimal lengths:
'3304.063'
'3304.0625'
'39.53'
'39.2'
I need to write a query to find only numbers with decimal places >= 4
First the query I wrote was:
SELECT
Column
FROM Tablename
WHERE Column LIKE '%.[0-9][0-9]%'
The above code finds all numbers with decimal places >= 2:
'3304.063'
'3304.0625'
'39.53'
Perfect! Now, I just need to increase the [0-9] by 2...
SELECT
Column
FROM Tablename
WHERE Column LIKE '%.[0-9][0-9][0-9][0-9]%'
this returned nothing! What?
Does anyone have an explanation as to what went wrong as well and/or a possible solution? I'm kind of stumped and my hunch is that it is some sort of 'LIKE' limitation..
Any help would be appreciated!
Thanks.

After your edit, you stated you are using FLOAT which is an approximate value stored as 4 or 8 bytes, or 7 or 15 digits of precision. The documents explicitly state that not all values in the data type range can be represented exactly. It also states you can use the STR() function when converting it which you'll need to get your formatting right. Here is how:
declare #table table (columnName float)
insert into #table
values
('3304.063'),
('3304.0625'),
('39.53'),
('39.2')
--see the conversion
select * , str(columnName,20,4)
from #table
--now use it in a where clause.
--Return all values where the last digit isn't 0 from STR() the conversion
select *
from #table
where right(str(columnName,20,4),1) != 0
OLD ANSWER
Your LIKE statement would do it, and here is another way just to show they both work.
declare #table table (columnName varchar(64))
insert into #table
values
('3304.063'),
('3304.0625'),
('39.53'),
('39.2')
select *
from #table
where len(right(columnName,len(columnName) - charindex('.',columnName))) >= 4
select *
from #table
where columnName like '%.[0-9][0-9][0-9][0-9]%'
One thing that could be causing this is a space in the number somewhere... since you said the column type was VARCHAR this is a possibility, and could be avoided by storing the value as DECIMAL
declare #table table (columnName varchar(64))
insert into #table
values
('3304.063'),
('3304. 0625'), --notice the space here
('39.53'),
('39.2')
--this would return nothing
select *
from #table
where columnName like '%.[0-9][0-9][0-9][0-9]%'
How to find out if this is the case?
select *
from #table
where columnName like '% %'
Or, anything but numbers and decimals:
select *
from #table
where columnName like '%[^.0-9]%'

The following is working fine for me:
declare #tab table (val varchar(50))
insert into #tab
select '3304.063'
union select '3304.0625'
union select '39.53'
union select '39.2'
select * from #tab
where val like '%.[0-9][0-9][0-9][0-9]%'

Assuming your table only has numerical data, you can cast them to decimal and then compare:
SELECT COLUMN
FROM tablename
WHERE CAST(COLUMN AS DECIMAL(19,4)) <> CAST(COLUMN AS DECIMAL(19,3))
You'd want to test the performance of this against using the character data type solutions that others have already suggested.

You can use REVERSE:
declare #vals table ([Val] nvarchar(50))
insert into #vals values ('3304.063'), ('3304.0625'), ('39.53'), ('39.2')
select [Val]
from #Vals
where charindex('.',reverse([Val]))>4

Splitting delimited values in a SQL column into multiple rows

I would really like some advice here, to give some background info I am working with inserting Message Tracking logs from Exchange 2007 into SQL. As we have millions upon millions of rows per day I am using a Bulk Insert statement to insert the data into a SQL table.
In fact I actually Bulk Insert into a temp table and then from there I MERGE the data into the live table, this is for test parsing issues as certain fields otherwise have quotes and such around the values.
This works well, with the exception of the fact that the recipient-address column is a delimited field seperated by a ; character, and it can be incredibly long sometimes as there can be many email recipients.
I would like to take this column, and split the values into multiple rows which would then be inserted into another table. Problem is anything I am trying is either taking too long or not working the way I want.
Take this example data:
message-id recipient-address
2D5E558D4B5A3D4F962DA5051EE364BE06CF37A3A5#Server.com user1#domain1.com
E52F650C53A275488552FFD49F98E9A6BEA1262E#Server.com user2#domain2.com
4fd70c47.4d600e0a.0a7b.ffff87e1#Server.com user3#domain3.com;user4#domain4.com;user5#domain5.com
I would like this to be formatted as followed in my Recipients table:
message-id recipient-address
2D5E558D4B5A3D4F962DA5051EE364BE06CF37A3A5#Server.com user1#domain1.com
E52F650C53A275488552FFD49F98E9A6BEA1262E#Server.com user2#domain2.com
4fd70c47.4d600e0a.0a7b.ffff87e1#Server.com user3#domain3.com
4fd70c47.4d600e0a.0a7b.ffff87e1#Server.com user4#domain4.com
4fd70c47.4d600e0a.0a7b.ffff87e1#Server.com user5#domain5.com
Does anyone have any ideas about how I can go about doing this?
I know PowerShell pretty well, so I tried in that, but a foreach loop even on 28K records took forever to process, I need something that will run as quickly/efficiently as possible.
Thanks!

If you are on SQL Server 2016+
You can use the new STRING_SPLIT function, which I've blogged about here, and Brent Ozar has blogged about here.
SELECT s.[message-id], f.value
FROM dbo.SourceData AS s
CROSS APPLY STRING_SPLIT(s.[recipient-address], ';') as f;
If you are still on a version prior to SQL Server 2016
Create a split function. This is just one of many examples out there:
CREATE FUNCTION dbo.SplitStrings
(
#List NVARCHAR(MAX),
#Delimiter NVARCHAR(255)
)
RETURNS TABLE
AS
RETURN (SELECT Number = ROW_NUMBER() OVER (ORDER BY Number),
Item FROM (SELECT Number, Item = LTRIM(RTRIM(SUBSTRING(#List, Number,
CHARINDEX(#Delimiter, #List + #Delimiter, Number) - Number)))
FROM (SELECT ROW_NUMBER() OVER (ORDER BY s1.[object_id])
FROM sys.all_objects AS s1 CROSS APPLY sys.all_objects) AS n(Number)
WHERE Number <= CONVERT(INT, LEN(#List))
AND SUBSTRING(#Delimiter + #List, Number, 1) = #Delimiter
) AS y);
GO
I've discussed a few others here, here, and a better approach than splitting in the first place here.
Now you can extrapolate simply by:
SELECT s.[message-id], f.Item
FROM dbo.SourceData AS s
CROSS APPLY dbo.SplitStrings(s.[recipient-address], ';') as f;
Also I suggest not putting dashes in column names. It means you always have to put them in [square brackets].

SQL Server 2016 include a new table function string_split(), similar to the previous solution.
The only requirement is Set compatibility level to 130 (SQL Server 2016)

You may use CROSS APPLY (available in SQL Server 2005 and above) and STRING_SPLIT function (available in SQL Server 2016 and above):
DECLARE #delimiter nvarchar(255) = ';';
-- create tables
CREATE TABLE MessageRecipients (MessageId int, Recipients nvarchar(max));
CREATE TABLE MessageRecipient (MessageId int, Recipient nvarchar(max));
-- insert data
INSERT INTO MessageRecipients VALUES (1, 'user1#domain.com; user2#domain.com; user3#domain.com');
INSERT INTO MessageRecipients VALUES (2, 'user#domain1.com; user#domain2.com');
-- insert into MessageRecipient
INSERT INTO MessageRecipient
SELECT MessageId, ltrim(rtrim(value))
FROM MessageRecipients
CROSS APPLY STRING_SPLIT(Recipients, #delimiter)
-- output results
SELECT * FROM MessageRecipients;
SELECT * FROM MessageRecipient;
-- delete tables
DROP TABLE MessageRecipients;
DROP TABLE MessageRecipient;
Results:
MessageId Recipients
----------- ----------------------------------------------------
1 user1#domain.com; user2#domain.com; user3#domain.com
2 user#domain1.com; user#domain2.com
and
MessageId Recipient
----------- ----------------
1 user1#domain.com
1 user2#domain.com
1 user3#domain.com
2 user#domain1.com
2 user#domain2.com

for table = "yelp_business", split the column categories values separated by ; into rows and display as category column.
SELECT unnest(string_to_array(categories, ';')) AS category
FROM yelp_business;

String manipulation SQL

I have a row of strings that are in the following format:
'Order was assigned to lastname,firsname'
I need to cut this string down into just the last and first name but it is always a different name for each record.
The 'Order was assigned to' part is always the same.......
Thanks
I am using SQL Server. It is multiple records with different names in each record.

In your specific case you can use something like:
SELECT SUBSTRING(str, 23) FROM table
However, this is not very scalable, should the format of your strings ever change.
If you are using an Oracle database, you would want to use SUBSTR instead.
Edit:
For databases where the third parameter is not optional, you could use SUBSTRING(str, 23, LEN(str))
Somebody would have to test to see if this is better or worse than subtraction, as in Martin Smith's solution but gives you the same result in the end.

In addition to the SUBSTRING methods, you could also use a REPLACE function. I don't know which would have better performance over millions of rows, although I suspect that it would be the SUBSTRING - especially if you were working with CHAR instead of VARCHAR.
SELECT REPLACE(my_column, 'Order was assigned to ', '')

For SQL Server
WITH testData AS
(
SELECT 'Order was assigned to lastname,firsname' as Col1 UNION ALL
SELECT 'Order was assigned to Bloggs, Jo' as Col1
)
SELECT SUBSTRING(Col1,23,LEN(Col1)-22) AS Name
from testData
Returns
Name
---------------------------------------
lastname,firsname
Bloggs, Jo

on MS SQL Server:
declare #str varchar(100) = 'Order was assigned to lastname,firsname'
declare #strLen1 int = DATALENGTH('Order was assigned to ')
declare #strLen2 int = len(#str)
select #strlen1, #strLen2, substring(#str,#strLen1,#strLen2),
RIGHT(#str, #strlen2-#strlen1)

I would require that a colon or some other delimiter be between the message and the name.
Then you could just search for the index of that character and know that anything after it was the data you need...
Example with format changing over time:
CREATE TABLE #Temp (OrderInfo NVARCHAR(MAX))
INSERT INTO #Temp VALUES ('Order was assigned to :Smith,Mary')
INSERT INTO #Temp VALUES ('Order was assigned to :Holmes,Larry')
INSERT INTO #Temp VALUES ('New Format over time :LootAt,Me')
SELECT SUBSTRING(OrderInfo, CHARINDEX(':',OrderInfo)+1, LEN(OrderInfo))
FROM #Temp
DROP TABLE #Temp

SQL Server 2008 query to find rows containing non-alphanumeric characters in a column

I was actually asked this myself a few weeks ago, whereas I know exactly how to do this with a SP or UDF but I was wondering if there was a quick and easy way of doing this without these methods. I'm assuming that there is and I just can't find it.
A point I need to make is that although we know what characters are allowed (a-z, A-Z, 0-9) we don't want to specify what is not allowed (##!$ etc...). Also, we want to pull the rows which have the illegal characters so that it can be listed to the user to fix (as we have no control over the input process we can't do anything at that point).
I have looked through SO and Google previously, but was unable to find anything that did what I wanted. I have seen many examples which can tell you if it contains alphanumeric characters, or doesn't, but something that is able to pull out an apostrophe in a sentence I have not found in query form.
Please note also that values can be null or '' (empty) in this varchar column.

Won't this do it?
SELECT * FROM TABLE
WHERE COLUMN_NAME LIKE '%[^a-zA-Z0-9]%'
Setup
use tempdb
create table mytable ( mycol varchar(40) NULL)
insert into mytable VALUES ('abcd')
insert into mytable VALUES ('ABCD')
insert into mytable VALUES ('1234')
insert into mytable VALUES ('efg%^&hji')
insert into mytable VALUES (NULL)
insert into mytable VALUES ('')
insert into mytable VALUES ('apostrophe '' in a sentence')
SELECT * FROM mytable
WHERE mycol LIKE '%[^a-zA-Z0-9]%'
drop table mytable
Results
mycol
----------------------------------------
efg%^&hji
apostrophe ' in a sentence

Sql server has very limited Regex support. You can use PATINDEX with something like this
PATINDEX('%[a-zA-Z0-9]%',Col)
Have a look at PATINDEX (Transact-SQL)
and Pattern Matching in Search Conditions

I found this page with quite a neat solution. What makes it great is that you get an indication of what the character is and where it is. Then it gives a super simple way to fix it (which can be combined and built into a piece of driver code to scale up it's application).
DECLARE #tablename VARCHAR(1000) ='Schema.Table'
DECLARE #columnname VARCHAR(100)='ColumnName'
DECLARE #counter INT = 0
DECLARE #sql VARCHAR(MAX)
WHILE #counter <=255
BEGIN
SET #sql=
'SELECT TOP 10 '+#columnname+','+CAST(#counter AS VARCHAR(3))+' as CharacterSet, CHARINDEX(CHAR('+CAST(#counter AS VARCHAR(3))+'),'+#columnname+') as LocationOfChar
FROM '+#tablename+'
WHERE CHARINDEX(CHAR('+CAST(#counter AS VARCHAR(3))+'),'+#columnname+') <> 0'
PRINT (#sql)
EXEC (#sql)
SET #counter = #counter + 1
END
and then...
UPDATE Schema.Table
SET ColumnName= REPLACE(Columnname,CHAR(13),'')
Credit to Ayman El-Ghazali.

SELECT * FROM TABLE_NAME WHERE COL_NAME LIKE '%[^0-9a-zA-Z $#$.$-$''''$,]%'
This works best for me when I'm trying to find any special characters in a string

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Concatenating records in a single column without looping? - sql

declare #concat varchar(max) set #concat = '' select #concat = #concat + col1 + ',' from tablename1

I just tackled a problem like this and looping took forever. So, I concantenated the values in the presentation medium (in this case Crystal Reports) and it was very fast. Just an idea.

If it is MySQL, you can use GROUP_CONCAT SELECT a, GROUP_CONCAT(b SEPARATOR ',') FROM table GROUP BY a;

Related

SQL Server : exploding CSV for SELECT statement

'LIKE' issues with FLOAT: SQL query needed to find values >= 4 decimal places

Splitting delimited values in a SQL column into multiple rows

String manipulation SQL

SQL Server 2008 query to find rows containing non-alphanumeric characters in a column

Categories

Resources