SQL Server: any equivalent of strpos()? - sql

I'm dealing with an annoying database where one field contains what really should be stored two separate fields. So the column is stored something like "The first string~#~The second string", where "~#~" is the delimiter. (Again, I didn't design this, I'm just trying to fix it.)
I want a query to move this into two columns, that would look something like this:
UPDATE UserAttributes
SET str1 = SUBSTRING(Data, 1, STRPOS(Data, '~#~')),
str2 = SUBSTRING(Data, STRPOS(Data, '~#~')+3, LEN(Data)-(STRPOS(Data, '~#~')+3))
But I can't find that any equivalent to strpos exists.

User charindex:
Select CHARINDEX ('S','MICROSOFT SQL SERVER 2000')
Result: 6
Link

The PatIndex function should give you the location of the pattern as a part of a string.
PATINDEX ( '%pattern%' , expression )
http://msdn.microsoft.com/en-us/library/ms188395.aspx

If you need your data in columns here is what I use:
create FUNCTION [dbo].[fncTableFromCommaString] (#strList varchar(8000))
RETURNS #retTable Table (intValue int) AS
BEGIN
DECLARE #intPos tinyint
WHILE CHARINDEX(',',#strList) > 0
BEGIN
SET #intPos=CHARINDEX(',',#strList)
INSERT INTO #retTable (intValue) values (CONVERT(int, LEFT(#strList,#intPos-1)))
SET #strList = RIGHT(#strList, LEN(#strList)-#intPos)
END
IF LEN(#strList)>0
INSERT INTO #retTable (intValue) values (CONVERT(int, #strList))
RETURN
END
Just replace ',' in the function with your delimiter (or maybe even parametrize it)

Related

How to replace all special characters in string

I have a table with the following columns:
dbo.SomeInfo
- Id
- Name
- InfoCode
Now I need to update the above table's InfoCode as
Update dbo.SomeInfo
Set InfoCode= REPLACE(Replace(RTRIM(LOWER(Name)),' ','-'),':','')
This replaces all spaces with - & lowercase the name
When I do check the InfoCode, I see there are Names with some special characters like
Cathe Friedrich''s Low Impact
coffeyfit-cardio-box-&-burn
Jillian Michaels: Cardio
Then I am manually writing the update sql against this as
Update dbo.SomeInfo
SET InfoCode= 'cathe-friedrichs-low-impact'
where Name ='Cathe Friedrich''s Low Impact '
Now, this solution is not realistic for me. I checked the following links related to Regex & others around it.
UPDATE and REPLACE part of a string
https://www.codeproject.com/Questions/456246/replace-special-characters-in-sql
But none of them is hitting the requirement.
What I need is if there is any character other [a-z0-9] replace it - & also there should not be continuous -- in InfoCode
The above Update sql has set some values of InfoCode as the-dancer's-workout®----starter-package
Some Names have value as
Sleek Technique™
The Dancer's-workout®
How can I write Update sql that could handle all such special characters?
Using NGrams8K you could split the string into characters and then rather than replacing every non-acceptable character, retain only certain ones:
SELECT (SELECT '' + CASE WHEN N.token COLLATE Latin1_General_BIN LIKE '[A-z0-9]'THEN token ELSE '-' END
FROM dbo.NGrams8k(V.S,1) N
ORDER BY position
FOR XML PATH(''))
FROM (VALUES('Sleek Technique™'),('The Dancer''s-workout®'))V(S);
I use COLLATE here as on my default collation in my instance the '™' is ignored, therefore I use a binary collation. You may want to use COLLATE to switch the string back to its original collation outside of the subquery.
This approach is fully inlinable:
First we need a mock-up table with some test data:
DECLARe #SomeInfo TABLE (Id INT IDENTITY, InfoCode VARCHAR(100));
INSERT INTO #SomeInfo (InfoCode) VALUES
('Cathe Friedrich''s Low Impact')
,('coffeyfit-cardio-box-&-burn')
,('Jillian Michaels: Cardio')
,('Sleek Technique™')
,('The Dancer''s-workout®');
--This is the query
WITH cte AS
(
SELECT 1 AS position
,si.Id
,LOWER(si.InfoCode) AS SourceText
,SUBSTRING(LOWER(si.InfoCode),1,1) AS OneChar
FROM #SomeInfo si
UNION ALL
SELECT cte.position +1
,cte.Id
,cte.SourceText
,SUBSTRING(LOWER(cte.SourceText),cte.position+1,1) AS OneChar
FROM cte
WHERE position < DATALENGTH(SourceText)
)
,Cleaned AS
(
SELECT cte.Id
,(
SELECT CASE WHEN ASCII(cte2.OneChar) BETWEEN 65 AND 90 --A-Z
OR ASCII(cte2.OneChar) BETWEEN 97 AND 122--a-z
OR ASCII(cte2.OneChar) BETWEEN 48 AND 57 --0-9
--You can easily add more ranges
THEN cte2.OneChar ELSE '-'
--You can easily nest another CASE to deal with special characters like the single quote in your examples...
END
FROM cte AS cte2
WHERE cte2.Id=cte.Id
ORDER BY cte2.position
FOR XML PATH('')
) AS normalised
FROM cte
GROUP BY cte.Id
)
,NoDoubleHyphens AS
(
SELECT REPLACE(REPLACE(REPLACE(normalised,'-','<>'),'><',''),'<>','-') AS normalised2
FROM Cleaned
)
SELECT CASE WHEN RIGHT(normalised2,1)='-' THEN SUBSTRING(normalised2,1,LEN(normalised2)-1) ELSE normalised2 END AS FinalResult
FROM NoDoubleHyphens;
The first CTE will recursively (well, rather iteratively) travers down the string, character by character and a return a very slim set with one row per character.
The second CTE will then GROUP the Ids. This allows for a correlated sub-query, where the actual check is performed using ASCII-ranges. FOR XML PATH('') is used to re-concatenate the string. With SQL-Server 2017+ I'd suggest to use STRING_AGG() instead.
The third CTE will use a well known trick to get rid of multiple occurances of a character. Take any two characters which will never occur in your string, I use < and >. A string like a--b---c will come back as a<><>b<><><>c. After replacing >< with nothing we get a<>b<>c. Well, that's it...
The final SELECT will cut away a trailing hyphen. If needed you can add similar logic to get rid of a leading hyphen. With v2017+ There was TRIM('-') to make this easier...
The result
cathe-friedrich-s-low-impact
coffeyfit-cardio-box-burn
jillian-michaels-cardio
sleek-technique
the-dancer-s-workout
You can create a User-Defined-Function for something like that.
Then use the UDF in the update.
CREATE FUNCTION [dbo].LowerDashString (#str varchar(255))
RETURNS varchar(255)
AS
BEGIN
DECLARE #result varchar(255);
DECLARE #chr varchar(1);
DECLARE #pos int;
SET #result = '';
SET #pos = 1;
-- lowercase the input and remove the single-quotes
SET #str = REPLACE(LOWER(#str),'''','');
-- loop through the characters
-- while replacing anything that's not a letter to a dash
WHILE #pos <= LEN(#str)
BEGIN
SET #chr = SUBSTRING(#str, #pos, 1)
IF #chr LIKE '[a-z]' SET #result += #chr;
ELSE SET #result += '-';
SET #pos += 1;
END;
-- SET #result = TRIM('-' FROM #result); -- SqlServer 2017 and beyond
-- multiple dashes to one dash
WHILE #result LIKE '%--%' SET #result = REPLACE(#result,'--','-');
RETURN #result;
END;
GO
Example snippet using the function:
-- using a table variable for demonstration purposes
declare #SomeInfo table (Id int primary key identity(1,1) not null, InfoCode varchar(100) not null);
-- sample data
insert into #SomeInfo (InfoCode) values
('Cathe Friedrich''s Low Impact'),
('coffeyfit-cardio-box-&-burn'),
('Jillian Michaels: Cardio'),
('Sleek Technique™'),
('The Dancer''s-workout®');
update #SomeInfo
set InfoCode = dbo.LowerDashString(InfoCode)
where (InfoCode LIKE '%[^A-Z-]%' OR InfoCode != LOWER(InfoCode));
select *
from #SomeInfo;
Result:
Id InfoCode
-- -----------------------------
1 cathe-friedrichs-low-impact
2 coffeyfit-cardio-box-burn
3 jillian-michaels-cardio
4 sleek-technique-
5 the-dancers-workout-

How to select specific row in SQL from a bad designed schema?

I have a string in a column of a db schema I did not design, like this:
numbers column
--------------------
First: 1,2,33,34,43,5
Second: 1,2,3,4,5
Despite I know this is not the best practice scenario, I would still want to select the row which contains only '3' value, not '33' or '34' or '43'.
How could I select only second row?
SELECT *
FROM tblNumbers
WHERE numbers like '%,3,%' OR numbers like '3,%' OR numbers like '%,3'
This query selected both 2 columns. How can I do this, to get just the second row?
Here is my problem:
Thanks.
You should be storing the values in a separate table, with one row per column and per number.
Sometimes, though, we are stuck with other peoples bad data structures. If so, you can do what you want in this rather cumbersome way:
where replace(replace(numbers, '{', ','), '}', ',') like '%,3,%'
That is, put the delimiters around all the numbers in numbers.
Let me repeat, though: the proper way to store this data is using a separate table. If you need to store multiple values in a column like this, then do some research on XML and JSON formats (which are supported only in the most recent version of SQL Server).
EDIT:
Exactly the same idea applies, the code is just simpler:
where ',' + numbers + ',' like '%,3,%'
Did you try it like this?
SELECT *
FROM tblNumbers
WHERE number = '3' OR ReportedGMY = '3'
if you are storing numbers as integers
SELECT *
FROM tblNumbers
WHERE number = '3'
if you are storing as string
SELECT *
FROM tblNumbers
WHERE number like "3"
Its is bad practice to save command separated value in a column. This should be avoid as much as possible. If you really need to do it, then can be done using user defined function.
CREATE FUNCTION dbo.HasDigit (#String VARCHAR(MAX), #DigitToCheck INT, #Delimiter VARCHAR(10))
RETURNS BIT
AS
BEGIN
DECLARE #DelimiterPosition INT
DECLARE #Digit INT
DECLARE #ContainsDigit BIT = 0
WHILE CHARINDEX(#Delimiter, #String) > 0
BEGIN
SELECT #DelimiterPosition = CHARINDEX(#Delimiter, #String)
SELECT #Digit = CAST(SUBSTRING(#String, 1, #DelimiterPosition - 1) AS INT)
IF(#Digit = #DigitToCheck)
BEGIN
SET #ContainsDigit = 1
END
SELECT #String = SUBSTRING(#String, #DelimiterPosition + 1, LEN(#String) - #DelimiterPosition)
END
RETURN #ContainsDigit
END;
GO
CREATE TABLE TEST (
Numbers VARCHAR(MAX),
COLUMNNAME VARCHAR(MAX)
)
GO
INSERT INTO TEST VALUES('First:', '1,2,33,34,43,5')
INSERT INTO TEST VALUES('Second:', ' 1,2,3,4,5')
GO
SELECT * FROM TEST WHERE dbo.HasDigit(COLUMNNAME, 3, ',') = 1
Output:
--Numbers COLUMNNAME
--------- ----------------
--Second: 1,2,3,4,5

SQL: SELECT number text base on a number

Background: I have an SQL database that contain a column (foo) of a text type and not integer. In the column I store integer in a text form.
Question: Is it possible to SELECT the row that contains (in foo column) number greater/lesser than n?
PS: I have a very good reason to store them as text form. Please refrain from commenting on that.
Update: (Forgot to mention) I am storing it in SQLite3.
SELECT foo
FROM Table
WHERE CAST(foo as int)>#n
select *
from tableName
where cast(textColumn as int) > 5
A simple CAST in the WHERE clause will work as long as you are sure that the data in the foo column is going to properly convert to an integer. If not, your SELECT statement will throw an error. I would suggest you add an extra step here and take out the non-numeric characters before casting the field to an int. Here is a link on how to do something similar:
http://blog.sqlauthority.com/2007/05/13/sql-server-udf-function-to-parse-alphanumeric-characters-from-string/
The only real modification you would need to do on this function would be to change the following lines:
PATINDEX('%[^0-9A-Za-z]%', #string)
to
PATINDEX('%[^0-9]%', #string)
The results from that UDF should then be castable to an int without it throwing an error. It will further slow down your query, but it will be safer. You could even put your CAST inside the UDF and make it one call. The final UDF would look like this:
CREATE FUNCTION dbo.UDF_ParseAlphaChars
(
#string VARCHAR(8000)
)
RETURNS int
AS
BEGIN
DECLARE #IncorrectCharLoc SMALLINT
SET #IncorrectCharLoc = PATINDEX('%[^0-9]%', #string)
WHILE #IncorrectCharLoc > 0
BEGIN
SET #string = STUFF(#string, #IncorrectCharLoc, 1, '')
SET #IncorrectCharLoc = PATINDEX('%[^0-9]%', #string)
END
SET #string = #string
RETURN CAST(#string as int)
END
GO
Your final SELECT statement would look something like this:
SELECT *
FROM Table
WHERE UDF_ParseAlphaChars(Foo) > 5
EDIT
Based upon the new information that the database is SQLite, the above probably won't work directly. I don't believe SQLite has native support for UDFs. You might be able to create a type of UDF using your programming language of choice (like this: http://www.christian-etter.de/?p=439)
The other option I see to safely get all of your data (an IsNumeric would exclude certain rows from your results, which might not be what you want) would probably be to create an extra column that has the int representation of the string. It is a little more dangerous in that you need to keep two fields in sync, but it will allow you to quickly sort and filter the table data.
SELECT *
FROM Table
WHERE CAST(foo as int) > 2000

String manipulation SQL

I have a row of strings that are in the following format:
'Order was assigned to lastname,firsname'
I need to cut this string down into just the last and first name but it is always a different name for each record.
The 'Order was assigned to' part is always the same.......
Thanks
I am using SQL Server. It is multiple records with different names in each record.
In your specific case you can use something like:
SELECT SUBSTRING(str, 23) FROM table
However, this is not very scalable, should the format of your strings ever change.
If you are using an Oracle database, you would want to use SUBSTR instead.
Edit:
For databases where the third parameter is not optional, you could use SUBSTRING(str, 23, LEN(str))
Somebody would have to test to see if this is better or worse than subtraction, as in Martin Smith's solution but gives you the same result in the end.
In addition to the SUBSTRING methods, you could also use a REPLACE function. I don't know which would have better performance over millions of rows, although I suspect that it would be the SUBSTRING - especially if you were working with CHAR instead of VARCHAR.
SELECT REPLACE(my_column, 'Order was assigned to ', '')
For SQL Server
WITH testData AS
(
SELECT 'Order was assigned to lastname,firsname' as Col1 UNION ALL
SELECT 'Order was assigned to Bloggs, Jo' as Col1
)
SELECT SUBSTRING(Col1,23,LEN(Col1)-22) AS Name
from testData
Returns
Name
---------------------------------------
lastname,firsname
Bloggs, Jo
on MS SQL Server:
declare #str varchar(100) = 'Order was assigned to lastname,firsname'
declare #strLen1 int = DATALENGTH('Order was assigned to ')
declare #strLen2 int = len(#str)
select #strlen1, #strLen2, substring(#str,#strLen1,#strLen2),
RIGHT(#str, #strlen2-#strlen1)
I would require that a colon or some other delimiter be between the message and the name.
Then you could just search for the index of that character and know that anything after it was the data you need...
Example with format changing over time:
CREATE TABLE #Temp (OrderInfo NVARCHAR(MAX))
INSERT INTO #Temp VALUES ('Order was assigned to :Smith,Mary')
INSERT INTO #Temp VALUES ('Order was assigned to :Holmes,Larry')
INSERT INTO #Temp VALUES ('New Format over time :LootAt,Me')
SELECT SUBSTRING(OrderInfo, CHARINDEX(':',OrderInfo)+1, LEN(OrderInfo))
FROM #Temp
DROP TABLE #Temp

How do I make a function in SQL Server that accepts a column of data?

I made the following function in SQL Server 2008 earlier this week that takes two parameters and uses them to select a column of "detail" records and returns them as a single varchar list of comma separated values. Now that I get to thinking about it, I would like to take this table and application-specific function and make it more generic.
I am not well-versed in defining SQL functions, as this is my first. How can I change this function to accept a single "column" worth of data, so that I can use it in a more generic way?
Instead of calling:
SELECT ejc_concatFormDetails(formuid, categoryName)
I would like to make it work like:
SELECT concatColumnValues(SELECT someColumn FROM SomeTable)
Here is my function definition:
FUNCTION [DNet].[ejc_concatFormDetails](#formuid AS int, #category as VARCHAR(75))
RETURNS VARCHAR(1000) AS
BEGIN
DECLARE #returnData VARCHAR(1000)
DECLARE #currentData VARCHAR(75)
DECLARE dataCursor CURSOR FAST_FORWARD FOR
SELECT data FROM DNet.ejc_FormDetails WHERE formuid = #formuid AND category = #category
SET #returnData = ''
OPEN dataCursor
FETCH NEXT FROM dataCursor INTO #currentData
WHILE (##FETCH_STATUS = 0)
BEGIN
SET #returnData = #returnData + ', ' + #currentData
FETCH NEXT FROM dataCursor INTO #currentData
END
CLOSE dataCursor
DEALLOCATE dataCursor
RETURN SUBSTRING(#returnData,3,1000)
END
As you can see, I am selecting the column data within my function and then looping over the results with a cursor to build my comma separated varchar.
How can I alter this to accept a single parameter that is a result set and then access that result set with a cursor?
Others have answered your main question - but let me point out another problem with your function - the terrible use of a CURSOR!
You can easily rewrite this function to use no cursor, no WHILE loop - nothing like that. It'll be tons faster, and a lot easier, too - much less code:
FUNCTION DNet.ejc_concatFormDetails
(#formuid AS int, #category as VARCHAR(75))
RETURNS VARCHAR(1000)
AS
RETURN
SUBSTRING(
(SELECT ', ' + data
FROM DNet.ejc_FormDetails
WHERE formuid = #formuid AND category = #category
FOR XML PATH('')
), 3, 1000)
The trick is to use the FOR XML PATH('') - this returns a concatenated list of your data columns and your fixed ', ' delimiters. Add a SUBSTRING() on that and you're done! As easy as that..... no dogged-slow CURSOR, no messie concatenation and all that gooey code - just one statement and that's all there is.
You can use table-valued parameters:
CREATE FUNCTION MyFunction(
#Data AS TABLE (
Column1 int,
Column2 nvarchar(50),
Column3 datetime
)
)
RETURNS NVARCHAR(MAX)
AS BEGIN
/* here you can do what you want */
END
You can use Table Valued Parameters as of SQL Server 2008, which would allow you to pass a TABLE variable in as a parameter. The limitations and examples for this are all in that linked article.
However, I'd also point out that using a cursor could well be painful for performance.
You don't need to use a cursor, as you can do it all in 1 SELECT statement:
SELECT #MyCSVString = COALESCE(#MyCSVString + ', ', '') + data
FROM DNet.ejc_FormDetails
WHERE formuid = #formuid AND category = #category
No need for a cursor
Your question is a bit unclear. In your first SQL statement it looks like you're trying to pass columns to the function, but there is no WHERE clause. In the second SQL statement you're passing a collection of rows (results from a SELECT). Can you supply some sample data and expected outcome?
Without fully understanding your goal, you could look into changing the parameter to be a table variable. Fill a table variable local to the calling code and pass that into the function. You could do that as a stored procedure though and wouldn't need a function.