Need Help Using CHARINDEX to Create a New Variable - sql

I'm trying to take the second number in the variable description and make a new variable. However, I'm not too familiar with string manipulation. Below is a small example of the variable description, I'm having a hard time formatting it because the first number changes in size.
Description
4 Matching notifications 11 Updates
32 Matching notifications 12 Updates
1211 Matching notifications 1 Updates
Below this is a rough idea of the code I thought would originally work.
SELECT
LEFT(Description, CHARNDEX('Updates', Description)-1) AS Second_variable
FROM X

This is a bit tricky in SQL Server. One method is:
select s.value as second_variable
from t cross apply
(select top (1) s.value
from string_split(description, ' ') s
where description like concat('% ', s.value, ' Updates')
) s;
This extracts the individual "words" from the string and then chooses the one that matches the value before the last ' Updates'.

Related

SQL code to remove extra spaces and line breaks in free text?

I'm currently working with a table that deals with patients who have visited a clinic. One of the fields in this table shows the reason for the visit, and it's free text so whoever's booking the appointment can leave a custom note for the doctor depending on what the issue is. Yes, I'm well aware free text is the actual worst, but I did not design this database or the front-end medical record system (which is also the worst) and I'm simply stuck dealing with it. Bear with me.
Because of the special characters, extra spaces, and carriage returns that often find their way into that free text field on the front end, all its contents would show up on a single line in SSMS but would cause all sorts of formatting issues with extra line breaks when the SQL results were pasted into Excel. I did a little research and found a snippet of code that would replace carriage returns, etc. in a given field, thus forcing all the contents of that field to remain in a single cell:
REPLACE(REPLACE(FieldName,char(10),''),char(13),'') as FieldName
This has worked splendidly for this VisitReason field and any other free text fields I've been forced to work with. However, does it account for every possible issue one might find in free text? Yesterday I was working with this table and pasted the results from SSMS into Excel, and there were two people whose VisitReason fields were cut off prematurely and then had all the results (as in multiple fields) from a bunch of other people's visits crammed into that same field (thus making for one really long cell in Excel).
For example, the VisitReason for one of these people showed up in SSMS as complaining of rash, see note. But then when it was pasted into Excel, the results looked like...
PatientID PatientName VisitDate ... VisitReason
----------------------------------------------------------------------------------------------
1001 Smith, John 01/08/2023 ... complaining of rash, see
PatientID1002PatientNameJaneDoeVisitDate01/08/2023VisitRe
asondiabetesfollowupPatientID1003PatientNameBobBrownVisitDa
(and so on)
I can't tell if this has something to do with the free text field, and there's some hidden character in there that's causing the weird line breaks and field merging that my REPLACE function isn't catching, or whether it's an error with Excel (in which case this obviously isn't the right place to be asking). But I wanted to check and see if there was anything that potentially needed to be added to the REPLACE line that would fix the problem.
My full query is really simple:
SELECT
d.PatientID,
d.PatientName,
v.VisitDate,
[some other visit-related fields, none of which are free text],
REPLACE(REPLACE(v.VisitReason,char(10),''),char(13),'') as VisitReason,
[some other demographic fields, none of which are free text]
FROM Demographics d
JOIN Visit v ON d.PatientID = v.PatientID
The REPLACE function works perfectly fine for literally every other patient in the list except for the two with results like what's shown above, which then go on to affect a number of other rows following them. Anyone have any thoughts?
Please try the following solution.
The xs:token data type is stripping out the white space characters.
SQL
USE tempdb;
GO
DROP FUNCTION IF EXISTS dbo.udf_tokenize;
GO
/*
1. All invisible TAB, Carriage Return, and Line Feed characters will be replaced with spaces.
2. Then leading and trailing spaces are removed from the value.
3. Further, contiguous occurrences of more than one space will be replaced with a single space.
*/
CREATE FUNCTION dbo.udf_tokenize(#input VARCHAR(MAX))
RETURNS VARCHAR(MAX)
AS
BEGIN
RETURN (SELECT CAST('<r><![CDATA[' + #input + ' ' + ']]></r>' AS XML).value('(/r/text())[1] cast as xs:token?','VARCHAR(MAX)'));
END
GO
-- DDL and sample data population, start
DECLARE #mockTbl TABLE (ID INT IDENTITY(1,1), col_1 VARCHAR(100), col_2 VARCHAR(100));
INSERT INTO #mockTbl (col_1, col_2) VALUES
(CHAR(13) + ' FL ' + CHAR(9), CHAR(10) + ' Miami'),
(' FL ', ' Fort Lauderdale '),
(' NY ', ' New York '),
(' NY ', ''),
(' NY ', NULL);
-- DDL and sample data population, end
-- before
SELECT *, LEN(col_2) AS [col_2_len]
FROM #mockTbl;
-- remove invisible white space chars
UPDATE #mockTbl
SET col_1 = dbo.udf_tokenize(col_1)
, col_2 = dbo.udf_tokenize(col_2);
-- after
SELECT *, LEN(col_2) AS [col_2_len]
FROM #mockTbl;

SQL Server search using like while ignoring blank spaces

I have a phone column in the database, and the records contain unwanted spaces on the right. I tried to use trim and replace, but it didn't return the correct results.
If I use
phone like '%2581254%'
it returns
customerid
-----------
33470
33472
33473
33474
but I need use percent sign or wild card in the beginning only, I want to match the left side only.
So if I use it like this
phone like '%2581254'
I get nothing, because of the spaces on the right!
So I tried to use trim and replace, and I get one result only
LTRIM(RTRIM(phone)) LIKE '%2581254'
returns
customerid
-----------
33474
Note that these four ids have same phone number!
Table data
customerid phone
-------------------------------------
33470 96506217601532388254
33472 96506217601532388254
33473 96506217601532388254
33474 96506217601532388254
33475 966508307940
I added many number for test propose
The php function takes last 7 digits and compare them.
For example
01532388254 will be 2581254
and I want to search for all users that has this 7 digits in their phone number
2581254
I can't figure out where's the problem!
It should return 4 ids instead of 1 id
Given the sample data, I suspect you have control characters in your data. For example char(13), char(10)
To confirm this, just run the following
Select customerid,phone
From YourTable
Where CharIndex(CHAR(0),[phone])+CharIndex(CHAR(1),[phone])+CharIndex(CHAR(2),[phone])+CharIndex(CHAR(3),[phone])
+CharIndex(CHAR(4),[phone])+CharIndex(CHAR(5),[phone])+CharIndex(CHAR(6),[phone])+CharIndex(CHAR(7),[phone])
+CharIndex(CHAR(8),[phone])+CharIndex(CHAR(9),[phone])+CharIndex(CHAR(10),[phone])+CharIndex(CHAR(11),[phone])
+CharIndex(CHAR(12),[phone])+CharIndex(CHAR(13),[phone])+CharIndex(CHAR(14),[phone])+CharIndex(CHAR(15),[phone])
+CharIndex(CHAR(16),[phone])+CharIndex(CHAR(17),[phone])+CharIndex(CHAR(18),[phone])+CharIndex(CHAR(19),[phone])
+CharIndex(CHAR(20),[phone])+CharIndex(CHAR(21),[phone])+CharIndex(CHAR(22),[phone])+CharIndex(CHAR(23),[phone])
+CharIndex(CHAR(24),[phone])+CharIndex(CHAR(25),[phone])+CharIndex(CHAR(26),[phone])+CharIndex(CHAR(27),[phone])
+CharIndex(CHAR(28),[phone])+CharIndex(CHAR(29),[phone])+CharIndex(CHAR(30),[phone])+CharIndex(CHAR(31),[phone])
+CharIndex(CHAR(127),[phone]) >0
If the Test Results are Positive
The following UDF can be used to strip the control characters from your data via an update
Update YourTable Set Phone=[dbo].[udf-Str-Strip-Control](Phone)
The UDF if Interested
CREATE FUNCTION [dbo].[udf-Str-Strip-Control](#S varchar(max))
Returns varchar(max)
Begin
;with cte1(N) As (Select 1 From (Values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) N(N)),
cte2(C) As (Select Top (32) Char(Row_Number() over (Order By (Select NULL))-1) From cte1 a,cte1 b)
Select #S = Replace(#S,C,' ')
From cte2
Return LTrim(RTrim(Replace(Replace(Replace(#S,' ','><'),'<>',''),'><',' ')))
End
--Select [dbo].[udf-Str-Strip-Control]('Michael '+char(13)+char(10)+'LastName') --Returns: Michael LastName
As promised (and nudged by Bill), the following is a little commentary on the UDF.
We pass a string that we want stripped of Control Characters
We create an ad-hoc tally table of ascii characters 0 - 31
We then run a global search-and-replace for each character in the
tally-table. Each character found will be replaced with a space
The final string is stripped of repeating spaces (a little trick
Gordon demonstrated several weeks ago - don't have the original
link)

Pass multiple values in Syabse

I am back again with a small problem. Hope i get something here.
I am working on a report, SQL Server Reporting Services 2012 and Database is Sybase ASE.
One of my report parameter can have multiple values. Let's name the parameter as #Fruit. It can have multiple values. So if the user selects Apple and Mango from the list, it should pass to the query at backend.
The parameter gives the values as : Apple,Mango
Now i need to pass it to the query in the below way.
SELECT
COLUMN1,
COLUMN2,
COLUMN3
FROM DBO.TABLE_NAME
WHERE COLUMN2 IN ('Apple','Mango')
Problem: I am able to pass a single fruit name at a time. But not able to pass more than one value. I did a bit research and found it's problem with Sybase. It cannot take multiple value.
I believe someone might have found a work around. Just need to get it working.
Thanks In Advance.!
you can create a comma separated string from the list using join in a expression:
strFruits= Join(Parameters!fruits.Value,",")
then your where clause would look like:
WHERE CHARINDEX(',' + COLUMN2 + ',' , ',' + #strFruits + ',')>0
the ',' added at the beginning and end of the strings are to make sure the search string is found even if it is located at beginning or end of the comma separated list.

SQL Server Substring buffer padding

This is probably a simple question but I'm trying to create a new column in SQL server based off of 4 others. The idea is to create a customer ID based off the first 5 characters of Zip, Last name, first name, and address.
My question is: how to I ensure that I get a spacing buffer if the name is too short? For example, I have a guy with the first name of Tom. How to I get it to return 'TOM ' with the two spaces at the end?
NOTE: I'm fully aware that creating a customer key based off something that can change like an address can cause problems. I've discussed that with the client and they said to do it anyway.
Couple more ways
select STUFF(' ', 1, LEN(name), name)
select name + REPLICATE(' ', 5 - len(name))
left (rtrim(#str) + ' ', 5)
That's five spaces ;-)

How to do string manipulation in SQL query

I know I'm close to figuring this out but need a little help. What I'm trying to do is all grab a column from a particular table, but chop off the first 4 characters. For example if in a column the value is "KPIT08L", the result I was is 08L. Here is what I have so far but not getting the desired results.
SELECT LEFT(FIELD_NAME, 4)
FROM TABLE_NAME
First up, left will give you the leftmost characters. If you want the characters starting at a specific location, you need to look into mid:
select mid (field_name,5) ...
Secondly, if you value performance,portability and scalability at all, this sort of "sub-column" manipulation should generally be avoided. It's usually far easier (and faster) to patch columns together than to split them apart.
In other words, keep the first four characters in their own column and the rest in a separate column, and do your selects on the relevant one. If you're using anything less than a full column, then it's technically not one attribute of the row.
Try with
SELECT MID(FIELD_NAME, 5) FROM TABLE_NAME
Mid is very powerfull, it let you select the starting point and all the remainder, or,
if specified, the length desidered as in
SELECT MID(FIELD_NAME, 5, 2) FROM TABLE_NAME ' gives 08 in your example text
SELECT RIGHT(FIELD_NAME,LEN(FIELD_NAME)-4)
FROM TABLE_NAME;
If it is for a generic string then the above one will work...
Don't have Access at my current location, but please try this.
SELECT RIGHT(FIELD_NAME, LEN(FIELD_NAME)-4)
FROM TABLE_NAME
The LEFT(FIELD_NAME, 4) will return the first 4 caracters of FIELD_NAME.
What you need to do is :
SELECT MID(FIELD_NAME, 5)
FROM TABLE_NAME
If you have a FIELD_NAME of 10 caracters, the function will return the 6 last caracters (chopping the first 4)!