Query for blank white space before AND after a number string - sql

How would i go about constructing a query, that would return all material numbers that have a "blank white space" either BEFORE or AFTER the number string? We are exporting straight from SSMS to excel and we see the problem in the spreadsheet. If i could return all of the material numbers with spaces.. i could go in and edit them or do a replace to fix this issue prior to exporting! (the mtrl numbers are imported in via a windows application that users upload an excel template to. This template has all of this data and sometimes they place in spaces in or after the material number). The query we have used to work but now it does not return anything, but upon export we identify these problems you see highlighted in the screenshot (left screenshot) and then query to find that mtrl # in the table (right screenshot). And indeed, it has a space before the 1.
Currently the query we use looks like:
SELECT Mtrl
FROM dbo.Source
WHERE Mtrl LIKE '% %'

Since you are getting the data from a query, you should just have that query remove any potential spaces using LTRIM and RTRIM:
LTRIM(RTRIM([MTRL]))
Keep in mind that these two commands remove only spaces, not tabs or returns or other white-space characters.
Doing the above will make sure that the data for the entire set of data is fine, whether or not you find it and/or fix it.
Or, since you are copying-and-pasting from the Results Grid into Excel, you can just CONVERT the value to a number which will naturally remove any spaces:
SELECT CONVERT(INT, ' 12 ');
Returns:
12
So you would just use:
CONVERT(INT, [MRTL])
Now, if you want to find the data that has anything that is not a digit in it, you would use this:
SELECT Mtrl
FROM dbo.Source
WHERE [Mtrl] LIKE '%[^0-9]%'; -- any single non-digit character
If the issue is with non-space white-space characters, you can find out which ones they are via the following (to find them at the beginning instead of at the end, change the RIGHT to be LEFT):
;WITH cte AS
(
SELECT UNICODE(RIGHT([MTRL], 1)) AS [CharVal]
FROM dbo.Source
)
SELECT *
FROM cte
WHERE cte.[CharVal] NOT BETWEEN 48 AND 57 -- digits 0 - 9
AND cte.[CharVal] <> 32; -- space
And you can fix in one shot using the following, which removes regular spaces (char 32 via LTRIM/RTRIM), tabs (char 9), and non-breaking spaces (char 160):
UPDATE src
SET src.[Mtrl] = REPLACE(
REPLACE(
LTRIM(RTRIM(src.[Mtrl])),
CHAR(160),
''),
CHAR(9),
'')
FROM dbo.Source src
WHERE src.[Mtrl] LIKE '%[' -- find rows with any of the following characters
+ CHAR(9) -- tab
+ CHAR(32) -- space
+ CHAR(160) -- non-breaking space
+ ']%';
Here I used the same WHERE condition that you have since if there can't be any spaces then it doesn't matter if you check both ends or for any at all (and maybe it is faster to have a single LIKE instead of two).

Related

Removing white spaces and special characters from SQL

I have a table where I have a ColumnA which has data with white spaces and special characters. I want to generate ColumnB with the data from ColumnA with the removal of white spaces and special characters.
For example, ColumnA has values like:
N/A
#email
Hot-topic
#sql#%
White paper.
I want a new column with values:
NA
email
HotTopic
sql
Whitepaper
I tried below SQL in SSMS, but it is not working completely. Could someone help me out?
SELECT code,
REPLACE(REPLACE(code, TRIM(TRANSLATE(code,'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz',' '))
,'') ,' ','')
FROM SAMP
It is not working for the record with value: #sql#%
Added as a wiki answer in order to retain the comment made by #lptr. Query by #lptr explanation mine (#DaleK).
Your attempt was close, but only worked for single characters... the one that failed was because you had multiple characters that needed replacing and once you remove the white space they are all next to each other and don't match the original string anymore.
This answer cleverly replaces all the letter characters with a "*" using translate as step 1, then using translate again on the original column value, replaces all the non-letter characters with a "*" as step 2, then finally replaces all "*" characters with an empty string.
Note also the use of replication to avoid typing the same character in multiple times.
create table samp(code varchar(50));
insert into samp(code)
values
('N/A'),
('#email'),
('Hot-topic'),
('#sql#%'),
('White paper. ');
select s.code, n.nonletters, l.letters
from samp as s
cross apply (values(translate(s.code, 'abcdefghijklmnopqrstuvwxyz', replicate('*', 26)))) as n (nonletters)
cross apply (values(replace(translate(s.code, n.nonletters, replicate('*', len(n.nonletters+'.')-1)), '*', ''))) as l (letters);

Passing Same In Function but output are different

Facing issue with #Detail when passing value in function through string 'ABC' the getting correct output but when passing via variable then getting incorrect output. What i am doing wrong.
DECLARE #Detail varchar(4000)
SELECT #Detail = detail FROM table1
select #Detail
select * from dbo.SDF_SplitString(#Detail,' ',88)
INCORRECT OUTPUT
select * from dbo.SDF_SplitString(' 11/13/2019 12:15:0 11/13/2019 12:15:0 11/13/2019 14:30:0 BOM GOP SG 438 24A MR Kamleshwar Prasad 4',' ',88)
CORRECT OUTPUT
OK, so that query I had you run is showing that the character between 12:15:0 11/13/2019 is a TAB (ascii character 9), not a space (ascii character 32)
This is why you get the output you do; you should perhaps consider to replace tabs with spaces before you split, or upgrade your split function so you can pass multiple chars to it for splitting (I assume that's what the ' ' parameter is for)
When you copied your value out of SSMS results grid it swapped the tabs for spaces already (SSMS converted the tabs to spaces), so when you ran it, it "looked the right output" when using the hardcoded value that contained only spaces..
..but the data in the DB table definitely has tabs! Always be careful copying stuff out of SSMS results grid; it drops characters, changes characters and cuts longer data off at the end, so it cannot be guaranteed to be an accurate reflection of what is in the table.
select * from dbo.SDF_SplitString(REPLACE(#Detail, CHAR(9), ' '),' ',88)
should give the output you expect..
If you have the option of changing this field so it doesn't have delimited data in it would be safer; splitting can sometimes give unusable results if the positions change

SQL finding rows that only contain chars from a certain Unicode range

I recently asked a question to obtain rows that contain characters in a certain Unicode range.
SELECT *
FROM #kanjinames
WHERE UNICODE(LEFT(ForeNames, 1)) BETWEEN 0x4e00 AND 0x9fff
A very helpful user shared the above with me. To my understanding it checks the first character on the left and if it is within the Unicode range it returns an a the row. Through testing I believe this works.
My current problem is how do I go about checking the entire column is within the range? For example:
石山コンタクトレンズ
The above contains characters outside of the range (the first two characters are within range) in the query above but I am not sure about how I go about checking the entire field. I am away of using stuff like
is not like N'%^a-z%'
for the English alphabet. Just not sure how to apply it for this situation.
Any help would be great on this.
I think this will work:
SELECT *
FROM #kanjinames
WHERE ForeNames NOT LIKE '%[^' + NCHAR(0x4e00) + '-' NCHAR(0x9fff) + ']%';
That is, the string contains no characters outside that sequence.
Edit: I had to alter this slightly to get it to work. I had to use the decimal values instead of the hex.
SELECT *
FROM #kanjinames
WHERE ForeNames NOT LIKE '%[^' + NCHAR(19968) + '-' + NCHAR(40802) + ']%';
This still returns blank values but I removed those separately.

SQL Server search using like while ignoring blank spaces

I have a phone column in the database, and the records contain unwanted spaces on the right. I tried to use trim and replace, but it didn't return the correct results.
If I use
phone like '%2581254%'
it returns
customerid
-----------
33470
33472
33473
33474
but I need use percent sign or wild card in the beginning only, I want to match the left side only.
So if I use it like this
phone like '%2581254'
I get nothing, because of the spaces on the right!
So I tried to use trim and replace, and I get one result only
LTRIM(RTRIM(phone)) LIKE '%2581254'
returns
customerid
-----------
33474
Note that these four ids have same phone number!
Table data
customerid phone
-------------------------------------
33470 96506217601532388254
33472 96506217601532388254
33473 96506217601532388254
33474 96506217601532388254
33475 966508307940
I added many number for test propose
The php function takes last 7 digits and compare them.
For example
01532388254 will be 2581254
and I want to search for all users that has this 7 digits in their phone number
2581254
I can't figure out where's the problem!
It should return 4 ids instead of 1 id
Given the sample data, I suspect you have control characters in your data. For example char(13), char(10)
To confirm this, just run the following
Select customerid,phone
From YourTable
Where CharIndex(CHAR(0),[phone])+CharIndex(CHAR(1),[phone])+CharIndex(CHAR(2),[phone])+CharIndex(CHAR(3),[phone])
+CharIndex(CHAR(4),[phone])+CharIndex(CHAR(5),[phone])+CharIndex(CHAR(6),[phone])+CharIndex(CHAR(7),[phone])
+CharIndex(CHAR(8),[phone])+CharIndex(CHAR(9),[phone])+CharIndex(CHAR(10),[phone])+CharIndex(CHAR(11),[phone])
+CharIndex(CHAR(12),[phone])+CharIndex(CHAR(13),[phone])+CharIndex(CHAR(14),[phone])+CharIndex(CHAR(15),[phone])
+CharIndex(CHAR(16),[phone])+CharIndex(CHAR(17),[phone])+CharIndex(CHAR(18),[phone])+CharIndex(CHAR(19),[phone])
+CharIndex(CHAR(20),[phone])+CharIndex(CHAR(21),[phone])+CharIndex(CHAR(22),[phone])+CharIndex(CHAR(23),[phone])
+CharIndex(CHAR(24),[phone])+CharIndex(CHAR(25),[phone])+CharIndex(CHAR(26),[phone])+CharIndex(CHAR(27),[phone])
+CharIndex(CHAR(28),[phone])+CharIndex(CHAR(29),[phone])+CharIndex(CHAR(30),[phone])+CharIndex(CHAR(31),[phone])
+CharIndex(CHAR(127),[phone]) >0
If the Test Results are Positive
The following UDF can be used to strip the control characters from your data via an update
Update YourTable Set Phone=[dbo].[udf-Str-Strip-Control](Phone)
The UDF if Interested
CREATE FUNCTION [dbo].[udf-Str-Strip-Control](#S varchar(max))
Returns varchar(max)
Begin
;with cte1(N) As (Select 1 From (Values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) N(N)),
cte2(C) As (Select Top (32) Char(Row_Number() over (Order By (Select NULL))-1) From cte1 a,cte1 b)
Select #S = Replace(#S,C,' ')
From cte2
Return LTrim(RTrim(Replace(Replace(Replace(#S,' ','><'),'<>',''),'><',' ')))
End
--Select [dbo].[udf-Str-Strip-Control]('Michael '+char(13)+char(10)+'LastName') --Returns: Michael LastName
As promised (and nudged by Bill), the following is a little commentary on the UDF.
We pass a string that we want stripped of Control Characters
We create an ad-hoc tally table of ascii characters 0 - 31
We then run a global search-and-replace for each character in the
tally-table. Each character found will be replaced with a space
The final string is stripped of repeating spaces (a little trick
Gordon demonstrated several weeks ago - don't have the original
link)

SQL Server 2000 - How to remove the hidden characters in the column?

I'm trying to remove a hidden characters from a varchar column, these hidden characters (i.e. period, space) was taken from a scanned bar code and it is not visible in the result set once query was executed. I have tried to use below script but it failed to remove the hidden characters(see attached screenshot for reference.)
Any help is highly appreciated.
SELECT Replace(Replace(LTrim(RTrim(mycolumn)), '.', ''), ' ', '')
FROM MyTable
WHERE serialno = '123456789'
One thing that has worked for me is to select the column with the special characters, then paste the data into notepad++ then turn on View>Show Symbol>Show All Characters. Then I could copy the special characters from Notepad++ into the second argument of the REPLACE() function in SQL.