LEN(firstName) + LEN(lastName) does not equal LEN(firstName + lastName) - sql

I'm trying to search a table based on the concatentation of the firstName and lastName columns. Both of these are defined as NVARCHAR(50) NOT NULL
The query sometimes fails to find a match because the concatenated column is padded with extra spaces. Here's the query:
SELECT firstName + lastName AS fullName, LEN(firstName) + LEN(lastName) AS realLength, LEN(firstName + lastName) AS concatLength FROM UsersTable
And here's an image with the results:
What is the deal with this? How can I avoid the extra spaces? If I do SELECT RTRIM(firstName) + RTRIM(lastName) ... I get the correct full name with no extra spaces, but using RTRIM is too expensive because my data set is very big. This would lead me to think that the issue is the data itself, except that LEN(firstName) is the same as LEN(RTRIM(firstName))

You have spaces at the end of your FirstName. It is easy enough to check that the following returns 4:
select len(N'abcd ')
This is a property of the varchar() data types and len(). Of course, when you concatenate them, then SQL Server decides to recognize the spaces at the end.
This behavior is documented in the "Remarks" section of the documentation:
Remarks
LEN excludes trailing blanks. If that is a problem, consider using the
DATALENGTH (Transact-SQL) function which does not trim the string. If
processing a unicode string, DATALENGTH will return twice the number
of characters.
As the comments suggest, you can ltrim()/rtrim() the values before concatenating them. Or, use like.

Related

Removing white spaces and special characters from SQL

I have a table where I have a ColumnA which has data with white spaces and special characters. I want to generate ColumnB with the data from ColumnA with the removal of white spaces and special characters.
For example, ColumnA has values like:
N/A
#email
Hot-topic
#sql#%
White paper.
I want a new column with values:
NA
email
HotTopic
sql
Whitepaper
I tried below SQL in SSMS, but it is not working completely. Could someone help me out?
SELECT code,
REPLACE(REPLACE(code, TRIM(TRANSLATE(code,'ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz',' '))
,'') ,' ','')
FROM SAMP
It is not working for the record with value: #sql#%
Added as a wiki answer in order to retain the comment made by #lptr. Query by #lptr explanation mine (#DaleK).
Your attempt was close, but only worked for single characters... the one that failed was because you had multiple characters that needed replacing and once you remove the white space they are all next to each other and don't match the original string anymore.
This answer cleverly replaces all the letter characters with a "*" using translate as step 1, then using translate again on the original column value, replaces all the non-letter characters with a "*" as step 2, then finally replaces all "*" characters with an empty string.
Note also the use of replication to avoid typing the same character in multiple times.
create table samp(code varchar(50));
insert into samp(code)
values
('N/A'),
('#email'),
('Hot-topic'),
('#sql#%'),
('White paper. ');
select s.code, n.nonletters, l.letters
from samp as s
cross apply (values(translate(s.code, 'abcdefghijklmnopqrstuvwxyz', replicate('*', 26)))) as n (nonletters)
cross apply (values(replace(translate(s.code, n.nonletters, replicate('*', len(n.nonletters+'.')-1)), '*', ''))) as l (letters);

Getting unwanted data in select statement of NChar column

On running the below query:
SELECT DISTINCT [RULE], LEN([RULE]) FROM MYTBL WHERE [RULE]
LIKE 'Trademarks, logos, slogans, company names, copyrighted material or brands of any third party%'
I am getting the output as:
The column datatype is NCHAR(120) and the collation used is SQL_Latin1_General_CP1_CI_AS
The data is inserted with an extra leading space in the end. But using RTRIM function also I am not able to trim the extra space. I am not sure which type of leading space(encoded) is inserted here.
Can you please suggest some other alternative except RTRIM to get rid of extra white space at the end as the Column is NCHAR.
Below are the things which I have already tried:
RTRIM(LTRIM(CAST([RULE] as VARCHAR(200))))
RTRIM([RULE])
Update to Question
Please download the Database from here TestDB
Please use below query for your reference:
SELECT DISTINCT [RULE], LEN([RULE]) FROM [TestDB].[BI].[DimRejectionReasonData]
WHERE [RULE]
LIKE 'Trademarks, logos, slogans, company names, copyrighted material or brands of any third party%'
You may have a non-breaking space nchar(160) inside the string.
You can convert it to a simple space and then use the usual trim function
LTRIM(RTRIM(REPLACE([RULE], NCHAR(160), ' ')))
In case of unicode space
LTRIM(RTRIM(REPLACE(RULE, NCHAR(0x00A0), ' ')))
I guess this is what you are looking for ( Not sure ) . Make a try with this approach
SELECT REPLACE(REPLACE([RULE], CHAR(13), ''), CHAR(10), '')
Reference links : Link 1 & Link 2
Note: FYI refer those links for better understanding .
change the type nchar into varchar it will return the result without extra space

SQL Server CONCAT function

I trying to draw a statement like this
SELECT CONCAT(street_name, ' ', street_number) as 'street_detail'
FROM geo_map
WHERE CONCAT(street_name, ' ', street_number) LIKE '%'
My table is something like this
postal_code int
building_name nchar(200)
street_number nchar(60)
street_name nchar(120)
The result I get was just the street name, less the street number, although my street number have value, any idea what's went wrong in my concat.
I am using SQL Server
It is best to use NVARCHAR(...) instead of NCHAR(...) types for storing information like what you have. The reason is that for NCHAR(...) types, strings are padded with trailing spaces to fill the whole length of the field.
A string in an NCHAR(200) field is always 200 characters wide. The concatenation of street_name, a space and the street_number will be 261 characters wide. The building number will appear on the 202nd character in the concatenation.
Perhaps you are not seeing a street number in your concatenation because your display field (in your program, SSMS, webpage, ...) just isn't wide enough.
Now with storing your street name in an NVARCHAR(200) and pretty much all other related information in NVARCHAR(...) fields, you would not have that problem. Strings stored in those fields are not padded with trailing spaces, and you would see your street number at the place you expected in your concatenation.

Regular expressions in SQL

Im curious if and how you can use regular expressions to find white space in SQL statments.
I have a string that can have an unlimited amount of white space after the actual string.
For example:
"STRING "
"STRING "
would match, but
"STRING A"
"STRINGB"
would not.
Right now I have:
like 'STRING%'
which doesnt quite return the results I would like.
I am using Sql Server 2008.
A simple like can find any string with spaces at the end:
where col1 like '% '
To also allow tabs, carriage returns or line feeds:
where col1 like '%[ ' + char(9) + char(10) + char(13) + ']'
Per your comment, to find "string" followed by any number of whitespace:
where rtrim(col1) = 'string'
You could try
where len(col1) <> len(rtrim(col1))
Andomar's answer will find the strings for you, but my spidey sense tells me maybe the scope of the problem is bigger than simply finding the whitespace.
If, as I suspect, you are finding the whitespace so that you can then clean it up, a simple
UPDATE Table1
SET col1 = RTRIM(col1)
will remove any trailing whitespace from the column.
Or RTRIM(LTRIM(col1)) to remove both leading and trailing whitespace.
Or REPLACE(col1,' '.'') to remove all whitespace including spaces within the string.
Note that RTRIM and LTRIM only work on spaces, so to remove tabs/CRs/LFs you would have to use REPLACE. To remove those only from the leading/trailing portion of the string is feasible but not entirely simple. Bug your database vendor to implement the ANSI SQL 99 standard TRIM function that would make this much easier.
where len(col1 + 'x') <> len(rtrim(col1)) + 1
BOL provides workarounds for LEN() with trailing spaces : http://msdn.microsoft.com/en-us/library/ms190329.aspx
LEN(Column + '_') - 1
or using DATALENGTH

MYSQL merge columns

I'm using MySQL and do a select:
SELECT LTRIM(Firstname + ' ' Lastname) AS Fullname FROM Persons
My result is 0 for every result.
Even if i remove the LTRIM, Using CONCAT is giving the same problem.
You are arithmetically adding the string values together; unless you have "1ohn 5mith" in the db, this will always be 0.
Does SELECT LTRIM(CONCAT(Firstname,' ',Lastname)) AS Fullname FROM Persons give you the same problem? (note that there are 3 parameters to CONCAT() here: Firstname, a one-character string containing a space, and Lastname; this function takes as many arguments as you throw at it and outputs them as a string)