SQL how to replace string which contains only new lines? - sql

My situation is that in some cases my field contain several (multiple) new lines and nothing else.
I can't use this:
SELECT REPLACE(REPLACE(fielddata, CHAR(13), ''), CHAR(10), '')
because I also have a normal ones, and as I understand this will replace all new lines.
As I understand, I need somehow check if string contains only new lines, and if it does, replace them with ''.
How can I accomplish that?

One way to solve it is to use case, len, and trim to figure out if the column have data in that specific row:
SELECT CASE WHEN LEN(
LTRIM(
RTRIM(
REPLACE(
REPLACE(
fielddata, CHAR(13), ''
), CHAR(10), ''
)
)
)
) = 0 THEN
''
ELSE
fielddata
END as fielddata

You can use a CASE expression like
CASE WHEN REPLACE(my_string_column, CHAR(13), '') = ''
THEN REPLACE(REPLACE(fielddata, CHAR(13), ''), CHAR(10), '')
ELSE my_string_column END AS Computed_Column

Related

LIKE statement to compare strings with hyphen

I am working in SQL and I have 3 columns Current Name, Given Full Name and Whether the names match (Y or No)
The problem with that is that when I am comparing the strings in the first 2 columns, it is not showing me the current result. For example, I am not finding a way to prove that 'Tushar Sharma' is same as 'Tushar-Sharma' considering that Tushar Sharma is the current full name and Tushar-Sharma is the name that has been extracted from a report.
I am stuck at the LIKE statement as to what to do if I want to have hyphen(-) included in the comparison so that I get a Y in the 3rd column.
Thank you
One option is to remove the hyphen for the comparison:
select (case when replace(given_name, '-', '') = replace(full_name, '-', '') then 'Y' else 'N' end) as names_match
You can use replace() with like as well:
select (case when replace(given_name, '-', '') like '%' + replace(full_name, '-', '') '%' then 'Y' else 'N' end) as names_match
Replace - with whitespace and compare, you can also use regex or fuzzy matching to improve the match for other conditions.
AND REPLACE(CurrentName, '-', ' ') = REPLACE(GivenName, '-', ' ');
Ex:
AND REPLACE('Tushar Sharma', '-', ' ') = REPLACE('Tushar-Sharma', '-', ' ')
will eval to
AND 'Tushar Sharma' = 'Tushar Sharma'
this will work:
select currentname,givenfullname,case when regexp_replace(currentname,' ','') like
regexp_replace(givenfullname,' ','') the 'Y' else 'N' end as matchstatus from
table_name;

Parsing Name Field in SQL

I am trying to separate a name field into the appropriate fields. The name field is not consistently the same. It can show up as Doe III,John w or Doe,John, or Doe III,John, or Doe,John W or it may be lacking the suffix and or middle initial. Any ideas would be greatly appreciated.
SELECT (
CASE LEN(REPLACE(FirstName, ' ', ''))
WHEN LEN(FirstName + ' ') - 1
THEN PARSENAME(REPLACE(FirstName, ' ', '.'), 2)
ELSE PARSENAME(REPLACE(FirstName, ' ', '.'), 3)
END
) AS LastName
,(
CASE LEN(REPLACE(FirstName, ' ', ''))
WHEN LEN(FirstName + ',') - 1
THEN NULL
ELSE PARSENAME(REPLACE(FirstName, ' ', '.'), 2)
END
) AS Suffix
,PARSENAME(REPLACE(FirstName, ' ', '.'), 1) AS FirstName
FROM Trusts.dbo.tblMember
I need the name regardless of the format, as stated above, to parse into the appropriate fields of LastName,Suffix,FirstName,MiddleInitial, regardless of whether it has a suffix or a middle initial
If the given 4 names are the only type of cases, then you can use something like below.
Note: I used a CTE table tbl2 to separate comma_pos,first_space,second_space for better understanding in the main query. You can replace these value in main query with their corresponding function in CTE, to make the main query faster. I mean replace comma_pos in main query with charindex(',',name) an so on.
Also I am assuming that there are no leading/trailing or extra whitespaces or any junk character in name column. If you have, then sanitize your data first before proceeding.
Rexter Sample
with tbl2 as (
select tbl.*,
charindex(',',name) as comma_pos,
charindex(' ',name,1) first_space,
charindex(' ',name,charindex(' ',name,1)+1) second_space
from tbl)
select tbl2.name
,case when second_space <> 0
then substring(name,comma_pos+1,second_space-comma_pos-1)
when first_space > comma_pos
then substring(name,comma_pos+1,first_space-comma_pos-1)
else substring(name,comma_pos+1,len(name)-comma_pos)
end as first_name
,case when second_space <> 0
then substring(name,second_space+1,len(name)-second_space)
when first_space > comma_pos
then substring(name,first_space+1,len(name)-first_space)
end as middle_name
,case when first_space=0 or first_space>comma_pos
then substring(name,1,comma_pos-1)
else substring(name,1,first_space-1)
end as last_name
,case when first_space=0 or first_space>comma_pos
then null
else substring(name,first_space,comma_pos-first_space)
end as suffix
from tbl2;

Scaler Function in Where Clause Really Slow? How to use Cross Apply Instead?

I have some data, some of it was imported with different separators such as * - . or a space...some of it was removed on import, some was not. Some of the external values being compared to it has the same issue. So we remove all separators and compare that way, I don't want to just update the columns yet as the data isn't "mine".
So since I see this over and over in the code I am moving to stored procedures, I wrote a stored function to do it for me.
ALTER FUNCTION [dbo].[fn_AccountNumber_Format2]
(#parAcctNum NVARCHAR(50))
RETURNS NVARCHAR(50)
AS
BEGIN
SET #parAcctNum = REPLACE(REPLACE(REPLACE(REPLACE(#parAcctNum, '.', ''), '*', ''), '-', ''), ' ', '');
RETURN #parAcctNum
END
Normally the queries looked something like this and it takes less than a second to run on a few millions rows :
SELECT name1, accountID FROM tblAccounts WHERE (Replace(Replace(Replace(accountnumber, '.', ''), '*', ''), '-', '') = Replace(Replace(Replace('123-456-789', '.', ''), '*', ''), '-', ''));
So my first attempt with it like this takes 24 seconds to excecute:
SELECT name1, accountID FROM tblAccounts WHERE (dbo.fn_AccountNumber_Format2 ([accountnumber])) = Replace(Replace(Replace('123-456-789', '.', ''), '*', ''), '-', '');
This one 43 seconds:
SELECT name1, accountID FROM tblAccounts WHERE (dbo.fn_AccountNumber_Format2(accountnumber)) = (dbo.fn_AccountNumber_Format2 ('123-456-789'));
So the drastic slow down came as a complete shock to me as I expected the user defined function to run just the same as the system function REPLACE... After some research on stackexchange and google it seems that using Cross Apply and creating a table with the function may be a better solution but I have no idea how that works, can anyone help me with that?
Inline Function
CREATE FUNCTION [dbo].[uspAccountNumber_Format3]
(
#parAcctNum NVARCHAR(50))
RETURNS TABLE
AS
RETURN
(
SELECT REPLACE(REPLACE(REPLACE(REPLACE(#parAcctNum, '.', ''), '*', ''),'-', ''), ' ', '') AS AccountNumber
)
Usage
SELECT name1 ,
accountID
FROM tblAccounts
CROSS APPLY dbo.uspAccountNumber_Format3(accountnumber) AS a
CROSS APPLY dbo.uspAccountNumber_Format3('123-456-789') AS b
WHERE a.AccountNumber = b.AccountNumber

ORDER BY name without leading "the" (and other complex code)

I'm sorting song names from a SQLite database, and I'd like to sort ignoring any leading "The ". So, for example:
1 Labradors are Lovely
2 The Last Starfighter
3 Last Stop before Heaven
This answer solves this need in the simple case:
SELECT name FROM songs
ORDER BY
CASE WHEN instr(lower(name),'the ')=1 THEN substr(name,5)
ELSE name
END
COLLATE NOCASE;
However, I'm already using a complex transformation on the name column. Combining the two I get this ugly, non-DRY code:
SELECT n, name
FROM songs
ORDER BY
CASE WHEN name GLOB '[0-9]*' THEN 1
ELSE 0
END,
CASE WHEN name GLOB '[0-9]*' THEN CAST(name AS INT)
ELSE CASE WHEN instr(lower(name),'the ')=1 THEN
replace(
replace(
replace(
replace(
substr(name,5),
'.',''
),
'(',''
),
'''',''
),
' ',' '
)
ELSE
replace(
replace(
replace(
replace(name,'.',''),
'(',''
),
'''',''
),
' ',' '
)
END
END
COLLATE NOCASE;
Is there a way to use a variable or something during the query so that I can DRY up the code, and only have all that punctuation-replacement taking place in one location instead of two different case branches?
Something like this should work.
SELECT n, name FROM (
SELECT n, name,
CASE WHEN instr(lower(name),'the ')=1 THEN substr(name,5)
ELSE name
END AS NameWithoutThe
FROM songs
) AS inr
ORDER BY
CASE WHEN name GLOB '[0-9]*' THEN 1
ELSE 0
END,
CASE WHEN name GLOB '[0-9]*' THEN CAST(NameWithoutThe AS INT)
ELSE
replace(
replace(
replace(
replace(
NameWithoutThe,
'.',''
),
'(',''
),
'''',''
),
' ',' '
)
END
COLLATE NOCASE;

Concatenate and format text in SQL

I need to concatenate the City, State and Country columns into something like City, State, Country.
This is my code:
Select City + ', ' + State + ', ' + Country as outputText from Places
However, because City and State allow null (or empty) value, what happen is, (for example) if the City is null/empty, my output will look like , Iowa, USA; or say the State is empty, then the output will look like Seattle, , USA
Is there anyway I can format the output and remove "unnecessary" commas?
Edited: Because of the requirements, I should not use any other mean (such as PL/SQL, Store Procedure) etc., so it has to be plain SQL statement
select
isnull(City, '') +
case when isnull(City, '') != '' then ', ' else '' end +
isnull(State, '') +
case when isnull(State, '') != '' then ', ' else '' end +
isnull(Country, '') as outputText
from
Places
Since adding a string with null will result null so if they are null (not empty string) this will give you teh desired result
Select isnull(City + ', ','') + isnull(State + ', ' ,'') + isnull(Country,'') as outputText from Places
Use the COALESCE (Transact-SQL) function.
SELECT COALESCE(City + ', ', '') + COALESCE(State + ', ', '')...
In SQL Server 2012 you can use CONCAT function:
select concat(City,', ',State,', ',Country ) as outputText from Places
Not elegant by any means...
first changes city, state,country to null values if blank
then interprets that value for null and adds a space before a comma
then replaces any space comma space ( , ) with empty set.
Query:
SELECT replace(coalesce(Replace(City,'',Null),' ') + ', ' +
coalesce(Replace(State,'',Null), ' ' + ', ' +
coalesce(replace(Country,''Null),''), ' , ','') as outputText
FROM Places
Assumes no city state or country will contain space comma space.