Convert varchar to 3 (sometimes 4) chars in T-SQL - sql

I select data from a database. The values are (field name is ADR_KOMP_VL) :
4 , 61A, 100, 12, 58, 123C, 6 A, 5
I need to convert these values to 3 digits (except when there is a letter then it is 4)
So the converted values should be:
004, 061A, 100, 012, 058, 123C, 006A, 005
The rules are:
Always 3 digits
No spaces
If the original value is less than three digits, put 0's in front of it.(The length is 3)
If the original value contains a letter, put 0's in front of it (but the length is 4)
For the "no space" part I have this:
select REPLACE(ADR_KOMP_VL, ' ','')
The solution I have so far is:
SELECT RIGHT('000' + CONVERT(VARCHAR(4),REPLACE(ADR_KOMP_VL, ' ','')), 3)
But this only gives me the right length, when there is no letter in the value. My problem is how to handle the values with a letter in them??

This only check if the last character is letter. Additional logic will be required if that's not the case
SELECT REPLICATE('0', CASE WHEN ISNUMERIC(RIGHT(ADR_KOMP_VL, 1)) = 0 THEN 4
ELSE 3
END - LEN(REPLACE(ADR_KOMP_VL, ' ', '')))
+ REPLACE(ADR_KOMP_VL, ' ', '')
FROM TX
EDIT - actually this might work better, checks for whole ADR_KOMP_VL if it's numeric:
SELECT REPLICATE('0', CASE WHEN ISNUMERIC(REPLACE(ADR_KOMP_VL, ' ', '')) = 0 THEN 4
ELSE 3
END - LEN(REPLACE(ADR_KOMP_VL, ' ', '')))
+ REPLACE(ADR_KOMP_VL, ' ', '')
FROM TX
SQLFiddle DEMO

You can use a case statement:
SELECT (case when ADR_KOMP_VL like '%[A-Z]%'
then RIGHT('0000' + CONVERT(VARCHAR(4),REPLACE(ADR_KOMP_VL, ' ','')), 4)
else RIGHT('000' + CONVERT(VARCHAR(4),REPLACE(ADR_KOMP_VL, ' ','')), 3)
end)

Related

SQL trying to replace middle characters with *

I am trying to replace SQL results with all the middle values with asterix, *. All results are words. I am using SSMS.
The words that are 4-5 letters, it should only show 1 letter in the beginning, one to the end.
6 letters and more, it it should only show 2 letter in the beginning, 2 letters in the end.
1-3 letters, no replacement.
For example:
(I am now using - instead of * so it does not make the text bold).
"Banana" 6 letters should become ba--na
"False" 5 letters should become F---e
"a" stays the same
"Selin is a vegetable and banana is a fruit" becomes "S---n is a ve-----le and ba--na is a f---t."
What I have done so far, is to make this for emails, after the #. But now I want it to happen with every word of the result.
What I've done:
DECLARE #String VARCHAR(100) = 'sample#gmail.com'
SELECT STUFF(STUFF(#STring,
CHARINDEX('#',#String)+2,
(CHARINDEX('.',#String, CHARINDEX('#',#String))-CHARINDEX('#',#String)-3),
REPLICATE('*',CHARINDEX('.',#String, CHARINDEX('#',#String))-CHARINDEX('#',#String)))
,2
,CHARINDEX('#',#String)-3
,REPLICATE('*',CHARINDEX('#',#String)-3))```
With result s----e#g------l.com
instead of -
And I tried the mask method
Select
--select first character from Email and use replicate
SUBSTRING(Sxolia,1,1) + REPLICATE('*',5)+
--function to put asterisks
SUBSTRING(Sxolia,CHARINDEX('#',Sxolia),len(Sxolia)-CHARINDEX('#',Sxolia)+1)
--at this statement i select this part #gmail,com and to first part to become like this A*****#gmail.com
as Emailmask
From [mytable]
With result
B***** Bana is a fruit
And
declare #str nvarchar(max)
select #str = '123456'
select '****' + substring(#str, 5, len(#str) - 3)
Result: ****56
Not what I am looking for.
How should I look into this?
If I had to deal with this in SQL Server I'd operate on each word as a row, however using string_split is not (currently) an option since it does not guarantee ordering.
The following uses json to split the string as an array and provides a key value for ordering, which allows the words to be aggregated in the correct order:
select t.Sentence,
String_Agg( masked, ' ') within group(order by seq) Masked
from t
cross apply (
select seq, [value] word,
case
when l<=3 then [value]
when l<=5 then Stuff([value],2,l-2,Replicate('*',l-2))
else
Stuff([value],3,l-4,Replicate('*',l-4))
end Masked
from (
select j.[value], 1 + Convert(tinyint,j.[key]) Seq
from OpenJson(Concat('["',replace(t.Sentence,' ', '","'),'"]')) j
)w
cross apply (values(Len([value])))x(l)
)w
group by t.Sentence;
See working demo
Result:
I'm not sure how e-mail fits into all this because you're asking for word masks, so I'm going to assume you actually want this. Use divide and conquer to implement this, so first implement an expression that would do this for simplest cases (e.g. single words). Then if you need it for e-mails, just split the e-mails however you see fit and then apply the same expression.
The expression itself is rather simple:
SELECT *
FROM (VALUES
('banana'),
('selin'),
('vegetable')
) words(word)
CROSS
APPLY (SELECT CASE
WHEN ln BETWEEN 4 AND 5
THEN LEFT(word, 1) + REPLICATE('*', ln-2) + RIGHT(word, 1)
WHEN ln >= 6
THEN LEFT(word, 2) + REPLICATE('*', ln-4) + RIGHT(word, 2)
ELSE word
END as result
FROM (VALUES (LEN(words.word))) x(ln)
) calc
This already provides the expected result. You could define a function out of this, if you have the permissions, and use it like so:
SELECT *
FROM (VALUES
('banana'),
('selin'),
('vegetable')
) words(word)
CROSS
APPLY fnMaskWord(word)
Here's a working demo on dbfiddle, it includes the statement to create the function.
Expanding on a few answers:
select case when len(#String) <= 3 then #String
when len(#String) > 3 AND len(#String) <= 5 then
substring(#String, 1, 2) +
REPLICATE('*', Len(#String) - 2) +
substring(#String, Len(#String) - 1, 2)
when len(#String) >= 6 then
substring(#String, 1, 2) +
REPLICATE('*', Len(#String) - 2) +
substring(#String, Len(#String) - 1, 2)
else 'unrecognized length!'
If the length of the string is less than or equal to 3, return the string.
If the length of the string is more than 3 and less than or equal to 5 then create a substring starting at position 1, then replicate * by the length of the string -2 and finally add another substring -1 from the end of the string.
Similar for if the result is over 6 characters.
Else unrecognized length!
Hope this helps understand what's going on!
Maybe this can help
declare #t table (word varchar(50))
insert into #t values ('banana'), ('selin'), ('vegetable')
select case when len(t.word) < 3 then t.word
else left(t.word, 1) + -- take first char from left
replicate('*', Len(t.word) - 2) + -- fill middle with *
right(t.word, 1) -- take last char from right
end
from #t t
this returns
COLUMN1
b****a
s***l
v*******e
If you want to keep 2 chars left and right when the len > 5 then maybe this
select case when len(t.word) < 3 then t.word
when len(t.word) < 6 then
left(t.word, 1) +
replicate('*', len(t.word) - 2) +
right(t.word, 1)
else left(t.word, 2) +
replicate('*', len(t.word) - 4) +
right(t.word, 2)
end
from #t t
The result
COLUMN1
ba**na
s***l
ve*****le
EDIT: What if there is a whole sentence ?
Well then we first split the sentence in words,
and then concat the individual words back together while putting the ** in them
declare #t table (word varchar(50))
insert into #t values ('banana'), ('selin'), ('vegetable'), ('Banana is a fruit')
select t.word,
-- put the words back togheter into the sentence, and ** them while we are at it
( select string_agg(case when len(value) < 3 then value
when len(value) < 6 then
left(value, 1) +
replicate('*', len(value) - 2) +
right(value, 1)
else left(value, 2) +
replicate('*', len(value) - 4) +
right(value, 2)
end,
' ')
)
from #t t
cross apply string_split(t.word, ' ') s -- split the sentence into words
group by t.word
the result is
word COLUMN1
---- -------
banana ba**na
Banana is a fruit Ba**na is a f***t
selin s***n
vegetable ve*****le

how to replace a value for string if letters are missing?

I ran into a problem where i have to create a 'LettersOfName' column. As name suggest I have to get letter 2,3 and 5 from ORGANISATIONNAME column and letters 2 and 3 from CLIENTLASTNAME column, then concatenate to form letters of name column. The condition is if letters of name is not equal to length 5 than replace with '22222' also if any of the letters is missing from first name and last name than replace with '22222'. I am using this query.
select
( CASE WHEN LENGTH (UPPER( SUBSTR(ORGANISATIONNAME, 2,2) || SUBSTR(ORGANISATIONNAME,5,1)) || UPPER(SUBSTR(CLIENTLASTNAME,2,2))) != '5' THEN '22222'
ELSE UPPER( SUBSTR(ORGANISATIONNAME, 2,2) || SUBSTR(ORGANISATIONNAME,5,1)) || UPPER(SUBSTR(CLIENTLASTNAME,2,2)) END)
AS LETTERSOFNAME
from client;
So, far this query runs fine, but when we have name like 'Jo Anne' or 'J Shark' it is missing letter '2' and '3' but does not replace the string with '22222'. When length is not equal to 5 it replaces with '22222'. I am using Oracle 12c.
If after the concatenations of the letters you remove all the spaces and the length of the remaining string is less than 5 then replace with '22222':
SELECT
CASE
WHEN LENGTH(REPLACE(SUBSTR(ORGANISATIONNAME, 2, 2) || SUBSTR(ORGANISATIONNAME, 5, 1) || SUBSTR(CLIENTLASTNAME, 2, 2), ' ', '')) < 5 THEN '22222'
ELSE UPPER(SUBSTR(ORGANISATIONNAME, 2, 2) || SUBSTR(ORGANISATIONNAME, 5, 1) || SUBSTR(CLIENTLASTNAME, 2, 2))
END LETTERSOFNAME
FROM client
Or with a CTE:
WITH cte AS (
SELECT
UPPER(REPLACE(
SUBSTR(ORGANISATIONNAME, 2, 2) ||
SUBSTR(ORGANISATIONNAME, 5, 1) ||
SUBSTR(CLIENTLASTNAME, 2, 2),
' ',
''
)) LETTERSOFNAME
FROM client
)
SELECT
CASE
WHEN LENGTH(LETTERSOFNAME) < 5 THEN '22222'
ELSE LETTERSOFNAME
END LETTERSOFNAME
FROM cte
See the demo.
You should first remove the white space between the string and and then apply your case statement on it
replace ('J Shark', ' ', '')
Reason is white space is being counted as a character in J Shark and that is why second and third characters are missing.
Here is an example demo.
Here is my approach:
Put both columns ORGANISATIONNAME and CLIENTLASTNAME to another table with identity column (to identify each row)
Write a function to split text by a string (in this case pass a space)
Get the identity and the splitted data to 2 tables each for column 1 and 2
Consider each table and apply your logic
Concatenate the row values separated by space, with the ID (1 record per ID)
Join the 2 tables (by IDs)
Join the 2 tables for matches in Col-Split data, and get the IDs
Now Query for the data in table in above 1

Parsing Name Field in SQL

I am trying to separate a name field into the appropriate fields. The name field is not consistently the same. It can show up as Doe III,John w or Doe,John, or Doe III,John, or Doe,John W or it may be lacking the suffix and or middle initial. Any ideas would be greatly appreciated.
SELECT (
CASE LEN(REPLACE(FirstName, ' ', ''))
WHEN LEN(FirstName + ' ') - 1
THEN PARSENAME(REPLACE(FirstName, ' ', '.'), 2)
ELSE PARSENAME(REPLACE(FirstName, ' ', '.'), 3)
END
) AS LastName
,(
CASE LEN(REPLACE(FirstName, ' ', ''))
WHEN LEN(FirstName + ',') - 1
THEN NULL
ELSE PARSENAME(REPLACE(FirstName, ' ', '.'), 2)
END
) AS Suffix
,PARSENAME(REPLACE(FirstName, ' ', '.'), 1) AS FirstName
FROM Trusts.dbo.tblMember
I need the name regardless of the format, as stated above, to parse into the appropriate fields of LastName,Suffix,FirstName,MiddleInitial, regardless of whether it has a suffix or a middle initial
If the given 4 names are the only type of cases, then you can use something like below.
Note: I used a CTE table tbl2 to separate comma_pos,first_space,second_space for better understanding in the main query. You can replace these value in main query with their corresponding function in CTE, to make the main query faster. I mean replace comma_pos in main query with charindex(',',name) an so on.
Also I am assuming that there are no leading/trailing or extra whitespaces or any junk character in name column. If you have, then sanitize your data first before proceeding.
Rexter Sample
with tbl2 as (
select tbl.*,
charindex(',',name) as comma_pos,
charindex(' ',name,1) first_space,
charindex(' ',name,charindex(' ',name,1)+1) second_space
from tbl)
select tbl2.name
,case when second_space <> 0
then substring(name,comma_pos+1,second_space-comma_pos-1)
when first_space > comma_pos
then substring(name,comma_pos+1,first_space-comma_pos-1)
else substring(name,comma_pos+1,len(name)-comma_pos)
end as first_name
,case when second_space <> 0
then substring(name,second_space+1,len(name)-second_space)
when first_space > comma_pos
then substring(name,first_space+1,len(name)-first_space)
end as middle_name
,case when first_space=0 or first_space>comma_pos
then substring(name,1,comma_pos-1)
else substring(name,1,first_space-1)
end as last_name
,case when first_space=0 or first_space>comma_pos
then null
else substring(name,first_space,comma_pos-first_space)
end as suffix
from tbl2;

CHARINDEX issue with NULL CHAR variable

DECLARE #MyChar CHAR = NULL
SELECT CHARINDEX(' ', ISNULL(NULL, '')),
CHARINDEX(' ', ISNULL(#MyChar, '')),
CHARINDEX(' ', ISNULL(CONVERT(VARCHAR, #MyChar), ''))
The above query returns the values 0, 1 and 0, in that order.
This result should be 0, 0 and 0. Is this an issue with MS SQL or there is some functionality here which I haven't understood?
I belive this will answer the question:
DECLARE #MyChar CHAR = NULL
SELECT CHARINDEX(' ', ISNULL(NULL, '')) a,
CHARINDEX(' ', ISNULL(#MyChar, '')) b,
CHARINDEX(' ', ISNULL(CONVERT(VARCHAR, #MyChar), '')) c
Results:
a b c
----------- ----------- -----------
0 1 0
Testing the values:
SELECT '|' + #MyChar + '|' a,
'|' + ISNULL(#MyChar, '') + '|' b,
'|' + ISNULL(CONVERT(VARCHAR, #MyChar), '') + '|' c
Results:
a b c
---- ---- --------------------------------
NULL | | ||
The ISNULL method returns the data type of the first argument it receives. since char has a minimum length of 1, and will pad the value with trailing spaces if needed, the result of ISNULL(#MyChar, '') is a string with a single space, hence the 1 you get in your result.
Let's try to understand the second query in two parts.
First part: SELECT ISNULL(#MyChar, '')
As per MSDN regarding ISNULL function:
Data type determination of the resulting expression is determined based on the data type of the first parameter.
So your first parameter #MyChar which is of Char and its value is NULL and when you use it in ISNULL function, second parameter which is '' (blank) will implicitly converted to CHAR like this -
SELECT CAST('' AS CHAR)
When you execute this query it'll give you whitespace.
Now when you execute your actual query with CharIndex
SELECT CHARINDEX(' ', ISNULL(#MyChar, '')
You'll get 1
Since you have defined the variable as CHAR and given the value as NULL hence it would occupy some space(2 bytes). So if the variable is a fixed width column and if you are trying to store NULL value in it then it will occupy the same amount of space as any other. From here:
There is a misconception that if we have the NULL values in a table it
doesn't occupy storage space. The fact is, a NULL value occupies space
– 2 bytes
If you change the datatype to varchar(1) and then provide the value as NULL you will find that you are getting the result as 0,0,0. So in case of variable width provided to the variable the NULL takes no space.
A good read article: How does SQL Server really store NULL-s

Oracle replacing text between first and last spaces

Here is the table data with the column name as Ships.
+--------------+
Ships |
+--------------+
Duke of north |
---------------+
Prince of Wales|
---------------+
Baltic |
---------------+
Replace all characters between the first and the last spaces (excluding these spaces) by symbols
of an asterisk (*). The number of asterisks must be equal to number of replaced characters.
Regular expressions are your friend :)
First match the space, followed by any other characters, ending in a space.
Then replace that with a string that consists of the starting and trailing space and, in between, a string of asterisks.
The string of asterisks is made by right padding a single asterisk with further asterisks to the appropriate length. That length is the length of the regular expression matched minus two characters for the leading/trailing space.
select regexp_replace(column_value,' .* ',
' '||rpad('*',length(regexp_substr(column_value,' .* '))-2,'*')||' ')
from table(sys.dbms_debug_vc2coll(
'Duke of north','Prince of Wales','Baltic','what if two spaces'));
Duke ** north
Prince ** Wales
Baltic
what ****** spaces
This really smells like homework. So I won't provide you with the full deal, but point you in the right direction instead:
Check out the function InStr. Espcecially its 3rd and 4th parameters, that allow you to search starting at the Xth char and/or search the Yth occurrence.
Edit: If someone finds this thread in a search and hopes for a solution that works in older versions of Oracle, this is how I'd have done it.
(I posted it as a comment to another post, but the author deleted his answer for some inexplicable reason o_O )
SELECT case
when InStr(Name, ' ', 1) > 0 and
InStr(Name, ' ', 1) <> InStr(Name, ' ', -1) then
SubStr(Name, 1, InStr(Name, ' ', 1) - 1) ||
lPad('*', InStr(Name, ' ', -1) - InStr(Name, ' ', 1) + 1, '*') ||
SubStr(Name, InStr(Name, ' ', -1) + 1)
else
Trim(Name)
end
FROM SomeTable
Although the data in the original question only had one word in between, it is possible to have more than one word in between the first and the last the word. For example:"This is an example with more than one word"
I suppose the solution should be such that it handles all these as well....
Anyway, here is another solution:
With
I As(
/*Serves as an input parameter*/
Select 'This is an example with more than one word' Str From Dual
)
,D As(
/*Split words into rows*/
Select RegExp_SubStr(Str,'[^ ]+',1,Level) Word,RowNum Seq,First_value(RowNum) Over(Order By RowNum Desc) L
From I
Connect By RegExp_SubStr(Str,'[^ ]+',1,Level) Is Not NULL
)
Select
/*Assemble all together - other than the first and the last word, replace all the rest into "*"*/
--uncomment the ListAgg statement if using 11g--
--ListAgg(Decode(Seq,1,Word,L,Word,RegExp_Replace(Word,'.','*')),' ') Within Group(Order By Seq) Statement
--If using earlier version of Oracle then use the following--
Trim(RegExp_Replace(XMLAgg(XMLElement(R,Decode(Seq,1,Word,L,Word,RegExp_Replace(Word,'.','*'))||' ') Order By Seq),'</?R>')) Statement
From D
/
OUTPUT:
This ** ** ******* **** **** **** *** word
SELECT a actual_string,
first_word,
SUBSTR(output1,1,LENGTH(output1)-LENGTH(SUBSTR(output1,(
CASE
WHEN regexp_count(output1,' ')=0
THEN 0
ELSE regexp_instr(output1,' ',1,regexp_count(output1,' '))
END)+1))) middle_words,
last_word,
CASE
WHEN first_word=last_word
THEN first_word
ELSE first_word
||TRANSLATE(upper(SUBSTR(output1,1,LENGTH(output1)-LENGTH(SUBSTR(output1,(
CASE
WHEN regexp_count(output1,' ')=0
THEN 0
ELSE regexp_instr(output1,' ',1,regexp_count(output1,' '))
END)+1)))),'ABCDEFGHIJKLMNOPQRSTUVWXYZ','**************************')
||last_word
END final_result
FROM
(SELECT a,
CASE
WHEN SUBSTR(a,1,regexp_instr(a,' ',1)) IS NULL
THEN a
ELSE SUBSTR(a,1,regexp_instr(a,' ',1))
END first_word,
SUBSTR(a,(
CASE
WHEN regexp_count(a,' ')=0
THEN 0
ELSE regexp_instr(a,' ',1,regexp_count(a,' '))
END)+1) last_word,
SUBSTR(a, LENGTH(
CASE
WHEN SUBSTR(a,1,regexp_instr(a,' ',1)) IS NULL
THEN a
ELSE SUBSTR(a,1,regexp_instr(a,' ',1))
END)+1, LENGTH(SUBSTR(a,(
CASE
WHEN regexp_count(a,' ')=0
THEN 0
ELSE regexp_instr(a,' ',1,regexp_count(a,' '))
END)+1))-2) middle_words,
CASE
WHEN regexp_instr(a,' ',1) +1>1
THEN SUBSTR(a,regexp_instr(a,' ',1)+1,
CASE
WHEN regexp_count(a,' ')=0
THEN 0
ELSE regexp_instr(a,' ',1,regexp_count(a,' '))
END )
ELSE a
END output1--,
FROM
( SELECT 'Duke of north' a FROM dual
UNION
SELECT 'Prince of Wales' a FROM dual
UNION
SELECT 'Baltic' a FROM dual
UNION
SELECT 'what if two spaces' a FROM dual
UNION
SELECT 'what if two or spaces' a FROM dual
)
)