VBA equivalent User Defined String formats in TSQL?

VBA equivalent User Defined String formats in TSQL? - vba

VBA allows for user-defined string formats in Format(). I am particularly interested in replicating the placeholder characters, # and ! in SQL Server (using its Format() function? - open to alternatives).
My use case requires a mix of characters and numbers stored as a Variant type in VBA.
With # and ! placeholder characters, here is what I would like to mimic from VBA in SQL Server.
VBA:
Format(12DFR89, "!##-#-####")
Output: 12-D-FR89

As your asking about reproducing the fixed format "!##-#-####" you can do this a with UDF that replicates the VBA behaviour:
CREATE FUNCTION dbo.CustomFormat(#VALUE VARCHAR(MAX)) RETURNS VARCHAR(MAX) AS
BEGIN
DECLARE #MAX_LENGTH INT = 7
SET #VALUE = RIGHT(#VALUE + ISNULL(RIGHT(REPLICATE('?', #MAX_LENGTH - LEN(#VALUE)), #MAX_LENGTH), ''), #MAX_LENGTH)
RETURN CONCAT(
LEFT(#VALUE, 2),
'-',
SUBSTRING(#VALUE, 3, 1),
'-',
SUBSTRING(#VALUE, 4, #MAX_LENGTH)
)
END
GO
Example:
SELECT
test,
dbo.CustomFormat(test)
FROM ( VALUES
('1'), ('12'), ('123'), ('1234'), ('12345'), ('123456'), ('1234567'), ('12345678'), ('123456789'), ('1234567890')
) AS T(test)
For:
test
1 1?-?-????
12 12-?-????
123 12-3-????
1234 12-3-4???
12345 12-3-45??
123456 12-3-456?
1234567 12-3-4567
12345678 23-4-5678
123456789 34-5-6789
1234567890 45-6-7890
(Replace '?' with ' ' in the function to get spaces)

Related

Is there a way to find default values that is a combination of the same number like 000/000/0 or 11111 or 99999?

I want to find values in the SQL database that is a combination of the same number such as 0000 or 000/000/0 or 11111 or 99999 etc.
Is there a way to find these values without hardcoding?
What I am currently doing is:
select * from XXXX where value = '000/000/0'

A simple solution is to remove all instances of first character of the string, and check if the result is an empty string:
select *
from t
where replace(replace(str, '/', ''), substring(str, 1, 1), '') = ''

Try this on :
SELECT *
FROM XXXX
WHERE value IN ('000/000/0',11111,99999,0000)
If you need fill column values with application or other third-party. you can use stored procedure like below:
CREATE PROC dbo.usp_ListOfNumbers #NumberValues Nvarchar(200)
as
BEGIN
SELECT *
FROM XXXX
WHERE value = #NumberValues
END
for call just use
EXEC dbo.usp_ListOfNumbers #NumberValues = '000/000/0'

How to identify and redact all instances of a matching pattern in T-SQL

I have a requirement to run a function over certain fields to identify and redact any numbers which are 5 digits or longer, ensuring all but the last 4 digits are replaced with *
For example: "Some text with 12345 and 1234 and 12345678" would become "Some text with *2345 and 1234 and ****5678"
I've used PATINDEX to identify the the starting character of the pattern:
PATINDEX('%[0-9][0-9][0-9][0-9][0-9]%', TEST_TEXT)
I can recursively call that to get the starting character of all the occurrences, but I'm struggling with the actual redaction.
Does anyone have any pointers on how this can be done? I know to use REPLACE to insert the *s where they need to be, it's just the identification of what I should actually be replacing I'm struggling with.
Could do it on a program, but I need it to be T-SQL (can be a function if needed).
Any tips greatly appreciated!

You can do this using the built in functions of SQL Server. All of which used in this example are present in SQL Server 2008 and higher.
DECLARE #String VARCHAR(500) = 'Example Input: 1234567890, 1234, 12345, 123456, 1234567, 123asd456'
DECLARE #StartPos INT = 1, #EndPos INT = 1;
DECLARE #Input VARCHAR(500) = ISNULL(#String, '') + ' '; --Sets input field and adds a control character at the end to make the loop easier.
DECLARE #OutputString VARCHAR(500) = ''; --Initalize an empty string to avoid string null errors
WHILE (#StartPOS <> 0)
BEGIN
SET #StartPOS = PATINDEX('%[0-9][0-9][0-9][0-9][0-9]%', #Input);
IF #StartPOS <> 0
BEGIN
SET #OutputString += SUBSTRING(#Input, 1, #StartPOS - 1); --Seperate all contents before the first occurance of our filter
SET #Input = SUBSTRING(#Input, #StartPOS, 500); --Cut the entire string to the end. Last value must be greater than the original string length to simply cut it all.
SET #EndPos = (PATINDEX('%[0-9][0-9][0-9][0-9][^0-9]%', #Input)); --First occurance of 4 numbers with a not number behind it.
SET #Input = STUFF(#Input, 1, (#EndPos - 1), REPLICATE('*', (#EndPos - 1))); --#EndPos - 1 gives us the amount of chars we want to replace.
END
END
SET #OutputString += #Input; --Append the last element
SET #OutputString = LEFT(#OutputString, LEN(#OutputString))
SELECT #OutputString;
Which outputs the following:
Example Input: ******7890, 1234, *2345, **3456, ***4567, 123asd456
This entire code could also be made as a function since it only requires an input text.

A dirty solution with recursive CTE
DECLARE
#tags nvarchar(max) = N'Some text with 12345 and 1234 and 12345678',
#c nchar(1) = N' ';
;
WITH Process (s, i)
as
(
SELECT #tags, PATINDEX('%[0-9][0-9][0-9][0-9][0-9]%', #tags)
UNION ALL
SELECT value, PATINDEX('%[0-9][0-9][0-9][0-9][0-9]%', value)
FROM
(SELECT SUBSTRING(s,0,i)+'*'+SUBSTRING(s,i+4,len(s)) value
FROM Process
WHERE i >0) calc
-- we surround the value and the string with leading/trailing ,
-- so that cloth isn't a false positive for clothing
)
SELECT * FROM Process
WHERE i=0
I think a better solution it's to add clr function in Ms SQL Server to manage regexp.
sql-clr/RegEx

Here is an option using the DelimitedSplit8K_LEAD which can be found here. https://www.sqlservercentral.com/articles/reaping-the-benefits-of-the-window-functions-in-t-sql-2 This is an extension of Jeff Moden's splitter that is even a little bit faster than the original. The big advantage this splitter has over most of the others is that it returns the ordinal position of each element. One caveat to this is that I am using a space to split on based on your sample data. If you had numbers crammed in the middle of other characters this will ignore them. That may be good or bad depending on you specific requirements.
declare #Something varchar(100) = 'Some text with 12345 and 1234 and 12345678';
with MyCTE as
(
select x.ItemNumber
, Result = isnull(case when TRY_CONVERT(bigint, x.Item) is not null then isnull(replicate('*', len(convert(varchar(20), TRY_CONVERT(bigint, x.Item))) - 4), '') + right(convert(varchar(20), TRY_CONVERT(bigint, x.Item)), 4) end, x.Item)
from dbo.DelimitedSplit8K_LEAD(#Something, ' ') x
)
select Output = stuff((select ' ' + Result
from MyCTE
order by ItemNumber
FOR XML PATH('')), 1, 1, '')
This produces: Some text with *2345 and 1234 and ****5678

find the end point of a pattern in SQL server

There is a comma separated string in a column which looks like
test=1,value=2.2,system=321
I want to extract value out from the string. I can use select PatIndex('%value=%',columnName) then use left, but this only find the beginning of the patindex.
How to identify the end of pattern value=%, so we can extract the value out?

Chain a few SUBSTRING with CHARINDEX and your PATHINDEX.
DECLARE #text VARCHAR(100) = 'test=1,value=2.21954,system=321'
SELECT
Original = #text,
Parsed = SUBSTRING( -- Get a portion of the original value
#text,
PATINDEX('%value=%',#text) + 6, -- ... starting from the 'value=' (without the 'value=')
-1 + CHARINDEX( -- ... and get as many characters until the first comma
',',
SUBSTRING( -- ... (find the comma starting from the 'value=' onwards)
#text,
PATINDEX('%value=%',#text) + 6,
100)))
Result:
Original Parsed
test=1,value=2.2,system=321 2.2
Note that the CHARINDEX will fail if there is no comma after your value=. You can filter this with a WHERE.
I strongly suggest to store your values already split on a proper table and you wont have to deal with string nightmares like this.

You can use CHARINDEX with starting position to find the first comma after the pattern. CROSS APPLY is used to keep the query easier to read:
WITH tests(str) AS (
SELECT 'test=1,value=2.2,system=321'
)
SELECT str, substring(str, pos1, pos2 - pos1) AS match
FROM tests
CROSS APPLY (SELECT PATINDEX('%value=%', str) + 6) AS ca1(pos1)
CROSS APPLY (SELECT CHARINDEX(',', str, pos1 + 1)) AS ca2(pos2)
-- 2.2

First of all, don't store denormalized data in this way, if you want to query them. SQL, the language, isn't good at string manipulation. Parsing and splitting strings can't take advantage of indexes either, which means any query that tried to find eg all records that refer to system 321 would have to scan and parse all rows.
SQL Server 2016 and JSON
SQL Server 2016 added suppor for JSON and the STRING_SPLIT function. Earlier versions already provided the XML type. It's better to store complex values as JSON or XML instead of trying to parse the string.
One option is to convert the string into a JSON object and retrieve the value contents, eg :
DECLARE #text VARCHAR(100) = 'test=1,value=2.2,system=321'
select json_value('{"' + replace(replace(#text,',','","'),'=','":"') + '"}','$.value')
This returns 2.2.
The replacements converted the original string into
{"test":"1","value":"2.2","system":"321"}
JSON_VALUE(#json,'$.') will return the value property of that object
Earlier SQL Server versions
In earlier SQL Server version, you can convert that string into an XML element the same way and use XQuery :
DECLARE #text VARCHAR(100) = 'test=1,value=2.2,system=321';
declare #xml varchar(100)='<r ' + replace(replace(#text,',','" '),'=',' ="') + '" />';
select #xml
select cast(#xml as xml).value('(/r[1]/#value)','varchar(20)')
In this case #xml contains :
<r test ="1" value ="2.2" system ="321" />
The query result is 2.2

You can try like following.
DECLARE #xml AS XML
SELECT #xml = Cast(( '<X>' + Replace(txt, ',', '</X><X>') + '</X>' ) AS XML)
FROM (VALUES ('test=1,value=2.2,system=321')) v(txt)
SELECT LEFT(value, Charindex('=', value) - 1) AS LeftPart,
RIGHT(value, Charindex('=', Reverse(value)) - 1) AS RightPart
FROM (SELECT n.value('.', 'varchar(100)') AS value
FROM #xml.nodes('X') AS T(n))T
Online Demo
Output
+----------+-----------+
| LeftPart | RightPart |
+----------+-----------+
| test | 1 |
+----------+-----------+
| value | 2.2 |
+----------+-----------+
| system | 321 |
+----------+-----------+

You can try the below query if you are using SQL Server (2016 or above)
SELECT RIGHT(Value,CHARINDEX('=',REVERSE(Value))-1) FROM YourTableName
CROSS APPLY STRING_SPLIT ( ColumnName , ',' )
WHERE Value Like 'Value=%'

Get part of the string between 2 different strings

I'm using SQL-Server 2008 R2.
First of all, I want to tell you that's I know that store strings like this is super bad practice, but as I'm SQL developer I don't have an ability to change it, the software of third-party generating output and inserting to the database like this.
Explanation
Sample value looks like:
Name: 'Document No. 996'
Unique No: 'A 54 x. 488sCHU'
No 2: 'RF123456789'
String 'This is dynamic text' value 'test' wrong data
Values 'ETC1 ETC2'.
Note: this is 1 value (1 column, 1 row)
As you see above, the structure is like: After word Name is added : then in single quotes, then some document no, after it line break and so on.
What I need (desired results)
I need to extract from that string this part: String 'This is dynamic text'.
This part always starts with word String, after it will be 1 space and in single quotes will be some text.
So it looks like I have look between 2 chars, first would be String ' and second '.
I have to use maybe SUBSTRING and CHARINDEX, but in anyway I can't achieve it.
What I've tried
There is sample data and what I've tried, just without success:
DECLARE #c varchar(100)
SET #c = 'Name: ''Document No. 996''
Unique No: ''A 54 x. 488sCHU''
No 2: ''RF123456789''
String ''This is dynamic text'' value ''test'' wrong data
Values ''ETC1 ETC2''.'
SELECT SUBSTRING(STUFF(#c, 1, CHARINDEX('String ''',#c), ''), 0, CHARINDEX('''', STUFF(#c, 1, CHARINDEX('String ''',#c), '')))

You can use it
DECLARE #c varchar(1000)
SET #c = 'Name: ''Document No. 996''
Unique No: ''A 54 x. 488sCHU''
No 2: ''RF123456789''
String ''This is dynamic text'' value ''test'' wrong data
Values ''ETC1 ETC2''.'
SELECT SUBSTRING( #c, CHARINDEX('String ''',#c) , CHARINDEX('''', #c, CHARINDEX('String ''',#c)+8 ) - CHARINDEX('String ''',#c)+1)
Result:
String 'This is dynamic text'

DECLARE #c varchar(255) --100 will truncate your string
SET #c = 'Name: ''Document No. 996''
Unique No: ''A 54 x. 488sCHU''
No 2: ''RF123456789''
String ''This is dynamic text'' value ''test'' wrong data
Values ''ETC1 ETC2''.'
Here is solution split in two parts for better understanding. First part is to find substring that starts with String keyword and goes until the end of the original string. We store it #c1, to reuse it twice. Second part is finding next ' but only in #c1 and cutting everything right from it.
DECLARE #c1 Varchar(255)
SELECT #c1 = SUBSTRING(#c, CHARINDEX('String ''',#c) + 8, 255)
--This is dynamic text' value 'test' wrong data Values 'ETC1 ETC2'.
SELECT LEFT(#c1, CHARINDEX('''',#c1) - 1)
--This is dynamic text
All put together - in single query:
SELECT LEFT(SUBSTRING(#c, CHARINDEX('String ''',#c) + 8, 255), CHARINDEX('''',SUBSTRING(#c, CHARINDEX('String ''',#c) + 8, 255)) - 1)

Not Sure but you are looking something as below :
DECLARE #DATA NVARCHAR(MAX);
SET #DATA = 'Name: ''Document No. 996''
Unique No: ''A 54 x. 488sCHU''
No 2: ''RF123456789''
String ''This is dynamic text'' value ''test'' wrong data
Values ''ETC1 ETC2''.';
SELECT SUBSTRING(SUBSTRING(#DATA, CHARINDEX('String', #DATA), CHARINDEX('Values', #DATA)-CHARINDEX('String', #DATA)), 1, CHARINDEX('''', SUBSTRING(SUBSTRING(#DATA, CHARINDEX('String', #DATA), CHARINDEX('Values', #DATA)-CHARINDEX('String', #DATA)), CHARINDEX('''', SUBSTRING(#DATA, CHARINDEX('String', #DATA), CHARINDEX('Values', #DATA)-CHARINDEX('String', #DATA)))+1, LEN(SUBSTRING(#DATA, CHARINDEX('String', #DATA), CHARINDEX('Values', #DATA)-CHARINDEX('String', #DATA)))))+CHARINDEX('''', SUBSTRING(#DATA, CHARINDEX('String', #DATA), CHARINDEX('Values', #DATA)-CHARINDEX('String', #DATA))));
Result :
String 'This is dynamic text'

Determine if zip code contains numbers only

I have a field called zip, type char(5), which contains zip codes like
12345
54321
ABCDE
I'd like to check with an sql statement if a zip code contains numbers only.
The following isn't working
SELECT * FROM S1234.PERSON
WHERE ZIP NOT LIKE '%'
It can't work because even '12345' is an "array" of characters (it is '%', right?
I found out that the following is working:
SELECT * FROM S1234.PERSON
WHERE ZIP NOT LIKE ' %'
It has a space before %. Why is this working?

If you use SQL Server 2012 or up the following script should work.
DECLARE #t TABLE (Zip VARCHAR(10))
INSERT INTO #t VALUES ('12345')
INSERT INTO #t VALUES ('54321')
INSERT INTO #t VALUES ('ABCDE')
SELECT *
FROM #t AS t
WHERE TRY_CAST(Zip AS NUMERIC) IS NOT NULL

Using answer from here to check if all are digit
SELECT col1,col2
FROM
(
SELECT col1,col2,
CASE
WHEN LENGTH(RTRIM(TRANSLATE(ZIP , '*', ' 0123456789'))) = 0
THEN 0 ELSE 1
END as IsAllDigit
FROM S1234.PERSON
) AS Z
WHERE IsAllDigit=0
DB2 doesnot have regular expression facility like MySQL REGEXP

USE ISNUMERIC function;
ISUMERIC returns 1 if the parameter contains only numbers and zero if it not
EXAMPLE:
SELECT * FROM S1234.PERSON
WHERE ISNUMERIC(ZIP) = 1
Your statement doesn't validate against numbers but it says get everything that doesn't start with a space.

Let's suppose you ZIP code is a USA zip code, composed by 5 numbers.
db2 "with val as (
select *
from S1234.PERSON t
where xmlcast(xmlquery('fn:matches(\$ZIP,''^\d{5}$'')') as integer) = 1
)
select * from val"
For more information about xQuery:fn:matches: http://pic.dhe.ibm.com/infocenter/db2luw/v10r5/topic/com.ibm.db2.luw.xml.doc/doc/xqrfnmat.html

mySql does not have a native isNumberic() function. This would be pretty straight-forward in Excel with the ISNUMBER() function, or in T-SQL with ISNUMERIC(), but neither work in MySQL so after a little searching around I came across this solution...
SELECT * FROM S1234.PERSON
WHERE ZIP REGEXP ('[0-9]')
Effectively we're processing a regular expression on the contents of the 'ZIP' field, it may seem like using a sledgehammer to crack a nut and I've no idea how performance would differ from a more simple approach but it worked and I guess that's the point.

I have made more error-prone version based on the solution https://stackoverflow.com/a/36211270/565525, added intermedia result, some examples:
select
test_str
, TRIM(TRANSLATE(replace(trim(test_str), ' ', 'x'), 'yyyyyyyyyyy', '0123456789'))
, case when length(TRIM(TRANSLATE(replace(trim(test_str), ' ', 'x'), 'yyyyyyyyyyy', '0123456789')))=5 then '5-digit-zip' else 'not 5d-zip' end is_zip
from (VALUES
(' 123 ' )
,(' abc ' )
,(' a12 ' )
,(' 12 3 ')
,(' 99435 ')
,('99323' )
) AS X(test_str)
;
The result for this example set is:
TEST_STR 2 IS_ZIP
-------- -------- -----------
123 yyy not 5d-zip
abc abc not 5d-zip
a12 ayy not 5d-zip
12 3 yyxy not 5d-zip
99435 yyyyy 5-digit-zip
99323 yyyyy 5-digit-zip

Try checking if there's a difference between lower case and upper case. Numerics and special chars will look the same:
SELECT *
FROM S1234.PERSON
WHERE UPPER(ZIP COLLATE Latin1_General_CS_AI ) = LOWER(ZIP COLLATE Latin1_General_CS_AI)

Here's a working example for the case where you'd want to check zip codes in a range. You could use this code for inspiration to make a simple single post code check, if you want:
if local_test_environment?
# SQLite supports GLOB which is similar to LIKE (which it only has limited support for), for matching in strings.
where("(zip_code NOT GLOB '*[^0-9]*' AND zip_code <> '') AND (CAST(zip_code AS int) >= :range_start AND CAST(zip_code AS int) <= :range_finish)", range_start: range_start, range_finish: range_finish)
else
# SQLServer supports LIKE with more advanced matching in strings than what SQLite supports.
# SQLServer supports TRY_PARSE which is non-standard SQL, but fixes the error SQLServer gives with CAST, namely: Conversion failed when converting the nvarchar value 'US-19803' to data type int.
where("(zip_code NOT LIKE '%[^0-9]%' AND zip_code <> '') AND (TRY_PARSE(zip_code AS int) >= :range_start AND TRY_PARSE(zip_code AS int) <= :range_finish)", range_start: range_start, range_finish: range_finish)
end

Use regex.
SELECT * FROM S1234.PERSON
WHERE ZIP REGEXP '\d+'

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

VBA equivalent User Defined String formats in TSQL? - vba

Related

Is there a way to find default values that is a combination of the same number like 000/000/0 or 11111 or 99999?

How to identify and redact all instances of a matching pattern in T-SQL

find the end point of a pattern in SQL server

Get part of the string between 2 different strings

Determine if zip code contains numbers only

Categories

Resources