Strip out non-numeric characters in SELECT - sql

In an MS Access 2007 project report, I have the following (redacted) query:
SELECT SomeCol FROM SomeTable
The problem is, that SomeCol apparently contains some invisible characters. For example, I see one result returned as 123456 but SELECT LEN(SomeCol) returns 7. When I copy the result to Notepad++, it shows as ?123456.
The column is set to TEXT. I have no control over this data type, so I can't change it.
How can I modify my SELECT query to strip out anything non-numeric. I suspect RegEx is the way to go... alternatively, is there a CAST or CONVERT function?

You mentioned using a regular expression for this. It is true that Access' db engine doesn't support regular expressions directly. However, it seems you are willing to use a VBA user-defined function in your query ... and a UDF can use a regular expression approach. That approach should be simple, easy, and faster performing than iterating through each character of the input string and storing only those characters you want to keep in a new output string.
Public Function OnlyDigits(ByVal pInput As String) As String
Static objRegExp As Object
If objRegExp Is Nothing Then
Set objRegExp = CreateObject("VBScript.RegExp")
With objRegExp
.Global = True
.Pattern = "[^\d]"
End With
End If
OnlyDigits = objRegExp.Replace(pInput, vbNullString)
End Function
Here is an example of that function in the Immediate window with "x" characters as proxies for your invisible characters. (Any characters not included in the "digits" character class will be discarded.)
? OnlyDigits("x1x23x")
123
If that is the output you want, just use the function in your query.
SELECT OnlyDigits(SomeCol) FROM SomeTable;

There is no RegEx in Access, at least not in SQL. If you venture to VBA, you might as well use a custom StripNonNumeric VBA function in the SQL statement.
e.g. SELECT StripNonNumeric(SomeCol) as SomeCol from SomeTable
Function StripNonNumeric(str)
keep = "0123456789"
outstr = ""
For i = 1 to len(str)
strChar = mid(str,i,1)
If instr(keep,strChar) Then
outstr = outstr & strChar
End If
Next
StripNonNumeric = outstr
End Function

You can do it all in a query, combining this question with your previous question, you get:
SELECT IIf(IsNumeric([atext]),
IIf(Len([atext])<4,Format([atext],"000"),
Replace(Format(Val([atext]),"#,###"),",",".")),
IIf(Len(Mid([atext],2))<4,Format(Mid([atext],2),"000"),
Replace(Format(Val(Mid([atext],2)),"#,###"),",","."))) AS FmtNumber
FROM Table AS t;

Public Function fExtractNumeric(strInput) As String
' Returns the numeric characters within a string in
' sequence in which they are found within the string
Dim strResult As String, strCh As String
Dim intI As Integer
If Not IsNull(strInput) Then
For intI = 1 To Len(strInput)
strCh = Mid(strInput, intI, 1)
Select Case strCh
Case "0" To "9"
strResult = strResult & strCh
Case Else
End Select
Next intI
End If
fExtractNumeric = strResult
End Function

Related

Look for multiple words in string

Need to build a VBA function that simulates the REGEXP_INSTR or REGEXP_LIKE in Oracle. Those make possible to look for words in string without having to loop word by word.
I've found this code, that find Names that starts with "Mr|Mrs|Ms|Dr", that meant to be used like:
Function StringStarts(strCheck As String, options As String) As Boolean
With CreateObject("VBScript.RegExp")
.IgnoreCase = True
.Pattern = "^(" & options & ")\.*\b"
StringStarts = .Test(strCheck)
End With
End Function
Debug.print StringStarts("Dr leopoldo malmeida", "Mr|Mrs|Ms|Dr")
In fact, I need help to find if it's possible to alter this function, in order to find any word, or parts of words (case-insensitive matching), in the pattern on any location of the string to search. For example:
Debug.print StringStarts("Looking for multiple words", "Word|like|for")
That should return true: "Word found in 'words'"; "for was a complete match in string".
Most probably RegEx is not needed, because it is a bit slow. A simple for-each loop in a function would be ok:
Public Function PatternPresent(testedString As String, _
Optional pattern As String = "Mr|Mrs|Ms|Dr") As Boolean
Const separator = "|"
Dim patterns As Variant: patterns = Split(pattern, separator)
Dim myVar As Variant
For Each myVar In patterns
If InStr(1, testedString, myVar) Then
PatternPresent = True
Exit Function
End If
Next myVar
End Function

How to split a string in VBA by more than one character

In C# one can easily split a split string by more than one character, one supplies an array of split characters. I was wondering what is best way to achieve this in VBA. I use VBA.Split typically but to split on more than one characters requires drilling in to the results and sub-splitting the elements. Then one has to re-dimension arrays etc. Quite painful.
Contraints
VBA responses only please. You may use .NET collection classes if you wish (yes they are creatable and callable in VBA). You may use JSON, XML as vessels for the list of split segments if you wish. You may use the humble VBA.Collection class if you wish, or even a Scripting.Dictionary. You may use even a fabricated recordset if you wish.
I know full well one can write a .NET asssembly to call the .NET String.Split method and expose assembly to VBA with COM interfaces but where is the challenge in that.
This should be fairly easy to do with a regular expression. If you match on the negation of the passed characters to split on, the matches will be the members of the output array. The upside to doing this is that the output array only needs to be sized once because you can get a count of the matches returned by the RegExp. The pattern is fairly simple to build - it boils down to something like [^abc]+ where 'a', 'b', and 'c' are the characters to split on. About the only thing that you need to do to prepare the expression is to escape a couple characters that have special meaning in that context inside a regular expression (I probably forgot some):
Private Function BuildRegexPattern(ByVal inputString As String) As String
Dim escapeTargets() As String
escapeTargets = VBA.Split("- ^ \ ]")
Dim returnValue As String
returnValue = inputString
Dim idx As Long
For idx = LBound(escapeTargets) To UBound(escapeTargets)
returnValue = Replace$(returnValue, escapeTargets(idx), "\" & escapeTargets(idx))
Next
BuildRegexPattern = "[^" & returnValue & "]+"
End Function
Once you have the pattern, it's just a simple matter of sizing the array and iterating over the matches to assign them (plus some other special case handling, etc.):
Public Function MultiSplit(ByVal toSplit As String, Optional ByVal delimiters As String = " ") As String()
Dim returnValue() As String
If toSplit = vbNullString Then
returnValue = VBA.Split(vbNullString)
Else
With New RegExp
.Pattern = BuildRegexPattern(IIf(delimiters = vbNullString, " ", delimiters))
.MultiLine = True
.Global = True
If Not .Test(toSplit) Then
'Only delimiters.
ReDim returnValue(Len(toSplit) - 1)
Else
Dim matches As Object
Set matches = .Execute(toSplit)
ReDim returnValue(matches.Count - 1)
Dim idx As Long
For idx = LBound(returnValue) To UBound(returnValue)
returnValue(idx) = matches(idx)
Next
End If
End With
End If
MultiSplit = returnValue
End Function
In my attempt, I replace all the other characters with space before splitting on space. (So I cheat a little.)
Private Function SplitByMoreThanOneChars(ByVal sLine As String)
'*
'* Brought to you by the Excel Development Platform Blog
'* http://exceldevelopmentplatform.blogspot.com/2018/11/
'*
'* Don't get excited, this splits by spaces only
'* we fake splitting by multiple characters by replacing those characters
'* with spaces
'*
Dim vChars2 As Variant
vChars2 = Array(" ", "<", ">", "[", "]", "(", ")", ";")
Dim sLine2 As String
sLine2 = sLine
Dim lCharLoop As Long
For lCharLoop = LBound(vChars2) To UBound(vChars2)
Debug.Assert Len(vChars2(lCharLoop)) = 1
sLine2 = VBA.Replace(sLine2, vChars2(lCharLoop), " ")
Next
SplitByMoreThanOneChars = VBA.Split(sLine2)
End Function

vb.net if statement between range of numbers

is there a way in vb.net that i can run an if statement to say if a variable starts with 07 then between 1-9
i know i could do this using substring which would work but it would make the if statement rather large
number_called.Substring(0, 3) = "071" or number_called.Substring(0, 3) = "072"
and so on up to 079 but can i create a shorter if statement for the whole range?
This would do it
Private Function CheckNumber(myNumber As String) As Boolean
Dim regex As Regex = New Regex("^07[1-9]]")
Dim match As Match = regex.Match(myNumber)
Return match.Success
End Function
Just call CheckNumber("071") or CheckNumber(number_called)
Remember to import the references Imports System.Text.RegularExpressions
Updated Expression, thank you Veeke
You could use String.StartsWith("07") and check the last character of the String - it must be a number and not 0 like this:
If str.Length = 3 And str.StartsWith("07") And Char.IsNumber(str(2)) And str(2) <> "0" Then
End If
If you know that it will always start with 3 numbers you can parse it
Dim num = Int32.Parse(number_called.Substring(0, 3))
Dim Valid= num>69 and num<80
If you dont know if it will start with 3 numbers, surround it with a TryCatch
Small correction on Malcor's post, which doesn't check if it begins with '07' (just if it contains '07'):
Private Function CheckNumber(myNumber As String) As Boolean
Dim regex As Regex = New Regex("^07[1-9]")
Dim match As Match = regex.Match(myNumber)
Return match.Success
End Function
You could use StartsWith coupled with a Select Case if you are sure it is always a numeric string:
If number_called.StartsWith("07") Then
Select Case CInt(number_called.SubString(2, 1))
Case 1 to 9
'it matched
Case Else
'it didn't match
End Select
End If

How to extract numbers UNTIL a space is reached in a string using Excel 2010?

I need to pull the code from the following string: 72381 Test 4Dx for Worms. The code is 72381 and the function that I'm using does a wonderful job of pulling ALL the numbers from a string and gives me back 723814, which pulls the 4 from the description of the code. The actual code is only the 72381. The codes are of varying length and are always followed by a space before the description begins; however there are spaces in the descriptions as well. This is the function I am using that I found from a previous search:
Function OnlyNums(sWord As String)
Dim sChar As String
Dim x As Integer
Dim sTemp As String
sTemp = ""
For x = 1 To Len(sWord)
sChar = Mid(sWord, x, 1)
If Asc(sChar) >= 48 And _
Asc(sChar) <= 57 Then
sTemp = sTemp & sChar
End If
Next
OnlyNums = Val(sTemp)
End Function
If the first character in the description part of your string is never numeric, you could use the VBA Val(string) function to return all of the numeric characters before the first non-numeric character.
Function GetNum(sWord As String)
GetNum = Val(sWord)
End Function
See the syntax of the Val(string) function for full details of it's usage.
You're looking for the find function.. Example:
or in VBA instr() and left()
Since you know the pattern is always code followed by space just use left of the string for the number of characters to the first space found using instr. Sample in immediate window above. Loop is going to be slow, and while it may validate they are numeric why bother if you know pattern is code then space?
In similar situations in C# code, I leave the loop early after finding the first instance of a space character (32). In VBA, you'd use Exit For.
You can get rid of the function altogether and use this:
split("72381 Test 4Dx for Worms"," ")(0)
This will split the string into an array using " " as the split char. Then it shows us address 0 in the array (the first element)
In the context of your function if you are dead set on using one it is this:
Function OnlyNums(sWord As String)
OnlyNums = Split(sWord, " ")(0)
End Function
While I like the simplicity of Mark's solution, you could use an efficient parser below to improve your character by character search (to cope with strings that don't start with numbers).
test
Sub test()
MsgBox StrOut("72381 Test 4Dx")
End Sub
code
Function StrOut(strIn As String)
Dim objRegex As Object
Set objRegex = CreateObject("vbscript.regexp")
With objRegex
.Pattern = "^(\d+)(\s.+)$"
If .test(strIn) Then
StrOut = .Replace(strIn, "$1")
Else
StrOut = "no match"
End If
End With
End Function

Passing a string to a database with ToUpper? VB.NET 2010

I'm trying to validate a textbox. If user enters all lower case characters then convert it to all upper case. So when its added to the database its all upper case. I need help with my function, I don't know if its a syntax error or something, here it is. I'm using VB.Net 2010
Public Function CheckLetters(ByVal strIn As String) As Boolean
Dim i As Integer
Dim strOne As String
For i = 0 To Len(strIn) - 1
strOne = strIn.Substring(i, 1).ToUpper
Select Case strOne
Case "A" To "Z"
Case Else
Return False
End Select
Next
Return True
End Function
How 'bout this:
Return Regex.IsMatch(strIn, "^[a-z]*$")
or this
Return strIn.All(Function(c) Char.IsLower(c))
There's no good reason for this method to be anything other than a one-liner. But if you really want to loop through the characters:
For Each c As Char In strIn
If Not Char.IsLower(c) Then Return False
Next c
Return True
Your code runs just fine here (VB does not require the brackets after ToUpper, but it's good for readability) - it accepts a string and returns True if that string only contains the characters "A" through "Z", false otherwise. The function doesn't convert anything - just returns that Boolean result.
If you want to actually convert the string to upper case, just use .ToUpper() on the string itself, e.g.
Dim MyString As String = "Contains some text"
MyString = MyString.ToUpper()
' Above would set MyString to "CONTAINS SOME TEXT"