Stripping out Non Numerical Characters using a Query - sql

Light user of MS Access so not a power user by any means.
Ok, to explain what I want first of all.
I have two tables, one with a username XXX99999 ( 3 Alpha 5 Numeric ) and the other one just 99999 ( 5 numeric ).
They are one in the same, for the most part I can safely drop the first 3 letters and perform what I need to 'link' using the last 5 Numeric digits only.
I imagine doing this by a query.
My question is, how would I mask this to build my query.
All 5 Numeric are unique.

If you take the function by #paxdiablo here (VBA: How to Find Numbers from String), which is
Public Function onlyDigits(s As String) As String
' Variables needed (remember to use "option explicit"). '
Dim retval As String ' This is the return string. '
Dim i As Integer ' Counter for character position. '
' Initialise return string to empty '
retval = ""
' For every character in input string, copy digits to '
' return string. '
For i = 1 To Len(s)
If Mid(s, i, 1) >= "0" And Mid(s, i, 1) <= "9" Then
retval = retval + Mid(s, i, 1)
End If
Next
' Then return the return string. '
onlyDigits = retval
End Function
paste it into a Module and save it, you should be able to link both tables in a query like this (assuming Table1 has the only numbers field and Table2 has the alpha and numbers):
SELECT Table1.MyField1, Table2.MyField2
FROM Table1 INNER JOIN Table2
ON CStr(Table1.OnlyNumbersField) = onlyDigits(Table2.TextAndNumberField);
This will strip the alpha characters behind the scenes, make sure both datatypes are the same, then "link" them to produce the joined result in your query. I know you said you are not a power user and this may be complicated for you, but there is no "easy" way to do this. I could walk you through doing it through multiple queries, which may make more sense to you, but it is a lot more to explain.

Related

splitting array of strings by specific length of character with condition in it

Function SplitString(ByVal str As String, ByVal numOfChar As Long) As String()
Dim sArr() As String
Dim nCount As Long
ReDim sArr((Len(str) - 1) \ numOfChar)
Do While Len(str)
sArr(nCount) = Left$(str, numOfChar)
str = Mid$(str, numOfChar + 1)
nCount = nCount + 1
Loop
SplitString = sArr
End Function
Hi i was trying to split an array of strings by specific length of character inside a cell with the above code and i got the results..
But now, like for an example, This is our input data "abcd,efghijk,lmn,opqrs,tuvw", and we want to split this into substrings of 8 characters each, after splitting the data, The result will be as "abcd,efg" "hijk,lmn"(8Charcaters), ",opqrs,t(8Charcaters)", "uvw(remaining characters)"... But how i needed was after every splitting, each spliited word should stop with it's previous comma... and next charcaters that come after comma should be considered by next splitted word even it comes on the same length of characters.. like the result for the above example should be "abcd"(should end with this much, even though of 8Charcaters in the splitted word), "efghijk"(correct 8Charcaters)", "lmn"(8Charcaters, even though of 8Charcaters in the splitted word)", "opqrs"(8charcaters even though of 8Charcaters in the splitted word), "tuvw(remaining characters)". We can have multiple words separated by commas inside one substring if all that comes within 8 characters limit. Please advise.

Extract first two digits that comes after some string in Excel

I have a row with values something like this, How to extract first two digits that come after the text 'ABCD' to another cell, any formula or vba? There may be a few chars in between or sometimes none.
ABCD 10 sadkf sdfas
ABCD-20sdf asdf
ABCD 40
ABCD50 asdf
You can do this with a worksheet formula. No need for VBA.
Assuming you do not need to test for the presence of two digits:
=MID(A1,MIN(FIND({1,2,3,4,5,6,7,8,9,0},A1&"1234567890")),2)
If you need to test for the presence of two digits, you can try:
=IF(ISNUMBER(-RIGHT(MID(A1,MIN(FIND({1,2,3,4,5,6,7,8,9,0},A1&"1234567890")),2),1)),MID(A1,MIN(FIND({1,2,3,4,5,6,7,8,9,0},A1&"1234567890")),2),"Invalid")
In general, it is always a good idea to show some code in StackOverflow. Thus, you show that you have tried something and you give some directions for the answer.
Concerning the first two digits extract, there are many ways to do this. Starting from RegEx and finishing with a simple looping of the chars and checking each one of them.
This is the loop option:
Public Function ExtractTwoDigits(inputString As String) As Long
Application.Volatile
Dim cnt As Long
Dim curChar As String
For cnt = 1 To Len(inputString)
curChar = Mid(inputString, cnt, 1)
If IsNumeric(curChar) Then
If Len(ExtractTwoDigits) Then
ExtractTwoDigits = ExtractTwoDigits & curChar
Exit Function
Else
ExtractTwoDigits = curChar
End If
End If
Next cnt
ExtractTwoDigits = -1
End Function
Application.Volatile makes sure that the formula recalculates every time;
-1 is the answer if no two digits exist in the inputString;
IsNumeric checks whether the string inside is numeric;
As a further step, you may try to make the function a bit robust, extracting the first 1, 3, 4 or 5 digits, depending on a parameter that you put. Something like this =ExtractTwoDigits("tarato123ra2",4), returning 1232.
RegEx Version:
Public Function GetFirstTwoNumbers(ByVal strInput As String) As Integer
Dim reg As New RegExp, matches As MatchCollection
With reg
.Global = True
.Pattern = "(\d{2})"
End With
Set matches = reg.Execute(strInput)
If matches.Count > 0 Then
GetFirstTwoNumbers = matches(0)
Else
GetFirstTwoNumbers = -1
End If
End Function
You have to enable Microsoft Regular Expressions 5.5 under extras->references. The pattern (\d{2}) matches 2 digits, return value is the number, if not existing -1.
Note: it only extracts 2 successive numbers.
If you place this function into a module, you can use it like normal formula.
Here a great site to to get into regEx.

Get the nth character, string, or number after delimiter in Visual Basic

In VB.net (Visual Studio 2015) how can I get the nth string (or number) in a comma-separated list?Say I have a comma-separated list of numbers like so:13,1,6,7,2,12,9,3,5,11,4,8,10How can I get, say, the 5th value in this string, in this case 12?I've looked at the Split function, but it converts a string into an array. I guess I could do that and then get the 5th element of that array, but that seems like a lot to go through just to get the 5th element. Is there a more direct way to do this, or am I pretty much limited to the Split function?
In case you are looking for an alternative method, which is more basic, you can try this:
Module Module1
Sub Main()
Dim a As String = "13,1,6,7,2,12,9,3,5,11,4,8,10"
Dim counter As Integer = 5 'the number you want (in this case, 5th one)
Dim movingcounter As Integer = 0 'how many times we have moved
Dim startofnumber, endofnumber, i As Integer
Dim numberthatIwant As String
Do Until movingcounter = counter
startofnumber = InStr(i + 1, a, ",")
i = startofnumber
movingcounter = movingcounter + 1
Loop
endofnumber = InStr(startofnumber + 1, a, ",")
numberthatIwant = (Mid(a, startofnumber + 1, endofnumber - startofnumber - 1))
Console.WriteLine("The number that I want: " + numberthatIwant)
Console.ReadLine()
End Sub
End Module
Edit: You can make this into a procedure or function if you wish to use it in a larger program, but this code run in console mode will give the output of 12.
The solution provided by Plutonix as a comment to my question is straightforward and exactly what I was looking for, to wit:result = csv.Split(","c)(5)In my case I was incrementing a variable each time my program ran and needed to get the nth character or string after the incremented value. That is, if my program had incremented the variable 5 times, then I needed the string after the 4th comma, which of course, is the 5th string. So my solution was something like this:result = WholeString.Split(","c)(IncrementedVariable)Note that this is a zero-based variable.Thanks, Plutonix.

Extract 5-digit number from one column to another

I need help with extracting 5-digit numbers only from one column to another in Excel 2010. These numbers can be in any position of the string (beginning of the string, anywhere in the middle, or at the end). They can be within brackets or quotes like:
(15478) or "15478" or '15478' or [15478]
I need to ignore any numbers that are less than 5 digits and include numbers that start with 1 or more leading zeros (like 00052, 00278, etc.) and ensure that leading zeros are copied over to the next column. Could someone help me with either creating a formula or UDF?
Here is a formula-based alternative that will extract the first 5 digit number found in cell A1. I tend to prefer reasonably simple formula solutions over VBA in most situations as formulas are more portable. This formula is an array formula and thus must be entered with Ctrl+Shift+Enter. The idea is to split the string up into every possible 5 character chunk and test each one and return the first match.
=MID(A1,MIN(IF(NOT(ISERROR(("1"&MID(A1,ROW(INDIRECT("R1C[1]:R"&(LEN(A1)-4)&"C[1]",FALSE)),5)&".1")*1))*ISERROR(MID(A1,ROW(INDIRECT("R1C[1]:R"&(LEN(A1)-4)&"C[1]",FALSE))+5,1)*1)*ISERROR(MID(A1,ROW(INDIRECT("R1C[1]:R"&(LEN(A1)-4)&"C[1]",FALSE))-1,1)*1),ROW(INDIRECT("R1C[1]:R"&(LEN(A1)-4)&"C[1]",FALSE)),9999999999)),5)
Let's break this down. First we have an expression I used twice to return an array of numbers from 1 up to 4 less than the length of your initial text. So if you have a string of length 10 the following will return {1,2,3,4,5,6}. Hereafter the below formula will be referred to as rowlist. I used R1C1 notation to avoid potential circular references.
ROW(INDIRECT("R1C[1]:R"&(LEN(A1)-4)&"C[1]",FALSE))
Next we will use that array to split the text into an array of 5 letter chunks and test each chunk. The test being performed is to prepend a "1" and append ".1" then verify the chunk is numeric. The prepend and append eliminate the possibility of white space or decimals. We can then check the character before and the character after to make sure they are not numbers. Hereafter the below formula will be referred to as isnumarray.
NOT(ISERROR(("1"&MID(A1,rowlist,5)&".1")*1))
*ISERROR(MID(A1,rowlist+5,1)*1)
*ISERROR(MID(A1,rowlist-1,1)*1)
Next we need to find the first valid 5 digit number in the string by returning the current index from a duplicate of the rowlist formula and returning a large number for non-matches. Then we can use the MIN function to grab that first match. Hereafter the below will be referred to as minindex.
MIN(IF(isnumarray,rowlist,9999999999))
Finally we need to grab the numeric string that started at the index returned by the MIN function.
MID(A1,minindex,5)
The following UDF will return the first five digit number in the string, including any leading zero's. If you need to detect if there is more than one five digit number, the modifications are trivial. It will return a #VALUE! error if there are no five-digit numbers.
Option Explicit
Function FiveDigit(S As String, Optional index As Long = 0) As String
Dim RE As Object
Set RE = CreateObject("vbscript.regexp")
With RE
.Pattern = "(?:\b|\D)(\d{5})(?:\b|\D)"
.Global = True
FiveDigit = .Execute(S)(index).submatches(0)
End With
End Function
As you may see from the discussion between Mark and myself, some of your specifications are unclear. But if you would want to exclude decimal numbers, when the decimal portion has five digits, then the regex pattern in my code above should be changed:
.Pattern = "(?:\d+\.\d+)|(?:\b|\D)(\d{5})(?:\b|\D)"
I just wrote this UDF for you , basic but will do it...
It will find the first 5 consecutive numbers in a string, very crude error checking so it just says Error if anything isn't right
Public Function GET5DIGITS(value As String) As String
Dim sResult As String
Dim iLen As Integer
sResult = ""
iLen = 0
For i = 1 To Len(value)
If IsNumeric(Mid(value, i, 1)) Then
sResult = sResult & Mid(value, i, 1)
iLen = iLen + 1
Else
sResult = ""
iLen = 0
End If
If iLen = 5 Then Exit For
Next
If iLen = 5 Then
GET5DIGITS = Format(sResult, "00000")
Else
GET5DIGITS = "Error"
End If
End Function

Shortening a repeating sequence in a string

I have built a blog platform in VB.NET where the audience are very young, and for some reason like to express their commitment by repeating sequences of characters in their comments.
Examples:
Hi!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
<3<3<3<3<3<3<3<3<3<3<3<3<3<3<3<3<3<3<3<3<3<3<3<3<3<3<3
LOLOLOLOLOLOLOLOLOLOLOLOLLOLOLOLOLOLOLOLOLOLOLOLOL
..and so on.
I don't want to filter this out completely, however, I would like to shorten it down to a maximum of 5 repeating characters or sequences in a row.
I have no problem writing a function to handle a single repeating character. But what is the most effective way to filter out a repeating sequence as well?
This is what I used earlier for the single repeating characters
Private Shared Function RemoveSequence(ByVal str As String) As String
Dim sb As New System.Text.StringBuilder
sb.Capacity = str.Length
Dim c As Char
Dim prev As Char = String.Empty
Dim prevCount As Integer = 0
For i As Integer = 0 To str.Length - 1
c = str(i)
If c = prev Then
If prevCount < 10 Then
sb.Append(c)
End If
prevCount += 1
Else
sb.Append(c)
prevCount = 0
End If
prev = c
Next
Return sb.ToString
End Function
Any help would be greatly appreciated
You should be able to recursively use the 'Longest repeated substring problem' to solve this. On the first pass you will get two matching sub-strings, and will need to check if they are contiguous. Then repeat the step for one of the sub-strings. Cut off the algo, if the strings are not contiguous, or if the string size become less than a certain number of characters. Finally, you should be able to keep the last match, and discard the rest. You will need to dig around for an implementation :(
Also have a look at this previously asked question: finding long repeated substrings in a massive string