Extracting substring based on different criterias and placing the extracted string in another cell of same row - vba

I have an excel file, in which there is a column containing a specific string. This string doesn't follow any particular pattern. My requirement is to extract a sub-string (product id) which is a set of 8 consecutive numbers that have to be preceded/followed by any no of characters or must be at the start or end of the string.
Following are some examples.
Scenario 1:
product id is preceded by #
Id#53298632/BS TVR:003519
Function used in excel
=MID(N88,FIND("#",N88)+1,8)
* result : 53298632 *
Scenario 2:
product id is at the beginning
53298632:003519
Function used in excel
=MID(A1,1,8)
* result : 53298632 *
At the beginning I had to deal with only scenario 1 and hence used the specified formula. Now a days the string doesnt follow any particular pattern but my product id still comes as 8 digit consecutive numbers. I searched for a suitable solution and found this formula (which I dont clearly understand).
=LOOKUP(10^8,MID(N132,ROW(INDIRECT("1:"&LEN(N132)-7)),8)+0)
This does work in most of the cases but in some cases it fails
For example
Pdr#53298632/ QTY NOS 1031949
Here the result is 1031949 which is definitely not what I want. The result should have been 53298632
Please help me fix this. Can this be done using VBA macro? I am completely new to such excel functions VBA and macro.
Any help will be highly appreciated!
Thanks in advance.

If you are happy to specifically include the Microsoft RegEx module into your Excel project, regular expressions will solve this reasonably quickly.
To add the RegEx function to use in your Excel Macros, select the Developer menu in Excel and start the Visual Basic editor. Within the VBA for Applications window, Select Tools->References and select Microsoft VBScript Regular Expressions 5.5.
Create a new Module for your VBAProject (right-click on your Excel file name in the project tree and click Insert->Module)
Double click on the newly created Module (within the project tree) and enter the following code in the Module1 (Code) window:
Public Function getProductCode(source As String) As String
Dim strPattern As String: strPattern = "(\d{8})"
Dim result As String: result = ""
Dim results As Object
Dim regEx As New RegExp
With regEx
.Global = True
.MultiLine = False
.IgnoreCase = False
.Pattern = strPattern
End With
If regEx.Test(source) Then
Set results = regEx.Execute(source)
If (results.Count <> 0) Then
result = results.Item(0)
End If
End If
getProductCode = result
End Function
From the relevant cell in Excel, you can now call the macro:
=getProductCode(A1)

I guess you could also modify the original formula to pick up the first match of an 8-digit number
=MID(A1,MATCH(TRUE,ISNUMBER(MID(A1,ROW(INDIRECT("1:"&LEN(A1)-7)),8)+0),0),8)
(must be entered as an array formula using CtrlShiftEnter).

Related

Set Workbook Language in VBA Excel

I have a workbook that gets copied to different users with different regional language settings.
The problem I have is, that because of the regional settings, some the macros i build wont work because they refer to translated values.
Examples are Date names and the value for Year.
(Year to Jaar.)
ActiveSheet.PivotTables("OpenPO").PivotFields("Years").ShowDetail = False
This refers to selected data inside a pivot table named Year in the original workbook. however this gets automaticly translated into the language of the user.
To solve this I was hoping I could lock or set this specific workbooks language.
unfortunately it is not an option to,
run the Macro before copy,
Set regional settings for every user,
set Excels regional settings for every user,
rewrite the macro's for every language.
I have been searching for a long time to find a solution but have yet to find an answer. I really hope someone on here can help me.
My most desired solution would be a simple VBA code to set the language of the workbook.
In the example you are passing "Years" as a String:
ActiveSheet.PivotTables("OpenPO").PivotFields("Years").ShowDetail = False
If a Dutch Excel is expecting Jaar, then you need some specific way of telling it. E.g.:
Option Explicit
Public Sub TestMe()
Dim yearIndependent As String
Select Case Application.LanguageSettings.LanguageID(msoLanguageIDUI)
Case 1043: 'Dutch
yearIndependent = "Jaar"
Case 1031: 'German
yearIndependent = "Jahr"
Case Else: 'English (default)"
yearIndependent = "Year"
End Select
ActiveSheet.PivotTables("OpenPO").PivotFields(yearIndependent).ShowDetail = False
End Sub
As you see, you have to rewrite the whole code. The good news is that there are not so many language-dependent words, thus you can use some MVC architecture, write the words on a sheet and read them from there. And if you export the logic in a function, returning the correct string, it would be quite digestable.
Some good reading about it - RonDeBruin.nl

Excel / VBA Dynamic Numbers and Text Separation

I have a value I take from the internet, specifically the market cap of some cryptocurriencies. I get these from the "get external data" function in excel. This loads the data in text format e.g. for a 1000 bitcoin value instead of getting 1000 i will get a cell that had 1000 BTC. Therefore, I am not able to use that data further. If I use the functions right, left, given that this value can fluctuate from 1000 to 1 or 1000000 the right and left will result in text format once the barrier is hit. On the other hand i cannot use the text to column separator as every time i refresh the table goes back to its original state and gets rid of any columns added by the user. Any suggestions to extrapolate the numbers from each cell dynamically are welcome. Thanks
You can use the Val function.
Assuming the cell containing the value you are after is A1 of the active sheet, you can use:
Dim numericPart As Double
numericPart = Val(Range("A1").Value)
If you wanted an Excel formula instead, and there is always a space between the value and the text part, you could use:
=VALUE(LEFT(A1,FIND(" ",A1)-1))
Another worksheet function, which will work without a space separating the digits from the letters (so long as the digits come first):
=LOOKUP(9E+307,--MID(B9,1,{1,2,3,4,5,6,7,8,9,10}))
If there might be more than 10 places in the number, merely extend the array constant.
While getting the external data from the internet, the data may have leading or trailing spaces or non printable leading or trailing characters.
In that case the safest way is to use regular expression to get the numeric part from the string.
Place the following UDF (User Defined Function) on a Standard Module and then use this UDF on the worksheet like below...
Assuming " 1000 BTC " is in A2, then try the UDF like this..
=GetNumber(A2)
UDF:
Function GetNumber(rng As Range) As Long
Dim Num As Long
Dim RE As Object, Match As Object, Matches As Object
Set RE = CreateObject("VBScript.RegExp")
With RE
.Global = False
.Pattern = "\d+"
End With
If RE.test(rng.Value) Then
Set Matches = RE.Execute(rng.Value)
GetNumber = Matches(0)
End If
End Function

How to find and remove special characters in Excel sheet?

My Excel sheet contains more than 20000 records in between some where i have values like this सोनिगरा केसर .
How to find these kind of characters ?
I am trying to import this XL Sheet in my Sql server DB but i am getting
"Text was truncated or one or more characters had no match in the target code page.".
How to solve this issue?
Ablebits is a good tool to remove any unwanted special characters from excel, specially which comes over during migration. You just need to provide a list of characters to be removed.
Reference : https://www.ablebits.com/excel-clean-cells/howto-remove-chars.php
Can you try this vba?
Function RemChrs(s As String) As String
Static RegEx As Object
If RegEx Is Nothing Then
Set RegEx = CreateObject("VBScript.RegExp")
RegEx.Global = True
End If
RegEx.Pattern = "à|¤|¸|¥|‹|¨|¿|—|°|¾|•|‡|°"
RemChrs = RegEx.Replace(s, "")
End Function
You can add pattern like |x. Also you can modify code with for loop for run all sheet cell. In this exampe you should use =RemChrs(A1) .
I hope I can help you.
Edit:
If you don't want to use vba, you can edit manually with using ctrl+h and replace all unwanted chars one by one.

Get Integer value from String using VBA

I'm trying to automate a script using QTP and SAP.
In SAP an order is generated with Document number in status bar as "Standard PO created under the number 4500290636"
My challenge is how should I convert take string to an Integer value.
Since you are using SAP, I think it is safe to assume that the document number will always have a length of 10 characters. So, you can extract it using the Right function, then just convert it using Val. Something like this:
yourInteger = Val(Right(yourSAPString, 10))
Note: in .net, you could use Convert.ToInt32 instead of Val
You could use the split function:
Dim strSplit As Variant
Dim yourNumericOut As Variant
strSplit = Split("Standard PO created under the number 4500290636")
If IsNumeric(UBound(strSplit)) Then yourNumericOut = --strSplit(UBound(strSplit))
Then test for numeric, will allow for many lengths and could change the position of the number in the returned values.
I would recommend using a custom function.
Here is the function:
Public Function ExtractNumber(fromString As String) As Double
Dim RegEx As Object
Set RegEx = CreateObject("VBScript.RegExp")
With RegEx
.Pattern = "(\d{1,3},?)+(\.\d{2})?"
.Global = True
If .Test(fromString) Then
ExtractNumber = CDbl(.Execute(fromString)(0))
End If
End With
End Function
Here is the usage example:
Sub Example()
Debug.Print ExtractNumber("Standard PO created under the number 4500290636")
End Sub
This code was taken from a similar, more recent answer that has better solutions. See here:
Excel VBA - Extract numeric value in string

Passing Excel VBA array to parental function?

I don't know if this question will make sense to begin with...
Example Given: the following value is given for a single cell (we'll call it A1): Sub-value #1|Here's another sub-value #2|Yet again, last but not least, sub-value #3. I already know someone will tell me that this is where a database should be used (trust me, my major is DB Management, I know, but I need my data in this fashion). My delimiter is the |. Now say I want to create a function that will take the LEN() of each sub-value and return the AVERAGE() of all the sub-values. If I wanted to create a single function to do this, I could use an split(), take each value, do an LEN() and return the AVERAGE().
For the example given, let's utilize cell B1. I have created similar functions in the past that would work by the following method (although not this exact one), but it requires splitting and joining the array/cell value(s) each time: =ARRAY_AVERAGE(ARRAY_LEN(A1,"|","|"),"|","|").
ARRAY_LEN(cell,delimiter[,Optional new_delimiter])
ARRAY_AVERAGE(cell,delimiter[,Optional new_delimiter])
However, I'm wondering if there might be a more dynamic approach to this. Basically, I want to split() an array with some custom VBA function, pass it to parent cell functions, and I wrap up the array by a function that will merge the array back together.
Here's how the cell function will run:
=ARRAY_AVERAGE(ARRAY_LEN(ARRAY_SPLIT(A1,"|"))).
ARRAY_SPLIT(cell,delimiter) will split the array.
ARRAY_LEN(array) will return the length of each sub-value of the array.
ARRAY_AVERAGE(array) will return the average of each sub-value of the array. Since this function returns a single value of multiple values, this will take the form of an imaginary ARRAY_JOIN(array,delimiter) that would merge the array back again.
This requires one or two additional functions in the cells, but it also lowers the number of iterations that the cell would be converting to and from a single cell value and VBA array.
What do you think? Possible? Feasible? More or less code efficient?
Now, this is a very crude example but it should give you an idea of how to get started and how you can customize this method to suit your needs. Assume you have the following data in a text file called example.txt :
Name|Age|DoB|Data1|Data2|Data3
David|25|1987-04-08|100|200|300
John|42|1960-06-21|400|500|600
Sarah|15|1997-02-01|700|800|900
This file resides in the folder C:\Downloads. To query this in VBA using ADO you'll need to reference the Microsoft ActiveX Data Objects 2.X Library where X is the latest version you have installed. I also reference the Microsoft Scripting Library to create my Schema.ini files at run-time to ensure that my data is read properly. Without the Schema.ini file you run the risk of your data not being read as you expect it to be by the driver. Numbers as text can ocassionally be read as null for no reason and dates often get returned null as well. The Schema.ini file gives the text driver an exact definition of your data and how to handle it. You don't HAVE to define every column explicitly like I have done but at the very least you should set your Format, ColNameHeader, and DateTimeFormat values.
Example Schema.ini file used:
[example.txt]
Format=Delimited(|)
ColNameHeader=True
DateTimeFormat=yyyy-mm-dd
Col1=Name Char
Col2=Age Integer
Col3=DoB Date
Col4=Data1 Integer
Col5=Data2 Integer
Col6=Data3 Integer
You'll notice that the file name is enclosed in brackets on the first line. This is NOT optional and it also allows you to define different schemas for different files. As mentioned earlier I create my Schema.ini file in VBA at run-time with something like the following:
Sub CreateSchema()
Dim fso As New FileSystemObject
Dim ts As TextStream
Set ts = fso.CreateTextFile(FILE_DIR & "Schema.ini", True)
ts.WriteLine "[example.txt]"
ts.WriteLine "Format=Delimited(|)"
ts.WriteLine "ColNameHeader=True"
ts.WriteLine "DateTimeFormat=yyyy-mm-dd"
ts.WriteLine "Col1=Name Char"
ts.WriteLine "Col2=Age Integer"
ts.WriteLine "Col3=DoB Date"
ts.WriteLine "Col4=Data1 Integer"
ts.WriteLine "Col5=Data2 Integer"
ts.WriteLine "Col6=Data3 Integer"
Set fso = Nothing
Set ts = Nothing
End Sub
You'll notice that I use the variable FILE_DIR which is a constant I define at the top of my module. Your Schema.ini file -MUST- reside in the same location as your data file. The connection string for your query also uses this directory so I define the constant to make sure they reference the same place. Here's the top of my module with the FILE_DIR constant along with the connection string and SQL query:
Option Explicit
Const FILE_DIR = "C:\Downloads\"
Const TXT_CONN = "Driver={Microsoft Text Driver (*.txt; *.csv)};Dbq=" & FILE_DIR & ";Extensions=asc,csv,tab,txt;"
Const SQL = "SELECT Name, DoB, ((Data1 + Data2 + Data3)/3) AS [Avg_of_Data]" & _
"FROM example.txt "
Notice the portion in TXT_CONN called Dbq. This is the directory where your data file(s) are stored. You'll actually define the specific file you use in the WHERE clause of your SQL string. The SQL constant contains your query string. In this case we're just selecting Name, DoB, and Averaging the three data values. With all of that out of the way you're ready to actually execute your query:
Sub QueryText()
Dim cn As New ADODB.Connection
Dim rs As New ADODB.Recordset
Dim i As Integer
'Define/open connection
With cn
.ConnectionString = TXT_CONN
.Open
'Query text file
With rs
.Open SQL, cn
.MoveFirst
'Loop through/print column names to Immediate Window
For i = 0 To .Fields.Count - 1
Debug.Print .Fields(i).Name
Next i
'Loop through recordset
While Not (.EOF Or .BOF)
'Loop through/print each column value to Immediate Window
For i = 0 To .Fields.Count - 1
Debug.Print .Fields(i)
Next i
.MoveNext
Wend
.Close 'Close recordset
End With
.Close 'Close connection to file
End With
Set rs = Nothing
Set cn = Nothing
End Sub
I know that I said doing this is extremely simple in my comments above and that this looks like a lot of work but I assure you it's not. You could use ONLY the QueryText() method and end up with similar results. However, I've included everything else to try and give you some ideas of where you can take this for your project as well as to show you how to solve problems you might run into if you're not getting the results back that you expected.
This is the guide I originally learned from: http://msdn.microsoft.com/en-us/library/ms974559.aspx
Here is a guide for doing the same thing to actual Excel files: http://support.microsoft.com/kb/257819
Lastly, here's more info on Schema.ini files: http://msdn.microsoft.com/en-us/library/windows/desktop/ms709353(v=vs.85).aspx
Hopefully you're able to find a way to make use of all this information in your line of work! A side benefit to learning all of this is that you can use ADO to query actual databases like Access, SQL Server, and Oracle. The code is nearly identical to what is printed here. Just swap out the connection string, sql string, and ignore the whole bit about a Schema.ini file.
Here are 2 example VBA UDFs that work on a single cell: enter the formula as
=AVERAGE(len_text(SPLIT_TEXT(A1,"|")))
Note that in this particular case you don't actually need the len_text function, you could use Excel's LEN() instead, but then you would have to enter the AVERAGE(..) as an array formula.
Option Explicit
Public Function Split_Text(theText As Variant, Delimiter As Variant) As Variant
Dim var As Variant
var = Split(theText, Delimiter)
Split_Text = Application.WorksheetFunction.Transpose(var)
End Function
Public Function Len_Text(something As Variant) As Variant
Dim j As Long
Dim k As Long
Dim var() As Variant
If IsObject(something) Then
something = something.Value2
End If
ReDim var(LBound(something) To UBound(something), LBound(something, 2) To UBound(something, 2))
For j = LBound(something) To UBound(something)
For k = LBound(something, 2) To UBound(something, 2)
var(j, k) = Len(something(j, k))
Next k
Next j
Len_Text = var
End Function