How to VBA Excel Macro part of a string - vba

I'm currently busy with Excel tooling and learning a lot but i got a question. Currently i have a couple rows with data in the rows. In the rows there is a lot of data but i need a specific part of the row. Of course i can delete it all manually but to do that for 3000 rows i will be wasting a lot of time.
Can any one help me with a macro that filters data. The data i need is between [ and ] so for example [data]
I hope you guys can help me out and if you need more information just ask me! I hope you guys can help me!
Example String ROW:
[Sandwitch]><xsd:element name="T8436283"
So what do i need?
So i need a macro that only gets the Sandwitch out of it and paste it in the B column. The string with all the information stays at column A and the Sandwitch goes to Column B and that for all rows.

Option 1: Find/Replace
1) Copy data in another column (just saving original copy)
2) Perform Find/Replace "*["
3) Perform Find/Replace "]"
Now you have data which was between [].
Option 2: Use formulas
1) Lets assume that original data in Column "A"
2) Apply this formula in column "B" which will extract data between []
=MID(A1,FIND("[",A1)+1,FIND("]",A1)-FIND("[",A1)-1)
Option 3: Macro
If it is absolutely needed, I can help create a macro, otherwise try first two easier options.

A general purpose "find element in s starting x up to next y":
Function GenExtract(FromStr As String, _
StartSep As String, EndSep As String) _
As Variant
Dim StPos As Long
Dim EnPos As Long
GenExtract = CVErr(xlErrNA)
If StartSep = "" Or EndSep = "" Then Exit Function 'fail
StPos = InStr(1, FromStr, Left(StartSep, 1))
If StPos = 0 Or StPos = Len(FromStr) Then Exit Function 'fail
EnPos = InStr(StPos + 1, FromStr, Left(EndSep, 1))
If EnPos = 0 Then Exit Function 'fail
GenExtract = Mid(FromStr, StPos + 1, EnPos - StPos - 1)
End Function
If the two separators are the same, as per quotes, it gives the first string enclosed by those.

If you want to get your feet wet in Regular Expressions, the following code will take you there. You have to add a reference to the VB Scripting Library
Tools > References > Microsoft VBScript Regular Expressions 5.5
Then the code is as follows:
Sub textBetweenStuffs()
Dim str As String
Dim regEx As RegExp
Dim m As Match
Dim sHolder As MatchCollection
Dim bracketCollection As Collection
Dim quoteCollection As Collection
Set regEx = New RegExp
'Matches anything in between an opening bracket and first closing bracket
regEx.Pattern = "\[(.*?\])"
str = "[Sandwitch]><xsd:element name=""T8436283"""
'populates matches into match collection
Set sHolder = regEx.Execute(str)
Set bracketCollection = New Collection
'loop through values in match collection to do with as you wish
For Each m In sHolder
bracketCollection.Add m.Value
Next i
Set sHolder = Nothing
'get values between Quotations
regEx.Pattern = "\"(.*?\")"
'populates matches into match collection
Set sHolder = regEx.Execute(str)
Set quoteCollection = New Collection
'loop through values in match collection to do with as you wish
For Each m In sHolder
quoteCollection.Add m.Value
Next i
End Sub

Related

Excel - Add sequential numbers to cell text based on line breaks

I have a worksheet that contains information on projects. The worksheet contains a column which contains risks for each project. There is a one-to-many relationship between a project and its risks.
Currently the risks for a projects are added to single cell and separated by a line break. I need to add sequential identifiers at the start of each risk. So for example inside a particular cell it should look like this. The sequential number should be bold if at all possible.
1).**Risk 1
2).**Risk 2
3).**Risk 3
etc.
Any suggestions on how to tackle this would be appreciated.
Here's how I'd approach it via UDF:
' Reformats a list from a simple delimitation to a numbered list
' Accepts arrays of strings for inList (allowing array formulas)
' numFormat is a standard Excel-style format string (default "0. ")
' inDelimiter is the delimiter in the input list
' outDelimiter is the delimiter for the output list
Public Function TO_NUMBERED_LIST(inList As Variant, Optional numFormat As Variant, _
Optional inDelimiter As Variant, Optional outDelimiter As Variant) As Variant
Dim i As Integer, j As Integer
' Set default parameters
If IsMissing(numFormat) Then numFormat = "0). "
If IsMissing(inDelimiter) Then inDelimiter = vbNewLine
If IsMissing(outDelimiter) Then outDelimiter = inDelimiter
If IsArray(inList) Then ' Must loop through each entry if using as an array formula
Dim outList() As Variant
ReDim outList(0 To (UBound(inList) - LBound(inList)), 1 To 1)
j = 0
For i = LBound(inList) To UBound(inList)
If IsError(inList(i, 1)) Then
outList(j, 1) = inList(i, 1)
Else
outList(j, 1) = MakeNumbered(CStr(inList(i, 1)), CStr(numFormat), CStr(inDelimiter), CStr(outDelimiter))
End If
j = j + 1
Next
TO_NUMBERED_LIST = outList
Else
TO_NUMBERED_LIST = MakeNumbered(CStr(inList), CStr(numFormat), CStr(inDelimiter), CStr(outDelimiter))
End If
End Function
' Helper function to do the actual work of splitting lists, numbering them, and recombining them
Private Function MakeNumbered(inList As String, Optional numFormat As String, _
Optional inDelimiter As String, Optional outDelimiter As String) As String
Dim i As Integer
Dim tokenArr() As String
tokenArr = Split(inList, inDelimiter)
For i = 0 To UBound(tokenArr)
tokenArr(i) = Format(i + 1, numFormat) & tokenArr(i)
Next
MakeNumbered = Join(tokenArr, outDelimiter)
End Function
I leverage some knowledge from your previous thread, like the fact that the input might be an array (and the whole function might be used in an array formula) but will only ever be 1-dimensional.
I've made this pretty general for reformatting. It can take in lists with any input delimiter (in your case, a newline) and output using any desired delimiter (in your case, still a newline). The numFormat parameter acts using the Format function and supports formats like you would commonly see in Excel. Check the documentation if you need help there.
Default parameters have already been tweaked for your example - newline as delimiter(s) and "0). " as numbering format.
You can use the Split function on each cell value to create an array of risks and then prefix each risk with the sequence id. Then you can use the Join function to put the array back into a single value to update the cell with.
Depending on how the newlines got into the cell you might need to use vbCrLf, or vbNewLine instead of vbLf in the following example code:
Option Explicit
Sub AddRiskSequence()
Dim rngRisks As Range
Dim rngCell As Range
Dim varRisks As Variant
Dim lngIndex As Long
'set range with risk values
Set rngRisks = Sheet2.Range("B2:B4")
'iterate cells in risk column
For Each rngCell In rngRisks
'split cell contents by line feed into array
varRisks = VBA.Split(rngCell.Value, vbLf)
'iterate array and add sequence ids
For lngIndex = 0 To UBound(varRisks)
varRisks(lngIndex) = VBA.CStr(lngIndex + 1) & ") " & varRisks(lngIndex)
Next lngIndex
'rejoin array and update cell value
rngCell.Value = VBA.Join(varRisks, vbLf)
Next rngCell
End Sub
Before:
After:

Read a table in outlook mail using macro

I'm writing a macro to read the below Email:
Start Date: July-07-2016
Name Accept Approved
John Yes No
Peter No No
I'm good with search the word "Start date" and get the next 13 character to copy and paste that in a text file. But my problem is the next part is in a Table format. So when I'm searching for the name "John" and trying to copy the next 10 Characters. It doesn't work.
Is there a way to search for the word "Accept" and get the First Row data(Which will be No) and then Second Row data(Which will be No)? Is that possible?
This EMail's table will have only 2 Rows. So, I don't need any dynamic way to get the data. Can someone guide me?
I've tried searching the internet first, but the solutions are too huge for me to understand. Is there any simple way?
I have even tried the solution give here: How to read table pasted in outlook message body using vba? but that method works when the body has ONLY TABLE. But my EMail will have text as well as table.
I've never actually programmed in vba, but I think I can help (a bit) nevertheless.
In the answer on the post you linked to, there is the line
Set msg = ActiveExplorer.Selection.item(1)
I think you can change this to something like
Set msg = Right(ActiveExplorer.Selection.item(1), 25)
to get rid of the text before the table (I got the Right part from here: http://www.exceltrick.com/formulas_macros/vba-substring-function/, but it should also work in Outlook).
This way, you run the code on the table itself instead of on the whole message.If there is also text after the table, it might be more difficult, but you might get that done by searching for the table ending.
I hope this helps!
Attempt 2
After some searching and thinking, I came up with the idea to get the html of the message and use that to parse the table (Ok, not really, I got it from the comments here: http://www.codeproject.com/Questions/567073/Howplustoplusrecognizeplusandplusreadplustableplus). Based on that and other sources, it is possible to write a code that gets the table from an email.
I've written some code that might work, but I couldn't test it as I do not have Outlook. Also, this is my first time writing vba, so there may be a lot of syntax errors (and the code is ugly).
Sub GetTable()
Dim msg As Outlook.mailItem
Dim html As String
Dim tableBegin As String
Dim tableEnd As String
Dim posTableBegin As Long
Dim posTableEnd As Long
Dim table As String
Dim rowBegin As String
Dim rowEnd As String
Dim rowCount As Long
Dim columnBegin As String
Dim columnBeginLen As Long
Dim columnEnd As String
Dim posRowBegin As Long
Dim posRowEnd As Long
Dim values As String(0, 3)
Dim beginValue0 As Long
Dim beginValue1 As Long
Dim beginValue2 As Long
Dim EndValue0 As Long
Dim EndValue1 As Long
Dim EndValue2 As Long
' Get the message and the html
Set msg = ActiveExplorer.Selection.item(1)
html = msg.HTMLbody
' Get the begin and end positions of the table (within the html)
tableBegin = "<table>"
tableEnd = "</table>"
posTableBegin = InStr(1, html, tableBegin)
posTableEnd = InStr(posTableBegin, html, tableEnd)
' Get the html table
table = Mid(html, posTableBegin + Len(tableBegin), posTableEnd - posTableBegin - Len(tableBegin))
' Set the variables for the loop
rowBegin = "<tr>"
rowEnd = "</tr>"
rowCount = 0
columnBegin = "<td>"
columnBeginLen = Len(columnBegin)
columnEnd = "</td>"
' Loop trough all rows
posRowBegin = InStr(lastPos, table, rowBegin)
Do While posRowBegin != 0
' Get the end from the current row
posRowEnd = InStr(posRowBegin, table, rowEnd)
rowCount = rowCount + 1
' Make the array larger
ReDim Preserve values(rowCount + 1, 3)
' Get the contents from that row
row = Mid(table, posRowBegin + Len(rowBegin), posRowEnd - posRowBegin - Len(rowBegin))
' Get the three values from that row (name, Accept, Approved) and put it in the array
beginValue0 = InStr(1, row, columnBegin) + columnBeginLen
endValue0 = InStr(beginValue0, row, columnEnd)
beginValue1 = InStr(endValue0, row, columnBegin) + columnBeginLen
endValue1 = InStr(beginValue1, row, columnEnd)
beginValue2 = InStr(endValue1, row, columnBegin) + columnBeginLen
endValue2 = InStr(beginValue2, row, columnEnd)
values(rowCount, 0) = Mid(row, beginValue0, endValue0)
values(rowCount, 1) = Mid(row, beginValue1, endValue1)
values(rowCount, 2) = Mid(row, beginValue2, endValue2)
' Get the beginning of the next row
posRowBegin = InStr(lastPos, table, rowBegin)
Loop
' The values are now in the (double) array 'values'.
' values(0, [1-3]) contains the headers.
End Sub
As said before, the original idea came from http://www.codeproject.com/Questions/567073/Howplustoplusrecognizeplusandplusreadplustableplus. Additionally, I used Word VBA how to select text between two substrings and assign to variable? and the Microsoft documentation to write this.
While it is likely that the code does not work out of the box, I think it still gets the general idea (and some specifics) across, so that it can be used as a guide. I hope this is the solution you need!
You can actually use the Word Object Model to parse out the text from the table - assuming that the email is in HTML format.
Get a Word.Document object from the Inspector.WordEditor property and use Word objects and methods to get the text, like the following below example from MSDN. Just replace ActiveDocument with the variable you declare and set from WordEditor.
Sub ReturnCellContentsToArray()
Dim intCells As Integer
Dim celTable As Cell
Dim strCells() As String
Dim intCount As Integer
Dim rngText As Range
If ActiveDocument.Tables.Count >= 1 Then
With ActiveDocument.Tables(1).Range
intCells = .Cells.Count
ReDim strCells(intCells)
intCount = 1
For Each celTable In .Cells
Set rngText = celTable.Range
rngText.MoveEnd Unit:=wdCharacter, Count:=-1
strCells(intCount) = rngText
intCount = intCount + 1
Next celTable
End With
End If
End Sub

Search cell for text and copy text to another cell in VBA?

I've got a column which contains rows that have parameters in them. For example
W2 = [PROD][FO][2.0][Customer]
W3 = [PROD][GD][1.0][P3]
W4 = Issues in production for customer
I have a function that is copying other columns into another sheet, however for this column, I need to do the following
Search the cell and look for [P*]
The asterisk represents a number between 1 and 5
If it finds [P*] then copy P* to the sheet "Calculations" in column 4
Basically, remove everything from the cell except where there is a square bracket, followed by P, a number and a square bracket
Does anyone know how I can do this? Alternatively, it might be easier to copy the column across and then remove everything that doesn't meet the above criteria.
Second Edit:
I edited here to use a regular expression instead of a loop. This may be the most efficient method to achieve your goal. See below and let us know if it works for you:
Function MatchWithRegex(sInput As String) As String
Dim oReg As Object
Dim sOutput As String
Set oReg = CreateObject("VBScript.RegExp")
With oReg
.Pattern = "[[](P[1-5])[]]"
End With
If oReg.test(sInput) Then
sOutput = oReg.Execute(sInput)(0).Submatches(0)
Else
sOutput = ""
End If
MatchWithRegex = sOutput
End Function
Sub test2()
Dim a As String
a = MatchWithRegex(Range("A1").Value)
If a = vbNullString Then
MsgBox "None"
Else
MsgBox MatchWithRegex(Range("A1").Value)
End If
End Sub
First EDIT:
My solution would be something as follows. I'd write a function that first tests if the Pattern exists in the string, then if it does, I'd split it based on brackets, and choose the bracket that matches the pattern. Let me know if that works for you.
Function ExtractPNumber(sInput As String) As String
Dim aValues
Dim sOutput As String
sOutput = ""
If sInput Like "*[[]P[1-5][]]*" Then
aValues = Split(sInput, "[")
For Each aVal In aValues
If aVal Like "P[1-5][]]*" Then
sOutput = aVal
End If
Next aVal
End If
ExtractPNumber = Left(sOutput, 2)
End Function
Sub TestFunction()
Dim sPValue As String
sPValue = ExtractPNumber(Range("A2").Value)
If sPValue = vbNullString Then
'Do nothing or input whatever business logic you want
Else
Sheet2.Range("A1").Value = sPValue
End If
End Sub
OLD POST:
In VBA, you can use the Like Operator with a Pattern to represent an Open Bracket, the letter P, any number from 1-5, then a Closed Bracket using the below syntax:
Range("A1").Value LIke "*[[]P[1-5][]]*"
EDIT: Fixed faulty solution
If you're ok with blanks and don't care if *>5, I would do this and copy down column 4:
=IF(ISNUMBER(SEARCH("[P?]",FirstSheet!$W2)), FirstSheet!$W2, "")
Important things to note:
? is the wildcard symbol for a single character; you can use * if you're ok with multiple characters at that location
will display cell's original value if found, leave blank otherwise
Afterwards, you can highlight the column and remove blanks if needed. Alternatively, you can replace the blank with a placeholder string.
If * must be 1-5, use two columns, E and D, respectively:
=MID(FirstSheet!$W2,SEARCH("[P",FirstSheet!$W2)+2,1)
=IF(AND(ISNUMBER($E2),$E2>0,$E2<=5,MID($W2,SEARCH("[P",FirstSheet!$W2)+3,1))), FirstSheet!$W2, "")
where FirstSheet is the name of your initial sheet.

VBA UBound returns a negative value

I would like to know what I'm doing wrong...
I have a word document open (in word 2010) with three tables in it. I wanted to test basic table extraction in VBA and followed the instructions http://msdn.microsoft.com/en-us/library/office/aa537149(v=office.11).aspx.
Sub ExtractTableData()
Dim doc As Word.Document
Dim tbl As Word.Table
Dim rng As Word.Range
Dim sData As String
Dim aData1() As String
Dim aData2() As String
Dim aDataAll() As String
Dim nrRecs As Long
Dim nrFields As Long
Dim lRecs As Long
Dim lFields As Long
Set doc = ActiveDocument
Set tbl = doc.Tables(1)
Set rng = tbl.ConvertToText(Separator:=vbTab, _
NestedTables:=False)
' Pick up the delimited text into and put it into a string variable.
sData = rng.Text
' Restore the original table.
doc.Undo
' Strip off last paragraph mark.
sData = Mid(sData, 1, Len(sData) - 1)
' Break up each table row into an array element.
aData1() = Split(sData, vbCr)
nrRecs = UBound(aData1())
' The messagebox below is for debugging purposes and tells you
' how many rows are in the table. It is commented out but can
' be used simply by uncommenting it.
'MsgBox "The table contained " & nrRecs + 1 & " rows"
'Process each row to break down the field information
'into another array.
For lRecs = LBound(aData1()) To nrRecs
aData2() = Split(aData1(lRecs), vbTab)
' We need to do this only once!
If lRecs = LBound(aData1()) Then
nrFields = UBound(aData2())
ReDim Preserve aDataAll(nrRecs, nrFields)
End If
' Now bring the row and field information together
' in a single, two-dimensional array.
For lFields = LBound(aData2()) To nrFields
aDataAll(lRecs, lFields) = aData2(j)
Next
Next
End Sub
I'm getting an error at this line: ReDim Preserve aDataAll(nrRecs, nrFields), which is due to "nrFields" being set to a negative value (-1)...
No idea how the upper bound of the array is a negative value... Any help on this would be much appreciated.
I figured it out - I was trying to extract a nested table. I had to cycle through all sub-tables and extract individually. Also, I had to search for and remove ^p before extraction to retain table structure.
After I had figured it out, I noticed that the MS code sample had an error: aData2(j) should actually be aData2(lFields).
Hope this helps some other newbie!
If UBound is -1 and LBound = 0, the array is empty. You can generate an empty array as follows:
Dim EmptyArray() As String
Dim s As String
EmptyArray = Split("")
Debug.Print (UBound(EmptyArray)) ' displays -1
Debug.Print (LBound(EmptyArray)) ' displays 0
In your case I suspect you need to skip the processing if the array is empty:
aData1 = Split(...)
If (UBound(aData1) < LBound(aData1) Then
' UBound is -1 and LBound is 0, array is empty, nothing to do
Else
' Array is non-empty, do your stuff
End If
Although quite bizarre, it is possible for VARIANT SAFEARRAY to have negative lower and upper bound values for any of the dimensions. The array extent is LBound(,dimension) to UBound(,dimension).
What must be true is UBound >= LBound.
To get the array size, use UBound - LBound + 1.
It used to be convention to set the lower bound using an Option Base statement at the top of VBA code although, of course, that didn't affect arrays being returned by 3rd party libraries. Most folk used to use 1 as the lower bound.

pulling out data from a colums in Excel

I have the following Data in Excel.
CHM0123456 SRM0123:01
CHM0123456 SRM0123:02
CHM0123456 SRM0256:12
CHM0123456 SRM0123:03
CHM0123457 SRM0789:01
CHM0123457 SRM0789:02
CHM0123457 SRM0789:03
CHM0123457 SRM0789:04
What I need to do is pull out all the relevent SRM numbers that relate to a single CHM ref. now I have a formular that will do some thing like this
=INDEX($C$2:$C$6, SMALL(IF($B$8=$B$2:$B$6, ROW($B$2:$B$6)-MIN(ROW($B$2:$B$6))+1, ""), ROW(A1)))
however this is a bit untidy and I really want to produce this same using short vb script, do i jsut have to right a loop that will run though and check each row in turn.
For x = 1 to 6555
if Ax = Chm123456
string = string + Bx
else
next
which should give me a final string of
SRM0123:01,SRM123:02,SRM0256:12,SRM0123:03
to use with how i want.
Or is ther a neater way to do this ?
Cheers
Aaron
my current code
For x = 2 To 6555
If Cells(x, 1).Value = "CHM0123456" Then
outstring = outstring + vbCr + Cells(x, 2).Value
End If
Next
MsgBox (outstring)
End Function
I'm not sure what your definition of 'neat' is, but here is a VBA function that I consider very neat and also flexible and it's lightning fast (10k+ entires with no lag). You pass it the CHM you want to look for, then the range to look in. You can pass a third optional paramater to set how each entry is seperated. So in your case you could write (assuming your list is :
=ListUnique(B2, B2:B6555)
You can also use Char(10) as the third parameter to seperat by line breaks, etc.
Function ListUnique(ByVal search_text As String, _
ByVal cell_range As range, _
Optional seperator As String = ", ") As String
Application.ScreenUpdating = False
Dim result As String
Dim i as Long
Dim cell As range
Dim keys As Variant
Dim dict As Object
Set dict = CreateObject("scripting.dictionary")
On Error Resume Next
For Each cell In cell_range
If cell.Value = search_text Then
dict.Add cell.Offset(, 1).Value, 1
End If
Next
keys = dict.keys
For i = 0 To UBound(keys)
result = result & (seperator & keys(i))
Next
If Len(result) <> 0 Then
result = Right$(result, (Len(result) - Len(seperator)))
End If
ListUnique = result
Application.ScreenUpdating = True
End Function
How it works: It simple loops through your range looking for the search_string you give it. If it finds it, it adds it to a dictionary object (which will eliminate all dupes). You dump the results in an array then create a string out of them. Technically you can just pass it "B:B" as the search array if you aren't sure where the end of the column is and this function will still work just fine (1/5th of a second for scanning every cell in column B with 1000 unique hits returned).
Another solution would be to do an advancedfilter for Chm123456 and then you could copy those to another range. If you get them in a string array you can use the built-in excel function Join(saString, ",") (only works with string arrays).
Not actual code for you but it points you in a possible direction that can be helpful.
OK, this might be pretty fast for a ton of data. Grabbing the data for each cell takes a ton of time, it is better to grab it all at once. The the unique to paste and then grab the data using
vData=rUnique
where vData is a variant and rUnique is the is the copied cells. This might actually be faster than grabbing each data point point by point (excel internally can copy and paste extremely fast). Another option would be to grab the unique data without having the copy and past happen, here's how:
dim i as long
dim runique as range, reach as range
dim sData as string
dim vdata as variant
set runique=advancedfilter(...) 'Filter in place
set runique=runique.specialcells(xlCellTypeVisible)
for each reach in runique.areas
vdata=reach
for i=lbound(vdata) to ubound(vdata)
sdata=sdata & vdata(i,1)
next l
next reach
Personally, I would prefer the internal copy paste then you could go through each sheet and then grab the data at the very end (this would be pretty fast, faster than looping through each cell). So going through each sheet.
dim wks as worksheet
for each wks in Activeworkbook.Worksheets
if wks.name <> "CopiedToWorksheet" then
advancedfilter(...) 'Copy to bottom of list, so you'll need code for that
end if
next wks
vdata=activeworkbook.sheets("CopiedToWorksheet").usedrange
sData=vdata(1,1)
for i=lbound(vdata) + 1 to ubound(vdata)
sData=sData & ","
next i
The above code should be blazing fast. I don't think you can use Join on a variant, but you could always attempt it, that would make it even faster. You could also try application.worksheetfunctions.contat (or whatever the contatenate function is) to combine the results and then just grab the final result.
On Error Resume Next
wks.ShowAllData
On Error GoTo 0
wks.UsedRange.Rows.Hidden = False
wks.UsedRange.Columns.Hidden = False
rFilterLocation.ClearContents