Excel macro - how to break wrapped text into rows for merged columns? - vba

I have to import data from PDF to SAS, and one step involves converting the PDF data to excel spreadsheet before converting to text for simpler SAS import. Usually the PDF data converts fine into excel, with few errors. As I am trying to import older data, it is getting quite messy and some of the rows get wrapped in a single cell. I am trying to figure out if there is a macro possible which can help me fix this error in sheets without too much manual manipulation. I have never programmed in VBA before so I am quite new to excel macros.
Here is the example of messy data:
Here is the example of normal data:
(*Note the data values in the two images are different, just for example formatting)
I have tried working on a macro. For this, I copy the messy data into another sheet, and run the macro which outputs corrected data on a separate sheet, and then i copy the corrected data over the messy one in the original spreadsheet.
After trying to code the macro, I was unable to figure out how to tell excel to take the data in columns C,D,E,F which are all merged into one cell and break that wrapped text, and so on for other merged columns (as shown in messy data image).
Here is my current code that I got after watching some tutorials:
Sub Split_Text_to_Rows()
Dim splitVals1 As Variant
Dim splitVals2 As Variant
Dim totalVals As Long
Set sh1 = ThisWorkbook.Sheets(2)
Set sh2 = ThisWorkbook.Sheets(3)
sh2.Cells.Clear
lrow1 = sh1.Range("A65356").End(xlUp).Row
For j = 1 To lrow1
splitVals1 = Split(sh1.Cells(j, 1), Chr(10))
splitVals2 = Split(sh1.Cells(j, 2), Chr(10))
For i = LBound(splitVals1) To UBound(splitVals1)
lrow2 = sh2.Range("A65356").End(xlUp).Row
sh2.Cells(lrow2 + 1, 1) = splitVals1(i)
Next i
For k = LBound(splitVals2) To UBound(splitVals2)
lrow3 = sh2.Range("B65356").End(xlUp).Row
sh2.Cells(lrow3 + 1, 2) = splitVals2(k)
Next k
Next j
End Sub
As you can see, my code is also quite messy. Although, I got the code to work for columns A and B, when I get to column C - "Motor Vehicle Theft" and so on, I am not sure how to separate that wrapped text since they are merged in columns C,D,E,F. I would also like to keep the columns I to Q as two merged rows even after macro splits 1 row into 2 (shown in normal data image) and then continue splitting cells till column Z.
Any tips would be helpful! Please let me know if more information or clarification is needed.

I often find that the best approach is to first paste the data into Word, do some clean-up there, format it as a Word table, and then transfer it into Excel. The reason is that Word has very powerful find/replace features which allow you to quickly convert a mess into something sensible. Since you didn't provide example data I could paste in, I randomly found a pdf on the web to show one approach. The key in this case was noticing that each column begins with a space followed by a digit. So I did a search for " ^#" (a space followed by 'any digit') and replaced it by "^t" (tab character). Next, I used Word's 'Convert to Table' feature, and after that the data table is ready for pasting into Excel.

Related

Looking for matching data in two workbooks of Excel, and format the matching results over a large data set

I am working on a problem in Excel, in which I have to compare data in two separate workbooks, and look for matching pieces of information over several columns - as the time stamp in row A isn't an exact match in both workbooks I need to rely on the data points from the other columns out to column G - and format the rows in which a match is found in a certain colour.
I could do this manually, but due to the amount of rows numbering into 5 figures, I think VBA seems the best way to do this. Having only basic programming knowledge, I am struggling to see the correct way to do this with VBA.
Here is what I have tried so far, and I will also include the error shown when I run the code.
I am aware I said that more than row A is needed to make the comparison, but I am not sure on how to apply the rules to a matrix over columns and rows.
Sub vbax_53997_Compare_Two_Ranges()
Dim i As Long
Dim wb1ws1, wb2ws2
Dim blnSame As Boolean
wb1ws1 = Workbooks("Copy_of_data.xlsm").Worksheets("Worksheet").Range("A1:A63").Value
wb2ws2 = Workbooks("Copy_of_data_to_be_compared.xlsm").Worksheets("Archive").Range("A1:A23067").Value
For i = LBound(wb1ws1) To UBound(wb1ws1)
If wb1ws1(i, 1) = wb2ws2(i, 1) Then
blnSame = True
End If
Next i
If blnSame = True Then
Sheets(sheetName).Cells(lRow, "A").Interior.ColorIndex = 3 'Set Color to Red'
End If
End Sub
This is the error when I run the code;
Run-time error '9': Subscript out of range.
I have no doubt its an easy fix and I could maybe do a nested vlookup/if statement, but taking longer in the short term to find a solution will probably be a benefit to automate the process in the long term.
Any help would be much appreciated as I have now run out of ideas.

SImple macro to delete all unused columns and rows

I have linked a spreadsheet to a database and have created a macro that clears the contents of the worksheet and pastes In the results of the SQL I send to the database.
It returns about 30 columns of data with about 6000 rows of data, yet the file size ballooned to 22mb!? I read on here about deleting/hiding empty columns and cells and this reduced the file to 2.5mb.
Is there a quick macro that will save me having to do this manually each day please? I've seen a lot of variants on Google and I can't get one to work correctly.
The number of columns stays the same each day but the number of rows fluctuate.
Sub hide_Empty_Columns()
Dim lastCol&, i&
Dim dataCol$
lastCol = Cells(1, Columns.Count).End(xlToLeft).Column
For i = lastCol To 1 Step -1
With Columns(i)
If WorksheetFunction.CountA(Columns(i)) = 0 Then
Columns(i).Hidden = True
End If
End With
Next i
End Sub
Pretty straightforward. You can tweak as needed. It hides any column that is completely empty. If you want instead to delete the column, change the Columns(i).Hidden line to Columns(i).EntireColumn.Delete.
Hi I too have had this problem in the past. Mostly the macros to delete unused columns and rows didn't help much. (some but not much).
I found that most of the problem was 2 things.
1) Excel saving a bunch of XML history.
2. Cluttered VBA.
Try these two things.
1)Save your file as an xlsb file. This will change everything to binary and save wasted xml space.
2)use code cleaner utility found here http://www.appspro.com/Utilities/CodeCleaner.htm It's amazing how much space that thing can save for you.
BTW, if you save your file as a myfile.zip you can open it up and see where the bulk of your issue is. Be-careful about editing there because you may corrupt things.
Hope this helps.

Run VBA format code on all sheets with different number of rows

VBA noob here needs a little bit of assistance. I cannot seem to find a solution or get something to work.
I've tried to simplify it as much as I could to get a proof of concept.
The basic idea is to format one cell (A1 say) with all borders, copy that format down across all data in the first sheet (A1:C10 for example), then do the same with data in subsequent sheets. What I'm struggling with is that subsequent sheets all have a different number of rows and anything I try just formats the additional sheets to the (A1:C10) of the original even if there is no data present.
Any help would be greatly appreciated.
What you need is a variable that identifies the last row of any given sheet. For instance
LastRow = Worksheets("Sheet1").cells(65000,1).end(xlup).row
Now you can loop through your cells
for i = 1 to LastRow
for j = 1 to 3
Worksheets("Sheet1").cells(i, j) (apply your formatting)
next j
next i
You can find the last row in a column by using this VBA code:
lastrow = Sheets("SheetName").Cells(rows.count,columnnumber).end(xlup).row
Change columnnumber to the number of the column you are looking in, for example column A = 1.

Move row entries to a third file based on comparison among Files or workbooks one and two

I think that my question was not clear earlier. So, I am attaching sample data along with a detailed insight into the requirement. Please advice.
https://docs.google.com/spreadsheets/d/1GUuNFkJdgpStfLH1oBTAvxEgW9V1v13Z5aJ9goA8C0M/edit?usp=sharing
https://docs.google.com/spreadsheets/d/1B9LObbHmu0G9pBHbFqbcR4fNJuSr8BvpqJHfVi9J2fg/edit?usp=sharing
Requirement:
a)Compare the data in Files named John.xlsx with Jack.xlsx
b)Specifically compare the Columns B and C .
c)If both Columns match, then move the entire ROW from Jack.xlsx to a third file Lilian.xlsx which will be having the same columns headers and is just a blank file at the moment.
d)Delete the moved row from Jack.xlsx
e)Save Jack.xlsx and Lilian .xlsx
Does that make any sense?
Thanks for the effort :)
PS: ( sorry, but I am not able to attach more than 2 links in the post coz of my reputation point in the forum is quite low. New to the forum -.-' ). Otherwise, I will put a link for the Lilian.xlsx file as well.
You might consider using native Excel features, rather than trying to code something in visual basic. For example, try =MATCH() formula combined with the autofilter feature, and set the criteria to hide anything that does not match, i.e. comes up "#N/A".
Once you get the data to display as you expect, then copy and paste to new tab, and save-as new.csv.
Assuming you have 3 sheets. Sheet1, Sheet2 and Sheet3, the code below checks columns C for sheet1 and sheet2 for the first 10 rows. If it finds a match it will copy the data to the first column in sheet3:
Sub main()
Dim i As Integer
Dim intCurrentRow As Integer
intCurrentRow = 1
For i = 1 To 10
If Sheet1.Cells(i, 3) = Sheet2.Cells(i, 3) Then
Sheet3.Cells(intCurrentRow, 1) = Sheet1.Cells(i, 3)
intCurrentRow = intCurrentRow + 1
End If
Next i
End Sub
Data in sheet 1:
Data in sheet2:
Result, Data in sheet 3:

Iterating 100 cells takes too long

In my excel VBA code, I need to move some data from a range to another sheet.
As of now, I'm iterating through the range and copying the values like this:
For offset = 0 To 101
ActiveWorkbook.Sheets(Sheet).Range("C3").offset(offset, 0).Value = ActiveSheet.Range("D4").offset(offset, 0).Value
Next offset
However, it takes almost a minute to iterate and copy the values for the 100 cells.
Would I be better off using Copy-Paste programatically, or is there a way to copy for the entire range at once? Something like:
ActiveWorkbook.Sheets(Sheet).Range("C3:C102").Value = ActiveSheet.Range("D4:D104").Value
You can read the entire range at once into a Variant array, and then write it back to another range. This is also quick, flickerless, and has the added bonus that you can code some operations on the data if you are so inclined.
Dim varDummy As Variant
varDummy = ActiveSheet.Range("D4:D104")
' Can insert code to do stuff with varDummy here
Workbook.Sheets(Sheet).Range("C3:C103") = varDummy
This I learned the hard way: Avoid Copy/Paste if at all possible! Copy and Paste use the clipboard. Other programs may read from / write to the clipboard while your code is running, which will cause wild, unpredictable results.
Also, it's generally a good idea to minimize the number of interactions between VBA and Excel, because they are slow. Having such interactions in a loop is multiply slow.
So, silly me did not try before posting here. Apparently, I can move data for an entire range this way:
Workbook.Sheets(Sheet).Range("C3:C102").Value = ActiveSheet.Range("D4:D104").Value
Its as fast as copy-paste without the switching of sheets.
Iterating through the range using a for loop takes about 45s for 100 cells, while the above two options are instant.
You can speed up code and stop flickering with:
Application.ScreenUpdating = False
'YOUR CODE
Application.ScreenUpdating = True
More: http://www.ozgrid.com/VBA/excel-macro-screen-flicker.htm
Columns("A:Z").Select
Selection.Copy
Sheets("Sheet2").Select
Range("A1").Select
ActiveSheet.Paste
That will copy columns A to Z from Sheet 1 to Sheet 2. This was generated by recording the macro. You can also apply it to ranges with something like this:
Range("D4:G14").Select
Selection.Copy
Sheets("Sheet2").Select
Range("D4").Select
ActiveSheet.Paste
Is this something like what you're after?
If you need anything specific and you can do it manually (e.g. copy and paste), record the macro to get the VBA code for it.
Copy and pasting has a decent amount of overhead in VBA, as does dealing with ranges like that. Its been a while since I have done VBA but if I recall correctly the fastest way to do something like this is to write the values you want into an array and then use the Resize function. So something like this:
Option Base 0
Dim firstrow as integer
Dim lastrow as integer
Dim valuesArray() as Long
Dim i as integer
//Set firstrow and lastrow however you deem appropriate
...
//Subtracing first row from last row gets you the needed size of the 0 based array
ReDim valuesArray(lastrow-firstrow)
for int i = 0 to (lastrow-firstrow)
valuesArray(i)=Cells(i+firstrow, COLUMNNUMBER).value
next i
Of course replace COLUMNNUMBER with whatever column it is you are iterating over. This should fill your array with your desired values. Then pick your destination cell and use Resize to put the values in. So if your destination cell is D4:
Range("D4").Resize(UBound(valuesArray)+1, 0).value = valuesArray
That write all the values in the array starting at D4 and going down to as many cells are in the array. Slightly more complicated but if you are going for speed I don't think I have ever come up with anything faster. Also I did this off the top of my head so please test and make sure that you don't cut off a cell here and there.
That OZGrid page has very useful info - http://www.ozgrid.com/VBA/SpeedingUpVBACode.htm
In my case, I need the formatting to be copied as well so I have been using this:
Sheet1.Range("A1:A200").Copy Destination:=Sheet2.Range("B1")
but was still having very slow execution - to the point of locking up the application - I finally found the problem - at some point in the past a number of empty text boxes got into my page - and while they were copied each time my code ran they were not erased by my code to clear the working area. The result was something like 4,500 empty text boxes - each of which was copy and pasted by even the code above.
If you use Edit - Go To... - Click on Special - then choose Objects - and you don't see anything that is good - if you see a bunch of objects that you were not aware of on your page that is not good.