How to improve browse of a big Word table in VBA? - vba

I happen to have a Word table with 1,300 rows and 4 columns. This table contains a list of documents with their main characteristics.
Then I want to upload the content of this table in an array to have a maximum performance of some filtering functions.
My problem is that the upload duration grows exponentially with the size of the Word table : for 300 rows, takes a couple of seconds, for 600 rows, make it 15 seconds, for 1200 rows, it's more than 60 seconds.
I have tested several ways of browsing my table, for instance:
Dim My_Table as Table
Dim My_array(1 to 1300, 1 to 4) as string
Nb_Cols = My_Table.Columns.Count
Nb_Rows = My_Table.rows.count
For I = 1 to Nb_Rows
For J = 1 to Nb_Cols
My_Array(I,J) = My_Table.Cell(I,J).Range.Text
Next J
Next I
None of the other methods I have tested has improved the code performance, some of them has made it worse! For instance :
- define a range for My_Table, and browse cells of this range.
- convert table to text and then browse paragraphs (convert to text takes ages!)
Any clue to get close to what I would dream of, like in Excel:
My_array = My_table ???
Thanks for any clue to solve that stuff.
P.S. : of course I have considered storing my big table in an Excel spreadsheet, but this implies creating the dynamic link to Excel in my set of programs, which has other drawbacks.
Very best regards.
S.C. (63200 RIOM - France)

Related

Is there a script to bypass 50000 characters for in-cell formula?

I have this (insanely) long formula I need to run in Google Sheets, and I came across the limit error:
There was a problem
Your input contains more than the maximum of 50000 characters in a single cell.
Is there a workaround for this?
my formula is:
=ARRAYFORMULA(SPLIT(QUERY({B!A1:A100; ........ ; CA!DZ1:DZ100},
"select * where Col1 is not null order by Col1 asc", 0), " "))
full formula is: pastebin.com/raw/ZCkZahpw
apologies for Pastebin... I got a few errors here too:
note 1: due to fact that it's a long formula, the output from it should be of size ~100 rows × 3 columns
note 2: so far I managed to bypass JOIN/TEXTJOIN for 50000+ characters even 500000 limits for total cells
Is there a script to bypass 50000 characters for in-cell formula?
If the length of {B!A1:A100; ........ ; CA!DZ1:DZ100} is greater than 50 thousands characters consider to build a custom function that build the array for you. You could "hard-code" the references or list them as text on a range to be read by your script.
Then, the resulting formula could look like this:
=ARRAYFORMULA(SPLIT(QUERY(MYCUSTOMFUNCTION(),
"select * where Col1 is not null order by Col1 asc", 0), " "))
or like this
=ARRAYFORMULA(SPLIT(QUERY(MYCUSTOMFUNCTION(A1:A1000),
"select * where Col1 is not null order by Col1 asc", 0), " "))
(assuming that you have 1000 references).
A custom function works because it on the Google Sheets side instead of having a formula that exceeds the cell content limit it will use just few characters and because by using good practices it's possible to make that it takes less than the 30 seconds time execution limit for them.
It's worth to note that if the MYCUSTOMFUNCTION() variant (without arguments) is used, it only will be recalculated when the spreadsheet is opened but the MYCUSTOMFUNCTION(A1:A1000) variant (with a range reference as argument) will be recalculated every time that a cell in the range reference changes.
References
Custom Functions in Google Sheets
getDataRange
UPDATE:
I managed to enter up to 323461 characters as a formula! by using CTRL + H where I replaced simple =SUM(1) formula with my huge formula from this answer: https://webapps.stackexchange.com/a/131019/186471
___________________________________________________________
after some research, it looks like there isn't any workaround to pull this of.
recommended savings that were suggested ( shortening: A!A:A, dropping: select *, asc, shortening: "where Col1!=''order by Col1") reduced it a bit and rest was split into two formulas in VR {} array solution.

Setting number format using Excel object quickly

I am writing 12,000 records with 46 columns, as a production report, to an excel file. The worksheet is not displayed while it is filled with data.
Previous StackOverflow information taught me to use arrays of objects to put values in ranges for speed. I had hoped this worked for formatting the values as well.
Code snip:
objExcel.Calculation = XlCalculation.xlCalculationManual
objExcel.ScreenUpdating = False
dcel = objWS.Range(objWS.Cells(rowdatastart, 1), objWS.Cells(rowdataend, nProdReportCol.ProdReportColCount - 2))
dcel.Value = aobj
dcel.NumberFormat = bobj
objExcel.ScreenUpdating = True
objExcel.Calculation = XlCalculation.xlCalculationAutomatic
aobj and bobj are object(,) arrays that fit the range. bobj contains strings such as "h:mma/p" to display time as "12:23a", and "0.00" to show numbers as "53/25".
The "dcel.value = aobj" takes half a second.
The "dcel.NumberFormat = bobj" takes 38 seconds.
Any suggestion for what I've missed/misunderstood? I'd rather a 7 second report (start excel, write, save, close excel) not take 45 seconds because I want the numbers/dates/times to appear as I choose.
If each column has its own format then try formatting per whole column. Also for each column do not use an array, if the formats are identical then you can just use a single string.
Do so for each column.
Please attempt and provide feedback.
After some more experimentation, I found the same solution as S Meaden. As only 14 of the 46 columns are not "General", I collected the column numbers for dates, times and 2-decimal numeric. Looping through each of these, gen a range for the 12000 records and 1 column, set the format. This takes about half a second for all the columns.
Odd that setting a row of 46 cells using a 46 object array takes so much longer than 12000 cells in a column, but there you are.
Thanks everyone.

Excel Compare two sheets and update sheet 1

Okay - This has been asked multiple times, but asking again for best possible solution :
I have two excel files (not sheets). the first excel sheet is very huge and has close to 200,000 records. One of the column (Gender) is corrupted and i have to fix it.
I have a second excel file and it has only around 200 records - these have the correct value for those ones which are messed up.
for eg:
and this is the file that has correct values with only around 200 records (only the corrupted ones).
Now i need a macro , where i need to find these exact 200 records out of 200,000 records (by employee id) and replace the Gender value with correct one.
i found something similar here. but i dont want to loop 200,000 records 200 times. feels like a performance overhead.
is there a better option?
I am thinking an ideal solution would be
Loop through 200 items and use employee id per loop
Take that employee id and do a "Find" operation in the Employee id column of the master excel
If found, replace the Gender column value
would there be any other better solution? Any inputs is gladly appreciated
One way to do this through VBA is to just loop through the 200 corrections, comparing the ID with the MATCH function to find the row it belongs on, as opposed to a second loop (a second loop through 20000 would take ages like you say).
For the below sub I have copied and pasted the 200 table into columns 5:7 of the 20000 table, you can either automate this part easily enough, or just put in the correct sheet references for each part of the code.
I've also put in a checking line to make sure there IS a match for the current ID from the small table, otherwise it'd throw up an error. You could put an ELSE in front of the END IF in this error catch to highlight any ID's which weren't actually found. Here's the code, hope this method helps!
Sub replace_things()
With ActiveSheet
For x = 2 To 200 'Change this to however many is in the small table
cur = .Cells(x, 5) 'Defined cur as ID from small table
aMatch = Application.WorksheetFunction.CountIf(.Range("A:A"), cur) 'Check to see there's a match in large table
If aMatch > 0 Then ' if there's a match then...
theRow = Application.WorksheetFunction.Match(cur, .Range("A:A"), 0) 'get the row number the match is actually on
.Cells(theRow, 3) = .Cells(x, 7) 'when row is found, replace with the relevant value from col7 (col3 of small table)
End If
Next x
End With
End Sub
A super quick way, copy your CORRECT employee ID list and paste below the CORRUPT employee ID list... highlight duplicates and correct the highlighted.
Otherwise a VLOOKUP could label which ones are corrupt? basically getting a unique field from your correct list and comparing that to your corrupt list then fixing the ~200 errors.
I assume that the employee ID is a unique record so you can paste the correct ones under existing ones, sort by empID and highlight duplicates to find them easily.

Recalculate (refresh) and save each outcome to another sheet

I'm a R programmer but am attempting VBA for the first time. I have a pretty simple population projection based on a series of randomly selected birth and survival values. I have predicted population values for years 1 to 20 in cells BC4:BC23 in Sheet1. Every time I refresh, the values change. I would like to refresh 100 times and store each of the population values into Sheet2 (100 columns with 20 values).
Based on my internet searching, it seems that a combination of a loop and EnableCalculation is a viable VBA approach for this. I've tried different coding approaches (Application.EnableEvents, CalculateManyTimes, etc) with no luck. Surely this kind of question has been asked before but I could not find it. Any tips would be appreciated. Thank you.
The key is Application.CalculateFull so the code could be:
Sub CalculateAndSave()
Dim Ws As Worksheet
Set Ws = Worksheets(2)
For i = 1 To 100
Application.CalculateFull
Ws.Range(Ws.Cells(4, i), Ws.Cells(23, i)) = Sheets(1).Range("BC4:BC23").Value
Next i
End Sub

For-Loop to assign index value Excel VBA

Apologies for the naive nature of this question but I am very new to VBA.
I have a column of data with the number of pageviews a particular page has had.
I then have a seperate sheet that assigns an index value of between 1 and 30 depending whether the number of page views exceeds a specific number.
For example if a page has 10,000 page views then that is an index value of 4 as index 4 is any number over and including 8,640 and upto 10,368 where it would become index 5.
As I have many rows of data to complete this indexing on I would like to create a loop that will check what index each page should be assigned and then print the index in a new column in the same row.
I have been looking at tutorials but can't find anything specific enough to help me out. If anyone has any advice or a quick example to get me started that would be much appreciated :)
Yes you can do it with VBA, although as others have mentioned its not required.
Forget about looping, its slow and unnecessary.
Sub HTH()
With Sheet1.Range("A1", Sheet1.Cells(Rows.Count, "A").End(xlUp)).Offset(, 1)
.Formula = "=VLOOKUP(A1,Sheet2!A$1:B$5,2)"
.Value = .Value
End With
End Sub
Assumes you have the index layed out like this:
A1 B1
1 1
5 2
500 3
8640 4
10,368 5
And your pageviews in column A on sheet2.