Quicker alternative to TimeScaleData for updating work in assignment for a specific time period - vba

I'm using VBA to update actual work in my projects. From an external csv sheet, I get actual work per task, resource and week, which then needs to be fed into Project in the same three dimensions.
I find that many things can be coded in two very different ways in Project, depending on whether I get the inspiration from recording a macro, or from exploring methods and properties in the Object Browser. For this operation, I haven't found any other way than what I learned by recording a macro while I update actual work in the Time Scale window in Task Usage view. This has given me the TimeScaleData method.
Here is a simplified version of my code. The variables MyStartDateString, MyEndDateString and MyActualWork are defined elsewhere.
dim t as Task
dim a as Assignment
For Each t in ActiveProject.Tasks
For Each a in t.Assignments
a.TimeScaleData(StartDate:=MyStartDateString, _
EndDate:=MyEndDateString, Type:=10, TimeScaleUnit:=3, _
Count:=1).Item(1).Value = MyActualWork
Next a
Next t
There is actually a lot more going on before this part, where I step through each week in the csv file and match the name of the task and resource with those in Project so on, but this is the critical part of it. I have found that in a project with around 1000 of these TimeScaleData operations, the whole thing takes around 45 seconds, which is annoyingly slow. Is there a faster and more elegant way?
Thanks a lot for your help!

Yes, there are often two ways of doing the same thing in Project, but in this case there is only one. However, it should help to make fewer calls to the TimeScaleData method--only one per assignment and then loop through the TimeScaleValues collection and set the Value property for each TimeScaleValue object.
Also, pass in date variables for the start and end dates, rather than strings. And turn off Calculation and ScreenUpdating.
Look at similar stack overflow posts such as TimeScaleData in Project using .net.

Related

Most Efficient Way To Find Which Range an Input Falls into from a List of ranges (as string)

Here is the situation: My workplace files engineering drawing pdfs by drawing number and sorts them into folders. There are 200+ of these folders. The folder is named with the range of drawing numbers that will go in the folder. For example:
"0.001.000 - 0.001.999"
Most of them are from x.xxx.000 - x.xxx.999, however, there are a few that are x.000.000 - x.500.999. Additionally, any with a first digit 9 are done in the format 9.xxx.x00 - 9.xxx.x99. So basically, the range they will fall into is not always consistent.
I currently have two ways I have figured out to tackle this:
(1) I take the input and use substrings to create the x.xxx., and then add the 000 and 999 on the end. I use if statements to handle the first digit 9 case and the others that don't follow the usual format, since I already know which ones these are
(2) The other method I came up with seems more elegant, but it also seems to be slower:
I get all the folder names into a list. Then I using a for loop, I loop through every folder in the list. In this loop I first take out the .'s from the input and the folder name, and then do foldername.split("-"c) to get it into a min and max. Then I use a select case to see if input is between those two numbers. If it is, I set that as the folder to look in, and exit the loop. If not, go to the next folder and repeat. The problem is because it loops through 200+ folders, this process seems to be slower if you're looking up 10 or 20 drawing numbers
Is there a better way to do this? Perhaps a way to go directly to which folder it is without having to loop through a bunch for every input? I have some ideas for reducing the number of folders the loop will have to go through (for example, only looping through folders with the same first digit would in some cases speed it up quite a bit), but I am not sure if it's possible to bypass the loop entirely without using the first method, which while simple also seems a little more brute force.

How can I speed up my Microsoft Project VBA code?

I have a macro I use for Microsoft Project that loops through each task in the project, and performs several checks to find any problems with the tasks. These checks include several IF and Select Case statements. When dealing with large projects with more tasks, the macro can get lengthy. Is there anything I can do to improve the speed up the macro? I have already turned off screen updating and manual calculation.
Turning off screen updating and setting calculation mode to Manual are the only application settings you can use to improve performance; the rest depends on your algorithm.
Your description of the problem is a bit vague: How large are your projects and how long does the macro take? If your projects are 1,000 tasks and you are making a dozen checks and your code takes more than five minutes, then yes, there is surely room for improvement. But if it's 20,000 tasks and 50 checks and the macro takes two minutes, stop trying to improve it--that's great performance.
Bottom line: it is impossible to tell if there is room for improvement without seeing your code.
If you use the same property (e.g. objTask.Start) in several different comparisons in your code then set the property into a local variable once and then perform your comparisons on the local variable.
For example:
Slow code:
If objTask.start < TestDate1 and objTask.Start > TestDate2 then ...
Fast code:
Define dteStart as Date
dteStart = objTask.Start
if dteStart < TestDate1 and dteStart > testdate2 then ...
Calls to the COM object model are expensive. The second code example will be quite a bit faster (although as noted by Rachel above) it really does depend on the volume of data being processed.
Also, make sure you define your variables with appropriate types as relying on the default Variant data type is very slow.
if you have some variables with lot of data like collections think about setting it to nothing and the end of your function
Set TasksCollection=Nothing

What do I need to do in order to structure these results?

I'm trying to learn VBA, but it's very different from the type of programming I'm used to. For that reason I would appreciate your guidance.
I want to structure the results I get from simulations. There are screenshots below illustrating what I'm trying to describe with words here:
What I want to do is:
Copy all the results from one sheet to a new sheet (to keep the original data).
Delete certain columns, for instance B & D:E
Move (or copy, doesn't matter) rows 30:38 up besides rows 11:19, with one empty column in between. The result will be as shown in the last figure below. (The number of rows in each block varies, and there are 4 blocks).
I don't know if these are the recommended procedures, but I know I can:
Delete columns this way:
Sub sbDeleteAColumnMulti()
Columns("D").Delete
End Sub
Copy/paste a range like this:
Sub copyRangeOver()
Dim i As Integer
i = 6
Dim copyRange As Range
Set copyRange = ThisWorkbook.Worksheets("Sheet2").Range("A" & i + 1 & ":CA" & i + 1)
Dim countD As Integer
countD = 10
copyRange.Copy Destination:=Cells(countD, 2)
End Sub
A few things that is complicating stuff for me: The headers (except the first one): Bus ( A ) -LL Fault, are shifted one column to the right (they are now above Ik'', not Bus Name).
I don't know in advance how many rows are in each "block", thus I need to check this (but I know there are only 4 "blocks"). All "blocks" are the same size, so I can just check the number of rows between two Bus Names.
Now, I don't want someone to write me a code! What I hope someone will help me with is to suggest a procedure I can follow for this to work. I know that many roads lead to Rome, and I see that this question might come of as either a "Primarily opinion-based question", or "Too broad". However, I think it's a legitimate question that belongs here. I'm not trying to start a debate over what the "best" way of doing this is, I just want to find a way that works, as I currently don't know where to start. I'm not afraid of "the complicated way", if it's more robust and cleaner.
What I don't know is what kind of Modules, Class Modules (if any) etc I need. Do I need Collections, create Public/Private subs? What would be the purpose of each of those be in this case?
What I start with: (Edit: none of the cells are merged, it's just a bunch of whitespaces)
What I want:
Update:
Here's the first chunk of code I get when recording a macro (note that my workbook has more columns and rows than in the example I gave):
Range("D:I,K:M,O:P").Select
Range("O1").Activate
Selection.Delete Shift:=xlToLeft
ActiveWindow.SmallScroll Down:=39
Range("C52:E78").Select
Selection.Copy
ActiveWindow.SmallScroll Down:=-42
Range("G13").Select
ActiveSheet.Paste
ActiveWindow.SmallScroll Down:=84
Range("C91:E117").Select
To me, this looks like a piece of crap. Of course, it's possible that I should have created the macro differently, but if I did it the right way, I don't think it's much to work with. I guess I can delete all the SmallScroll-lines, but still...
Also, I don't see how I can adapt this, so that it will work if I have a different number of rows in each block.
To get this, you're going to want to start with using the Macro Recorder from Excel.
If you are doing the exact same formatting options for the exact same data output each time, this is by far your best bet. The recorder will copy whatever you do for formatting and write the code you need. It may not be the best code but it will by far be the best option for what you are describing.
If (when?) you need to start adding logic other than the same formatting, you will then have functional code which will make your life easier.
But isn't the macro recorder going to generate bad code and/or it's better to just code from scratch?
I'm fairly experienced at this point and often use the macro recorder because... while it does put a lot of code there which isn't strictly speaking necessary, it gets you a ton of the more obscure stuff (how do I format the cell border to be this way?) etc. Of course it's better to not only use the recorder, but for your example it's even more perfect, you get all the formatting recorded and then can modify the logic and not have to waste time figuring out syntax for formatting, deleting columns, etc.
Very few languages offer the ability to basically say, "I want to do what I am doing now programmatically - how can I start?" the way VBA does. You can bypass a lot of annoying syntax issues when learning (especially if you've previously done any sort of coding) and focus right on the logic you want to add. It works out pretty well, honestly.

VBA: Performance of multidimensional List, Array, Collection or Dictionary

I'm currently writing code to combine two worksheets containing different versions of data.
Hereby I first want to sort both via a Key Column, combine 'em and subsequently mark changes between the versions in the output worksheet.
As the data amounts to already several 10000 lines and might some day exceed the lines-per-worksheet limit of excel, I want these calculations to run outside of a worksheet. Also it should perform better.
Currently I'm thinking of a Quicksort of first and second data and then comparing the data sets per key/line. Using the result of the comparison to subsequently format the cells accordingly.
Question
I'd just love to know, whether I should use:
List OR Array OR Collection OR Dictionary
OF Lists OR Arrays OR Collections OR Dictionaries
I have as of now been unable to determine the differences in codability and performance between this 16 possibilities. Currently I'm implementing an Array OF Arrays approach, constantly wondering whether this makes sense at all?
Thanks in advance, appreciate your input and wisdom!
Some time ago, I had the same problem with the macro of a client. Additionally to the really big number of rows (over 50000 and growing), it had the problem of being tremendously slow from certain row number (around 5000) when a "standard approach" was taken, that is, the inputs for the calculations on each row were read from the same worksheet (a couple of rows above); this process of reading and writing was what made the process slower and slower (apparently, Excel starts from row 1 and the lower is the row, the longer it takes to reach there).
I improved this situation by relying on two different solutions: firstly, setting a maximum number of rows per worksheet, once reached, a new worksheet was created and the reading/writing continued there (from the first rows). The other change was moving the reading/writing in Excel to reading from temporary .txt files and writing to Excel (all the lines were read right at the start to populate the files). These two modifications improved the speed a lot (from half an hour to a couple of minutes).
Regarding your question, I wouldn't rely too much on arrays with a macro (although I am not sure about how much information contains each of these 10000 lines); but I guess that this is a personal decision. I don't like collections too much because of being less efficient than arrays; and same thing for dictionaries.
I hope that this "short" comment will be of any help.

Is there a way for VBA UDF to "know" what other functions will be run?

Assume I have a UDF that will be used in a worksheet 100,000+ times. Is there a way, within the function, for it to know how many more times it is going to be called in the batch? Basically what I want to do is have every function create a to-do list of work to do. I want to do something like:
IF remaining functions to be executed after this one = 0 then ...
Is there a way to do this?
Background:
I want to make a UDF that will perform SQL queries with the user just giving parameters(date, hour, node, type). This is pretty easy to make if you're willing to actually execute the SQL query every time the function is run. I know its easy because I did this and it was ridiculously slow. My new idea is to have the function first see if the data it is looking for exists in a global cache variable and if it isn't to add it to a global variable "job-list".
What I want it to do is when the last function is called to then go through the job list and perform the fewest number of SQL queries and fill the global cache variable. Once the cache variable is full it would do a table refresh to make all the other functions get called again since on the subsequent call they'll find the data they need in the cache.
Firstly:
VBA UDF performance is extremely sensitive to the way the UDF is coded:
see my series of posts about writing efficient VBA UDFs:
http://fastexcel.wordpress.com/2011/06/13/writing-efficient-vba-udfs-part-3-avoiding-the-vbe-refresh-bug/
http://fastexcel.wordpress.com/2011/05/25/writing-efficient-vba-udfs-part-1/
You should also consider using an Array UDF to return multiple results:
http://fastexcel.wordpress.com/2011/06/20/writing-efiicient-vba-udfs-part5-udf-array-formulas-go-faster/
Secondly:
The 12th post in this series outlines using the AfterCalculate event and a cache
http://fastexcel.wordpress.com/2012/12/05/writing-efficient-udfs-part-12-getting-used-range-fast-using-application-events-and-a-cache/
Basically the approach you would need is for the UDF to check the cache & if not current or available then add a request to the queue. Then use the after-calculation event to process the queue and if neccessary trigger another recalc.
Performing 100,000 SQL queries from an Excel spreadsheet seems like a poor design. Creating a cache'ing mechanism on top of these seems to compound the problem, making it more complicated than it probably needs to be. There are some circumstances where this might be appropriate, but I would consider other design approaches instead.
The most obvious is to take the data from the Excel spreadsheet and load it into a table in the database. Then use the database to do the processing on all the rows as once. The final step is to read the result back into Excel.
I find that the best way to get large numbers of rows from Excel into a database is to save the Excel file as csv and bulk insert them.
This approach may not work for your problem. In general, though, set-based approaches running in the database are going to perform much better.
As for the cach'ing mechanism, if you have to go down that route. I can imagine a function that has the following pseudo-code:
Check if input values are in cache.
If so, read values from cache.
Else do complex processing.
Load values in cache.
This logic could go in the function. As #Bulat suggests, though, it is probably better to add an additional caching layer around the function.