Using a Linest Function within a Nested Loop - vba

I am working with VBA and I'm very much a novice. I basically have 3 columns of data which act as the independent variable (MSCI Value, Growth and Small Cap) and then a blank column followed by numerous columns containing fund data (dependent variables). Most of these have the same number of rows but a few do not.
I am looking to use the Linest Function in excel to produce the coefficient (beta) of each fund with each independant variable separately (MSCI growth, Value, Small cap). I am unsure what is the best way to set out my data and vba. And thoughts/ ideas would be much appreciated.
Currently my thoughts are a nested loop. Whereby I use the Linest function to regress the first independent variable (MSCI Growth, column 2) against the first dependent variable (column 6) and this column number in the range is incremented each time until the column is blank (there are no more funds), and when this happens it loops back to the first fund but changes to the next independent variable (MSCI Value, column 3). This process is repeated until the last independent variable (MSCI Growth, column 4) is regressed against the last fund.
My problem so far has been 1) creating a Linest Function using named ranges
2) creating a table where the results of the loop are placed.
Set StartCell = Range("B9")
LastRow = Cells(Rows.Count, 1).End(xlUp).Row
Set gRange = Range(StartCell, Cells(LastRow, 2)) 'MSCI growth range
Range("M21").value = Evaluate("Linest(gRange,G9:G112)") 'column G contains the first fund.
This code doesn't run, I think it has something to do with the array formula, I only need the coefficient so do not need to run the whole array.
I tried using cell references but when I ran the code I got #VALUE
Range("M22").value = Evaluate("Linest(Range((cells(9,2):cells(112,2)),Range(cells(9,7):cells(112,7)))")
Maybe I am going about this the wrong way, I want to create a global macro I can use on other sheets but I am unsure how to approach the task.

you need to remove the vba part from the quotes and concatenate.
ActiveSheet.Range("M21").value = ActiveSheet.Evaluate("Linest(" & gRange.Address(0,0) & ",G9:G112)")(1)
The Second one:
With ActiveSheet
.Range("M22").value = .Evaluate("Linest(" & .Range(.cells(9,2),.cells(112,2)).Address(0,0) & "," & .Range(.cells(9,7),.cells(112,7)).Address(0,0) & ")")(1)
End With
This will also error if the two ranges are not the same size. So make sure they are.

Related

Compare and Select ranges based off most up-to-date Reading Date VBA

I am working on an excel workbook where the user imports text files into a "Data Importation Sheet", the amount of text files imported is dynamic. See image.
So here is what I need to happen
1) Need to find the most up-to-date Reading Date (in this example 2016)
2) Need to copy and paste the range of Depth values of the most up-to-date Reading Date to a separate sheet (in this example I would want to copy and paste values 1-17.5.
3) Need to check if all other data sets contain this same range of Depth values. For the year 2014 you can see its depth goes from 0.5-17.5. I want to be able to just copy the data at the range of the most up-to-date Reading Date so the range of 1-17.5.
Here is my code to find the most up-to-date Reading date and to copy those depths to the other sheets.
Sub Copy_Depth()
Dim dataws As Worksheet, hiddenws As Worksheet
Dim tempDate As String, mostRecentDate As String
Dim datesRng As Range, recentCol As Range, headerRng As Range, dateRow As Range, cel As Range
Dim lRow As Long
Dim x As Double
Set dataws = Worksheets("Data Importation Sheet")
Set hiddenws = Worksheets("Hidden2")
Set calcws = Worksheets("Incre_Calc_A")
Set headerRng = dataws.Range(dataws.Cells(1, 1), dataws.Cells(1, dataws.Cells(1, Columns.Count).End(xlToLeft).Column))
'headerRng.Select
For Each cel In headerRng
If cel.Value = "Depth" Then
Set dateRow = cel.EntireColumn.Find(What:="Reading Date:", LookIn:=xlValues, lookat:=xlPart)
Set datesRng = dataws.Cells(dateRow.Row + 1, dateRow.Column)
'datesRng.Select
' Find the most recent date
tempDate = Left(datesRng, 10)
If tempDate > mostRecentDate Then
mostRecentDate = tempDate
Set recentCol = datesRng
End If
End If
Next cel
Dim copyRng As Range
With dataws
Set copyRng = .Range(.Cells(2, recentCol.Column), .Cells(.Cells(2, recentCol.Column).End(xlDown).Row, recentCol.Column))
End With
hiddenws.Range(hiddenws.Cells(2, 1), hiddenws.Cells(copyRng.Rows(copyRng.Rows.Count).Row, 1)).Value = copyRng.Value
calcws.Range(calcws.Cells(2, 1), calcws.Cells(copyRng.Rows(copyRng.Rows.Count).Row, 1)).Value = copyRng.Value
Worksheets("Incre_Calc_A").Activate
lRow = Cells(Rows.Count, 1).End(xlUp).Row
x = Cells(lRow, 1).Value
Cells(lRow + 1, 1) = x + 0.5
End Sub
Any tips/help would be greatly appreciated. I am fairly new to VBA and don't know how to go about comparing the depth ranges! Thanks in advance!
Assuming that your datasets are as regularly organised as your screenshot suggests then quite a lot of processing can be done in Excel.
The image below shows a possible approach based on the data shown in your example.
The approach exploits the fact that each data set occupies 7 columns of the importation worksheet. The =ADDRESS() function is used to build text strings which look like cell addresses and these are further manipulated to create text strings which look like range addresses. The approach also assumes that the reading date is always located in the third row following the final row of depth data.
The solution is slightly different to your problem, in that it identifies the common range of depth values across all datasets. For the example in the question this amounts to the same thing as identifying the depth values associated with the latest reading date.
This approach was taken as it is not clear from the question what would happen if, say, a dataset had depth values starting at say 1.5 (so greater than the first value for the latest reading date) or ending at say 17 (so less than the the last value for the latest reading date). The approach can obviously be adapted if these possibilities will never occur.
The table shown in the image above has in its final column, a text representation of the ranges to be copied from the Data Importation Sheet. A simple bit of VBA can read this column, a cell at a time and use the text to assign an appropriate range object to which copy and paste methods can then be applied.
Additional bit of answer
The image above could be set-up as a "helper" worksheet. If there is always the same number of datasets on the Data Importation Worksheet then set up this helper sheet so that the number of rows in Table 2 is equal to this number of datasets. If the number of datasets is variable, then set up the helper sheet so that the number of rows in Table 2 is equal to the maximum number of datasets that is ever likely to be encountered. In this situation, when the number of datasets imported is fewer than this maximum, some rows of Table 2 will be unused and these unused rows will contain meaningless values in some columns.
Your VBA program should be organised to read the value in the value in cell D2 of the helper sheet and then use this to determine how many rows of Table 2 to examine with the rest of your VBA code. This will unused rows (if any) to be ignored.
If your VBA code identifies a value of, say 10, in cell D2 of the helper sheet then you will want your code to read one a time the 10 values in the range Q12:Q21 (so in a loop). Each of these cells holds, as a string, the range containing a single dataset's values and so can be assigned to a Range object using code such as
Set datasetRng = Range(datasetStr)
where datasetStr is the text string read from a cell in Q12:Q21.
Still within the loop, datasetRng can then be copied and pasted to your output worksheet.
Because the same helper worksheet can be re-used for each data importation, you should be able to incorporate it into your automation scheme. No need for copying and pasting formula down rows to create a different helper for each importation, just apply the same helper template to each data importation.
The approach adopted makes as much use of Excel as possible to determine relevant information about the imported data sets and summarises this information within the helper worksheet. This means VBA can be limited to automation of the copy/paste operations on the datasets and its reads information from the helper sheet in determining what to copy for each dataset.
It is of course possible to do everything in VBA but as you indicated you were fairly new to VBA it seemed sensible to tip the balance towards using less VBA and more Excel.
Incidentally, the problem of comparing the depth ranges is not really one of Excel or programming, it is one of analysis - ie looking at a range of cases, figuring out what needs to happen for each case, and distilling this into a set of processing rules (what some would call an algorithm). Only then should attempts be made to implement these processing rules (either via Excel formula or VBA code). I have hinted at my analysis of the problem (finding the common range of depth values across all datasets)and you should be able to track through how I have implemented this in Excel to cater for cases where some datasets might contain Depth values which are less than the minimum of the common range or which are greater than its maximum (or possibly both).
End of additional bit
The formula used are shown in the table below.

Referring Range with Known Columns and Unknown Rows Excel VBA

How do you refer to a range where the number of columns is known but you don't know which row? What's the correct way of rendering Range("A&i:J&i")?
For i = 8 To WSData.Range("A8").End(xlDown).Row
If Cells(i, 1) = "Overall Totals:" Then
WSData.Range("A&i:J&i").Interior.Color = RGB(217, 217, 217)
End If
Next
Scott's answer is off course quite correct. However there are several other ways of referring to a variable range which you might find useful.
1) You could also use WSData.Range("A10", "J10"), i.e. you specify the top left and bottom right cells as two separate parameters. (The order of the paraneters doesn't actually matter!)
In your example, you would use: WSData.Range("A" & i , "J" & i)
2) I find using numbers, rather than letters for columns is useful, especially if your columns will be unknown in advance. The basic structure is as follows.
WSData.Range(Cells(1,10), Cells(10,10) 'A10 to J10)
or in your example
WSData.Range(Cells(1,i), Cells(10,i))
However one has to be careful! The default worksheet for the Cells range is the Active Worksheet. If this is not the same as the WSData, it will lead to a run time error. However, this can easily by avoided by specifying the worksheet to which the "Cells" belong:
WSData.Range(WSData.Cells(1,i), WSData.Cells(10,i))
This may look rather long-winded but it gives you complete flexibility in specifying your range as you can use variables for each of the cell parameters.

Find a value from a column and quickly return the row number of its cell

What I have
I have a file with part numbers and several suppliers for each part. There are 1500 parts with around 20 possible suppliers each. For the sake of simplicity let's say parts are listed in column A, with each supplier occupying a column after that. Values under the suppliers are entered manually but don't really matter.
In another sheet, I have a list of parts that is imported from an Access database. The parts list is imported, but not the supplier info. In both cases, each part appears only once.
What I want to do
I simply want to match the supplier info from the first sheet with the parts in the imported list. Right now, I have a function which goes through each part in the list with suppliers, copies the supplier information in an array, finds the part number in the imported part list (there is always a unique match) and copies the array next to it (with supplier info inside). It works. Unfortunately, the find function slows down considerably each time it is used. I know it is the culprit through various tests, and I can't understand why it slows down (starts at 200 loop iterations per second, slows down to 1 per second and Excel crashes) . I may have a leak of some sort? The file size remains 7mb throughout. Here it is:
Function LigneNum(numAHNS As String) As Integer
Dim oRange As Range, aCell As Range
Dim SearchString As String
Set oRange = f_TableMatrice.Range("A1:A1500")
SearchString = numAHNS
Set aCell = oRange.Find(What:=SearchString, LookIn:=xlValues, _
LookAt:=xlPart, SearchOrder:=xlByRows, SearchDirection:=xlNext, _
MatchCase:=False, SearchFormat:=False)
If Not aCell Is Nothing Then
'We have found the number by now:
LigneNum = aCell.Row
Exit Function
Else
MsgBox "Un numéro AHNS n'a pas été trouvé: " & SearchString
Debug.Print SearchString & " not found!"
LigneNum = 0
Exit Function
End If
End Function
The function simply returns the row number on which the value is found, or 0 if it doesn't find it which should never happen.
What I need help with
I'd like either to identify the cause of the slow down, or find a replacement for the Find method. I have used the Find before and it is the first time this happens to me. It was initially taken from Siddarth Rout's website: http://www.siddharthrout.com/2011/07/14/find-and-findnext-in-excel-vba/ What is strange is that it doesn't start slow, it just becomes sluggish as it goes on.
I think using Match could work, or maybe dumping the range to search (the part numbers) into an array and trying to match these with the imported parts number list could work. I am unsure how to do it, but my question is more about which one would be faster (as long as it remains under 15 seconds I don't really care, though, but looping over 1500 items 1500 times right out of the sheet is out of the question). Would anyone suggest match over the array solution / spending more hours fixing my code?
EDIT
Here is the loop it is being called from. I don't think it is problematic:
For Each cellToMatch In rngToMatch
Debug.Print cellToMatch.Row
'The cellsToMatch's values are the numbers I want, rngToMatch is the column where they are.
For i = 2 To nbSup + 1
infoSup(i - 2) = f_TableMatrice.Cells(cellToMatch.Row, i)
Next
'infoSup contains the required supplier data now
'I call the find function here to find the row where the number appears in the imported sheet
'To copy the array nbSup on that line
LigneAHNS = LigneNum(cellToMatch.Value) 'This is the Find function
If LigneAHNS = 0 Then Exit Sub
'This loop just empties the array in the right line.
For i = LBound(infoSup) To UBound(infoSup)
f_symix.Cells(LigneAHNS, debutsuppliers + i) = infoSup(i)
Next
Next
If I replace LigneAHNS = LigneNum by LigneAHNS = 20, for example, the code executes extremely fast. The leak therefore comes from the find function itself.
Another way to do it without using the find function might be something like this. Firstly, put the part IDs and their line numbers into a scripting dictionary. These are really quick to lookup from. Like this:
Dim Dict As New Scripting.Dictionary
Dim ColA As Variant
Lastrow=range("A50000").end(xlUp).Row
ColA = Range("A1:A" & LastRow).Value
For i = 1 To LastRow
Dict.Add ColA(i, 1), i
Next i
To further optimise, you could declare the Dict as a public variable, populate it once, and refer to it many times in your lookups. I expect this would be faster than running a cells.find over a range every time you do a lookup.
For syntax of looking up items in the dictionary, refer to Looping through a Scripting.Dictionary using index/item number
You could achieve this with only Excel cell formulas and no VB if you are willing to devote a separate column to each supplier on your main parts sheet. You could then use conditional formatting to make it more visually appealing. I've tried it with 1500 rows and it's very quick. Increasing it to 5000 rows becomes noticeably slower, but you say you have only 1500 rows for now, so it should be suitable.
On Sheet 1, define a part number column and a separate column for each supplier.
Create a separate sheet for each supplier with all part numbers available from that supplier listed in column A. Make sure the rows on the supplier sheets are ordered by part number.
Name each of the supplier sheets the same as the associated column heading shown on Sheet 1.
Assign the following formula in each cell beneath each supplier column heading on Sheet 1:
=NOT(ISNA(VLOOKUP($A2,INDIRECT("'"&B$1&"'!A:A"),1,FALSE)))
The following screen cap shows this implemented along with conditional formatting to highlight which suppliers have which parts:
If you wanted to show quantities available from suppliers, then you could always have a second column (B) on the supplier sheets containing last known quantities for each part and use VLOOKUP to retrieve column B instead of A.

Not able to get range of excel worksheet

I'm currently trying to extract certain data from a workbook to put into a different workbook. I've got the workbook to open using
Application.GetOpenFile
, and then assigning that to a workbook. Then I assign a sheet to the active worksheet from that workbook.
My problem is coming from trying to get the range of the worksheet. I'm using an array of strings (like
columnLetter(0) = "A"
columnLetter(1) = "B"
and so on to try to check through all of the columns for certain strings (which are listed in an if statement with a ton of "Or"s. The specific place of the strings varies from file to file, so my plan was to search the first row, then the second row, etc until it finds one of the strings. So, I'm using this:
lastRow = brokerSheet.Range(columnLetter(i) & Rows.Count).End(xlUp).Row
to get the amount of rows in that column specifically. When I run the program, though, I get the error
Method 'Range' of object '_Worksheet' failed
on that line. I'm guessing that's because I'm trying to use
columnLetter(i)
,which has A through R assigned to it, instead of "A" or something like that for the column name. However, I tried using 1, 2, 3, etc to represent first column, second column, third column...but that didn't work. The worksheet only has around 90 rows (though some will have upwards of 400 once I get this working). Is there another way to do this? I could write out "A" "B" "C" etc for all of them, but there has to be a better way to designate which column to check than that.
Why not try creating a string object strRange and injecting this into Range():
strRange = columnLetter(i) & CStr(Rows.Count)
lastRow = brokerSheet.Range(strRange).End(xlUp).Row
This should work

Why is my conditional format offset when added by VBA?

I was trying to add conditional formats like this:
If expression =($G5<>"") then make set interior green, use this for $A$5:$H$25.
Tried this, worked fine, as expected, then tried to adapt this as VBA-Code with following code, which is working, but not as expected:
With ActiveSheet.UsedRange.Offset(1)
.FormatConditions.Delete
'set used row range to green interior color, if "Erledigt Datum" is not empty
With .FormatConditions.Add(Type:=xlExpression, _
Formula1:="=($" & cstrDefaultProgressColumn & _
.row & "<>"""")")
.Interior.ColorIndex = 4
End With
End With
The Problem is, .row is providing the right row while in debug, however my added conditional-formula seems to be one or more rows off - depending on my solution for setting the row. So I am ending up with a conditional formatting, which has an offset to the row, which should have been formatted.
In the dialog it is then =($G6<>"") or G3 or G100310 or something like this. But not my desired G5.
Setting the row has to be dynamicall, because this is used to setup conditional formats on different worksheets, which can have their data starting at different rows.
I was suspecting my With arrangement, but it did not fix this problem.
edit: To be more specific, this is NOT a UsedRange problem, having the same trouble with this:
Dim rngData As Range
Set rngData = ActiveSheet.Range("A:H") 'ActiveSheet.UsedRange.Offset(1)
rngData.FormatConditions.Delete
With rngData.FormatConditions.Add(Type:=xlExpression, _
Formula1:="=($" & cstrDefaultProgressColumn & _
1 & "<>"""")")
.Interior.ColorIndex = 4
End With
My Data looks like this:
1 -> empty cells
2 -> empty cells
3 -> empty cells
4 -> TitleCols -> A;B;C;...;H
5 -> Data to TitleCols
. .
. .
. .
25
When I execute this edited code on Excel 2007 and lookup the formula in the conditional dialog it is =($G1048571<>"") - it should be =($G1<>""), then everything works fine.
Whats even more strange - this is an edited version of a fine working code, which used to add conditional formats for each row. But then I realized, that it's possible to write an expression, which formats a whole row or parts of it - thought this would be adapted in a minute, and now this ^^
edit: Additional task informations
I use conditional formatting here, because this functions shall setup a table to react on user input. So, if properly setup and a user edits some cell in my conditionalized column of this tabel, the corresponding row will turn green for the used range of rows.
Now, because there might be rows before the main header-row and there might be a various number of data-columns, and also the targeted column may change, I do of course use some specific informations.
To keep them minimal, I do use NamedRanges to determine the correct offset and to determine the correct DefaultProgessColumn.
GetTitleRow is used to determine the header-row by NamedRange or header-contents.
With ActiveSheet.UsedRange.Offset(GetTitleRow(ActiveSheet.UsedRange) - _
ActiveSheet.UsedRange.Rows(1).row + 1)
Corrected my Formula1, because I found the construct before not well formed.
Formula1:="=(" & Cells(.row, _
Range(strMatchCol1).Column).Address(RowAbsolute:=False) & _
"<>"""")"
strMatchCol1 - is the name of a range.
Got it, lol. Set the ActiveCell before doing the grunt work...
ActiveSheet.Range("A1").Activate
Excel is pulling its automagic range adjusting which is throwing off the formula when the FromatCondition is added.
The reason that Conditional Formatting and Data Validation exhibit this strange behavior is because the formulas they use are outside the normal calculation chain. They have to be so that you can refer to the active cell in the formula. If you're in G1, you can't type =G1="" because you'll create a circular reference. But in CF or DV, you can type that formula. Those formulas are disassociated with the current cell unlike real formulas.
When you enter a CF formula, it's always relative to the active cell. If, in CF, you make a formula
=ISBLANK($G2)
and you're in A5, Excel converts it to
=ISBLANK(R[-3]C7)
and when that gets put into the CF, it ends up being relative to the cell it's applied to. So in row 2, the formula comes out to
=ISBLANK($G655536)
(for Excel 2003). It offsets -3 rows and that wraps to the bottom of the spreadsheet.
You can use Application.ConvertFormula to make the formula relative to some other cell. If I'm in row 5 and the start of my range is in row 2, I make the formula relative to row 8. That way the R[-3] will put the formula in A5 as $G5 (three rows up from A8).
Sub test()
Dim cstrDefaultProgressColumn As String
Dim sFormula As String
cstrDefaultProgressColumn = "$G"
With ActiveSheet.UsedRange.Offset(1)
.FormatConditions.Delete
'set used row range to green interior color, if "Erledigt Datum" is not empty
'Build formula
sFormula = "=ISBLANK(" & cstrDefaultProgressColumn & .Row & ")"
'convert to r1c1
sFormula = Application.ConvertFormula(sFormula, xlA1, xlR1C1)
'convert to a1 and make relative
sFormula = Application.ConvertFormula(sFormula, xlR1C1, xlA1, , ActiveCell.Offset(ActiveCell.Row - .Cells(1).Row))
With .FormatConditions.Add(Type:=xlExpression, _
Formula1:=sFormula)
.Interior.ColorIndex = 4
End With
End With
End Sub
I only offset .Cells(1) row-wise because the column is absolute in this example. If both row and column are relative in your CF formula, you need more offsetting. Also, this only works if the active cell is below the first cell in your range. To make it more general purpose, you would have to determine where the activecell is relative to the range and offset appropriately. If the offset put you above row 1, you would need to code it so that it referred to a cell nearer the bottom of the total number of rows for your version of Excel.
If you thought selecting was a bit of a kludge, I'm sure you'll agree that this is worse. Even though I abhor unnecessary Selecting and Activating, Conditional Formatting and Data Validation are two places where it's a necessary evil.
A brief example:
Sub Format_Range()
Dim oRange As Range
Dim iRange_Rows As Integer
Dim iCnt As Integer
'First, create a named range manually in Excel (eg. "FORMAT_RANGE")
'In your case that would be range "$A$5:$H$25".
'You only need to do this once,
'through VBA you can afterwards dynamically adapt size + location at any time.
'If you don't feel comfortable with that, you can create headers
'and look for the headers dynamically in the sheet to retrieve
'their position dynamically too.
'Setting this range makes it independent
'from which sheet in the workbook is active
'No unnecessary .Activate is needed and certainly no hard coded "A1" cell.
'(which makes it more potentially subject to bugs later on)
Set oRange = ThisWorkbook.Names("FORMAT_RANGE").RefersToRange
iRange_Rows = oRange.Rows.Count
For iCnt = 1 To iRange_Rows
If oRange(iCnt, 1) <> oRange(iCnt, 2) Then
oRange(iCnt, 2).Interior.ColorIndex = 4
End If
Next iCnt
End Sub
Regarding my comments given on the other reply:
If you have to do this for many rows, it is definitely faster to load the the entire range into memory (an array) and check the conditions within the array, after which you do the writing on those cells that need to be written (formatted).
I could agree that this technique is not "necessary" in this case - however it is good practise because it is flexible for many (any type of) customizations afterwards and easier to debug (using the immediate / locals / watches window).
I'm not a fan of Offset although I don't state it doesn't work as it should and in some limited scenarios I could say that the chance for problems "could" be small: I experienced that some business users tend to use it constantly (here offset +3, there offset -3, then again -2, etc...); although it is easy to write, I can tell you it is hell to revise. It is also very often subject to bugs when changes are made by end users.
I am very much "for" the use of headers (although I'm also a fan of reducing database capabilities for Excel, because for many it results in avoiding Access), because it will allow you very much flexibility. Even when I used columns 1 and 2; better is it to retrieve the column nr dynamically based on the location of the named range of the header. If then another column is inserted, no bugs will appear.
Last but not least, it may sound exaggerated, but the last time, I used a class module with properties and functions to perform all retrievals of potential data within each sheet dynamically, perform checks on all bugs I could think of and some additional functions to execute specific tasks.
So if you need many types of data from a specific sheet, you can instantiate that class and have all the data at your disposal, accessible through defined functions. I haven't noticed anyone doing it so far, but it gives you few trouble despite a little bit more work (you can use the same principles again over and over).
Now I don't think that this is what you need; but there may come a day that you need to make large tools for end users who don't know how it works but will complain a lot about things because of something they might have done themselves (even when it's not your "fault"); it's good to keep this in mind.