I often see questions relating to Overflow errors with vba.
My question is why use the integer variable declaration instead of just defining all numerical variables (excluding double etc.) as long?
Unless you're performing an operation like in a for loop where you can guarantee that the value won't exceed the 32,767 limit, is there an impact on performance or something else that would dictate not using long?
Integer variables are stored as 16-bit (2-byte) numbers
Office VBA Reference
Long (long integer) variables are stored as signed 32-bit (4-byte) numbers
Office VBA Reference
So, the benefit is in reduced memory space. An Integer takes up half the memory that a Long does. Now, we are talking about 2 bytes, so it's not going to make a real difference for individual integers, it's only a concern when you are dealing with TONS of integers (e.g large arrays) and memory usage is critical.
BUT on a 32 bit system, the halved memory usage comes at a performance cost. When the processor actually performs some computation with a 16 bit integer (e.g. incrementing a loop counter), the value silently gets converted to a temporary Long without the benefit of the larger range of numbers to work with. Overflows still happen, and the register that the processor uses to store the values for the calculation will take the same amount of memory (32 bits) either way. Performance may even be hurt because the datatype has to be converted (at a very low level).
Not the reference I was looking for but....
My understanding is that the underlying VB engine converts integers to long even if its declared as an integer. Therefore a slight speed decrease can be noted. I have believed this for some time and perhaps thats also why the above statement was made, I didnt ask for reasoning.
ozgrid forums
This is the reference I was looking for.
Short answer, in 32-bit systems 2 byte integers are converted to 4 byte
Longs. There really is no other way so that respective bits correctly line
up for any form of processing. Consider the following
MsgBox Hex(-1) = Hex(65535) ' = True
Obviously -1 does not equal 65535 yet the computer is returning the correct
answer, namely
"FFFF" = "FFFF"
However had we coerced the -1 to a long first we would have got the right
answer (the 65535 being greater than 32k is automatically a long)
MsgBox Hex(-1&) = Hex(65535) ' = False
"FFFFFFFF" = "FFFF"
Generally there is no point in VBA to declare "As Integer" in modern
systems, except perhaps for some legacy API's that expect to receive an
Integer.
pcreview forum
And at long last I found the msdn documentation I was really truly looking for.
Traditionally, VBA programmers have used integers to hold small
numbers, because they required less memory. In recent versions,
however, VBA converts all integer values to type Long, even if they're
declared as type Integer. So there's no longer a performance advantage
to using Integer variables; in fact, Long variables may be slightly
faster because VBA does not have to convert them.
To clarify based on the comments: Integers still require less memory to store - a large array of integers will need significantly less RAM than an Long array with the same dimensions. But because the processor needs to work with 32 bit chunks of memory, VBA converts Integers to Longs temporarily when it performs calculations
So, in summary, there's almost no good reason to use an Integer type these days. Unless you need to Interop with an old API call that expects a 16 bit int, or you are working with large arrays of small integers and memory is at a premium.
One thing worth pointing out is that some old API functions may be expecting parameters that are 16-bit (2-byte) Integers and if you are on a 32 bit and trying to pass an Integer (that is already a 4-byte long) by reference it will not work due to difference in length of bytes.
Thanks to Vba4All for pointing that out.
Even though this post is four years old, I was curious about this and ran some tests. The most important thing to note is that a coder should ALWAYS declare a variable as SOMETHING. Undeclared variables clearly performed the worst (undeclared are technically Variant)
Long did perform the fastest, so I have to think that Microsoft's recommendation to always use Long instead of Integer makes sense. I'm guessing the same as true with Byte, but most coders don't use this.
RESULTS ON 64 BIT WINDOWS 10 LAPTOP
Code Used:
Sub VariableOlymics()
'Run this macro as many times as you'd like, with an activesheet ready for data
'in cells B2 to D6
Dim beginTIME As Double, trials As Long, i As Long, p As Long
trials = 1000000000
p = 0
beginTIME = Now
For i = 1 To trials
Call boomBYTE
Next i
Call Finished(p, Now - beginTIME, CDbl(trials))
p = p + 1
beginTIME = Now
For i = 1 To trials
Call boomINTEGER
Next i
Call Finished(p, Now - beginTIME, CDbl(trials))
p = p + 1
beginTIME = Now
For i = 1 To trials
Call boomLONG
Next i
Call Finished(p, Now - beginTIME, CDbl(trials))
p = p + 1
beginTIME = Now
For i = 1 To trials
Call boomDOUBLE
Next i
Call Finished(p, Now - beginTIME, CDbl(trials))
p = p + 1
beginTIME = Now
For i = 1 To trials
Call boomUNDECLARED
Next i
Call Finished(p, Now - beginTIME, CDbl(trials))
p = p + 1
End Sub
Private Sub boomBYTE()
Dim a As Byte, b As Byte, c As Byte
a = 1
b = 1 + a
c = 1 + b
c = c + 1
End Sub
Private Sub boomINTEGER()
Dim a As Integer, b As Integer, c As Integer
a = 1
b = 1 + a
c = 1 + b
c = c + 1
End Sub
Private Sub boomLONG()
Dim a As Long, b As Long, c As Long
a = 1
b = 1 + a
c = 1 + b
c = c + 1
End Sub
Private Sub boomDOUBLE()
Dim a As Double, b As Double, c As Double
a = 1
b = 1 + a
c = 1 + b
c = c + 1
End Sub
Private Sub boomUNDECLARED()
a = 1
b = 1 + a
c = 1 + b
c = c + 1
End Sub
Private Sub Finished(i As Long, timeUSED As Double, trials As Double)
With Range("B2").Offset(i, 0)
.Value = .Value + trials
.Offset(0, 1).Value = .Offset(0, 1).Value + timeUSED
.Offset(0, 2).FormulaR1C1 = "=ROUND(RC[-1]*3600*24,0)"
End With
End Sub
As noted in other answers, the real difference between int and long is the size of its memory space and therefore the size of the number it can hold.
here is the full documentation on these datatypes
http://msdn.microsoft.com/en-us/library/office/ms474284(v=office.14).aspx
an Integer is 16 bits and can represent a value between -32,768 and 32,767
a Long is 32 bits and can represent -2,147,483,648 to 2,147,483,647
and there is a LongLong which is 64 bits and can handle like 9 pentilion
One of the most important things to remember on this is that datatypes differ by both language and operating system / platform. In your world of VBA a long is 32 bits, but in c# on a 64 bit processor a long is 64 bits. This can introduce significant confusion.
Although VBA does not have support for it, when you move to any other language in .net or java or other, I much prefer to use the system datatypes of int16, int32 and int64 which allows me to b much more transparent about the values that can be held in these datatypes.
VBA has a lot of historical baggage.
An Integer is 16 bits wide and was a good default numeric type back when 16 bit architecture/word sizes were prevalent.
A Long is 32 bits wide and (IMO) should be used wherever possible.
I have taken #PGSystemTester's method and updated it to remove some potential variability. By placing the loop in the routines, this removes the time taken to call the routine (which is a lot of time). I have also turned off screen updating to remove any delays this may cause.
Long still performed the best, and as these results are more closely limited to the impacts of the variable types alone, the magnitude of variation is worth noting.
My results (desktop, Windows 7, Excel 2010):
Code used:
Option Explicit
Sub VariableOlympics()
'Run this macro as many times as you'd like, with an activesheet ready for data
'in cells B2 to D6
Dim beginTIME As Double, trials As Long, i As Long, p As Long
Dim chosenWorksheet As Worksheet
Set chosenWorksheet = ThisWorkbook.Sheets("TimeTrialInfo")
Application.EnableEvents = False
Application.Calculation = xlCalculationManual
Application.ScreenUpdating = False
trials = 1000000000 ' 1,000,000,000 - not 10,000,000,000 as used by #PGSystemTester
p = 0
beginTIME = Now
boomBYTE trials
Finished p, Now - beginTIME, CDbl(trials), chosenWorksheet.Range("B2")
p = p + 1
beginTIME = Now
boomINTEGER trials
Finished p, Now - beginTIME, CDbl(trials), chosenWorksheet.Range("B2")
p = p + 1
beginTIME = Now
boomLONG trials
Finished p, Now - beginTIME, CDbl(trials), chosenWorksheet.Range("B2")
p = p + 1
beginTIME = Now
boomDOUBLE trials
Finished p, Now - beginTIME, CDbl(trials), chosenWorksheet.Range("B2")
p = p + 1
beginTIME = Now
boomUNDECLARED trials
Finished p, Now - beginTIME, CDbl(trials), chosenWorksheet.Range("B2")
p = p + 1
Application.EnableEvents = True
Application.Calculation = xlCalculationAutomatic
Application.ScreenUpdating = True
chosenWorksheet.Calculate
End Sub
Private Sub boomBYTE(numTrials As Long)
Dim a As Byte, b As Byte, c As Byte
Dim i As Long
For i = 1 To numTrials
a = 1
b = 1 + a
c = 1 + b
c = c + 1
Next i
End Sub
Private Sub boomINTEGER(numTrials As Long)
Dim a As Integer, b As Integer, c As Integer
Dim i As Long
For i = 1 To numTrials
a = 1
b = 1 + a
c = 1 + b
c = c + 1
Next i
End Sub
Private Sub boomLONG(numTrials As Long)
Dim a As Long, b As Long, c As Long
Dim i As Long
For i = 1 To numTrials
a = 1
b = 1 + a
c = 1 + b
c = c + 1
Next i
End Sub
Private Sub boomDOUBLE(numTrials As Long)
Dim a As Double, b As Double, c As Double
Dim i As Long
For i = 1 To numTrials
a = 1
b = 1 + a
c = 1 + b
c = c + 1
Next i
End Sub
Private Sub boomUNDECLARED(numTrials As Long)
Dim a As Variant, b As Variant, c As Variant
Dim i As Long
For i = 1 To numTrials
a = 1
b = 1 + a
c = 1 + b
c = c + 1
Next i
End Sub
Private Sub Finished(i As Long, timeUSED As Double, trials As Double, initialCell As Range)
With initialCell.Offset(i, 0)
.Value = trials
.Offset(0, 1).Value = timeUSED
.Offset(0, 2).FormulaR1C1 = "=ROUND(RC[-1]*3600*24,2)"
End With
End Sub
This is a space vs necessity problem.
In some situations it's a necessity to use a long. If you're looping through rows in a large excel file, the variable that holds the row number should be a long.
However, sometimes you will know that an integer can handle your problem and using a long would be a waste of space (memory). Individual variables really don't make much of a difference, but when you start dealing with arrays it can make a big difference.
In VBA7, Integers are 2 bytes and longs are 4 bytes
If you have an array of 1 million numbers between 1 and 10, using an Integer array would take up about 2MB of RAM, compared to roughly 4MB of RAM for a long array.
As others already mentioned, a Long may take twice as much space as an Integer. As others also already mentioned, the high capacity of current computers means you will see no difference in performance whatsoever, unless you are dealing with extra extra extra large amounts of data:
Memory
Considering 1 million values, the difference between using Integers versus Longs would be of 2 bytes for each value, so that is 2 * 1 000 000 / 1,024 / 1024 = less than 2 MB of difference in your RAM, which is likely much less than 1% or even 0.1% of your RAM capacity.
Processing
Considering the benchmark done by PGSystemTester's, you can see a difference of 811 - 745 = 66 seconds between Longs and Integers, when processing 10 billion batches of 4 operations each. Reduce the number to 1 million of operations and we can expect 66 / 10 000 / 4 = less than 2ms of difference in execution time.
I personally use Integers and Longs to help readability of my code, particularly in loops, where an Integer indicates the loop is expected to be small (less than 1000 iterations), whereas a Long tells me the loop is expected to be rather large (more than 1000).
Note this subjective threshold is way below the Integer upper limit, I use Longs just to make the distinction between my own definitions of small and large.
I know it is an old post, but I found it while browsing and I also wanted to share my findings:
after #PGSystemTester's method I got those
and after AJD's
Intel i5-8500T CPU
8gigs of RAM
64bit system and Win10Enterprise 21H1 OS build 19043.2006
while excel is on Version 2108 and Build 14326.20852
I also don't know if it was influencing but I also got Rubberduck Vers. 2.5.2.5906
but it seems like my Integer is faster in both cases
Ty AJD and pgSystemTester. I confirm AJD results for Integer 33 and Long 20. Yet Long is much much faster only because your test program is small enough to fit entirely in the processor cache memory. The fastest ram is L1 cache and that is only 32kB for data and 32kB for program. Testing Byte, Integer and Long for one or several arrays that barely fit into 32kB of L1 data cache will give you totally different or opposite results regarding the speeds.
In my case for the same arrays that totals 120kB for Integers and 240kB for Long i had the same result for Long as for Integer.
That is because changing to Long the same arrays totals double the size compared with Integer arrays, and so, more and more of data had fall outside L1 cache due to the change to Long. To reach data outside L1 cache took much more clocks or time.
Therefore your test is good only as a test yet in real life is misleading as msdn.microsoft recommendation to use Long regardless. Also those who emphasize that ram size is double for Long had not emphasize the consequence for the processor waiting time to reach data outside of L1 cache or even worst outside L2 cache or outside L3 cache. For each outside L1, L2 and L3 the time to reach the data will increase dramatically and this is most important for speed.
To summarize:
if your data fit inside the L1 cache then Long is fastest but
that is only 4k of data times 4Bytes for Long = 16kB (because other programs and OS will populate the rest of 32k of L1 cache),
Byte and Integer will drastically increase the speed for arrays in size
of at least 16kB, because changing to Long will increase size and this
will force more data to reside outside the fastest L1 cache ram.
Try the same test but instead of Dim a As Byte use Dim a() As Byte , example:
Dim a() As Byte, b() As Byte, c() As Byte
ReDim a(7, 24, 60), b(24, 7, 60), c(24, 60, 7)
Dim h As Long, loops As Long: Dim i As Long, j As Long, k As Long ' these i, j, k always As Long
loops=1
For h = 1 To loops
For i = 1 To 6: For j = 0 To 23: For k = 1 To 58
a(i, j, k) = a(i + 1, j, k): b(j, i, k) = b(j, i - 1, k)
c(j, k, i) = a(i - 1, j, k + 1) + b(j, i - 1, k - 1)
Next k: Next j: Next i
For i = 6 To 1 Step -1: For j = 23 To 0 Step -1: For k = 58 To 1 Step -1
a(i, j, k) = a(i + 1, j, k): b(j, i, k) = b(j, i - 1, k)
c(j, k, i) = a(i - 1, j, k + 1) + b(j, i - 1, k - 1)
Next k: Next j: Next i
Next h
First set "loops" to 1 to see how long it takes. Then increase it gradually aiming for several seconds for As Bytes. It will take longer for As Integer, and even longer for As Long...
The size of each of the 3 array is 8x25x61 = 12200 and this is
12200 kB times 3 = 36600 kB for As Byte ,
24400 kB times 3 = 73200 kB for As Integer ,
48800 kB times 3 = 146400 kB for As Long .
Run the same code with Dim a() As Integer, b() As Integer, c() As Integer, then the same with Dim a() As Long, b() As Long, c() As Long and so on.
Now if you increase one dimension 20 times you will expect a 20 increase in duration but it will be much more because now data will fall outside the L2 cache (1MB shared for all 4 cores).
If you increase one dimension 200 times you will expect a 200 increase in duration but it will be again much more because now data will fall outside the L3 cache (6-8MB shared for all 4 cores and the same 8MB for 8 cores or 16MB if ryzen 5800...).
I can't understand why after 20 years or more the L1 cache is only 64kB when it can be at least 16x16=256kB . With 16bits for row address and 16bits for column address you have to read only 32bits and that is one read for 32 bits processor. I suspect that is because perhaps the core still works on 16bits (8 for row + 8 for column address, 8x8=64kB) or worst on only 8bits.
After testing please post your results.
Related
I have a number that I would like to normally distribute into 15 bins or cells. And I want the 15 numbers to be in sequence
Example:
Number to be distributed - 340
Output: 6 9 12 16 20 24 27 30 32 32 32 30 27 24 20
... yes, my series is not perfectly distributed but currently I'm doing this by,
first create a linear series of number 1 2 3 4 ... 14 15
Then use Norm.Dist(x,mean,standard_dev) to generate a series of z-score values where x=1, 2, 3 .. 14, 15
Then I scale those values using similar triangles ie. x1/y1=x2/y2 where x1=z-score; y1=sum(z-scores); x2=number I want; y2=340
Is there a better way to do this? because I have to generate multiple matrix for this and something is not quite right...
Here is a hit-and-miss approach that searches for a random vector of independent normal variables whose sum falls within a given tolerance of the target sum and, if so, rescales all of the numbers so as to equal the sum exactly:
Function RandNorm(mu As Double, sigma As Double) As Double
'assumes that Ranomize has been called
Dim r As Double
r = Rnd()
Do While r = 0
r = Rnd()
Loop
RandNorm = Application.WorksheetFunction.Norm_Inv(r, mu, sigma)
End Function
Function RandSemiNormVect(target As Double, n As Long, mu As Double, sigma As Double, Optional tol As Double = 1) As Variant
Dim sum As Double
Dim rescale As Double
Dim v As Variant
Dim i As Long, j As Long
Randomize
ReDim v(1 To n)
For j = 1 To 10000 'for safety -- can increase if wanted
sum = 0
For i = 1 To n
v(i) = RandNorm(mu, sigma)
sum = sum + v(i)
Next i
If Abs(sum - target) < tol Then
rescale = target / sum
For i = 1 To n
v(i) = rescale * v(i)
Next i
RandSemiNormVect = v
Exit Function
End If
Next j
RandSemiNormVect = CVErr(xlErrValue)
End Function
Tested like this:
Sub test()
On Error Resume Next
Range("A1:A15").Value = Application.WorksheetFunction.Transpose(RandSemiNormVect(340, 15, 20, 3))
If Err.Number > 0 Then MsgBox "No Solution Found"
End Sub
Typical output with those parameters:
On the other hand, if I change the standard deviation to 1, I just get the message that no solution is found because then the probability of getting a solution within the specified tolerance of the target sum is vanishingly small.
In response to this question I ran the following VBA experiment:
Sub Test()
Dim i As Long, A As Variant
Dim count1 As Long, count2 As Long
ReDim A(1 To 10000)
For i = 1 To 10000
Randomize
A(i) = IIf(Rnd() < 0.5, 0, 1)
Next i
'count how often A(i) = A(i+1)
For i = 1 To 9999
If A(i) = A(i + 1) Then count1 = count1 + 1
Next i
For i = 1 To 10000
A(i) = IIf(Rnd() < 0.5, 0, 1)
Next i
'count how often A(i) = A(i+1)
For i = 1 To 9999
If A(i) = A(i + 1) Then count2 = count2 + 1
Next i
Debug.Print "First Loop: " & count1
Debug.Print "Second Loop: " & count2 & vbCrLf
End Sub
When I saw output like this:
First Loop: 5550
Second Loop: 4976
I was pretty sure that I knew what was happening: VBA was converting the system clock into something of lower resolution (perhaps microsecond) which as a consequence would lead to Randomize sometimes producing identical seeds in two or more passes through the loop. In my original answer I even confidently asserted this. But then I ran the code some more and noticed that the output was sometimes like this:
First Loop: 4449
Second Loop: 5042
The overseeding is still causing a noticeable autocorrelation -- but in the opposite (and unexpected) direction. Successive passes through the loop with the same seed should produce identical outputs, hence we should see successive values agreeing more often than chance would predict, not disagreeing more often than chance would predict.
Curious now, I modified the code to:
Sub Test2()
Dim i As Long, A As Variant
Dim count1 As Long, count2 As Long
ReDim A(1 To 10000)
For i = 1 To 10000
Randomize
A(i) = Rnd()
Next i
'count how often A(i) = A(i+1)
For i = 1 To 9999
If A(i) = A(i + 1) Then count1 = count1 + 1
Next i
For i = 1 To 10000
A(i) = Rnd()
Next i
'count how often A(i) = A(i+1)
For i = 1 To 9999
If A(i) = A(i + 1) Then count2 = count2 + 1
Next i
Debug.Print "First Loop: " & count1
Debug.Print "Second Loop: " & count2 & vbCrLf
End Sub
Which always gives the following output:
First Loop: 0
Second Loop: 0
It seems that it isn't the case that successive calls to Randomize sometimes returns the same seed (at least not often enough to make a difference).
But if that isn't the source of the autocorrelation -- what is? And -- why does it sometimes manifest itself as a negative rather than a positive autocorrelation?
Partial answer only, fell free to edit and complete.
Well, there is clearly a correlation when you overuse the Randomize function.
I tried the following code, with a conditional formatting (black fill for values >0.5), and there is clearly patterns appearing (try to comment the Randomize to see a more "random" pattern. (best seen with 20 pt columns and 10% zoom)
Function Rndmap()
Dim i As Long, j As Long
Dim bmp(1 To 512, 1 To 512) As Long
For i = 1 To 512
For j = 1 To 512
' Rnd -1 ' uncomment this line to get a big white and black lines pattern.
Randomize 'comment this line to have a random pattern
bmp(i, j) = IIf(Rnd() < 0.5, 0, 1)
Next j
Next i
Range(Cells(1, 1), Cells(512, 512)) = bmp
End Function
So while the MSDN states that "Using Randomize with the same value for number does not repeat the previous sequence.", implying that if the Timer returns twice the same value, the Rnd should keep on the same random sequence without reseting, there is still some behind the scene link..
Some screenshots:
Rnd() only:
Using Randomize:
Using Rnd -1 and Randomize:
The Randomize method initialises the Rnd function with the current system time as it's seed, you can also specify a number with Randomize to be used as the seed.
I decided to test how long a sequence continues before repeating itself:
Sub randomRepeatTest()
For i = 1 To 100000
Randomize
randomThread = randomThread & Int(9 * Rnd + 1)
If i Mod 2 = 0 Then
If Left(randomThread, i / 2) = Right(randomThread, i / 2) Then
Debug.Print i / 2
Exit Sub
End If
End If
Next i
End Sub
This sub generates a random sequence of the digits 0 - 9, and as the sequence becomes an even length it is tested to see if the first half of the sequence matches the second half, and if so it outputs the length the sequence reached before repeating. After running it a number of times, and discounting where a digit is repeated twice at the beginning, the result comes out at 256 (nice).
Providing any value to Randomize will still return a result of 256.
We're randomizing Rnd every loop, so what's going on here?
Well as I said at the beginning, if no value is given to Randomize, it will use the system time as the seed. The resolution of this time is something I can't seem to find sourced, however I believe it to be low.
I have tested using the value of timer which returns the time of day in seconds to 2 decimal places (e.g. 60287.81). I have also tried GetTickCount which returns the system active time (starts counting at boot) in milliseconds. Both of these still result in the 256 sequence limit.
So, why when we're randomizing every loop does the sequence repeat? Well the reality is, the code is executed within a millisecond. Essentially, we're providing the same number to randomize every loop, and so we're not actually shuffling the seed.
So, is Rnd more random without Randomize?
I ran the above sub again without Randomize; nothing returned. I upped the loop count to 2,000,000; still nothing.
I've managed to source the algorithm used by the workbook Rand formula, which I believe is the same as Rnd with no initialised seed:
C IX, IY, IZ SHOULD BE SET TO INTEGER VALUES BETWEEN 1 AND 30000 BEFORE FIRST ENTRY
IX = MOD(171 * IX, 30269)
IY = MOD(172 * IY, 30307)
IZ = MOD(170 * IZ, 30323)
RANDOM = AMOD(FLOAT(IX) / 30269.0 + FLOAT(IY) / 30307.0 + FLOAT(IZ) / 30323.0, 1.0)
It is an iterative function which uses the result of the previous call to generate a new number. Referenced as the Wichman-Hill procedure, it guarantees that more than 10^13 numbers will be generated before the sequence repeats itself.
The problem with Rnd
For the algorithm to work, it first needs to be initialised with values for IX, IY & IZ. The problem we have here is that we can't initialise the algorithm with random variables, as it is this algorithm we need in order to get random values, so the only option is to provide some static values to get it going.
I have tested this and it seems to be the case. Opening a fresh instance of Excel, ? Rnd() returns 0.70554. Doing the same again returns the exact same number.
So the problem we have is Rnd without using Randomize gives us a much longer sequence of random numbers, however that sequence will start at the same place each time we open Excel. Where functions are dependant on random generation, such as password generation, this doesn't suffice as we will get the same repeated results each time we open Excel.
The solution
Here's a function I have come up with and it seems to work well:
Public Declare PtrSafe Sub Sleep Lib "kernel32" (ByVal Milliseconds As LongPtr)
Public Declare Function GetTickCount Lib "kernel32" () As Long
Public randomCount As Long
Function getRandom()
If randomCount Mod 255 = 0 Then
Sleep 1
End If
Randomize GetTickCount
getRandom = Rnd()
randomCount = randomCount + 1
End Function
It makes use of the GetTickCount function as the Randomize seed. Each call adds 1 to a randomCount variable, and after every 255 runs the macro is forced to sleep for 1 millisecond (although this actually works out at around 15 on my system) so that the seed of GetTickCount will be changed, and so a new sequence of numbers will be returned by Rnd
This of course will return the same sequence if by chance it is used at the same system time, however for most cases it will be a sufficient method for generating more random numbers. If not, it would need some fancy work using something like the Random.Org API.
I have a VBA Macro that needs to run through a very large Excel file (about 550,000 rows).
I have found that the Macro can only run through about 20,000 lines at a time - any bigger than that and it gives an 'Overflow' error.
My only solution right now is to set j (row number) = 1 to 20,000, run it, then set j from 20,001 to 40,000, run it, then set j from 40,001 to 60,000 etc....all the way to 550,000.
This is a tedious process and I'm wondering is there any way to automate it so that I don't have to keep going back in to change the numbers?
Code looks something like:
Sub ExposedDays()
Dim j As Integer
Dim k As Integer
Dim l As Integer
Dim result As String
For j = 2 To 10000
k = 1
Do While Len(Worksheets("my8").Cells(j, k)) > 0
k = k + 1
Loop
Worksheets("my8").Cells(j, 131) = Worksheets("my8").Cells(j, k - 6)
Worksheets("my8").Cells(j, 132) = Worksheets("my8").Cells(j, k - 5)
.......
The integer data type allocates only enough memory to hold integers -32,768 to 32,767. When you:
dim j as integer
an 'Overflow' error will result when j exceeds 32,767. To solve this problem you need to assign a larger data type.
The Long data type is large enough to hold integers -2,147,483,648 to 2,147,483,647.
If you:
dim j as long
dim k as long
you should be able to avoid this error.
Hope this helps.
I'm writing a macro that reads price data from an excel file, write them on another excel file and does basic calculations (e.g. returns). The data is made of 2600 x 100 data points.
When I run it i have an overflow error! looks like it's handling too much data. Is it an issue with the computer's memory? (i have a modern laptop with 4GB Ram) or does it have to do with the way the data is stored during the calulations? What I don't get is that if I try to do that kind of calculations in the spreadsheet directly, i won't have an overflow message.
Btw, I'm trying to avoid using Access.
Thanks for your help.
Sam
Dim i As Long
Dim j As Long
Dim k As Long
Dim m As Long
Dim n As Long
Dim o As Long
' FYI yearsconsidered = 10
For j = 1 To Group(k).TickerCount
For m = 1 To Quotes(j, k).ContinuationCount
For i = 1 To YearsConsidered * 261
If Sheets(j).Cells(i - GivenPeriod, 4 * m - 3) = 0 Then
Sheets(j).Cells(i, 4 * m - 2) = 0
Else
Sheets(j).Cells(i, 4 * m - 2) = (Sheets(j).Cells(i, 4 * m - 3) - Sheets(j).Cells(i - GivenPeriod, 4 * m - 3)) / Sheets(j).Cells(i - GivenPeriod, 4 * m - 3)
End If
Next i
Next m
Next j
I have experienced a similar situation when I'm looping through a large number of data points. For some reason, this does not happen if I code the formula entry into a range of cells (at the same time); like when you manually select a range in the spreadsheet, enter the formula once, then hit Ctrl+Enter. This will require for the proper use of the $ signs (absolute vs relational positions).
It's hard to tell exactly what the issue is without seeing the code, so not sure if this is your problem...
The following evaluates to 32760, an integer:
Debug.Print (20 * 1638 - 1)
But this raises an overflow error:
Dim t as Integer
t = 1639
Debug.Print (20 * t - 1)
It seems like this is implicitly expecting an Integer return value, because if I do either of the following, the error is avoided.
Dim t as Long OR Debug.print (20 * CLng(t) - 1) OR Debug.Print (20# * t - 1)
Is this behavior documented?
Is my assumption accurate? Namely, that arithmetic of integers implies an integer return value, and that simply introducing one Long or Double value in to the equation will avoid the error?
If my logic is correct, Dim or 'Dimension' is a way of telling the application that you expect use a variable of a certain type, and that type pertains to a certain amount of 'bits' (of memory).
This reserves a section of the system's memory, which has been allocated a certain amount of bits dependant on the variable type that you have instructed in your code. These bits then define how many (If you're familiar with C++ or similar then you will probably already know all this...)
An Integer is 16 bits in VBA and is a signed integer which means we can store negative values too, so the limit is 32,767 because this is the biggest number we can achieve with 16 bits:
(generally a variable can hold 2^n where n = number of bits)
unsigned 16 bits = 0 - 65,536 (2^16)
signed 16 bits = -32,768 - 32,767
32,767 = 111111111111111 (Binary)
32,768 = 1000000000000000 <--- note the extra bit
This extra bit is what causes the "overflow" error - because the amount of bits required to produce the number overflows the amount of bits that the memory has to store the number safely.
I don't think the method of the calculation is documented to this extent, however your code snippet:
Dim t as Integer
t = 1639
Debug.Print (20 * t - 1)
would require t to be first be multiplied by 20, resulting in a figure of 32,780:
20 * t = 20 * 1639 = 32,780
32,780 = 1000000000001100 (Binary)
which overflows the bit limit for the Integer data type. At this point the system throws an error before it has the chance to proceed with the rest of the calculation because it tries to multiply t whilst still in it's allocated memory address, for which only 16 bits of memory have been reserved.
Also, not declaring t as a type will force VBA to default to type Variant which will assess that t needs to have more memory allocated when the calculation runs and push it into the Long boundary automatically.
Update: It would appear that VBA will only permit the highest amount of bits held by a variable within the equation for the return value, as can be seen in this example:
Sub SO()
Dim t As Integer, c As Long
t = 1639
c = 20
Debug.Print (20 * (t - 1)) '// No Error
Debug.Print (c * (t - 1)) '// No Error
Debug.Print ((c * t) - 1) '// No Error
c = (20 * t - 1) '// Error
Debug.Print (20 * t - 1) '// Error
End Sub
Although I don't believe this is documented anywhere, it would lead one to believe that VBA limits memory usage to the highest amount of bits being used by a variable at any one time.