I am looking for a seeded random number generator that creates a pool of numbers as a context. It doesn't have to be too good. It is used for a game, but it is important, that each instance of the Game Engine has its own pool of numbers, so that different game instances or even other parts of the game that use random numbers don't break the deterministic character of the generated numbers.
Currently I am using rand() which obviously doesn't have this feature.
Are there any c or objective-c generators that are capable of doing what I want?
Use srand to set the seed, and then use rand():
unsigned int seed = 10; /* Choose an arbitrary value for the seed */
int r;
srand(seed); /* Set the seed */
r = rand(); /* Generate a random number */
The man page explicitly states that the sequence of pseudo-random numbers can be repeatable (and hence it is deterministic):
Thesrand() function sets its argument as the seed for a new sequence of pseudo-random integers to be returned by rand(). These sequences are repeatable by calling srand() with the same seed value.
Edit (clarification):
Note that the man page states that srand() is niether reentrant nor thread-safe.
I assumed that by "different game instances" you meant different processes, in which case it is okay to use it.
However, if you plan on changing seeds within the same process, you won't get the functionality that you want. In this case I recommend using rand_r() instead. Take a look at this question for reference.
It seems you don't need a "context" (whatever that may mean); instead you are looking for a PRNG implementation where you can save and restore the current state. This is actually possible with any PRNG implementation that you implement yourself (since you can always save the state), whereas library functions may or may not give you access to the state.
For Linux and MacOS, they actually added a rand_r in addition to rand -- this is documented as a thread safe, reentrant version of rand, but the "magic" behind that is simply that it takes a pointer to the current state instead of keeping it in a static variable. Other random number functions like the drand48 family seem to have versions with additional parameters as well, although I would have to do more reading to find out whether it can be used to store the state.
Either way, if you 'google' or 'wikipedia' for a random number generator to implement yourself, making the 'current state' an explicit parameter will be trivial.
You might be able to use random() and setstate(). I haven't used setstate() myself, but the manpage seems to indicate it might do what you want…
The 'obvious' PRNG to use is the drand48() family of functions. These allow you to provide 48 bits of state, and even allow you to set the multiplier and constant used in the calculations.
Any good PRNG library should be able to do this. GNU Scientific Library provides support for generating random numbers via many diferent algorithms, and from many probability distributions. Each call to gsl_rng_alloc sets up an independent random number generator with its own state, which you can seed using gsl_rng_set. You probably want to use different seeds for different parts of the program, and depending on which PRNG algorithm you use, some particular seeds might not work very well. Copying and pasting a few numbers from random.org is probably a good way of getting seeds.
Related
What is the difference between np.random.seed(0), np.random.seed(42), and np.random.seed(..any number). what is the function of the number in parentheses?
python uses the iterative Mersenne Twister algorithm to generate pseudo-random numbers [1]. The seed is simply where we start iterating.
To be clear, most computers do not have a "true" source of randomness. It is kind of an interesting thing that "randomness" is so valuable to so many applications, and is quite hard to come by (you can buy a specialized device devoted to this purpose). Since it is difficult to make random numbers, but they are nevertheless necessary, many, many, many, many algorithms have been developed to generate numbers that are not random, but nevertheless look as though they are. Algorithms that generate numbers that "look randomish" are called pseudo-random number generators (PRNGs). Since PRNGs are actually deterministic, they can't simply create a number from the aether and have it look randomish. They need an input. It turns out that using some complex operations and modular arithmetic, we can take in an input, and get another number that seems to have little or no relation to the input. Using this intuition, we can simply use the previous output of the PRNG as the next input. We then get a sequence of numbers which, if our PRNG is good, will seem to have no relation to each other.
In order to get our iterative PRNG started, we need an initial input. This initial input is called a "seed". Since the PRNG is deterministic, for a given seed, it will generate an identical sequence of numbers. Usually, there is a default seed that is, itself, sort of randomish. The most common one is the current time. However, the current time isn't a very good random number, so this behavior is known to cause problems sometimes. If you want your program to run in an identical manner each time you run it, you can provide a seed (0 is a popular option, but is entirely arbitrary). Then, you get a sequence of randomish numbers, but if you give your code to someone they can actually entirely recreate the runtime of the program as you witnessed it when you ran it.
That would be the starting key of the generator. Typically if you want to get reproducible results you'll use the same seed over and over again throughout your simulations.
You are setting the seed of the random number generator so you can get reproducible results. Example.
np.random.seed(0)
np.random.randint(0,100,10)
Output:
array([44, 47, 64, 67, 67, 9, 83, 21, 36, 87])
Now, if you ran the same code your computer, you should get the same 10 number output from the random integers from 0 to 100.
Looking for the proper data type (such as IndexedSeq[Double]) to use when designing a domain-specific numerical computing library. For this question, I'm limiting scope to working with 1-Dimensional arrays of Double. The library will define a number functions that are typically applied for each element in the 1D array.
Considerations:
Prefer immutable data types, such as Vector or IndexedSeq
Want to minimize data conversions
Reasonably efficient in space and time
Friendly for other people using the library
Elegant and clean API
Should I use something higher up the collections hierarchy, such as Seq?
Or is it better to just define the single-element functions and leave the mapping/iterating to the end user?
This seems less efficient (since some computations could be done once per set of calls), but at at the same time a more flexible API, since it would work with any type of collection.
Any recommendations?
If your computations are to do anything remotely computationally intensive, use Array, either raw or wrapped in your own classes. You can provide a collection-compatible wrapper, but make that an explicit wrapper for interoperability only. Everything other than Array is generic and thus boxed and thus comparatively slow and bulky.
If you do not use Array, people will be forced to abandon whatever things you have and just use Array instead when performance matters. Maybe that's okay; maybe you want the computations to be there for convenience not efficiency. In that case, I suggest using IndexedSeq for the interface, assuming that you want to let people know that indexing is not outrageously slow (e.g. is not List), and use Vector under the hood. You will use about 4x more memory than Array[Double], and be 3-10x slower for most low-effort operations (e.g. multiplication).
For example, this:
val u = v.map(1.0 / _) // v is Vector[Double]
is about three times slower than this:
val u = new Array[Double](v.length)
var j = 0
while (j<u.length) {
u(j) = 1.0/v(j) // v is Array[Double]
j += 1
}
If you use the map method on Array, it's just as slow as the Vector[Double] way; operations on Array are generic and hence boxed. (And that's where the majority of the penalty comes from.)
I am using Vectors all the time when I deal with numerical values, since it provides very efficient random access as well as append/prepend.
Also notice that, the current default collection for immutable indexed sequences is Vector, so that if you write some code like for (i <- 0 until n) yield {...}, it returns IndexedSeq[...] but the runtime type is Vector. So, it may be a good idea to always use Vectors, since some binary operators that take two sequences as input may benefit from the fact that the two arguments are of the same implementation type. (Not really the case now, but some one has pointed out that vector concatenation could be in log(N) time, as opposed to the current linear time due to the fact that the second parameter is simply treated as a general sequence.)
Nevertheless, I believe that Seq[Double] should already provide most of the function interfaces you need. And since mapping results from Range does not yield Vector directly, I usually put Seq[Double] as the argument type as my input, so that it has some generality. I would expect that efficiency is optimized in the underlying implementation.
Hope that helps.
I am using rand() function, but it always uses the same random sequence. Is there a random function that seeds with the clock value? And how would I do this?
rand() requires you to specify the seed. The best way to specify a seed is to use the current time.
// specify the seed
srand(time(NULL));
Or you can use arc4random.
You are meant to seed rand() and random() (slightly bigger space) yourself, with their respective seed functions, before using them. You can use the time, or whatever other value you desire:
srand(time(0));
srandom(time(0));
Here we get the system time; obviously passing a constant will produce the same sequence every run.
You can also use arc4random() which generates very high-quality random bits and seeds itself using /dev/random.
I have read that it has something to do with time, also you get from including time.h, so I assumed that much, but how does it work exactly? Also, does it have any tendencies towards odd or even numbers or something like that? And finally is there something with better distribution in the C standard library or the Foundation framework?
Briefly:
You use time.h to get a seed, which is an initial random number. C then does a bunch of operations on this number to get the next random number, then operations on that one to get the next, then... you get the picture.
rand() is able to touch on every possible integer. It will not prefer even or odd numbers regardless of the input seed, happily. Still, it has limits - it repeats itself relatively quickly, and in almost every implementation only gives numbers up to 32767.
C does not have another built-in random number generator. If you need a real tough one, there are many packages available online, but the Mersenne Twister algorithm is probably the most popular pick.
Now, if you are interested on the reasons why the above is true, here are the gory details on how rand() works:
rand() is what's called a "linear congruential generator." This means that it employs an equation of the form:
xn+1 = (*a****xn + ***b*) mod m
where xn is the nth random number, and a and b are some predetermined integers. The arithmetic is performed modulo m, with m usually 232 depending on the machine, so that only the lowest 32 bits are kept in the calculation of xn+1.
In English, then, the idea is this: To get the next random number, multiply the last random number by something, add a number to it, and then take the last few digits.
A few limitations are quickly apparent:
First, you need a starting random number. This is the "seed" of your random number generator, and this is where you've heard of time.h being used. Since we want a really random number, it is common practice to ask the system what time it is (in integer form) and use this as the first "random number." Also, this explains why using the same seed twice will always give exactly the same sequence of random numbers. This sounds bad, but is actually useful, since debugging is a lot easier when you control the inputs to your program
Second, a and b have to be chosen very, very carefully or you'll get some disastrous results. Fortunately, the equation for a linear congruential generator is simple enough that the math has been worked out in some detail. It turns out that choosing an a which satisfies *a***mod8 = 5 together with ***b* = 1 will insure that all m integers are equally likely, independent of choice of seed. You also want a value of a that is really big, so that every time you multiply it by xn you trigger a the modulo and chop off a lot of digits, or else many numbers in a row will just be multiples of each other. As a result, two common values of a (for example) are 1566083941 and 1812433253 according to Knuth. The GNU C library happens to use a=1103515245 and b=12345. A list of values for lots of implementations is available at the wikipedia page for LCGs.
Third, the linear congruential generator will actually repeat itself because of that modulo. This gets to be some pretty heady math, but the result of it all is happily very simple: The sequence will repeat itself after m numbers of have been generated. In most cases, this means that your random number generator will repeat every 232 cycles. That sounds like a lot, but it really isn't for many applications. If you are doing serious numerical work with Monte Carlo simulations, this number is hopelessly inadequate.
A fourth much less obvious problem is that the numbers are actually not really random. They have a funny sort of correlation. If you take three consecutive integers, (x, y, z), from an LCG with some value of a and m, those three points will always fall on the lattice of points generated by all linear combinations of the three points (1, a, a2), (0, m, 0), (0, 0, m). This is known as Marsaglia's Theorem, and if you don't understand it, that's okay. All it means is this: Triplets of random numbers from an LCG will show correlations at some deep, deep level. Usually it's too deep for you or I to notice, but its there. It's possible to even reconstruct the first number in a "random" sequence of three numbers if you are given the second and third! This is not good for cryptography at all.
The good part is that LCGs like rand() are very, very low footprint. It typically requires only 32 bits to retain state, which is really nice. It's also very fast, requiring very few operations. These make it good for noncritical embedded systems, video games, casual applications, stuff like that.
PRNGs are a fascinating topic. Wikipedia is always a good place to go if you are hungry to learn more on the history or the various implementations that are around today.
rand returns numbers generated by a pseudo-random number generator (PRNG). The sequence of numbers it returns is deterministic, based on the value with which the PRNG was initialized (by calling srand).
The numbers should be distributed such that they appear somewhat random, so, for example, odd and even numbers should be returned at roughly the same frequency. The actual implementation of the random number generator is left unspecified, so the actual behavior is specific to the implementation.
The important thing to remember is that rand does not return random numbers; it returns pseudo-random numbers, and the values it returns are determined by the seed value and the number of times rand has been called. This behavior is fine for many use cases, but is not appropriate for others (for example, rand would not be appropriate for use in many cryptographic applications).
How does rand() work?
http://en.wikipedia.org/wiki/Pseudorandom_number_generator
I have read that it has something to
do with time, also you get from
including time.h
rand() has nothing at all to do with the time. However, it's very common to use time() to obtain the "seed" for the PRNG so that you get different "random" numbers each time your program is run.
Also, does it have any tendencies
towards odd or even numbers or
something like that?
Depends on the exact method used. There's one popular implementation of rand() that alternates between odd and even numbers. So avoid writing code like rand() % 2 that depends on the lowest bit being random.
I was using the arc4random() function in order to generate a random group and sequence of numbers, but I was told that this was overkill and that I should use the random() function instead. However, the random() function gives me the same group and sequence of numbers every time.
I call srand(time(0)) once when my app first starts in order to seed the random() function. Do you ever need to reseed the random() function?
Am I missing something?
Thanks.
First off, who told you arc4random was overkill? I use it in my projects, and it (a) satisfies my requirements, (b) doesn't suck down resources (at least any visible to the user or obvious to me), and (c) was trivial to implement, so I don't really see how a similar use in your own code could be called "overkill."
Second, srand() seeds the rand() function, not random(), so that may be your issue. And no, you shouldn't have to reseed the generator at any time during your program's execution - once during startup is enough.
No, you do not need to reseed the random number generator. There is some additional uniformity gained by generating some amount of numbers and throwing them away, but unless you are looking for security level random number generation there is no need. For most purposes a properly seeded random number generator is uniform enough.