SQL Server newID - how is it created? - sql

I would like to use newId to generate random numbers. Usually you would use it just once, but I might be generating up to 10 random numbers per newId.
Is it random enough?

Usually you would use it just once, but I might be generating up to 10 random numbers per newId. Is it random enough?
It depends on how you extract the numbers from the newid. You cannot treat it as 128 independently random bits.
For example if you use the first 8 bits for generating one random number between 0 and 255, use the next 8 bits to generate another number, etc. then you will see that your numbers will not be uniformly random.
v
E058D654-35A8-47F2-AE40-1C4EEBBDC549
01461481-ED8D-4B85-90FA-C08621D98DAE
AE861E4E-3469-4BDB-A38B-0031DACC8DAE
AF8905D0-E41B-4300-94F2-33BB45698CD1
003308A6-AE0A-4E20-9F24-047A6955E748
76F9B7ED-79AB-4EB1-B361-8C0AF5177CE3
B8F1CAC0-591D-436B-BB21-FAAD9EECA983
7FBEAEFD-2163-4315-A783-8106909E47D8
85E2FC60-E7B3-400F-B20A-CEFBECAEE4F9
17ED0A03-ADAD-4521-97EE-04815A867B32
^
|
always 4
You should also try to avoid reusing the same bits to generate different random numbers as your numbers will become related. If in doubt, don't reuse the same number.
Note that there is also a RAND function which you can call. This returns numbers from a uniform distribution.

Yes, it's statistically random. It's simply a GUID.
How do you plan on generating 10 numbers from one seed though? CHECKSUM(NEWID()) is normally how you'd do it for one value, perhaps with modulo and ABS

NewID generates a GUID. It is random enough.

Random enough for what? When you say use it to generate, are you just going to use it to seed a PNRG? I'm not sure it's any better than a timestamp for that. Or are you going to extract bits from the GUID - that's a bad idea.
http://www.random.org/randomness/

Related

Generating distinct random numbers efficiently

My main purpose is to spread a buffer over pixels of an image randomly and efficiently, but I'm stuck at generating distinct random numbers. What I simply want is to generate numbers between 0 and N, but I also want these numbers to be distinct. Also note that N usually will be quite large such as 20 million and the algorithm doesn't have to be cryptographically secure.
I can't use random shuffle method since N is quite large. I did some search and found Linear congruential generator but the parameter m is required to be prime, but my N is sometimes not.
Lastly, I tried the following approach but it's not quite efficient and reliable since it might throw maximum call stack size exceeded error.
next(max: number)
{
let num = LCG.next()
if (num <= max) return num
return next(max)
}
If numbers are distinct, then they are not random. Random numbers can repeat; distinct numbers are selected from an ever decreasing set. It is the difference between selecting numbers with replacement and without replacement.
You want numbers from 0 to 20 million. As you have found, that is too large for a shuffle. Better to use an encryption. Because an encryption is one-to-one, as long as you have distinct inputs you will get distinct outputs. Just encrypt 0, 1, 2, 3, ... and you will get distinct outputs.
You talk about using a linear congruential PRNG so I assume that security is not of great importance. 20 million is about 2^24 or 2^26 so you can write a simple four round Feistel cipher sized appropriately to do the work. Alternatively, use a standard library cipher with one of the Format preserving methods to keep the output within the bounds you want.

Solitaire: storing guaranteed wins cheaply

Given a list of deals of Klondike Solitaire that are known to win, is there a way to store a reasonable amount of deals (say 10,000+) in a reasonable amount of space (say 5MB) to retrieve on command? (These numbers are arbitrary)
I thought of using a pseudo random generator where a given seed would generate a decimal string of numbers, where each two digits represents a card, and the index represents the location of the deal. In this case, you would only have to store the seed and the PRG code.
The only cons I can think of would be that A) the number of possible deals is 52!, and so the number of possible seeds would be at least 52!, and would be monstrous to store in the higher number range, and B) the generated number can't repeat a two digit number (though they can be ignored in the deck construction)
Given no prior information, the theoretical limit on how compactly you can represent an ordered deck of cards is 226 bits. Even the simple naive 6-bits-per card is only 312 bits, so you probably won't gain much by being clever.
If you're willing to sacrifice a large part of the state-space, you could use a 32- or 64-bit PRNG to generate the decks, and then you could reproduce them from the 32- or 64-bit initial PRNG state. But that limits you to 2^64 different decks out of the possible 2^225+.
If you are asking hypothetically, I would say that you would need at least 3.12 MB to store 10,000 possible deals. You need 6 bits to represent each card (assuming you number them 1-52) and then you would need to order them so 6 * 52 = 312. Take that and multiply it by the number of deals 312 * 10,000 and you get 3,120,000 bits or 3.12 MB.

Runtime Random Generators, not Compile Time [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
iPhone: random() function gives me the same random number everytime
I am writing a test iPhone app for a larger project that I am working on that will involve randomized strings, numbers, etc. When I use the rand() or random() functions, every single time I do get randomized numbers and strings, but in the same order! I know that the compiler determines the order at compile time, but do not want that. I want it to be completely random, so something different every time not just a predetermined list. What I have tried is a loop that counts up and down to try to take a value from that but it didn't work.
The compiler doesn't generate the random strings; they're generated at runtime. But they are generated based on an initial seed; for a given value of the seed, you'll get the same sequence of numbers. You need to choose a seed at runtime based on something like the system time, uptime, count of user clicks, etc.
You need to seed the number for rand. This way it will be random every time!
Here is an example of what I mean:
// Seed number for rand()
srand((unsigned int) time(0) + getpid());
Add the above line of code before using rand()
Random number generators are never truly "random," and instead generate a number based on something that should always be different. Giving a random number generator a unique value so that it can generate a random number is know as "seeding." For rand(), you will want to seed it with srand(time(NULL)). For random() it's the same deal with srandom(). There are functions available for iOS such as arc4random(), which is self-seeding. To generate a random number up to (but not including) 10, for instance, you could use arc4random() as follows:
int random = arc4random() % 10;

How does rand() work? Does it have certain tendencies? Is there something better to use?

I have read that it has something to do with time, also you get from including time.h, so I assumed that much, but how does it work exactly? Also, does it have any tendencies towards odd or even numbers or something like that? And finally is there something with better distribution in the C standard library or the Foundation framework?
Briefly:
You use time.h to get a seed, which is an initial random number. C then does a bunch of operations on this number to get the next random number, then operations on that one to get the next, then... you get the picture.
rand() is able to touch on every possible integer. It will not prefer even or odd numbers regardless of the input seed, happily. Still, it has limits - it repeats itself relatively quickly, and in almost every implementation only gives numbers up to 32767.
C does not have another built-in random number generator. If you need a real tough one, there are many packages available online, but the Mersenne Twister algorithm is probably the most popular pick.
Now, if you are interested on the reasons why the above is true, here are the gory details on how rand() works:
rand() is what's called a "linear congruential generator." This means that it employs an equation of the form:
xn+1 = (*a****xn + ***b*) mod m
where xn is the nth random number, and a and b are some predetermined integers. The arithmetic is performed modulo m, with m usually 232 depending on the machine, so that only the lowest 32 bits are kept in the calculation of xn+1.
In English, then, the idea is this: To get the next random number, multiply the last random number by something, add a number to it, and then take the last few digits.
A few limitations are quickly apparent:
First, you need a starting random number. This is the "seed" of your random number generator, and this is where you've heard of time.h being used. Since we want a really random number, it is common practice to ask the system what time it is (in integer form) and use this as the first "random number." Also, this explains why using the same seed twice will always give exactly the same sequence of random numbers. This sounds bad, but is actually useful, since debugging is a lot easier when you control the inputs to your program
Second, a and b have to be chosen very, very carefully or you'll get some disastrous results. Fortunately, the equation for a linear congruential generator is simple enough that the math has been worked out in some detail. It turns out that choosing an a which satisfies *a***mod8 = 5 together with ***b* = 1 will insure that all m integers are equally likely, independent of choice of seed. You also want a value of a that is really big, so that every time you multiply it by xn you trigger a the modulo and chop off a lot of digits, or else many numbers in a row will just be multiples of each other. As a result, two common values of a (for example) are 1566083941 and 1812433253 according to Knuth. The GNU C library happens to use a=1103515245 and b=12345. A list of values for lots of implementations is available at the wikipedia page for LCGs.
Third, the linear congruential generator will actually repeat itself because of that modulo. This gets to be some pretty heady math, but the result of it all is happily very simple: The sequence will repeat itself after m numbers of have been generated. In most cases, this means that your random number generator will repeat every 232 cycles. That sounds like a lot, but it really isn't for many applications. If you are doing serious numerical work with Monte Carlo simulations, this number is hopelessly inadequate.
A fourth much less obvious problem is that the numbers are actually not really random. They have a funny sort of correlation. If you take three consecutive integers, (x, y, z), from an LCG with some value of a and m, those three points will always fall on the lattice of points generated by all linear combinations of the three points (1, a, a2), (0, m, 0), (0, 0, m). This is known as Marsaglia's Theorem, and if you don't understand it, that's okay. All it means is this: Triplets of random numbers from an LCG will show correlations at some deep, deep level. Usually it's too deep for you or I to notice, but its there. It's possible to even reconstruct the first number in a "random" sequence of three numbers if you are given the second and third! This is not good for cryptography at all.
The good part is that LCGs like rand() are very, very low footprint. It typically requires only 32 bits to retain state, which is really nice. It's also very fast, requiring very few operations. These make it good for noncritical embedded systems, video games, casual applications, stuff like that.
PRNGs are a fascinating topic. Wikipedia is always a good place to go if you are hungry to learn more on the history or the various implementations that are around today.
rand returns numbers generated by a pseudo-random number generator (PRNG). The sequence of numbers it returns is deterministic, based on the value with which the PRNG was initialized (by calling srand).
The numbers should be distributed such that they appear somewhat random, so, for example, odd and even numbers should be returned at roughly the same frequency. The actual implementation of the random number generator is left unspecified, so the actual behavior is specific to the implementation.
The important thing to remember is that rand does not return random numbers; it returns pseudo-random numbers, and the values it returns are determined by the seed value and the number of times rand has been called. This behavior is fine for many use cases, but is not appropriate for others (for example, rand would not be appropriate for use in many cryptographic applications).
How does rand() work?
http://en.wikipedia.org/wiki/Pseudorandom_number_generator
I have read that it has something to
do with time, also you get from
including time.h
rand() has nothing at all to do with the time. However, it's very common to use time() to obtain the "seed" for the PRNG so that you get different "random" numbers each time your program is run.
Also, does it have any tendencies
towards odd or even numbers or
something like that?
Depends on the exact method used. There's one popular implementation of rand() that alternates between odd and even numbers. So avoid writing code like rand() % 2 that depends on the lowest bit being random.

Generating Random Numbers in Objective C for iPhone SDK

I was using the arc4random() function in order to generate a random group and sequence of numbers, but I was told that this was overkill and that I should use the random() function instead. However, the random() function gives me the same group and sequence of numbers every time.
I call srand(time(0)) once when my app first starts in order to seed the random() function. Do you ever need to reseed the random() function?
Am I missing something?
Thanks.
First off, who told you arc4random was overkill? I use it in my projects, and it (a) satisfies my requirements, (b) doesn't suck down resources (at least any visible to the user or obvious to me), and (c) was trivial to implement, so I don't really see how a similar use in your own code could be called "overkill."
Second, srand() seeds the rand() function, not random(), so that may be your issue. And no, you shouldn't have to reseed the generator at any time during your program's execution - once during startup is enough.
No, you do not need to reseed the random number generator. There is some additional uniformity gained by generating some amount of numbers and throwing them away, but unless you are looking for security level random number generation there is no need. For most purposes a properly seeded random number generator is uniform enough.