Smalltalk dictionary as calculator - smalltalk

I'm working on a homework assignment that asks us to create a type of Units class that can keep track of units and perform basic arithmetic on them. The problem description has this bit, which I don't completely understand:
Probably the easiest way to keep track of the units is to give Units a dictionary that maps symbols to integers. If you are dividing by a unit then it has a negative value in the dictionary. You add two Units together by adding the value together for each symbol in the dictionary. When it is zero, throw the symbol away!
For reference, this is also included in the description:
[...] you could write an expression 3 elephants / (1 sec sec) and it would return the right thing.
Could someone shed some light here? How can I use a dictionary to map these types of units? Am I making this way harder than it needs to be?

It sounds like your teacher is giving you a hint about how to wind up with the proper units at the end of the calculation.
When you're parsing the problem, as you encounter items that are obviously units, enter them into a dictionary. The dictionary would consist of a number and a string (the supposed "unit"). Then you'd use a set of rules to increase or decrease the integer count. The resultant integer value would help you to output the units correctly.
A count of 1 indicates it's a unit in the output.
A count of -1 indicates it's inverse is a unit in the output.
A count of 0 indicates that it doesn't appear in the output at all.
Similarly, a count of 2 would indicate that it's square appears as a unit in the output.
To wit:
5 Hippo + 10 Hippo = 15 Hippos
Parsing: Dictionary:
-------- -----------
5 Hippo Hippo:1
+
10 Hippo Hippo:1 (previous operation was addition or subtraction, and already have Hippo in dictionary
But consider this problem:
5 Hippo * 5 sec/Hippo = 25 sec
Parsing: Dictionary:
5 Hippo Hippo:1
*
5 sec Hippo:1, sec:1
/
Hippo Hippp:0, sec:1 (previous operation was division of Hippo, so decrement Hippo count)
Or perhaps:
10 feet / 5 sec = 2 feet/sec
Parsing: Dictionary:
10 feet feet:1
/
5 sec feet:1, sec:-1 (divided by sec, and second is not in dictionary, so second implicitly = 0. 0 + (-1) = -1.
In the example above, feet will be on the top of the bar because it's equal to 1, and sec will be below the bar because it's value is -1. If it's value had been -2, it would have been (feet/(sec*sec) or feet/(sec squared).

Related

Conversion of MIPS to %

I've been learning TSQL and need some help with a conversion CPU MIPS into PERCENTAGE.
I've built my code to get some data that I'm expecting. In addition to this, I want to add a column to my code which is to get the CPU%. I have a column that gives me TOTALCPU MIPS and want to use this in the code but in the form of percentage. Example, I have these values in my TOTAL CPU Column:
1623453.66897
0
0
2148441.01573933
3048946.946314
I want to convert these values into percentage and use them. I couldn't find much info on the internet.
Appreciate your response.
I assume that you have 5 numeric quantities (2 of them being zero) and you want to find the percentage that corresponds to each of them out of the addition of the five quantities. Is it so?
To find the percentage of a particular number in the addition you multiply the number by 100 and divide by the addition, the result is the percentage that that number is in relation with the addition.
The sum: 6820841.631023
The percentage of the first number (of MIPS):
1623453.668970 * 100 / 6820841.631023 = 23.80136876 =>
23.80136876% is the percentage of CPU used by the first program.
To give the answer some SQL looking, refering to Mips_Table as the view/table that contains the MIPs data:
select mips, mips/totMips*100 Pct_CPU
from Mips_Table,
(select sum(mips) TotMips from Mips_Table) k

repeated sum of digits big o complexity

Lets say for example we have the number 12345.
This sums to 15 when you add 1 + 2 + 3 + 4 + 5, which sums to 6 when you add 1 + 5.
My question is, what would the time complexity be for a repetitive adding algorithm like this be? This process is happens until there is only a single digit left.
I know that for any given number, the # of digits is approximately ln(n). Im thinking that this means that the big o would look something like (ln(n))^k, for some k. However, I am not confident because each time you sum, the number of digits gets smaller (first summed 5 digits, then only 2).
How would I go about figuring this out?

How to make a biased random number generator in VB.NET?

How do I make a biased random number generator (RNG) in VB.NET?
I know I could make it by fiddling with the output of the Randomize()/Rnd methods, but is there a built-in way of doing this?
I want the biased RNG to give me either a 2 or 4 (though using 1 or 2 as a substitute is also OK by me), with 2 occurring on average 90% of the time and 4 occurring on average 10% of the time.
Create a random number generator to return values from 1-10, if the value from the random number generator is between 1 and 9 send a 2 if the value is 10 send a 4.
You might want to look at this
http://msdn.microsoft.com/en-us/library/vstudio/ctssatww(v=vs.100).aspx?cs-save-lang=1&cs-lang=vb#code-snippet-2
If you want to come out with a mask to generate your values
Here is what I think you can do.
Dim numbers() as integer = {2,2,2,2,4,2,2,2,2,2} ' set 10% for 4, 90% for 2
Dim r as new Random()
Return numbers(r.Next(0, 10))

Converting binary to decimal with out using a function

I'm trying to create a binary to decimal converter, and have got stuck on the code. I have researched forums for any help, but they all seam to use functions, which can not be used within a private sub. Please can anyone give me help on a solution to this problem?
I would use the positional notation method:
http://en.wikipedia.org/wiki/Positional_notation
http://www.wikihow.com/Convert-from-Binary-to-Decimal
So basically, without giving you the answer, you want to loop through binary place holders, filling up a variable as you go along. You would use an index to move from the least significant placeholder to the most.
For example : 10011011 in binary is 155 decimal.
So every place holder is a power with a base of two. Then you add the value for each one until your finished, like so:
placeholder 1 is: 2 pow 0 equals 1.
placeholder 2 is: 2 pow 1 equals 2.
placeholder 3 is: 2 pow 2 equals 4.
placeholder 4 is: 2 pow 3 equals 8.
placeholder 5 is: 2 pow 4 equals 16.
placeholder 6 is: 2 pow 5 equals 32.
placeholder 7 is: 2 pow 6 equals 64.
placeholder 8 is: 2 pow 7 equals 128.
Now we only add for the placeholders that have 1s.
128+16+8+2+1 = 155
What you will need:
A loop looping through indexes, and incrementing the exponent value as you go along, only adding the value if the index equals 1 in the binary number.
Hope my explanation makes sense. Good luck.

Power-law distribution in T-SQL

I basically need the answer to this SO question that provides a power-law distribution, translated to T-SQL for me.
I want to pull a last name, one at a time, from a census provided table of names. I want to get roughly the same distribution as occurs in the population. The table has 88,799 names ranked by frequency. "Smith" is rank 1 with 1.006% frequency, "Alderink" is rank 88,799 with frequency of 1.7 x 10^-6. "Sanders" is rank 75 with a frequency of 0.100%.
The curve doesn't have to fit precisely at all. Just give me about 1% "Smith" and about 1 in a million "Alderink"
Here's what I have so far.
SELECT [LastName]
FROM [LastNames] as LN
WHERE LN.[Rank] = ROUND(88799 * RAND(), 0)
But this of course yields a uniform distribution.
I promise I'll still be trying to figure this out myself by the time a smarter person responds.
Why settle for the power-law distribution when you can draw from the actual distribution ?
I suggest you alter the LastNames table to include a numeric column which would contain a numeric value representing the actual number of indivuduals with a name that is more common. You'll probably want a number on a smaller but proportional scale, say, maybe 10,000 for each percent of representation.
The list would then look something like:
(other than the 3 names mentioned in the question, I'm guessing about White, Johnson et al)
Smith 0
White 10,060
Johnson 19,123
Williams 28,456
...
Sanders 200,987
..
Alderink 999,997
And the name selection would be
SELECT TOP 1 [LastName]
FROM [LastNames] as LN
WHERE LN.[number_described_above] < ROUND(100000 * RAND(), 0)
ORDER BY [number_described_above] DESC
That's picking the first name which number does not exceed the [uniform distribution] random number. Note how the query, uses less than and ordering in desc-ending order; this will guaranty that the very first entry (Smith) gets picked. The alternative would be to start the series with Smith at 10,060 rather than zero and to discard the random draws smaller than this value.
Aside from the matter of boundary management (starting at zero rather than 10,060) mentioned above, this solution, along with the two other responses so far, are the same as the one suggested in dmckee's answer to the question referenced in this question. Essentially the idea is to use the CDF (Cumulative Distribution function).
Edit:
If you insist on using a mathematical function rather than the actual distribution, the following should provide a power law function which would somehow convey the "long tail" shape of the real distribution. You may wan to tweak the #PwrCoef value (which BTW needn't be a integer), essentially the bigger the coeficient, the more skewed to the beginning of the list the function is.
DECLARE #PwrCoef INT
SET #PwrCoef = 2
SELECT 88799 - ROUND(POWER(POWER(88799.0, #PwrCoef) * RAND(), 1.0/#PwrCoef), 0)
Notes:
- the extra ".0" in the function above are important to force SQL to perform float operations rather than integer operations.
- the reason why we subtract the power calculation from 88799 is that the calculation's distribution is such that the closer a number is closer to the end of our scale, the more likely it is to be drawn. The List of family names being sorted in the reverse order (most likely names first), we need this substraction.
Assuming a power of, say, 3 the query would then look something like
SELECT [LastName]
FROM [LastNames] as LN
WHERE LN.[Rank]
= 88799 - ROUND(POWER(POWER(88799.0, 3) * RAND(), 1.0/3), 0)
Which is the query from the question except for the last line.
Re-Edit:
In looking at the actual distribution, as apparent in the Census data, the curve is extremely steep and would require a very big power coefficient, which in turn would cause overflows and/or extreme rounding errors in the naive formula shown above.
A more sensible approach may be to operate in several tiers i.e. to perform an equal number of draws in each of the, say, three thirds (or four quarters or...) of the cumulative distribution; within each of these parts list, we would draw using a power law function, possibly with the same coeficient, but with different ranges.
For example
Assuming thirds, the list divides as follow:
First third = 425 names, from Smith to Alvarado
Second third = 6,277 names, from to Gainer
Last third = 82,097 names, from Frisby to the end
If we were to need, say, 1,000 names, we'd draw 334 from the top third of the list, 333 from the second third and 333 from the last third.
For each of the thirds we'd use a similar formula, maybe with a bigger power coeficient for the first third (were were are really interested in favoring the earlier names in the list, and also where the relative frequencies are more statistically relevant). The three selection queries could look like the following:
-- Random Drawing of a single Name in top third
-- Power Coef = 12
SELECT [LastName]
FROM [LastNames] as LN
WHERE LN.[Rank]
= 425 - ROUND(POWER(POWER(425.0, 12) * RAND(), 1.0/12), 0)
-- Second third; Power Coef = 7
...
WHERE LN.[Rank]
= (425 + 6277) - ROUND(POWER(POWER(6277.0, 7) * RAND(), 1.0/7), 0)
-- Bottom third; Power Coef = 4
...
WHERE LN.[Rank]
= (425 + 6277 + 82097) - ROUND(POWER(POWER(82097.0, 4) * RAND(), 1.0/4), 0)
Instead of storing the pdf as rank, store the CDF (the sum of all frequencies until that name, starting from Aldekirk).
Then modify your select to retrieve the first LN with rank greater than your formula result.
I read the question as "I need to get a stream of names which will mirror the frequency of last names from the 1990 US Census"
I might have read the question a bit differently than the other suggestions and although an answer has been accepted, and a very through answer it is, I will contribute my experience with the Census last names.
I had downloaded the same data from the 1990 census. My goal was to produce a large number of names to be submitted for search testing during performance testing of a medical record app. I inserted the last names and the percentage of frequency into a table. I added a column and filled it with a integer which was the product of the "total names required * frequency". The frequency data from the census did not add up to exactly 100% so my total number of names was also a bit short of the requirement. I was able to correct the number by selecting random names from the list and increasing their count until I had exactly the required number, the randomly added count never ammounted to more than .05% of the total of 10 million.
I generated 10 million random numbers in the range of 1 to 88799. With each random number I would pick that name from the list and decrement the counter for that name. My approach was to simulate dealing a deck of cards except my deck had many more distinct cards and a varing number of each card.
Do you store the actual frequencies with the ranks?
Converting the algebra from that accepted answer to MySQL is no bother, if you know what values to use for n. y would be what you currently have ROUND(88799 * RAND(), 0) and x0,x1 = 1,88799 I think, though I might misunderstand it. The only non-standard maths operator involved from a T-SQL perspective is ^ which is just POWER(x,y) == x^y.