Seed random number for unit tests in golang - testing

I have functions that uses math/rand to "randomly" sample from a Poisson and another from the binomial distribution. It is often used by other functions that also return random values like h(g(f())) where f() g() and h() are random functions.
I placed a rand.Seed(n) call in main() to pick a different seed every time the program is run and it works fine.
My question is for unittests for these PRNG functions and the functions that use them using the builtin testing package. I would like to remove the randomness so that I can have a predictable value to compare with.
Where is the best place to place my constant value seed to get deterministic output? At the init() of the test file or inside every test function, or somewhere else?

You should certainly not put it in the test init() function. Why? Because execution order (or even if test functions are run) is non-deterministic. For details, see How to run golang tests sequentially?
What does this mean?
If you have 2 test functions (e.g. TestA() and TestB()) both of which test functions that call into math/rand, you don't have guarantees if TestA() is run first or TestB(), or even if any of those will be called. And so random data returned by math/rand will depend on this order.
A better option would be to put seeding into TestA() and TestB(), but this may also be insufficient, as tests may run parallel, so the random data returned by math/rand may also be non-deterministic.
To really have deterministic test results, functions that need random data would need to receive a math.Rand value and use that explicitly, and in tests you can create separate, distinct math.Rand values that will not be used by other tests, so seeding those to constant values and using those in the tested functions, only then can you have deterministic results that will not depend on how and in which order the test functions are called.

As an alternative to passing in a math.Rand you could monkey patch IF you you don't want dependency injection to be part of your package's API e.g.: https://play.golang.org/p/cIGxhO0wSbo

To stay compatible with parallel test execution, create your own rand.Rand.
My example includes Check() from the standard package testing/quick which is often used to execute tests on a hundred of pseudo-random arguments. (Similarly to OP's case, it's a function that makes your tests very much dependent on RNG seeding).
package main
import (
"math/rand"
"testing"
"testing/quick"
)
func TestRandomly(t *testing.T) {
r := rand.New(rand.NewSource(0))
config := &quick.Config{Rand: r}
assertion := func(num uint8) bool {
// fail test when argument is 254
return num != 254
}
if err := quick.Check(assertion, config); err != nil {
t.Error("failed checks", err)
}
}

Related

Using inline function kotlin

I know there was documented in main kotlin page, but there is no clear explanation about when to use it, why this function need a receiver as a function. What would be the correct way to create a correct definition of inline function.
This is inline function
inline fun String?.toDateString(rawDateFormat: String = MMMM_DD_YYYY, outputDate: String = MM_DD_YYYY, block: (date: String) -> String): String {
return try {
var sdf = SimpleDateFormat(rawDateFormat, Locale.US)
val date = sdf.parse(this.orEmpty())
sdf = SimpleDateFormat(outputDate, Locale.US)
block(sdf.format(date ?: Date()).orEmpty())
} catch (ex: Exception) {
block("")
}
}
The same way we also can do
inline fun String?.toDateString(rawDateFormat: String = MMMM_DD_YYYY, outputDate: String = MM_DD_YYYY): String {
return try {
var sdf = SimpleDateFormat(rawDateFormat, Locale.US)
val date = sdf.parse(this.orEmpty())
sdf = SimpleDateFormat(outputDate, Locale.US)
sdf.format(date ?: Date()).orEmpty()
} catch (ex: Exception) {
""
}
}
If anyone could have a detail explanation about this?
Edit:
I understand that the inline function will insert the code whenever it called by the compiler. But this come to my attention, when I want to use inline function without functional parameter receiver type the warning show as this in which should have a better explain. I also want to understand why this is such recommendation.
There are few things here.
First, you ask about using a function with a receiver.  In both cases here, the receiver is the String? part of String?.toDateString().  It means that you can call the function as if it were a method of String, e.g. "2021-01-15 12:00:00".toDateString(…).
The original String? is accessible as this within the function; you can see it in the sdf.parse(this.orEmpty()) call.  (It's not always as obvious as this; you could simply call sdf.parse(orEmpty()), where the this. is implied.)
Then you ask about inline functions.  All you have to do is to mark the function as inline, and the compiler will automatically insert its code wherever it's called, instead of defining a function in the usual way.  But you don't need to worry about how it's implemented; there are just a few visible effects in the code.  In particular, if a function is inline and accepts a function parameter, then its lambda can do a few things (such as calling return) that it couldn't otherwise do.
Which leads us to what I think is your real question: about the block function parameter.  Your first example has this parameter, with the type (date: String) -> String — i.e. a function taking a single String parameter and returning another String.  (The technical term for this is that toDateString() is a higher-order function.)
The toDateString() function calls this block function before returning, applying it to the date string it has formatted before returning it to the caller.
As to why it does this, it's hard to tell.  That's why we put documentation comments before functions: to explain anything that's not obvious from the code!  Ideally, there would be a comment explaining why you're required to supply a block lamdba (or function reference), when it's not vital to what the function does.
There are times when blocks passed this way are very useful.  For example, the joinToString() function accepts an optional transform parameter, which it applies to each item before joining it to the list.  If it didn't, the effect would be a lot more awkward to obtain.  (You'd probably have to apply a map() to the collection before calling joinToString(), which would be less efficient.)
But this isn't one of those times.  As your second example shows, toDateString() would work perfectly well without the block parameter — and then if you needed to pass the result through another function, you could just call it on toDateString()'s result.
Perhaps if you included a link to the ‘main kotlin page’ where you saw this, it might give some more context?
The edited question also asks about the IDE warning.  This is shown when it thinks inlining a function won't give a significant improvement.
When no lambdas are involved, the only potential benefit from inlining a function is performance, and that's a trade-off.  It might avoid the overhead of a function call wherever it's called — but the Java runtime will often inline small functions anyway, all on its own.  And having the compiler do the inlining comes at the cost of duplicating the function's code everywhere it's called; the increased code size is less likely to fit into memory caches, and less likely to be optimised by the Java runtime — so that can end up reducing the performance overall.  Because this isn't always obvious, the IDE gives a warning.
It's different when lambdas are involved, though.  In that case, inlining affects functionality: for example, it allows non-local returns and reified type parameters.  So in that case there are good reasons for using inline regardless of any performance implications, and the IDE doesn't give the warning.
(In fact, if a function calls a lambda it's passed, inlining can have a more significant performance benefit: not only does the function itself get inlined, but the lambda itself usually does as well, removing two levels of function call — and the lambda is often called repeatedly, so there can be a real saving.)

Why Kotlin doesn't show kotlin.Unit for main function, unlike other functions having Unit return type?

I'm new to Kotlin and while trying the programs when I included functions with Unit return type it showed kotlin.Unit after completion of execution.
Being main function has Unit return type too why it doesn't show kotlin.Unit after execution?
a
b
c
Process finished with exit code 0
This was the output I got for simple program without any other functions
First, please do not upload images of code/errors when asking a question..
But to answer your question: Kotlin is printing the Unit result of calling your function because you're telling it to:
print(compare(a, b))
There's almost never a need to print Unitor manipulate it in any way, but that's perfectly legal to do. And since your main() function also returns Unit, you could print that out too if you wanted (and were calling that from another function). But why would you want to?
Either remove the print(), and simply call compare(a, b) on its own; or change compare() to return a value that you do want to print!

How to make and use an arraylist of functions

How can i make an arraylist of functions, and call each function easily? I have already tried making an ArrayList<Function<Unit>>, but when i tried to do this:
functionList.forEach { it }
and this:
for(i in 0 until functionList.size) functionList[i]
When i tried doing this: it() and this: functionList[i](), but it wouldn't even compile in intellij. How can i do this in kotlin? Also, does the "Unit" in ArrayList<Function<Unit>> mean return value or parameters?
Just like this:
val funs:List<() -> Unit> = listOf({}, { println("fun")})
funs.forEach { it() }
The compiler can successfully infer the type of funs here which is List<() -> Unit>. Note that () -> Unit is a function type in Kotlin which represents a function that does not take any argument and returns Unit.
I think there are two problems with the use of the Function interface here.
The first problem is that it doesn't mean what you might think. As I understand it, it's a very general interface, implemented by all functions, however many parameters they take (or none). So it doesn't have any invoke() method. That's what the compiler is complaining about.
Function has several sub-interfaces, one for each 'arity' (i.e. one for each number of parameters): Function0 for functions that take no parameters, Function1 for functions taking one parameter, and so on. These have the appropriate invoke() methods. So you could probably fix this by replacing Function by Function0.
But that leads me on to the second problem, which is that the Function interfaces aren't supposed to be used this way. I think they're mainly for Java compatibility and/or for internal use by the compiler.
It's usually much better to use the Kotlin syntax for function types: (P1, P2...) -> R. This is much easier to read, and avoids these sorts of problems.
So the real answer is probably to replace Function<Unit> by () -> Unit.
Also, in case it's not clear, Kotlin doesn't have a void type. Instead, it has a type called Unit, which has exactly one value. This might seem strange, but makes better sense in the type system, as it lets the compiler distinguish functions that return without an explicit value, from those which don't return. (The latter might always throw an exception or exit the process. They can be defined to return Nothing -- a type with no values at all.)

What is the difference between an Idempotent and a Deterministic function?

Are idempotent and deterministic functions both just functions that return the same result given the same inputs?
Or is there a distinction that I'm missing?
(And if there is a distinction, could you please help me understand what it is)
In more simple terms:
Pure deterministic function: The output is based entirely, and only, on the input values and nothing else: there is no other (hidden) input or state that it relies on to generate its output. There are no side-effects or other output.
Impure deterministic function: As with a deterministic function that is a pure function: the output is based entirely, and only, on the input values and nothing else: there is no other (hidden) input or state that it relies on to generate its output - however there is other output (side-effects).
Idempotency: The practical definition is that you can safely call the same function multiple times without fear of negative side-effects. More formally: there are no changes of state between subsequent identical calls.
Idempotency does not imply determinacy (as a function can alter state on the first call while being idempotent on subsequent calls), but all pure deterministic functions are inherently idempotent (as there is no internal state to persist between calls). Impure deterministic functions are not necessarily idempotent.
Pure deterministic
Impure deterministic
Pure Nondeterministic
Impure Nondeterministic
Idempotent
Input
Only parameter arguments (incl. this)
Only parameter arguments (incl. this)
Parameter arguments and hidden state
Parameter arguments and hidden state
Any
Output
Only return value
Return value or side-effects
Only return value
Return value or side-effects
Any
Side-effects
None
Yes
None
Yes
After 1st call: Maybe.After 2nd call: None
SQL Example
UCASE
CREATE TABLE
GETDATE
DROP TABLE
C# Example
String.IndexOf
DateTime.Now
Directory.Create(String)Footnote1
Footnote1 - Directory.Create(String) is idempotent because if the directory already exists it doesn't raise an error, instead it returns a new DirectoryInfo instance pointing to the specified extant filesystem directory (instead of creating the filesystem directory first and then returning a new DirectoryInfo instance pointing to it) - this is just like how Win32's CreateFile can be used to open an existing file.
A temporary note on non-scalar parameters, this, and mutating input arguments:
(I'm currently unsure how instance methods in OOP languages (with their hidden this parameter) can be categorized as pure/impure or deterministic or not - especially when it comes to mutating the the target of this - so I've asked the experts in CS.SE to help me come to an answer - once I've got a satisfactory answer there I'll update this answer).
A note on Exceptions
Many (most?) programming languages today treat thrown exceptions as either a separate "kind" of return (i.e. "return to nearest catch") or as an explicit side-effect (often due to how that language's runtime works). However, as far as this answer is concerned, a given function's ability to throw an exception does not alter its pure/impure/deterministic/non-deterministic label - ditto idempotency (in fact: throwing is often how idempotency is implemented in the first place e.g. a function can avoid causing any side-effects simply by throwing right-before it makes those state changes - but alternatively it could simply return too.).
So, for our CS-theoretical purposes, if a given function can throw an exception then you can consider the exception as simply part of that function's output. What does matter is if the exception is thrown deterministically or not, and if (e.g. List<T>.get(int index) deterministically throws if index < 0).
Note that things are very different for functions that catch exceptions, however.
Determinacy of Pure Functions
For example, in SQL UCASE(val), or in C#/.NET String.IndexOf are both deterministic because the output depends only on the input. Note that in instance methods (such as IndexOf) the instance object (i.e. the hidden this parameter) counts as input, even though it's "hidden":
"foo".IndexOf("o") == 1 // first cal
"foo".IndexOf("o") == 1 // second call
// the third call will also be == 1
Whereas in SQL NOW() or in C#/.NET DateTime.UtcNow is not deterministic because the output changes even though the input remains the same (note that property getters in .NET are equivalent to a method that accepts no parameters besides the implicit this parameter):
DateTime.UtcNow == 2016-10-27 18:10:01 // first call
DateTime.UtcNow == 2016-10-27 18:10:02 // second call
Idempotency
A good example in .NET is the Dispose() method: See Should IDisposable.Dispose() implementations be idempotent?
a Dispose method should be callable multiple times without throwing an exception.
So if a parent component X makes an initial call to foo.Dispose() then it will invoke the disposal operation and X can now consider foo to be disposed. Execution/control then passes to another component Y which also then tries to dispose of foo, after Y calls foo.Dispose() it too can expect foo to be disposed (which it is), even though X already disposed it. This means Y does not need to check to see if foo is already disposed, saving the developer time - and also eliminating bugs where calling Dispose a second time might throw an exception, for example.
Another (general) example is in REST: the RFC for HTTP1.1 states that GET, HEAD, PUT, and DELETE are idempotent, but POST is not ( https://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html )
Methods can also have the property of "idempotence" in that (aside from error or expiration issues) the side-effects of N > 0 identical requests is the same as for a single request. The methods GET, HEAD, PUT and DELETE share this property. Also, the methods OPTIONS and TRACE SHOULD NOT have side effects, and so are inherently idempotent.
So if you use DELETE then:
Client->Server: DELETE /foo/bar
// `foo/bar` is now deleted
Server->Client: 200 OK
Client->Server DELETE /foo/bar
// foo/bar` is already deleted, so there's nothing to do, but inform the client that foo/bar doesn't exist
Server->Client: 404 Not Found
// the client asks again:
Client->Server: DELETE /foo/bar
// foo/bar` is already deleted, so there's nothing to do, but inform the client that foo/bar doesn't exist
Server->Client: 404 Not Found
So you see in the above example that DELETE is idempotent in that the state of the server did not change between the last two DELETE requests, but it is not deterministic because the server returned 200 for the first request but 404 for the second request.
A deterministic function is just a function in the mathematical sense. Given the same input, you always get the same output. On the other hand, an idempotent function is a function which satisfies the identity
f(f(x)) = f(x)
As a simple example. If UCase() is a function that converts a string to an upper case string, then clearly UCase(Ucase(s)) = UCase(s).
Idempotent functions are a subset of all functions.
A deterministic function will return the same result for the same inputs, regardless of how many times you call it.
An idempotent function may NOT return the same result (it will return the result in the same form but the value could be different, see http example below). It only guarantees that it will have no side effects. In other words it will not change anything.
For example, the GET verb is meant to be idempotent in HTTP protocol. If you call "~/employees/1" it will return the info for employee with ID of 1 in a specific format. It should never change anything but simply return the employee information. If you call it 10, 100 or so times, the returned format will always be the same. However, by no means can it be deterministic. Maybe if you call it the second time, the employee info has changed or perhaps the employee no longer even exists. But never should it have side effects or return the result in a different format.
My Opinion
Idempotent is a weird word but knowing the origin can be very helpful, idem meaning same and potent meaning power. In other words it means having the same power which clearly doesn't mean no side effects so not sure where that comes from. A classic example of There are only two hard things in computer science, cache invalidation and naming things. Why couldn't they just use read-only? Oh wait, they wanted to sound extra smart, perhaps? Perhaps like cyclomatic complexity?

Execute command block in primitive in NetLogo extension

I'm writing a primitive that takes in two agentsets and a command block. It needs to call a few functions, execute the command block in the current context, and then call another function. Here's what I have so far:
class WithContext(pushGraphContext: GraphContext => Unit, popGraphContext: api.World => GraphContext)
extends api.DefaultCommand {
override def getSyntax = commandSyntax(
Array(AgentsetType, AgentsetType, CommandBlockType))
def perform(args: Array[Argument], context: Context) {
val turtleSet = args(0).getAgentSet.requireTurtleSet
val linkSet = args(1).getAgentSet.requireLinkSet
val world = linkSet.world
val gc = new GraphContext(world, turtleSet, linkSet)
val extContext = context.asInstanceOf[ExtensionContext]
val nvmContext = extContext.nvmContext
pushGraphContext(gc)
// execute command block here
popGraphContext(world)
}
}
I looked at some examples that used nvmContext.runExclusively, but that looked like it's specifically for having a given agentset run the command block. I want the current agent (possibly the observer) to run it. Should I wrap nvm.agent in an agentset and pass that to nvmContext.runExclusively? If so, what's the easiest way to wrap an agent in agentset? If not, what should I do?
Method #1
The quicker-but-arguably-dirtier method is to use runExclusiveJob, as demonstrated in (e.g.) the create-red-turtles command in https://github.com/NetLogo/Sample-Scala-Extension/blob/master/src/SampleScalaExtension.scala .
To wrap the current agent in an agentset, you can use agent.AgentSetBuilder. (You could also pass an Array[Agent] of length 1 to one of the ArrayAgentSet constructors, but I'd recommend AgentSetBuilder since it's less reliant on internal implementation details which are likely to change.)
Method #2
The disadvantage of method #1 is the slight constant overhead associated with creating and setting up the extra AgentSet, Job, and Context objects and directing execution through them.
Creating and running a separate job isn't actually how built-in commands like if and while work. Instead of making a new job, they remain in the current job and cause commands in a command block to run (or not run) by manipulating the instruction pointer (nvm.Context.ip) to jump into them or skip over them.
I believe an extension command could do the same. I'm not certain if it has been tried before, but I can't see any reason it wouldn't work.
Doing it this way would involve understanding more about NetLogo engine internals, as documented at https://github.com/NetLogo/NetLogo/wiki/Engine-architecture . You'd model your primitive after e.g. https://github.com/NetLogo/NetLogo/blob/5.0.x/src/main/org/nlogo/prim/etc/_if.java , including altering your implementation of nvm.CustomAssembled. (Note that prim._extern, which runs extension commands, delegates its assemble method to the wrapped command's own assemble method, so this should work.) In your assemble method, instead of calling done() at the end to terminate the job, you'd just allow execution to fall through.
I could try to construct an example that works this way, but it'd take me a couple hours; it's probably not worth me doing unless there's a real need.