Nim sequence assignment value semantics - sequence

I was under the impression that sequences and strings always get deeply copied on assignment. Today I got burned when interfacing with a C library to which I pass unsafeAddr of a Nim sequence. The C library writes into the memory area starting at the passed pointer.
Since I don't want the original Nim sequence to be changed by the library I thought I'll simply copy the sequence by assigning it to a new variable named copy and pass the address of the copy to the library.
Lo and behold, the modifications showed up in the original Nim sequence nevertheless. What's even more weird is that this behavior depends on whether the copy is declared via let copy = ... (changes do show up) or via var copy = ... (changes do not show up).
The following code demonstrates this in a very simplified Nim example:
proc changeArgDespiteCopyAssignment(x: seq[int], val: int): seq[int] =
let copy = x
let copyPtr = unsafeAddr(copy[0])
copyPtr[] = val
result = copy
proc dontChangeArgWhenCopyIsDeclaredAsVar(x: seq[int], val: int): seq[int] =
var copy = x
let copyPtr = unsafeAddr(copy[0])
copyPtr[] = val
result = copy
let originalSeq = #[1, 2, 3]
var ret = changeArgDespiteCopyAssignment(originalSeq, 9999)
echo originalSeq
echo ret
ret = dontChangeArgWhenCopyIsDeclaredAsVar(originalSeq, 7777)
echo originalSeq
echo ret
This prints
#[9999, 2, 3]
#[9999, 2, 3]
#[9999, 2, 3]
#[7777, 2, 3]
So the first call changes originalSeq while the second doesn't. Can someone explain what is going on under the hood?
I'm using Nim 1.6.6 and a total Nim newbie.

Turns out there are a lot of issues concerned with this behavior in the nim-lang issue tracker. For example:
let semantics gives 3 different results depends on gc, RT vs VM, backend, type, global vs local scope
Seq assignment does not perform a deep copy
Let behaves differently in proc for default gc
assigning var to local let does not properly copy in default gc
clarify spec/implementation for let: move or copy?
RFC give default gc same semantics as --gc:arc as much as possible
Long story short, whether a copy is made depends on a lot of factors, for sequences especially on the scope (global vs. local ) and the gc (refc, arc, orc) in use.
More generally, the type involved (seq vs. array), the code generation backend (C vs. JS) and whatnot can also be relevant.
This behavior has tricked a lot of beginners and is not well received by some of the contributors. It doesn't happen with the newer GCs --gc:arc or --gc:orc where the latter is expected to become the default GC in an upcoming Nim version.
It has never been fixed in the current default gc because of performance concerns, backward compatibility risks and the expectation that it will disappear anyway once we transition to the newer GCs.
Personally, I would have expected that it at least gets clearly singled out in the Nim language manual. Well, it isn't.

I took a quick look at the generated C code, a highly edited and simplified version is below.
The essential thing that is missing in the let code = ... variant is the call to genericSeqAssign() which makes a copy of the argument and assigns that to copy, instead the argument is assigned to copy directly. So, there is definitely no deep copy on assignment in that case.
I don't know if that is intentional or if it is a bug in the code generation (my first impression was astonishment). Any ideas?
tySequence_A* changeArgDespiteCopyAssignment(tySequence_A* x, NI val) {
tySequence_A* result = NIM_NIL;
tySequence_A* copy;
NI* copyPtr;
copy = x; /* NO genericSeqAssign() call !*/
copyPtr = &copy->data[(NI) 0];
*copyPtr = val;
genericSeqAssign(&result, copy, &NTIseqLintT_B);
return result;
}
tySequence_A* dontChangeArgWhenCopyIsDeclaredAsVar(tySequence_A* x, NI val) {
tySequence_A* result = NIM_NIL;
tySequence_A copy;
NI* copyPtr;
genericSeqAssign(&copy, x, &NTIseqLintT_B);
copyPtr = &copy->data[(NI) 0];
*copyPtr = val;
genericSeqAssign(&result, copy, &NTIseqLintT_B);
return result;
}

Related

Why is a variable not immutable?

Why is varx not immutable in this code?
y = { g = 'a' }
z = { g = 1 }
varx = foo y
varx = foo z
Anybody knows? Thanks.
I guess that this code is executed in the Elm REPL. Immutable variables behave a little differently there. From the Beginning Elm book:
https://elmprogramming.com/immutability.html
The repl works a little differently. Whenever we reassign a different value to an existing constant, the repl essentially rebinds the constant to a new value. The rebinding process kills the constant and brings it back to life as if the constant was never pointing to any other value before.
When compiling Elm code the ordinary way with Elm make, this code would lead to an error.

Kotlin stdlib operatios vs for loops

I wrote the following code:
val src = (0 until 1000000).toList()
val dest = ArrayList<Double>(src.size / 2 + 1)
for (i in src)
{
if (i % 2 == 0) dest.add(Math.sqrt(i.toDouble()))
}
IntellJ (in my case AndroidStudio) is asking me if I want to replace the for loop with operations from stdlib. This results in the following code:
val src = (0 until 1000000).toList()
val dest = ArrayList<Double>(src.size / 2 + 1)
src.filter { it % 2 == 0 }
.mapTo(dest) { Math.sqrt(it.toDouble()) }
Now I must say, I like the changed code. I find it easier to write than for loops when I come up with similar situations. However upon reading what filter function does, I realized that this is a lot slower code compared to the for loop. filter function creates a new list containing only the elements from src that match the predicate. So there is one more list created and one more loop in the stdlib version of the code. Ofc for small lists it might not be important, but in general this does not sound like a good alternative. Especially if one should chain more methods like this, you can get a lot of additional loops that could be avoided by writing a for loop.
My question is what is considered good practice in Kotlin. Should I stick to for loops or am I missing something and it does not work as I think it works.
If you are concerned about performance, what you need is Sequence. For example, your above code will be
val src = (0 until 1000000).toList()
val dest = ArrayList<Double>(src.size / 2 + 1)
src.asSequence()
.filter { it % 2 == 0 }
.mapTo(dest) { Math.sqrt(it.toDouble()) }
In the above code, filter returns another Sequence, which represents an intermediate step. Nothing is really created yet, no object or array creation (except a new Sequence wrapper). Only when mapTo, a terminal operator, is called does the resulting collection is created.
If you have learned java 8 stream, you may found the above explaination somewhat familiar. Actually, Sequence is roughly the kotlin equivalent of java 8 Stream. They share similiar purpose and performance characteristic. The only difference is Sequence isn't designed to work with ForkJoinPool, thus a lot easier to implement.
When there is multiple steps involved or the collection may be large, it's suggested to use Sequence instead of plain .filter {...}.mapTo{...}. I also suggest you to use the Sequence form instead of your imperative form because it's easier to understand. Imperative form may become complex, thus hard to understand, when there are 5 or more steps involved in the data processing. If there is just one step, you don't need a Sequence, because it just creates garbage and gives you nothing useful.
You're missing something. :-)
In this particular case, you can use an IntProgression:
val progression = 0 until 1_000_000 step 2
You can then create your desired list of squares in various ways:
// may make the list larger than necessary
// its internal array is copied each time the list grows beyond its capacity
// code is very straight forward
progression.map { Math.sqrt(it.toDouble()) }
// will make the list the exact size needed
// no copies are made
// code is more complicated
progression.mapTo(ArrayList(progression.last / 2 + 1)) { Math.sqrt(it.toDouble()) }
// will make the list the exact size needed
// a single intermediate list is made
// code is minimal and makes sense
progression.toList().map { Math.sqrt(it.toDouble()) }
My advice would be to choose whichever coding style you prefer. Kotlin is both object-oriented and functional language, meaning both of your propositions are correct.
Usually, functional constructs favor readability over performance; however, in some cases, procedural code will also be more readable. You should try to stick with one style as much as possible, but don't be afraid to switch some code if you feel like it's better suited to your constraints, either readability, performance, or both.
The converted code does not need the manual creation of the destination list, and can be simplified to:
val src = (0 until 1000000).toList()
val dest = src.filter { it % 2 == 0 }
.map { Math.sqrt(it.toDouble()) }
And as mentioned in the excellent answer by #glee8e you can use a sequence to do a lazy evaluation. The simplified code for using a sequence:
val src = (0 until 1000000).toList()
val dest = src.asSequence() // change to lazy
.filter { it % 2 == 0 }
.map { Math.sqrt(it.toDouble()) }
.toList() // create the final list
Note the addition of the toList() at the end is to change from a sequence back to a final list which is the one copy made during the processing. You can omit that step to remain as a sequence.
It is important to highlight the comments by #hotkey saying that you should not always assume that another iteration or a copy of a list causes worse performance than lazy evaluation. #hotkey says:
Sometimes several loops. even if they copy the whole collection, show good performance because of good locality of reference. See: Kotlin's Iterable and Sequence look exactly same. Why are two types required?
And excerpted from that link:
... in most cases it has good locality of reference thus taking advantage of CPU cache, prediction, prefetching etc. so that even multiple copying of a collection still works good enough and performs better in simple cases with small collections.
#glee8e says that there are similarities between Kotlin sequences and Java 8 streams, for detailed comparisons see: What Java 8 Stream.collect equivalents are available in the standard Kotlin library?

Make interpreter execute faster

I've created an interprter for a simple language. It is AST based (to be more exact, an irregular heterogeneous AST) with visitors executing and evaluating nodes. However I've noticed that it is extremely slow compared to "real" interpreters. For testing I've ran this code:
i = 3
j = 3
has = false
while i < 10000
j = 3
has = false
while j <= i / 2
if i % j == 0 then
has = true
end
j = j+2
end
if has == false then
puts i
end
i = i+2
end
In both ruby and my interpreter (just finding primes primitively). Ruby finished under 0.63 second, and my interpreter was over 15 seconds.
I develop the interpreter in C++ and in Visual Studio, so I've used the profiler to see what takes the most time: the evaluation methods.
50% of the execution time was to call the abstract evaluation method, which then casts the passed expression and calls the proper eval method. Something like this:
Value * eval (Exp * exp)
{
switch (exp->type)
{
case EXP_ADDITION:
eval ((AdditionExp*) exp);
break;
...
}
}
I could put the eval methods into the Exp nodes themselves, but I want to keep the nodes clean (Terence Parr saied something about reusability in his book).
Also at evaluation I always reconstruct the Value object, which stores the result of the evaluated expression. Actually Value is abstract, and it has derived value classes for different types (That's why I work with pointers, to avoid object slicing at returning). I think this could be another reason of slowness.
How could I make my interpreter as optimized as possible? Should I create bytecodes out of the AST and then interpret bytecodes instead? (As far as I know, they could be much faster)
Here is the source if it helps understanding my problem: src
Note: I haven't done any error handling yet, so an illegal statement or an error will simply freeze the program. (Also sorry for the stupid "error messages" :))
The syntax is pretty simple, the currently executed file is in OTZ1core/testfiles/test.txt (which is the prime finder).
I appreciate any help I can get, I'm really beginner at compilers and interpreters.
One possibility for a speed-up would be to use a function table instead of the switch with dynamic retyping. Your call to the typed-eval is going through at least one, and possibly several, levels of indirection. If you distinguish the typed functions instead by name and give them identical signatures, then pointers to the various functions can be packed into an array and indexed by the type member.
value (*evaltab[])(Exp *) = { // the order of functions must match
Exp_Add, // the order type values
//...
};
Then the whole switch becomes:
evaltab[exp->type](exp);
1 indirection, 1 function call. Fast.

Is it possible to avoid specifying a default in order to get an X in Chisel?

The following Chisel code works as expected.
class Memo extends Module {
val io = new Bundle {
val wen = Bool(INPUT)
val wrAddr = UInt(INPUT, 8)
val wrData = UInt(INPUT, 8)
val ren = Bool(INPUT)
val rdAddr = UInt(INPUT, 8)
val rdData = UInt(OUTPUT, 8)
}
val mem = Mem(UInt(width = 8), 256)
when (io.wen) { mem(io.wrAddr) := io.wrData }
io.rdData := UInt(0)
when (io.ren) { io.rdData := mem(io.rdAddr) }
}
However, it's a compile-time error if I don't specify io.rdData := UInt(0), because defaults are required. Is there a way to either explicitly specify X or to have modules without defaults output X, erm, by default?
Some reasons you might want to do this are that nothing should rely on the output if ren isn't asserted, and Xs let you specify it, and specifying an X can tell the synthesis tool that it's a don't care, for optimization purposes.
Chisel does not support X.
This is probably what you want:
io.rdData := mem(io.rdAddr)
(this prevents muxing between zero and mem's readout data).
If, for example, one is trying to infer a single-ported SRAM, the Chisel tutorial/manual describes how to do so (https://chisel.eecs.berkeley.edu/latest/chisel-tutorial.pdf):
val ram1p =
Mem(UInt(width = 32), 1024, seqRead = true)
val reg_raddr = Reg(UInt())
when (wen) { ram1p(waddr) := wdata }
.elsewhen (ren) { reg_raddr := raddr }
val rdata = ram1p(reg_raddr)
In short, the read address needs to be registered (since we're dealing with synchronous memory in this example), and the enable signal on this register dictates that the read-out data only changes with that enable is true. Thus, the backend understands this a read-enable signal exists on the read port, and in this example, the write port and the read port are one and the same (since only one is ever accessed at a time). Or if you want 1w/1r (as per your example), you can change the "elsewhen (ren)" to a "when (ren)".
From the Chisel 2.0 Tutorial paper, only binary logic is supported at the moment:
2 Hardware expressible in Chisel
This version of Chisel also only supports binary logic, and does not
support tri-state signals.
We focus on binary logic designs as they constitute the vast majority
of designs in practice. We omit support for tri-state logic in the
current Chisel language as this is in any case poorly supported by
industry flows, and difficult to use reliably outside of controlled
hard macros.

Clearing numerical values in Mathematica

I am working on fairly large Mathematica projects and the problem arises that I have to intermittently check numerical results but want to easily revert to having all my constructs in analytical form.
The code is fairly fluid I don't want to use scoping constructs everywhere as they add work overhead. Is there an easy way for identifying and clearing all assignments that are numerical?
EDIT: I really do know that scoping is the way to do this correctly ;-). However, for my workflow I am really just looking for a dirty trick to nix all numerical assignments after the fact instead of having the foresight to put down a Block.
If your assignments are on the top level, you can use something like this:
a = 1;
b = c;
d = 3;
e = d + b;
Cases[DownValues[In],
HoldPattern[lhs_ = rhs_?NumericQ] |
HoldPattern[(lhs_ = rhs_?NumericQ;)] :> Unset[lhs],
3]
This will work if you have a sufficient history length $HistoryLength (defaults to infinity). Note however that, in the above example, e was assigned 3+c, and 3 here was not undone. So, the problem is really ambiguous in formulation, because some numbers could make it into definitions. One way to avoid this is to use SetDelayed for assignments, rather than Set.
Another alternative would be to analyze the names in say Global' context (if that is the context where your symbols live), and then say OwnValues and DownValues of the symbols, in a fashion similar to the above, and remove definitions with purely numerical r.h.s.
But IMO neither of these approaches are robust. I'd still use scoping constructs and try to isolate numerics. One possibility is to wrap you final code in Block, and assign numerical values inside this Block. This seems a much cleaner approach. The work overhead is minimal - you just have to remember which symbols you want to assign the values to. Block will automatically ensure that outside it, the symbols will have no definitions.
EDIT
Yet another possibility is to use local rules. For example, one could define rule[a] = a->1; rule[d]=d->3 instead of the assignments above. You could then apply these rules, extracting them as say
DownValues[rule][[All, 2]], whenever you want to test with some numerical arguments.
Building on Andrew Moylan's solution, one can construct a Block like function that would takes rules:
SetAttributes[BlockRules, HoldRest]
BlockRules[rules_, expr_] :=
Block ## Append[Apply[Set, Hold#rules, {2}], Unevaluated[expr]]
You can then save your numeric rules in a variable, and use BlockRules[ savedrules, code ], or even define a function that would apply a fixed set of rules, kind of like so:
In[76]:= NumericCheck =
Function[body, BlockRules[{a -> 3, b -> 2`}, body], HoldAll];
In[78]:= a + b // NumericCheck
Out[78]= 5.
EDIT In response to Timo's comment, it might be possible to use NotebookEvaluate (new in 8) to achieve the requested effect.
SetAttributes[BlockRules, HoldRest]
BlockRules[rules_, expr_] :=
Block ## Append[Apply[Set, Hold#rules, {2}], Unevaluated[expr]]
nb = CreateDocument[{ExpressionCell[
Defer[Plot[Sin[a x], {x, 0, 2 Pi}]], "Input"],
ExpressionCell[Defer[Integrate[Sin[a x^2], {x, 0, 2 Pi}]],
"Input"]}];
BlockRules[{a -> 4}, NotebookEvaluate[nb, InsertResults -> "True"];]
As the result of this evaluation you get a notebook with your commands evaluated when a was locally set to 4. In order to take it further, you would have to take the notebook
with your code, open a new notebook, evaluate Notebooks[] to identify the notebook of interest and then do :
BlockRules[variablerules,
NotebookEvaluate[NotebookPut[NotebookGet[nbobj]],
InsertResults -> "True"]]
I hope you can make this idea work.