Rust Range.contains failed to be inlined/optimized - optimization

I was running my code through Clippy and it suggested changing the following:
const SPECIAL_VALUE: u8 = 0; // May change eventually.
pub fn version1(value: u8) -> bool {
(value >= 1 && value <= 9) || value == SPECIAL_VALUE
}
Into
pub fn version2(value: u8) -> bool {
(1..=9).contains(&value) || value == SPECIAL_VALUE
}
Since it is more readable. Unfortunately the resulting assembly output is twice as long, even with optimization level 3. Manually inlining it (2-nestings down), gives almost the same code as version1 and is as efficient.
pub fn manually_inlined(value: u8) -> bool {
(1 <= value && value <= 9) || value == SPECIAL_VALUE
}
If I remove the || value == SPECIAL_VALUE they all resolve with the same (though with 1 more instruction added to decrement the parameter value before a compare). Also if I change SPECIAL_VALUE to something not adjacent to the range they all resolve to same assembly code as version2, which is the reason why I kept it 0 unless I eventually have to change it.
I have a link to Godbolt with the code here: https://rust.godbolt.org/z/bMYzfcYob
Why is the compiler failing to properly inline/optimize version2? Is it an "optimization bug"? Or am I misunderstanding some semantics of Rust, maybe something with the borrowing prevents the optimization, but can't the compiler assume no mutation of value due to the aliasing and referencing rules?
Trying to do the same in C++ suggest, yields the worse option in both cases (https://godbolt.org/z/zahfz65W3)
Edit: Changing the compiler for my C++ version to GCC makes it optimized in both cases.

This was indeed a missed optimization opportunity that has now been corrected in LLVM. https://github.com/rust-lang/rust/issues/90609#issuecomment-1046037263 .

Related

Why in kotlin "something != null || return" does not perform smartcast, but "if (something == null) return" yes

Given a function. for example:
suspend fun getUser(userId: Int): User? {
val result: UserApiResult? = fetchTheApi(userId)
//result != null || return null // Not smartcast
if (result == null) return null // Will make an smartcast of result from UserApiResult? to UserApiResult
return User(result.email, result.name)
}
Inside my IDE, specifically Android Studio. The first condition won't perform a smartcast even though it visibly does the same thing as the second condition (unless it's doing some dark things under the hood).
There is no good technical reason for smart casting to not take effect.
But it is jankier than you are giving it credit for. The only reason result != null || return null compiles is because return null has type Nothing and you can coerce Nothing to anything (in this case: Boolean).
The compiler should be able to reason that result != null as otherwise we would have obtained an instance of Nothing (which is impossible). But I'm personally glad I'll never have to see || return null in code review and I imagine the reasons for this not working are not a mistake by the Koltin devs.
Speculation on my part is that the compiler coerces the Nothing from return null to Boolean and loses the semantics of that branch being impossible to return from.
I think it's just a limitation of the current compiler. Building that code fails with the current compiler, but if you switch to the new K2 compiler (still in Alpha at the moment) compilation succeeds.
Example:
fun returnSomething(): String? = null
fun doSomething(): String? {
val result: String? = returnSomething()
result != null || return null
return result.length.toString()
}
fun main() {
println(doSomething())
}
Build output:
Kotlin: kotlinc-jvm 1.7.10 (JRE 1.8.0_212-b10)
Kotlin: ATTENTION!
This build uses experimental K2 compiler:
-Xuse-k2
Kotlin: performing incremental compilation analysis
Updating dependency information… [coroutines-test]
Running 'after' tasks
Finished, saving caches…
Executing post-compile tasks...
Synchronizing output directories...
01/11/2022, 18:01 - Build completed successfully with 4 warnings in 9 sec, 296 ms

Kotlin's logical 'and' doesn't short-circuit?

I was following along Kotlin's documentation at http://kotlinlang.org/docs/reference/null-safety.html#checking-for-null-in-conditions and tried adapting this example,
val b = "Kotlin"
if (b != null && b.length > 0) {
print("String of length ${b.length}")
} else {
print("Empty string")
}
to the case where b = null. In an IntelliJ Idea Kotlin project I have an app.kt with a main() function defined as:
fun main() {
val b = null
if (b != null && b.length > 0) {
print("String of length ${b.length}")
} else {
print("Empty string")
}
}
However, when I run this, I get two compilation errors:
Information:Kotlin: kotlinc-jvm 1.3.20 (JRE 11+28)
Information:2019-02-02 15:07 - Compilation completed with 2 errors and 0 warnings in 1 s 921 ms
/Users/kurtpeek/IdeaProjects/HelloWorld/src/app.kt
Error:(3, 24) Kotlin: Unresolved reference: length
Error:(4, 37) Kotlin: Unresolved reference: length
I understand that the compiler is evaluating b.length even though the first condition, b != null, is false. This surprises me because I thought that the first check was to 'short-circuit' the Boolean expression if needed and make the call to b.length 'safe'.
For example, in Python, you can do this:
In [1]: "foo" == "bar" and what.the.heck
Out[1]: False
which works even though what is not defined, because the and 'stops' since "foo" is not equal to "bar".
Is this indeed how Kotlin works? It seems like missing Python's 'short-circuiting' feature would be a limitation.
Kotlin's && operator will short circuit (just like Java's) but only at runtime. What you are experiencing is a compile time error. The big difference to remember especially when comparing Kotlin (or Java) to Python is that Kotlin and Java are statically typed and have a compilation phase. So you'll get a compilation error if the types don't match up.
Let's go through these one at a time...
val b = "Kotlin"
if (b != null && b.length > 0) {
...
}
In this case, Kotlin will correctly infer that b is the type String, because you clearly set it to a String ("Kotlin"). We should note here that the String type cannot ever contain null. Knowing that, the b != null part of your if statement is unnecessary. However, after evaluating that (to true, always) it will evaluate b.length because b is a String and therefore has a length property. This example should compile fine (I didn't test it).
And next...
val b = null
if (b != null && b.length > 0) {
...
}
This code will not compile, let's go over why...
This code looks really similar but has one huge difference. In this case because you just set b to null, Kotlin is going to infer that b is an Nothing?. It has no information as to what type you want b to be, and you've set it to null (and because it's a val, it will always be null). Because b is null, it makes b nullable.
So, given that, when we compile b != null, that will always fail, because b can't ever be something that isn't null. But wait! We're compiling now... and when we run into b.length Kotlin will throw a compilation error because Nothing? does not have a length property!
Essentially, by setting b to null and not providing a type hint, Kotlin takes the only path it can to infer the type - Nothing?.
From your linked text: "Note that this only works where b is immutable (i.e. a local variable which is not modified between the check and the usage or a member val which has a backing field and is not overridable)".
val b=null is immutable, but since the type of null cannot be inferred nor stored, it cannot be used as the source in a valid shortcut.
If you changed to code to give it a nullable type, and set that null, this would work.

Binding of private attributes: nqp::bindattr vs :=

I'm trying to find how the binding operation works on attributes and what makes it so different from nqp::bindattr. Consider the following example:
class Foo {
has #!foo;
submethod TWEAK {
my $fval = [<a b c>];
use nqp;
nqp::bindattr( nqp::decont(self), $?CLASS, '#!foo',
##!foo :=
Proxy.new(
FETCH => -> $ { $fval },
STORE => -> $, $v { $fval = $v }
)
);
}
method check {
say #!foo.perl;
}
}
my $inst = Foo.new;
$inst.check;
It prints:
$["a", "b", "c"]
Replacing nqp::bindattr with the binding operator from the comment gives correct output:
["a", "b", "c"]
Similarly, if foo is a public attribute and accessor is used the output would be correct too due to deconterisation taking place within the accessor.
I use similar code in my AttrX::Mooish module where use of := would overcomplicate the implementation. So far, nqp::bindattr did the good job for me until the above problem arised.
I tried tracing down Rakudo's internals looking for := implementation but without any success so far. I would ask here either for an advise as to how to simulate the operator or where in the source to look for its implementation.
Before I dig into the answer: most things in this post are implementation-defined, and the implementation is free to define them differently in the future.
To find out what something (naively) compiles into under Rakudo Perl 6, use the --target=ast option (perl6 --target=ast foo.p6). For example, the bind in:
class C {
has $!a;
submethod BUILD() {
my $x = [1,2,3];
$!a := $x
}
}
Comes out as:
- QAST::Op(bind) :statement_id<7>
- QAST::Var(attribute $!a) <wanted> $!a
- QAST::Var(lexical self)
- QAST::WVal(C)
- QAST::Var(lexical $x) $x
While switching it for #!a like here:
class C {
has #!a;
submethod BUILD() {
my $x = [1,2,3];
#!a := $x
}
}
Comes out as:
- QAST::Op(bind) :statement_id<7>
- QAST::Var(attribute #!a) <wanted> #!a
- QAST::Var(lexical self)
- QAST::WVal(C)
- QAST::Op(p6bindassert)
- QAST::Op(decont)
- QAST::Var(lexical $x) $x
- QAST::WVal(Positional)
The decont instruction is the big difference here, and it will take the contents of the Proxy by calling its FETCH, thus why the containerization is gone. Thus, you can replicate the behavior by inserting nqp::decont around the Proxy, although that rather begs the question of what the Proxy is doing there if the correct answer is obtained without it!
Both := and = are compiled using case analysis (namely, by looking at what is on the left hand side). := only works for a limited range of simple expressions on the left; it is a decidedly low-level operator. By contrast, = falls back to a sub call if the case analysis doesn't come up with a more efficient form to emit, though in most cases it manages something better.
The case analysis for := inserts a decont when the target is a lexical or attribute with sigil # or %, since - at a Perl 6 level - having an item bound to an # or % makes no sense. Using nqp::bindattr is going a level below Perl 6 semantics, and so it's possible to end up with the Proxy bound directly there using that. However, it also violates expectations elsewhere. Don't expect that to go well (but it seems you don't want to do that anyway.)

String comparison in the core language

Taking this simple comparison loopValue == "Firstname", is the following statement true?
If the internal operand inspecting the first char does not match the compared string, it will early abort
So taking the rawer form loopValue and "Firstname" are both []byte. And it would walk the array kind of like so as callback loop for truth:
someInspectionFunc(loopValue, "Firstname", func(charA, charB) {
return charA == charB
})
... making it keep on going until it bumps false and checks if the number of iterations was equal to both their lengths. Also does it check length first?
if len(loopValue) != len("Firstname") {
return false
}
I can't really find an explanation in the go source-code on GitHub as it's a bit above me.
The reason I'm asking this is because I'm doing big data processing and am benchmarking and doing cpu, memory and allocation pprof to squeeze some more juice out of the process. From that process it kind of made me think how Go (but also just C in general) would do this under the hood. Is this fully on an assembly level or does the comparison already happen in native Go code (kind of like sketched in the snippets above)?
Please let me know if I'm being too vague or if I missed something. Thank you
Update
When I did a firstCharater match in big strings of json, before really comparing I got about 3.7% benchmarking gain on 100k heavy entries:
<some irrelevant inspection code>.. v[0] == firstChar && v == lookFor {
// Match found when it reaches here
}
the code above (especially on long strings) is faster than just going for v == lookFor.
The function is handled in assembly. The amd64 version is:
TEXT runtime·eqstring(SB),NOSPLIT,$0-33
MOVQ s1str+0(FP), SI
MOVQ s2str+16(FP), DI
CMPQ SI, DI
JEQ eq
MOVQ s1len+8(FP), BX
LEAQ v+32(FP), AX
JMP runtime·memeqbody(SB)
eq:
MOVB $1, v+32(FP)
RET
And it's the compiler's job to ensure that the strings are of equal length before that is called. (The runtime·memeqbody function is actually where the optimized memory comparisons happen, but there's probably no need to post the full text here)
The equivalent Go code would be:
func eqstring_generic(s1, s2 string) bool {
if len(s1) != len(s2) {
return false
}
for i := 0; i < len(s1); i++ {
if s1[i] != s2[i] {
return false
}
}
return true
}

Gimpel's PC Lint Value Tracking

I'm a newbie to this site, so if I mess up any question-asking etiquette here I apologize in advance... Thanks!
This is extremely simplified example code, but I think it shows what I'm talking about: I have a C++ method that makes a call into another method to test a value...
char m_array[MAX]; // class member, MAX is a #define
foo(unsigned int n)
{
if (validNumber(n)) //test n
{
// do stuff
m_array[n-1] = 0;
}
}
where: validNumber(unsigned int val) { return ((val > 0) && (val <= MAX)); }
The irritation I'm having is that PC Lint's Value Tracking seems to ignore the validNumber() call and gives a warning 661 possible access of out-of-bounds pointer (1 beyond end of data) by operator '['
However if I do it like this, Lint is happy:
if ((n > 0) && (n <= MAX)) //test n
...
So, does Lint's Value Tracking just not work if the test is a method call?
Thanks again,
HF
I'd guess that validNumber is defined after foo, but in any case, PC Lint normally makes one pass over the code, and in such cases it doesn't see validNumber as a check for the boundaries for n.
You could try the option -passes(2) or even 3, and see what Lint makes out of it. I think (but didn't try) that Lint would then correctly note that the value for n is within the correct bounds.