I received a lint message while writing Kotlin using IntelliJ as my IDE: The argument can be converted to 'Set' to improve performance
My code went something like this:
val variable3 = variable1 - variable2 variables are of type List<Int>
The linter recommended me to change it to val variable3 = variable1 - variable2.toSet()
I would like to find out why it recommends this change and where the documentation is so next time I can look up the messages and learn the reasoning behind the lint check.
This is about performance and efficiency — in particular, about how the performance scales as your data gets bigger.
The - operator calls the standard library's Iterable<T>.minus(elements: Iterable<T>) extension function. If you look at its code (which you can do in IntelliJ), you'll see that that works by taking the first iterable (variable1 in this case) and filtering it to keep only the values not in the second one (variable2).
How does it check whether an element is in the second one? By calling its contains() method. But how that works will depend on the type of iterable. Most Sets, for example, can look up by hash code, which takes a short time, regardless of how big the set is.
However, most Lists and other iterables can't do that: they need to search through the whole list, element by element. How long that takes will obviously depend on the size of the list — it'll be very quick for short lists, but it might take some time to search through a list with thousands or millions of elements.
What makes it particularly important in this case is that it has to do that search repeatedly: once for each element of the first one. So the time can really mount up.
Say that the first iterable has 𝐌 elements, and the second has 𝐍. The subtraction has to make 𝐌 checks; if the second one is a set, then each check takes about the same time, so the overall time is proportional to 𝐌. But if not, then each check will take time proportional to 𝐍, so the overall time is 𝐌×𝐍 — which can get very big very fast! (For example, if you make each list 10× bigger, it'll take 100× as long.)
So if you don't want your program to grind to a halt when it starts handling more data, it can be well worth converting the second list to a set first. For small amounts of data, it adds a little extra work, but that probably won't be noticeable; and for large amounts of data, it can be a really big win.
That's why the IntelliJ inspection suggests it.
(For those of you who know about algorithmic complexity, please forgive the simplifications I've made here :)
Interestingly, when I tried it myself (in IntelliJ 2021.2.3 with Kotlin v1.5.73), it didn't make that suggestion. And looking into that implementation of the standard library, I see that in some cases the minus() method will do the conversion for you! However, I think it doesn't cover some other common cases, so it's still worth doing the conversion yourself if you think the lists could get big.
You can hover over and open the documentation like this:
The general quick fix options can be found and turn on/off under
settings -> inspections -> kotlin (example)
Related
to is an infix function within the standard library. It can be used to create Pairs concisely:
0 to "hero"
in comparison with:
Pair(0, "hero")
Typically, it is used to initialize Maps concisely:
mapOf(0 to "hero", 1 to "one", 2 to "two")
However, there are other situations in which one needs to create a Pair. For instance:
"to be or not" to "be"
(0..10).map { it to it * it }
Is it acceptable, stylistically, to (ab)use to in this manner?
Just because some language features are provided does not mean they are better over certain things. A Pair can be used instead of to and vice versa. What becomes a real issue is that, does your code still remain simple, would it require some reader to read the previous story to understand the current one? In your last map example, it does not give a hint of what it's doing. Imagine someone reading { it to it * it}, they would be most likely confused. I would say this is an abuse.
to infix offer a nice syntactical sugar, IMHO it should be used in conjunction with a nicely named variable that tells the reader what this something to something is. For example:
val heroPair = Ironman to Spiderman //including a 'pair' in the variable name tells the story what 'to' is doing.
Or you could use scoping functions
(Ironman to Spiderman).let { heroPair -> }
I don't think there's an authoritative answer to this. The only examples in the Kotlin docs are for creating simple constant maps with mapOf(), but there's no hint that to shouldn't be used elsewhere.
So it'll come down to a matter of personal taste…
For me, I'd be happy to use it anywhere it represents a mapping of some kind, so in a map{…} expression would seem clear to me, just as much as in a mapOf(…) list. Though (as mentioned elsewhere) it's not often used in complex expressions, so I might use parentheses to keep the precedence clear, and/or simplify the expression so they're not needed.
Where it doesn't indicate a mapping, I'd be much more hesitant to use it. For example, if you have a method that returns two values, it'd probably be clearer to use an explicit Pair. (Though in that case, it'd be clearer still to define a simple data class for the return value.)
You asked for personal perspective so here is mine.
I found this syntax is a huge win for simple code, especial in reading code. Reading code with parenthesis, a lot of them, caused mental stress, imagine you have to review/read thousand lines of code a day ;(
The large majority of SonarLint rules that I've come across in Java seemed plausible and justified. However, ever since I've started using SonarLint for VB.NET, I've come across several rules that left me questioning their usefulness or even whether or not they are working correctly.
I'd like to know if this is simply a problem of me using some VB.NET constructs in a suboptimal way or whether the rule really is flawed.
(Apologies if this question is a little longer. I didn't know if I should create a separate question for each individual rule.)
The following rules I found to leave some cases unconsidered that would actually turn up as false-positives:
S1871: Two branches in the same conditional structure should not have exactly the same implementation
I found this one to bring up a lot of false-positives for me, because sometimes the order in which the conditions are checked actually does matter. Take the following pseudo code as example:
If conditionA() Then
doSomething()
ElseIf conditionB() AndAlso conditionC() Then
doSomethingElse()
ElseIf conditionD() OrElse conditionE() Then
doYetAnotherThing()
'... feel free to have even more cases in between here
Else Then
doSomething() 'Non-compliant
End If
If I wanted to follow this Sonar rule and still make the code behave the same way, I'd have to add the negated version of each ElseIf-condition to the first If-condition.
Another example would be the following switch:
Select Case i
Case 0 To 40
value = 0
Case 41 To 60
value = 1
Case 61 To 80
value = 3
Case 81 To 100
value = 5
Case Else
value = 0 'Non-compliant
There shouldn't be anything wrong with having that last case in a switch. True, I could have initialized value beforehand to 0 and ignored that last case, but then I'd have one more assignment operation than necessary. And the Java ruleset has conditioned me to always put a default case in every switch.
S1764: Identical expressions should not be used on both sides of a binary operator
This rule does not seem to take into account that some functions may return different values every time you call them, for instance collections where accessing an element removes it from the collection:
stack.Push(stack.Pop() / stack.Pop()) 'Non-compliant
I understand if this is too much of an edge case to make special exceptions for it, though.
The following rules I am not actually sure about:
S3385: "Exit" statements should not be used
While I agree that Return is more readable than Exit Sub, is it really bad to use a single Exit For to break out of a For or a For Each loop? The SonarLint rule for Java permits the use of a single break; in a loop before flagging it as an issue. Is there a reason why the default in VB.NET is more strict in that regard? Or is the rule built on the assumption that you can solve nearly all your loop problems with LINQ extension methods and lambdas?
S2374: Signed types should be preferred to unsigned ones
This rule basically states that unsigned types should not be used at all because they "have different arithmetic operators than signed ones - operators that few developers understand". In my code I am only using UInteger for ID values (because I don't need negative values and a Long would be a waste of memory in my case). They are stored in List(Of UInteger) and only ever compared to other UIntegers. Is this rule even relevant to my case (are comparisons part of these "arithmetic operators" mentioned by the rule) and what exactly would be the pitfall? And if not, wouldn't it be better to make that rule apply to arithmetic operations involving unsigned types, rather than their declaration?
S2355: Array literals should be used instead of array creation expressions
Maybe I don't know VB.NET well enough, but how exactly would I satisfy this rule in the following case where I want to create a fixed-size array where the initialization length is only known at runtime? Is this a false-positive?
Dim myObjects As Object() = New Object(someOtherList.Count - 3) {} 'Non-compliant
Sure, I could probably just use a List(Of Object). But I am curious anyway.
Thanks for raising these points. Note that not all rules apply every time. There are cases when we need to balance between false positives/false negatives/real cases. For example with identical expressions on both sides of an operator rule. Is it a bug to have the same operands? No it's not. If it was, then the compiler would report it. Is it a bad smell, is it usually a mistake? Yes in many cases. See this for example in Roslyn. Should we tune this rule to exclude some cases? Yes we should, there's nothing wrong with 2 << 2. So there's a lot of balancing that needs to happen, and we try to settle for an implementation that brings the most value for the users.
For the points you raised:
Two branches in the same conditional structure should not have exactly the same implementation
This rule generally states that having two blocks of code match exactly is a bad sign. Copy-pasted code should be avoided for many reasons, for example if you need to fix the code in one place, you'll need to fix it in the other too. You're right that adding negated conditions would be a mess, but if you extract each condition into its own method (and call the negated methods inside them) with proper names, then it would probably improves the readability of your code.
For the Select Case, again, copy pasted code is always a bad sign. In this case you could do this:
Select Case i
...
Case 0 To 40
Case Else
value = 0 ' Compliant
End Select
Or simply remove the 0-40 case.
Identical expressions should not be used on both sides of a binary operator
I think this is a corner case. See the first paragraph of the answer.
"Exit" statements should not be used
It's almost always true that by choosing another type of loop, or changing the stop condition, you can get away without using any "Exit" statements. It's good practice to have a single exit point from loops.
Signed types should be preferred to unsigned ones
This is a legacy rule from SonarQube VB.NET, and I agree with you that it shouldn't be enabled by default in SonarLint. I created the following ticket in our JIRA: https://jira.sonarsource.com/browse/SLVS-1074
Array literals should be used instead of array creation expressions
Yes, it seems to be a false positive, we shouldn't report on array creations when the size is explicitly specified. https://jira.sonarsource.com/browse/SLVS-1075
Okay, what I really wanted to do is, I have an Array and I want to choose a random element from it. The obvious thing to do is get an integer from a random number generator between 0 and the length minus 1, which I have working already, and then applying Array.get, but that returns a Maybe a. (It appears there's also a package function that does the same thing.) Coming from Haskell, I get the type significance that it's protecting me from the case where my index was out of range, but I have control over the index and don't expect that to happen, so I'd just like to assume I got a Just something and somewhat forcibly convert to a. In Haskell this would be fromJust or, if I was feeling verbose, fromMaybe (error "some message"). How should I do this in Elm?
I found a discussion on the mailing list that seems to be discussing this, but it's been a while and I don't see the function I want in the standard library where the discussion suggests it would be.
Here are some pretty unsatisfying potential solutions I found so far:
Just use withDefault. I do have a default value of a available, but I don't like this as it gives the completely wrong meaning to my code and will probably make debugging harder down the road.
Do some fiddling with ports to interface with Javascript and get an exception thrown there if it's Nothing. I haven't carefully investigated how this works yet, but apparently it's possible. But this just seems to mix up too many dependencies for what would otherwise be simple pure Elm.
(answering my own question)
I found two more-satisfying solutions:
Roll my own partially defined function, which was referenced elsewhere in the linked discussion. But the code kind of feels incomplete this way (I'd hope the compiler would warn me about incomplete pattern matches some day) and the error message is still unclear.
Pattern-match and use Debug.crash if it's a Nothing. This appears similar to Haskell's error and is the solution I'm leaning towards right now.
import Debug
fromJust : Maybe a -> a
fromJust x = case x of
Just y -> y
Nothing -> Debug.crash "error: fromJust Nothing"
(Still, the module name and description also make me hesitate because it doesn't seem like the "right" method intended for my purposes; I want to indicate true programmer error instead of mere debugging.)
Solution
The existence or use of a fromJust or equivalent function is actually code smell and tells you that the API has not been designed correctly. The problem is that you're attempting to make a decision on what to do before you have the information to do it. You can think of this in two cases:
If you know what you're supposed to do with Nothing, then the solution is simple: use withDefault. This will become obvious when you're looking at the right point in your code.
If you don't know what you're supposed to do in the case where you have Nothing, but you still want to make a change, then you need a different way of doing so. Instead of pulling the value out of the Maybe use Maybe.map to change the value while keeping the Maybe. As an example, let's say you're doing the following:
foo : Maybe Int -> Int
foo maybeVal =
let
innerVal = fromJust maybeVal
in
innerVal + 2
Instead, you'll want this:
foo : Maybe Int -> Maybe Int
foo maybeVal =
Maybe.map (\innerVal -> innerVal + 2) maybeVal
Notice that the change you wanted is still done in this case, you've simply not handled the case where you have a Nothing. You can now pass this value up and down the call chain until you've hit a place where it's natural to use withDefault to get rid of the Maybe.
What's happened is that we've separated the concerns of "How do I change this value" and "What do I do when it doesn't exist?". We deal with the former using Maybe.map and the latter with Maybe.withDefault.
Caveat
There are a small number of cases where you simply know that you have a Just value and need to eliminate it using fromJust as you described, but those cases should be few and far between. There's quite a few that actually have a simpler alternative.
Example: Attempting to filter a list and get the value out.
Let's say you have a list of Maybes that you want the values of. A common strategy might be:
foo : List (Maybe a) -> List a
foo hasAnything =
let
onlyHasJustValues = List.filter Maybe.isJust hasAnything
onlyHasRealValues = List.map fromJust onlyHasJustValues
in
onlyHasRealValues
Turns out that even in this case, there are clean ways to avoid fromJust. Most languages with a collection that has a map and a filter have a method to filter using a Maybe built in. Haskell has Maybe.mapMaybe, Scala has flatMap, and Elm has List.filterMap. This transforms your code into:
foo : List (Maybe a) -> List a
foo hasAnything =
let
onlyHasRealValues = List.filterMap (\x -> x) hasAnything
in
onlyHasRealValues
First, i admit all the things i will ask are about our homework but i assure you i am not asking without struggling at least two hours.
Description: We are supposed to add a field called max_cpu_percent to task_struct data type and manipulate process scheduling algorithm so that processes can not use an higher percentage of the cpu.
for example if i set max_cpu_percent field as 20 for the process firefox, firefox will not be able to use more than 20% of the cpu.
We wrote a system call to set max_cpu_percent field. Now we need to see if the system call works or not but we could not get the value of the max_cpu_percent field from a user-spaced program.
Can we do this? and how?
We tried proc/pid/ etc can we get the value using this util?
By the way, We may add additional questions here if we could not get rid of something else
Thanks All
Solution:
The reason was we did not modify the code block writing the output to the proc queries.
There are some methods in array.c file (fs/proc/array.c) we modified the function so that also print the newly added fields value. kernel is now compiling we'll see the result after about an hour =)
It Worked...
(If you simply extended getrlimit/setrlimit, then you'd be done by now…)
There's already a mechanism where similar parts of task_struct are exposed: /proc/$PID/stat (and /proc/$PID/$TID/stat). Look for functions proc_tgid_stat and proc_tid_stat. You can add new fields to the ends of these files.
In algebra if I make the statement x + y = 3, the variables I used will hold the values either 2 and 1 or 1 and 2. I know that assignment in programming is not the same thing, but I got to wondering. If I wanted to represent the value of, say, a quantumly weird particle, I would want my variable to have two values at the same time and to have it resolve into one or the other later. Or maybe I'm just dreaming?
Is it possible to say something like i = 3 or 2;?
This is one of the features planned for Perl 6 (junctions), with syntax that should look like my $a = 1|2|3;
If ever implemented, it would work intuitively, like $a==1 being true at the same time as $a==2. Also, for example, $a+1 would give you a value of 2|3|4.
This feature is actually available in Perl5 as well through Perl6::Junction and Quantum::Superpositions modules, but without the syntax sugar (through 'functions' all and any).
At least for comparison (b < any(1,2,3)) it was also available in Microsoft Cω experimental language, however it was not documented anywhere (I just tried it when I was looking at Cω and it just worked).
You can't do this with native types, but there's nothing stopping you from creating a variable object (presuming you are using an OO language) which has a range of values or even a probability density function rather than an actual value.
You will also need to define all the mathematical operators between your variables and your variables and native scalars. Same goes for the equality and assignment operators.
numpy arrays do something similar for vectors and matrices.
That's also the kind of thing you can do in Prolog. You define rules that constraint your variables and then let Prolog resolve them ...
It takes some time to get used to it, but it is wonderful for certain problems once you know how to use it ...
Damien Conways Quantum::Superpositions might do what you want,
https://metacpan.org/pod/Quantum::Superpositions
You might need your crack-pipe however.
What you're asking seems to be how to implement a Fuzzy Logic system. These have been around for some time and you can undoubtedly pick up a library for the common programming languages quite easily.
You could use a struct and handle the operations manualy. Otherwise, no a variable only has 1 value at a time.
A variable is nothing more than an address into memory. That means a variable describes exactly one place in memory (length depending on the type). So as long as we have no "quantum memory" (and we dont have it, and it doesnt look like we will have it in near future), the answer is a NO.
If you want to program and to modell this behaviour, your way would be to use a an array (with length equal to the number of max. multiple values). With this comes the increased runtime, hence the computations must be done on each of the values (e.g. x+y, must compute with 2 different values x1+y1, x2+y2, x1+y2 and x2+y1).
In Perl , you can .
If you use Scalar::Util , you can have a var take 2 values . One if it's used in string context , and another if it's used in a numerical context .