SonarLint - questions about some of the rules for VB.NET - vb.net

The large majority of SonarLint rules that I've come across in Java seemed plausible and justified. However, ever since I've started using SonarLint for VB.NET, I've come across several rules that left me questioning their usefulness or even whether or not they are working correctly.
I'd like to know if this is simply a problem of me using some VB.NET constructs in a suboptimal way or whether the rule really is flawed.
(Apologies if this question is a little longer. I didn't know if I should create a separate question for each individual rule.)
The following rules I found to leave some cases unconsidered that would actually turn up as false-positives:
S1871: Two branches in the same conditional structure should not have exactly the same implementation
I found this one to bring up a lot of false-positives for me, because sometimes the order in which the conditions are checked actually does matter. Take the following pseudo code as example:
If conditionA() Then
doSomething()
ElseIf conditionB() AndAlso conditionC() Then
doSomethingElse()
ElseIf conditionD() OrElse conditionE() Then
doYetAnotherThing()
'... feel free to have even more cases in between here
Else Then
doSomething() 'Non-compliant
End If
If I wanted to follow this Sonar rule and still make the code behave the same way, I'd have to add the negated version of each ElseIf-condition to the first If-condition.
Another example would be the following switch:
Select Case i
Case 0 To 40
value = 0
Case 41 To 60
value = 1
Case 61 To 80
value = 3
Case 81 To 100
value = 5
Case Else
value = 0 'Non-compliant
There shouldn't be anything wrong with having that last case in a switch. True, I could have initialized value beforehand to 0 and ignored that last case, but then I'd have one more assignment operation than necessary. And the Java ruleset has conditioned me to always put a default case in every switch.
S1764: Identical expressions should not be used on both sides of a binary operator
This rule does not seem to take into account that some functions may return different values every time you call them, for instance collections where accessing an element removes it from the collection:
stack.Push(stack.Pop() / stack.Pop()) 'Non-compliant
I understand if this is too much of an edge case to make special exceptions for it, though.
The following rules I am not actually sure about:
S3385: "Exit" statements should not be used
While I agree that Return is more readable than Exit Sub, is it really bad to use a single Exit For to break out of a For or a For Each loop? The SonarLint rule for Java permits the use of a single break; in a loop before flagging it as an issue. Is there a reason why the default in VB.NET is more strict in that regard? Or is the rule built on the assumption that you can solve nearly all your loop problems with LINQ extension methods and lambdas?
S2374: Signed types should be preferred to unsigned ones
This rule basically states that unsigned types should not be used at all because they "have different arithmetic operators than signed ones - operators that few developers understand". In my code I am only using UInteger for ID values (because I don't need negative values and a Long would be a waste of memory in my case). They are stored in List(Of UInteger) and only ever compared to other UIntegers. Is this rule even relevant to my case (are comparisons part of these "arithmetic operators" mentioned by the rule) and what exactly would be the pitfall? And if not, wouldn't it be better to make that rule apply to arithmetic operations involving unsigned types, rather than their declaration?
S2355: Array literals should be used instead of array creation expressions
Maybe I don't know VB.NET well enough, but how exactly would I satisfy this rule in the following case where I want to create a fixed-size array where the initialization length is only known at runtime? Is this a false-positive?
Dim myObjects As Object() = New Object(someOtherList.Count - 3) {} 'Non-compliant
Sure, I could probably just use a List(Of Object). But I am curious anyway.

Thanks for raising these points. Note that not all rules apply every time. There are cases when we need to balance between false positives/false negatives/real cases. For example with identical expressions on both sides of an operator rule. Is it a bug to have the same operands? No it's not. If it was, then the compiler would report it. Is it a bad smell, is it usually a mistake? Yes in many cases. See this for example in Roslyn. Should we tune this rule to exclude some cases? Yes we should, there's nothing wrong with 2 << 2. So there's a lot of balancing that needs to happen, and we try to settle for an implementation that brings the most value for the users.
For the points you raised:
Two branches in the same conditional structure should not have exactly the same implementation
This rule generally states that having two blocks of code match exactly is a bad sign. Copy-pasted code should be avoided for many reasons, for example if you need to fix the code in one place, you'll need to fix it in the other too. You're right that adding negated conditions would be a mess, but if you extract each condition into its own method (and call the negated methods inside them) with proper names, then it would probably improves the readability of your code.
For the Select Case, again, copy pasted code is always a bad sign. In this case you could do this:
Select Case i
...
Case 0 To 40
Case Else
value = 0 ' Compliant
End Select
Or simply remove the 0-40 case.
Identical expressions should not be used on both sides of a binary operator
I think this is a corner case. See the first paragraph of the answer.
"Exit" statements should not be used
It's almost always true that by choosing another type of loop, or changing the stop condition, you can get away without using any "Exit" statements. It's good practice to have a single exit point from loops.
Signed types should be preferred to unsigned ones
This is a legacy rule from SonarQube VB.NET, and I agree with you that it shouldn't be enabled by default in SonarLint. I created the following ticket in our JIRA: https://jira.sonarsource.com/browse/SLVS-1074
Array literals should be used instead of array creation expressions
Yes, it seems to be a false positive, we shouldn't report on array creations when the size is explicitly specified. https://jira.sonarsource.com/browse/SLVS-1075

Related

Where is contains( Junction) defined?

This code works:
(3,6...66).contains( 9|21 ).say # OUTPUT: «any(True, True)␤»
And returns a Junction. It's also tested, but not documented.
The problem is I can't find its implementation anywhere. The Str code, which is also called from Cool, never returns a Junction (it does not take a Junction, either). There are no other methods contain in source.
Since it's autothreaded, it's probably specially defined somewhere. I have no idea where, though. Any help?
TL;DR Junction autothreading is handled by a single central mechanism. I have a go at explaining it below.
(The body of your question starts with you falling into a trap, one I think you documented a year or two back. It seems pretty irrelevant to what you're really asking but I cover that too.)
How junctions get handled
Where is contains( Junction) defined? ... The problem is I can't find [the Junctional] implementation anywhere. ... Since it's autothreaded, it's probably specially defined somewhere.
Yes. There's a generic mechanism that automatically applies autothreading to all P6 routines (methods, operators etc.) that don't have signatures that explicitly control what happens with Junction arguments.
Only a tiny handful of built in routines have these explicit Junction handling signatures -- print is perhaps the most notable. The same is true of user defined routines.
.contains does not have any special handling. So it is handled automatically by the generic mechanism.
Perhaps the section The magic of Junctions of my answer to an earlier SO Filtering elements matching two regexes will be helpful as a high level description of the low level details that follow below. Just substitute your 9|21 for the foo & bar in that SO, and your .contains for the grep, and it hopefully makes sense.
Spelunking the code
I'll focus on methods. Other routines are handled in a similar fashion.
method AUTOTHREAD does the work for full P6 methods.
This is setup in this code that sets up handling for both nqp and full P6 code.
The above linked P6 setup code in turn calls setup_junction_fallback.
When a method call occurs in a user's program, it involves calling find_method (modulo cache hits as explained in the comment above that code; note that the use of the word "fallback" in that comment is about a cache miss -- which is technically unrelated to the other fallback mechanisms evident in this code we're spelunking thru).
The bit of code near the end of this find_method handles (non-cache-miss) fallbacks.
Which arrives at find_method_fallback which starts off with the actual junction handling stuff.
A trap
This code works:
(3,6...66).contains( 9|21 ).say # OUTPUT: «any(True, True)␤»
It "works" to the degree this does too:
(3,6...66).contains( 2 | '9 1' ).say # OUTPUT: «any(True, True)␤»
See Lists become strings, so beware .contains() and/or discussion of the underlying issues such as pmichaud's comment.
Routines like print, put, infix ~, and .contains are string routines. That means they coerce their arguments to Str. By default the .Str coercion of a listy value is its elements separated by spaces:
put 3,6...18; # 3 6 9 12 15 18
put (3,6...18).contains: '9 1'; # True
It's also tested
Presumably you mean the two tests with a *.contains argument passed to classify:
my $m := #l.classify: *.contains: any 'a'..'f';
my $s := classify *.contains( any 'a'..'f'), #l;
Routines like classify are list routines. While some list routines do a single operation on their list argument/invocant, eg push, most of them, including classify, iterate over their list doing something with/to each element within the list.
Given a sequence invocant/argument, classify will iterate it and pass each element to the test, in this case a *.contains.
The latter will then coerce individual elements to Str. This is a fundamental difference compared to your example which coerces a sequence to Str in one go.

Use String for IF statement conditions

I'm hoping someone can help answer my question, perhaps with an idea of where to go or whether what I'm trying to do is not possible with the way I want to do it.
I've been asked to write a set of rules based on the data held by our ERP form components or variables.
Unfortunately, these components and variables cannot be accessed or used outside of the ERP, so I can't use SQL to query the values and then build some kind of SQL query.
They'd like the ability to put statements like these:
C(MyComponentName) = C(MyOtherComponentName)
V(MyVariableName) > 16
(C(MyComponentName) = "") AND V(MyVariableName) <> "")
((C(MyComponentName) = "") OR C(MyOtherComponentName) = "") AND V(MyVariableName) <> "")
This should be turned into some kind of query which gets the value of MyComponentName and MyOtherComponentName and (in this case) compares them for equality.
They don't necessarily want to just compare for equality, but to be able to determine whether a component / variable value is greaterthan or lessthan etc.
Basically it's a free-form statement that gets converted into something similar to an IF statement.
I've tried this:
Sub TestCondition()
Dim Condition as string = String.Format("{0} = {1}", _
Component("MyComponent").Value, Component("MyOtherComponent").Value)
If (Condition) Then
' Do Something
Else
' Do Something Else
End If
End Sub
Obviously, this does not work and I honestly didn't think it would be so simple.
Ignoring the fact that I'd have to parse the line, extract the required operators, the values from components or variables (denoted by a C or V) - how can I do this?
I've looked at Expression Trees but these were confusing, especially as I'd never heard of them, let alone used them. (Is it possible to create an expression tree for dynamic if statements? - This link provided some detail on expression trees in C#)
I know an easier way to solve this might be to simply populate the form with a multitude of drop-down lists, so users pick what they want from lists or fill in a text box for a specific search criteria.
This wouldn't be a simple matter as the ERP doesn't allow you to dynamically create controls on its forms. You have to drag each component manually and would be next to useless as we'd potentially want at least 1 rule for every form we have (100+).
I'm either looking for someone to say you cannot do this the way you want to do it (with a suitable reason or suggestion as to how I could do it) that I can take to my manager or some hints, perhaps a link or 2 pointing me in the right direction.
If (Condition) Then
This is not possible. There is no way to treat data stored in a string as code. While the above statement is valid, it won't and can't function the way you want it to. Instead, Condition will be evaluated as what it is: a string. (Anything that doesn't boil down to 0 is treated as True; see this question.)
What you are attempting borders on allowing the user to type code dynamically to get a result. I won't say this is impossible per se in VB.Net, but it is incredibly ambitious.
Instead, I would suggest clearly defining what your application can and can't do. Enumerate the operators your code will allow and build code to support each directly. For example:
Public Function TestCondition(value1 As Object, value2 As Object, op as string) As Boolean
Select Case op
Case "="
Return value1 = value2
Case "<"
Return value1 < value2
Case ">"
Return value1 > value2
Case Else
'Error handling
End Select
End Function
Obviously you would need to tailor the above to the types of variables you will be handling and your other specific needs, but this approach should give you a workable solution.
For my particular requirements, using the NCalc library has enabled me to do most of what I was looking to do. Easy to work with and the documentation is quite extensive - lots of examples too.

How do you control a range for type safety?

Imagine you have a function that converts ints to roman string:
public String roman(int)
Only numbers from 1 to 3999 (inclusive) are valid for conversion.
So what do you do if someone passes 4000 in any OO language?
raise an exception
return “” or some other special string
write an assert
…
Number 1: raise an exception. That's what ArgumentOutOfRangeException is for (at least in .NET):
if (intToConvert >= 4000)
{
throw new ArgumentOutOfRangeException("intToConvert ", "Only numbers 1-3000 are valid for conversion.");
}
I find the validation topic very interesting in general. In my opinion option 2 (returning a special value) is not a good one, since you are forcing the client to do if/case to check for the returned value and that code must be repeated everywhere. Also, unlike exceptions that propagate through the calling stack, in this scenario the caller is almost always the one that has to handle that special value.
In the context of OOP raising an exception or having an assertion is, IMO, a more elegant way to cope with it. However i find that inlining verification code in every method doesn't scale well for some reasons:
Many times your validation logic ends up being greater than the method logic itself, so you end up cluttering your code with things that are not entirely relevant to it.
There is no proper validation code reuse (e.g. range validation, e-mail validation, etc).
This one depends on your tastes, but you will be doing defensive programming.
Some years ago I attended to a talk about validators (a similar talk slide's are here. The document explaining it used to be in http://www.caesarsystems.com/resources/caesarsystems/files/Extreme_Validation.pdf but now its a 404 :( ) and totally like the concept. IMHO having a validation framework that adopts the OO philosophy is the way to go. In case you want to read about it I've written a couple of posts about it here and here (disclaimer: the posts are part of the blog of the company I work for).
HTH

GW Basic default variable initialization

I'm working with legacy code and ran across something that I haven't been able to explain after several days of looking up tutorials and handbooks for GW Basic: a variable (P9%) is used in a comparison on line 530 (IF P9% <> 0) before the code would reach its definition on line 860. It's not a complex piece of code, only ~1200 lines total, so I am confident that I haven't missed any goto or gosub or anything that would reach 860 earlier than this comparison.
I am curious as to how this has been effecting the program as it runs. Most of my experience is with c++ where this sort of thing wouldn't compile, and if it did an unassigned variable could potentially contain anything that would fit, but I have no idea what kind of default assignment is given to a variable in Basic.
It has been many years since I wrote much in gwbasic!
If I remember correctly the variable is assigned a zero value in that case. Gwbasic (and Qbasic I think) assigns a default value to all variables when first referenced, this is usually zero or the empty string for a string variable.
Interestingly when creating an array using the DIM statement, all the items in the array are also initialised this way.
Even with this mechanism it is usually better to initialise a variable, just to be clear what is happening.
Many programmers of the era writing for gwbasic omitted as much as they could to minimise the amount of memory used by the program instructions so they had more for other stuff. So that may be why it's not initialised.

can a variable have multiple values

In algebra if I make the statement x + y = 3, the variables I used will hold the values either 2 and 1 or 1 and 2. I know that assignment in programming is not the same thing, but I got to wondering. If I wanted to represent the value of, say, a quantumly weird particle, I would want my variable to have two values at the same time and to have it resolve into one or the other later. Or maybe I'm just dreaming?
Is it possible to say something like i = 3 or 2;?
This is one of the features planned for Perl 6 (junctions), with syntax that should look like my $a = 1|2|3;
If ever implemented, it would work intuitively, like $a==1 being true at the same time as $a==2. Also, for example, $a+1 would give you a value of 2|3|4.
This feature is actually available in Perl5 as well through Perl6::Junction and Quantum::Superpositions modules, but without the syntax sugar (through 'functions' all and any).
At least for comparison (b < any(1,2,3)) it was also available in Microsoft Cω experimental language, however it was not documented anywhere (I just tried it when I was looking at Cω and it just worked).
You can't do this with native types, but there's nothing stopping you from creating a variable object (presuming you are using an OO language) which has a range of values or even a probability density function rather than an actual value.
You will also need to define all the mathematical operators between your variables and your variables and native scalars. Same goes for the equality and assignment operators.
numpy arrays do something similar for vectors and matrices.
That's also the kind of thing you can do in Prolog. You define rules that constraint your variables and then let Prolog resolve them ...
It takes some time to get used to it, but it is wonderful for certain problems once you know how to use it ...
Damien Conways Quantum::Superpositions might do what you want,
https://metacpan.org/pod/Quantum::Superpositions
You might need your crack-pipe however.
What you're asking seems to be how to implement a Fuzzy Logic system. These have been around for some time and you can undoubtedly pick up a library for the common programming languages quite easily.
You could use a struct and handle the operations manualy. Otherwise, no a variable only has 1 value at a time.
A variable is nothing more than an address into memory. That means a variable describes exactly one place in memory (length depending on the type). So as long as we have no "quantum memory" (and we dont have it, and it doesnt look like we will have it in near future), the answer is a NO.
If you want to program and to modell this behaviour, your way would be to use a an array (with length equal to the number of max. multiple values). With this comes the increased runtime, hence the computations must be done on each of the values (e.g. x+y, must compute with 2 different values x1+y1, x2+y2, x1+y2 and x2+y1).
In Perl , you can .
If you use Scalar::Util , you can have a var take 2 values . One if it's used in string context , and another if it's used in a numerical context .