VB.NET "Fixing" Code Clone - vb.net

I'm working on refactoring a decently large project, and I'm interested in searching out and reducing Code Clone for better standardization as well as ease of development.
I've got a code snippet that keeps coming up in "Exact Matches" (using Visual Studio 2012's
"Find Code Clones" feature).
Here it is:
End If
End Using
Catch ex As Exception
MsgBox(ex.Message)
End Try
End Using
End Using
Return Nothing
Basically, I have a whole host of similar functions (similar in structure but not in actual function) that all open with something like this:
Const sql As String = ...
Using cn As SqlConnection...
Using cmd As SqlCommand(sql,cn)
... Maybe add some SqlParameters
cn.Open()
... Something with cmd.Execute...
Now, I recognize that the first code block is IDENTICAL across many, many methods, but I can't figure out a way to pull that code out, write it once and simply call it each time I need that functionality. It seems to me that there is too much control flow occurring.
So, I'm stumped as to how to fix this (and I'm only assuming that I can "fix" this because Microsoft has identified it as "cloned code").
I'm thinking of something along the lines of making a small number of functions that do the same type of thing (like returning a count from a table, returning a top value, etc...) and only really differ by the SQL that is executed. That could be a bit tricky, though, since sometimes the parameters (type and number) differ.
Any thoughts?

I wouldn't concern yourself with those. Your common-sense first impression is correct. While, technically speaking, the syntax is repeated multiple times, it is not really a repetition of logic or an algorithm of any kind. That's not to say there's no way to reduce that repetition, it's just that by doing so, you will likely end up with a worse design in your code.
For instance, you could create a single method that does all the setup and tear-down and then just calls a method in the middle that actually uses the connection to do the work, such as:
Public Function PerformDbTask(task As IMyDbTask)
Using cn As SqlConnection...
Using cmd As SqlCommand = cn.CreateCommand()
Try
cn.Open()
task.Perform(cmd)
Catch ex As Exception
MsgBox(ex.Message)
End Try
End Using
End Using
Return Nothing
End Function
However, what have you really gained? Probably not much at all, yet you've lost a lot of flexibility. So, unless that kind of design is actually necessary for what you are trying to do, I wouldn't waste time trying to solve a problem that doesn't exist.

You could create a class that implements a builder pattern, that winds up looking like
Dim list =
SqlBuilder.Query("SELECT ...")
.WithConnection("...connection string...")
.WithParameter(userName)
.WithParameter(lastTrackId)
.RetrieveTopRows(10)

Your problem is you are using a clone detector that matches token sequences (thus it matches the end-of-block sequences you exhibited), as opposed to clone detectors that match code structures. While the token sequence is technically a clone, it isn't an interesting clone. Token clone detectors produce many many "false positive" clones like this, which simply waste your time.
It is easier to build token sequence detectors, which is why they built that into MS Studio.
If you want better detectors, you have to step outside of Studio.
You should look into clone detectors that match on abstract syntax trees. They don't produce this kind of false positive, and they can find clones with complex parameters (as opposed to single-token parameters).
You should also understand that just because something has been identified as a clone, that it is not always easy or possible to refactor it away. The language you have may not have sufficiently strong abstraction mechanisms to handle that case.

Related

Roslyn context.SemanticModel.GetDeclaredSymbol() returning NULL from InvocationExpression

Trying to develop an VS extension to help with migration from vb6 to Vb.net using Roslyn.
Unfortunately I am not having much luck with detecting the "DoEvents" expression in my source as I get NULL from my GetDeclaredSymbol during the detection.
My bad coding is......
Register the action:
context.RegisterSyntaxNodeAction(AddressOf ExpressionStatementDec, SyntaxKind.InvocationExpression)
Try and detect the "DoEvents" expression:
Private Sub ExpressionStatementDec(context As SyntaxNodeAnalysisContext)
Dim GotYou = context.SemanticModel.GetDeclaredSymbol(context.Node)
Dim WhatExpression = context.Node.ToFullString.ToString
' Find DoEvents.
If RemoveWhitespace(WhatExpression) = "DoEvents" Then
Dim diag = Diagnostic.Create(Rule, GotYou.Locations(0), GotYou.Name)
context.ReportDiagnostic(diag)
End If
End Sub
I have tried loads of options for trying to get the right type of object for "GotYou" but no luck so far.
Any pointers appreciated :)
Edit Additional info:
I have tried GetSymbolInfo but when I am detecting "DoEvents" in the context.Node.ToFullString.ToString I am still not getting anything in the context.SemanticModel.GetSymbolInfo(context.Node) as below.
Thanks,
Richard
If you want to look at what a invocation is referencing, call GetSymbolInfo not GetDeclaredSymbol.
Don’t have Visual Studio handy in order to get the code, but...
I believe what you want is something like:
Dim WhatExpression = TryCast(CType(context.Node, InvocationExpressionSyntax).Expression, IdentifierNameSyntax)?.Identifier.Text
This isn’t all of it, you could be dealing with a memberaccessexpression, in which case it’s probably not what you are looking for. The options would be a bit easier to handle with pattern matching in C#, but that’s the general idea. You don’t need the semantic tree at this point, because you first want to verify that you are dealing with the right text. Once you’ve got that, you can see where it comes from and whether it is something you need to deal with. Getting the semantic model is expensive, no reason to do so when (outside of your unit test) it is rarely going to be needed.

naming a function that exhibits "set if not equal" behavior

This might be an odd question, but I'm looking for a word to use in a function name. I'm normally good at coming up with succinct, meaningful function names, but this one has me stumped so I thought I'd appeal for help.
The function will take some desired state as an argument and compare it to the current state. If no change is needed, the function will exit normally without doing anything. Otherwise, the function will take some action to achieve the desired state.
For example, if wanted to make sure the front door was closed, i might say:
my_house.<something>_front_door('closed')
What word or term should use in place of the something? I'd like it to be short, readable, and minimize the astonishment factor.
A couple clarifying points...
I would want someone calling the function to intuitively know they didn't need to wrap the function an 'if' that checks the current state. For example, this would be bad:
if my_house.front_door_is_open():
my_house.<something>_front_door('closed')
Also, they should know that the function won't throw an exception if the desired state matches the current state. So this should never happen:
try:
my_house.<something>_front_door('closed')
except DoorWasAlreadyClosedException:
pass
Here are some options I've considered:
my_house.set_front_door('closed')
my_house.setne_front_door('closed') # ne=not equal, from the setne x86 instruction
my_house.ensure_front_door('closed')
my_house.configure_front_door('closed')
my_house.update_front_door('closed')
my_house.make_front_door('closed')
my_house.remediate_front_door('closed')
And I'm open to other forms, but most I've thought of don't improve readability. Such as...
my_house.ensure_front_door_is('closed')
my_house.conditionally_update_front_door('closed')
my_house.change_front_door_if_needed('closed')
Thanks for any input!
I would use "ensure" as its succinct, descriptive and to the point:
EnsureCustomerExists(CustomerID)
EnsureDoorState(DoorStates.Closed)
EnsureUserInterface(GUIStates.Disabled)
Interesting question!
From the info that you have supplied, it seems to me that setstate (or simply set, if you are setting other things than states) would be fine, though ensure is good if you want to really emphasize the redundancy of an if.
To me it is however perfectly intuitive that setting a state does not throw an exception, or require an if. Think of setting the state of any other variable:
In C:
int i;
i = 5; // Would you expect this to throw an exception if i was already 5?
// Would you write
if (i != 5)
i = 5;
// ?
Also it only takes about one sentence to document this behaviour:
The function does nothing if the
current state equals the requested
state.
EDIT: Actually, thinking about it, if it is really important to you (for some reason) that the user is not confused about this, I would in fact pick ensure (or some other non-standard name). Why? Because as a user, a name like that would make me scratch my head a bit and look up the documentation ("This is more than just an ordinary set-function, apparently").
EDIT 2: Only you know how you design your programs, and which function name fits in best. From what you are saying, it seems like your setting functions sometimes throw exceptions, and you need to name a setting function that doesn't - e.g. set_missile_target. If that is the case, I think you should consider the set_if, set_when, set_cond or cond_set names. Which one would kind of depend on the rest of your code. I would also add that one line of documentation (or two, if you're generous), which clarifies the whole thing.
For example:
// Sets missile target if current target is not already the requested target,
// in which case it does nothing. No exceptions are thrown.
function cond_set_missile_target ()
or function cond_set_MissileTarget ()
or function condSet_MissileTarget ()
or function condSetMissileTarget ()
ensure is not so bad, but to me it implies only that there is additional logic required to set the state (e.g. multiple states tied together, or other complications). It helps to make the user avoid adding unnecessary ifs, but it does not help much with the exception issue. I would expect an ensure function to throw an exception sooner than a set function, since the ensure function clearly has more responsibilities for, well, ensuring that this setting operation is in fact done right.
I'd go for ensure for the function you describe. I'd also use camelCase, but I suppose you may be in a language that prefers underscores.
You could always document (shock!) your API so that others don't make the mistakes you describe.

Using statement with more than one system resource

I have used the using statement in both C# and VB. I agree with all the critics regarding nesting using statements (C# seems well done, VB not so much)
So with that in mind I was interested in improving my VB using statements by "using" more than one system resource within the same block:
Example:
Using objBitmap As New Bitmap(100,100)
Using objGraphics as Graphics = Graphics.From(objBitmap)
End Using
End Using
Could be written like this:
Using objBitmap As New Bitmap(100,100), objGraphics as Gaphics = Graphics.FromImage(objbitmap)
End Using
So my question is what is the better method?
My gut tells me that if the resources are related/dependent then using more than one resource in a using statement is logical.
My primary language is C#, and there most people prefer "stacked" usings when you have many of them in the same scope:
using (X)
using (Y)
using (Z)
{
// ...
}
The problem with the single using statement that VB.NET has is that it seems cluttered and it would be likely to fall of the edge of the screen. So at least from that perspective, multiple usings look better in VB.NET.
Maybe if you combine the second syntax with line continuations, it would look better:
Using objBitmap As New Bitmap(100,100), _
objGraphics as Graphics = Graphics.FromImage(objbitmap)
' ...
End Using
That gets you closer to what I would consider better readability.
They are both the same, you should choose the one that you find to be the most readable as they are identical at the IL level.
I personally like the way that C# handles this by allowing this syntax:
using (Foo foo = new Foo())
using (Bar bar = new Bar())
{
// ....
}
However I find the VB.NET equivalent of this form (your second example) to be less readable than the nested Using statements from your first example. But this is just my opinion. Choose the style that best suits the readability of the code in question as that is the most important thing considering that the output is identical.

When is it OK to use the GoTo statement in VB.Net?

I have process that needs to create a bunch of records in the database and roll everything back if anything goes wrong. What I want to do is this:
Public Structure Result
Public Success as Boolean
Public Message as String
End Structure
Private _Repository as IEntityRepository
Public Function SaveOrganization( _
ByVal organization As rv_o_Organization) As Result
Dim result = Result.Empty
_Repository.Connection.Open()
_Repository.Transaction = _Repository.Connection.BeginTransaction()
''//Performs validation then saves it to the database
''// using the current transaction
result = SaveMasterOrganization(organization.MasterOrganization)
If (Not result.Success) Then
GoTo somethingBadHappenedButNotAnException
End If
''//Performs validation then saves it to the database
''//using the current transaction
result = SaveOrganziation(dbOrg, organization)
If (Not result.Success) Then GoTo somethingBadHappenedButNotAnException
somethingBadHappenedButNotAnException:
_Repository.Transaction.Commit()
_Repository.Connection.Close()
Return result
End Sub
Is this an ok use of the GoTo statement, or just really bad design? Is there a more elegant solution? Hopefully this sample is able to get the point across
If you have to ask, don't do it.
For your specific code, you could do it like this:
Public Function SaveOrganization(ByVal organization As rv_o_Organization) As Result
Dim result As Result = Result.Empty
_Repository.Connection.Open()
_Repository.Transaction = _Repository.Connection.BeginTransaction()
'Performs validation then saves it to the database
'using the current transaction
result = SaveMasterOrganization(organization.MasterOrganization)
'Performs validation then saves it to the database
'using the current transaction
If result.Success Then result = SaveOrganziation(dbOrg, organization)
_Repository.Transaction.Commit()
_Repository.Connection.Close()
Return result
End Sub
Goto has such a terrible reputation that it will cause other developers to instantly think poorly of your code. Even if you can demonstrate that using a goto was the best design choice - you'll have to explain it again and again to anyone who sees your code.
For the sake of your own reputation, just don't do it.
Really bad design. Yes.
There might be some extreme edge cases where it is applicable, but almost unequivocally, no, do not use it.
In this specific case, you should be using the Using statement to handle this in a better manner. Generally, you would create a class which would implement IDisposable (or use one that already does) and then handle cleanup in the Dispose method. In this case, you would close the connection to the database (apparently, it is reopened again from your design).
Also, I would suggest using the TransactionScope class here as well, you can use that to scope your transaction and then commit it, as well as auto-abort in the face of exceptions.
The only time you should use a goto, is when there is no other alternative.
The only way to find out if there are no other alternatives is to try them all.
In your particular example, you should use try...finally instead, like this (sorry, I only know C#)
void DoStuff()
{
Connection connection = new Connection();
try
{
connection.Open()
if( SomethingBadHappened )
return;
}
finally
{
connection.Close();
}
}
I would say extremely sparingly. Anytime I have had to think about using a GOTO statement, I try and refactor the code. The only exception I can think of was in vb with the statement On Error Goto.
There is nothing inherently wrong with goto, but this isn't a very ideal use of it. I think you are putting too fine a point on the definition of an exception.
Just throw a custom exception and put your rollback code in there. I would assume you would also want to rollback anyway if a REAL exception occured, so you get double duty out of it that way too.
Gotos are simply an implementation detail. A try/catch is much like a goto (an inter-stack goto at that!) A while loop (or any construct) can be written with gotos if you want. Break and early return statements are the most thinly disguised gotos of them all--they are blatant(and some people dislike them because of the similarity)
So technically there is nothing really WRONG with them, but they do make for more difficult code. When you use the looping structures, you are bound to the area of your braces. There is no wondering where you are actually going to, searching or criss-crossing.
On top of that, they have a REALLY BAD rep. If you do decide to use one, even in the best of possible cases, you'll have to defend your decision against everyone who ever reads your code--and many of those people you'll be defending against won't have the capability to make the judgment call themselves, so you're encouraging bad code all around.
One solution for your case might be to use the fact that an early return is the same as a goto (ps. Worst psuedocode ever):
dbMethod() {
start transaction
if(doWriteWorks())
end Transaction success
else
rollback transaction
}
doWriteWorks() {
validate crap
try Write crap
if Fail
return false
validate other crap
try Write other crap
if Fail
return false
return true
}
I think this pattern would work in VB, but I haven't used it since VB 3 (around the time MS bought it) so if transactions are somehow bound to the executing method context or something, then I dunno. I know MS tends to bind the database very closely to the structure of the code or else I wouldn't even consider the possibility of this not working...
I use goto's all the time in particular places, for example right above a Try Catch, in case a you prompt the user, "Retry?, Cancel", if retry then Goto StartMyTask: and increment the Current Try of Maximum retries depending on the scenario.
They're also handy in for each loops.
For each Items in MyList
If VaidationCheck1(Item) = false then goto SkipLine
If ValidationCheck(Item) = false then goto skipline
'Do some logic here, that can be avoided by skipping it to get better performance.
'I use then like short circuit operands, why evaluate more than you actually have to?
SkipLine:
Next
I wouldn't substitute them for functions and make really huge blocks of long code, just in little places where they can really help, mostly to skip over things.
Everytime I've seen a goto used, simple refactoring could have handle it. I'd reccomend never using it unless you "know" that you have to use it
I'm tempted to say never, but I suppose there is always one case where it may be the best solution. However, I've programmed for the last 20 years or so without using a Goto statement and can't foresee needing one any time soon.
Why not wrap each function call in a try catch block and when that is done, if one of your exceptions are thrown, you can catch it and close the connection. This way, you avoid the GOTO statement altogether.
In short, the GOTO statement is NOT a good thing, except in unusual situations, and even then it is usually a matter of refactoring to avoid it. Don't forget, it is a leftover from early languages, in this case BASIC.
I'd say use very sparingly as it's generally associated with introducing spagetti code. Try using methods instead of Labels.
A good case I think for using GOTO is to create a flow through select which is available in C# but not VB.
The go to statement has a tendency to make the program flow difficult to understand. I cannot remember using it during the last ten years, except in visual basic 6 in combination with "on error".
Your use of the go to is acceptable as far as I am concerned, because the program flow is very clear. I don't think using try ... catch would improve things much, because you would need to throw the exceptions on the locations where the go tos are now.
The formatting, however, is not very appealing:-)
I would change the name of the go to label to something else, because this location is also reached when everything is successful. clean_up: would be nice.
Eh, there was a decent use for it in VBscript/ASP for handling errors. We used it to return error handling back to ASP once we were done using on error resume next.
in .net? Heavens, no!
Everyone says avoid it but why.
The GOTO syntax is a jump statement in assembly - very efficient.
Main reason to avoid it is code readability. You need to find the GOTO label in the code which is hard to eyeball.
Some people think that it might cause memory leaks but I've seen that experts say that this is not true in .NET.

What's bad about the VB With/End With keyword?

In this question, a user commented to never use the With block in VB. Why?
"Never" is a strong word.
I think it fine as long as you don't abuse it (like nesting)
IMHO - this is better:
With MyCommand.Parameters
.Count = 1
.Item(0).ParameterName = "#baz"
.Item(0).Value = fuz
End With
Than:
MyCommand.Parameters.Count = 1
MyCommand.Parameters.Item(0).ParameterName = "#baz"
MyCommand.Parameters.Item(0).Value = fuz
There is nothing wrong about the With keyword. It's true that it may reduce readibility when nested but the solution is simply don't use nested With.
There may be namespace problems in Delphi, which doesn't enforce a leading dot but that issue simply doesn't exist in VB.NET so the people that are posting rants about Delphi are losing their time in this question.
I think the real reason many people don't like the With keyword is that is not included in C* languages and many programmers automatically think that every feature not included in his/her favourite language is bad.
It's just not helpful compared to other options.
If you really miss it you can create a one or two character alias for your object instead. The alias only takes one line to setup, rather than two for the With block (With + End With lines).
The alias also gives you a quick mouse-over reference for the type of the variable. It provides a hook for the IDE to help you jump back to the top of the block if you want (though if the block is that large you have other problems). It can be passed as an argument to functions. And you can use it to reference an index property.
So we have an alternative that gives more function with less code.
Also see this question:
Why is the with() construct not included in C#, when it is really cool in VB.NET?
The with keyword is only sideswiped in a passing reference here in an hilarious article by the wonderful Verity Stob, but it's worth it for the vitriol: See the paragraph that starts
While we are on identifier confusion. The with keyword...
Worth reading the entire article!
The With keyword also provides another benefit - the object(s) in the With statement only need to be "qualified" once, which can improve performance. Check out the information on MSDN here:
http://msdn.microsoft.com/en-us/library/wc500chb(VS.80).aspx
So by all means, use it.