Should mid and instr be used, or indexof and substring? - vb.net

Some VB string functions have similar methods in System.String, such as mid and substring, instr and indexof. Is there a good reason to use one or the other?

An example could explain a lot. This is the source code of Mid from Microsoft.VisualBasic
public static string Mid(string str, int Start, int Length)
{
if (Start <= 0)
{
throw new ArgumentException(Utils.GetResourceString("Argument_GTZero1", new string[] { "Start" }));
}
if (Length < 0)
{
throw new ArgumentException(Utils.GetResourceString("Argument_GEZero1", new string[] { "Length" }));
}
if ((Length == 0) || (str == null))
{
return "";
}
int length = str.Length;
if (Start > length)
{
return "";
}
if ((Start + Length) > length)
{
return str.Substring(Start - 1);
}
return str.Substring(Start - 1, Length);
}
At the end of the day they call Substring....
The story is a little more complex for Instr agains IndexOf because you could use a compare parameter but also in that case the internal code used in the Microsoft.VisualBasic COMPATIBILITY (Bold is mine) library falls again inside the base methods provided by the NET Framework.
Of course, if you need only to maintain an old program ported from the VB6 days, then it is absolutely correct to use these methods. Instead if you plan to continue the evolution of your program or you build a new one I suggest to switch to the NET Framework core methods as soon as possible.

I'd love to say you should use Mid() or Instr() because some of us have been using those functions for years, but I'd recommend against using those throwbacks. Mostly because the portable target platforms (like for Xbox and Windows Phone) do not support them. To me that's a sign that they're going to deprecate sooner than later. I've also read the .Net versions seem to perform better, but can't find any statistics to support that claim right now.
One other interesting note that is somewhat related is that the way in which the Trim() function deals with break lines is different. Sample code:
Dim strTest As String = ControlChars.NewLine ' OR Environment.NewLine OR vbNewLine
Dim oldLength As Integer = Len(Trim(strTest)) '2
Dim newLength As Integer = strTest.Trim().Length '0
So be careful if you're porting code to the .Net versions.

The reason I usually prefer to use the System.String one is that it is more compatible with other languages. If using c# as well as VB it's much less confusing to stick to the System.String ones. There are also a number of System.String functions which don't AFAIK have equivalents in Microsoft.VisualBasic, such as EndsWith and it's a bit odd to use a mixture. The VB ones are for compatibility with VB6 etc which is ancient now.
However - I do sometimes like the fact that the VB versions are more fault tolerant. The Mid example that Steve posted shows that Mid returns "" in cases which would have thrown an exception with Substring. There are similar differences with some of the others. I have found that quite useful in the past; you can end up writing those checks yourself before calling Substring. It also means that editing code using the old VB style commands to System.String can introduce some unexpected exception. I work on one project which started in VB5, and I learnt not to replace the old versions VB without a reason quite quickly.

Related

VB.Net Alternative to C# Underscore (discard)

In c# i can do:
_ = Bla();
Can I do that in VB.Net ?
I think the answer is no but I just wanted to make sure.
The underscore (_), as used in your example, is C#'s discard token. Unfortunately, there is (currently) nothing similar in VB. There is a discussion about adding a similar feature on the VB language design github page.
In your example, however, you can just omit assigning the result (both in C# and VB), i.e.
Bla(); // C#
Bla() ' VB
The "discard variable" is particularly useful for out parameters. In VB, you can just pass an arbitrary value instead of a variable to discard unused ByRef parameters. Let me give you an example:
The following two lines are invalid in C#:
var b = Int32.TryParse("3", 0); // won't compile
var b = Int32.TryParse("3", out 0); // won't compile
Starting with C# 7, you can use _ for that purpose:
var b = Int32.TryParse("3", out _); // compiles and discards the out parameter
This, however, is perfectly valid in VB, even with Option Strict On:
Dim b = Int32.TryParse("3", 0)
So, yes, it would be nice to make the fact that "I want to ignore the ByRef value" more explicit, but there is a simple workaround in VB.NET. Obviously, once VB.NET gets pattern matching or deconstructors, this workaround won't be enough.

cli/c++ increment operator overloading

i have a question regarding operator overloading in cli/c++ environment
static Length^ operator++(Length^ len)
{
Length^ temp = gcnew Length(len->feet, len->inches);
++temp->inches;
temp->feet += temp->inches/temp->inchesPerFoot;
temp->inches %= temp->inchesPerFoot;
return temp;
}
(the code is from ivor horton's book.)
why do we need to declare a new class object (temp) on the heap just to return it?
ive googled for the info on overloading but theres really not much out there and i feel kinda lost.
This is the way operator overloading is implemented in .NET. Overloaded operator is static function, which returns a new instance, instead of changing the current instance. Therefore, post and prefix ++ operators are the same. Most information about operator overloading talks about native C++. You can see .NET specific information, looking for C# samples, for example this: http://msdn.microsoft.com/en-us/library/aa288467(v=vs.71).aspx
.NET GC allows to create a lot of lightweight new instances, which are collected automatically. This is why .NET overloaded operators are more simple than in native C++.
Yes, because you're overloading POST-increment operator here. Hence, the original value may be used a lot in the code, copied and stored somewhere else, despite the existance of the new value. Example:
store_length_somewhere( len++ );
While len will be increased, the original value might be stored by the function somewhere else. That means that you might need two different values at the same time. Hence the creation and return of a new value.

Using If...Else vs If()

Does cleanliness trump performance here:
Version 1:
Function MyFunc(ByVal param as String) As String
Dim returnValue as String
If param Is Nothing Then
returnValue = "foo"
Else
returnValue = param
return returnValue
Version 2:
Function MyFunc(ByVal param as String) As String
return If(param,"foo")
Version 1 deals directly with unboxed Strings. Version 2 deals with all boxed Objects. [If() takes a TestExpression as Object, a FalsePart as Object and returns an Object]
[can't add comments]
COMMENT: ja72, fixed my naming.
COMMENT: Marc, so you would go with Version 2?
I think clarity trumps anything.
The If(obj1,obj2) function is the null coalescing operator of VB.NET. It functions the same as obj1 ?? obj2 in C#. As such, everyone should know what it means, and it should be used where conciseness is important.
Although the If/Else statement is clean, simple, and obvious, in this particular case, I would favor the If function.
The compiler would optimize these two to the same code or nearly the same depending on the optimization level (See project properties).
Write two methods this way, compile them and use Reflector to look into the VB.Net decompiled code (or even MSIL) and you will see that there is very little (some billionth of a second) or none difference in exectuion.
Compiler optimizations generally handle normal patterns that allows you to write if-statements and loops in different ways. For instance in .Net for, foreach, while, do, etc do not actually exist. They are language specific features that are compiled down to goto-statement logic in the assembly level. Use Reflector to look at a few of these and you'll learn a lot! :)
Note that it is possible to write bad code that the compiler can't optimize to its "best state", and it is even possible to do better than the compiler. Understanding .Net assembly and MSIL means understanding the compiler better.
Really? I don't think this function is going to be a bottleneck in any application, and so just go with brevity/clarity.
I would recommend:
Public Function TXV(ByVal param As String) As String
Return If(param Is Nothing, "foo", param)
End Function
and make sure the function returns a string (to keep type safety). BTW, why is your Function called MySub ? Shouldn't it be MyFunc ?
I believe that these two implementations are nearly the same, I would use the second one because it's shorter.
Since I come from a C background, I would opt for the ternary operator most times where it is clear what is happening - in a case like this where there is repetition and it can be idiomatic. Similarly in T-SQL where you can use COALESCE(a, b, c, d, e) to avoid having a bunch of conditionals and simply take the first non-null value - this is idiomatic and easily read.
Beware that the old IIf function is different from the new If operator, so while the new one properly handles side-effects and short-circuits, it's only one character away from a completely different behavior which people have long been wary of.
http://secretgeek.net/iif_function.asp
http://visualbasic.about.com/od/usingvbnet/a/ifop.htm
I don't think it's going to matter in terms of performance, because the optimizer is pretty good about these kind of transforms.

VB.NET logical expression evaluator

I need to test a logical expression held in a string to see if it evaluate to TRUE or FALSE.(the strig is built dynamically)
For example the resulting string may contain "'dog'<'cat' OR (1>4 AND 4<6)". There are no variables in the string, it will logically evaluate. It will only contain simple operators = > < >< >= <= and AND , OR and Open and Close Brackets, string constants and numbers. (converted to correct syntax && || etc.)
I currently acheive this by creating a jscipt function and compiling it into a .dll. I then reference the .dll in my VB.NET project.
class ExpressionEvaluator
{
function Evaluate(Expression : String)
{
return eval(Expression);
}
}
Is there a simpler method using built in .NET functions or Lamdba expressions.
I tried the demo out for this project and you might like it over you current method of evaluating. Note, it doesn't use lamdba expressions or any build it .NET methods.
http://web1.codeproject.com/KB/vb/expression_evaluator.aspx?msg=1151870
try out: http://www.codeproject.com/KB/cs/ExpressionEval.aspx
More guidance :http://www.thefreakparade.com/2008/07/evaluating-expressions-at-runtime-in-net-c/
Good one: http://flee.codeplex.com/
Boolean Example which you are looking for : http://flee.codeplex.com/wikipage?title=BooleanExpression&referringTitle=Examples (ignore the variable adding part as you are not looking for variable)

P/Invoke with [Out] StringBuilder / LPTSTR and multibyte chars: Garbled text?

I'm trying to use P/Invoke to fetch a string (among other things) from an unmanaged DLL, but the string comes out garbled, no matter what I try.
I'm not a native Windows coder, so I'm unsure about the character encoding bits. The DLL is set to use "Multi-Byte Character Set", which I can't change (because that would break other projects). I'm trying to add a wrapper function to extract some data from some existing classes. The string in question currently exists as a CString, and I'm trying to copy it to an LPTSTR, hoping to get it into a managed StringBuilder.
This is what I have done that I believe is the closest to being correct (I have removed the irrelevant bits, obviously):
// unmanaged function
DLLEXPORT void Test(LPTSTR result)
{
// eval->result is a CString
_tcscpy(result, (LPCTSTR)eval->result);
}
// in managed code
[DllImport("Test.dll", CharSet = CharSet.Auto)]
static extern void Test([Out] StringBuilder result);
// using it in managed code
StringBuilder result = new StringBuilder();
Test(result);
// contents in result garbled at this point
// just for comparison, this unmanaged consumer of the same function works
LPTSTR result = new TCHAR[100];
Test(result);
Really appreciate any tips! Thanks!!!
One problem is using CharSet.Auto.
On an NT-based system this will assume that the result parameter in the native DLL will be using Unicode. Change that to CharSet.Ansi and see if you get better results.
You also need to size the buffer of the StringBuilder that you're passing in:
StringBuilder result = new StringBuilder(100); // problem if more than 100 characters are returned
Also - the native C code is using 'TCHAR' types and macros - this means that it could be built for Unicode. If this might happen it complicates the CharSet situation in the DllImportAtribute somewhat - especially if you don't use the TestA()/TestW() naming convention for the native export.
Dont use out paramaeter as you are not allocating in c function
[DllImport("Test.dll", CharSet = CharSet.Auto)]
static extern void Test(StringBuilder result);
StringBuilder result = new StringBuilder(100);
Test(result);
This should work for you
You didn't describe what your garbled string looks like. I suspect you are mixing up some MBCS strings and UCS-2 strings (using 2-byte wchar_ts). If every other byte is 0, then you are looking a UCS-2 string (and possibly misusing it as an MBCS string). If every other byte is not 0, then you are probably looking at an MBCS string (and possibly misusing it as a Unicode string).
In general, I would recommend not using TCHARs (or LPTSRs). They use macro magic to switch between char (1 byte) and wchar_t (2 bytes), depending on whether _UNICODE is #defined. I prefer to explicit use chat and wchar_t to make the codes intent very clear. However, you will need to call the -A or -W forms of any Win32 APIs that use TCHAR parameters: e.g. MessageBoxA() or MessageBoxW() instead of MessageBox() (which is a macro that checks whether _UNICODE is #defined.
Then you should change CharSet = CharSet.Auto to something CharSet = CharSet.Ansi (if both caller and callee are using MBCS) or CharSet = CharSet.Unicode (if both caller and callee are using UCS-2 Unicode). But it sounds like your DLL is using MBCS, not Unicode.
pinvoke.net is a great wiki reference with many examples of P/Invoke function signatures for Win32 APIs: