I'm working on an application that fills in a number of arrays. But being originally a VB6 application, it doesn't use element zero of any of them. This stops things like
my_array.Min
from working properly. I have no plans to tamper with the innards of the application, but it would be very convenient if I could specify a range of array elements in this sort of statement; something like
my_array(1:100).Min
Does such a construction exist, and if so, what is it?
Unfortunately .NET doesn’t have a convenient array slice construct1 (although you can use Linq to approximate it) but you’re solving the wrong X in an XY problem here.
The real solution is not to use 1-based arrays. Do change the innards of your application.
Incidentally, the default base for arrays in VB6 was also zero. You explicitly needed to specify Option Base 1 for 1-based arrays.
1 There’s ArraySegment(T) but before .NET 4.5 this structure was completely broken since it didn’t implement the IList(T) interface and was thus unusable. It does implement that now, but it’s too late – nobody is using the class.
Related
I have a very CPU intensive F# program that depends on persistent data-structures - about 40% of the total CPU time is spent in the Map module. So I thought I'd try out the PersistentHashMap in FSharpX collections. (BTW, this is already a big improvement over the previous version of F# in VS2013 where the same program spent 70% of its time in Map. I also notice that running programs with the debugger attached doesn't have the huge penalty it did before - good work guys...) There is also a hot-spot where I'm re-sorting all the time, where instead I should be adding to a Heap, so I thought I'd give that a go as well.
Two issue became immediately apparent:
(1) Swapping out one for the other from an interface perspective proved harder than it seems it should - I.e., making a shim that let me switch from a Map to a PersistentMap, preserving both the needed module-based let-bound functions and Types necessary to use the each map. I know that having full HM type-inference (and no type-classes) is orthogonal to LSP-style referential transparency for the most part - but maybe I was missing some way to do this better with a minimal amount of code.
(2) The biggest problem (which I'd like to focus on here) is the reliance of the F# functional data-structs on oo-style dispatched equality and comparison via the IComparison (when 't : comparison), etc., family of interfaces.
Even for OO programs ISTM that the idea of dispatching equality and comparison is a bad idea -- an object "knows" how to perform its own domain-specific tasks, but it doesn't "know" for the most part what notion of equality is going to be necessary at various points in the program for various purposes -- so equality/comparison should not be part of the object's interface, but when these concepts are needed, they should always be mentioned explicitly. For example, there should never be a .Sort(), only a .SortWith(...). One could argue that even something as basic as structural equality in F# could be explicit a.StructEq(b) or a ~= b - otherwise you always get object.Equals -- but even stipulating that doing things this way is the best for a multi-paradigm language that's a first-class .Net citizen, it seems like there should at least be the option of using passed-in comparison and equality functions, but this is not the case.
This means that: (a) type constraints are enforced even if you don't want them, causing ripples of broken inferred typing (and hundreds of wavy red lines with it being unclear where the actual "problem" is) and (b), that by implementing a notion of equality or comparison that makes one container type happy in one part of your program (and in my case I want to use the same container and item type with two different notions of ordering in two different places), it is likely to silently break (or cause inefficiency, if one subsumes the other) in other parts of the code that depended on the default/previous implementation.
The only way around this that I could think of is wrapping each item a adapter object using new...with object expression - but I really don't want to create so much garbage just to get the code to work.
So, ISTM that we could have a "pure" version of each persistent data struct that could be loaded if desired (even basics like List, etc.) that do not depend on dispatched equality/comparison/hashing and do not impose type constraints - all such needs should be via a passed in fn's at the time of the call. (Dispatched eq/cmp would be only for used for interop with BCL collections that don't accept delegates.) Then we could have a [EqCmpHashThrowNotImplemented] attribute, and I could be sure that there were no default operations happening at all, and I would feel better about the efficiency and predictability of my code. (And this also let's one change from a Record to a Class or visa-versa w/o worrying about any changes in behavior due to default implementations.) Again, this would be optional, but done by with a simple import. (Which does mean that each base core collection type would have to be broken out into its own module, which isn't really a bad idea anyway.)
If I've overlooked a better way to do things or there are some patterns people are using here, I'd be interested.
What are the differences between these two? Why would you pick one over the other, is it just personal preference, or is there an actual reason behind why you would use either a built-in function or whatever .length is.
I think using *.length over *.length() or len(*) is kind of a historical artifact, which was probably done to make getting the length of an array as fast as possible. Arrays after all, are a very basic data structure in many languages, and getting the length of one is an extremely common operation. And accessing a property is much faster than calling a method.
Nowadays a compiler could probably optimize that kind of thing out, but back then I think there was a pull towards ease-of-implementation which guided many languages to simply have *.length as a property.
However, in any OOP language at least, it's more consistent to have *.length(), because while arrays have immutable lengths, and can afford to have *.length exposed as a constant value, other data structures which you can add or remove values would not be able to do this.
So this seems like a fairly simple answer to a common problem: Infinite loop detected in Jackosn. If, when serializing an object tree, Jackson comes upon an object it has already serialized why doesn't it just ignore it? Is there a way to do this in Jackson, or has someone created something similar?
Why all this mucking around with JsonManagedReference/JsonBackReference, which is completely insufficent if you start serializing child objects (which need a reference to the parent) some of the time and you are serializing parent objects some of the time (which obviously doesn't want the child to refer back to itself)?
It seems like now I have to create custom views that take into account every type of circular reference and use case possible which in any non-trivial ORM is a huge task.
EDIT (October 2012)
Jackson 2.x actual now DOES support identity information handling with #JsonIdentityInfo annotation! So the original answer is bit out of date...
OBSOLETE
Jackson does not support handling of object identity: this is a non-trivial task not so because of identifying shared objects which can be done by traversing object graph (incurring some overhead), but rather in figuring out how to include identity information; which ids to use and how. This in turn is somewhat similar to inclusion of type information, but now adds second dimension of extra wrapping to handle.
Doing this has been requested before, and some thought has gone into figuring out how to do it, but ratio of effort to benefit (i.e. number of requests, how badly it is needed) has been higher than adding other features.
So your best bet is to use wrapper objects and implement this manually, or have a look at XStream which can solve this (when enabled; it adds significant overhead in time) and also has JSON output mode using Jettison.
Implementing this manually for your use case is bit easier than solving the general case: you could start with BeanSerializerModifier to add wrapper handler that can keep track of object identities, and know what to serialize instead as object id.
Some people have argued that the C# 4.0 feature introduced with the dynamic keyword is the same as the "everything is an Object" feature of VB. However, any call on a dynamic variable will be translated into a delegate once and from then on, the delegate will be called. In VB, when using Object, no caching is applied and each call on a non-typed method involves a whole lot of under-the-hood reflection, sometimes totaling a whopping 400-fold performance penalty.
Have the dynamic type delegate-optimization and caching also been added to the VB untyped method calls, or is VB's untyped Object still so slow?
Solution
Some research and a better reading of the earlier referred to article mentioned by Hans Passant, brings about the following conclusion:
VB.NET 2010 supports the DLR;
You can implement IDynamicMetaObjectProvider if you want to explicitly support dynamics, the VB.NET compiler is updated to recognize that;
VB's Object will only use the DLR and method caching if the object implements IDynamicMetaObjectProvider;
BCL and Framework types do not implement IDynamicMetaObjectProvider, using Object on such types or your own types will invoke the classical, non-cached VB.NET late-binder.
Background: elaborating on why late-binding caching could help VB code performance
Some people (among whom Hans Passant, see his answer) may wonder why caching or non-caching in late-binding could possibly matter. Actually, it makes a large difference, both in VB and in other late-binding technologies (remember IQueryInterface with COM?).
Late-binding comes down to a simple principle: given a name and its parameter-declarations, loop through all the methods of this class and its parent classes by means of methods available though the Type interface (and in VB, a method, a property and a field can look the same, making this process even slower). If you consider that method tables are unordered, then this is easily much more expensive than a single direct (i.e., typed) method call.
If you were capable of looking up the method once, and then storing the method-pointer in a lookup table, this would greatly speed up this process. Cached method binding in the DLR goes one step futher and replaces the method-call with a pointer to the actual method, if possible. After the first call, this becomes an order of magnitude faster for each subsequent call (think 200x to 800x times faster).
As an example of when this matters, here's some code that illustrates this issue. In a case where every class has a .Name string property, but the classes do not share a common ancestor or interface, you can naively sort lists of any of those types like so:
' in the body of some method '
List<Customers> listCustomers = GetListCustomers()
List<Companies> listCompanies = GetListCompanies()
listCustomers.Sort(MySort.SortByName)
listCompanies.Sort(MySort.SortByName)
' sorting function '
Public Shared Function SortByName(Object obj1, Object obj2) As Integer
' for clarity, check for equality and for nothingness removed '
return String.Compare(obj1.Name, obj2.Name)
End Function
This code (similar at least) actually made it into production with one of my clients and was used in an often-called AJAX callback. Without manually caching the .Name properties, already on medium sized lists of less than half a million objects, the late-binding code became such a noticeable burden that it eventually brought the whole site down. It proved hard to track down this issue, but that's a story for another time. After fixing this, the site regained 95% of its CPU resouces.
So, the answer to Hans's question "don't you have bigger problems to worry about" is simple: this is a big problem (or can be), esp. to VB programmers who have gotten too careless about using late-binding.
In this particular case, and many like them, VB.NET 2010 has apparently not been upgraded to introduce late-binding, and as such, Object remains evil for the unaware and should not be compared with dynamic.
PS: late-binding performance issues are very hard to track down, unless you have a good performance profiler and know how late-binding is implemented internally by the compiler.
Quoting from the what's new article:
Visual Basic 2010 has been updated to
fully support the DLR in its
latebinder
Can't get more explicit than that. It is the DLR that implements the caching.
Good question. I'm guessing the answer is "No", because this article in MSDN magazine says VB.Net has been changed to support the Dynamic Language Runtime, and briefly describes changes to the runtime but doesn't mention caching.
Does anyone know better?
What I would like to do is be able to take a Dictionary of key value pairs and make the key the name of a variable and the value the value.
From searching the net seems to be very vague on whether this is possible.
The equivalent in PHP would be:
foreach($array as $key=>$val)
{
$$key = $val;
}
Thanks.
You have to declare variables in .Net CLR languages (not to be confused with .Net dynamic runtime languages) at compile time. More than that, it's generally better if you know the types of the variables as well. .Net programmers generally believe that this a good thing (the link is for C#, but the contents still apply).
What do you want to do with these variables? Tell us, and I'll bet we can give you a better way to accomplish the same thing.
This is doable but it is a slow operation. You'd have to use reflection to access the variables by their string representation.
If at all possible it will be much faster to use Generics to store the objects themselves as the keys. There's an example of that in VB.NET with a Dictionary here. Doing it this way means that there will be no casting or reflection needed at run-time. Plus it allows intellisense to work on the collection directly which is yet another awesome thing about Generics.