Memory Consumption of Collection Types

Memory Consumption of Collection Types - vb.net

I am trying to make a small app that uses a multiple hierarchy of type List within a List for some 20-30 levels. I tried with:
System.Collections.ComponentModel.ObservableCollection, but at run time, I got an OutOfMemoryException error. Then, I tried with List, and this time I did not get such an error.
What type of collection consumes the least amount of space? Or, what would be a good way to achieve this type of hierarchy? I just need a collection; I don't need change notifications, etc. I am using .NET 4 with VB, Linq, and WPF. I achieved the code looping process with Parallel.ForEach threading.
Edit:
The Program is for string the file system data into Sql CE DB and retrieving it back. Hence, the hierarchy can be of 20-30 levels also.
Edit: There would be about 80000 Queries with linq for getting the hierarchy. The Type of data I am using is as follows:
Public Structure FileRecord
Property ID As String
Property Namee As String
Property Size As String
Property IsFolder As Boolean
Property DateModified As Date
Property FullPath As String
Property Disk As String
Property ParentID As String
Property Items As List(Of FileRecord)
End Structure

The overhead associated with each collection wouldn't cause an OOM exception, as described above because both implement List<T>. The ObservableCollection can be used in conjunction with INotifyCollectionChanged to alert views and presenters/viewmodels of changes to the collection.
The bigger question is, what kind of type are you using? If you're running out of memory, each type might be allocating an unnecessary amount of memory. Also, I wouldn't use an ObservableCollection unless you intend to use binding.

The performance characteristics of the various .NET collection classes vary widely, and the type of collection you'll want to use will also depend on how you will want to access the collection. As usual, there are trade-offs to be made between performance (in time and memory) and simplicity or convenience.
That said, one of the simplest and most performant collection types in .NET is probably Array.

Related

How to get multiple data from gemfire cacheloader?

We are going to implement gemfire for our project. We are currently syncing gemfire cache with our DB2 database. So, we are facing issue while putting DB data into cache.
To put DB data into region. I have implement com.gemstone.gemfire.cache.CacheLoader and override load method of it. As written in java doc load method will return only one Object. But for our requirement we will have to return multiple VO from load method
public List<CmDvceInvtrGemfireBean> load(LoaderHelper<CmDvceInvtrGemfireBean, CmDvceInvtrGemfireBean> helper)
throws CacheLoaderException
While returining multiple VO in form of List<CmDvceInvtrGemfireBean> gemfire region consider it's as single value.
So, when i invoke,
System.out.println("return COUNT" + cmDvceInvtrRecord.query("SELECT COUNT(*) FROM /cmDvceInvtrRecord"));
It return count of one. But i can see total 7 number of data into it.
So, I want to implement the kind of mechanism that will put all the 7 values as a separate VO in Region
Is there any way to do this using Gemfire CacheLoader?

A CacheLoader was meant to load a value only for a single entry in the GemFire Region on a cache miss. As the Javadoc states...
..creates the value for the desired key..
While a key can map to a multi-valued (e.g. an array/Collection) value, the CacheLoader can only populate a single entry.
You will have to resort to other means of populating the cache with multiple "entries" in a single operation.
Out of curiosity, why do you need (requirement?) to load multiple entries (from the DB) at once? Are you trying to minimize the number of round trips to the DB?
Also, what logic are you using to decide what VO from the DB will be loaded based on the information (i.e. key) provided in the CacheLoader?
For instance, are you somehow trying to predictably select values from the DB based on the CacheLoader key that would subsequently minimize cache misses on future Region.get(key) calls?
Sorry, I don't have a better answer for you right now, but answers to some of these questions may help me give you some ideas for alternatives.
Cheers,
John

Fast, efficient method of assigning large array of data to array of clusters?

I'm looking for a faster, more efficient method of assigning data gathered from a DAQ to its proper location in a large cluster containing arrays of subclusters.
My current method 1 relies heavily on the OpenG cluster manipulation tools, but with a large data-set the performance is far too slow.
The array and cluster location of each element of data from the DAQ is determined during an initialization phase and doesn't change during acquisition.
Because the data element origin and end points are the same throughout acquisition, I would think an array of memory locations could be created and the data directly assigned to its proper place. I'm just not sure how to implement such a thing.

The following code does what you want:
For each of your cluster elements (AMC, ANLG_PM and PA) you should add a case in the string case structure, for the elements AMC and PA you will need to place a second case structure.

This is really more of a comment, but I do not have the reputation to leave those yet, so here it is:
Regarding adding cases for every possible value of Array name, is there any reason why you cannot use an enum here? Since you are placing it into a cluster anyway, I would suggest making a type-defined enum of your possible array names. That way, when you want to add or remove one, you only have to do it in one place.
You will still need to right-click on your case structures that use this enum and select Add item for every value if you are adding a value, or manually delete the obsolete value if you are removing one. I suppose some maintenance is required either way...

Limiting Amount of Rows in List View

Simple enough question, how would I be able to limit the amount of rows on a ListView to the amount of items/rows that actually contain information. I know how to count the rows with items by using this code
ListView1.Items.Count
But how can I limit the amount of rows the listview has to the amount of items?

Assuming a version of .Net that includes LINQ (3.5+), you get some really nice features which help a lot. These apply to any IQueryable including IList..
Dim MyList = [Some code to get hundreds of items]
Dim MyShortList = MyList.Take(30)
You can also implement paging very easily by using Skip...
Dim MyShortListPage2 = MyList.Skip(30).Take(30)
You should look into using the Entity framework or equivalents which implement IQueryable. These reduce memory overhead by using deferred processing aka Lazy Loading.
In short, if I were to do the following using the EF:
Dim Users = DBContext.Set(Of Users)
Users won't actually contain all users in the database, instead it will contain the query to get all users. If I did Users.First, it would run the query against SQL to get the first user. If instead, I did Users.Where(function(x) x.Age=30).First it would only query SQL for the first user whose age is 30.
Thus, IQueryable lets you pare down a dataset quickly using the power of the underlying provider instead of doing it in-memory.
If, instead, I did
Dim Users = DBContext.Set(Of Users).ToList()
It would retrieve all users from the database into memory. The ToList() is what forces this to happen. A List has to be stored in local memory, an IQueryable does not, it can run the appropriate query at the last possible moment and get as little as possible to satisfy your request.
Whether you want this to happen or not depends on the use case.

DataSet serialization and OutOfMemoryException

I've got a DataSet with about 250k Rows and 80 Columns causing StringBuilder to throw an OutOfMemoryException (#System.String.GetStringForStringBuilder(String value, Int32 startIndex, Int32 length, Int32 capacity)) when calling .GetXml() on my dataset.
As I read here (last paragraph) this can be overcome by using binary representation instead of xml, which sounds logical.
So I set the RemotingFormat-property on my dataset to binary but the issue still occurs.
I had a closer look to the GetXml-implementation and there seems to be no distinction based on the RemotingFormat. Instead, I found out that GetXmlSchemaForRemoting considers RemotingFormat, but this method is internal so I can't call it from the outside. It is called by private SerializeDataSet which is called by public GetObjectData.
GetObjectData itself seems to be for custom serialization.
How can I binary (de-)serialize my dataset? Or call at least GetXml without throwing exceptions? Did I overlook any dataset property?

The link you provided in you question is from 2008.
There is some more new discussions:
dotnespider 2009
and also from SO 2011.
The last one is about problem with DataAdapter while reading 150K records, but the answer can be also interestin for you:
The first thing that I'd check is how many columns you are returning,
and what their data types are.
and
...you are either returning way more fields than you need, or perhaps
that some of the fields are very large strings or binary data. Try
cutting down the select statement to only return the fields that are
absolutely needed for the display.
If that doesn't work, you may need to move from a DataTable to a list
of a custom data type (a class with the appropriate fields).
from the accepted answer

As you discovered there is no built in way to serialize datasets as binary.
The only way to serialize your dataset as binary data is to implement your own formatter.
start here http://msdn.microsoft.com/en-us/magazine/cc163911.aspx

Keeping an array sorted - at setting, getting or later?

As an aid to learning objective c/oop, I'm designing an iOS app to store and display periodic bodyweight measurements. I've got a singleton which returns a mutablearray of the shared store of measurement object. Each measurement will have at least a date and a body weight, and I want to be able to add historic measurements.
I'd like to display the measurements in date order. What's the best way to do this? As far as I can see the options are as follows: 1) when adding a measurement - I override addobject to sort the shared store every time after a measurement is added, 2) when retrieving the mutablearray I sort it, or 3) I retrieve the mutablearray in whatever order it happens to be in the shared store, then sort it when displaying the table/chart.
It's likely that the data will be retrieved more frequently than a new datum is added, so option 1 will reduce redundant sorting of the shared store - so this is the best way, yes?

You can use a modified version of (1). Instead of sorting the complete array each time a new object is inserted, you use the method described here: https://stackoverflow.com/a/8180369/1187415 to insert the new object into the array at the correct place.
Then for each insert you have only a binary search to find the correct index for the new object, and the array is always in correct order.
Since you said that the data is more frequently retrieved than new data is added, this seems to be more efficient.

If I forget your special case, this question is not so easy to answer. There are two basic solutions:
Keep array unsorted and when you try to access the element and array is not sorted, then sort it. Let's call it "lazy sorting".
Keep array sorted when inserting elements. Note this is not about appending new element at the end and then sort the whole array. This is about finding where the element should be (binary search) and place it there. Let's call it "sorted insert".
Both techniques are correct and useful and deciding which one is better depends on your use cases.
Example:
You want to insert hundreds of elements into the array, then access the elements, then again insert hundreds of elements, then access. In summary, you will be inserting values in big chunks. In this case, lazy sorting will be better.
You will often insert individual elements and you will access the elements often. Then sorted insert will have better performance.
Something in the middle (between inserting 1 and inserting tens of elements). You probably don't care which one of the methods will be used.
(Note that you can use also specialized structures to keep an array sorted, not based on NSArray, e.g. structures based on a balanced tree, while keeping number of elements in the subtree).

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Memory Consumption of Collection Types - vb.net

Related

How to get multiple data from gemfire cacheloader?

Fast, efficient method of assigning large array of data to array of clusters?

Limiting Amount of Rows in List View

DataSet serialization and OutOfMemoryException

Keeping an array sorted - at setting, getting or later?

Categories

Resources