mxnet: how to retrieve the number of example from the DataIter object? - mxnet

After using the mx.io.ImageRecordIter() to load my training examples, is there a way to retrieve the total number of examples from the returned DataIter object ?
Many thanks,

Length of the iterator is unknown until you iterate through it.
You could iterate through and count the number of examples but depending on the data set size, this could be a time consuming operation.

Related

LeavePGroupsOut For multidimensional array

I am working on a research problem and due to a small sized dataset with subjects I am trying to implement Leave N Out style analyses.
Currently I am doing this ad-hoc and I stumbled upon scikit-learn LeavePGroupsOut function.
I read the docs but I am unable to understand how to use it in multidimensional array.
My data are the following: I have 50 subjects, around 20 entries per subject (not fixed) and 20 features per entry with ground-truth value (0 or 1) for every entry.
Well the documentation is actually pretty clear:
https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.LeavePGroupsOut.html#sklearn.model_selection.LeavePGroupsOut
In your case you need to concatenate your array s.t. you can provide for every entry and feature the group index. Thus your feature array will have the shape 50*20 datapoints times 20 features (1000,20), so your group array also needs to have shape (1000,).
Then you need to define the cross validation via
lpgo = LeavePGroupsOut(n_groups=n_groups)
It's important to notice that this will result in all possible combinations of left out test groups.

What is the time complexity of Search in ArrayList?

One interview question which I couldn't answer and couldn't find any relevant answers online.
I know the arraylist retrieve the data in constant time based on the indexes.
Suppose in an arraylist, there are 10000 data and the element is at 5000th location(We are not given the location), we have to search for a particular value( for eg integer 3 which happens to be on the 5000th index), for searching the value, we will have to traverse through the arraylist to find the value and it would take linear time right??
Because if we are traversing through the arraylist to find the data, it would take linear time and not constant time.
In short I want to know the internal working of contains method in which I have to check for the particular value and I don't have the index. It will have to traverse through the array to check for the particular value and it would take O(n) time right?
Thanks in advance.
I hope this is what you want to know about search in ArrayList:
Arrays are laid sequentially in memory. This means, if it is an array of integers that uses 4 bytes each, and starts at memory address 1000, next element will be at 1004, and next at 1008, and so forth. Thus, if I want the element at position 20 in my array, the code in get() will have to compute:
1000 + 20 * 4 = 1080
to have the exact memory address of the element. Well, RAM memory got their name of Random Access Memory because they are built in such way that they have a hierarchy of hardware multiplexers that allow them to access any stored memory unit (byte?) in constant time, given the address.
Thus, two simple arithmetic operations and one access to RAM is said to be O(1). See link to original answer.

getting the name of the variable with the maximum value in anylogic

In anylogic, I have some variables(more than 2) and I want to know which of them has the maximum value? How can I do that?
where can I save the name of the maximum variable?
getting the maximum value of a number of values is not an AnyLogic problem but a general coding problem which stackoverflow probably has hundreds of answers for.
In AnyLogic, you could add all your variables to a statistics object. Then call myStatisticsObject.max() to get the maximum.
To save that in a new variables, call newVariable = mtStatistics.max()
cheers

Better to use size or count on collection?

When counting a collection. Is it better to do it via size or count?
Size = Ruby (#foobars.size)
Count = SQL (#foobars.count)
I also notice, count makes another trip to the db.
I tend to suggest using size for everything, just because it's safer. People make fewer silly mistakes using size.
Here's how they work:
length: length will return the number of elements from an array, or otherwise loaded collection - the key point is that the collection will be loaded here regardless. So if you're working with an activerecord association, it will pull the elements from the DB to memory, and then return the number.
count: count issues a database query, so if you have an array already it's a pointless call to your database.
size: best of both worlds - size checks which type you're using and then uses whichever seems more appropriate (so if you have an array, it will use length; if you have an unretrieved ActiveRecord::Association it will use count, and so on).
Source:
http://blog.hasmanythrough.com/2008/2/27/count-length-size/
It depends on the situation. In the example you show I would go with size since you already have the collection loaded and a call to size will just check the length of the array. As you noticed, count will do an extra db query and you really want to avoid that.
However, in the scenario that you only want to display the number of Foobars and not show those objects, then I would go with count because it will not load the instances into memory, just return the number of records.

Core DAta: Get a random row from the fetched result

I'm looking for a memory efficient way to take only one row from a fetch result set. This must be random.
I thought using [context countForFetchRequest:fetch error:nil]; and get an int random value between 0 and that and offset + limit the fetch to 1 result. But I can't find whether or not it doesn't allocate memory for each item it counts.
Is "count" a lightweight operation? Or does it need to instantiate objects in the context before being able to count them?
The documentation is somewhat unclear, but it includes the phrase "number of objects a given fetch request would have returned." Furthermore, Core Data tends to make things like count very lightweight - entity instances, for example, allow you to call count to find out the number of objects on the end of a to-many relationship without instantiating all those objects or firing that fault. I'd say go for it, but profile it yourself - don't optimize prematurely!