Data structure with quick min, delete, insert, search for big compute job - optimization

I'm looking for a data structure that will let me perform the operations I need efficiently. I expect to traverse a loop between 1011 and 1013 times so Ω(n) operations are right out. (I'll try to trim n down so it can fit in cache but it won't be small.) Each time through the loop I will call
Min exactly once
Delete exactly once (delete the minimum, if that helps)
Insert 0 to 2 times, with an average of somewhat more than 1
Search once for each insert
I only care about average or amortized performance, not worst-case. (The calculation will take ages, it's no concern if bits of the calculation stall from time to time.) The data will not be adversarial.
What kinds of structures should I consider? Maybe there's some kind of heap modified to have quick search?

A balanced tree is a quite good data structure for such a usage. All the specified operations are computed in O(log n). I think you can write an optimized tree implementation so that the minimum can be retrieved in O(1) (by keeping an iterator to the min and possibly the value for faster fetches). The resulting time of the algorithm will be O(m log n) where m is the number of iteration and n the number of items in the data structure.
This is the optimal algorithmic complexity. Indeed, assuming each iteration can be done in (amortized) O(1), each of the four operations must have such a complexity too. Let's assume a data structure S can be built with such a properties. One can write the following algorithm (written in Python):
def superSort(input):
s = S()
inputSize = len(input)
for i in range(inputSize):
s.insert(item[i])
output = list()
for i in range(inputSize):
output.append(s.getMin())
s.deleteMin()
return output
superSort has an (amortized) complexity of O(n). However, the theoretical optimal exact algorithmic complexity for a comparison-based sort has been proven to be O(n log (n)). Thus, S cannot exist and at least one of the 4 operations need to be done in at-least O(log n) time.
Note that naive binary tree implementations are often pretty inefficient. There is a lot of optimization you can perform to make them much faster. For example, you can pack the nodes (see B-trees), put the nodes in an array (assuming the number of item is bounded), use a relaxed balancing possibly based on random properties (see Treaps), use small references (eg. 16-bit indices or 32-bit indices rather than 64-bit pointers), etc. You can start with a naive AVL or a splay-tree.

My suggested data structure requires more work to be implemented, but it does achieve the desired results;
A data structure with {insert, delete, findMin, search} operations can be implemented using an AVL tree which ensures that each operation is done in O(logn) and findMin is done in O(1).
I'm going to dive in a bit into the implementation:
The tree would contain a pointer to the minimum node which is updated on each insertion and deletion, thus findMin requires O(1).
insert is implemented as it is in every AVL tree which takes O(logn) (using the balance factor and rotations/swaps to balance the tree). After you insert an element, you would need to update the minimum node pointer by going all the way to the left from the root of the tree, which requires O(logn) as well since the tree height is O(logn).
Likewise, after using delete you would need to update the minimum pointer in same fashion, thus it requires O(logn).
Finally, search also requires O(logn).
If more assumptions were given, e.g. the inserted elements are within a certain range of the minimum, then you could also give each node in the tree successor and predecessor pointers, which can also be updated in O(logn) during insertions and deletions, and thus can be accessed in O(1) without the need to traverse over the entire tree. And searching for the inserted elements can be done faster.
The successor of an inserted node can be updated by going to the right child and then all the way to the left. But if a right child does not exist then you would need to climb up the parents as long as the current node is not the left child of its parent.
The predecessor is updated in the exact reverse way.
In c++ a node would look something like this
template <class Key,class Value>
class AvlNode{
private:
Key key;
Value value;
int Height;
int BF; //balance factor
AvlNode* Left;
AvlNode* Right;
AvlNode* Parent;
AvlNode* Succ;
AvlNode* Pred;
public:
...
}
While the tree would look something like this:
template <class Key,class Value>
class AVL {
private:
int NumOfKeys;
int Height;
AvlNode<Key, Value> *Minimum;
AvlNode<Key, Value> *Root;
static void swapLL(AVL<Key, Value> *avl, AvlNode<Key, Value> *root);
static void swapLR(AVL<Key, Value> *avl, AvlNode<Key, Value> *root);
static void swapRL(AVL<Key, Value> *avl, AvlNode<Key, Value> *root);
static void swapRR(AVL<Key, Value> *avl, AvlNode<Key, Value> *root);
public:
...
}

From what you told us, I think I would use an open-addressed hash table for search and a heap to keep track of the minimum.
In the heap, instead of storing values, you would store indexes/pointers to the items in the hash table. That way when you delete min from the heap, you can follow the pointer to find the item you need to delete from the hash table.
The total memory overhead will be 3 or 4 words per item -- about the same as a balanced tree, but the implementation is simpler and faster.

Related

What benefit does a balanced search tree provide over a sorted key-value pair array?

public class Entry{
int key;
String value;
}
If you have an array of Entry.
Entry[]
You can do a binary search on this array to find, Insert or remove an Entry all in O(Log(n)). I can also do a range search in O(log(n)).
And this is very simple.
What does a comparatively complicated data structure like a red-black balanced search tree, give me over a simple sorted key value array?
If data is immutable, the tree has no benefit.
The only benefit of the array is locality of reference, e.g. data is close together and CPU may cache it.
Because the array is sorted, search is O(log n)
If you add / remove items things changed.
For small number of elements, the array is better (faster) this is because of the locality of reference.
For larger number of items Red Black Tree (or another self balanced tree) will perform better, because the array will need to shift the elements.
e.g. insert and delete will take O(log n) + huge n/2 for the shift.

Why is Hash Table insertion time complexity worst case is not N log N

Looking at the fundamental structure of hash table. We know that it resizes WRT load factor or some other deterministic parameter. I get that if the resizing limit is reached within an insertion we need to create a bigger hash table and insert everything there. Here is the thing which I don't get.
Let's consider a hash table where each bucket contains an AVL - balanced BST. If my hash function returns the same index for every key then I would store everything in the same AVL tree. I know that this hash function would be a really bad function and would not be used but I'm doing a worst case scenario here. So after some time let's say that resizing factor has been reached. So in order to resize I created a new hash table and tried to insert every old elements in my previous table. Since the hash function mapped everything back into one AVL tree, I would need to insert all the N elements into the same AVL. N insertion on an AVL tree is N logN. So why is the worst case of insertion for hash tables considered O(N)?
Here is the proof of adding N elements into Avl three is N logN:
Running time of adding N elements into an empty AVL tree
In short: it depends on how the bucket is implemented. With a linked list, it can be done in O(n) under certain conditions. For an implementation with AVL trees as buckets, this can indeed, wost case, result in O(n log n). In order to calculate the time complexity, the implementation of the buckets should be known.
Frequently a bucket is not implemented with an AVL tree, or a tree in general, but with a linked list. If there is a reference to the last entry of the list, appending can be done in O(1). Otherwise we can still reach O(1) by prepending the linked list (in that case the buckets store data in reversed insertion order).
The idea of using a linked list, is that a dictionary that uses a reasonable hashing function should result in few collisions. Frequently a bucket has zero, or one elements, and sometimes two or three, but not much more. In that case, a simple datastructure can be faster, since a simpler data structure usually requires less cycles per iteration.
Some hash tables use open addressing where buckets are not separated data structures, but in case the bucket is already taken, the next free bucket is used. In that case, a search will thus iterate over the used buckets until it has found a matching entry, or it has reached an empty bucket.
The Wikipedia article on Hash tables discusses how the buckets can be implemented.

Given an array of N integers how to find the largest element which appears an even number of times in the array with minimum time complexity

You are given an array of N integers. You are asked to find the largest element which appears an even number of times in the array. What is the time complexity of your algorithm? Can you do this without sorting the entire array?
You could do it in O(n log n) with a table lookup method. For each element in the list, look it up in the table. If it is missing, insert a key-value pair with the key being the element and the value as the number of appearances (starting at one); if it is present, increment the appearances. At the end just loop through the table in O(n) and look for the largest key with an even value.
In theory for an ideal hash-table, a lookup operation is O(1). So you can find and/or insert all n elements in O(n) time, making the total complexity O(n). However, in practice you will have trouble with space allocation (need much more space than data set size) and collisions (why you need it). This makes the O(1) lookup very difficult to achieve; in the worst case scenario it can be as much as O(n) (though also unlikely) - making the total complexity O(n^2).
Instead you can be more secure with a tree-based table - that is, the keys are stored in a binary tree. Lookup and insertion operations are all O(log n) in this case, provided that the tree is balanced; there are a wide range of tree structures to help ensure this e.g. Red-Black trees, AVL, splay, B-trees etc (Google is your friend). This will make the total complexity a guaranteed O(n log n).

How to represent a binary relation

I plan to make a class that represents a strict partially ordered set, and I assume the most natural way to model its interface is as a binary relation. This gives functions like:
bool test(elementA, elementB); //return true if elementA < elementB
void set(elementA, elementB); //declare that elementA < elementB
void clear(elementA, elementB); //forget that elementA < elementB
and possibly functions like:
void applyTransitivity(); //if test(a,b) and test(b, c), then set(a, c)
bool checkIrreflexivity(); //return true if for no a, a < a
bool checkAsymmetry(); //return true if for no a and b, a < b and b < a
The naive implementation would be to have a list of pairs such that (a, b) indicates a < b. However, it's probably not optimal. For example, test would be linear time. Perhaps it could be better done as a hash map of lists.
Ideally, though, the in memory representation would by its nature enforce applyTransitivity to always be "in effect" and not permit the creation of edges that cause reflexivity or symmetry. In other words, the degrees of freedom of the data structure represent the degrees of freedom of a strict poset. Is there a known way to do this? Or, more realistically, is there a means of checking for being cyclical, and maintaining transitivity that is amortized and iterative with each call to set and clear, so that the cost of enforcing the correctness is low. Is there a working implementation?
Okay, let's talk about achieving bare metal-scraping micro-efficiency, and you can choose how deep down that abyss you want to go. At this architectural level, there are no data structures like hash maps and lists, there aren't even data types, just bits and bytes in memory.
As an aside, you'll also find a lot of info on representations here by looking into common representations of DAGs. However, most of the common reps are designed more for convenience than efficiency.
Here, we want the data for a to be fused with that adjacency data into a single memory block. So you want to store the 'list', so to speak, of items that have a relation to a in a's own memory block so that we can potentially access a and all the elements related to a within a single cache line (bonus points if those related elements might also fit in the same cache line, but that's an NP-hard problem).
You can do that by storing, say, 32-bit indices in a. We can model such objects like so if we go a little higher level and use C for exemplary purposes:
struct Node
{
// node data
...
int links[]; // variable-length struct
};
This makes the Node a variable-length structure whose size and potentially even address changes, so we need an extra level of indirection to get stability and avoid invalidation, like an index to an index (if you control the memory allocator/array and it's purely contiguous), or an index to a pointer (or reference in some languages).
That makes your test function still involve a linear time search, but linear with respect to the number of elements related to a, not the number of elements total. Because we used a variable-length structure, a and its neighbor indices will potentially fit in a single cache line, and it's likely that a will already be in the cache just to make the query.
It's similar to the basic idea you had of the hash map storing lists, but without the explosion of lists overhead and without the hash lookup (which may be constant time but not nearly as fast as just accessing the connections to a from the same memory block). Most importantly, it's far more cache-friendly, and that's often going to make the difference between a few cycles and hundreds.
Now this means that you still have to roll up your sleeves and check for things like cycles yourself. If you want a data structure that more directly and conveniently models the problem, you'll find a nicer fit with graph data structures revolving around a formalization of a directed edge. However, those are much more convenient than they are efficient.
If you need the container to be generic and a can be any given type, T, then you can always wrap it (using C++ now):
template <class T>
struct Node
{
T node_data;
int links[1]; // VLS, not necessarily actually storing 1 element
};
And still fuse this all into one memory block this way. We need placement new here to preserve those C++ object semantics and possibly keep an eye on alignment here.
Transitivity checks always involves a search of some sort (breadth first or depth first). I don't think there's any rep that avoids this unless you want to memoize/cache a potentially massive explosion of transitive data.
At this point you should have something pretty fast if you want to go this deep down the abyss and have a solution that's really hard to maintain and understand. I've unfortunately found that this doesn't impress the ladies very much as with the case of having a car that goes really fast, but it can make your software go really, really fast.

Design a highly optimized datastructure to perform three operations insert, delete and getRandom

I just had a software interview. One of the questions was to design any datastructure with three methods insert, delete and getRandom in a highly optimized way. The interviewer asked me to think of a combination of datastructures to design a new one. Insert can be designed anyway but for random and delete i need to get the position of specific element. He gave me a hint to think about the datastructure which takes minimum time for sorting.
Any answer or discussion is welcomed....
Let t be the type of the elements you want to store in the datastructure.
Have an extensible array elements containing all the elements in no particular order. Have a hashtable indices that maps elements of type t to their position in elements.
Inserting e means
add e at the end of elements (i.e. push_back), get its position i
insert the mapping (e,i) into `indices
deleting e means
find the position i of e in elements thanks to indices
overwrite e with the last element f of elements
update indices: remove the mapping (f,indices.size()) and insert (f,i)
drawing one element at random (leaving it in the datastructure, i.e. it's peek, not pop) is simply drawing an integer i in [0,elements.size()[ and returning elements[i].
Assuming the hashtable is well suited for your elements of type t, all three operations are O(1).
Be careful about the cases where there are 0 or 1 element in the datastructure.
A tree might work well here. Order log(n) insert and delete, and choose random could also be log(n): start at the root node and at each junction choose a child at random (weighted by the total number of leaf nodes per child) until you reach a leaf.
The data structure which takes the least time for sorting is sorted array.
get_random() is binary search, so O(log n).
insert() and delete() involve adding/removing the element in question and then resorting, which is O(n log n), e.g. horrendous.
I think his hint was poor. You may have been in a bad interview.
What I feel is that you can use some balaced version of tree like Red-Black trees. This will give O(log n) insertion and deletion time.
For getting random element, may be you can have a additional hash table to keep track of elements which are in the tree structure.
It might be Heap (data structure)