How to implement a stack with limited number of elements? - vb.net

I have recently created an elaborate Undo/Redo mechanism for a programm of mine. It is an editor that works with specific XML files. However, since certain changes may or may not change any amount of nodes in the XML file, I am currently backing up the whole XML document as a clone.
So far, I've been using two System.Collections.Generic.Stack(Of XmlNode) objects to store them, and skipping back and forth works very well. But now I want to limit the number of steps one can undo, i. e. I need to throw out the oldest entries if the number of items in the undo stack exceeds a certain threshold.
How would I do that?
P.S.: It occured to me that I might use something like a Deque, so I already implemented my own DoubleEndedQueue(Of T). I could easily emulate a limited stack with that. It uses a System.Collections.Generic.List(Of T), though, and I don't know if List.Insert(item, 0) is high-performance O(1) or O(n).

Related

DOORS creates Object ID while they are not saved

I am adding some 2000 new objects to a DOORS module, I do this by importing a spread sheet with blank IDs, DOORS is supposed to create IDs for those blank rows.
Now the problem is, while i import spreadsheet, DOORS hangs, then when i Kill DOORS process, it anyhow creates IDs, next time when i add a new object, ID number starts from those which are already created but no exist. For some reason i need to continue from my last saved ID. Is there any way I can do this?
several remarks here:
works as designed. As soon as an object is created in any DOORS session, the new absolute number is centrally marked as "used". I think the main reason for this feature is the possibility to work in shared mode. If there were a different design, you would get into trouble as soon as two developers work on the module at the same time.
are you sure that DOORS really hangs? Perhaps it is just not yet finished, at least you can see that the objects are really created. Note that depending on how the script is written that you use for import, the number of imports per second might decrease significantly for bigger files
You should NEVER give any meaning to the absolute number other than uniqueness (perhaps QSS should have used timestamps or UUIDS instead of integers for their absolute numbers when they designed DOORS, this would make the situation clearer). You will have to rework “some reasons” . Perhaps you use a different mechanism to assign your own ID mechanism or you have to evaluate whether the requirement “generate consecutive numbers without gaps” is really necessary.

Does cytoscape.js have a way of producing JSON to be sent to server for saving?

There is a note on the cytoscape.js website that says:
"Note that a collection is immutible by default, meaning that the set of elements within a collection can not be changed. The API returns a new collection with different elements when necessary, instead of mutating the existing collection. This allows the developer to safely use set theory operations on collections, use collections functionally, and so on."
Does this mean it is not really suitable to use in the creation of online 'network editor' ie. where the user can interact to add and delete nodes and edges to the existing graph?
If I understand the note above it would mean that adding a new node would mean reconstructing the whole graph from scratch (but with the new node) and then presumably performing a complete redraw. Is this correct?
A collection is a set of elements; the set merely points to all the individual elements. You can think of it like an array of elements: The array just holds the elements. Different arrays/sets can have different, similar, overlapping elements, etc.
Cytoscape.js is very suitable for the purpose you mention. There are already projects that have live, collaborative editors (similar to google docs, online office, etc but for graphs). For example, a simple one that I created is codenamed "Factoid" for biological processes. Though I really think it ought to have a better, more accurate name -- you can still look through the code for a live collaboration example with Cytoscape.js. Because you can listen to events easily, it's relatively straightforward to send diffs (or even just events) back and forth between the server and the client.
Adding an element is inexpensive: It just adds the single element and redraws if opportune. It's even cheaper with cy.batch() for modifying lots of elements in a row.

Can Parallel.ForEach be used safely with CloudTableQuery

I have a reasonable number of records in an Azure Table that I'm attempting to do some one time data encryption on. I thought that I could speed things up by using a Parallel.ForEach. Also because there are more than 1K records and I don't want to mess around with continuation tokens myself I'm using a CloudTableQuery to get my enumerator.
My problem is that some of my records have been double encrypted and I realised that I'm not sure how thread safe the enumerator returned by CloudTableQuery.Execute() is. Has anyone else out there had any experience with this combination?
I would be willing to bet the answer to Execute returning a thread-safe IEnumerator implementation is highly unlikely. That said, this sounds like yet another case for the producer-consumer pattern.
In your specific scenario I would have the original thread that called Execute read the results off sequentially and stuff them into a BlockingCollection<T>. Before you start doing that though, you want to start a separate Task that will control the consumption of those items using Parallel::ForEach. Now, you will probably also want to look into using the GetConsumingPartitioner method of the ParallelExtensions library in order to be most efficient since the default partitioner will create more overhead than you want in this case. You can read more about this from this blog post.
An added bonus of using BlockingCollection<T> over a raw ConcurrentQueueu<T> is that it offers the ability to set bounds which can help block the producer from adding more items to the collection than the consumers can keep up with. You will of course need to do some performance testing to find the sweet spot for your application.
Despite my best efforts I've been unable to replicate my original problem. My conclusion is therefore that it is perfectly OK to use Parallel.ForEach loops with CloudTableQuery.Execute().

Separating code logic from the actual data structures. Best practices? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
I have an application that loads lots of data into memory (this is because it needs to perform some mathematical simulation on big data sets). This data comes from several database tables, that all refer to each other.
The consistency rules on the data are rather complex, and looking up all the relevant data requires quite some hashes and other additional data structures on the data.
Problem is that this data may also be changed interactively by the user in a dialog. When the user presses the OK button, I want to perform all the checks to see that he didn't introduce inconsistencies in the data. In practice all the data needs to be checked at once, so I cannot update my data set incrementally and perform the checks one by one.
However, all the checking code work on the actual data set loaded in memory, and use the hashing and other data structures. This means I have to do the following:
Take the user's changes from the dialog
Apply them to the big data set
Perform the checks on the big data set
Undo all the changes if the checks fail
I don't like this solution since other threads are also continuously using the data set, and I don't want to halt them while performing the checks. Also, the undo means that the old situation needs to be put aside, which is also not possible.
An alternative is to separate the checking code from the data set (and let it work on explicitly given data, e.g. coming from the dialog) but this means that the checking code cannot use hashing and other additional data structures, because they only work on the big data set, making the checks much slower.
What is a good practice to check user's changes on complex data before applying them to the 'application's' data set?
This is probably not much help now, since your app is built, and you probably don't want to reimplement, but I'll mention it for reference.
Using a ORM framework would help you here. Not only does it handle getting the data from the database into an object oriented representation, it also provides the tools to implement isolated temporary changes and views:
Using the ORM framework with transactions, you can allow the user to change the objects in the model without affecting other users, and without commiting the data "for real" until it has been checked. The ACID guarantees of transactions ensures that your changes are not persisted to the database, but held in your transaction, only visible to you. You can then run checks on the data and commit the transaction only if the data validates. If the data doesn't validate, you rollback the transaction and discard the changes. If it does validate, you commit the transaction and changes are made permanent.
Alternatively, you can create views which provide your data for validation. The views combine the base data and temporary tables (local to your current connection). This avoids locking tables, at the expense of having to write and maintain the views.
EDIT: If you already have a rich object model in memory, the hardest part to making that support incremental, local and isolated changes is direct references between objects. When you want to replace object A with A', that contains a change, you don't want to do a deep copy, with all referneces, since you mention that your object model is large. Also, you don't want to have to update all objects that were pointing to A to reference A'. As an example, consider a very large doubly linked list. It's not possible to create a new list that is the same as the old one with just one element changed, without duplicating the entire list. You can achieve isolation by storing the identifier for related objects rather than the object themselves. E.g. Instead of referencing A explicitly, your collaborators store a reference to the unique key that identifies A, key(A). This key is used to fetch the actual object at the time it is needed (e.g. during verification.) Your model then becomes a large Map of keys to objects, which can be decorated for local changes. When looking up an object by key, first check the local map for value, and if not found, check the universal map. To change A to A', you add an entry to the local map, that maps key(A) to A'. (Note that A and A' have the same key, since logically they are the same item.) When you run your veriification code, local changes are then incorporated, since objects referring to key(A) will get A', while other users using key(A) will get the original, A.
This may sound complex written down, but by removing explicit references and computing them on demand is the only way of supporting isolated updates without having to do a deep copy of the data.
An alternative, but equivalent way, is that your validator uses a map to lookup objects with their replacements before it uses them. E.g. your user modifies A, so you put A->A' into the map. The validator is iterating over the model and comes across A. Before using A, it checks the map, and finds A', which it then uses. The difficulty of this approach is that you have to make sure you check the map every time before an object is used. If you miss one, then your view on the model will be inconsistent.
I would try by any means to verify changes before applying them to the data set, as undoing the ripple effects of changes which later turn out to be invalid can easily become a nightmare.
If there is really a lot of data, I understand that creating a full copy of it may not be feasible - although in general "copy on write" would be the simplest and safest solution. If you really are only able to verify the changes by taking into account the whole set of data, you could try a "decorator"-like approach, i.e. somehow creating a "view" of the changes layered on top of the existing body of data, without actually modifying the latter. This could be used to validate the changes, and if the validation succeeds, you can actually apply the changes; otherwise you can simply throw away the "view" and the changes, without affecting the original data in any way.
Hmm, I would suggest rather than loading data copying it in memory. This is expensive, but will allow you to work on all data concurrently. When changes on data are valid, just apply the changes from the copy to all data using some locking strategy. This way you do not need any undo as long as you can apply the changes atomically. You could even try some transaction system if your needs are more complex.
Also think about lazy-loading(copying) your data as you really need them. Finally what comes to my mind is that if you need to workon large data sets from databases using transactions, try considering using Prolog. It might be reasonable to formulate your chcecks as predicates.
Sounds as if you should instead move the rules etc to the database where they belong, by having the checks in our app you will always issues. Instead by placing as much of the logic in for instance stored procedures that are run when the user insert the values you could catch and rollback invalid input. But I guess you have your reasons for keeping it all in memory.

how to create a system-wide independent universal counter object primarily for Database keys?

I would like to create/use a system-wide independent universal 'counter object' that can be called via COM in a thread-safe manner.
The counter object will be passed an ID to identify which counter to return, handle the counting, 'persist' the count (occasionally), have reasonable performance (as fast as possible) perhaps capable of 1000 counts per second or better (1mS) and be accessible cross-process/out-of-process. The current count status must be persisted between object restarts/shutdowns.
The counter object is liklely to be a 'singleton' type object implemented in some form of free-threaded dictionary, containing maybe 10 counters (perhaps 50 max). The count needs to be monotonic and consistent, (ie: guaranteed unique sequential values).
Each counter should have a few methods, like reset, inc, dec, set, clear, remove. As a luxury, I would like to have a variable-increment (ie: 'step by' value). To support thread-safefty, perhaps some sorm of critical-section or mutex call. It just needs to return a long/4byte signed integer.
I really want something that can be called from anywhere, including VBScript, so I figure COM is my preferred solution.
The primary use of this is for database keys. I am unable to use autoinc or guid type keys and have ruled out database-generated counting systems at this point.
I've spent days researching this and I have really struggled to find a solution. The best I can find is a free-threaded dictionary object that can be instantiated using COM+ from Motobit - it seems to offer all the 'basics' and I guess I could create some form of wrapper for this.
So, here are my questions:
Does such a 'general purpose
counter-object already exist? Can you direct me to it? (MS did
do an IIS/ASP object called
'MSWC.Counter' but this isn't
'cross-process'/ out-of-process
component and isn't thread-safe. (but if it was, it would do!)
What is the best way of creating such
a Component? (I'd prefer VB6
right-now, [don't ask!] but can do in VB.NET2005
if I had to). I don't have the
skills/knowledge/tools to use
anything else.
I am desparate for a workable solution. I need specific guidance! If anybody can code something up for me I am prepared to pay for it.
Update:
Whats wrong with GUIDs? a) 16bytes if I'm lucky (Binary storage), 32+bytes if I'm not (ANSI without formatting) or even worse(64bytes Unicode). b) I have an high-volume replicated app where the GUID is just too big (compared to the actual row data) and c) the overhead of indexing and inserts d) I want a readable number! - I only need 4 byte integer, so why not try and get that? I know you will say that disc-space is cheap, but for my application the cost is in slow inserts, and guids don't help (and I have tried/tested) but would prefer not to use if I have a choice.
Autonumber/autoincs are evil: a) don't get the value until after the insert, b) session specific, c) easy to lose/screw up on a table alter, d) no good for mutli-table inserts, (its not MS-SQL Svr) plus I have a need for counters outside my DB...
By the sound of it, what you're looking to create is an ActiveX EXE. They run in their own process but can be accessed from any other process by instantiating an object from it as though it is just another COM object. It handles all the marshaling necessary to sync its internal thread with the threads of any process calling it. Since all you planning on using is integers, there's no need to worry about the thread safety of objects passed between the threads.
More than likely you can use the MSWC.Counter object within that ActiveX EXE and let it do the counter work.
A database engine is already very good at generating unique primary key values for a dbase table. Either by marking the column auto-increment or by using a Guid. Trying to create your own is a grave mistake. System wide is just not wide enough, it fails miserably when your app grows and more than one machine starts using the database.
Nevertheless, you can get what you want in VB6 by creating a COM server. It's been to long, I forgot the exact names of the project options, something resembling "single use".
I have implemented a similar solution implemented as a REST web service - accessible from any technology that supports http.
Simple c sharp backend implementation using a singleton pattern and will scale nicely under IIS.
The whole thing sounds like a twisted idea, so why should I not add another twisted one. :P
Host an old-skool ASP page.
You can use Application.Lock with a counter then, just like in the sample.
Added benefit: use it from any platform/language. (e.g. other HTML pages with XMLHttpRequest. :)
If you save the value at say every 100th request to a file, you do not even have to worry about IIS resets.
Just set the starting value to last saved value + 100 in Application_OnStart. :P