Updating OpenFlow group table bucket list in OpenDaylight - openflow

I have a mininet (v2.2.2) network with openvswitch (v2.5.2), controlled by OpenDaylight Carbon. My application is an OpenDaylight karaf feature.
The application creates a flow (for multicasts) to a group table (type=all) and adds/removes buckets as needed.
To add/remove buckets, I first check if there is an existing group table:
InstanceIdentifier<Group> groupIid = InstanceIdentifier.builder(Nodes.class)
.child(Node.class, new NodeKey(NodId))
.augmentation(FlowCapableNode.class)
.child(Group.class, grpKey)
.build();
ReadOnlyTransaction roTx = dataBroker.newReadOnlyTransaction();
Future<Optional<Group>> futOptGrp = rwTx.read(LogicalDatastoreType.OPERATIONAL, groupIid);
If it doesn't find the group table, it is created (SalGroupService.addGroup()). If it does find the group table, it is updated (SalGroupService.updateGroup()).
The problem is that it takes some time after the RPC call add/updateGroup() to see the changes in the data model. Waiting for the Future<RPCResult<?>> doesn't guarantee that the data model has the same state as the device.
So, how do I read the group table and bucket list from the data model and make sure that I am indeed reading the same state as the current state of the device?
I know that
Add/UpdateGroupInputBuilder has setTransactionUri()
DataBroker gives transaction to read/write
you should use transaction chaining
But I cannot figure out how to combine these.
Thank you
EDIT: Or do I have to use write transactions in stead of RPC calls?

I dropped using RPC calls for writing flows and switched to using writes to the config datastore. It will still take some time to see the changes appear in the actual device and in the operational datastore but that is ok as long as I use the config datastore for both reads and writes.
However, I have to keep in mind that it is not guaranteed that changes to the config datastore will always make it to the actual device. My flows are not that complicated in the sense that conflicts are unlikely to happen. Still, I will probably check consistency between operational and configuration datastore.

Related

Avoid two-phase commits in a event sourced application saving BLOB data

Let's assume we have an Aggregate User which has a UserPortraitImage and a Contract as a PDF file. I want to store files in a dedicated document-based store and just hold process-relevant data in the event (with a link to the BLOB data).
But how do I avoid a two-phase commit when I have to store the files and store the new event?
At first I'd store the documents and then the event; if the first transaction fails it doesn't matter, the command failed. If the second transaction fails it also doesn't matter even if we generated some dead files in the store, the command fails; we could even apply a rollback.
But could there be an additional problem?
The next question is how to design the aggregate and the event. If the aggregate only holds a reference to the BLOB storage, what is the process after a SignUp command got called?
SignUpCommand ==> Store documents (UserPortraitImage and Contract) ==> Create new User aggregate with the given BLOB storage references and store it?
Is there a better design which unburdens the aggregate of knowing that BLOB data is saved in another store? And who is responsible for storing BLOB data and forwarding the reference to the aggregate?
Sounds like you are working with something analogous to an AtomPub media-entry/media-link-entry pair. The blob is going into your data store, the meta data gets copied into the aggregate history
But how do I avoid a two-phase commit when I have to store the files and store the new event?
In practice, you probably don't.
That is to say, if the blob store and the aggregate store happen to be the same database, then you can update both in the same transaction. That couples the two stores, and adds some pretty strong constraints to your choice of storage, but it is doable.
Another possibility is that you accept that the two changes that you are making are isolated from one another, and therefore that for some period of time the two stores are not consistent with each other.
In this second case, the saga pattern is what you are looking for, and it is exactly what you describe; you pair the first action with a compensating action to take if the second action fails. So "manual" rollback.
Or not - in a sense, the git object database uses a two phase commit; an object gets copied into the object store, and then the trees get updated, and then the commit... garbage collection comes along later to discard the objects that you don't need.
who is responsible for storing BLOB data and forwarding the reference to the aggregate?
Well, ultimately it is an infrastructure concern; does your model actually need to interact with the document, or is it just carrying a claim check that can be redeemed later?
At first I'd store the documents and then the event; if the first
transaction fails it doesn't matter, the command failed. If the second
transaction fails it also doesn't matter even if we generated some
dead files in the store, the command fails; we could even apply a
rollback. But could there be an additional problem?
Not that I can think of, aside from wasted disk space. That's what I typically do when I want to avoid distributed transactions or when they're not available across the two types of data stores. Oftentimes, one of the two operations is less important and you can afford to let it complete even if the master operation fails later.
Cleaning up botched attempts can be done during exception handling, as an out-of-band process or as part of a Saga as #VoiceOfUnreason explained.
SignUpCommand ==> Store documents (UserPortraitImage and Contract) ==>
Create new User aggregate with the given BLOB storage references and
store it?
Yes. Usually the Application layer component (Command handler in your case) acts as a coordinator betweeen the different data stores and gets back all it needs to know from one store before talking to the other or to the Domain.

configure parallel async event queue on replicated region in Gemfire

I'm trying to configure Gemfire/Geode in order to have an async event queue with parallel=true on a replicated region. However, I'm getting the following exception at startup:
com.gemstone.gemfire.internal.cache.wan.AsyncEventQueueConfigurationException: Parallel Async Event Queue myQueue can not be used with replicated region /myRegion
This (i.e. to prevent parallel queues on replicated regions) seems to be a design decision, but I can't understand why it is the case.
I have read all the documentation I've been able to find (primarily http://gemfire.docs.pivotal.io/docs-gemfire/latest/reference/book_intro.html and related docs),
and searched any kind of reference to this exception on the internet, but I didn't find any clear explanation on why I can't have an event listener on each member hosting a replicated region.
My conclusion is that I must be missing some fundamental concept about replicated regions and/or parallel queues, but since I can't find the appropriate documentation
on my own, I'm asking for an explanation and/or pointers to the right resources to read.
Thanks in advance.
EDIT : Let me put the question into context.
I have an external system sending data to my application using REST services, which are load balanced between nodes in order to maximize performance. Each of the nodes hosts the same regions (let's say 3, named A B and C). The data travels through all those regions (A to B to C) and is processed along the way. This means that region A hosts data that has just been received, region B data that has been partially processed and region C hosts data whose processing is complete.
I am using event listeners to process data and move it from region to region, and in case of the listener for region C, to export it to another external system.
All the listeners must (and I repeat, must) be transactional.
I also need horizontal scalability (i.e. adding nodes on the fly to increase throughput) and the maximum amount of data replication that can be possibily achieved.
Moreover, I want to run all of the nodes with the same gemfire configuration.
I have already tried to use partitioned regions, but they are not fit to my needs for a bunch of reasons that I won't explain here for the sake of brevity (just trust me, it is not currently possible).
So I thought that having all the nodes host the replicated regions could be the way, but I need all of them to be able to process events independently and perform region synchronization afterwards in an active/active scenario. It is my understanding that this requires event queues to be parallel, but it does not seem possible (by design).
So the (updated) question(s) are :
Is this scenario even possible? And if it is, how can I achieve it?
Any explanation and/or documentation, example, resource or anything else is more than welcome.
Again, thanks in advance.
An AsyncEventQueue is used to write data that arrives in GemFire to some other data store. You would ideally want to do this only once. Since the content of the replicated region is same on all the members of the system, you only need a Async event listener on one member, hence parallel=true is not supported.
For Partitioned regions, if you only had one member that hosts the AsyncQueue, then every single put to a partitioned region will also be routed through that member. This introduces a single point of contention in the system. The solution to this problem was introduction of parallel AsyncQueues, so that events on each member are only queued up locally in that member.
GemFire also supports CacheListeners, which are invoked on each member even for replicated regions, however, they are synchronous. You can introduce a thread pool in your CacheListener to get the same functionality.

Keeping Gemfire in Sync with a Database

We are developing an application that makes use of deep object models that are being stored in Gemfire. Clients will access a REST service to perform cache operations instead of accessing the cache via the Gemfire client. We are also planning to have a database underneath Gemfire as the system of record making use of asynchronous write-behind to do the DB updates and inserts.
In this scenario it seems to be a non-trivial problem to guarantee that an insert or update into Gemfire will result in a successfull insert or update in the DB without setting up elaborate server-side validation (essentially the constraints to the Gemfire operation would have to match the DB operation constraints). We cannot bubble the DB insert or update success/failure back to the client without making the DB call synchronous to the Gemfire operation. This would obviously defeat the purpose of using Gemfire for low latency client operations.
We are curious as to how other Gemfire adopters who are using write-behind have solved the problem of keeping the DB in sync with the Gemfire data fabric?
We generally recommend to implement validations in GemFire in a CacheWriter since by definition it's going to be called before any modification occurs on the cache.
http://gemfire.docs.gopivotal.com/javadocs/japi/com/gemstone/gemfire/cache/CacheWriter.html
That said, a pattern that I saw on a customer was to receive data on region "A" and implement basic validations there, accepting that data. This data will then be copied to the DB using write-behind as you describe, batched through AsyncEventListener where in the try/catch of the insertion, if any error occur they store that data in a another region that has another Listener for the write-behind, non-batched, so you can actually see which record failed and decide what to do accordingly.
Something like:
data -> cacheWriter basic checks -> region A -> AEL (batch 500~N events) -> DB
If error occurs copy to region B -> Listener persist individual records on DB ->
Do some action on the failed record.
In their case, they had an interface to manually clear pending records on region B.
Hope that helps, if not maybe you can provide more details about your use case...
Cheers.
You can register an AsyncEventListener and update your DB with batch updates. For more info:
http://pubs.vmware.com/vfabric53/topic/com.vmware.vfabric.gemfire.7.0/developing/events/implementing_write_behind_event_handler.html
Gemfire ver8 and above solves your problem of accessing Cache objects REST service.
Gemfire REST

Multiple iOS devices + centralized sqlite database

Can I share same sqlite database from 2 different iOS devices or update one from another?
Not easily. You can sync the db over iCloud but there is a high chance that data will be overwritten.
I recommend writing some sort of syncing tool that can detect and merge changes originating on either device.
A Common method is to use a timestamp column and get all rows modified since the last sync, then update the other database.
You can take a look at OpenMobster's Sync Service. It supports multi-device sync operations with conflict management.
Besides this it supports the following sync modes:
two-way
one-way device
one-way server
bootup
You can run in complete offline mode and the changes will be auto-tracked and synchronized with the network returns. At this time sync happens with the Cloud and other devices that hold the same data that you are holding locally.
The project is located at: http://openmobster.googlecode.com
You can look at the following iOS sync tutorial to get an idea of how this thing works:
http://code.google.com/p/openmobster/wiki/iPhoneSyncApp

Is this a good use-case for Redis on a ServiceStack REST API?

I'm creating a mobile app and it requires a API service backend to get/put information for each user. I'll be developing the web service on ServiceStack, but was wondering about the storage. I love the idea of a fast in-memory caching system like Redis, but I have a few questions:
I created a sample schema of what my data store should look like. Does this seems like it's a good case for using Redis as opposed to a MySQL DB or something like that?
schema http://www.miles3.com/uploads/redis.png
How difficult is the setup for persisting the Redis store to disk or is it kind of built-in when you do writes to the store? (I'm a newbie on this NoSQL stuff)
I currently have my setup on AWS using a Linux micro instance (because it's free for a year). I know many factors go into this answer, but in general will this be enough for my web service and Redis? Since Redis is in-memory will that be enough? I guess if my mobile app skyrockets (hey, we can dream right?) then I'll start hitting the ceiling of the instance.
What to think about when desigining a NoSQL Redis application
1) To develop correctly in Redis you should be thinking more about how you would structure the relationships in your C# program i.e. with the C# collection classes rather than a Relational Model meant for an RDBMS. The better mindset would be to think more about data storage like a Document database rather than RDBMS tables. Essentially everything gets blobbed in Redis via a key (index) so you just need to work out what your primary entities are (i.e. aggregate roots)
which would get kept in its own 'key namespace' or whether it's non-primary entity, i.e. simply metadata which should just get persisted with its parent entity.
Examples of Redis as a primary Data Store
Here is a good article that walks through creating a simple blogging application using Redis:
http://www.servicestack.net/docs/redis-client/designing-nosql-database
You can also look at the source code of RedisStackOverflow for another real world example using Redis.
Basically you would need to store and fetch the items of each type separately.
var redisUsers = redis.As<User>();
var user = redisUsers.GetById(1);
var userIsWatching = redisUsers.GetRelatedEntities<Watching>(user.Id);
The way you store relationship between entities is making use of Redis's Sets, e.g: you can store the Users/Watchers relationship conceptually with:
SET["ids:User>Watcher:{UserId}"] = [{watcherId1},{watcherId2},...]
Redis is schema-less and idempotent
Storing ids into redis sets is idempotent i.e. you can add watcherId1 to the same set multiple times and it will only ever have one occurrence of it. This is nice because it means you don't ever need to check the existence of the relationship and can freely keep adding related ids like they've never existed.
Related: writing or reading to a Redis collection (e.g. List) that does not exist is the same as writing to an empty collection, i.e. A list gets created on-the-fly when you add an item to a list whilst accessing a non-existent list will simply return 0 results. This is a friction-free and productivity win since you don't have to define your schemas up front in order to use them. Although should you need to Redis provides the EXISTS operation to determine whether a key exists or a TYPE operation so you can determine its type.
Create your relationships/indexes on your writes
One thing to remember is because there are no implicit indexes in Redis, you will generally need to setup your indexes/relationships needed for reading yourself during your writes. Basically you need to think about all your query requirements up front and ensure you set up the necessary relationships at write time. The above RedisStackOverflow source code is a good example that shows this.
Note: the ServiceStack.Redis C# provider assumes you have a unique field called Id that is its primary key. You can configure it to use a different field with the ModelConfig.Id() config mapping.
Redis Persistance
2) Redis supports 2 types persistence modes out-of-the-box RDB and Append Only File (AOF). RDB writes routine snapshots whilst the Append Only File acts like a transaction journal recording all the changes in-between snapshots - I recommend adding both until your comfortable with what each does and what your application needs. You can read all Redis persistence at http://redis.io/topics/persistence.
Note Redis also supports trivial replication you can read more about at: http://redis.io/topics/replication
Redis loves RAM
3) Since Redis operates predominantly in memory the most important resource is that you have enough RAM to hold your entire dataset in memory + a buffer for when it snapshots to disk. Redis is very efficient so even a small AWS instance will be able to handle a lot of load - what you want to look for is having enough RAM.
Visualizing your data with the Redis Admin UI
Finally if you're using the ServiceStack C# Redis Client I recommend installing the Redis Admin UI which provides a nice visual view of your entities. You can see a live demo of it at:
http://servicestack.net/RedisAdminUI/AjaxClient/