Currently I am learning Dapr, and I find it powerful for micro services development, but after reading more, I found some limitations, the most important of them is "state management".
How can we use tools like EF Core or Dapper for interaction with databases in Dapr? Is DaprClient class the only way of interaction with databases?
You should continue to use those ORMs alongside Dapr as its not a replacement for them. ORMs already act like a translation layer (Repository layer) for relational stored data. State management is simple KVP storage with some minimal query ability. Think of your state as a Dictionary<string,object>, where that object can be anything simple or complex. this is well suited for technologies like Redis, DynamoDB, MongoDB, Cassandra etc. Again, what you would put in cache, is what you would put in state storage. So, you can still (and should) have an ORM in your service for relational data and pass in runtime configuration for provider, connection string etc, while being able to utilize all the other features of Dapr via DaprClient, via HTTPClient or via GRPC clients.
Bonus, with many providers in EFCore/EF you can add tracing headers (https://github.com/opentracing-contrib/csharp-netcore) that will seamlessly plug into the tracing provided for free with Dapr. That way you will see a full e2e trace all the way through your state and relational data
Related
A simple question about scalability. I have been studying about scalability and I think I understand the basic concept behind it. You use an orchestrator like Kubernetes to manage the automatic scalability of a system. So in that way, as a particular microservice gets an increase demand of calls, the orchestrator will create new instances of it, to deal with the requirement of the demand. Now, in our case, we are building a microservice structure similar to the example one at Microsoft's "eShop On Containers":
Now, here each microservice has its own database to manage just like in our application. My question is: When upscaling this system, by creating new instances of a certain microservice, let's say "Ordering microservice" in the example above, wouldn't that create a new set of databases? In the case of our application, we are using SQLite, so each microservice has its own copy of the database. I would asume that in order to be able to upscale such a system would require that each microservice connects to an external SQL Server. But if that was the case, wouldn't that be a bottle neck? I mean, having multiple instances of a microservice to attend more demand of a particular service BUT with all those instances still accessing a single database server?
In the case of our application, we are using SQLite, so each microservice has its own copy of the database.
One of the most important aspects of services that scale-out is that they are stateless - services on Kubernetes should be designed according to the 12-factor principles. This means that service-instances cannot have its own copy of the database, unless it is a cache.
I would asume that in order to be able to upscale such a system would require that each microservice connects to an external SQL Server.
yes, if you want to be able to scale-out, you need to use a database that are outside the instances and shared between the instances.
But if that was the case, wouldn't that be a bottle neck?
This depend very much on how you design your system. Comparing microservices to monoliths; when using a monolith, the whole thing typically used one big database, but with microservices it is easier to use multiple different databases, so it should be much easier to scale-out the database this way.
I mean, having multiple instances of a microservice to attend more demand of a particular service BUT with all those instances still accessing a single database server?
There are many ways to scale a database system as well, e.g. caching read-operations (but be careful). But this is a large topic in itself and depends very much on what and how you do things.
I am building a service to handle a large number of devices, for a large number of users.
We have a complex schema of access roles assigned to each entity. Some data entries can be written to by certain users, while some users can only read from some entities (but can write to others).
This is a cloud service: there are more devices, and users than can be handled by a single server machine (we are using non relational Cloud databases for this).
I was wondering if there was an established cloud-scale user/role management backend system which I could integrate to enforce the access rules, instead of writing my own. This tech should preferably be cloud agnostic, so I would prefer not to use a SAAS solution, but deploy my own.
I am looking for a system which can scale to millions of users, and billions of data entities
I think authentication is not going to be a big issue, there are very robust cloud based solutions available for storing identities and authenticating millions of users. Authorization will be trickier, and will depend a lot on how granular you want it to be. You could look at Apigee for example as a very scalable proxy that might help you implement this. So getting to the point where you have a token that you can verify the users identity with and that might contain some scopes is not going to be hard imo. If that is enough for you then I would just look at Auth.0, Okta and the native IDM solution of whatever cloud platform you are using (Cognito, Cloud Identity etc.).
I think you will find that more features come with a very hefty pricetag. So Auth.0 is far superior compared to Cognito, but Cognito still has enough features for basic use cases and will end up costing a fraction of Auth.0 in large deployments. So everything comes with pros and cons. If you have very complex requirements such as a bunch of big legacy repositories that you need to integrate then products like Auth.0 rapidly start looking more attractive.
Personally I would look at Auth.0, Cognito and Apigee and my decision would depend massively on parameters that you haven't mentioned in your question. Obviously these are all SaaS solutions, which I think you should definitely be using anyways. I would not host this myself unless I had no other choice, and going that route will radically limit your choices and probably increase expenses. All the cool stuff is happening in the cloud.
Running ASP.NET Core 2.x and I'm trying to register two types of IDistributedCache services (both ship in the box).
I'd like to use the SQL Distributed Cache for Sessions and the local DistributedMemory cache for everything else (eventually this will be a redis cache, but that's irrelevant at the moment).
// Distributed Cache used for Sessions
services.AddDistributedSqlServerCache(o => // IDistributedCache with SQL backed storage version
{
o.ConnectionString = dbCs;
o.SchemaName = "dbo";
o.TableName = "Sessions";
});
services.AddDistributedMemoryCache();
As it stands, both of these resolve to IDistributedCache. Is it possible to direct the Sessions middleware to use the SQL cache and everything else to resolve to DistributedMemoryCache?
Out of the box, no, this is not going to be possible. Microsoft.Extensions.DependencyInjection is very basic. You can register multiple implementations of an interface, but the only way to logically use them at that point is to inject all the implementations, via something like List<IDistributedCache> in this case. The distribute caching machinery doesn't support this, though.
It might, and I stress might, be possible if you use a more advanced DI container like Autofac, but I've never tried this scenario, and therefore cannot say for sure.
That said, it should be noted that caching isn't designed to be segregated like this. Session storage in ASP.NET Core uses cache because it's temporal by nature, and lends itself more readily to ejecting expired or no longer relevant entries than a relational database does, not withstanding the fact that you're actually backing the cache with a relational database in this circumstance. Regardless, you're intended to just have one cache, not multiple different caches with different backing stores. Session storage will work just as well in Redis (if not better), so I don't see a strong need for this anyways.
I'm creating a microservice architecture with Core, rabbitMQ, strangler pattern ... but I have to use an existing SQL database (Transaction requeriment).
Doing a research I don't found a lot of information about how implement SQL database, but I think it's impossible to do a transactional operation on different services at the same time.
1- Every service must have access to entirely database?
2- Is a good idea do a service exclusive to do transactionals operations?
3- SQL with microservices it's maybe too much slow?
I don't know if exist a standard for this.
Thanks.
The whole point of microservices is about having small, independent services that are decoupled as much as possible.
Sharing a common database introduces very strong coupling, and is not recommended.
If two services need the same data, you could either (a) have a different database for each, and replicate the data, or (b) introduce a third service that is responsible for access to the database.
If you're looking for a bigger-scale distributed transaction across microservices, then you should look into things like sagas. Typically you'll have a coordinator ("process manager" in some literature) that tracks the various operations, and can compensate or cancel actions that have been performed if the transaction as a whole is bound to fail.
3- SQL with microservices it's maybe too much slow?
What makes you think so?
There is nothing about SQL that makes it inadequate for microservices. Microservices may vary wildly in terms of what they do and what they require. SQL will be perfectly suitable for some microservices, and possibly not so suitable for others. It depends on the service.
It look like you need a distributed transactions in your system
https://msdn.microsoft.com/en-us/library/windows/desktop/ms681205(v=vs.85).aspx
Also there is a nice book devoted to microservices. It includes distributed transactions and other patters used in microservice bases apps.
http://shop.oreilly.com/product/0636920033158.do
1- Every service must have access to entirely database?
No. A microservice has its own schema related to the Aggregate Root / Service that it offers. If a service needs data of another entity, it invokes the APIs provided by another micro service.
2- Is a good idea do a service exclusive to do transactionals
operations?
No. Each microservice is a transaction boundary in its own right. Distributed transactions, particularly using 2PC, do not perform particularly well.
3- SQL with microservices it's maybe too much slow?
I am not totally clear as to why you make such a statement.
We've got a smart client that talks to a SQL Server database via WCF, displaying the entities in the database, and allowing the user to edit those entities.
Some of the WCF calls return a large data set. Since this data set doesn't change very often, I'm considering some sort of write-through cache on the client, and only getting the deltas from the WCF service.
That is: the client both reads from the service and writes to the service.
I'm not looking for disconnected/offline operation, but since the majority of the data doesn't change very often, I'd probably implement this with a local data store.
I don't want the local store to get too stale, and I don't think I'm too concerned about conflict resolution, because updates will always go straight to the WCF service -- think of it as a write-through cache.
Would Microsoft's Sync Framework be good for this? Could I use a local SQL-CE cache and perform the updates over WCF? The service end has a SQL Server 2005/2008 backend, but I don't want to talk to it directly. Does Sync Framework integrate well with WCF?
Are there other solutions out there? Should I roll something myself?
I don't think you have to couple it to WCF at all. FeedSync allows you to publish directly to an RSS feed.
The only that I'm not too sure about is if it would be suitable for a "large dataset" though. Since you don't need two way replication, if your dataset is extremely large, you might want to write your own WCF implementation to optimize it; especially for the initial population.