How to handle multiple storage backends transparently

How to handle multiple storage backends transparently - nhibernate

I'm working with an application right now that uses a third-party API for handling some batch email-related tasks, and in order for that to work, we need to store some information in this service. Unfortunately, this information (first/last name, email address) is also something we want to use from our application. My normal inclination is to pick one canonical data source and stick with it, but round-tripping to a web service every time I want to look up these fields isn't really a viable option (we use some of them quite a bit), and the service's API requires the records to be stored there, so the duplication is sadly necessary.
But I have no interest in peppering every method throughout our business classes with code to synchronize data to the web service any time they might be updated, and I also don't think my entity should be aware of the service to update itself in a property setter (or whatever else is updating the "truth").
We use NHibernate for all of our DAL needs, and to my mind, this data replication is really a persistence issue - so I've whipped up a PoC implementation using an EventListener (both PostInsert and PostUpdate) that checks, if the entity is of type X, and any of fields [Y..Z] have been changed, update the web service with the new state.
I feel like this is striking a good balance between ensuring that our data is the canonical source and making sure that it gets replicated transparently and minimizing the chances for changes to fall through the cracks and get us into a mismatch situation (not the end of the world if eg. the service is unreachable, we just do a manual batch update later, but for everybody's sanity in the general case, the goal is that we never have to think about it), but my colleagues and I still have a degree of uncomfortableness with this way forward.
Is this a horrid idea that will invite raptors into my database at inopportune times? Is it a totally reasonable thing to do with an EventListener? Is it a serviceable solution to a less-than-ideal situation that we can just make do with and move on forever tainted? If we soldier on down this road, are there any gotchas I should be wary of in the Events pipeline?

In case of unreliable data stores (web service in your case), I would introduce a concept of transactions (operations) and store them in local database, then periodically pull them from DB and execute against the Web Service (other data store).
Something like this:
public class OperationContainer
{
public Operation Operation; //what ever operations you need CRUD, or some specific
public object Data; //your entity, business object or whatever
}
public class MyMailService
{
public SendMail (MailBusinessObject data)
{
DataAcceessLair<MailBusinessObject>.Persist(data);
OperationContainer operation = new OperationContainer(){Operation=insert, Data=data};
DataAcceessLair<OperationContainer>.Persist(operation);
}
}
public class Updater
{
Timer EverySec;
public void OnEverySec()
{
var data = DataAcceessLair<OperationContainer>.GetFirstIn(); //FIFO
var webServiceData = WebServiceData.Converr(data); // do the logic to prepare data for WebService
try
{
new WebService().DoSomething(data);
DataAcceessLair<OperationContainer>.Remove(data);
}
}
}
This is actually pretty close to the concept of smart client - technically not logicaly. Take a look at book: .NET Domain-Driven Design with C#: Problem-Design-Solution, chapter 10. Or take a look at source code from the book, it's pretty close to your situation: http://dddpds.codeplex.com/

Related

Is it OK to flush the EntityManager from within a domain Service?

I have a domain service called OrderService, with a saveOrder() method:
class OrderService
{
// ...
public function saveOrder(Order $order)
{
$this->orderRepository->add($order);
// $this->entityManager->flush();
$this->notificationService->notifyOrderPlaced($order);
}
}
saveOrder() adds the order to the repository (which internally calls persist() on the EntityManager), then passes the Order to the NotificationService to send appropriate notifications (email, SMS).
The problem is, while NotificationService needs the order ID to include in the notifications, the Order has no ID yet as it's not been persisted to the DB (the ID is auto generated).
The obvious solution seems to pass the EntityManager as a dependency to the OrderService, and flush() right after the repository add() method, as in the example above. But I've always been reluctant to make the domain Services aware of the EntityManager, preferring to let them talk only to repositories, or other services.
What are the drawbacks, if any, of a domain Service having a dependency on the EntityManager?
Is there a better alternative?
Note: I'm using PHP and the Doctrine ORM, but I believe the same principles apply to Java & Hibernate as well.

You may want to consider one of these options (or both)
Make this service an Application layer service instead of a Domain service. It's perfectly OK to call your change tracker in an Application service since it is supposed to know about the application context and progress in the current use case. Typical application services will commit the business transaction/ask the change tracker to save changes when they're done, so why not call it to generate Id's as well ?
If you're concerned about the database being involved in the middle of a use case, maybe you can find an equivalent to NHibernate's Guid.Comb strategy to make your ORM generate an Id without issuing an INSERT to the database right away.
Use a Domain event. Upon creation, an Order could inform the world that it has been newed up. The notification service would handle the event and send appropriate notifications. You'll find an example of that here (it also includes an Application layer service to take care of the business transaction).

Need some advice for a web service API?

My company has a product that will I feel can benefit from a web service API. We are using MSMQ to route messages back and forth through the backend system. Currently we are building an ASP.Net application that communicates with a web service (WCF) that, in turn, talks to MSMQ for us. Later on down the road, we may have other client applications (not necessarily written in .Net). The message going into MSMQ is an object that has a property made up of an array of strings. There is also a property that contains the command (a string) that will be routed through the system. Personally, I am not a huge fan of this, but I was told it is for scalability and every system can use strings.
My thought, regarding the web services was to model some objects based on our data that can be passed into and out of the web services so they are easily consumed by the client. Initially, I was passing the message object, mentioned above, with the array of strings in it. I was finding that I was creating objects on the client to represent that data, making the client responsible for creating those objects. I feel the web service layer should really be handling this. That is how I have always worked with services. I did this so it was easier for me to move data around the client.
It was recommended to our group we should maintain the “single entry point” into the system by offering an object that contains commands and have one web service to take care of everything. So, the web service would have one method in it, Let’s call it MakeRequest and it would return an object (either serialized XML or JSON). The suggestion was to have a base object that may contain some sort of list of commands that other objects can inherit from. Any other object may have its own command structure, but still inherit base commands. What is passed back from the service is not clear right now, but it could be that “message object” with an object attached to it representing the data. I don’t know.
My recommendation was to model our objects after our actual data and create services for the types of data we are working with. We would create a base service interface that would house any common methods used for all services. So for example, GetById, GetByName, GetAll, Save, etc. Anything specific to a given service would be implemented for that specific implementation. So a User service may have a method GetUserByUsernameAndPassword, but since it implements the base interface it would also contain the “base” methods. We would have several methods in a service that would return the type of object expected, based on the service being called. We could house everything in one service, but I still would like to get something back that is more usable. I feel this approach leaves the client out of making decisions about what commands to be passed. When I connect to a User service and call the method GetById(int id) I would expect to get back a User object.
I had the luxury of working with MS when I started developing WCF services. So, I have a good foundation and understanding of the technology, but I am not the one designing it this time.
So, I am not opposed to the “single entry point” idea, but any thoughts about why either approach is more scalable than the other would be appreciated. I have never worked with such a systematic approach to a service layer before. Maybe I need to get over that?

I think there are merits to both approaches.
Typically, if you are writing an API that is going to be consumed by a completely separate group of developers (perhaps in another company), then you want the API to be as self-explanative and discoverable as possible. Having specific web service methods that return specific objects is much easier to work with from the consumer's perspective.
However, many companies use web services as one of many layers to their applications. In this case, it may reduce maintenance to have a generic API. I've seen some clever mechanisms that require no changes whatsoever to the service in order to add another column to a table that is returned from the database.
My personal preference is for the specific API. I think that the specific methods are much easier to work with - and are largely self-documenting. The specific operation needs to be executed at some point, so why not expose it for what it is? You'd get laughed at if you wrote:
public void MyApiMethod(string operationToPerform, params object[] args)
{
switch(operationToPerform)
{
case "InsertCustomer":
InsertCustomer(args);
break;
case "UpdateCustomer":
UpdateCustomer(args);
break;
...
case "Juggle5BallsAtOnce":
Juggle5BallsAtOnce(args);
break;
}
}
So why do that with a Web Service? It'd be much better to have:
public void InsertCustomer(Customer customer)
{
...
}
public void UpdateCustomer(Customer customer)
{
...
}
...
public void Juggle5BallsAtOnce(bool useApplesAndEatThemConcurrently)
{
...
}

WebService to have separate methods for populating properties in composite class?

I'm building a web-service to retrieve a list of composite objects. Should a complex sub-property of each object in the list be populated at once or is it OK to have client request that info as needed.
Example:
class InvoiceLine
{
string ServiceDescription;
decimal ChargeTotal;
}
class Invoice
{
string InvoiceNumber;
string CustomerNumber;
List<InvoiceLine> InvoiceLines;
}
//Returns all invoices send to this customer
public List<Invoice> GetInvoices(string customerNumber);
Is it bad design to have another method in WebService as:
public List<InvoiceLine> GetInvoiceLines(string invoiceNumber)
and require client to first get a list of all saved invoices (with empty list of InvoiceLines in them, and expect them to call:
invoices[0].InvoiceLines = webService.GetInvoiceLines(customerNumber);
to simulate a "lazy-load".
This seems like a good way to save on volume but at the expense of more calls to get data when needed. Is this worth it or is that an anti-pattern of some sort?
It just doesn't seem right to return a half-populated object...
Thanks in advance for any pointers / links.

Service interface granularity is always an important decision when designing your services. Since web service calls are usually network calls or, at the very least, out of process calls they are relatively expensive. If your service interface is too fine grained (or "chatty") then this can affect performance. Of course, you need to consider your application and your requirements when determining the correct level of granularity. Software components: Coarse-grained versus fine-grained although a little dry has some good information.
It just doesn't seem right to return a
half-populated object...
I agree.
In your case let's assume that InvoiceLines is expensive to retrieve and not required most of the time. If that is the case you can encapsulate the invoice summary or header and return that information. However, if the consumer needs the full invoice then offer them the ability to do that.
class InvoiceLine
{
string ServiceDescription;
decimal ChargeTotal;
}
class Invoice
{
InvoiceSummary InvoiceSummary;
List<InvoiceLine> InvoiceLines;
}
class InvoiceSummary
{
string InvoiceNumber;
string CustomerNumber;
}
public InvoiceSummary GetInvoiceSummary(string customerNumber);
public Invoice GetInvoice(string customerNumber);
In general, I prefer to avoid imposing a sequence of calls on the consumer of the service. i.e. First you have to get the Invoice and then you have to get the Details. This can lead to performance issues and also tighter coupling between the applications.

Web services should not be "chatty": you're communicating over a network, and any request/response exchange costs time. You don't want to require the client to ask ten times when he could ask once.
Let the client request all of the data at once.
If that turns out to be a performance problem, then allow the client to specify how much of the data they want, but it should still all be returned in a single call.

I don't think it's bad design, I think it's a trade-off that you have to consider. Which is more important: Returning a complete object graph, or the potential transmission savings.
If, for example, you're more likely to have customers with several hundred or thousands of InvoiceLines and you only need those in very specific circumstances, then it seems very logical to split them up in the way you mention.
However, if you almost always need the InvoiceLines, and you're only usually talking 10 instances, then it makes more sense to always send them together.

A WCF service which deals with large objects

Viewing WCF in its use as a way to do RPC between remote PCs you can nicely just send an object as a method parameter. This is easy to code but means whenever the object changes you send the whole thing, and also potentially means the receiver has to have extra logic to only act on changed fields. Or you can have a class which has one method per attribute on the object. This fine-grained approach is great for performance if you have a large class and normally only change one attribute. But it's a lot more code to write, and you have to maintain it every time the object gains another attribute.
Is there a better approach which can avoid having to write a load of copy-paste methods for each attribute, but also only sends attributes that actually change? Can we auto-generate the WCF service methods from a class/interface or something?
For example say we have the (pseudo) classes, and the aim is two applications want to keep in sync about people (I add a complex attribute List to make it a bit more like real life):
class Pet
{
String name;
AnimalType type;
}
class Person
{
int age;
float height;
string name;
List<Pet> pets
}

WCF by itself does not do that. There are many approaches to figure out changes, but it's in most cases developers duty.
The only predefined solution could be found is ADO.NET DataServices. This is actually RESTful WCF service wrapper for Entity Framework Datacontext from Microsoft. To be honest, you can actually use it not only with EF. On the client side you get a context, that tracks changes. When you submit changes, client only sends the concrete changes. But this limits you to HTTP transport and XML or JSON serialization, which does hit the performance on big objects.
There could be also some sort of event-driven solution, when you send a command to server with some meta data.

However you do it there is going to be overhead. It's up to you to decide what sort of overhead is most acceptable to you. Possible approaches:
Ignore the problem and always send the full entity. The overhead here is the sheer amount of data being sent.
Use ADO.NET Data Services. The overhead here is the data context, change tracking, and general "chattiness" of it all.
Re-design your contracts to reduce the amount of data being passed. The overhead here is the additional complexity of the service interface.
Example of option 3:
class Person {
string Name;
PersonalData PersonalData;
MedicalData MedicalData;
List<Pet> Pets;
}
class PersonalData {
int Age;
string SSN;
}
class MedicalData {
float Weight;
float Height;
}
class Pet {
string Name;
AnimalType Type;
}
interface IPerson {
void Update(Person data, bool includePersonalData, bool includeMedicalData, bool includePets);
}
In the client code, if you don't want to update medical data, then you can pass false to the update method and not have to bother instantiating a MedicalData object in the data. This cuts down on network traffic since the corresponding element in the InfoSet will be missing.

The solution really depends on what your binding constraints are. If you are forced to basicHttp bindings then ADO.Net DataServices might be the best approach as stated by Pavel and Christian. However, if NetTcp and other more complex bindings (WS*) are available, you could look into Reliable Messaging with Ordered Delivery. You could break down your responses into smaller chunks and put them back together on the other end. Also look into Streamed vs. Buffered transfer. Of course this requires a lot more work than ADO.Net DataServices but that makes it more fun, non?
Also, keep in mind Contract first development. Using parameterized methods in a web service will constrain you down the road and any changes you want to make will force a new version, even for any little change (e.g., an additional field returned).

Best way to share data between .NET application instance?

I have create WCF Service (host on Windows Service) on load balance server. Each of this service instance maintain list of current user. E.g. Instance A has user A001, A002, A005, instance B has user A003, A004, A008 and so on.
On each service has interface that use to get user list, I expect this method to return all user in all service instance. E.g. get user list from instance A or instance B will return A001, A002, A003, A004, A005 and A008.
Currently I think that I will store the list of current users on database but this list seem to update so often.
I want to know, is it has another way to share data between WCF service that suit my situation?

Personally, the database option sounds like overkill to me just based on the notion of storing current users. If you are actually storing more than that, then using a database may make sense. But assuming you simply want a list of current users from both instances of your WCF service, I would use an in-memory solution, something like a static generic dictionary. As long as the services can be uniquely identified, I'd use the unique service ID as the key into the dictionary and just pair each key with a generic list of user names (or some appropriate user data structure) for that service. Something like:
private static Dictionary<Guid, List<string>> _currentUsers;
Since this dictionary would be shared between two WCF services, you'll need to synchronize access to it. Here's an example.
public class MyWCFService : IMyWCFService
{
private static Dictionary<Guid, List<string>> _currentUsers =
new Dictionary<Guid, List<string>>();
private void AddUser(Guid serviceID, string userName)
{
// Synchronize access to the collection via the SyncRoot property.
lock (((ICollection)_currentUsers).SyncRoot)
{
// Check if the service's ID has already been added.
if (!_currentUsers.ContainsKey(serviceID))
{
_currentUsers[serviceID] = new List<string>();
}
// Make sure to only store the user name once for each service.
if (!_currentUsers[serviceID].Contains(userName))
{
_currentUsers[serviceID].Add(userName);
}
}
}
private void RemoveUser(Guid serviceID, string userName)
{
// Synchronize access to the collection via the SyncRoot property.
lock (((ICollection)_currentUsers).SyncRoot)
{
// Check if the service's ID has already been added.
if (_currentUsers.ContainsKey(serviceID))
{
// See if the user name exists.
if (_currentUsers[serviceID].Contains(userName))
{
_currentUsers[serviceID].Remove(userName);
}
}
}
}
}
Given that you don't want users listed twice for a specific service, it would probably make sense to replace the List<string> with HashSet<string>.

A database would seem to offer a persistent store which may be useful or important for your application. In addition it supports transactions etc which may be useful to you. Lots of updates could be a performance problem, but it depends on the exact numbers, what the query patterns are, database engine used, locality etc.
An alternative to this option might be some sort of in-memory caching server like memcached. Whilst this can be shared and accessed in a similar (sort of) way to a database server there are some caveats. Firstly, these platforms are generally not backed by some sort of permanent storage. What happens when the memcached server dies? Second they may not be ACID-compliant enough for your use. What happens under load in terms of additions and updates?

I like the in memory way. Actually I am designing a same mechanism for one my projects I'm working now. This is good for scenarios where you don't have opportunities to access database or some people are really reluctant to create a table to store simple info like a list of users against a machine name.
Only update I'd do there is a node will only return the list of its available users to its peer and peer will combine that with its existing list. Then return its existing list to the peer who called. Thats how all the peers would be in sync with same list.

The DB option sounds good. If there are no performance issues it is a simple design that should work. If you can afford to be semi realtime and non persistent one way would be to maintain the list in memory in each service and then each service updates the other when a new user joins. This can be done as some kind of broadcast via a centralised service or using msmq etc.

If you reconsider and host using IIS you will find that with a single line in a config file you can make the ASP Global, Application and Session objects available. This trick is also very handy because it means you can share session state between an ASP application and a WCF service.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas