Pros/Cons of Binary Reference VS WCF - wcf

I am in the process of implementing an enhancement to an existing web application(A). The new solution will provide features(charts/images/data) to the application A. The new enhancement will be a new project and will generate new assemblies. I am trying to identify what would be most elegant way to read this information.
1) Do a binary reference and read the data directly. The new assemblies live with your application and are married together
2) Write a WCF call and get the data. This will help to decouple the application.
The new application will involve me to buy some expensive licences. So if i go with the 2nd option i can limit the license fee to a single server or atmost 2-3. My current applicaiton runs under a webfarm of 8 servers.
Please share out the pros/cons of both approach.
Thanks.

If you decouple the two pieces sufficiently, you will also permit the use of clients running something other than .NET. Using the first option, you could only support .NET clients. This may turn out to be important, even if today you are absolutely certain that only .NET will ever be used - tomorrow, your company may be purchased by another which is a Java or PHP shop.
Even if you never need to support a non .NET client, coupling to the assemblies will require you to maintain version compatibility between the client and server. If this is not necessary, then use option #2.

The benefit of using WCF (decoupled approach) is that you get a deployment option to take it outside of the machine if it impacts the machine too much in terms of processing or storage.
The downside is that you'll likely pay some performance hit compared to linking directly.
I'm sure you can do some dynamic linking so you don't have to deploy to all 8 servers.

Related

How to architect scheduled API to API integration

My organization moves data for customers between systems, these integrations are in BizTalk and are done by file, sometimes to/from APIs. More and more customers are switching to APIs so we are facing more and more API to API integrations.
I'm mostly a backend developer but have been tasked with finding out how we can find a more generic pattern or system to make these integrations, we are talking close to a thousand of integrations.
But not thousands of different APIs, many customers use the same sort of systems.
What I want is a solution that:
Fetches data from the source api
Transforms the data to the format for the target api
Sends the data to the target api
Another requirement is that it should be possible to set a schedule when these jobs should run.
This is easily done in BizTalk but as mentioned there will be thousands of integrations and if we need to change something in one of the steps it will be a lot of work.
My vision is something that holds interfaces to all APIs that we communicate with and also contains the scheduled jobs we want to be run between them. Preferrably with logging/tracking.
There must be something out there that does this?
Suggestions?
NOTE: No cloud-based solutions since they are not allowed in our organization.
You can easily implement this using temporal.io open source project. You can code your integrations using a general-purpose programming language. Temporal ensures that the integration runs to completion in the presence of all sorts of intermittent failures. Scheduling is also supported out of the box.
Disclaimer: I'm a founder of the Temporal project.

Any downside of running multiple hosted service within a .Net Core Windows Service?

Currently, we have a .Net Framework 4.7 based windows service that we install through MSI built using Wix. But during install, we register multiple windows services for the same exe with difference being in the arguments passed to each service. It would look like Myapp.exe -instance 1, Myapp.exe -instance 2..and so on. Each instance uses a different configuration based on the instance number and will poll different IBM MQ and process messages. We install around 14 such instances.
Now that we are looking to migrate to .Net Core, we are wondering, if its worth changing this deployment model and instead move to using multiple instances of hosted services. With this, we will simply register the hosted service multiple times but with different constructor parameter. So I am trying to understand, what could be potential downside of this approach. Till now, I could think of coupe of them.
Since these runs as independent processes, we currently have ability to stop/start specific instance of windows service. So we will potentially lose that ability.
Since these runs as independent processes, we can easily identify memory spike in a specific instance of windows service. So for troubleshooting, we can just focus on specific instance. With single executable, we lose this ability as well.
Apart from these, what other potential pitfalls that I may come across with this approach?
Also for the above 2 points, is there any workaround when using multiple hosted services?
I'm not sure specifically about Windows Services but I had the same question for microservices. I think in general, there isn't much either way but some things to consider:
All services go down if you need to deploy a new one (but if they are all the same, you are more likely to update all of them at the same time)
Coordinating between them (if necessary) might be easier (locks, transactions etc) if they are together but likewise might allow you to do things that break encapsulation because you can
They would all start and stop at the same time in a single service, if you want to control them separately, you will either need an external enable-disable mechanism or separate windows services.
If you will ever need to separate them e.g. onto separate machines, you will have to do the risky work of separating them later.
It sounds like they are largely identical just targetting different data so there aren't any things I can think of that would be a problem.

How to perform stress test against sharepoint site using threads

I want to analyze the performance (hence its weak points) of a sharepoint site doing stress test activity. What is needed to be done is call some methods exposed via web service that do the following things inside the sharepoint site:
-create a new group
-add a content to the group
-add an attachment to the content
-delete the content
-delete the previously created group
What is required is a simulation of a situation where there are 4500 users trying to do these operations concurrently (at the same time or more realistically within a short timespan, for example within 5 seconds).
We want to register the execution time of each operation (web method, for example of the "create new group"), too. I thought I could simulate these operations via a console applications using threads and stopwatchs. Is there anyone who has encountered a similar problem and can give me any existing solutions or hints to do it "the right way"? For
example how can I obtain that all threads start at the same instant? Thanks in advance.
I am a user of Visual Studio Load Testing since 2 years, and I find it very powerfull and easy to use. You can run integration tests, navigation in a web site, simulate database load, ... in fact, everything. Because it is a MS application, it is also fully compatible with all MS products like Sharepoint : it's easier to call a WCF service from a unit test than another technology (how to test nettcpbinding ?). You can also use the Visual Studio Profiler for instrumenting your code (and see what line of code is expensive or event ADO.net interactions). You can also easily extend the load testing by many extensibility points.
One important thing is that VS laod testing is "intrusive". It will note only collect response time, request lengths, ... but also all performance counters, database queries, ... All this metrics are saved in a dedicated database like SQLExpress for reporting. There is an AddOn for Excel.
Juste one important note (available for all load testing solutions !) :
You can run load tests from a developer machine or even a single dedicated machine, but you usually can't generate enough traffic to really see how the application responds (you machine can not simulate 500 concurrent users because of limited CPU/Memory/Network) . In order to simulate a lot of users, you'll set up what is known as a Load Test Rig.
A test rig is made up of a Test Controller machine and one or more Test Agent machines as shown in Figure 1. The controller manages and coordinates the agent machines and the agents generate load against the application. The test controller is also responsible for collecting performance monitor data from the servers under test and optionally from the test rig machines.
Here are some links :
MSDN
Dave's introduction
Not saying Visual Studio Load Testing is not a great tool. There are tools, like Tsung, Eventlet (and many others) that can support well over thousands of concurrent users.
Good luck.

WCF: sharing cached data across multiple services

We are developing a project that involves about 10 different WCF services with several endpoints each. One of the services keeps a few big tables of data cached in memory.
We have found we need access to that data from another service. Rather than keeping 2 copies of the cache, I'd like to be able to share those tables across all services.
I have done some research and found some articles about using an IExtension attached to the servicehosts to store the shared data.
Provided that all the services are running under the same web site, will that work? And is it the right approach? Or should I be looking elsewhere?
If the data that you're caching is required by more than one service, it sounds like - from a Service Oriented Architecture perspective, anyway - that it doesn't belong in either of services you have calling it.
If the data being cached isn't really related to either service, but is something that both services need, then perhaps it belongs in it's own seperate service. Have you considered encapsulating your cache in a third service, and performing a service-to-service call to retrieve the data you need? Benefits include...
It solves your original dilemma, avoiding the need to read the whole cache from the database several times;
It encapsulates the cache in one place for easy maintainance/change later.
It allows you to abstract the implementation of the cache away from the other services by putting another service interface in the way.
All in all, I'd suggest that's the best approach. The only downside is the extra overhead of making the service-to-service call, but that surely outperforms having to read the whole cache from the database.
Alternatively, if the data in your cache is very closely related to BOTH of the services that are calling the cache, i.e. both services add/change the data in the cache, etc. then perhaps the two existing services should be combined into a single service.
If what I'm saying is making some sense, then then principle of SOA I'm drawing on is Service Autonomy.
Provided all your services are part of the same application there doesn't seem to be any reason why you can't share the cache directly via a shared object reference. The simplest way of doing this is via a static field.
If you choose this approach, one thing to be very careful about is thread safety. If your cache is concurrently accessed via two WCF sessions, you must ensure that the two sessions are not going to interfere with each other by both changing the cache at the same time. If the cache is read-only, your need to do this is lessened, but you still might need to synchronrise initialisation of the cache.

Index replication and Load balancing

Am using Lucene API in my web portal which is going to have 1000s of concurrent users.
Our web server will call Lucene API which will be sitting on an app server.We plan to use 2 app servers for load balancing.
Given this, what should be our strategy for replicating lucene indexes on the 2nd app server?any tips please?
You could use solr, which contains built in replication. This is possibly the best and easiest solution, since it probably would take quite a lot of work to implement your own replication scheme.
That said, I'm about to do exactly that myself, for a project I'm working on. The difference is that since we're using PHP for the frontend, we've implemented lucene in a socket server that accepts queries and returns a list of db primary keys. My plan is to push changes to the server and store them in a queue, where I'll first store them into the the memory index, and then flush the memory index to disk when the load is low enough.
Still, it's a complex thing to do and I'm set on doing quite a lot of work before we have a stable final solution that's reliable enough.
From experience, Lucene should have no problem scaling to thousands of users. That said, if you're only using your second App server for load balancing and not for fail over situations, you should be fine hosting Lucene on only one of those servers and accessing it via NDS (if you have a unix environment) or shared directory (in windows environment) from the second server.
Again, this is dependent on your specific situation. If you're talking about having millions (5 or more) of documents in your index and needing your lucene index to be failoverable, you may want to look into Solr or Katta.
We are working on a similar implementation to what you are describing as a proof of concept. What we see as an end-product for us consists of three separate servers to accomplish this.
There is a "publication" server, that is responsible for generating the indices that will be used. There is a service implementation that handles the workflows used to build these indices, as well as being able to signal completion (a custom management API exposed via WCF web services).
There are two "site-facing" Lucene.NET servers. Access to the API is provided via WCF Services to the site. They sit behind a physical load balancer and will periodically "ping" the publication server to see if there is a more current set of indicies than what is currently running. If it is, it requests a lock from the publication server and updates the local indices by initiating a transfer to a local "incoming" folder. Once there, it is just a matter of suspending the searcher while the index is attached. It then releases its lock and the other server is available to do the same.
Like I said, we are only approaching the proof of concept stage with this, as a replacement for our current solution, which is a load balanced Endeca cluster. The size of the indices and the amount of time it will take to actually complete the tasks required are the larger questions that have yet to be proved out.
Just some random things that we are considering:
The downtime of a given server could be reduced if two local folders are used on each machine receiving data to achieve a "round-robin" approach.
We are looking to see if the load balancer allows programmatic access to have a node remove and add itself from the cluster. This would lessen the chance that a user experiences a hang if he/she accesses during an update.
We are looking at "request forwarding" in the event that cluster manipulation is not possible.
We looked at solr, too. While a lot of it just works out of the box, we have some bench time to explore this path as a learning exercise - learning things like Lucene.NET, improving our WF and WCF skills, and implementing ASP.NET MVC for a management front-end. Worst case scenario, we go with something like solr, but have gained experience in some skills we are looking to improve on.
I'm creating the Indices on the publishing Backend machines into the filesystem and replicate those over to the marketing.
That way every single, load & fail balanced, node has it's own index without network latency.
Only drawback is, you shouldn't try to recreate the index within the replicated folder, as you'll have the lockfile lying around at every node, blocking the indexreader until your reindex finished.