Design: Is it correct to push records to database first and then pick from database and push to rabbitmq - rabbitmq

We are e-commerce website with several listings. The listings can be created/updated/deleted from various apps:- i.e. desktop, mobile, apps.
We need to push all this updated information to some third party APIs i.e. any new listing created/deleted or if the existing listing updates, we need to push complete listing information through some third party API. We are using Rabbitmq for this since we expect a high volume of record update.
We have two choices:
Push from all endpoints (desk/msite/app) info like (listingId, Action on listing i.e. CREATE/UPDATE/DELETE) to rabbitmq. Rabbitmq further dequeue these messages and hit appropriate API.
Implement trigger on listings table i.e. on create, insert entry into some table with column (listingId, database action to be executed i.e. CREATE/UPDATE/DELETE). Now, create a job toread from this table after every 10 sec. and push these to rabbitmq.
Which is better appraoch?

I think an HTTP based API might be the best solution. You can implement a gateway which includes security (OAuth2/SAML), rate limiting etc. Internally you can use RabbitMQ. The gateway can publish the updates to RabbitMQ and have subscribers which write the data to your master database and other subscribers that publish the data to your third party APIs.
The added benefit of an HTTP gateway, beyond the extra security controls available, is that you could change your mind about RabbitMQ in the future without impacting your desktop and mobile apps which would be difficult to update in their entirety.
Having worked with databases for most of my career, I tend to avoid triggers. Especially if you expect large volumes of updates. I have experienced performance problems and reliability problems with triggers in the past.
It is worth noting that RabbitMQ does have message duplication problems - there is no exactly once delivery guarantee. So you will need to implement your own deduplication or preferably make all actions idempotent.

Related

Prevent repeat write API calls whilst traffic mirroring

I'm looking at using traffic mirroring with Istio to dark test releases.
The mirror'd traffic will mean write APIs like order & payment, etc are called multiple times which I don't want else I'll be charging the customer twice & sending them a duplicate product.
Is there a standard way to prevent this (stubbing seems an odd thing to do in production) or is mirroring only really applicable for read APIs?
Issue
There is diagram of mirroring setup wih traffic flows.
Although these mirror requests are mirrored as “fire and forget” and the reply from the mirror service is just dropped (by the envoy proxy sidecar) to /dev/null and not returned to the caller, it's still uses this api.
Solution
As mentioned in comments
In my opinion you should add path for your testing purposes with some custom header, so this could be tested only by you or your organization, and the customer shouldn't be involved in that.
This topic is described in detail here by Christian Posta.
When we deploy a new version of our service and mirror traffic to the test cluster, we need to be mindful of impact on the rest of the environment. Our service will typically need to collaborate with other services (query for data, update data, etc). This may not be a problem if the collaboration with other services is simply reads or GET requests and those collaborators are able to take on additional load. But if our services mutates data in our collaborators, we need to make sure those calls get directed to test doubles and not the real production traffic.
There are a few approaches you may consider, all of them are described in the link above:
Stubbing out collaborating services for certain test profiles
Synthetic transactions
Virtualizing the test-cluster’s database
Materializing the test-cluster’s database
In practice, mirroring production traffic to our test cluster (whether that cluster exists in production or in non-production environments) is a very powerful way to reduce the risk of new deployments. Big webops companies like Twitter and Amazon have been doing this for years. There are some challenges that come along with this approach, but there exist decent solutions as discussed in the patterns above.

Mulesoft best practices for API-led connectivity , is it okay to invoke System API directly from the client application(be it web/mobile)

The main reason for this question is to understand/reasons behind the best practices over the usage of system APIs. If the System API itself good enough to be serve the purpose of my client application, do we still need to write an experience API to invoke the system API indirectly, or break the rule, just invoke the system-API directly from the client application. As sometimes , it is overhead/numerous API calls over the network.
System API is to unlock or expose the system asset(back end data). Now, one could write the system API in such a way that it fetches the data from system database, does the required processing, for instance convert the table rows to JSON format and then does some enrichment & trimming of fields and expose it to the customer A. This is a course grained approach. Now, another customer B requires similar data but needs some fields that were already trimmed by you to serve Customer A who wanted only few fields of the many fields that you picked from System(database). You'll have to write a separate course grained API for Customer B.
Also, in the future if the backend SYSTEM is replaced with a new SYSTEM, one would have to re-write/update both the API's for each customer A and customer B.
This course grained approach would solve your problem each time, but architecturally having a fine grained approach of breaking down a large service into multiple layers of experience, process and system API's would enable re-use, reduce work effort, increase time to market, lower total cost of ownership and allow applying the required separate policies(Security, sla's etc) for each of the clients through experience API layer. You can now better scale your integration landscape.
A fine grained approach increases usage of resources such as network, diskspace(more logging) etc but its a trade-off to all the many advantages you get. Again, the decision to go with either of the approaches should align with the current circumstances of your ecosystem, so it all depends.

Callback or real-time communication for social table api

I am evaluating social table API to see if there's any way to get notified when data in the social tablet side changes so that we can sync the data in real time. I can't find anything on call back or long running operations. Does that mean polling is the only option?
There is no real time API for Social Tables. This means that you're correct in that polling is the only option for keeping data in sync.

How can i design better a Inventory, Order sync web app

I have a web application that displays inventory, orders, tracking information from drop-shippers for orders and tracking updates. When a customer logs in, he will see all the above information in different pages.
I have a Console based application in the server that hosts 4 background workers to do each of the above tasks and updates the database. Now i have one console application for each customer. I did this because for any reason the console application fails because of one customer's data, it should not effect others.
Is there a better approach or any existing tools, api, frameworks available to support this kind of stack in Microsoft? Or what i am doing is correct and best approach? Are there any technologies that are more stable to support Subscription based membership, Offline data sync, Queue User requests and notifying user when they are completed.
I would take a look at the Azure Queues and Webjobs (Links below)
With a queue structure, you can simply decouple your application and make the application only do what is needed. Your main application can then just put relevant and needed information in the Queue and forget about it.
Next (and perhaps the most crucial part of this) you can write a simple console application that will run when a queue is present and ready. The beauty of this is that you not only can have multiple webjobs doing the same thing (I don't recommend it) but also, you only need to have and maintain one Console application. If the application crashes, it will simply restart it again (within a few seconds) and go back at it again.
Below, please find a link to the tutorial of how to make a sample Queue and Webjob:
http://azure.microsoft.com/en-us/documentation/articles/websites-dotnet-webjobs-sdk-get-started/?rnd=1

How do I lock down a .net 4.0 WCF Data Service

I've had a WCF Data service published for about 2 months. It's 100% been hacked already. I even noticed the service published on twitter!
Luckily my site was under development and the user entity was only about 80 beta testers.
Still this is a pretty big problem. With the power of E.F. Navigation properties anyone can easily write a script to download all my user data and my valuable domain data that no-one else has. I want to provide non-authenticated access and do things like:
Limit what columns get exposed (e.g. a users emails)
Limit number of requests possible per day (e.g. 10 per request host address)
Be notified when someone is misusing the service
Limit the results set and expand options on different entity sets
Stuff I haven't yet thought about
Does this make sense or should I drop WCF Data Services - which in theory sounded great, but now that I've got experience with them I'm wondering if they are just good for development and not production (they're kind of fatter than I was expecting).
Thoughts that go beyond my knowledge and suggestions here will be greatly appreciated.
Also posting any links to thorough blog post examples or video presentation that cover ground would be excellent!
I think you need to implement some authentication. There is no other way I can think of to "lock down" a web service. This is one of the advantages of WCF -- it makes implementing complex authentication easy.
On my WCF service, I require a UserContext object, simply comprised of two strings, username and password.
Every method on the service requires that context, and if I haven't added the username/password to the database, it denies the request.
This also makes it simple to track who is abusing the service, as you will have their username/password tied to every request.
You should also run it over SSL so other users' credentials will not be easily compromised.
1 - WCF Data Services currently doesn't allow you to easily filter columns on per request basis. You could have two EF models (one "public", and one "private") and expose them as two services. The public one accessible to anybody, the private one behind full auth.
2 - This you will have to implement yourself. But for this to work you need some way to identify the user. So it's pretty close to authentication (Even if it doesn't require password or something like that). There's a series of posts about auth over WCF Data Services here: http://blogs.msdn.com/b/astoriateam/archive/tags/authentication/
3 - If you can identify the user as per #2, you can for example count the number or frequence or requests he/she makes and setup a notification based on that. Again the techniques used for auth should provide you the right hooks.
4 - This is reasonably simple. WCF Data Service allows you to set hard limit on the size of the response (DataServiceConfiguration.MaxResultsPerCollection) or a soft limit, which means paging. Paging is usually better, since it limits the size of a single response but still allows clients to get all the data with multiple requests. This can be done through DataServiceConfiguration.SetEntitySetPageSize. The exand behavior can be limited by usage of DataServiceConfiguration.MaxExpandCount and MaxExpandDepth properties.
Some other possible techniques to use
Query interceptors (http://msdn.microsoft.com/en-us/library/dd744842.aspx) - this allows you to filter rows on per request bases. Typically used to limit rows based on the user making the request (note that this only allows you to filter rows, not columns).
Service operations - if you define a service operation which returns IQueryable the client can still compose queries on top of it, but it gives you the ability to filter the data before the query is applied. Or you can make certain pieces of information accessible only through service operations (not as easy to use and not queryable, but it gives you full control). (http://msdn.microsoft.com/en-us/library/cc668788.aspx)