I'd like to know which is the best strategy to approach this situation, let's suppose I have this action:
class BetsController < ApplicationController
def create
#bet = Bet.new(params[:bet])
# some other operation
end
end
and sometimes I receive tons of call to this action and I could lost data, should I spawn a thread each time? Any suggestions?
Scaling is quite a bit topic and providing an answer that will be exactly right for your needs is not really possible.
But just to get you started:
In case the action itself takes long to execute then you should
consider doing some offline work - essentially you should save the
information you need and have some offline worker run over all the
queues requests and process them one at a time. This approach may
require some redesign of your application. You should consider how
users get feedback on their actions in case they may fail or in case
they return some result.
If the actions are very short and this is just a matter of load, then you should consider other approaches to scaling - additional servers w/ load balancing and such.
I know this is far from complete answer but this is a really large topic...
Related
I need some best practice guidelines for a backend service in a scenario like this one:
UI sends multiple images for uploading to the backend service
Backend service receives all of the images and processes upload to storage one by one
There can be failure in 1 or multiple image upload
My question is how do I send the response towards UI if my backend service is unable to upload 1 or more file(s).
One way can be to send failed and successful image link together in a JSON response body. So the UI knows about the failure and handles it in its own way.
Another way can be to send only the successfully uploaded images' link which is the best case scenario.
Any suggestions will be welcomed with some reference links.
Use an Orchestrator - something specific that can coordinate multiple actions and provide a meaningful result back to the caller.
This might be as simple as a component sitting in the UI that orchestrates calls to the backend. The UI component and the backend service might be designed as parts of a cohesive solution, or the UI component might simply act as a type of client/proxy/facade to some random backend service.
UI calls the orchestrator with references to all the images it needs uploading.
The orchestrator works through the items, uploading each as you prefer (sequentially or in parallel, etc). For each file, handle errors however you prefer - e.g. try once and die gracefully on failure; put errors into a queue or some other mechanism for retry (how many times is up to you); etc.
Based on rules internal to the orchestrator, return status to the caller.
For potentially long-running processes (like file uploads) make sure the call to the orchestrator is asynchronous.
Rather than only returning "complete" result at the end, the orchestrator might provide a simple status back, allowing callers to get some idea of where processing is at. For example, you might have a call-back (from the orchestrator to it's caller) that simply emits very simple statuses like: processing, failed and complete. A more complex solution would be for the orchestrator to return more specific info like %complete and detailed error info.
Have a look at how the big cloud providers do complex file uploads by reading their documentation and studying their API's.
I need some best practice guidelines for a backend service
In no particular order:
Keep it as simple as possible - generally, the fewer moving parts the better. E.g. pay attention to the Single Responsibility Principle (SRP).
Clean up after yourself. If the upload service generates any data - make sure you have a clean-up process so you don't end up with mountains of un-needed data lying around, especially stuff like image files. If you design an upload solution that maintains state (which is independent of what happens to the images once they are uploaded) then you'll be storing data which probably won't be needed once the images are all processed.
Think about support - not just developer debugging but also operational support. Getting your solution into production is not the end result, it's just the beginning.
If designing this solution across teams (e.g. frontend and backend teams) make sure both teams are involved in the design. If the backend team can't provide a solution that works for the frontend team then it's not going to end well.
Think about the likely error scenarios and how can you handle them.
This isn't really just a question of best practice, as there are multiple ways you could implement it, more than one of which could be valid. This is actually an architecture and design question, with more than one valid answer, hence I don't think it fits as a Stack Overflow question and you will not get references to any one correct approach.
That said, by way of an answer I will outline what I think you need. At a very high level, and not necessarily in this order but taking these factors into account, I would:
Design the UI process flow. For example, you may decide that the user process will have several stages:
User selects first image for upload;
User selects each subsequent image for upload;
User presses some kind of "Go" button after selecting all images;
System now uploads the batch, and user receives a response confirming success or otherwise;
User has option to click through to detailed success/error details.
Design the required success/error reports
Design the data needed to support the overall functionality
Provide one or more APIs giving the upload function and the report function(s) the CRUD access they need to this data
If you hit any specific technical issues at any stage, then please post a new questions accordingly as you go.
As to the point you mentioned, how to send the UI response, there is more than one valid way but I would return a basic success/falure response initially, containing only minimal details such as number of successes, and return more details in further messages in response to user actions (such as clicking through to detailed success/error details), at which point I would retrieve the requested error details from the database.
As I said at the start of my answer, I don't think your question can be answered just in terms of best practices, as it's a whole architecture and design question, but I hope my answer helps you along this path.
I’m developing a small game where the player owns droids used to perform some automated actions. The easiest example is giving an order to a droid to send him at a specific position. Basically, the users gives it a position and the droid goes there. I’m already using a lot Azure app function and I’d like to use them to make the droid moves.
On the top of my head, I thought about making one function that would trigger every minute, fetch all the droid that need to move then make them move.
The issue with this approach is that if the game is popular, there could be hundreds of droids and I have to ensure that the function execution time stays below the minute.
I thought about just retrieving all droids that needs to move then for each of them calling a Azure app function via its URL to make it execute for this particular droid. In my head, it would parallelize the execution a bit but I’m not sure I’m correct.
I also have to think about using sql transaction or not in order to be sure not to create deadlocks.
The final question would be « how to handle recurring treatment of potentially large amount of data and ensure that it stays below the minute ? »
Thanks for your advice
Typically, you handle such scenarios with queues. Each order becomes a queue message, and then Azure Function is triggered by it and processes the order. It can and will scale based on the amount of messages in the queue.
If your logic still requires timer-based processing, the timer should be as lean as possible, e.g. send the queue messages to a queue which would do the real work.
I'm working on an application that will process data submitted by the user, and compare with past logged data. I don't need to return or respond to the post straight away, just need to process it. This "processing" involves logging the response (in this case a score from 1 to 10) that's submitted by the user every day, then comparing it against the previous scores they submitted. Then if something found, do something (not sure yet, maybe email).
Though I'm worried about the effectiveness of doing this and how it could affect the site's performance. I'd like to keep it server side so the script for calculating isn't exposed. The site is only dealing with 500-1500 responses (users) per day, so it isn't a massive amount, but just interested to know if this route of processing will work. The server the site will be hosted on won't be anything special, probably a small(/est) AWS instance.
Also, will be using Node.js and SQL/PSQL database.
It depends on how do you implement this processing algorithm and how heavy on resources this algorithm is.
If your task is completely syncronous its obviously going to block any incoming requests for your application until its finished.
You can make this "processing-application" as a seperate node process and communicate with it only what you need.
If this is a heavy task and you worry about performance its a good idea to make it a seperate node process so it does not impact the serving of the users.
I recoment to google for "node js asynchronous" to better understand the subject.
I have been developing an app which processes many WSAPI and LBAPI requests which take an extended period of time to complete. In the event that certain parameters are changed, these requests become irrelevant and canceling them would be the best thing to do, in an effort to clear up the network queue for the new set of requests that need to take place.
I have searched the docs of both APIs and haven't been able to find any way included in the SDK to cancel these requests. I'm wondering if there might be a way to do this manually, or if there is a function I might be missing.
Thanks!
You haven't found it because it doesn't exist in Ext. :-) We have run into similar things in the past but haven't had a critical need to build this into the framework yet.
The best info I've found on this problem is this post which describes how to augment stores to support canceling outstanding loads:
http://www.mattgoldspink.co.uk/2013/02/03/ext-js-cancel-a-load-on-an-ext-data-store/
how to design parallel processing workflow
I have a scenarial case about data analysis.
There are four steps basicly:
pick up task either read from a queue or receive a message throught API (web service maybe) to trigger the service
submit request to remote service base on the parameters from step 1
wait from remote service finished and download
perform process on the data that downloaded from step 3
the four step above looks like a sequence workflow.
my question is that how can i scale it out.
every day i might need to perform hundreds to thousands of this task.
if i can do them in parallel, that will help a lot.
e.g run 20 tasks at a time.
so can we config windows workflow foundation to run parallel?
Thanks.
You may want to use pfx (http://www.albahari.com/threading/part5.aspx), then you can control how many threads to make for fetching, and using PLINQ I find helpful.
So, you loop over the list of urls, perhaps reading from a file or database, and then in your select you can then call a function to do the processing.
If you can go into more detail as to whether you want to have the fetching and processing be on different threads, for example, it may be easier to give a more complete answer.
UPDATE:
This is how I would approach this, but I am also using ConcurrentQueue (http://www.codethinked.com/net-40-and-system_collections_concurrent_concurrentqueue) so I can be putting data into the queue while reading from it.
This way each thread can dequeue safely, without worrying about having to lock your collection.
Parallel.For(0, queue.Count, new ParallelOptions() { MaxDegreeOfParallelism = 20 },
(j) =>
{
String i;
queue.TryDequeue(out i);
// call out to URL
// process data
}
});
You may want to put the data into another concurrent collection and have that be processed separately, it depends on your application needs.
Depending on the way your tasks and workflow is modeled you can use a Parallel activity and create different branches for the different tasks to be performed. Each branch has its own logic and the WF runtime will start a second WCF request to retrieve data as soon as it is waiting for the first to respond. This requires you to model the number of branches explicitly but allows for different activities in each branch.
But from you description it sounds like you have the same steps for each task and in that case you could model it using a ParallelForEach activity and have that iterate over a collection of tasks. Each task object would need to contain all the information used for the request. This requires each task to have the same steps but you can put in as many tasks as you want.
What works best really depends on your scenario.