Double way API? - api

For the moment, all my customers are in the same db, same domain etc… on my majestic monolith on https://www.mystartup.com.
Let’s say I want to deploy an instance of my rails app for one big customer. And let’s say I may deploy other instances of this rails app in the future.
The thing is that I am fetching and computing some heavy data, and I want to do it once instead of in all the instances. So I guess I should do them in https://secret-api.mystartup.com, and each of the instance should make requests to it with secret access token.
But my issue is this one : is there a way for https://secret-api.mystartup.com to trigger some calls to each of the domains, when needed? Is this what we call “webhooks”? or is there some double-way-api concept that I am missing?

One question you need to answer is that what if this secret-api server need to be restarted ! You loose all that heavy lifting computation..
Another problem with above solution is that it is going against micro-services architecture in a way.. because you are having a single server for secret-api.. What if this goes down; then your whole system goes down... With micro-services; for high availability you should always have multiple servers for same api.
For such scenarios; when there is heavy lifting to be done, one solution could be that have an in-memory layer in between something like memcached or redis..keep your solutions in this in-meory server and NOT inside a cache maintained inside your secret-api server.. This solution will solve both above mentioned problems.

Related

Consideration before creating a single Redis instance

I currently have some different project that works on different redis instance ( consider the sample where I've 3 different asp.net application that are on different server each one with its redis server).
We've been asked to virtualize and to remove useless instances so I was wondering what happens if I have only one redis server and all the 3 asp.net points to the same redis instance.
For the application key I think there's no problem, I can prefix my own key with the application name , for example "fi-agents", "ga-agents", and so on... but I was wondering for the auth session what happens?
as far as I've read the Prefix is used as internal and it can't be used by final user to separate... it's just enought to use different Db?
Thanks
Generally and unless there are truely compelling reasons, you don't want to mix different applications and their data in the same database. Yes, it does lower ops costs initially but it can quickly deteriorate to scaling and performance nightmare. This, I believe, is true for any database.
Specifically with Redis, technically yes - you could use a key prefix or the shared/numbered database approach. I'm not sure what you meant by "auth" sessions but you can probably apply the same approach to them. But you really shouldn't... since Redis is a single-threaded process you can end up where one of the apps is blocking the other two. Since Redis by itself is so lightweight, just spin up dedicated servers - one per app - even in the same VM if you must. You can read more background information on why you don't want to opt for the shared approach here: https://redislabs.com/blog/benchmark-shared-vs-dedicated-redis-instances

serverside vs client

First let me say i am only a novice programmer, and by no means an sql guru. We have an app at work that is and has been under heavy dev from the vendor for sometime (2+ years). It runs as a MSSQL instance on one of our servers, and there is a client install for the desktops. The client software is making direct sql calls to the database.(it also has a local mysql instance to handle the client settings) there is 6-12 ports that had to be opened up for the communication. Looking at the sql manager, i can see direct sql calls from various clients.
Seems to me this is entirely the wrong approach. the closest thing i have done to this, was a webpage + php+ mysql. The webpage would make requests, and all the processing would be serverside, then simply display the results. The sluggishness my users feel i think is from the clientside request+ processing of the sql data.
ps: i realize that if they have not done it by now, switching to another paradigm seems out of the question. i just want to know if i am way off base.
You are way off base.
The client side has much more processing power.
Consider the case of one server and 5 clients. Even is the server has 3 times the power of a client the clients as a whole are still 5:3 more powerful.
If the application is sluggish it was probably poorly written. You need to investigate the root cause. Client / Server is a leading practice in design, I'm guessing it is not the root cause. It might be badly implemented or there might be other reasons. Your comment about having a local mysql sounds very fishy to me -- there should be no need for this.

Running the same web app on 2 or more physically separate servers?

I am not sure if I should be posting this question here or over at ServerFault so apologies if it is in the wrong place.
I have a small web app that is starting to get some more business.
Currently I have a single dedicated LAMP server for this, and this has worked well - the single server is able to handle all of our traffic.
However... Recently I have been approached by some potential customers who are interested in using the app, but only if their data can be stored on a server in the same province as they are (legal reasons).
I could migrate the server, but I am reluctant to do this. I like where it is now.
So, I am wondering what is involved in having multiple servers in physically separate datacentres far apart, running the same web app? Data between the servers would not need to stay synced, necessarily.
I have never done anything like this before, and am not sure how complicated a job it is. Any suggestions on how and where to start looking into this would be much appreciated.
Thanks (in advance) for your advice.
As long as each customer has their own set of data you can just install another copy of the application in the other datacenter. It will require you to get some structure to your source control and deployment process, but it works. This option will give you two separate databases.
If you have to have one common database for all the customers (e.g. some kind of booking/reservation system of common resources) then you're up to a completely other level of complexity with replicating databases etc. It's doable, but it's hard.

Sharing variables across multiple sessions

I know I cannot have a global variable in my backend code (java or php or something else) and have different users (and hence sessions) see the same value. If I need to share some values across these user sessions I need to write them to a DB and read it out every time. This seems awfully wasteful to me.
I understand that an apache process (or the app server) will fork and so having global values will not work but if I am looking at a specialized application is there a web server that lets me do this? This should be possible in a web server that uses threads instead of forking processes. But if I need to share global memory I will need to have some kind of locks to properly access them. I understand that it could (and mostly will) get really buggy but will it degrade performance compared to a DB?
Thoughts?
Pav
I'm not sure that's entirely true. Apache will handle each user connection individually - correct. However, I know that in Java it is possible to have a Singleton object that exists for the life of the application, in which you could potentially store values to be used across all user sessions.
When handling each user connection on the server side, each access to this Singleton will access the same object - therefore the same values.
You might want to do some more research into application scope objects as well. I'm not sure exactly what you're trying to achieve due to lack of a use case, but you may find that Java web apps can do more than you expect in this area.

Index replication and Load balancing

Am using Lucene API in my web portal which is going to have 1000s of concurrent users.
Our web server will call Lucene API which will be sitting on an app server.We plan to use 2 app servers for load balancing.
Given this, what should be our strategy for replicating lucene indexes on the 2nd app server?any tips please?
You could use solr, which contains built in replication. This is possibly the best and easiest solution, since it probably would take quite a lot of work to implement your own replication scheme.
That said, I'm about to do exactly that myself, for a project I'm working on. The difference is that since we're using PHP for the frontend, we've implemented lucene in a socket server that accepts queries and returns a list of db primary keys. My plan is to push changes to the server and store them in a queue, where I'll first store them into the the memory index, and then flush the memory index to disk when the load is low enough.
Still, it's a complex thing to do and I'm set on doing quite a lot of work before we have a stable final solution that's reliable enough.
From experience, Lucene should have no problem scaling to thousands of users. That said, if you're only using your second App server for load balancing and not for fail over situations, you should be fine hosting Lucene on only one of those servers and accessing it via NDS (if you have a unix environment) or shared directory (in windows environment) from the second server.
Again, this is dependent on your specific situation. If you're talking about having millions (5 or more) of documents in your index and needing your lucene index to be failoverable, you may want to look into Solr or Katta.
We are working on a similar implementation to what you are describing as a proof of concept. What we see as an end-product for us consists of three separate servers to accomplish this.
There is a "publication" server, that is responsible for generating the indices that will be used. There is a service implementation that handles the workflows used to build these indices, as well as being able to signal completion (a custom management API exposed via WCF web services).
There are two "site-facing" Lucene.NET servers. Access to the API is provided via WCF Services to the site. They sit behind a physical load balancer and will periodically "ping" the publication server to see if there is a more current set of indicies than what is currently running. If it is, it requests a lock from the publication server and updates the local indices by initiating a transfer to a local "incoming" folder. Once there, it is just a matter of suspending the searcher while the index is attached. It then releases its lock and the other server is available to do the same.
Like I said, we are only approaching the proof of concept stage with this, as a replacement for our current solution, which is a load balanced Endeca cluster. The size of the indices and the amount of time it will take to actually complete the tasks required are the larger questions that have yet to be proved out.
Just some random things that we are considering:
The downtime of a given server could be reduced if two local folders are used on each machine receiving data to achieve a "round-robin" approach.
We are looking to see if the load balancer allows programmatic access to have a node remove and add itself from the cluster. This would lessen the chance that a user experiences a hang if he/she accesses during an update.
We are looking at "request forwarding" in the event that cluster manipulation is not possible.
We looked at solr, too. While a lot of it just works out of the box, we have some bench time to explore this path as a learning exercise - learning things like Lucene.NET, improving our WF and WCF skills, and implementing ASP.NET MVC for a management front-end. Worst case scenario, we go with something like solr, but have gained experience in some skills we are looking to improve on.
I'm creating the Indices on the publishing Backend machines into the filesystem and replicate those over to the marketing.
That way every single, load & fail balanced, node has it's own index without network latency.
Only drawback is, you shouldn't try to recreate the index within the replicated folder, as you'll have the lockfile lying around at every node, blocking the indexreader until your reindex finished.