Here is the situation:
I have a number of web servers, say 10. I need to use a (software) load balancer which can be implemented using a reverse proxy server, like HAProxy or Varnish. Now, All the traffic that we serve is over https and not http, so Varnish is out of the question.
Now, I want to divide the users' request into a few categories, which depend on one of the input (POST) parameters of the request. Depending on that parameter, I need to divide the request among the servers, as based on that, (even if all other input (POST) parameters are same) different servers would serve differently.
So, I need to define a custom load balancing algorithm, such that, for a particular value of that parameter, I divide the load to specific 3 (say), for some other value, divide the request to specific 2 and for other value(s), to remaining 5.
Since I cannot use varnish, as it cannot be use to terminate ssl (defining custom algorithm would have been easy in VCL), I am thinking of using HA-Proxy.
So, here is the question:
Can anyone help me with how to define a custom load balancing function using HA-Proxy?
I've researched a lot and I could not find any such document with which we can. So, if it is not possible with HA-Proxy, can you refer me to some other reverse-proxy service, that can be used as a load balancer too, such that it meets both the above criteria? (ssl termination and ability to define a custom load balancing).
EDIT:
This question is in succession with one of my previous questions. Varnish to be used for https
I'm not sure what your goal is, but I'd suggest NOT doing custom routing based on the HTTP request body at all. This will perform very poorly, and likely outweigh any benefit you are trying to achieve.
Anything that has to parse values beyond typical HTTP headers at your load balancer will slow things down. Cookies by themselves are generally a bad idea if you can avoid it.
If you can control the path/route values that is likely a much better idea than to parse every POST for certain values.
You can probably achieve what you want via NGINX with lua scripts (the Kong platform is based on them), but I can't say how hard that would be for you...
https://github.com/openresty/lua-nginx-module#readme
Here's an article with a specific example of setting different upstreams based on lua input.
http://sosedoff.com/2012/06/11/dynamic-nginx-upstreams-with-lua-and-redis.html
server {
...BASE CONFIG HERE...
port_in_redirect off;
location /somepath {
lua_need_request_body on;
set $upstream "default.server.hostname";
rewrite_by_lua '
ngx.req.read_body() -- explicitly read the req body
local data = ngx.req.get_body_data()
if data then
-- use data: see
-- https://github.com/openresty/lua-nginx-module#ngxreqget_body_data
ngx.var.upstream = some_deterministic_value
end
'
...OTHER PARAMS...
proxy_pass http://$upstream
}
}
Related
In an Express.js app, we would like to rate-limit users who hit a certain route too often, but only if they cause a certain exception. Is there a natural way to do this in Express?
Here's more or less what we have now, not rate-limited.
app.get(
"/api/method",
authenticationMiddleware,
handler
);
Rate-limiter middleware typically looks like this. It counts accesses, and errors out if the user accessed it too many times, before we even get to the handler.
app.get(
"/api/method",
authenticationMiddleware,
rateLimiterMiddleware, # <--- count, and tell them to go away if over limit
handler
);
However, we're fine with them accessing it as many times as they want - we just want to bar them if they have recently caused a lot of exceptions.
In Express, Error handlers are supposed to go at the end of the handler chain.
So it seems we have to put the "guard" at the front, and an error-handling "counter" at the end.
app.get(
"/api/method",
authenticationMiddleware,
errorIfTooManyExceptionsByUser, # <--- tell them to go away if over the limit
handler,
countExceptionsForUser # <--- count
);
This seems inelegant, and also a little tricky since the two parts of rate-limiting middleware have to know a lot about each other. Is there a better way?
Perhaps we could get clever and modify the handler(s), to do the guarding and counting before and after they run?
app.get(
"/api/method",
authenticationMiddleware,
rateLimitErrors(handler) # <-- ???
)
Am I missing something or is there a better way to do this?
You can maybe look at how the express-redis-cache handle his middleware (https://github.com/rv-kip/express-redis-cache/blob/df4ed8e057a5b7d41d894e6e468f975aa62206f6/lib/ExpressRedisCache/route.js#L184). They wrap the send() method of express, with their own logic. Maybe with this you can have only one middleware but i think it's not the best solution.
Express Rate Limit
There is an existing middleware which handle rate limit in express https://www.npmjs.com/package/express-rate-limit.
Nginx handling
Express is a lightweight framework, in the official doc they advice to put an Nginx in front of your express server to handle server things.
(https://expressjs.com/en/advanced/best-practice-performance.html)
Use a reverse proxy
A reverse proxy sits in front of a web app and performs supporting operations on the requests, apart from directing requests to the app. It can handle error pages, compression, caching, serving files, and load balancing among other things.
Handing over tasks that do not require knowledge of application state to a reverse proxy frees up Express to perform specialized application tasks. For this reason, it is recommended to run Express behind a reverse proxy like Nginx or HAProxy in production.
And in nginx you have a rate-limit system : https://www.nginx.com/blog/rate-limiting-nginx/. I don't know if you can customise this for your specific use case, but i think it's the best way to handle rate-limiting.
I recently came through the concept of ETag HTTP header. (this) But I still have a problem that for a particular HTTP resource who is responsible to generate ETags?
In other words, it is actual application, container (Ex:Tomcat), Web Server/Load balancer (Ex: Apache/Nginx)?
Can anyone please help?
Overview of typical algorithms used in webservers.
Consider we have a file with
Size 1047 i.e. 417 in hex.
MTime i.e. last modification on Mon, 06 Jan 2020 12:54:56 GMT which
is 1578315296 seconds in unix time or 1578315296666771000 nanoseconds.
Inode which is a physical file number 66 i.e. 42 in hex
Different webservers returns ETag like:
Nginx: "5e132e20-417" i.e. "hex(MTime)-hex(Size)". Not configurable.
BusyBox httpd the same as Nginx
monkey httpd the same as Nginx
Apache/2.2: "42-417-59b782a99f493" i.e. "hex(INode)-hex(Size)-hex(MTime in nanoseconds)". Can be configured but MTime anyway will be in nanos
Apache/2.4: "417-59b782a99f493" i.e. "hex(Size)-hex(MTime in nanoseconds)" i.e. without INode which is friendly for load balancing when identical file have different INode on different servers.
OpenWrt uhttpd: "42-417-5e132e20" i.e. "hex(INode)-hex(Size)-hex(MTime)". Not configurable.
Tomcat 9: W/"1047-1578315296666" i.e. Weak"Size-MTime in milliseconds". This is incorrect ETag because it should be strong as for a static file i.e. octal compatibility.
LightHTTPD: "hashcode(42-1047-1578315296666771000)" i.e. INode-Size-MTime but then reduced to a simple integer by hashcode (dekhash). Can be configured but you can only disable one part (etag.use-inode = "disabled")
MS IIS: it have a form Filetimestamp:ChangeNumber e.g. "53dbd5819f62d61:0". Not documented, not configurable but can be disabled.
Jetty: based on last mod, size and hashed. See Resource.getWeakETag()
Kitura (Swift): "W/hex(Size)-hex(MTime)" StaticFileServer.calculateETag
Few thoughts:
Hex numbers are used here so often because it's cheap to convert a decimal number to a shorter hex string.
Inode while adding more guarantees makes load balancing not possible and very fragile if you simply copied the file during application redeploy.
MTime in nanoseconds is not available on all platforms and such granularity not needed.
Apache have a bug about this like https://bz.apache.org/bugzilla/show_bug.cgi?id=55573
The order MTime-Size or Size-MTime is also matters because MTime is more likely changed so comparing ETag string may be faster for a dozen CPU cycles.
Even if this is not a full checksum hash but definitely not a weak ETag. This is enough to show that we expect octal compatibility for Range requests.
Apache and Nginx shares almost all traffic in Internet but most static files are shared via Nginx and it is not configurable.
It looks like Nginx uses the most reasonable schema so if you implementing try to make it the same.
The whole ETag generated in C with one line:
printf("\"%" PRIx64 "-%" PRIx64 "\"", last_mod, file_size)
My proposition is to take Nginx schema and make it as a recommended ETag algorithm by W3C.
As with most aspects of the HTTP specification, the responsibility ultimately lies with whoever is providing the resource.
Of course, it's often that case that we use tools—servers, load balancers, application frameworks, etc.—that help us fulfill those responsibilities. But there isn't any specification defining what a "web server", as opposed to the application, is expected to provide, it's just a practical question of what features are available in the tools you're using.
Now, looking at ETags in particular, a common situation is that the framework or web server can be configured to automatically hash the response (either the body or something else) and put the result in the ETag. Then, on a conditional request, it will generate a response and hash it to see if it has changed, and automatically send the conditional response if it hasn't.
To take two examples that I'm familiar with, nginx can do this with static files at web server level, and Django can do this with dynamic responses at the application level.
That approach is common, easy to configure, and works pretty well. In some situations, though, it might not be the best fit for your use case. For example:
To compute a hash to compare to the incoming ETag you first have to have a response. So although the conditional response can save you the overhead of transmitting the response, it can't save you the cost of generating the response. So if generating your response is expensive, and you have an alternative source of ETags (for example, version numbers stored in the database), you can use that to achieve better performance.
If you're planning to use the ETags to prevent accidental overwrites with state-changing methods, you will probably need to add your own application code to make your compare-and-set logic atomic.
So in some situations you might want to create your ETags at the application level. To take Django as an example again, it provides an easy way for you to provide your own function to compute ETags.
In sum, it's ultimately your responsibility to provide the ETags for the resources you control, but you may well be able to take advantage of the tools in your software stack to do it for you.
I'm developing a simple HTTPS proxy (written in Python) which receives POST/GET requests/responses, applies some transformation and finally forwards the result to the recipient.
I need to handle chunked-encoded requests/responses in a "streaming" fashion, meaning that as soon as a chunk is received the proxy transforms it and forwards it to the recipient.
Before deciding to support chunked-encoded requests, I've been using mitmproxy http://mitmproxy.org/ and it worked perfectly. Unfortunately, I noticed that it waits until the entire body is received before letting me handle the response/request.
How can I implement a proxy supporting chunked-encoded requests/responses? Has anyone of you ever done something like this?
Thanks
EDIT: MORE INFO ON MY USE CASE
I need to handle POST requests and GET responses.
In the POST request I receive a JSON object and I have to encrypt some of its values.
In the GET response I receive a JSON object and I have to decrypt some of its values.
Till now, the following code has worked perfectly:
def handle_request(self, r):
if(r.method=='POST'):
// encryption of r.get_form_urlencoded()
def handle_response(self, r):
if(r.request.method=='GET'):
// decryption of r.content
How can I do the same thing with single chunks?
EDIT: UPDATES
After evaluating different solutions, I decided to go for Squid (proxy) + ICAP (content adaptation).
I've successfully configured Squid and the performance are just great. Unfortunately, I can't find a suitable ICAP server (in Python, if possible) for doing content adaptation (modification). I thought this one https://github.com/netom/pyicap could do the job but looks like it doesn't read the body of myPOST requests.
Do you guys know a Python ICAP server that I can use together with Squid?
Thanks
The answer below is outdated. You can now pass --stream to mitmproxy, whose behaviour is explained in the mitmproxy documentation.
mitmproxy developer here. This is definitely a feature we want for mitmproxy as well, but it's not that trivial and probably not coming very soon. If you really want to implement that yourself, I can recommend two things:
If you have a very specific use case, you can employ libmproxy.protocol.http.HTTPRequest.from_stream for parsing the header and do the body processing yourself.
If you do not want to modify the request/response body, you may find it sufficient to modify mitmproxy itself. In a nutshell, you would need to read the request/response without content (see 1.), modify it to your needs, pass it to the server and then delegate control to the libmproxy.protocol.tcp (see https://github.com/mitmproxy/mitmproxy/blob/master/libmproxy/proxy/server.py#L169)
If you have further questions, don't hesistate to ask here or on mitmproxy's IRC channel.
Re Comment #1:
You can't take too much out of mitmproxy, but at least you get delegate the header parsing & processing.
# ...accept request, socket.makefile() etc...
req = HTTPRequest.from_stream(client_conn.rfile, include_content=False)
# manually forward to the server (req._assemble_head())
# manually receive response body chunk by chunk and forward it to the server, see
# https://github.com/mitmproxy/netlib/blob/master/netlib/http.py#L98
resp = HTTPResponse.from_stream(server_conn.rfile, include_content=False)
# manually forward headers
# manually process body and forward
That being said, this is a fairly complex topic. Eventually, you're better off hacking that directly into libmproxy.protocol.http.HTTPHandler.
Another option, depending on your use case again: Use mitmproxy, set the conntype to tcp and forward traffic as-is and use regex replacements on the content in libmproxy.protocol.tcp . Probably the easiest way, but the most hacky one.
If you can provide some context, I may guide you further in the right direction.
Re Comment #2:
Before we get to the main part: JSON is a really bad choice for streaming/chunking as long as you don't want to encrypt the complete JSON object and treat it as a single string. You should definitely consider something like tnetstrings if you only want to encrypt parts.
Apart from that, hooking into read_chunk works, but first you need to get to the point where you can actually receive chunks over the line. Then, it's as simple as reading the single chunks, encrypting them and forwarding them.
I'm thinking about the best way to create a cache layer in front or as first layer for GET requests to my RESTful API (written in Ruby).
Not every request can be cached, because even for some GET requests the API has to validate the requesting user / application. That means I need to configure which request is cacheable and how long each cached answer is valid. For a few cases I need a very short expiration time of e.g. 15s and below. And I should be able to let cache entries expire by the API application even if the expiration date is not reached yet.
I already thought about many possible solutions, my two best ideas:
first layer of the API (even before the routing), cache logic by myself (to have all configuration options in my hand), answers and expiration date stored to Memcached
a webserver proxy (high configurable), perhaps something like Squid but I never used a proxy for a case like this before and I'm absolutely not sure about it
I also thought about a cache solution like Varnish, I used Varnish for "usual" web applications and it's impressive but the configuration is kind of special. But I would use it if it's the fastest solution.
An other thought was to cache to the Solr Index, which I'm already using in the data layer to not query the database for most requests.
If someone has a hint or good sources to read about this topic, let me know.
Firstly, build your RESTful API to be RESTful. That means authenticated users can also get cached content as to keep all state in the URL it needs to contain the auth details. Of course the hit rate will be lower here, but it is cacheable.
With a good deal of logged in users it will be very beneficial to have some sort of model cache behind a full page cache as many models are still shared even if some aren't (in a good OOP structure).
Then for a full page cache you are best of to keep all the requests off the web server and especially away from the dynamic processing in the next step (in your case Ruby). The fastest way to cache full pages from a normal web server is always a caching proxy in front of the web servers.
Varnish is in my opinion as good and easy as it gets, but some prefer Squid indeed.
memcached is a great option, and I see you mentioned this already as a possible option. Also Redis seems to be praised a lot as another option at this level.
On an application level, in terms of a more granular approach to cache on a file by file and/or module basis, local storage is always an option for common objects a user may request over and over again, even as simple as just dropping response objects into session so that can be reused vs making another http rest call and coding appropriately.
Now people go back and forth debating about varnish vs squid, and both seem to have their pros and cons, so I can't comment on which one is better but many people say Varnish with a tuned apache server is great for dynamic websites.
Since REST is an HTTP thing, it could be that the best way of caching requests is to use HTTP caching.
Look into using ETags on your responses, checking the ETag in requests to reply with '304 Not Modified' and having Rack::Cache to serve cached data if the ETags are the same. This works great for cache-control 'public' content.
Rack::Cache is best configured to use memcache for its storage needs.
I wrote a blog post last week about the interesting way that Rack::Cache uses ETags to detect and return cached content to new clients: http://blog.craz8.com/articles/2012/12/19/rack-cache-and-etags-for-even-faster-rails
Even if you're not using Rails, the Rack middleware tools are quite good for this stuff.
Redis Cache is best option.
check here.
It is open source. Advanced key-value cache and store.
I’ve used redis successfully this way in my REST view:
from django.conf import settings
import hashlib
import json
from redis import StrictRedis
from django.utils.encoding import force_bytes
def get_redis():
#get redis connection from RQ config in settings
rc = settings.RQ_QUEUES['default']
cache = StrictRedis(host=rc['HOST'], port=rc['PORT'], db=rc['DB'])
return cache
class EventList(ListAPIView):
queryset = Event.objects.all()
serializer_class = EventSerializer
renderer_classes = (JSONRenderer, )
def get(self, request, format=None):
if IsAdminUser not in self.permission_classes: # dont cache requests from admins
# make a key that represents the request results you want to cache
# your requirements may vary
key = get_key_from_request()
# I find it useful to hash the key, when query parms are added
# I also preface event cache key with a string, so I can clear the cache
# when events are changed
key = "todaysevents" + hashlib.md5(force_bytes(key)).hexdigest()
# I dont want any cache issues (such as not being able to connect to redis)
# to affect my end users, so I protect this section
try:
cache = get_redis()
data = cache.get(key)
if not data:
# not cached, so perform standard REST functions for this view
queryset = self.filter_queryset(self.get_queryset())
serializer = self.get_serializer(queryset, many=True)
data = serializer.data
# cache the data as a string
cache.set(key, json.dumps(data))
# manage the expiration of the cache
expire = 60 * 60 * 2
cache.expire(key, expire)
else:
# this is the place where you save all the time
# just return the cached data
data = json.loads(data)
return Response(data)
except Exception as e:
logger.exception("Error accessing event cache\n %s" % (e))
# for Admins or exceptions, BAU
return super(EventList, self).get(request, format)
in my Event model updates, I clear any event caches.
This hardly ever is performed (only Admins create events, and not that often),
so I always clear all event caches
class Event(models.Model):
...
def clear_cache(self):
try:
cache = get_redis()
eventkey = "todaysevents"
for key in cache.scan_iter("%s*" % eventkey):
cache.delete(key)
except Exception as e:
pass
def save(self, *args, **kwargs):
self.clear_cache()
return super(Event, self).save(*args, **kwargs)
I need to load balance incoming calls to asterisk. To do this, I have set up the Openser server in front of it and I loaded and configured the dispatcher modules to do so. What I want to do is that the Openser server will receive the calls and route them to the least "busy" Asterisk server which will take care of the rest (I have an IVR menu set up in each of the servers). I am using X-Lite softphone for testing. The same users are registered in both Asterisk and Openser. When I initiate the call it just goes across the Openser server, it does not get forwarded to any of the Asterisk boxes. I am wondering if I am missing any configuration or step in my set up.
Thank you in advance
The dispatcher module cannot do any type of load balancing. It is a "stateless" module, that means that it does not keep track of how many calls are sent to each box.
You can choose different types of routing logic, the available types are:
“0” - hash over callid
“1” - hash over from uri.
“2” - hash over to uri.
“3” - hash over request-uri.
“4” - round-robin (next destination).
“5” - hash over authorization-username
“6” - random (using rand()).
“7” - hash over the content of PVs string.
“X” - if the algorithm is not implemented, the first entry in set is chosen.
The one most likely to distribute the load fairly is round-robin (option 5).
To use it, call the following function in the route section of your openser.cnf:
ds_select_dst("1", "5");
The first parameter is your GW group, the second is the routing type.
For more info check this page
Hope this helps
The dispatcher module cannot do that. You'd have to use the (surprise!) load balancer module