How many active, simultaneous connections can a web server accept? - sql

I know this is a difficult question but here it is, in context:
Our company has a request to build a WordPress website for a certain client. The caveat is that, on one day per year, for a period of about 20 minutes, 5,000 - 10,000 people will attempt to access the home page of this website. Their purpose: Only to acquire an outbound link to another site.
My concern is, no matter what kind of hosting we provide, the server may reject the connections after a certain number of connections are reached.
Any ideas on this?

This does not depend on WordPress. WordPress is basically software to render webpages: it helps you to quickly modify the content content of a page. Other software like for instance Apache accepts connections and redirects the calls to for instance WordPress.
Apache can be configured to accept more connections. I think the default is about 200. If that's bad really depends. If the purpose is only to give another URL, you can say that connections will be terminated fast. So that's not really an issue. If on the other hand you want to generate an entire page using PHP and MySQL it can take some time before a client is satisfied. In that case 200 connections are perhaps not sufficient.
As B-Lat points out. You can use cloud computing platforms like Google App Engine or Microsoft Azure that provide a lot of server power. But only bill their clients on the consumption on these resources. In other words you can accept thousands of connections at once. But you don't need to pay for the other days when clients visit your website less often.

Related

What is a Ray ID (Cloudflare)?

Every time I visit a website that is using Cloudflare's Under-Attack-Mode, it shows me the usual text telling me to wait a few seconds until Cloudflare verified I am not a bot. Every time I reload the page it changes my current Ray ID.
What is the purpose of a Ray ID? Is it some kind of session ID?
It is a UID which can be used by the website operator (and Cloudflare support) to potentially debug issues. The ray id is actually returned in the headers of most requests through Cloudflare, just not as visibly as what you see in the case of I'm under attack mode.
I will try to explain both in simple words, Cloudfare and Ray ID.
Cloudfare:
This is a service sitting in front of servers tasked with the ability to handle requests and allow response depending upon multiple security parameters. The service provides website security, performance optimization and DDoS protection. The service mimics a reverse proxy.
When a request is sent to Cloudfare, it initially hits one of the edge servers (Content Delivery Network) strategically placed across the continent. This helps to fetch cached data at a rapid pace without the need to traverse to the origin server to fetch the content.
Ray ID:
Ray ID is a unique identifier generated by Cloudflare's edge servers, and is included in the response headers of each request processed by these edge servers and is used to help identify specific requests when troubleshooting issues. Other than issues, it helps to track the request, request specifier, topo location and auditing becomes easier.
Hope, it clears the doubt

Is it a bad idea to have a web browser query another api instead of my site providing it?

Here's my issue. I have a site that provides some investing services, I pay for end of day data which is all I really need for my service but I feel its a bit odd when people check in during the day and it only displays yesterdays closing price. End of day is fine for my analytics but I want to display delayed quotes on my site.
According to the yahoo's YQL faq: If you use IP based authentication then you are limited to 1000 calls/day/IP, if my site grows I may exceed that but I was thinking of trying to push this request to the people browsing my site themselves since its extremely unlikely that the same IP will visit my site 1,000 times a day(my site itself has no use for this info). I would call a url from their browser, then parse the results so I can allow them to view it in the format of the sites template.
I'm new to web development so I'm wondering is it a common practice or a bad idea to have the users browser make the api call themselves?
It is not a bad idea at all:
You stretch up limitations this way;
Your server will respond faster (since it does not have to contact the api);
Your page will load faster because the initial response is smaller;
You can load the remaining data from the api in async manner while your UI is already responsive.
Generally speaking it is a great idea to talk with api's from the client. It's more dynamic, you spread traffic, more responsiveness etc...
The biggest downside I can think of is depending on the availability of other services. On the other hand your server(s) will be stressed less because of spreading the traffic.
Hope this helped a bit! Cheers!

Being cost effected with a bandwidth for a streaming service

Basically, I'm about to launch a music streaming app, and I'm trying to figure out cost.
Cloud services like S3 and RackSpace cloud are expensive. As far as scalability is concerned... I'm assumign that an average user listens to music for an hour and lets say our app scales to 100,000's of users. It's about 90MB / hour per user of bandwidth... Let's make another assumption and say that we have an average of 10,000 concurrent users streaming music in a 24 hour period (90MB (avg/hr) * 10k * 24 = 21,600,000MB = ~20.5 TB)... That's a shit load of bandwidth! According to Rackspace's pricing, that's $3,780 USD per day... holllllly crap! Anoher thing, services like Rdio, Grooveshark, etc have roughly 15 million (licensed) songs... If I through that into the mix, that's 15,000,000 * 3MB (avg song) = 43,945GB = $4,300 a month.
So at these rates, companies like Rdio and Grooveshark, etc, in no way pay this much.
So my question is simple... generally, what are some routes to take when creating a streaming service? Being specific would earn my vote! (AKA, links to well rated companies offering cheaper CDN services or unmetered colocation for a flat rate)
Thanks duders!
More)
Application servers will be hosted on Rackspace... but this is somewhat irrelevant considering the fact that I really just need a fast "cdn"
Look at accelerating load balancers like jetNEXUS. They are very simple to set up and use techniques like static caching HTML muxing and compression to dramatically reduce the amount of data hitting the actual servers. This can save you a ton of money in bandwidth costs.
I think Rackspace has some Zeus or Jetnexus offerings, and I know that it's available as an option on Amazon's Cloud.
There are plenty of of ways to reduce that cost. I know Spotify do the following (among other things):
Cache the songs locally.
Use P2P to download from other clients (they mainly use the server to guarantee low latency).
Only allow high bit-rates for paying users.
I recommend you read the following: http://www.csc.kth.se/~gkreitz/spotify-p2p10/
If you're looking for cheap hosting then I suggest you check out: http://www.hetzner.de/. I haven't used them but I've heard lots good things about them.
We've been working on reducing the costs for our high volume email delivery service (http://elasticemail.com) which uses a lot of bandwidth and needs to scale. We found that by switching to OVH we could get much more bandwidth and much more hardware for a lot cheaper and they have great API's to automate a lot of the complexities you'd find in a complex infrastructure.
So kudos to OVH (http://ovh.ie) for saving us a lot of money.
I know Rackspace cloud files for their CDN (which is included in the price) use Akamai. Akamai don't seem to have any pricing around on the web, but they do seem to be expensive after some googling.
I'd try these things.
Tell Rackspace your plans and ask if they can work out some kind of a bulk deal.
Contact Akamai and tell them about your plans and see what they offer.
Google "cheapest content delivery network" and see what comes up.
I think a CDN is what you want, that'd give you the capacity that you need. I don't think it'd be possible to do that much from a simple VPS or cloud provider without a CDN behind it.
Basically, if you're serving a lot of static content, and you're doing it from cloud servers (vps's) it's gonna clog up your pipes at some point, even if you have a few servers, it'll eventually reach capacity, but with a CDN, all the content is pushed out to the nodes, so it basically goes on and on :)
From my experience, Akamai CDN is awesome. I've used it quite a bit (through RS cloud files) and in like 2 years only hit 2 issues, one was the end user's fault for using some far away dns servers, and the other was fixed in about 1.5 days, where a user was in Italy or somewhere and their content was coming out of some other country.
Akamai uses the geo IP database lookup of the DNS server that requests the url to give you the IP of a host nearby. This works great for most people as they'll use their ISPs DNS servers for doing lookup.
On the plus side, most users get ping times much smaller than if they downloaded it from America, for example in Gold Coast, my ping time to akamai is about 20-50 msecs, to USA it's 250-400 msecs.
Update: After doing some googling myself, this looked promising: http://24ways.org/2008/using-google-app-engine-as-your-own-cdn - they're saying to use google app engine as a CDN. On the plus side, last time I checked you could do that for free, but on the downside, I wouldn't base a business by planning that that would stay free; going by google's history of releasing free things, then later charging for them or dropping them.

Do i need a Content Delivery Network If my audience is in one city?

So ive asked question earlier about having some sort of social network website with lots of images and the problem is the more users , the more images the website will have and i was afraid it would take a LONG time for the images to load on the client side.
How to handle A LOT of images in a webpage?
So the feedback i got was to get a content delivery network. Base on my limited knowledge of what a content delivery network is, it is series of computures containing copies of data and clients access that certain servers/computers depending where they are in the world? What if im planning to release my website only for a university, only for students. Would i need something like a CDN for my images to load instantly? Or would i need to rent a REALLY expensive server? Thanks.
The major hold up for having lots of images is the number of requests the browser has to make to the server, and then, in turn, the number of requests the server has to queue up and send back.
While one benefit of a CDN is location (it will load assets from the nearest physical server) the other benefit is that it's another server. So instead of one server having to queue up and deliver all 20 file requests, it can maybe do 10 while the other server is simultaneously doing 10.
Would you see a huge benefit? Well, it really doesn't matter at this point. Having too much traffic is a really good problem to have. Wait until you actually have that problem, then you can figure out what your best option is at that point.
If you're target audience will not be very large, you shouldn't have a big problem with images loading. A content delivery network is useful when you have a large application with a distributed userbase and very high traffic. Underneath that, and you shouldnt have a problem.
Hardware stress aside, another valuable reason for using a CDN is that browsers limit the number of simultaneous connections to one host, so let's say the browser is limited to 6 connections and you have in one page load 10 images, 3 CSS files and 3 javascript files. If all 10 of those images are coming from one host, then it will take a while to get through all 16 of those connections. If however, the 10 images are loaded from a CDN that uses different hosts, that load time can be drastically reduced.
Even if all your users are geographically close, they may have very different network topologies to reach your hosting provider. If you choose a hosting provider that has peering agreements with all the major ISPs that provide service in your town, then a CDN may not provide you much benefit. If your hosting provider has only one peer who may also be poorly connected to the ISPs in your town, then a CDN may provide a huge benefit, if it can remove latency from some or all of your users.
If you can measure latency to your site from all the major ISPs in your area to your hosting provider, that will help you decide if you need a CDN to help shorten the hops between your content and your clients.

Apache to limit number of different IP addresses that can connect to a server

Me and a group of friends are developing a server that we want to have a limited number of users accessing to it. We first tried the KeepAlive and MaxClients directives with a relative small timeout. It worked fine in an experimental simple webpage.
But our webpage only loads a portion of it. I think it's because we use the AJAX model that does multiple connections. One per one part of the web interface.
Anyway to overcome this we though of blocking by number of different IP's connected and not different connections. We tried to find an Apache module/directive that does this but we keep finding modules that limit bandwidth/connections per IP (which is the reverse thing of what we keep trying to find).
Does anyone know something that can help me with this?
Thanks in advance.
This sort of things are usually done in routing since that's where you want to handle DOS related issues. There is however mod_evasive, and you can find it by Googling. It is not really maintained anymore and doesn't seem to have official homepage either but if you insist using Apache for that sort of stuff I would check it out in any case...