What advantages and disadvantages using nginx+Apache+mod_wsgi vs nginx+uWSGI(vurtualenv) in production
Advantages of first variant using i see in that mod_wsgi developing since 2007 and have more stable version and easy administrated
Advantages of second variant is more high perfomance (see Benchmark of Python WSGI Servers, available to use uWSGI server in virtualenv that is more secure.
Disadvantage of second variant is a still no major version, need to creating something controling scripts for starting uWSGI servers for each virtual host (or use supervisor)
What do you thinking about it?
When you load your typical large Python web application on top of the most popular WSGI servers, the performance difference isn't actually that much and usually nothing to get excited about. Hello world benchmarks like the one you quote are very misleading as they test a very narrow use case and the configurations used are usually never comparable. You should consider watching my PyCon talk which talk about bottlenecks in web servers and web applications.
http://pyvideo.org/video/703/web-server-bottlenecks-and-performance-tuning
Given that the WSGI server is not usually the problem, you should just choose that which you find easiest to manage and has the sorts of features you think you will require. Then use benchmarking and monitoring of that choice to work out how to set it up so as to perform best for your specific web application. Even then, any increase in performance or gains in user satisfaction are not usually going to come from such tuning.
Related
Tools such as WebAii can be used to visit a website, and with a simple loop, in succession.
If I code a lot of hits to a site in succession and/or with the ability for custom patterns, is this the same functionality as a load/stress testing tool?
Thanks
Essentially, yes. But to get a true test this would most definitely need multithreading or, preferably, be run on numerous clients against the site to reflect concurrent usage. This would make information gathering difficult (WCAT is very good for this but has a bit of steep learning curve).
I had considered writing something myself when I needed to do some stress testing as neither WCAT nor WAST really fit the bill. Had I looked into WebAii I would have considered it.
I wouldnt say that this is load testing unless you have a number of instances running. When load testing web applications and you are using real web browsers it is seen as a good rule of thumb to have 1 browser per CPU/core.
There are services that you can use to generate realistic load for not much money.
We frequently use web automation tools combined with virtual machines to load test. Each virtual machine uses web drivers following a script, and the scripts are written so that they gate and wait for eachother at certain checkpoints and make sure all machines and their browsers have caught up before continuing. That way key things (like clicking a link that kicks off intense calculation) are done simultaneously by all virtual machines.
I've been using PyAMF to write a backend for a flex app that will request different groups of hundreds of different images depending on what the client needs. I have been using the "simple_server" WSGI server that PyAMF supplies while developing the flex code. Now I'm ready to write a robust backend that will be able to pull images from a mySQL database and send them as fast as possible and as efficiently as possible to many concurrent clients.
The PyAMF documentation is great because they supply many examples to follow, however I am confused about what kind of backend I am trying to create.
Do I want a SocketServer or a WSGI server or something like Twisted or web2py or Tornado? Are these even all different? :) Should I be using Apache modules instead (mod_wsgi or modjy or mod_python)?
I realize that this probably touches on many open debates, so maybe you could just point me to any good summaries of these debates?
Its great to have so many options, but how do I choose?
The short answer is, of course, that it depends on the requirements of your project.
How many concurrent connections is "a lot"?
How much programmer time can you throw at the problem?
How much hardware can you throw at the problem?
...etc...
If you plan to have lots of concurrent clients, it's hard to beat Twisted in the Python world. However, you'll have to deal with your database asynchronously to avoid blocking, and depending on how complex your database interactions are, this can be a bit of a pain. You're basically limited to either using twisted.enterprise.adbapi or coming up with your own twisted-ORM integration.
If you'd rather have "easy" database code (i.e. you want to use an ORM), you're better off going with a (TurboGears/Pylons/plain wsgi) project, probably hosted using Apache and mod_wsgi. This can be a pretty scalable solution, and you get a lot of stuff for free using these frameworks, but it may be more than you need.
I would avoid using one of the many plain python wsgi servers out there (wsgiref, paster, etc.) in production if you really want high performance.
Good Luck!
Imagine that a large player is undertaking the construction of a new operating system, where backward compatibility requirements are limited to:
Run existing applications written in (or compiled to) JavaScript which are presented in HTML5 and styled with CSS3
Plug and play support for printers, external storage, and optical drives
Degrade gracefully when disconnected from the internet
Sufficient process quotas to support safely permitting tasks to run in the background, including timers
What specific features from existing research operating systems (such as Plan 9) would you like to see enter the mainstream through this channel? Please limit your suggestions to things that have been implemented, and provide a link to the implementation (or at least search terms).
From the Plan 9 docs:
Plan 9 began in the late 1980’s as an
attempt to have it both ways: to build
a system that was centrally
administered and cost-effective using
cheap modern microcomputers as its
computing elements.
Netbooks qualify as cheap modern microcomputers, and The Cloud qualifies as centrally administered. There is an opportunity to implement the features (in DDaviesBrackett's words) that we want netbooks to have other than by extending a 1970's time-sharing OS; the research operating systems may have proved the value of alternatives by example.
From the Plan 9 FAQ:
Subject: What are its key ideas?
Plan 9 exploits, as far as possible,
three basic technical ideas: first,
all the system objects present
themselves as named files that are
manipulated by read/write operations;
second, all these files may exist
either locally or remotely, and
respond to a standard protocol; third,
the file system name space - the set
of objects visible to a program - is
dynamically and individually
adjustable for each of the programs
running on a particular machine. The
first two of these ideas were
foreshadowed in Unix and to a lesser
extent in other systems, while the
third is new: it allows a new
engineering solution to the problems
of distributed computing and graphics.
Plan 9's approach means that
application programs don't need to
know where they are running; where,
and on what kind of machine, to run a
Plan 9 program is an economic decision
that doesn't affect the construction
of the application itself.
Does that not appear to be an excellent fit for the netbook/Cloud domain?
What operating system features I would advocate for Chrome OS?
Here my wish list as a Plan 9/Inferno fan:
Resources (ip stack, graphics, etc) as file systems.
Network transparent file system (ie., 9P).
Private per-process namespaces.
Factotum-like auth system (ie., no root user).
Pure UTF-8 everywhere.
Extremely lightweight processes.
Automatic snapshot and de-duplicating storage (ala venti+fossil).
And I guess many others, but this would be enough to make me quite happy.
This is not a 'OS feature' per see, but I would love to have a GUI with mouse-chording.
None.
I'd prefer for a new consumer OS, especially one targeted at Netbooks, to be very very good at doing the things that we already want OSes to be able to do rather than having time spent on features that are, by their nature, experimental.
(Of course, I'd be totally un-bothered by features I wasn't forced to use to develop on the platform; other people's toys are welcome as long as they don't make my job harder.)
I really think that Google might look into Plan9 for inspiration actually. Hearsay (the Internet) claims that several of those that initially developed UNIX and then later scrapped it for a better design (Plan9) are employed by Google. Google is also hosting its own version of Inferno, but I am not sure whether this is any central part of their plan. Further "evidence" could be that the plan9 authorization system (p9auth) for Linux was published by a Google researcher. The third "evidence" would be that Google claim that Chrome OS will have a novel security architecture.
The authorization seems to me to be one of the GREATEST parts of the Plan9 that can be included right now (/net would also be nice but there is no working code for that yet). The idea that a program that needs root access only gets limited access to the parts that are determined by the authorization server is definitely a great step forward compared to the now prevalent user/superuser/root division in Linux, where "a man in the middle" attacks can (theoretically) be done by gaining (full, as opposed to limited by the authorization server) root access via a bug in a program granted root.
So I was listening to the latest Stackoverflow podcast (episode 19), and Jeff and Joel talked a bit about scaling server hardware as a website grows. From what Joel was saying, the first few steps are pretty standard:
One server running both the webserver and the database (the current Stackoverflow setup)
One webserver and one database server
Two load-balanced webservers and one database server
They didn't talk much about what comes next though. Do you add more webservers? Another database server? Replicate this three-machine cluster in a different datacenter for redundancy? Where does a web startup go from here in the hardware department?
A reasonable setup supporting an "average" web application might evolve as follows:
Single combined application/database server
Separate database on a different machine
Second application server with DNS round-robin (poor man's load balancing) or, e.g. Perlbal
Second, replicated database server (for read loads, requires some application logic changes so eligible database reads go to a slave)
At this point, evaluating the current state of affairs would help to determine a better scaling path. For example, if read load is high and content doesn't change too often, it might be better to emphasise caching and introduce dedicated front-end caches, e.g. Squid to avoid un-needed database reads, although you will need to consider how to maintain cache coherency, typically in the application.
On the other hand, if content changes reasonably often, then you will probably prefer a more spread-out solution; introduce a few more application servers and database slaves to help mitigate the effects, and use object caching, such as memcached to avoid hitting the database for the less volatile content.
For most sites, this is probably enough, although if you do become a global phenomenon, then you'll probably want to start considering having hardware in regional data centres, and using tricks such as geographic load balancing to direct visitors to the closest "cluster". By that point, you'll probably be in a position to hire engineers who can really fine-tune things.
Probably the most valuable scaling advice I can think of would be to avoid worrying about it all far too soon; concentrate on developing a service people are going to want to use, and making the application reasonably robust. Some easy early optimisations are to make sure your database design is fairly solid, and that indexes are set up so you're not doing anything painfully crazy; also, make sure the application emits cache-control headers that direct browsers on how to cache the data. Doing this sort of work early on in the design can yield benefits later, especially when you don't have to rework the entire thing to deal with cache coherency issues.
The second most valuable piece of advice I want to put across is that you shouldn't assume what works for some other web site will work for you; check your logs, run some analysis on your traffic and profile your application - see where your bottlenecks are and resolve them.
plenty of fish Architecture
some interesitng videos:
Youtube scalibility
Inteview with Dan Farino, System Architect at Myspace
Joel mentioned adding a second datacenter, with the same setup, and then assigning your users randomly to each. Changes to the data are logged and sent from one location to the other, so that both locations contain all the data.
The talk Scalable Web Architectures Common Patterns & Approaches from Cal Henderson (Yahoo) on Web 2.0 Expo was quite interesting. I thought there was an video, but I could not find it. But here are the slides:
http://www.slideshare.net/techdude/scalable-web-architectures-common-patterns-and-approaches
A certain next step would be a cluster of webservers (a web farm) and a clustered system of database servers (replication or Oracle RAC etc. etc.)
If your interested in caching and using .Net, look into the application caching block in enterprise library (of course use this along with the other points above).
I'm trying to make the case for click-once and smart client development but my network support team wants to keep with web development for everything.
What is the best way to convince them that click-once and smart client development have a place in the business?
We use ClickOnce where I work; in terms of comparison to a web release I would base the case around the need for providing users with a rich client app, otherwise it might well actually be better to use web applications.
In terms of releasing a rich client app ClickOnce is fantastic; you can set it up to enforce updates on startup thus enforcing a version throughout the network. You can make the case that ClickOnce gives you the same benefit of having a single deployment point that web deployment possesses.
Personally I've found ClickOnce to be unbelievably useful. If you're developing rich client .net apps (in Windows, though let's face it the vast majority of real .net development is in Windows) and want to deploy it across a network nothing else compares.
Here is a couple of ideas that may help
long running processes, they are not asp.net best friend.
scaling, using client side processing as compared to bigger or more servers reduces cost etc.
They have a place in the Windows environment but not in any other environment and so if you intend on writing applications for external clients, then your probably best sticking with Web based development.
I heard this "Write Once, Run Many" before from Microsoft when Asp.net 1.1 was released, it never happened in practice.
#Mark
scaling, using client side processing as compared to bigger or more servers reduces cost etc.
I'm not sure I would entirely agree with this. It would seem to cost less to buy 1 powerful server and 1,000's of "dum terminals" than an average powerful server and 1,000 of powerful desktop computers.
#GateKiller
when i speak of scaling i was talking about the cost of buying more servers and not clients.
most workstations in an organization barely use 50% of their computing power right through the day. If i was to use a click once deployed application i would be using the grunt of existing workstations therefore not having any further cost on the organiztion.