Testing with 100's of devices

Testing with 100's of devices - testing

We are building software intended to communicate with 100's or 1000's of PCs. Unfortunately we do not have the means to set that many devices up. We don't have that many physical devices, and we don't have enough infrastructure to support that many virtual devices either. We are looking to test with high volumes of PC's, combined with other factors such as network latency. Are there services or other ways we can achieve this level of testing?

There are cloud load testing services out there that might do what you want. A few I know of off hand are LoadStorm and Load Impact. Quite a few others will turn up with a search for something like "cloud load testing". This could be an easy option. There would be some cost, but it wouldn't be too high.
If you want to roll your own solution for free, a lot of infrastructure-as-a-service providers offer a free tier for new users. Amazon EC2 and Microsoft Azure both offer 750 hours of their smallest instance a month for free. While this is usually used for a whole contiguous month of server time (24 hours * 31 days), you could also use it to spin up 750 servers for an hour, once a month. Spread them across all the different regions/data centers available to maximize variance in networks and latency.
You could also consider writing a testing tool using a language with good concurrency support, or with a light footprint, so that you can fire up several hundred threads/processes at once, and then run your tests on relatively few servers. It wouldn't be quite the same as 1000s of different IP addresses all at once, but 4-5 servers each fielding a few hundred clients might be enough to satisfy your testing needs.

Related

Forming a web application cluster with 3 VMs running in the same physical box

Are there any advantages what so ever to form a cluster if all the nodes are Virtual machines running inside the same physical host? Our small company just purchased a server with 16GB of Ram. I propose to just setup IIS on the box to handle outside requests, but our 'Network Engineer' argue that it will be better to create 3 VMs on the box and form a cluster with the VMs for load balancing. But since they are all in the same box, are there actual benefits for taking the VM approach rather than no VMs?
THanks.

No, as the overheads of running four operating systems would take a toll on performance, plus, I believe all modern web servers (plus IIS) are multithreaded so are optimised for performance anyway.

Maybe the Network Engineer knows something that you don't. Just ask. Use common sense to analyze the answer.
That said, running VMs always needs resources - but you might not notice. Doesn't make sense? Well, even if you attach the computer with a Gigabit link to the Internet, you still won't be able to process more data than the ISP gives you. If your uplink is 1MB/s, that's the best you can get. Any VM today is able to process that little trickle of data while being bored 99.999% of the time.
Running the servers in VMs does have other advantages, though. First of all, you can take them down individually for maintenance. If the load surges because your company is extremely successful, you can easily add more VMs on other physical boxes and move virtual servers around with a mouse click. If the main server dies, you can set up a replacement machine and migrate the VMs without having to reinstall everything.

I'd certainly question this decision myself as from a hardware perspective you obviously still have a single point of failure so there is no benefit.
From an application perspective it could be somewhat tenuously suggested that this would allow zero downtime deployments by taking VMs out of the "farm" one at a time but you won't get any additional application redundancy or performance from virtualisation in this instance. What you will get is considerably more management overhead in terms of infrastructure and deployment for little gain.
If there's a plan to deploy to a "proper" load balanced environment in the near future this might be a good starting point to ensure your application works correctly in a farm (sticky sessions etc). Although this makes your apparently live environment also a QA server, which is far from ideal.

from a performance perspective, 3 VMs on the same hardware is slower
from an availability perspective, 2 VMs will give higher availability (better protects from app software failures, OS failures, you can perform maintenance on one node while the other is up).

Services similar to S3/EC2

Does any other provider offer a cloud computing + storage layer like S3/EC2, with free data transfer between the two layers?
I have looked at:
Softlayer CloudLayer Storage -- no free transfer between the cloudlayer storage and cloudlayer computing instances.
Rackspace CloudFiles - Quite a bit of marketing mumbo-jumbo, and something about Cloud Connect, I gave up on the site once the Live Chat CSS Popup started following me around.
Does anyone know of any others?
I'm looking to store some large (non-random access) files for constant re-processing on a storage solution, and process it nearby, without paying transfer costs daily (looking to store in the 500-2000GB range, re-processing it all daily).
Re-processing requires a (Linux) server with a "decent" (weasel word alert) configuration.
Thanks!

'Cloud computing' is a bit of a myth.
They're all just, essentially, virtual private servers. 'Cloud' instances tend to have the flexibility to be billed by the hour, rather than monthy, but they're still just a VPS.
Persistent storage is a useful feature from a very limited number of VPS providers, but one that can easily be emulated by having two+ VPS' in the same data centre (Linode are an excellent VPS provider with free local data transfer, sadly they're rather limited by capacity). I don't know of any other VPS/Cloud providers who offer their own persistent storage solution.
It is something you can easily achieve yourself. VPS servers tend to be a little restrictive on hard drive capacity if you're looking for 500-2000GB, Perhaps you could consider a dedicated server and handle storage and processing on the same machine... you can't get data more local than the same machine!

First, the short version: stop looking for “free”.
Now, in more detail: you're looking to consume some somewhat-non-trivial computing, data storage and networking resources. Presumably you've got a good reason for doing this; if you truly have, you'll have the ability to also purchase the resources required for what you want to do. There are a few options on this front, none of which are free:
Buy and host your own hardware.
Buy the hardware and host it in a colocation facility.
Hire the hardware
Long term hire
Short term hire
All the Amazon are doing is short-term, easy set up hiring of resources. Their prices are quite keen (if some other option is cheaper, it's because it is missing something significant that Amazon do; maybe it's something you don't need but that's up to you to figure out). You can host the core of the Amazon API quite easily on whatever resources you've hired (see Eucalyptus) but be aware that going from having the software and the API to having everything work smoothly is a really big step; the more I work with Eucalyptus installations, the more impressed with Amazon I become. And that's despite being also pretty impressed with Eucalyptus itself.
But none of this is free. It takes real resources to provide – e.g., electricity to power the machines and keep them cool and a building to house them in – and ultimately, that's got to be paid for somewhere. To expect otherwise is to believe that others should have to pay for things for you; it's pretty rare that that happens, and the more you need to consume, the rarer it is (especially if the economy isn't doing too good). So stop thinking in terms of how you can get it for nothing (“freeload”) and instead take a good look at what it really costs to provide through various routes and seek to minimize your costs. If you can't afford even that, your #1 problem isn't hosting but funding; fix that first.
Rest assured you're not alone in this matter. This is what lots of other people worldwide have to do to make their projects into reality. Good luck!

GoGrid has an external storage with free transfer and access over typical protocols like SMB, NFS, rsync, FTP. The first two allows for mounting as normal drive.
Note also that many providers will allow you to create cloud servers with 2 TB instance storage. For sure not able to name all of them, but you can find some with cloudorado.com .

Is it feasible to virtualize developer machines? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 months ago.
Improve this question
It's budgeting time and Corporate is balking at the cost of replacing a coworker's machine who is due for it, needs it, and deserves it.
Our group is a small ISV/SAAS that exists as a division of a larger media group. We are not a cost center, we make money, even this year. We are owned by a mid-size media group whose business model is quite different, and seems driven only by reducing costs.
Our software stack is Visual Studio 2008, SQL 2008, on Windows Server 2008 (so that multiple root websites can be hosted and debugged on each dev's machine). Our target hardware is 3GHz quad-core workstation, 4GB RAM, and RAID 1 mirrored hard drives so that we are protected against the productivity loss of losing a developer hard drive.
Corporate wants to give us a couple powerful, but hand-me-down, decommissioned servers, and then each developer would have a virtual workstation on that server. The computers sitting on our desktops would be dumb terminals at $400-500 each.
I'm trying to be neutral but I doubt it's hard to discern my bias. I'd like to see real developer reactions to this, and I figure this is the best place to get that.
Please include arguments for or against, evidence if you've seen this tried and how well (or not) it has gone.

This sounds like a well intentioned idea, but:
In my experience you need multiple cores, lots of memory, and fast disks to be productive in today's modern IDE's. I don't see that happening in a virtual environment with any economy. Individual boxes are still better.
It's also an issue of control. In a virtual environment I can imagine all kinds of restrictions. Will you still be able to install your own tools, for example?
Ultimately, it's misguided. If this idea increases build times by any substantial amount, any savings in hardware will quickly be erased by lost productivity. Conversely, money that is spent on decent individual machines for developers will quickly pay for itself over and over in reduced build times.
Good quality individual machines are an investment, not a cost.

Development is disk-bound, i.e. you spend your time waiting for builds which is a disk-bound process most of the time. If you're all sharing a machine build times will become much worse.

Aside from all of the givens (perfomance, disk space, etc...):
I would be OK with this as long as I still had multiple monitor support.
Without that, it is a no-go.

Basic failure to understand what a developer box is actually doing much of the time:
When building its chewing through processor and disk - especially disk.
When testing you're talking about having one or more instances of Visual Studio running (once you get past two things start to get interesting), database server, website/services plus all the other stuff (browsers with a lot of tabs open, notebook software, and heaven only knows what else) all spread across multiple monitors (at least two). Lots of cores, lots of memory please!
I can quite happily accept that there's an argument for virtualisation - a good dev box should be able to host multiple, concurrent VMs in order to isolate some of the above and to provide "clean" environments for testing. Note that that's the box for ONE developer hosting multiple VMs solely for the benefit of that one developer...

Our team is developing on remote server (no GUI stuff, plain old vim) for quite some time without problems. Granted it requires rather powerful server and sometimes is starts to be bit on a slow side if everyone start to compile at the same time.
But as a bonus you are very mobile in terms where you can develop from (we all are having laptops) be it in office, home, sunny beach (last one was probably overstatement).
Bute yeah, that might not all work well for graphics heavy apps of course.

It sounds like your group is not offering the solutions that you have considered in a well documented format, otherwise corporate would not be shoving decisions down your throat. If you have a documented process for development, corporate might want to discuss changing the process with you, but as soon as you say, "this change would break our process and we would have to retool our development workflow", they will see the pain of the $$ in reworking the process and most likely back off. That said, once your process is documented, you should internally be ruthless about trying to make it more efficient and cost effective, and have an open mind about corporate's suggestions.

I assume you have machines already for SVN / TRAC, your Continuous Integration server, product demos, testing, etc. and that the only possible use your team could make of these servers is for personal VMs.

I do many things that peg my processor at 100%. Compiles certainly achieve this. Now imagine having to share that processor with 10 other developers. The loss in productivity will become quite apparent. If you have a multi-core PC, this won't be as painful. Get an Intel i7 and you probably won't even notice it when 8 people are logged in. Most programs (including my compiler) can't use more than 1 processor anyway.
That said, it's a viable solution to reduce costs. I used to work at a company who has since switched to these dumb terminals. It works fine. My university had HP UNIX machines that were dumb terminals. They logged into a server that split up the processor ownership among however many people were logged in. What people would do is log into a server and check the number of people logged in. If there were too many, they'd search for the next one, because build times are noticeably slower. I'd never log into the easy to remember server names. =)
It definitely works, but also reduces productivity due to longer build times, especially when multiple people are building at the same time. Since productivity is such a difficult thing to quantify, it might be hard to argue your point.

Graphics acceleration might also be an issue if you need to do anything with animation, video, or image editing. You can't really test video playback through an RDP session since the framerate and/or color depth isn't high enough.

Regardless of performance, at my company we are moving to laptops as developer machines. The main advantage is that developers can bring their computers to meetings, conferences, etc. Also being able to sit next to a colleague when you're helping him with a problem, and having your own development environment available, is very valuable.

Working around development constraints in customer policy

As described before, I work in IT consultancy and move through various customer environments. It is natural to encounter a variety of security policies, and in most environments we have had to go through a security checklist before authorizating our laptops - our mobile development workstations - for connection into their network (most of the time just development network).
There is this customer who does not allow external computers to connect to their network, so our laptops are.... expensive communication computers with mobile GSM modems. We are forced to use their desktop PCs for development, and those workstations are pretty old models with low RAM and single-core Pentium 4 CPUs and cranky disks. Needless to say, development work is sub-optimal, especially when working with Visual Studio solutions that can range 100 - 400 projects.
For small cases that can be isolated, we develop and test on our own laptops. But for the bigger cases, given that certain development servers like SeeBeyond and mainframe DB2 databases are only on the network, and the prospect of copying hundreds of projects to and fro machines is just ghastly, it does not seem like a technically sound idea.
I am not asking for tricks that violate the customer's policies (e.g. plug laptop in masquerading desktop MAC address). I just like to know what others have tried to retain some of their advantage and efficiency with their own hardware when working in such environments. Whenever I can I try to duplicate the environment with virtual servers on my own laptop, but it only goes so far with Microsoft-only server solutions. Virtualizing non-Microsoft server and software is a challenge.

That's tough. The root cause here is management that doesn't understand that there are real cost implications to their choice of environments.
Your problem is that while you may be billing by the hour, you probably aren't getting paid that way, so your customers' wasted time goes into the pockets of your boss and not to you. A lot of times, this presents a mild conflict of interest. Your company has about zero incentive to speed up your work, and your client doesn't want to make an infrastructure investment in what they see as a temporary engagement.
All I can say is that you have to run this up the flagpole with management. You have to show them that this is taking real time from the projects which could put your deliverable dates at risk, or worse, the reliability of these machines is such that it puts the delivery of the end product at risk as well. The onus is on you to make your management into a believer.
A gig of RAM at Crucial is thirty bucks. If nobody is willing to shell out 90 big ones for 3GB of RAM for your box, you have management that's actively working against you or does not respect you. If it comes to that, you've got bigger problems and need to look for your next employer.

One of the things that I did when I upgrade my current development environment was find links to productivity studies that showed how much productivity increased when the development environment was enhanced. In my particular case it was going from 2 to 3 monitors on my desktop. I was able to find 3-4 articles that described how much was gained by having the extra monitor. It seems self-evident to me that you'd want a newer, well-configured system for developers, especially since the cost of the hardware relative to the cost of the people is so small these days, but the bean counters often think differently. If you can go in armed with some industry studies that show productivity gains, I think it will be harder to dismiss your concerns as just complaints about the environment.
FWIW, I was disappointed to have to do the research for an upgrade that cost less than what the department would spend on paper in a month, but sometimes you have to do things that make no sense to you because it makes sense to someone else.

Write a decent proposal to your manager, that's about all you can do to rectify the solution. If he is unwilling or unable to fix the problem, or unwilling/unable to pass the proposal up to someone who can, then I'd say the current situation is what they've decided to use.
In that case, either live with it, or don't, ie. move on.
The proposal should contain:
A proposal for what you want done
Why it should be done
The consequences of doing it
And most importantly, the consequences of not doing it
List things like longer development time, or less testing, or less time to write quality code. Basically, a minor upgrade that doesn't cost much will improve the quality of the product tremendously.

I just went through this and found a pretty good solution : get a different job

Just synchronize incrementally. You're not typing that much code/second a gsm connection cannot keep up with it? Make sure your projects are setup to use mocks/stubs whereever possible.
Setting this up probably is beyond the capability of the systems administrators of your customer.
The dependency on the big databases should be reduced so you only need to run daily regression tests.

Hardware requirements for a Virtual Server

We have decided to go with a virtualization solution for a few of our development servers. I have an idea of what the hardware specs would be like if we bought separate physical servers, but I have no idea how to consolidate that information into the specification for a generalized virtual server.
I know intuitively that the specs are not additive - I shouldn't just add up all the RAM requirements from each machine to get the RAM required for the virtual server. I can't really treat them as parallel systems either because no matter how good the virtualization software is, it can't abstract away two servers trying to peg the CPU at the same time.
So my question is - is there a standard method to estimating the hardware requirements for a virtualized system given hardware requirement estimations for the underlying virtual machines? Is there a +C constant for VMWare/MS Virtual Server overhead (and if so, what is C?)?
P.S. I promise to move this over to serverfault once it goes into beta (Promise kept)

Yes add 25% additional resources to manage the VM. So if I need 4 servers that are equal to single core 2 ghz machines with 2 gigs of ram I will need 10 ghz processing power plus 10 gigs of ram. This will allow all systems to redline and still be ok.
In the real world this will never happen though, all your servers will not always be running all the time. You can get a feel for usage by profiling your current servers and determine their exact requirements and then adding an additional 25% in resources.
Check out this software for profiling utilization http://confluence.atlassian.com/display/JIRA/Profiling+Memory+and+CPU+usage+with+YourKit

The requirements are in fact additive. You should add up the memory requirements for each VM, and the disk requirements, and have at least one processor core per VM. Then add on whatever you need for the host system.
VMs can share a CPU, to some extent, if you have really low performance requirements, but they cannot share disk space or memory.

Answers above are far too high, second (1 core per VM) is closer. You can either 1) plan ahead and probably over-purchase 2) add just-in-time. Do you have some reason that you must know well ahead (yearly budget? your chosen host platform doesn't cluster hosts, so you can't add later?)
Unless you have an incredible simple usage profile, it will be hard to predict before and you'll over purchase. The answer above (+25%) would be several times more than you need for an modern server virtualization software (VMware, Zen, etc) that manages resources smartly. It's accurate only for desktop products like VPC. I chose to rough it out on a napkin and profile my first environment (set of machines) on the host. I'm happy.
Examples of things that will confound your estimation
Disk space, Some systems (Lab
Manager) use only the difference in
space from the base template. 10
deployed machines with 10 GB drives
using about 10 GB (template) + 200MB.
Disk space: You'll then find you
don't like the deltas in specific
scenarios.
CPU / Memory: This is dev
shop - so you'll have erratic load.
Smart hosts don't reserve memory and CPU.
CPU / Memory: But then you'll
want to do perf testing, and want to
reserve CPU cycles (not all hosts can
do that)
We all virtualize for different reasons. Many of the guests
in our environment don't have much work. We want them there to see how something behaves with a cluster of 3 servers of type X. Or, we have a bundle of weird client desktops waiting around, being used one at time by a tester. They rarely consume many host resources.
So, if you are using something like that doesn't do delta disks, disk space might be somewhat calculable. If lab manager (delta disk), disk space is really hard to predict.
Memory and processor usage: You'll have to profile or over-purchase heavily. I have many more guest CPUs than host CPUS, and don't have perf problems - but that's because of the choppy usage in our QA environments.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas