Qemu using too much disk - virtual-machine

I am using Centos 7 on the host, and with QEMU emulation windows server, I create a 15 GB disk, then after a week or so, the disk goes to 30 GB, is there a way to stop this? I am using snapshots. Is there any windows service that may be creating lots of files, or maybe the HD is being used as RAM to imprive system? from 15 GB to 30 is a lot, and the server's HD is not bigger, so It's a big issue to me. I reinstalled all and turned a lot of things off, but the same thing happens again. Little help here.

You don't mention what kind of disk image you're using, and what its configuration is. I'm guessing from your description that its a qcow2 image, since you mention use of snapshots. Snapshots stored inside the qcow2 image will increase its size beyond that visible to the guest OS. eg if you create a qcow2 image that has a virtual disk size of 15 GB, then it is possible for it to consume much more than 15 GB on the host if you have saved multiple different snapshots in the image and the guest has been writing alot of data between each snapshot. Even when snapshots are deleted, this space consumed by qcow2 won't be released back to the OS normally. So while seeing 30 GB of usage sounds quite large it is not totally unreasonable if using snapshots alot.

Related

Azure VM disk attachment number is too low. Can this limit be increased?

Based on this blog post https://blogs.technet.microsoft.com/uspartner_ts2team/2015/08/26/azure-vm-drive-attachment-limits/ there is a limit on the disk attachment following the model of number of cpus x2. Is there a technical reason why this limit is in place? If you use kubernetes you may not be able to schedule a pod. The scheduler is not aware of this limit.
This was proposed as a workaround https://github.com/khenidak/dysk but I'm wondering why this very low limit exists in the first place.
The number of data disks are directly tied to the size of the VM. For example, if you go here https://learn.microsoft.com/en-us/azure/virtual-machines/windows/sizes you will see that each VM increasing in resources can handle more data disks.
This restraint is mainly built around performance. If you had a virtual machine with only 2 CPU cores and say 10 data disks you would likely run into performance issues as the CPU power and RAM needed to reach out to all those data disks at once could cause your VM to tap out.
The simple solution would be to use larger VM sizes if you need more disks. Or depending on how much space you have Azure can support up to 4TB data disks.

Can I use MRJob to process big files in local mode?

I have a relatively big file - around 10GB to process. I suspect it won't fit into my laptop's RAM, if MRJob decides to sort it in RAM or something similar.
At the same time, I don't want to setup hadoop or EMR - the job is not urgent and I can simple start worker before going to sleep and get the results the next morning. In other words, I'm quite happy with local mode. I know, the performance won't be perfect but it's ok for now.
So can it process such 'big' files at a single weak machine? If yes - what would you recommend to do (besides setting a custom tmp dir to point to the filesystem, not to the ramdisk which will be exhausted quickly). Let's assume we use version 0.4.1.
I think the RAM size won't be an issue with the python runner of mrjob. The output of each step should be written out to temporary file on disk, so it should not fill up the RAM I believe. Dumping output to disk is the way it should be with Hadoop (and the reason why it is slow due to IO). So I would just run the job and see how it goes.
If the RAM size is an issue, you can create enough swap space on your laptop to make it at least run, thought it will be slow if the partition isn't on SSD.

how to tackle large data in redis

My pc have 3G memory,and I store data into redis.I write data to disk via dump.rdb. I found when dump.rdb is about 1.5G,redis will shut down,why that?
and when I input info in redis-cli,I found all my memory are spent by redis.
how to tackle that?
I use 32bit, ubuntu 12.04.
if I add more memory,32bit machine only can hold 4G memory,so if I will add more 5G data,how to tackle that much dat?
thx guys.
You're hitting the 32 bit process memory limit. 2GB of the address space is reserved for the kernel, leaving only 2GB left. You then read 1.5GB, and presumably do some more stuff with it that involves allocating more memory, and somewhere in this process, you run out.
If you want to be able to use more than that you'll need a machine with more ram and a 64 bit operating system.

Basic virtualization questions

Excuse me for my lack of knowledge but I am really new to the Virtual world and have a few questions.
I work for a small charity who specialise in providing basic IT training. We have recently acquired a few Dell Poweredge 2650 servers and Dell desktops and we wish to offer both XP, Windows 7, Mac and Ubuntu training. I am looking at setting up a Virtual environment so that we can have a standard image for each OS (I currently use image files but it currently takes approximately 25mins to build each machine and multi-boot is not an option as the new machines have 20Gb disks).
The servers are all dual processor and we can purchase more memory(I need to justify the cost)
What are the memory requirements for
the Host?
How many VM's can I run
per server?
Can I run multiple instances of the same VM
Thanks in advance for your knowledge.
Darryn
You might be able to get away with a multi-boot option with those 20 gig disks; each OS will probably take no more than ten gigs for minimal installs, two OSes per machine isn't terrible. (Incidentally, look around for a group like FreeGeek in your area -- larger hard drives ought to be cheap for small sizes like 120-500 gigs.)
That said, virtualization might be just what you need, if you have a handful of pretty powerful machines.
I think between one and two gigabytes of host memory for every guest VM that you want to run would be very useful. At least in my experience, an Ubuntu image I gave 1024 megabytes to ran very quickly, but I didn't press it very far. Running Firefox or OpenOffice inside the VM would probably dictate more memory very quickly. Chrome seemed snappy.
So, if you've got 12 gigabytes of RAM, you might be able to get between four and twenty virtual machines hosted on the machine simultaneously, depending upon what your guests are doing.
As for disk space, if you use QEMU's -snapshot option, you ought to be able to save disk space. Each user could boot the same underlying disk image, but their own modifications would go into the 'snapshot' file. (I have no experience trying to do long-term system maintenance with this option, so it could be that all twenty of your users need to store service pack 2 contents when they upgrade in the future; I'd be scared of trying to modify the shared disk image once you've got snapshots of it running. Perhaps having everyone store 'personal documents' and the like in CIFS shares would make a ton of sense.)
The biggest hurdle will probably be Mac; because the Apple terms of service forbid running OS X on non-Apple hardware, you'll have to have some Apple machines around to run VirtualBox.

Carrying and Working on an Entire Development Box from a USB Stick. Feasible?

Lately I have been thinking about investing in a worthy USB pen drive (something along the lines of this), and install Operating Systems on Virtual Machines and start developing on them.
What I have in mind is that I want to be able to carry my development boxes, being a Windows Distribution for .Net development and a Linux Distribution for stuff like RoR, Perl and whatnot, so that I would be able to carry them around where need be...be it work, school, different computers at home etc...
I am thinking of doing this also for backup purposes...ie to backup my almost-single VM file to an external hd, instead of doing routinely updates to my normal Windows Box. I am also thinking about maybe even committing the VM boxes under Source Control (is that even feasible?)
So, am I on the right track with this ? Do you suggest that I try to implement this out?
How feasible is it to have your development box on Virtual Machine that runs from a USB Pen-Drive ?
I absolutely agree with where you are heading. I wish to do this myself.
But if you don't already know, it's not just about drive size, believe it or not USB Flash drives can be much slower than your spinning disk drives!
This can be a big problem if you plan to actually run the VMs directly from the USB drive!
I've tried running a 4GB Windows XP VM on a 32GB Corsair Survivor and the VM was virtually unusuable! Also copying my 4GB VM off and back onto the drive was also quite slow - about 10 minutes to copy it onto the drive.
If you have an esata port I'd highly recommend looking at high-speed ESata options like this Kanguru 32GB ESata/USB Flash drive OR this 32GB one by OCZ.
The read and write speeds of these drives are much higher over ESata than other USB drives. And you can still use them as USB if you don't have an ESata port. Though if you don't have an ESata port you can buy PCI to ESata cards online and even ESata ExpressCards for your laptop.
EDIT: A side note, you'll find the USB flash drives use FAT instead of NTFS. You don't want to use NTFS because it makes a lot more reads & writes on the disk and your drive will only have a limited number of reads & writes before it dies. But by using FAT you'll be limited to max 2GB file size which might be a problem with your VM. If this is the case, you can split your VM disks into 2GB chunks. Also make sure you backup your VM daily incase your drive does reach it's maximum number of writes. :)
This article on USB thumbdrives states,
Never run disk-intensive applications
directly against files stored on the
thumb drive.
USB thumbdrives utilize flash memory and these have a maximum number of writes before going bad and corruption occurs. The author of the previously linked article found it to be in the range of 10,000 - 100,000 writes but if you are using a disk intensive application this could be an issue.
So if you do this, have an aggressive backup policy to backup your work. Similarly, if when you run your development suite, if it could write to the local hard drive as a temporary workspace this would be ideal.
Hopefully you are talking about interpreted language projects. I couldn't imagine compiling a C/C++ of any size on a VM, let alone a VM running off of a USB drive.
I do it quite frequently with Xen, but also include a bare metal bootable kernel on the drive. This is particularly useful when working on something from which a live CD will be based.
The bad side is the bloat on the VM image to keep it bootable across many machines .. so where you would normally build a very lean and mean paravirtualized kernel only .. you have to also include one that has everything including the kitchen sink (up to what you want, i.e. do you need Audio, or token ring, etc?)
I usually carry two sticks, one has Xen + a patched Linux 2.6.26, the other has my various guest images which are ready to boot either way. A debootstrapped copy of Debian or Ubuntu makes a great starting point to create the former.
If nothing else, its fun to tinker with. Sorry to be a bit GNU/Linux centric, but that's what I use exclusively :) I started messing around with this when I had to find an odd path to upgrading my distro, which was two years behind the current one. So, I strapped a guest, installed what I wanted and pointed GRUB at the new LV for my root file system. Inside, I just mounted my old /home LV and away I went.
Check out MojoPac:
http://www.mojopac.com/
Hard-core gamers use it to take world of warcraft with them on the go -- it should work fine for your development needs, at least on Windows. Use cygwin with it for your unix-dev needs.
I used to do this, and found that compiling was so deathly slow, it wasn't worth it.
Keep in mind that USB flash drives are extremely slow (maybe 10 to 100 times slower) compared to hard drives at random write performance (writing lots of small files to a partition which already has lots of files).
A typical build process using GNU tools will create lots of small files - a simple configure script creates thousands of small files and deletes them again just to test the environment before you even start compiling. You could be waiting a long time.