how to tackle large data in redis

how to tackle large data in redis - redis

My pc have 3G memory,and I store data into redis.I write data to disk via dump.rdb. I found when dump.rdb is about 1.5G,redis will shut down,why that?
and when I input info in redis-cli,I found all my memory are spent by redis.
how to tackle that?
I use 32bit, ubuntu 12.04.
if I add more memory,32bit machine only can hold 4G memory,so if I will add more 5G data，how to tackle that much dat?
thx guys.

You're hitting the 32 bit process memory limit. 2GB of the address space is reserved for the kernel, leaving only 2GB left. You then read 1.5GB, and presumably do some more stuff with it that involves allocating more memory, and somewhere in this process, you run out.
If you want to be able to use more than that you'll need a machine with more ram and a 64 bit operating system.

Related

Qemu using too much disk

I am using Centos 7 on the host, and with QEMU emulation windows server, I create a 15 GB disk, then after a week or so, the disk goes to 30 GB, is there a way to stop this? I am using snapshots. Is there any windows service that may be creating lots of files, or maybe the HD is being used as RAM to imprive system? from 15 GB to 30 is a lot, and the server's HD is not bigger, so It's a big issue to me. I reinstalled all and turned a lot of things off, but the same thing happens again. Little help here.

You don't mention what kind of disk image you're using, and what its configuration is. I'm guessing from your description that its a qcow2 image, since you mention use of snapshots. Snapshots stored inside the qcow2 image will increase its size beyond that visible to the guest OS. eg if you create a qcow2 image that has a virtual disk size of 15 GB, then it is possible for it to consume much more than 15 GB on the host if you have saved multiple different snapshots in the image and the guest has been writing alot of data between each snapshot. Even when snapshots are deleted, this space consumed by qcow2 won't be released back to the OS normally. So while seeing 30 GB of usage sounds quite large it is not totally unreasonable if using snapshots alot.

What stops a virtual machine from starting by having "too much" ram

In virtual box, their is a slider for how much RAM to allocate to the machine. However, when hovering over with my cursor, it says:
If you assign too much, the machine might not start
Since I have 40gb of RAM, what is too much, and why would alot of ram prevent it from starting?
Incase it was not apparent, I am looking for a programming related answer, not how to use the software.

Not sure about VBox but, in VmWare, if you allocated too much, it wouldn't start because the memory couldn't be taken from the host OS.
The memory was allocated away from the host in one big chunk, and held for as long as the VM was running.
So it would be similar to if you tried to pre-allocate a 60G virtual disk on a drive with only 40G free. Except that would fall over at VM creation time rather than run time.
Trying to grab more memory than the host could provide is something that will cause you an issue when you run the VM rather than when you create it, because, unlike the pre-allocated disk, it's not needed until then.

Can I use MRJob to process big files in local mode?

I have a relatively big file - around 10GB to process. I suspect it won't fit into my laptop's RAM, if MRJob decides to sort it in RAM or something similar.
At the same time, I don't want to setup hadoop or EMR - the job is not urgent and I can simple start worker before going to sleep and get the results the next morning. In other words, I'm quite happy with local mode. I know, the performance won't be perfect but it's ok for now.
So can it process such 'big' files at a single weak machine? If yes - what would you recommend to do (besides setting a custom tmp dir to point to the filesystem, not to the ramdisk which will be exhausted quickly). Let's assume we use version 0.4.1.

I think the RAM size won't be an issue with the python runner of mrjob. The output of each step should be written out to temporary file on disk, so it should not fill up the RAM I believe. Dumping output to disk is the way it should be with Hadoop (and the reason why it is slow due to IO). So I would just run the job and see how it goes.
If the RAM size is an issue, you can create enough swap space on your laptop to make it at least run, thought it will be slow if the partition isn't on SSD.

Shinking JVM memory and Swap

Virtual Machine:
4CPU
10GB RAM
10GB swap
Java 1.7
-Xms=-Xmx=6144m
Tomcat 7
We observed a very strange behaviour with the JVM. The JVm resident memory began to shrink and the swap usage shot up to over 50%.
Please see below stats from monitoring tools.
http://i44.tinypic.com/206n6sp.jpg
http://i44.tinypic.com/m99hl0.jpg
Any pointers to understand this is grateful.
Thanks!

Or maybe your Java program was idle and it didn't need that memory, and you have high swappiness? In such situation your OS would free RAM just in case and leave only used part.
In my opinion, that is actually good behaviour, why should you waste RAM for process that won't use it?
Unless you run only this one process on VM, then it would be quite good idea to set swappiness to 0 or other small number - this memory was given to this single process, so we may disable swapping it.

Thanks for the response. Yes this is more close to a system troubleshooting than Java but I thought this the right forum to initiate this topic incase anybody has seen such a phenomena with JVM.
Anyways, I had already checked the top and no there was no other process than Java which was hungry for memory. Actually the second top process was utilizing 72MB (RSS).
No the swappiness is not aggressive set on this system but at default 60. One additional information I missed to share is we have 4 app servers in cluster and all showed this behaviour exactly at the same time. AFAIK, JVM does not swap out but the OS would. But all of it is what confusing me.
All these app servers are production and busy serving request so not idle. The used Heap size was at Avg 5 GB used of the the 6GB.
The other interesting thing I found out were some failed messages in the Vmware logs at the same time which is what I'm investigating.

Carrying and Working on an Entire Development Box from a USB Stick. Feasible?

Lately I have been thinking about investing in a worthy USB pen drive (something along the lines of this), and install Operating Systems on Virtual Machines and start developing on them.
What I have in mind is that I want to be able to carry my development boxes, being a Windows Distribution for .Net development and a Linux Distribution for stuff like RoR, Perl and whatnot, so that I would be able to carry them around where need be...be it work, school, different computers at home etc...
I am thinking of doing this also for backup purposes...ie to backup my almost-single VM file to an external hd, instead of doing routinely updates to my normal Windows Box. I am also thinking about maybe even committing the VM boxes under Source Control (is that even feasible?)
So, am I on the right track with this ? Do you suggest that I try to implement this out?
How feasible is it to have your development box on Virtual Machine that runs from a USB Pen-Drive ?

I absolutely agree with where you are heading. I wish to do this myself.
But if you don't already know, it's not just about drive size, believe it or not USB Flash drives can be much slower than your spinning disk drives!
This can be a big problem if you plan to actually run the VMs directly from the USB drive!
I've tried running a 4GB Windows XP VM on a 32GB Corsair Survivor and the VM was virtually unusuable! Also copying my 4GB VM off and back onto the drive was also quite slow - about 10 minutes to copy it onto the drive.
If you have an esata port I'd highly recommend looking at high-speed ESata options like this Kanguru 32GB ESata/USB Flash drive OR this 32GB one by OCZ.
The read and write speeds of these drives are much higher over ESata than other USB drives. And you can still use them as USB if you don't have an ESata port. Though if you don't have an ESata port you can buy PCI to ESata cards online and even ESata ExpressCards for your laptop.
EDIT: A side note, you'll find the USB flash drives use FAT instead of NTFS. You don't want to use NTFS because it makes a lot more reads & writes on the disk and your drive will only have a limited number of reads & writes before it dies. But by using FAT you'll be limited to max 2GB file size which might be a problem with your VM. If this is the case, you can split your VM disks into 2GB chunks. Also make sure you backup your VM daily incase your drive does reach it's maximum number of writes. :)

This article on USB thumbdrives states,
Never run disk-intensive applications
directly against files stored on the
thumb drive.
USB thumbdrives utilize flash memory and these have a maximum number of writes before going bad and corruption occurs. The author of the previously linked article found it to be in the range of 10,000 - 100,000 writes but if you are using a disk intensive application this could be an issue.
So if you do this, have an aggressive backup policy to backup your work. Similarly, if when you run your development suite, if it could write to the local hard drive as a temporary workspace this would be ideal.

Hopefully you are talking about interpreted language projects. I couldn't imagine compiling a C/C++ of any size on a VM, let alone a VM running off of a USB drive.

I do it quite frequently with Xen, but also include a bare metal bootable kernel on the drive. This is particularly useful when working on something from which a live CD will be based.
The bad side is the bloat on the VM image to keep it bootable across many machines .. so where you would normally build a very lean and mean paravirtualized kernel only .. you have to also include one that has everything including the kitchen sink (up to what you want, i.e. do you need Audio, or token ring, etc?)
I usually carry two sticks, one has Xen + a patched Linux 2.6.26, the other has my various guest images which are ready to boot either way. A debootstrapped copy of Debian or Ubuntu makes a great starting point to create the former.
If nothing else, its fun to tinker with. Sorry to be a bit GNU/Linux centric, but that's what I use exclusively :) I started messing around with this when I had to find an odd path to upgrading my distro, which was two years behind the current one. So, I strapped a guest, installed what I wanted and pointed GRUB at the new LV for my root file system. Inside, I just mounted my old /home LV and away I went.

Check out MojoPac:
http://www.mojopac.com/
Hard-core gamers use it to take world of warcraft with them on the go -- it should work fine for your development needs, at least on Windows. Use cygwin with it for your unix-dev needs.

I used to do this, and found that compiling was so deathly slow, it wasn't worth it.
Keep in mind that USB flash drives are extremely slow (maybe 10 to 100 times slower) compared to hard drives at random write performance (writing lots of small files to a partition which already has lots of files).
A typical build process using GNU tools will create lots of small files - a simple configure script creates thousands of small files and deletes them again just to test the environment before you even start compiling. You could be waiting a long time.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

how to tackle large data in redis - redis

Related

Qemu using too much disk

What stops a virtual machine from starting by having "too much" ram

Can I use MRJob to process big files in local mode?

Shinking JVM memory and Swap

Carrying and Working on an Entire Development Box from a USB Stick. Feasible?

Categories

Resources