Where to store a massive training dataset in the desktop (C drive, D drive or external USB)? - usb

I have a 2TB training dataset to run in my desktop PC with 4 nvidia rtx2080ti graphics cards. This entire 2TB dataset is going to be read from the disk at each epoch and there 200 epochs to train (total training time is estimated as two months).
My desktop storage configuration is as follow:
C drive:
4TB Samsung 970 EVO NVMe SSD with 320 MB/sec R/W speed. Windows 10 OS, anaconda environment and all pytorch program files reside in the C drive (there are still 2.5TB free space available).
D drive:
4TB Western Digital HDD with approx. 30MB/sec R/W speed.
External:
4TB portable SSD memory card with USB 3.1 and 240MB/sec R/W speed.
Under this hardware environment, I am contemplating where to store my training dataset (read-only).
C drive is the fastest. However, if I store the training dataset in C drive, then Windows OS, anaconda, all programs scripts and dataset read/write will be congested (??) in the same drive.
HDD in D drive is too slow (20MB/sec) and I am afraid whether this slow I/O might be a bottleneck for training speed.
External SSD might be a good choice but I saw many warnings in the internet that massive data transfer through USB port becomes slower and slower over long time period (eventually got stuck down to 1 ~ 2MB/sec data transfer speed) and also might terminate program without warning due to power and heat issues over time.
I wonder whether there is anyone who experienced similar situation in the past and can give a proper suggestion in this environment.

Another option might be splitting your C drive into 2 partitions, and keep your files in the second partition. This way all the read/writes will be done on the first partition, and the second part will remain untouched/undamaged. Also, you'll be able to easily reinstall windows and be sure that your files in the second partition are untouched.
To split C drive do the following:
Open the disk management by hitting menu button and typing diskmgmt.msc
Select the C drive, right click on it, and hit "Shrink"
Provide the storage size you'd like to cut
After finishing you'll have some empty space like this
Right click on that area and click on "New Simple Volume", and create a new partition.
Assign a letter to the partition, and now it's ready for use.

Related

xplane11 control pilot aircraft freeze loose settings

I use for professional piloting over airport Xplane 11 on windows 10 home, and sometimes by hour this software freeze stop running while I piloting real a airfreighter in sky, and to doing revoring is slow as 5 minutes minimum, also to start 3D world for improve view, and hd camera in board pilot cabin. Also to retake recovery the control piloting on ground the aircraft and the remote view is wrong of parameters angles. What about the last update in version older to quick restart, avoid freeze and recover more high keep the cockpit equipment settings whithout reload the saved file before a new time to be secure a new bug crach this software, what I use daily and know it since last years.
there a solution, increasing virtual memory and file preference screen video : https://www.x-plane.com/kb/configuring-x-plane-to-use-less-virtual-memory/
Reducing X-Plane’s Virtual Memory Use :
There are a few ways you can reduce the amount of virtual memory that X-Plane needs to operate.
The first thing you can do is remove add-ons. Plugins, custom airplanes, and custom scenery can all increase the amount of virtual memory that X-Plane uses. Try removing add-ons and see if the problem goes away. Add-ons can also increase the amount of memory that X-Plane itself uses. You may have to mix and match add-ons depending on your activities.
The second thing you can do is turn down X-Plane’s settings. Here are some settings that make a difference in virtual memory usage.
AI traffic. More planes uses more memory, and more complex and higher detail planes use more memory.
Texture resolution: turn texture compression on – it saves a lot of memory. Turn down texture res as needed. In particular, do not run with extreme res and uncompressed textures.
Trees – turn down the forest density to save memory.
4x SSAA – when in HDR mode, turn off 4x SSAA if you use a large monitor or a large window size.
If needed, turn down object density.

is it possible to increase the ram in google colab with another way?

When I run this code in google colab
n = 100000000
i = []
while True:
i.append(n * 10**66)
it happens to me all the time. My data is huge. After hitting 12.72 GB RAM, but I don't immediately get to the crash prompt and the option to increase my RAM.
I have just this Your session crashed after using all available RAM. View runtime logs
What is the solution ? Is there another way ?
You either need to upgrade to Colab Pro or if your computer itself has more RAM than the VM for Colab, you can connect to your local runtime instead.
Colab Pro will give you about twice as much memory as you have now. If that’s enough, and you’re willing to pay $10 per month, that’s probably the easiest way.
If instead you want to use a local runtime, you can hit the down arrow next to “Connect” in the top right, and choose “Connect to local runtime
The policy was changed. However, currently, this workaround works for me:
Open and copy this notebook to your drive. Check if you already have 25gb RAM by hovering over the RAM indicator on the top right (this was the case for me). If not, follow the instructions in the colab notebook.
Source: Github
To double the RAM size of Google Colab use this Notebook, it gives a 25GB RAM! Note: set Runtime type to "None" to double RAM, then change it again to GPU or TPU.
https://colab.research.google.com/drive/155S_bb3viIoL0wAwkIyr1r8XQu4ARwA9?usp=sharing
as you said 12GB
this needs a large RAM,
if you need a small increase you can use colab pro
If you need a large increase and using a deep learning framework my advice you should use :
1- the university computer (ACADEMIC & RESEARCH COMPUTING)
2- using a platform like AWS, GCP, etc 3- you may use your very professional computer using GPU (I didn't recommend this)

Azure VM disk attachment number is too low. Can this limit be increased?

Based on this blog post https://blogs.technet.microsoft.com/uspartner_ts2team/2015/08/26/azure-vm-drive-attachment-limits/ there is a limit on the disk attachment following the model of number of cpus x2. Is there a technical reason why this limit is in place? If you use kubernetes you may not be able to schedule a pod. The scheduler is not aware of this limit.
This was proposed as a workaround https://github.com/khenidak/dysk but I'm wondering why this very low limit exists in the first place.
The number of data disks are directly tied to the size of the VM. For example, if you go here https://learn.microsoft.com/en-us/azure/virtual-machines/windows/sizes you will see that each VM increasing in resources can handle more data disks.
This restraint is mainly built around performance. If you had a virtual machine with only 2 CPU cores and say 10 data disks you would likely run into performance issues as the CPU power and RAM needed to reach out to all those data disks at once could cause your VM to tap out.
The simple solution would be to use larger VM sizes if you need more disks. Or depending on how much space you have Azure can support up to 4TB data disks.

How to test applications and websites with less computing power?

Currently I develop on an extremely powerful machine: Pentium i7, 32 GB Ram, SSD, 1 Gb 1028-bit graphics card, etc.
What I'm trying to figure out is the proper way to test my applications and web pages simulating a less powerful computer. Is there any way to simulate a slower processor, less ram, slower hard drive, and weaker graphics card? I'm not sure if I missed anything else in terms of what else to simulate...
The only thing I've figured out so far is resolution, but that was as easy as changing my monitor resolution. Though, if there is a way to simulate less resolution without needing to change my actual screen resolution as well, that'd be great.
Download Windows Virtual PC.
And test you application in it. You can customize your configuration, everything like disk storage, ram memory etc.
Good application for application developers to test in different environments.
I suppose you could create a virtual machine on your computer. You could then vary how much RAM & processing power it has access to. You could boot this machine of a USB drive if you wanted to simulate a slower drive.

Carrying and Working on an Entire Development Box from a USB Stick. Feasible?

Lately I have been thinking about investing in a worthy USB pen drive (something along the lines of this), and install Operating Systems on Virtual Machines and start developing on them.
What I have in mind is that I want to be able to carry my development boxes, being a Windows Distribution for .Net development and a Linux Distribution for stuff like RoR, Perl and whatnot, so that I would be able to carry them around where need be...be it work, school, different computers at home etc...
I am thinking of doing this also for backup purposes...ie to backup my almost-single VM file to an external hd, instead of doing routinely updates to my normal Windows Box. I am also thinking about maybe even committing the VM boxes under Source Control (is that even feasible?)
So, am I on the right track with this ? Do you suggest that I try to implement this out?
How feasible is it to have your development box on Virtual Machine that runs from a USB Pen-Drive ?
I absolutely agree with where you are heading. I wish to do this myself.
But if you don't already know, it's not just about drive size, believe it or not USB Flash drives can be much slower than your spinning disk drives!
This can be a big problem if you plan to actually run the VMs directly from the USB drive!
I've tried running a 4GB Windows XP VM on a 32GB Corsair Survivor and the VM was virtually unusuable! Also copying my 4GB VM off and back onto the drive was also quite slow - about 10 minutes to copy it onto the drive.
If you have an esata port I'd highly recommend looking at high-speed ESata options like this Kanguru 32GB ESata/USB Flash drive OR this 32GB one by OCZ.
The read and write speeds of these drives are much higher over ESata than other USB drives. And you can still use them as USB if you don't have an ESata port. Though if you don't have an ESata port you can buy PCI to ESata cards online and even ESata ExpressCards for your laptop.
EDIT: A side note, you'll find the USB flash drives use FAT instead of NTFS. You don't want to use NTFS because it makes a lot more reads & writes on the disk and your drive will only have a limited number of reads & writes before it dies. But by using FAT you'll be limited to max 2GB file size which might be a problem with your VM. If this is the case, you can split your VM disks into 2GB chunks. Also make sure you backup your VM daily incase your drive does reach it's maximum number of writes. :)
This article on USB thumbdrives states,
Never run disk-intensive applications
directly against files stored on the
thumb drive.
USB thumbdrives utilize flash memory and these have a maximum number of writes before going bad and corruption occurs. The author of the previously linked article found it to be in the range of 10,000 - 100,000 writes but if you are using a disk intensive application this could be an issue.
So if you do this, have an aggressive backup policy to backup your work. Similarly, if when you run your development suite, if it could write to the local hard drive as a temporary workspace this would be ideal.
Hopefully you are talking about interpreted language projects. I couldn't imagine compiling a C/C++ of any size on a VM, let alone a VM running off of a USB drive.
I do it quite frequently with Xen, but also include a bare metal bootable kernel on the drive. This is particularly useful when working on something from which a live CD will be based.
The bad side is the bloat on the VM image to keep it bootable across many machines .. so where you would normally build a very lean and mean paravirtualized kernel only .. you have to also include one that has everything including the kitchen sink (up to what you want, i.e. do you need Audio, or token ring, etc?)
I usually carry two sticks, one has Xen + a patched Linux 2.6.26, the other has my various guest images which are ready to boot either way. A debootstrapped copy of Debian or Ubuntu makes a great starting point to create the former.
If nothing else, its fun to tinker with. Sorry to be a bit GNU/Linux centric, but that's what I use exclusively :) I started messing around with this when I had to find an odd path to upgrading my distro, which was two years behind the current one. So, I strapped a guest, installed what I wanted and pointed GRUB at the new LV for my root file system. Inside, I just mounted my old /home LV and away I went.
Check out MojoPac:
http://www.mojopac.com/
Hard-core gamers use it to take world of warcraft with them on the go -- it should work fine for your development needs, at least on Windows. Use cygwin with it for your unix-dev needs.
I used to do this, and found that compiling was so deathly slow, it wasn't worth it.
Keep in mind that USB flash drives are extremely slow (maybe 10 to 100 times slower) compared to hard drives at random write performance (writing lots of small files to a partition which already has lots of files).
A typical build process using GNU tools will create lots of small files - a simple configure script creates thousands of small files and deletes them again just to test the environment before you even start compiling. You could be waiting a long time.