better place for sensitive temp file, ram disk or hdd or ssd? - ram

my app saves a 1MB file and then another app reads it back. After that I want to sercure delete it. I thought about a ram drive because I know that even with a secure delete appl. something would remain on HDD or SSD. I can accept to lose the content of that file on shutdown. The fact is that I read about some bugs in some ram disk applications bug lists(ex.: imdisk) related to file corruption. Solved bugs but I'm wondering if ram disk apps are secure from file integrity point of view. On the other hand neither a normal disk is 100% secure. My temp file is absolutely important for me. I also protect my file through a sha1 or similar, but let's suppose for a moment that there is no protection, just to understand what is the best solution.
Thanks
Pupillo

What storage place is best certainly depends on the hardware involved, amongst others their age, their MTTF and any previous failures encountered.
I don't think it is possible to give a general answer.
Sounds to me like you are looking for an IPC mechanism, like shared memory.
This would also avoid using file systems and their -- imo very rare -- bugs.
If you think about file corruptions you should also think what will happen on crashes of the involved applications.
So you might have multiple problems:
IPC
Persistency on crashes
Security concerncs, e.g. others reading the involved sections of RAM/HD

Related

Prevent Memory Corruption During Writes with Power Loss

I have a system that runs windows via a USB stick (it's a proprietary machine). This type of machine is commonly powered off by 'pulling the plug'. There is no way around it, that is how it is operated.
We occasionally have drive corruption on the USB stick, or at least corruption in the directory that we write things into. Is there really any software solution to get around this problem other than 'write as little/infrequently as possible'?
It's a windows machine and the applications that write are typically written in Java/C# if that is useful to anyone. The corruption typically shows up as a write directory or the parent of a write directory that can no longer be accessed due to the corruption. The only way to deal with it is to delete it via command line and start over.
Is there any way to programmatically deal with such a scenario, to perhaps restore a previous state of the memory as opposed to deleting and starting anew?
I don't feel as though there is any way to prevent this type of thing from happening given our current design. If you do enough writes and keep pulling the plug you are eventually going to get a corruption and that's just facts. Especially in this design. Even if the backup batteries are charged, if the software doesn't shutdown gracefully within the battery's discharge time, the corruptions could still occur. Not to mention as gravitymixes said above its going to damage hardware eventually which we have seen before.
A system redesign needs to considered for this project as a whole. Some type of networked solution comes to mind immediately where data is sent off the volatile machine to be logged on a machine with a more reliable power source over a reliable network connection with writing to the disk on the actual volatile machine as a last ditch effort if network comms are not reliable at a given point in time (backfill). I feel like this would increase hardware life as well. Of course the problem of network reliability then becomes your problem.

Virtual machine undoing

I am working with fairly sensitive information and I am also just a very paranoid person in general. I am using my work computer, but seeing as how I don't work for a company, I don't have any way of safely wiping everything completely whenever I decide to get rid of it. I mean, I can have somebody wipe it for me (I just don't know how secure it is) or just destroy the computer, but outside of that I am not sure what I can do.
So I was thinking about using a Virtual Machine, but I don't understand much about it. For example, I see this article about internet browsing, sandboxing, and an "undo" feature. I realize this is about internet browsing, but the idea of whenever I close the application and it deleting from the VM is appealing. However, I've also read things where you can use VMWare Tools or something like that to recover data that you deleted on the VM.
Is it possible to have the VM delete the data and, at least, make it virtually impossible to recover the data? If not completely, at least make it very unlikely?
The VM's storage is an abstraction and compartmentalization of the storage on the host machine. You can delete the VM, but recovering its image and therefore its contents is not any harder than using forensic tools to recover regular files on your device. If you're worried about security, use strong passwords and a VPN service. In terms of file destruction, you can simply encrypt your data before "destruction". This way even if someone recovers it, it'll be computationally infeasible * for them to undo the encryption just for the chance to maybe peek at your files.
Computational infeasibility means a computation which although computable would take far too many resources to actually compute. Ideally in cryptography one would like to ensure an infeasible computation’s cost is greater than the reward obtained by computing it.

Accesing files which are currently being written

If a file is in a writing process, and at this time if I try to access it like if it is a log file which is being written every 10 milliseconds and I`m trying to access it will I damage or disturb the writing process?
Specifically I'm asking about video files, like if I start a recording process (using Windows Media Encoder) and at this time I would like to monitor the file if it is a blank file (black pixels everywhere) or there is a real content being recorded.
Sorry if my question is a newbie one, but I really really need to be sure about that.
Best on advance
In general you can certainly read files as they are being written, without corrupting their content. However:
It is possible to face an issue if your recording medium cannot deal with the combined data-rate or of both reading and writing. This can be a problem especially with slow-ish USB flash drives.
It is possible to face an issue on hard drives too, if the combination of reading and writing exceeds the rate of random seeks that the hard drive can handle. This can happen more easily on older drives (e.g. IDE) when dealing with HD video.
The end result is that if you have a real-time writer process, such as a TV recorder, it may be forced to drop some of the data - in the case of video a few frames.
Modern systems have quite fast disk subsystems, reasonably good I/O schedulers and large enough RAM capacities to allow for extensive data caching, which makes it quite unlikely that a single writer/reader combination would saturate the disk subsystem, unless you are doing something unusual like recording several video streams at once.
Keep in mind however, that:
The disk subsystem can also be saturated by unrelated processes reading/writing other files from the same drive.
If you are encoding video, you might also lose frames if something draws enough CPU resources that the encoding process is no longer able to keep-up with the real-time requirements. Depending on the video file, test-playing it might be just enough to do that - at least HD reproduction can be quite demanding. So, watch your CPU load and experiment before relying on it to record your favourite show :-)
EDIT:
If you are among the lucky ones that have SSD drives, seeks and data rate should normally be a non-issue. That leaves the CPU - you'd be surprised how easy it is to push it to the limit.
Above all, you should experiment to find out the limits of your system for each particular application. That way you won't have any nasty surprises...

Determining failing sectors on portable flash memory

I'm trying to write a program that will detect signs of failure for portable flash memory devices (thumb drives, etc).
I have seen tools in the past that are able to detect failing sectors and other kinds of trouble on conventional mechanical hard drives, but I fear that flash memory does not have the same kind of predictable low-level access to the hardware due to the internal workings of the storage. Things like wear-leveling and other block-remapping techniques (to skip over 'dead' sectors?) lead me to believe that determining if a flash drive is failing will be difficult at best, if not impossible (short of having constant read failures and device unmounts).
Flash drives at their end-of-life should be easy to detect (constant CRC discrepancies during reads and all-out failure). But what about drives that might be failing early? Are there any tell-tale signs like slower throughput speeds that might indicate a flash drive is going to fail much sooner than normal?
Along the lines of detecting potentially bad blocks, I had considered attempting random reads/writes to a file close to or exactly the size of the entire volume, but even then is it possible that the drive might report sizes under its maximum capacity to account for 'dead' blocks?
In short, is there any way to circumvent or at least detect (algorithmically or otherwise) the use of block-remapping or other life extension techniques for flash memory?
Let me end this question by expressing my uncertainty as to whether or not this belongs on serverfault.com . This is definitely a hardware-related question, but I also desire a software solution - preferably one that I can program myself.
If this question is misplaced, I will be happy to migrate it to serverfault - but I do need a programming solution. Please let me know if you need clarification :)
Thanks!
It's interesting if badblocks can help in this case
AFAIK, Wear leveling happens at the firmware level. The hardware does not know about the bad block, till such time the firmware detects one.
And there is no known way to find this bad sectors before hand. BTW, I guess, it is not bad sectors, but bad blocks. Once a sector is bad, the whole block is marked as bad ...

Best Dual HD Set up for Development

I've got a machine I'm going to be using for development, and it has two 7200 RPM 160 GB SATA HDs in it.
The information I've found on the net so far seems to be a bit conflicted about which things (OS, Swap files, Programs, Solution/Source code/Other data) I should be installing on how many partitions on which drives to get the most benefit from this situation.
Some people suggest having a separate partition for the OS and/or Swap, some don't bother. Some people say the programs should be on the same physical drive as the OS with the data on the other, some the other way around. Same with the Swap and the OS.
I'm going to be installing Vista 64 bit as my OS and regularly using Visual Studio 2008, VMWare Workstation, SQL Server management studio, etc (pretty standard dev tools).
So I'm asking you--how would you do it?
If the drives support RAID configurations in your BIOS, you should do one of the following:
RAID 1 (Mirror) - Since this is a dev machine this will give you the fault tolerance and peace of mind that your code is safe (and the environment since they are such a pain to put together). You get better performance on reads because it can read from both/either drive. You don't get any performance boost on writes though.
RAID 0 - No fault tolerance here, but this is the fastest configuration because you read and write off both drives. Great if you just want as fast as possible performance and you know your code is safe elsewhere (source control) anyway.
Don't worry about mutiple partitions or OS/Data configs because on a dev machine you sort of need it all anyway and you shouldn't be running heavy multi-user databases or anything anyway (like a server).
If your BIOS doesn't support RAID configurations, however, then you might consider doing the OS/Data split over the two drives just to balance out their use (but as you mentioned, keep the programs on the system drive because it will help with caching). Up to you where to put the swap file (OS will give you dump files, but the data drive is probably less utilized).
If they're both going through the same disk controller, there's not going to be much difference performance-wise no matter which way you do it; if you're going to be doing lots of VM's, I would split one drive for OS and swap / Programs and Data, then keep all the VM's on the other drive.
Having all the VM's on an independant drive would let you move that drive to another machine seamlessly if the host fails, or if you upgrade.
Mark one drive as being your warehouse, put all of your source code, data, assets, etc. on there and back it up regularly. You'll want this to be stable and easy to recover. You can even switch My Documents to live here if wanted.
The other drive should contain the OS, drivers, and all applications. This makes it easy and secure to wipe the drive and reinstall the OS every 18-24 months as you tend to have to do with Windows.
If you want to improve performance, some say put the swap on the warehouse drive. This will increase OS performance, but will decrease the life of the drive.
In reality it all depends on your goals. If you need more performance then you even out the activity level. If you need more security then you use RAID and mirror it. My mix provides for easy maintenance with a reasonable level of data security and minimal bit rot problems.
Your most active files will be the registry, page file, and running applications. If you're doing lots of data crunching then those files will be very active as well.
I would suggest if 160gb total capacity will cover your needs (plenty of space for OS, Applications and source code, just depends on what else you plan to put on it), then you should mirror the drives in a RAID 1 unless you will have a server that data is backed up to, an external hard drive, an online backup solution, or some other means of keeping a copy of data on more then one physical drive.
If you need to use all of the drive capacity, I would suggest using the first drive for OS and Applications and second drive for data. Purely for the fact of, if you change computers at some point, the OS on the first drive doesn't do you much good and most Applications would have to be reinstalled, but you could take the entire data drive with you.
As for dividing off the OS, a big downfall of this is not giving the partition enough space and eventually you may need to use partitioning software to steal some space from the other partition on the drive. It never seems to fail that you allocate a certain amount of space for the OS partition, right after install you have several gigs free space so you think you are fine, but as time goes by, things build up on that partition and you run out of space.
With that in mind, I still typically do use an OS partition as it is useful when reloading a system, you can format that partition blowing away the OS but keep the rest of your data. Ways to keep the space build up from happening too fast is change the location of your my documents folder, change environment variables for items such as temp and tmp. However, there are some things that just refuse to put their data anywhere besides on the system partition. I used to use 10gb, these days I go for 20gb.
Dividing your swap space can be useful for keeping drive fragmentation down when letting your swap file grow and shrink as needed. Again this is an issue though of guessing how much swap you need. This will depend a lot on the amount of memory you have and how much stuff you will be running at one time.
For the posters suggesting RAID - it's probably OK at 160GB, but I'd hesitate for anything larger. Soft errors in the drives reduce the overall reliability of the RAID. See these articles for the details:
http://alumnit.ca/~apenwarr/log/?m=200809#08
http://permabit.wordpress.com/2008/08/20/are-fibre-channel-and-scsi-drives-more-reliable/
You can't believe everything you read on the internet, but the reasoning makes sense to me.
Sorry I wasn't actually able to answer your question.
I usually run a box with two drives. One for the OS, swap, typical programs and applications, and one for VMs, "big" apps (e.g., Adobe CS suite, anything that hits the disk a lot on startup, basically).
But I also run a cheap fileserver (just an old machine with a coupla hundred gigs of disk space in RAID1), that I use to store anything related to my various projects. I find this is a much nicer solution than storing everything on my main dev box, doesn't cost much, gives me somewhere to run a webserver, my personal version control, etc.
Although I admit, it really isn't doing much I couldn't do on my machine. I find it's a nice solution as it helps prevent me from spreading stuff around my workstation's filesystem at random by forcing me to keep all my work in one place where it can be easily backed up, copied elsewhere, etc. I can leave it on all night without huge power bills (it uses <50W under load) so it can back itself up to a remote site with a little script, I can connect to it from outside via SSH (so I can always SCP anything I need).
But really the most important benefit is that I store nothing of any value on my workstation box (at least nothing that isn't also on the server). That means if it breaks, or if I want to use my laptop, etc. everything is always accessible.
I would put the OS and all the applications on the first disk (1 partition). Then, put the data from the SQL server (and any other overflow data) on the second disk (1 partition). This is how I'd set up a machine without any other details about what you're building. Also make sure you have a backup so you don't lose work. It might even be worth it to mirror the two drives (if you have RAID capability) so you don't lose any progress if/when one of them fails. Also, backup to an external disk daily. The RAID won't save you when you accidentally delete the wrong thing.
In general I'd try to split up things that are going to be doing a lot of I/O (such as if you have autosave on VS going off fairly frequently) Think of it as sort of I/O multithreading
I've observed significant speedups by putting my virtual machines on a separate disk. Whenever Windows is doing something stupid in the VM (e.g., indexing yet again), it doesn't thrash my Mac's disk quite so badly.
Another issue is that many tools (Visual Studio comes to mind) break in frustrating ways when bits of them are on the non-primary disk.
Use your second disk for big random things.