How do you reduce the size of a folder's index file in NTFS? - ntfs

I have a folder in NTFS that contains tens of thousands of files. I've deleted all files in that folder, save 1. I ran contig.exe to defragment that folder so now it's in 1 fragment only. However, the size of that folder is still 8MB in size. This implies that there's a lot of gap in the index. Why is that? If I delete that one file, the size of the index automatically goes to zero. My guess is because it gets collapsed into the MFT. Is there any way to get NTFS to truly defragment the index file by defragmenting it based on the content of the file? Any API that you're aware of? Contig.exe only defragment the physical file.

I guess this is one way in which NTFS is just like almost every other FS - none of them seem to like shrinking directories.
So you should apply a high-tech method that involves using that advanced language, "BAT" :)
collapse.bat
REM Invoke as "collapse dirname"
ren dirname dirname.old
mkdir dirname
cd dirname.old
move * ../dirname/
cd ..
rmdir dirname.old

There is slack in the index, but not a gap. I make the distinction to imply that there is technically wasted space, but it's not like NTFS has to parse the 8MB in order to enumerate/query/whatever the index. It knows where the root of its tree is, and it just happens to have a lot of extra allocation leftover. Probably too detailed a response, given how unhelpful it is.
Fragmentation is likely a separate issue altogether.

Take a look at the accepted answer to this question: NTFS performance and large volumes of files and directories
The author provided some otherwise undocumented information about file index fragmentation, which he received from Microsoft Tech Support during an incident. The short version is, DEFRAG does not defragment the folder index, only the files in that folder. If you want to defragment the file index, you have to use SysInternals' CONTIG tool, which is now owned and distributed (free) by Microsoft. The answer gives a link to CONTIG.

Related

How to read and modify meta data of a folder in High sierra (APFS)

I am not able to access .DS_store files in High Sierra. Do APFS store meta data of a folder somewhere else, How to read and modify meta data of a folder in APFS.
You should never be reading and writing .DS_store files, they are private to the Finder. Depending on the metadata you are trying to read/write you might use NSFileManager, NSURL, the BSD/Posix functions covered in sections 2 & 3 of the Unix manual (the man) command, AppleScripting the Finder, etc. Time for some reading!
If you don't find your answer ask a new question, show what you've tried (or worked pre-APFS), what you've read, what exact metadata you're trying to read/write, etc. and someone will undoubtedly help you out.

How to get the file back after using "unlink" in R?

I accidentally deleted some of my useful files. The files were deleted and I could not find them in recycle bin. I want to know how can I get it back?
I am using windows 8.1. All the files in My documents deleted using unlink in R. I try to using R-delete to recover, but it only can recover the file deleted from recycle bin not unlink using R.
Thank you.
Though not being an expert of R, I assume that your file has been unlinked at the file-system level. You can't expect finding it in the recycle bin of your operating system. If it is very important, the only real solution is:
stop immediately doing anything with your computer;
take the time reading and understanding from another computer
try accessing your hard drive (or whatever) from another mounted filesystem/operating-system (boot with a USB stick for instance)
use some undelete tool adapted to your filesystem.
You don't tell about your operating system and OS; maybe there will be some tool usable from the mounted filesystem and it may be easier; but anyway, don't use your computer too much before doing it...

Objective-C - Finding directory size without iterating contents

I need to find the size of a directory (and its sub-directories). I can do this by iterating through the directory tree and summing up the file sizes etc. There are many examples on the internet but it's a somewhat tedious and slow process, particularly when looking at exceptionally large directory structures.
I notice that Apple's Finder application can instantly display a directory size for any given directory. This implies that the operating system is maintaining this information in real time. However, I've been unable to determine how to access this information. Does anyone know where this information is stored and if it can be retrieved by an Objective-C application?
IIRC Finder iterates too. In the old days, it used to use FSGetCatalogInfo (an old File Manager call) to do this quickly. I think there's a newer POSIX call for that these days that's the fastest, lowest-level API for this, especially if you're not interested in all the other info besides the size and really need blazing speed over easily maintainable code.
That said, if it is cached somewhere in a publicly accessible place, it is probably Spotlight. Have you checked whether the spotlight info for a folder includes its size?
PS - One important thing to remember when determining the size of a file: Mac files can have two "forks", the data fork, and the resource fork (where e.g. Finder keeps the info if you override a particular file to open with another application than the default for its file type, and custom icons assigned to files). So make sure you add up both forks' sizes, or your measurements will be off.

How to remove .efs file extension from 1000's of recovered files in one folder

I recently recovered a 1.5TB external HDD that crashed. The program I used to recover the files was Active Undelete Enterprise, it's excellent. When the files were successfully recovered they were all saved with a .efs extension so files looked like mydocument.docx.efs. At first I thought they were encrypted and needed to be decrypted, I spent 10 mins on it and realized I just need to remove the .efs from the entire filename and the mydocument.docx works perfectly. Problem is now I have over 55,000 files within hundreds of folders where I need to simply remove the .efs after each file. Does anyone know how to do this?
From a command prompt window, navigate to the top level directory where these files reside.
Type the command
DIR /S/B >>filelist.txt
This command will give you a bare format file listing of the current directory plus all nested subdirectories without any extraneous information. The list will be contained in the text file named "filelist.txt" or whatever else you choose to call it. I would then use this text file in a text editor to convert every line of text from, for example,
C:\Users\dlucas\.gimp-2.8\mathmap\file1.png.efs
to
rename c:\Users\dlucas\.gimp-2.8\mathmap\file1.png.efs file1.png
to give a simple example of a file that I just found on my system using this method.
You will need to use a text editor with a columnar editing capability since you have to modify som many files. Old programmer's editors such as CodeWright made this really simple while modern editors such as Eclipse or Notepad++ make this a little more difficult and may require a columnar editing plugin, depending on version. You basically have to make a columnar copy of all of the text in the file, and then paste the copy off to the far right - far enough that a second column of filenames and paths won't overwrite any of the existing file names and paths. You can then use columnar editing features to select and delete the path names of the text in the 2nd column since the rename command requires that the 2nd argument be simply the base filename and extension without the path information. You can use the columnar editing features to prepend every line with "RENAME ". If you attempt to do this without columnar editing features, you will find it slow going!
An alternate way to do this is to use a command formed from a "regular expression" to create the rename command. If you are not familiar with "regular expressions", ask a programmer friend as this is not an easy topic to learn from scratch. If you are familiar with regular expressions, this is probably the simplest way to perform this task. I haven't used them in many years and no longer recall the exact syntax to use or I would tell you myself.
Regardless of what kind of editor you use, the goal is to turn this ASCII file list of paths and filenames into a batch file (simply rename file1.txt to file1.bat when you are finished editing). You can then run the batch file by typing file1.bat at a command prompt.
I have just run into this same problem myself using the same really wonderful tool that you used. I am writing this while waiting for the undelete program to finish. That it restores files with this extra extension seems very anti-intuitive so I will look for an option to make it not do this when it finishes. If I find one, I will post a new answer here that is more specific to this tool. Otherwise, I am going to have rename all kazillion files just as you had to.
You experienced this problem because the disk that you recovered your files to "does not support encryption", according to the Active# UNDELETE documentation. The documentation offers no further explanation of what kind of disks support encryption, etc.
They offer a Decrypt command that restores the file's proper names as a post processing step. Unfortunately, this requires that you "include" each and every file to be decrypted, with no support for wildcards and parsing subdirectories so that is a non-starter, in my opinion given that both of us have hundreds of thousands of files to be renamed.
I did find that by selecting a normal fixed (non-removable) hard drive as the destination of the recovery effort, that the resulting files do not end up encrypted (i.e., they are recovered with the proper file name and extension). I originally chose a large USB based flash drive and the files were stored in their "encrypted" state (not really encrypted, but possibly potentially so and thus they give the .efs extension). Of course, this meant that I had to run the command all over again after switching to a regular hard drive (takes about 16 hours to recover 80GB worth of files due to presence of many sector CRC errors).

Software configuration management tool for hundreds of binary files, many are large

Note: I've tried searching, Stackoverflows near useless. I am not sure what kind of tool I need.
At my organization we need to keep track of the software configuration for many types of computers including the binary installers and automation scripts. Change is infrequent but the size of latest version of the configuration is several gigs.
We are trying to use Mercurial to store changes but it is just too slow, even without many revisions at all. I did an hg status but killed it after it took 10 minutes without finishing.
We are looking for a way to store the current configuration as well as having the old configurations there just in case. I have never done anything like this before and do not know what tools are available or even suitable for such tasks. Can someone point me in the right direction or tell me how the are solving this problem? Thanks
Since hard disk space is cheap and being able to view binary differences isn't very helpful, perhaps the best option you have is to store each configuration in a new directory that is indexed somehow. Example below:
/software/configs/2009-03-15
/software/configs/2009-09-28
/software/configs/2009-09-30
Given the size of your files and the infrequent number of changes, this would allow you to pick a configuration from a given 'tag' without the overhead of revision control.
If you pack your files into a single tar file and generate a SHA-512 hash, then you can be reasonably sure that no one has tampered with your files since they were archived.
While I don't know specific details about how to implement this strategy in mercurial, I have been working with git and git-fat. It sets up a general procedure that is likely to be feasible on mercurial as well. Basically the idea is whenever you add a binary file to the repository, under the hood, the repo creates a symlink to the file that is actually stored in another location as a checksummed object.
This allows large files to be tracked by the repo, without storing the actual data inside. It requires the data to be stored in some other location (perhaps in a binary management system).
It might take some configuration to do it in mercurial, but I think it's an elegantly simple solution.