Is there any function to retrieve the path associated with an inode? - objective-c

I am writing a utility that walks a directory tree on Mac OS X (10.6 and higher) and tries to detect changes that have occurred since the directory was last synchronized with a back-up location.
When I initially synchronize the files and folders I obtain the inode number and store it in the database record for that file or folder:
NSString *oldFilePath = /* ... */;
NSError *error = nil;
NSDictionary *attributes = [[NSFileManager defaultManager] attributesOfItemAtPath:oldFilePath error:&error];
/* set database record for oldFilePath to [attributes fileSystemFileNumber] */
When I encounter a new file or folder I first do a database lookup using the inode number to find the original file, if any.
But in the case where a file has moved from a parent directory to a sub-directory, and I am trying to detect changes to the parent directory I would like to be able to use the saved inode number to identify the new path so that I can distinguish between a move and a delete.

On Mac the GetFileInfo command performs a reverse lookup of inode numbers.
GetFileInfo /.vol/234881029/344711
Should produce:
file: "/path/to/file"
...
Martin R's answer only works on directories.

inode numbers are only unique within a filesystem, so you need at least device and inode number to identify a file.
On the HFS+ file system, the inode number is in fact identical to the "Macintosh File Id", and there is a special "/.vol" filesystem that allows you to find a directory by device and inode.
Example:
$ cd /.vol/234881029/342711
$ pwd
/Volumes/Data/tmpwork/test20/test20.xcodeproj
$ stat .
234881029 342711 drwxr-xr-x 5 martin staff 0 170 ......
As you can see, 234881029 is the device number of "/Volumes/Data", 342711 is the inode number of "tmpwork/test20/test20.xcodeproj" within that filesystem, and
cd /.vol/<deviceNo>/<inodeNo>
transferred you directly to that folder. You could now use getcwd() to determine the real path to that folder.
The "/.vol" filesystem is documented in the legacy Technical Q&A QA1113.
Disclaimer: I tried this only on OS X 10.7 and I am fairly sure that it works on older systems. I have no idea if you can rely on this feature in future versions of OS X. Also it is very HFS specific.

On Unix-like systems, many filenames may reference the same inode, so you'd have to search the filesystem. I don't know if MacOS provides a shortcut.

Note that, as explained above, the /.vol/ 'magic' directory needs the device ID for the volume, and the inode of the directory or file. You can get the device ID of the volume as the first number returned by stat as explained in a different answer here.
# stat returns device ID as '234881026' and confirms inode is '32659974'
~$ stat /Volumes/Foo
234881026 32659974 lrwxr-xr-x 1 root admin 0 1 ... /Volumes/Foo
# access file using ./vol/<device ID>/<inode>
~$ cd /.vol/234881026/1017800
:../Prague 2011 March$
~$ GetFileInfo /.vol/234881026/1017800/IMG_3731.JPG
file: "/Users/roger/Pictures/Prague 2011 March/IMG_3731.JPG"

Related

GetAttr Returning Value of 8211 VBA

What does the value of 8211 returned from the GetAttr funciton in VBA mean?
Code used:
Private Sub this()
Dim path As String
path = "c:\"
Dim firstdur As String
firstdir = Dir(path, vbNormal + vbHidden + vbDirectory)
Do Until firstdir = ""
Debug.Print ; firstdir & " " & GetAttr(path & firstdir)
firstdir = Dir()
Loop
End Sub
Output in question:
MSOCache 8211
GetAttr() Function returns an Integer representing the attributes of a file, directory, or folder.
Actually the result is a sum of the following constants specified by VBA within Enum VbFileAttribute:
vbNormal 0 Normal.
vbReadOnly 1 Read-only.
vbHidden 2 Hidden.
vbSystem 4 System file. Not available on the Macintosh.
vbDirectory 16 Directory or folder.
vbArchive 32 File has changed since last backup. Not available on the Macintosh.
vbAlias 64 Specified file name is an alias. Available only on the Macintosh.
Also there are File Attribute Constants which are used by Windows File API:
FILE_ATTRIBUTE_READONLY 1 (0x1) A file that is read-only. Applications can read the file, but cannot write to it or delete it. This attribute is not honored on directories.
FILE_ATTRIBUTE_HIDDEN 2 (0x2) The file or directory is hidden. It is not included in an ordinary directory listing.
FILE_ATTRIBUTE_SYSTEM 4 (0x4) A file or directory that the operating system uses a part of, or uses exclusively.
FILE_ATTRIBUTE_DIRECTORY 16 (0x10) The handle that identifies a directory.
FILE_ATTRIBUTE_ARCHIVE 32 (0x20) A file or directory that is an archive file or directory. Applications typically use this attribute to mark files for backup or removal .
FILE_ATTRIBUTE_DEVICE 64 (0x40) This value is reserved for system use.
FILE_ATTRIBUTE_NORMAL 128 (0x80) A file that does not have other attributes set. This attribute is valid only when used alone.
FILE_ATTRIBUTE_TEMPORARY 256 (0x100) A file that is being used for temporary storage. File systems avoid writing data back to mass storage if sufficient cache memory is available, because typically, an application deletes a temporary file after the handle is closed. In that scenario, the system can entirely avoid writing the data. Otherwise, the data is written after the handle is closed.
FILE_ATTRIBUTE_SPARSE_FILE 512 (0x200) A file that is a sparse file.
FILE_ATTRIBUTE_REPARSE_POINT 1024 (0x400) A file or directory that has an associated reparse point, or a file that is a symbolic link.
FILE_ATTRIBUTE_COMPRESSED 2048 (0x800) A file or directory that is compressed. For a file, all of the data in the file is compressed. For a directory, compression is the default for newly created files and subdirectories.
FILE_ATTRIBUTE_OFFLINE 4096 (0x1000) The data of a file is not available immediately. This attribute indicates that the file data is physically moved to offline storage. This attribute is used by Remote Storage, which is the hierarchical storage management software. Applications should not arbitrarily change this attribute.
FILE_ATTRIBUTE_NOT_CONTENT_INDEXED 8192 (0x2000) The file or directory is not to be indexed by the content indexing service.
FILE_ATTRIBUTE_ENCRYPTED 16384 (0x4000) A file or directory that is encrypted. For a file, all data streams in the file are encrypted. For a directory, encryption is the default for newly created files and subdirectories.
FILE_ATTRIBUTE_INTEGRITY_STREAM 32768 (0x8000) The directory or user data stream is configured with integrity (only supported on ReFS volumes). It is not included in an ordinary directory listing. The integrity setting persists with the file if it's renamed. If a file is copied the destination file will have integrity set if either the source file or destination directory have integrity set.
FILE_ATTRIBUTE_VIRTUAL 65536 (0x10000) This value is reserved for system use.
FILE_ATTRIBUTE_NO_SCRUB_DATA 131072 (0x20000) The user data stream not to be read by the background data integrity scanner (AKA scrubber). When set on a directory it only provides inheritance. This flag is only supported on Storage Spaces and ReFS volumes. It is not included in an ordinary directory listing.
FILE_ATTRIBUTE_RECALL_ON_OPEN 262144 (0x40000) This attribute only appears in directory enumeration classes (FILE_DIRECTORY_INFORMATION, FILE_BOTH_DIR_INFORMATION, etc.). When this attribute is set, it means that the file or directory has no physical representation on the local system; the item is virtual. Opening the item will be more expensive than normal, e.g. it will cause at least some of it to be fetched from a remote store.
FILE_ATTRIBUTE_RECALL_ON_DATA_ACCESS 4194304 (0x400000) When this attribute is set, it means that the file or directory is not fully present locally. For a file that means that not all of its data is on local storage (e.g. it may be sparse with some data still in remote storage). For a directory it means that some of the directory contents are being virtualized from another location. Reading the file / enumerating the directory will be more expensive than normal, e.g. it will cause at least some of the file/directory content to be fetched from a remote store. Only kernel-mode callers can set this bit.
Particularly for your case it is a sum of 8211 = 8192 + 16 + 2 + 1, that means read-only hidden folder which is not to be indexed.
You can refer to this page and check which attributes (flags) are set. For C:\MSOCACHE:
8211 = &H2013 = 0010 0000 0001 0011
= 1 + 2 + 16 + 8192
= READONLY + HIDDEN + DIRECTORY + NOT_CONTENT_INDEXED
So the file C:\MSOCACHE is a readonly hidden directory which is not content-indexed.

d3 pick attempt to write into update protected file

I tried to compile a simple program I wrote, but I am getting the following error:
:compile chris_programs fileprinter
fileprinter
.
[235] attempt to write into update protected file!
The chris_programs file is a Q pointer to the directory /u/chris_programs.
# pwd
/u/chris_programs
# ls -al
total 16
drwxrwxrwx 2 root system 256 Jun 16 06:58 .
drwxrwxrwx 15 root system 4096 Jun 13 17:40 ..
-rw-rw-rw- 1 root system 72 Jun 16 07:03 fileprinter
Here is the md entry for the chris_programs file:
DICT md 'chris_programs' size = 45
01 Q
02
03 /u/chris_programs
Glad to see you're getting comfortable with those super q-pointers. The issue here is that the object module goes into the Dict of the file hosting the BASIC source. But when you're using a host OS path without specifying a dictionary, it doesn't know where to put the object code. For this I would recommend the following:
create-file dict chris_programs 3
(Copy your md q-pointer to a different name first or you won't be able to use the same name.)
There will be a default q-pointer put into that dict file, which points any references to the data file back upon the dict (so dict and data are the same space). You can then copy the q-pointer you already have (renamed per above) into the dict to replace that item:
copy md renamed_pointer (o
to: (dict chris_programs
So now your source will be in the host file system and the object will be in D3.
There is a way to have both dict and data in the host OS but I don't recall the syntax at this time. I'll try to update this later with that if I get the info.
I recommend against a follow-up of "but I really want everything in the host OS!" The object code serves no purpose outside of the DBMS so you might as well keep it there. As to the source, well, I put some source at the OS level too for source control (integration with Subversion), to use with other editors, and to share with other MV DBMS's. Unless you're doing something like this, I'd advise you to keep all source and object in the DBMS. If you want a better editor, AccuTerm wED (Windows Editor) is a GUI with syntax highlighting and many other features. We can discuss that separately if that's your goal.
EDIT : The following is intended to provide a solution to the desired problem, outside the limitations of the faulty steps already taken.
Let's go back to fundamentals: Source code is in the data file, object goes in the dictionary. Here's how you link OS-level source to DBMS-level object.
create-file dict bp1 3
There will be a default q-pointer put into that dict file, which points any references to the data file back upon the dict (so dict and data are the same space). You can replace that reflexive pointer with a new one to the host OS. Use ED or whatever editing tool you prefer but the idea is:
ed dict bp1 bp1
The pointer item in the dict has the same name as the dict. Replace that item with the following:
01 q
02
03 /path/foldername
The line numbers are only for reference, don't type those in. Substitute the path as required. Your D3 user (as specified in the pick0 OS file) must have r/w access to that path.
So now you should be able to do something like this:
ED BP1 TEST1
01 CRT "SUCCESS"
COMPILE BP1 TEST1
RUN BP1 TEST1
You'll find TEST1 in /path/foldername. If you LIST DICT BP1, you'll see the BP1 pointer to the data file as well as an item for the object module for TEST1.
Rather than retrofitting what you have, please just follow this and you should be successful within a few minutes.
See note above about "but I really want everything in the host OS!"
Another approach to source control (not the same but close): Keep everything in the DBMS. Periodically t-dump your source to an OS-level backup file, or copy to a folder. Then source-control that OS data. This eliminates the direct connection between the OS and the programs, which most D3 people don't understand anyway.

how to differentiate between folder and file with NTFS

i know that if 1 is present at the 4th position of binary representation of attribute then this is a directory, but i am not sure if 1 is not present at that location should i consider it as a file?
or is there any other attribute present to determine folder or file ?
please help me.
Thanks.
Every file has a File Record in Master File Table (MFT) of the volume.
You can check the 2-byte flag stored at 0x16 and 0x17(attention, little endian). The second bit (counting from right) tells whether it's a folder(1), or a file(0).
if (flag & 0x02)
it's a folder
else
it's a file
If you change this bit that would originally represent a file to 1 by force, for example with the help of WinHex, and (probably a restart or system cache fresh is needed) double click it, OS would report that the file is corrupted.
In addition, the first bit tells if it is deleted.
if (flag & 0x01)
it's a normal file or folder not deleted
else
it's a deleted file or folder

What does f+++++++++ mean in rsync logs?

I'm using rsync to make a backup of my server files, and I have two questions:
In the middle of the process I need to stop and start rsync again.
Will rsync start from the point where it stopped or it will restart from the beginning?
In the log files I see "f+++++++++". What does it mean?
e.g.:
2010/12/21 08:28:37 [4537] >f.st...... iddd/logs/website-production-access_log
2010/12/21 08:29:11 [4537] >f.st...... iddd/web/website/production/shared/log/production.log
2010/12/21 08:29:14 [4537] .d..t...... iddd/web/website/production/shared/sessions/
2010/12/21 08:29:14 [4537] >f+++++++++ iddd/web/website/production/shared/sessions/ruby_sess.017a771cc19b18cd
2010/12/21 08:29:14 [4537] >f+++++++++ iddd/web/website/production/shared/sessions/ruby_sess.01eade9d317ca79a
Let's take a look at how rsync works and better understand the cryptic result lines:
1 - A huge advantage of rsync is that after an interruption the next time it continues smoothly.
The next rsync invocation will not transfer the files again, that it had already transferred, if they were not changed in the meantime. But it will start checking all the files again from the beginning to find out, as it is not aware that it had been interrupted.
2 - Each character is a code that can be translated if you read the section for -i, --itemize-changes in man rsync
Decoding your example log file from the question:
>f.st......
> - the item is received
f - it is a regular file
s - the file size is different
t - the time stamp is different
.d..t......
. - the item is not being updated (though it might have attributes
that are being modified)
d - it is a directory
t - the time stamp is different
>f+++++++++
> - the item is received
f - a regular file
+++++++++ - this is a newly created item
The relevant part of the rsync man page:
-i, --itemize-changes
Requests a simple itemized list of the changes that are being made to
each file, including attribute changes. This is exactly the same as
specifying --out-format='%i %n%L'. If you repeat the option, unchanged
files will also be output, but only if the receiving rsync is at least
version 2.6.7 (you can use -vv with older versions of rsync, but that
also turns on the output of other verbose messages).
The "%i" escape has a cryptic output that is 11 letters long. The
general format is like the string YXcstpoguax, where Y is replaced by
the type of update being done, X is replaced by the file-type, and the
other letters represent attributes that may be output if they are
being modified.
The update types that replace the Y are as follows:
A < means that a file is being transferred to the remote host (sent).
A > means that a file is being transferred to the local host (received).
A c means that a local change/creation is occurring for the item (such as the creation of a directory or the changing of a symlink,
etc.).
A h means that the item is a hard link to another item (requires --hard-links).
A . means that the item is not being updated (though it might have attributes that are being modified).
A * means that the rest of the itemized-output area contains a message (e.g. "deleting").
The file-types that replace the X are: f for a file, a d for a
directory, an L for a symlink, a D for a device, and a S for a
special file (e.g. named sockets and fifos).
The other letters in the string above are the actual letters that will
be output if the associated attribute for the item is being updated or
a "." for no change. Three exceptions to this are: (1) a newly created
item replaces each letter with a "+", (2) an identical item replaces
the dots with spaces, and (3) an unknown attribute replaces each
letter with a "?" (this can happen when talking to an older rsync).
The attribute that is associated with each letter is as follows:
A c means either that a regular file has a different checksum (requires --checksum) or that a symlink, device, or special file has a
changed value. Note that if you are sending files to an rsync prior to
3.0.1, this change flag will be present only for checksum-differing regular files.
A s means the size of a regular file is different and will be updated by the file transfer.
A t means the modification time is different and is being updated to the sender’s value (requires --times). An alternate value of T
means that the modification time will be set to the transfer time,
which happens when a file/symlink/device is updated without --times
and when a symlink is changed and the receiver can’t set its time.
(Note: when using an rsync 3.0.0 client, you might see the s flag
combined with t instead of the proper T flag for this time-setting
failure.)
A p means the permissions are different and are being updated to the sender’s value (requires --perms).
An o means the owner is different and is being updated to the sender’s value (requires --owner and super-user privileges).
A g means the group is different and is being updated to the sender’s value (requires --group and the authority to set the group).
The u slot is reserved for future use.
The a means that the ACL information changed.
The x means that the extended attribute information changed.
One other output is possible: when deleting files, the "%i" will
output the string "*deleting" for each item that is being removed
(assuming that you are talking to a recent enough rsync that it logs
deletions instead of outputting them as a verbose message).
Some time back, I needed to understand the rsync output for a script that I was writing. During the process of writing that script I googled around and came to what #mit had written above. I used that information, as well as documentation from other sources, to create my own primer on the bit flags and how to get rsync to output bit flags for all actions (it does not do this by default).
I am posting that information here in hopes that it helps others who (like me) stumble up on this page via search and need a better explanation of rsync.
With the combination of the --itemize-changes flag and the -vvv flag, rsync gives us detailed output of all file system changes that were identified in the source directory when compared to the target directory. The bit flags produced by rsync can then be decoded to determine what changed. To decode each bit's meaning, use the following table.
Explanation of each bit position and value in rsync's output:
YXcstpoguax path/to/file
|||||||||||
||||||||||╰- x: The extended attribute information changed
|||||||||╰-- a: The ACL information changed
||||||||╰--- u: The u slot is reserved for future use
|||||||╰---- g: Group is different
||||||╰----- o: Owner is different
|||||╰------ p: Permission are different
||||╰------- t: Modification time is different
|||╰-------- s: Size is different
||╰--------- c: Different checksum (for regular files), or
|| changed value (for symlinks, devices, and special files)
|╰---------- the file type:
| f: for a file,
| d: for a directory,
| L: for a symlink,
| D: for a device,
| S: for a special file (e.g. named sockets and fifos)
╰----------- the type of update being done::
<: file is being transferred to the remote host (sent)
>: file is being transferred to the local host (received)
c: local change/creation for the item, such as:
- the creation of a directory
- the changing of a symlink,
- etc.
h: the item is a hard link to another item (requires
--hard-links).
.: the item is not being updated (though it might have
attributes that are being modified)
*: means that the rest of the itemized-output area contains
a message (e.g. "deleting")
Some example output from rsync for various scenarios:
>f+++++++++ some/dir/new-file.txt
.f....og..x some/dir/existing-file-with-changed-owner-and-group.txt
.f........x some/dir/existing-file-with-changed-unnamed-attribute.txt
>f...p....x some/dir/existing-file-with-changed-permissions.txt
>f..t..g..x some/dir/existing-file-with-changed-time-and-group.txt
>f.s......x some/dir/existing-file-with-changed-size.txt
>f.st.....x some/dir/existing-file-with-changed-size-and-time-stamp.txt
cd+++++++++ some/dir/new-directory/
.d....og... some/dir/existing-directory-with-changed-owner-and-group/
.d..t...... some/dir/existing-directory-with-different-time-stamp/
Capturing rsync's output (focused on the bit flags):
In my experimentation, both the --itemize-changes flag and the -vvv flag are needed to get rsync to output an entry for all file system changes. Without the triple verbose (-vvv) flag, I was not seeing directory, link and device changes listed. It is worth experimenting with your version of rsync to make sure that it is observing and noting all that you expected.
One handy use of this technique is to add the --dry-run flag to the command and collect the change list, as determined by rsync, into a variable (without making any changes) so you can do some processing on the list yourself. Something like the following would capture the output in a variable:
file_system_changes=$(rsync --archive --acls --xattrs \
--checksum --dry-run \
--itemize-changes -vvv \
"/some/source-path/" \
"/some/destination-path/" \
| grep -E '^(\.|>|<|c|h|\*).......... .')
In the example above, the (stdout) output from rsync is redirected to grep (via stdin) so we can isolate only the lines that contain bit flags.
Processing the captured output:
The contents of the variable can then be logged for later use or immediately iterated over for items of interest. I use this exact tactic in the script I wrote during researching more about rsync. You can look at the script (https://github.com/jmmitchell/movestough) for examples of post-processing the captured output to isolate new files, duplicate files (same name, same contents), file collisions (same name, different contents), as well as the changes in subdirectory structures.
1.) It will "restart the sync", but it will not transfer files that are the same size and timestamp etc. It first builds up a list of files to transfer and during this stage it will see that it has already transferred some files and will skip them. You should tell rsync to preserve the timestamps etc. (e.g. using rsync -a ...)
While rsync is transferring a file, it will call it something like .filename.XYZABC instead of filename. Then when it has finished transferring that file it will rename it. So, if you kill rsync while it is transferring a large file, you will have to use the --partial option to continue the transfer instead of starting from scratch.
2.) I don't know what that is. Can you paste some examples?
EDIT: As per http://ubuntuforums.org/showthread.php?t=1342171 those codes are defined in the rsync man page in section for the the -i, --itemize-changes option.
Fixed part if my answer based on Joao's

How do I extend this batch command?

I came across this piece of batch code. It should find the path to every single .exe file if you enter it.
#Set Which=%~$PATH:1
#if "%Which%"=="" ( echo %1 not found in path ) else ( echo %Which% )
For instance, if you save this code in the file which.bat and then go to its directory in DOS, you can write
which notepad.exe
The result will be: C:\WINDOWS\System32\notepad.exe
But it's a bit limited in that it can't find other executables. I've done a bit of batch, but I don't see how I can edit this code so that it can crawl the hard drive and return the exact path.
When you want to find an executable (or other file) anywhere on the drive, not just in PATH, then perhaps only the following will work reliably:
dir /s /b \*%!~x1 | findstr "%1"
But still, it's horribly slow. And it doesn't work with cyclic directory structures. And it probably eats children.
You may be much better off using either Windows Search (dependin on OS) or writing a program from scratch which does exactly what you want (the cyclic dir thing might happen on recent Windows versions pretty easily; afaik they have that already by default).
Here's the same thing written in python:
import os
def which(program,additional_dirs=[]):
path = os.environ["PATH"]
path_components = path.split(":")
path_components.extend(additional_dirs)
for item in path_components:
location = os.path.join(item,program)
if os.path.exists(location):
return location
return None
If called with just an argument, this will only search the path. If called with two arguments ( the second being an array ), other directories will be searched.Here are some snippets :
# this will search notepad.exe in the PATH variable
print which("notepad.exe")
# this will search whatever.exe in PATH. If not found there,
# it will continue searching in the D:\ drive and in the Program Files
print which("whatever.exe",["D:/","C:/Program Files"])