How to use persistent heap images to make loading of theories faster in Isabelle/jEdit? - jedit

Let's assume I have a directory isabelle_afp where a lot of theories are stored. This directory is a library and I do not plan to change the files in it. I want to speed up the start-up time of Isabelle/jEdit (by default, all theories in isabelle_afp my current theory depends on are processed anew).
How can I skip this step? The system manual tells me to build a persistent heap image. What is the easiest way to do so?
And how can I tell Isabelle/jEdit to load this heap image?

Isabelle/jEdit in Isabelle2013 already takes care of building your heap images, by a relatively basic mechanism that uses the isabelle build_dialog tool internally (which has a separate entry in the cited documentation).
You have two main possibilities doing it without using isabelle build_dialog or the isabelle build power-tool manually:
The jEdit dialog "Utilities / Options / Plugin Options / Isabelle / General" provides a choice for "Logic", with a tiny tool tip saying that you have to restart the application after changing it. Doing that, the heap image will be produced on restart.
The command line option -l, e.g. isabelle jedit -l HOL-Word
For AFP sessions you need to tell the system separately about session directories. This can be done on the command line via isabelle jedit -d DIR1 -d DIR2 or in your $ISABELLE_HOME_USER/ROOTS file (list each directory on a separate line).
A pure command-line solution would look like this:
isabelle jedit -d isabelle_afp -l Simpl
Note that in this example, isabelle_afp is a (relative or absolute) directory name, while Simpl is the logical session name.

First, you need to set up a "session" for your isabelle_afp directory. This is done by creating a file ROOT (inside isabelle_afp) which contains an entry of the following shape (see also isabelle doc system Chapter 3: Isabelle sessions and build management)
session session_name = HOL +
theories
Theory1
Theory2
Theory3
This roughly means that the heap image session_name should be based on the HOL heap image and additionally contain the theories Theory1, Theory2, ...
Now invoke isabelle jedit -d isabelle_afp -l session_name. When done for the first time, this builds the heap image of session session_name. As long as nothing in isabelle_afp changes, any further invocations will directly start Isabelle/jEdit on top of the prebuilt heap image session_name.

Related

Why are the files called .babelRC and .npmRC? [duplicate]

In my home folder in Linux I have several config files that have "rc" as a file name extension:
$ ls -a ~/|pcregrep 'rc$'
.bashrc
.octaverc
.perltidyrc
.screenrc
.vimrc
What does the "rc" in these names mean?
It looks like one of the following:
run commands
resource control
run control
runtime configuration
Also I've found a citation:
The ‘rc’ suffix goes back to Unix's grandparent, CTSS. It had a command-script feature called "runcom". Early Unixes used ‘rc’ for the name of the operating system's boot script, as a tribute to CTSS runcom.
Runtime Configuration normally if it's in the config directory. I think of them as resource files. If you see rc in file name this could be version i.e. Release Candidate.
Edit: No, I take it back officially... "run commands"
[Unix: from runcom files on the CTSS system 1962-63, via the startup script /etc/rc]
Script file containing startup instructions for an application program (or an entire operating system), usually a text file containing commands of the sort that might have been invoked manually once the system was running but are to be executed automatically each time the system starts up.
Thus, it would seem that the "rc" part stands for "runcom", which I believe can be expanded to "run commands". In fact, this is exactly what the file contains, commands that bash should run.
Quoted from What does “rc” in .bashrc stand for?
I learnt something new! :)
In the context of Unix-like systems, the term rc stands for the phrase "run commands". It is used for any file that contains startup information for a command. It is believed to have originated somewhere in 1965 from a runcom facility from the MIT Compatible Time-Sharing System (CTSS).
Reference: https://en.wikipedia.org/wiki/Run_commands
In Unix world, RC stands for "Run Control".
http://www.catb.org/~esr/writings/taoup/html/ch10s03.html
To understand rc files, it helps to know that Ubuntu boots into several different runlevels. They are 0-6, 0 being "halt", 1 being "single-user", 2 being "multi-user"(the default runlevel), etc. This system has now been outdated by the Upstart and initd programs in most Linux Distros. It is still maintained for backwards compatibility.
Within the /etc directory are several folders labeled "rc0.d, rc1.d" etc, through rc6.d. These are the directories the kernel refers to to know which init scripts it should run for that runlevel. They are symbolic links to the system service scripts residing in the /etc/init.d directory.
In the context you are using it, it would appear that you are listing any files with rc in the name. The code in these files will set the way the services/tasks startup and run when initialized.

Why does `singularity run/exec` automatically bind specific some directories? What is the use case?

I'm familiar with containers, but new to Singularity and I found myself fighting a broken Python installation in a Singularity container tonight. It turns out that this was because $HOME was being mounted into my container without my knowledge.
I guess that I've developed a liking for the idiom "Explicit is better than implicit" from Python. To me, automatically mounting specific directories is unexpected behavior.
Three questions:
Why does Singularity default to mounting $HOME, /tmp, /proc, etc?
So that I can become more comfortable with Singularity, what are some use cases for this behavior?
I see the --no-home flag, but is there a flag to disable all of the default mounts without needing to change the default Singularity configuration?
It's a mixture of design, convenience and technical necessity.
The biggest reason is that, unless you use certain params that say otherwise, Singularity images are read-only filesystems. You need somewhere to write output and any temporary files that get created along the way. Maybe you know to mount in your output dir, but there are all sorts of files that get created / modified / deleted in the background that we don't ever think about. Implicit automounts give reasonable defaults that work in most situations.
Simplistic example: you're doing a big sort and filter operation on some data, but you're print the results to console so you don't bother to mount in anything but the raw data. But even after some manipulation and filtering, the size of the data exceeds available memory so sort falls back to using small files in /tmp before being deleted when the process finishes. And then it crashes because you can't write to /tmp.
You can require a user to manually specify a what to mount to /tmp on run, or you can use a sane default like /tmp and also allow that to be overridden by the user (SINGULARITY_TMPDIR, -B $PWD/fake_tmp:/tmp, --contain/--containall). These are all also configurable, so the admins can set sane defaults specific the running environment.
There are also technical reasons for some of the mounts. e.g., /etc/passwd and /etc/group are needed to match permissions on the host OS. The docs on bind paths and mounts are actually pretty good and have more specifics on the whats and whys, and even the answer to your third question: --no-mount. The --contain/--containall flags will probably also be of interest. If you really want to deep dive, there are also the admin docs and the source code on github.
A simple but real singularity use case, with explanation:
singularity exec \
--cleanenv \
-H $PWD:/home \
-B /some/local/data:/data \
multiqc.sif \
multiqc -i $SAMPLE_ID /data
--cleanenv / -e: You've already experienced the fun of unexpected mounts, there's also unexpected environment variables! --cleanenv/-e tells Singularity to not persist the host execution environment in the container. You can still use, e.g., SINGULARITYENV_SOMEVAR=23 to have SOMEVAR=23 inside the container though, as that is explicitly set.
-H $PWD:/home: This mounts the current directory into the container to /home and sets HOME=/home. While using --contain/--containall and explicit mounts is probably a better solution, I am lazy and this ensures several things:
the current directory is mounted into the container. The implicit mounting of the working is allowed to fail, and will do so quietly, if the base directory does not exist in the image. e.g., if you're running from /cluster/my-lab/some-project and there is no /cluster inside your image, it will not be mounted in. This is not an issue if using explicit binds directly (-B /cluster/my-lab/some-project) or if an explicit bind has a shared path (-B /cluster/data/experiment-123) with current directory.
the command is executed from the context of the current directory. If $PWD fails to be mounted as described above, singularity uses $HOME as the working directory instead. If both $PWD and $HOME failed to mount, / is used. This can cause problems if you're using relative paths and you aren't where you expected to be. Since it is specific to the path on the host, it can be really annoying when trying to duplicate a problem locally.
the base path is inside the container is always the same regardless of host OS file structure. Consistency is good.
The rest is just the command that's being run, which in this case summarizes the logs from other programs that work with genetic data.

How to synchronise jEdit settings between multiple computers

I use jEdit as a text editor, because it's cross-platform, and has all the features I need (Java regular expressions, keystroke macros, etc). However, it's a pain to set up on a new computer, and to synchronise settings (keyboard bindings, file save options, etc).
Can anyone suggest a good way of doing this? Ideally it should synchronise in the background, perhaps writing to a Dropbox folder. I've had a look in the jEdit plugins, and there doesn't appear to be anything.
Thanks!
I use the following macro to clean up and zip my jEdit settings directory to my Google Drive directory on my Mac:
void delete(String name) {
path = jEdit.getSettingsDirectory()+"/"+name;
VFS vfs = VFSManager.getVFSForPath(path);
session = vfs.createVFSSession(path,view);
vfs._delete(session, path, view);
if (session != null) vfs._endVFSSession(session,view);
}
runInSystemShell(view, "cd " + jEdit.getSettingsDirectory());
// clean up files
delete("abbrevs"); // I use SuperAbbrevs
delete("killring.xml");
delete("recent.xml");
delete("perspective.xml");
delete("activity.log");
delete("history");
delete("printspec");
delete("registers.xml");
delete("pluginMgr-Cached.xml.gz");
delete("macros" + File.separator + ".macroManagerCache"); // File.separator = System.getProperty("file.separator")
delete("server");
delete("jedit_quicknote.txt"); // or qn.txt
delete("mirrorList.xml"); // mirrorList can be updated by Options -> Plugin Manager
// clean up directories
delete("jars-cache");
delete("settings-backup");
delete("cache");
delete("DockableWindowManager");
delete("PluginManager.download");
delete("printspec");
runInSystemShell(view, "rm -f ~/Google\\ Drive/doc/jedit.zip; zip -r ~/Google\\ Drive/doc/jedit.zip * -x '*.DS_Store'");
Then I can run the following alias to unzip the settings on other machines:
alias je_sync="rm -rf ~/.jedit/*; unzip ~/Google\ Drive/doc/jedit.zip -d ~/.jedit/"
There is no built-in or plugin way I'm aware of to synchronize jEdit settings. But everything should be stored in your settings directory. ("should" because some plugins might store stuff elsewhere, especially if it uses settings together with other ways to do stuff, like git or svn that store user credentials in ~/.subversion/ and so on. Where the settings directory lives depends on the OS you are using jEdit on if you do not use the -settings switch to start jEdit).
So to synchronize the settings, just synchronize the settings directory via some means like Google Drive, Box, Dropbox or anything else. You can even make jEdit directly use those directories with the -settings switch, e. g. if you are on an OS that does not properly support symlinks like Windows.
But be aware that there can arise serious problems or unexpected behaviour. E. g. you will also sync stuff like recent files, last window and dialog positions, last opened files, ...
And more importantly, jEdit currently does not behave too well if you run two instances in the same settings directory, this for sure also would cover cases where you sync the settings folder via some means.
One scenario that will happen if you use two jEdit instances (not windows, real instances, like opened with -noserver) on the same computer on the same settings directory and will for sure also happen with such a synced directory:
instance A starts running, reads the settings files and stores their last modification date
instance A writes configuration file Z and stores its last modification date
instance B starts running, reads the settings files and stores their last modification date
instance B writes configuration file Z and stores its last modification date
instance A wants to write configuration file Z, but sees that its last modification date is newer than what it remembered. It will give a warning to the log, but nothing more and will not save file Z anymore until restarted.
So if Z e. g. is the properties file, any settings changes done after this in instance A will just be lost and not saved. And this happens on a per-file basis, depending on which instance first writes a certain file after both instances were started, so some files may be locked by instance A, some by instance B which could further increase confusion.
So, if you are ok with syncing stuff like recent files, last open files, and other stuff with paths in it and so on and so on and you make sure that you will not use two jEdit instances on the same settings directory at the same time, it could be ok to just use something like Google Drive or alike.

Mock filesystem in ocaml

I am writing code that creates a folder/file structure in ocaml, and I want to write some tests for it. I'd like to not have to create and delete files each time the tests are run, since they cna be run many times.
What would be the best way to go to mock filesystem? I'd be open to have a filesystem in memory or just mock up functions.
Maybe you could use a Makefile to help you.
For instance make test might start by compiling your program, then create the files and folders required for testing, launching your program, and then cleaning the test folder if need be (at that time, you might also want to check if the state of the test folder is as expected).
On linux:
mount -o size=50m -t tmpfs none ./ramdisk
will create a filesystem in ram, size 50M, mounted to ./ramdisk. Only root can do this. Non-root users can use it. It will show up in df and du. You can clean it by doing umount ./ramdisk.
Creation, usage and removal are working just fine, maybe the root requirement is an obstacle.

/tmp files filling up with surefires files

When Jenkins invokes maven build, /tmp fills with 100s of surefire839014140451157473tmp, how to explicitly redirect to another directory during the build. For clover build it fills with 100s of grover53460334580.jar? Any idea to over come this?
And any body know exact step by step to create ramdisk so I could redirect surefire stuffs into that ramdisk ? Will it save write time to hard drive?
Thanks
Many programs respect the TMPDIR (and sometimes TMP) environment variables. Maybe Jenkins uses APIs that respect them? Try:
TMPDIR=/path/to/bigger/filesystem jenkins
when launching Jenkins. (Or however you start it -- does it run as a daemon and have a shell-script to launch it?)
There might be some performance benefit to using a RAM-based filesystem -- ext3, ext4, and similar journaled filesystems will order writes to disk, and even a quick fd=open(O_CREAT); unlink(fd); sequence will probably require both on-disk journal updates and directory updates. (Homework: test this.) A RAM-based filesystem won't perform the journaling, and might or might not write anything to disk (depending upon which one you pick).
There are two main choices: ramfs is a very simple window into the kernel's caching mechanism. There is no disk-based backing for your files at all, and no memory limits. You can fill all your memory with one of these very quickly, and suffer very horrible consequences. (Almost no programs handle out-of-disk well, and the OOM-killer can't free up any of this memory.) See the Linux kernel file Documentation/filesystems/ramfs-rootfs-initramfs.txt.
tmpfs is a slight modification of ramfs -- you can specify an upper limit on the space it can allocate (-o size) and the page cache can swap the data to the swap partitions or swap files -- which is an excellent bonus, as your memory might be significantly better used elsewhere, such as keeping your compiler, linker, source files, and object files in core. See the Linux kernel file Documentation/filesystems/tmpfs.txt.
Adding this line to your /etc/fstab will change /tmp globally:
tmpfs /tmp tmpfs defaults 0 0
(The default is to allow up to half your RAM to be used on the filesystem. Change the defaults if you need to.)
If you want to mount a tmpfs somewhere else, you can; maybe combine that with the TMPDIR environment variable from above or learn about the new shared-subtree features in Documentation/filesystems/sharedsubtree.txt or made easy via pam_namespace to make it visible only to your Jenkins and child processes.