What is the optimal way to store data-files for testing using travis-ci + Docker? - testing

I am trying to set-up the testing of the repository using travis-ci.org and Docker. However, I couldn't find any manuals about what is the politics on memory usage.
To perform a set of tests (test.sh) I need a set of input files to run on, which are very big (up to 1 Gb, but average 500 Mb).
One idea is to wget directly in test.sh script, but for each test-run it would be not efficient to download the input file again and again.
The other idea is to create a separate dockerfile containing the test-files and mount it as a drive, but this would be not nice to push such a big dockerimage in the general register.
Is there a general prescription for such tests?

Have you considered using Travis File Cache?
You can write your test.sh script in a way so that it will only download a test file if it was not available on the local file system yet.
In your .travis.yml file, you specify which directories should be cached after a successful build. Travis will automatically restore that directory and files in it at the beginning of the next build. As your test.sh script will then notice the file exists already, it will simply skip the download and your build should be a little faster.
Note that how the Travis cache works is that it will create an archive file and put it on some cloud storage where it will need to download it later on as well. However, the assumption is that the network traffic will likely be inside that "cloud" and potentially in the same data center as well. This should still give you some benefits in terms of build time and lower use of resources in your own infrastructure.

Related

Why did Pycharm ask me to set up a sync folder every time when I add a remote interpreter?

Every time I tried to config a remote interpreter, Pycharm asked me to set a sync folder. In my routine, I usually have the Cannot find declaration to go to error which can not be solved by invalidating caches. So I have to config the interpreter again. And these caused the redundant folders in my remote machine. And another situation is that I want to create other projects with the same interpreter. Where I have to config the folder mapping for each project to make the interpreter valid.
I do not understand this way. In my opinion, the sync folders should correspond to my local project. And the interpreter should be independent of the projects.
Every time I tried to config a remote interpreter, Pycharm asked me to set a sync folder.
To be able to execute a script on the remote machine, it is necessary to make sure it exists on it. This is by design, but if you already have a project folder deployed, you can change the suggested paths to needed ones during the interpreter configuration.
See step 7. https://www.jetbrains.com/help/pycharm/configuring-remote-interpreters-via-ssh.html#ssh
And another situation is that I want to create other projects with the same interpreter. Where I have to config the folder mapping for each project to make the interpreter valid.
Unfortunately, this setup does not work, please vote for
https://youtrack.jetbrains.com/issue/PY-40680/Allow-reusing-a-single-remote-interpreter-in-multiple-project
to increase its priority.

Methods to deploy an npm project to a remote server

I'm trying to find a good cross-platform way to deploy an npm project to a remote server over ssh (or another method). I'm specifically looking for something that copies over the files, while respecting the .gitignore (not copying files that are in .gitignore, and preserving files in .gitignore on the remote server, while pruning spare files.
Notably as a consequence of this requirement, this should neither copy node_modules nor clobber remote node_modules.
The idea is to get the source code to the server this way, and then execute commands over ssh to build it on the server, copy the dist into the appropriate location on the server, and run any other deploy steps.
I already have something that works fairly well. I set up a git repo on my server that I have a remote to locally, and I push my local changes to that remote. A post-recieve hook then takes effect and copies the source to where I need it, similar to what this describes.
This works pretty nicely, but it kind of falls apart when I want to deploy without fully committing everything, and it also feels somewhat fragile. I use a fairly complex local script to checkout a new branch, commit all working changes, and push it, but it fails on certain cases like having untracked files.
Pardon the lengthy context. tl;dr; I'm looking for other options to do this sort of deploy. It seems like rsync would be a natural candidate and I've looked into the npm rsync package, but its Windows support doesn't seem great, requiring cygwin. I've also considered copying manually with scp and leveraging a library to parse the .gitignore, but I'd like to preserve node_modules on the server (so it doesn't have to redownload everything), so I can't just overwrite the directory.
Any ideas?

Where to store git repo to run swa efficiently using wsl2?

I'm trying to run my static web app using Windows Subsystem for Linux (2), but I can't figure out where on my computer I should store the git repository to be able to run it decently quickly. I have tried storing it on under /mnt/c/{workfolder}, but it takes several minutes to start up (using npm run start), and I have to rerun to see any changes. This is useless when I'm trying to work...
I have also tried to store it in /mnt/wsl/{workfolder}, and in that case it starts up quickly and I can see my changes without rerunning the app. However, it seems to disappears when I restart my computer.
Where should I store the git repository to be able to run the app quickly and see changes without rerunning? I'm assuming there's something I'm not understanding, help me get this it you know.
You'll want it somewhere on the ext4 partition of the WSL distribution. Typically, the best place is going to be under your WSL /home/<username> folder.
I would recommend:
mkdir ~/src
# or
mkdir ~/projects
# or something similar
Then create subdirectories for each project in that directory.
Why the others don't work:
/mnt/c is the Windows C: drive. That drive is mounted into WSL2 using the 9P network file system, and yes, it's (a) slow, and (b) does not support inotify, so apps cannot register for notifications of changes to files.
/mnt/wsl is a tmpfs mount. It's really there for holding things that need to be shared between all running WSL instances. The auto-generated resolv.conf that you see there is one of those things. You can also use it for copying a file from one WSL distribution to another -- Simply copy the file to /mnt/wsl, start another WSL distribution, and copy or move the file out.
But yes, all tmpfs mounts are ephemeral and will terminate when the last WSL2 distribution/instance terminates.

Mock filesystem in ocaml

I am writing code that creates a folder/file structure in ocaml, and I want to write some tests for it. I'd like to not have to create and delete files each time the tests are run, since they cna be run many times.
What would be the best way to go to mock filesystem? I'd be open to have a filesystem in memory or just mock up functions.
Maybe you could use a Makefile to help you.
For instance make test might start by compiling your program, then create the files and folders required for testing, launching your program, and then cleaning the test folder if need be (at that time, you might also want to check if the state of the test folder is as expected).
On linux:
mount -o size=50m -t tmpfs none ./ramdisk
will create a filesystem in ram, size 50M, mounted to ./ramdisk. Only root can do this. Non-root users can use it. It will show up in df and du. You can clean it by doing umount ./ramdisk.
Creation, usage and removal are working just fine, maybe the root requirement is an obstacle.

/tmp files filling up with surefires files

When Jenkins invokes maven build, /tmp fills with 100s of surefire839014140451157473tmp, how to explicitly redirect to another directory during the build. For clover build it fills with 100s of grover53460334580.jar? Any idea to over come this?
And any body know exact step by step to create ramdisk so I could redirect surefire stuffs into that ramdisk ? Will it save write time to hard drive?
Thanks
Many programs respect the TMPDIR (and sometimes TMP) environment variables. Maybe Jenkins uses APIs that respect them? Try:
TMPDIR=/path/to/bigger/filesystem jenkins
when launching Jenkins. (Or however you start it -- does it run as a daemon and have a shell-script to launch it?)
There might be some performance benefit to using a RAM-based filesystem -- ext3, ext4, and similar journaled filesystems will order writes to disk, and even a quick fd=open(O_CREAT); unlink(fd); sequence will probably require both on-disk journal updates and directory updates. (Homework: test this.) A RAM-based filesystem won't perform the journaling, and might or might not write anything to disk (depending upon which one you pick).
There are two main choices: ramfs is a very simple window into the kernel's caching mechanism. There is no disk-based backing for your files at all, and no memory limits. You can fill all your memory with one of these very quickly, and suffer very horrible consequences. (Almost no programs handle out-of-disk well, and the OOM-killer can't free up any of this memory.) See the Linux kernel file Documentation/filesystems/ramfs-rootfs-initramfs.txt.
tmpfs is a slight modification of ramfs -- you can specify an upper limit on the space it can allocate (-o size) and the page cache can swap the data to the swap partitions or swap files -- which is an excellent bonus, as your memory might be significantly better used elsewhere, such as keeping your compiler, linker, source files, and object files in core. See the Linux kernel file Documentation/filesystems/tmpfs.txt.
Adding this line to your /etc/fstab will change /tmp globally:
tmpfs /tmp tmpfs defaults 0 0
(The default is to allow up to half your RAM to be used on the filesystem. Change the defaults if you need to.)
If you want to mount a tmpfs somewhere else, you can; maybe combine that with the TMPDIR environment variable from above or learn about the new shared-subtree features in Documentation/filesystems/sharedsubtree.txt or made easy via pam_namespace to make it visible only to your Jenkins and child processes.