Where should a dockerized web application store uploaded files? - file-upload

I'm building a web application that needs to allow users to upload profile pictures. I want the application to be self-contained, so that people don't need to have an s3 or other cloud storage service account.
It's best to keep docker containers as disposable as possible, so I guess I should create a volume. So I want the volume to be created automatically, so people don't have to specify a volume when running the container, but the documentation for the VOLUME instruction in dockerfiles confuses me.
The VOLUME instruction creates a mount point with the specified name and marks it as holding externally mounted volumes from native host or other containers.
What does it mean to be marked as such? The data is to be written by the application, it's not coming from an extenrl source.

When you mark a volume in the dockerfile, say VOLUME /site/uploads,it makes it very easy to later run another container with --volumes-from <container-name> and have /site/uploads available in the new container with all the data that has been written and that will be written (if the first container is still running).
Also, you'll be able to see that volume with docker volume ls after you start the container the first time.
The only problem that you might have if you delete the container, is that you will lose the mapping provided by docker inspect <container-name> that tells you which volume your container created. To see the volume your container created really clearly and quickly, try docker inspect <container-name> | jq '.[].Mounts' if you have jq installed. Otherwise, docker inspect <container-name> | grep Mounts -A 10 might be enough when you only have one volume. (you can also just wade through all the json yourself)
Even if you remove the container that created the volume, the volume will remain on your system, viewable with docker volume ls unless you run docker volume rm <volume-name>
Note: I'm using docker version 1.10.3

You will not have problems with that, the images will be uploaded to the mounted filesystem without problems.
Maybe you have to specify free permissions to the uploads folder so that you can write on it.

Related

Copy / update the code in docker container without stopping container

I have a docker-composer setup in which i am uploading source code for server say flask api . Now when i change my python code, I have to follow steps like this
stop the running containers (docker-compose stop)
build and load updated code in container (docker-compose up --build)
This take a bit long time . Is there any better way ? Like update code in the running docker and then restarting Apache server without stopping whole container ?
There are few dirty ways you can modify file system of running container.
First you need to find the path of directory which is used as runtime root for container. Run docker container inspect id/name. Look for the key UpperDir in JSON output. You can edit/copy/delete files in that directory.
Another way is to get the process ID of the process running within container. Go to the /proc/process_id/root directory. This is the root directory for the process running inside docker. You can edit it on the fly and changes will appear in the container.
You can run the docker build while the old container is still running, and your downtime is limited to the changeover period.
It can be helpful for a couple of reasons to put a load balancer in front of your container. Depending on your application this could be a "dumb" load balancer like HAProxy, or a full Web server like nginx, or something your cloud provider makes available. This allows you to have multiple copies of the application running at once, possibly on different hosts (helps for scaling and reliability). In this case the sequence becomes:
docker build the new image
docker run it
Attach it to the load balancer (now traffic goes to both old and new containers)
Test that the new container works correctly
Detach the old container from the load balancer
docker stop && docker rm the old container
If you don't mind heavier-weight infrastructure, this sequence is basically exactly what happens when you change the image tag in a Kubernetes Deployment object, but adopting Kubernetes is something of a substantial commitment.

Proposal to Migrate OpenNebula Datastore from Local FS to NFS

I have an instance of OpenNebula with 2 nodes running KVM and local file store. This means no live migration as vm images are scp'd to each node, so there is also no option of failover or Live Migration.
I would like to implement NFS shared storage and move the VM's from the local FS datastore to the NFS shared storage datastore. OpenNebula supports migrating VM's between datastores, but only datastores of the same type i.e. 'ssh' to 'ssh' but not 'ssh' to 'shared'.
I am working on a method of achieving this, and would love some feedback as to why this is a good or a bad idea.
Thanks
OpenNebula doesn't currently support migrating VM's from one type of datastore to another different type of datastore. I have been working on a method that is working and want to document it here to get some feedback and opinions on the method.
A datastore type is identified primarily by the Transfer manager Driver 'TM_MAD' setting. This setting cannot be changed, either through Sunstone or through the cli. So we need a method to do just this. This is what i did. I started with a fresh install of OpenNebula 5.4.13 in one VM, and 2 VM nodes all running Debian 9 within VMware virtual machines (don't forget to check virtualisation for the VM CPU options).
NOTE: This is an experimental process so make sure you Backup everything first!
Steps
To migrate to a different store, there are a few steps we need to do. They are as follows:
Setup the NFS share exports,
Move the VM images to the NFS share and mount the datastore,
Change the datastore types,
Configure the nodes for NFS share.
Setup NFS Server
First thing we want to do is setup the NFS shares that we want to use. I'm using a single share for the base datastore folder, but you could use separate shares for each datastore ID from different NFS servers.
On the NFS Server create the datastore folder i.e. mkdir /share/one_datastore,
Add the datastore path to exports and export the new share exportfs -rav,
Confirm the share is available showmount -e localhost
Prepare to Migrate
Before we modify the datastores there are a few things to do first:
Shut down any running VM's and undeploy them. This saves the machines states and copies the images back to the image store,
Stop Sunstone and OpenNebula services systemctl stop opennebula && systemctl stop opennebula-sunstone.
Migrate Data
Shared storage shares the VM disk images so all the nodes can access the same data. So copy the VM data to the NFS share ready for mounting.
From the Sunstone frontend server confirm the NFS shares showmount -e [nfs-server],
Create a temp folder to mount the share in mkdir /mnt/datastore,
Temporarily mount the NFS folder mount [nfs-server]:/share/one_datastore /mnt/datastore,
Move the datastore folders to the share mv /var/lib/one/datastores/* /mnt/datastore/
OpenNebula datastore folders now live on the NFS server: ls /mnt/datastore should list folders 0, 1 and 2,
Mount the NFS share to replace the OpenNebula datastore folder mount [nfs-server]:/share/one_datastore /var/lib/one/datastores,
Confirm the folders are available ls /var/lib/one/datastores should list our 3 folders 0, 1 and 2,
Add the mount into /etc/fstab to persist the mount on boot.
OpenNebula frontend is now configured to access the datastore folders from the NFS share. Next we want to change the datastores type from ssh to shared.
Change Datastore Types
The data for the datastore configuration is stored in the OpenNebula database /var/lib/one/one.db. We can change the driver type by editing the datastore configuration data which then tells OpenNebula whiche drivers to use, and how to handle the datastore data. By default OpenNebula uses an sqlite database with the option of MySql. i'm using sqlite but the same works for MySql.
Open the OpenNebula database sqlite3 /var/lib/one/one.db,
View all tables with .tables. datastore_pool is the table we want to modify,
List all the records in the table select * from datastore_pool; will result in a screen-full of configuration data. Each record has an identifier oid which matches the datastore ID, like this (the first 0 is the datastore ID for the default SYSTEM database):
0|system|<DATASTORE><ID>0</ID><UID>0</UID><GID>0</GID><UNAME>oneadmin</UNAME><GNAME>oneadmin</GNAME><NAME>system</NAME><PERMISSIONS><OWNER_U>1</OWNER_U><OWNER_M>1</OWNER_M><OWNER_A>0</OWNER_A><GROUP_U>1</GROUP_U><GROUP_M>0</GROUP_M><GROUP_A>0</GROUP_A><OTHER_U>0</OTHER_U><OTHER_M>0</OTHER_M><OTHER_A>0</OTHER_A></PERMISSIONS><DS_MAD><![CDATA[-]]></DS_MAD><TM_MAD><![CDATA[ssh]]></TM_MAD><BASE_PATH><![CDATA[/var/lib/one//datastores/0]]></BASE_PATH><TYPE>1</TYPE><DISK_TYPE>0</DISK_TYPE><STATE>0</STATE><CLUSTERS><ID>0</ID></CLUSTERS><TOTAL_MB>0</TOTAL_MB><FREE_MB>0</FREE_MB><USED_MB>0</USED_MB><IMAGES></IMAGES><TEMPLATE><ALLOW_ORPHANS><![CDATA[NO]]></ALLOW_ORPHANS><DISK_TYPE><![CDATA[FILE]]></DISK_TYPE><DS_MIGRATE><![CDATA[YES]]></DS_MIGRATE><RESTRICTED_DIRS><![CDATA[/]]></RESTRICTED_DIRS><SAFE_DIRS><![CDATA[/var/tmp]]></SAFE_DIRS><SHARED><![CDATA[NO]]></SHARED><TM_MAD><![CDATA[ssh]]></TM_MAD><TYPE><![CDATA[SYSTEM_DS]]></TYPE></TEMPLATE></DATASTORE>|0|0|1|1|0
Now to change the datastore type. Grab the data from the 3rd column body
(You can run select body from datastore_pool where oid=0;) and copy to your favourite text editor (that's the chunk starting with <DATASTORE> and ending with </DATASTORE>). Find and replace:
Find: <TM_MAD><![CDATA[ssh]]></TM_MAD>
Replace with: <TM_MAD><![CDATA[shared]]></TM_MAD>
Find: <SHARED><![CDATA[NO]]></SHARED>
Replace with: <SHARED><![CDATA[YES]]></SHARED>
Now to update the SYSTEM datastore record. Run the following command on the database, replacing [datastore-config] with the text block you just modified update datastore_pool set body='[datastore-config]' where oid=0,
Update IMAGE datastore is a little different. There is no SHARED option, but we want to use either shared or qcow2 drivers. I used qcow2. So: select body from datastore_pool where oid=1;:
Find: <TM_MAD><![CDATA[ssh]]></TM_MAD>
Replace: <TM_MAD><![CDATA[qcow2]]></TM_MAD>
Update the record: update datastore_pool set body='[datastore-config]' where oid=1;,
Update the FILES datastore (oid=3) by replacing <TM_MAD><![CDATA[ssh]]></TM_MAD> with <TM_MAD><![CDATA[shared]]></TM_MAD> and update using the method above.
Now that the datastores have been updated to use the shared driver, lets start Sunstone and check that the datastores show up.
systemctl start opennebula && systemctl start opennebula-sunstone
Jump into Sunstone web and go to datastores. Opening each datastore to check whether SHARED is enabled, and the correct drivers show i.e. shared or qcow2.
~DONT DO ANYTHING YET~ Still need to configure the nodes!
Configure the Nodes
So because we stopped and undeployed the VMs, there shouldn't be any data in the node datastores. So we can just set up NFS shares to the datastores folder. Confirm the folders are empty first and make sure to take backups! This is an experimental process so be warned! Right, lets get onto it:
Check the contents of /var/lib/one/datastores. If you are mounting each datastore ID based folder to its own NFS share then you can do this instead of the entire datastore folder. Empty any folders with 0, 1 and 2 folders. otherwise remove all folders from the datastores folder,
If not already installed: apt-get install nfs-common,
Check for NFS shares: showmount -e [nfs-server],
Mount the nfs share to the datastore folder: mount [nfs-server]:/share/one_datastore /var/lib/one/datastores,
Confirm the mount i.e. df,
Edit /etc/fstab adding the mount so its mounted on next boot.
Restart your node to confirm the datastore nfs persists, and to give them a restart!
Repeat with all host nodes.
Test it Out
In Sunstone go to the Hosts TAB and check they are up and running. Next go and grab a VM and deploy it. It should deploy without any issues and start booting.
Once up and running i like to constantly ping the VM while testing live migration. So start ping (ping [vm-ip] -t in windows) and then in Sunstone open the VM and do a 'Live Migrate' to another node. Watch the ping and check the logs to make sure it succeeded. I found i had to refresh the display, and go to the hosts TAB to check the VM had migrated. After that it showed correctly but i think its a caching issue in my browser. After the Live Migration you should still see the ping rolling along, with maybe one failed ping in the results.
Conclusion
So that's the process i used to migrate from ssh local storage to shared storage. I'v tested it and it is working without any issues. However, if you do have any issues or have an opinion on this process please let me know. If there are any pitfalls with this i have overlooked please also let me know.
Ok, have fun with it. I'm off to try moving the shared storage over to some kind of shared cluster like Ceph or GlusterFS!

Dockerized Gitlab Container Backup

I am using a GitLab docker image for integration testing of a service I'm helping to develop. Ideally, the image would be a preconfigured snapshot of GitLab with different users and repos available to run tests against. So the problem ends up being, what is a good way to automate the creation of 'snapshots' of GitLab (that can then be versioned etc.)?
My current solution to this problem is to use GitLab's built in backup utility via gitlab-rake gitlab:backup:create after getting GitLab to a state that I want. This then lets me use GitLab's gitlab-rake gitlab:backup:restore in a hook when the container is starting up to get the container back to the state that I expect (the backup having been ADDed in the Dockerfile for the image). This has the advantage of being relatively lightweight (backups are on the order of MBs) and the backups can be checked in to version control.
I have tried using docker export along with docker import to save the state of the container and then create an image based on that state. This has the advantage of being easy to automate since it is directly supported by Docker, but ends up being fairly expensive considering what the goal is (having users and repos available to test against). It also would require the images to be pushed to a registry of some kind in order to be easily distributed. Perhaps this is the best solution because it is well supported though.
I suppose my question is, what is the Docker way of approaching a problem like this?

Docker - Backup Containers in /var/lib/docker/vfs/dir

I have noticed that all the data for the docker containers is located in the folder "/var/lib/docker/vfs/dir"
I have research different techniques for backing up docker containers but I haven't seen anyone backup the containers directly without using docker. It seems like we are all missing a trick when backing up docker containers.
My question is, has anyone else backed up their docker containers in the folder: "/var/lib/docker/vfs/dir"?
Example: I would create a TAR file of the directory "/var/lib/docker/vfs/dir" and move the TAR file onto another server.
EDIT: I have data only containers which keep my persistent data separate to the worker containers. When browsing "/var/lib/docker/vfs/dir", I can see all the persistent data. I am trying to think of a reason why I wouldn't simply rsync all the folders and files onto another server for a backup.
There is no more sense to backup the containers. They are just the instances of docker images. Every time you run the container it is created from the docker image.
There is a data which we want to keep after the container will be stopped or removed and you can use external volumes for it.
To get the image on the other server you can save it on the docker registry or just save the Dockerfile and recreate the image on the new server.

Connecting to a running docker container - differences between using ssh and running a command with "-t -i" parameters

Could you please point me what is the difference between installing openssh-server and starting a ssh session with a given docker container and running docker run -t -i ubuntu /bin/bash and then performing some operations. How does docker attach compare to those two methods?
Difference 1. If you want to use ssh, you need to have ssh installed on the Docker image and running on your container. You might not want to because of extra load or from a security perspective. One way to go is to keep your images as small as possible - avoids bugs like heartbleed ;). Whether you want ssh is a point of discussion, but mostly personal taste. I would say only use it for debugging, and not to actually change your image. If you would need the latter, you'd better make a new and better image. Personally, I have yet to install my first ssh server on a Docker image.
Difference 2. Using ssh you can start your container as specified by the CMD and maybe ENTRYPOINT in your Dockerfile. Ssh then allows you to inspect that container and run commands for whatever use case you might need. On the other hand, if you start your container with the bash command, you effectively overwrite your Dockerfile CMD. If you then want to test that CMD, you can still run it manually (probably as a background process). When debugging my images, I do that all the time. This is from a development point of view.
Difference 3. An extension of the 2nd, but from a different point of view. In production, ssh will always allow you to check out your running container. Docker has other options useful in this respect, like docker cp, docker logs and indeed docker attach.
According to the docs "The attach command will allow you to view or interact with any running container, detached (-d) or interactive (-i). You can attach to the same container at the same time - screen sharing style, or quickly view the progress of your daemonized process." However, I am having trouble in actually using this in a useful manner. Maybe someone who uses it could elaborate in that?
Those are the only essential differences. There is no difference for image layers, committing or anything like that.