Summary
When I create a backup of GitLab. I always have different checksums.
Steps to reproduce
sudo gitlab-rake gitlab:backup:create STRATEGY=copy
What is the current bug
I created backup-script, its backup GitLab hourly and send archive in cloud storage.
In two archives with identical contents, there is always a different check-sum.
Why do I have two folders with the same files and the same checksums, but when this folders is archived, I get different checksum? Content has not changed, but checksum has always changed. Why?
What is the expected correct behavior?
When the archive is not edited, it must have the same checksum
Relevant logs and/or screenshots
94e779cbe595eda6f79f15437d6059ec50c40de9efe01c7c8227b2c799556aac artifacts.tar.gz (first)
a15da160a4bc6d308f47bd0ebbbeaa09c549f07136d6f13203f05cf0374c77d2 569.log
709b40d737572628d282d5c5f97a62ea4681560d3300f5c126d34436a375618d 570.log
caf0c823c22213c63a86299c4100aec8e8913d3ef6209d36e893982d6fdf3510 571.log
dc77e18335dde4e2ba3ac38d4b2c8b9f59785057e871cceaea172596d3932a0c 572.log
709b40d737572628d282d5c5f97a62ea4681560d3300f5c126d34436a375618d 573.log
14ec475a0cbfc50408a010e14c7f5ab91ae4f675046b53b0e4a65d5dec7e2b79 574.log
67fbe4206bc4b2e5298472e155b81643fb8a30ab41b3c7971e2c9c9c0af1d9a7 artifacts.tar.gz (second)
a15da160a4bc6d308f47bd0ebbbeaa09c549f07136d6f13203f05cf0374c77d2 569.log
709b40d737572628d282d5c5f97a62ea4681560d3300f5c126d34436a375618d 570.log
caf0c823c22213c63a86299c4100aec8e8913d3ef6209d36e893982d6fdf3510 571.log
dc77e18335dde4e2ba3ac38d4b2c8b9f59785057e871cceaea172596d3932a0c 572.log
709b40d737572628d282d5c5f97a62ea4681560d3300f5c126d34436a375618d 573.log
14ec475a0cbfc50408a010e14c7f5ab91ae4f675046b53b0e4a65d5dec7e2b79 574.log
Output of checks
This bug happens on GitLab-CE Omnibus
Results of GitLab environment info
(For installations with omnibus-gitlab package run and paste the output of:
sudo gitlab-rake gitlab:env:info)
System information
System: Ubuntu 16.04
Current User: git
Using RVM: no
Ruby Version: 2.3.5p376
Gem Version: 2.6.13
Bundler Version:1.13.7
Rake Version: 12.0.0
Redis Version: 3.2.5
Git Version: 2.13.5
Sidekiq Version:5.0.4
Go Version: unknown
GitLab information
Version: 10.0.1
Revision: 2417795
Directory: /opt/gitlab/embedded/service/gitlab-rails
DB Adapter: postgresql
URL: https://git.site
HTTP Clone URL: https://git.site
SSH Clone URL: git#git.git.site
Using LDAP: no
Using Omniauth: no
GitLab Shell
Version: 5.9.0
Repository storage paths:
- default: /var/opt/gitlab/git-data/repositories
Hooks: /opt/gitlab/embedded/service/gitlab-shell/hooks
Git: /opt/gitlab/embedded/bin/git
My Issues on Gitlab.com
This seems expected considering the COPy strategy will copy first the files, before tar/gz them.
As explained in "Backup strategy option":
The default backup strategy is to essentially stream data from the respective data locations to the backup using the Linux command tar and gzip.
This works fine in most cases, but can cause problems when data is rapidly changing.
When data changes while tar is reading it, the error file changed as we read it may occur, and will cause the backup process to fail.
To combat this, 8.17 introduces a new backup strategy called copy. The strategy copies data files to a temporary location before calling tar and gzip, avoiding the error.
A side-effect is that the backup process with take up to an additional 1X disk space. The process does its best to clean up the temporary files at each stage so the problem doesn't compound, but it could be a considerable change for large installations. This is why the copy strategy is not the default in 8.17.
See lib/backup/files.rb:
# Copy files from public/files to backup/files
def dump
FileUtils.mkdir_p(Gitlab.config.backup.path)
FileUtils.rm_f(backup_tarball)
if ENV['STRATEGY'] == 'copy'
cmd = %W(cp -a #{app_files_dir} #{Gitlab.config.backup.path})
output, status = Gitlab::Popen.popen(cmd)
So the created timestamp for those copied files will change, making any checksum different.
Related
I'm running Sonatype Nexus 3 inside a docker container, with the following startup command:
docker run -d -p 80:8081 --ulimit nofile=65536:65536 --name nexus -v nexus-data:/nexus-data -e INSTALL4J_ADD_VM_PARAMS="-Xms4g -Xmx4g -XX:MaxDirectMemorySize=6717m -Djava.util.prefs.userRoot=${NEXUS_DATA}/javaprefs" sonatype/nexus3
After updating the docker image version from 3.30.0 to 3.40.1, I keep getting the following warnings regarding user prefs.
2022-07-18 13:14:45,860+0000 WARN [Timer-0] *SYSTEM java.util.prefs - Couldn't flush user prefs: java.util.prefs.BackingStoreException: Couldn't get file lock.
2022-07-18 13:15:15,860+0000 WARN [Timer-0] *SYSTEM java.util.prefs - Could not lock User prefs. Unix error code 2.
As you can see from the startup command, the user prefs directory is inside the docker volume and at directory /nexus-data/javaprefs . I have tried looking for existing locks inside the directory, but found none. I've also tried completely deleting the directory and saw that the warning still came up and the folder itself wasn't being created by Nexus.
I honestly don't even know if this is an important issue or not, since there is little to no documentation about the user preferences folder.
Even a way to turn off the warning log which fires every 30s would be useful.
----UPDATE----
I've tried doing a clean installation of Nexus through Docker, following the simple instructions inside the github sonatype nexus3 docker repository, and still find these warnings.
I even tried on a different OS (Windwos instead of linux, through Docker Desktop) and with and without a volume for /nexus-data.
At this point I believe it to be a bug in a newer Nexus version.
TLDR: Adding -Djava.util.prefs.userRoot=/nexus-data/javaprefs should solve the problem, assuming the nexus data directory is at /nexus-data/.
Just had the same issue after upgrading from 3.38.1 to 3.42.0. After some investigation found that indeed the java.util.prefs.userRoot property got lost somewhere between those versions. The default value in the vanilla Nexus 3.38.1 is /nexus-data/javaprefs.
I am building WebRTC library using travis CI.
This is running well but takes lots of time and more and more often the build ends with the message :
The job exceeded the maximum time limit for jobs, and has been
terminated.
You can consult a log that failed travis log
During the gclient sync :
_______ running 'download_from_google_storage --directory --recursive --num_threads=10 --no_auth --quiet --bucket chromium-webrtc-resources src/resources' in '/home/travis/build/mpromonet/webrtc-streamer/webrtc'
...
Hook 'download_from_google_storage --directory --recursive --num_threads=10 --no_auth --quiet --bucket chromium-webrtc-resources src/resources' took 1255.11 secs
I disabled the tests, so I think this is useless and it takes lots of time.
Is there anyway to give some arguments or setting some variables to avoid this time costly task ?
A way to not download chromium-webrtc-resources defined in dependencies DEPS
{
# Download test resources, i.e. video and audio files from Google Storage.
'pattern': '.',
'action': ['download_from_google_storage',
'--directory',
'--recursive',
'--num_threads=10',
'--no_auth',
'--quiet',
'--bucket', 'chromium-webrtc-resources',
'src/resources'],
},
is to pached it removing this section or adding a condition that is false.
In order to patch I used the folowing command :
sed -i -e "s|'src/resources'],|'src/resources'],'condition':'rtc_include_tests==true',|" src/DEPS
This save about 20mn and allow the travis build to stay below the timeout.
You can bake the entire toolchain into a docker image and run your actual tests/builds in that. Delegate the docker image update into another automated process (travis-ci cronjob for example).
An additional benefit is that you now have full control over when parts of your toolchain change. I find that very important.
Edit:
Some resources to read.
The official travis docs for using docker
Building & deploying images on travis
Dockerhub automated builds
I'm starting with Ansible, and I found that there is a module called command which lets me execute any command in a remote node.
I saw a couple of example where initial setups are solved by using command instead of specific modules. For example, as far as I know, both of these do the same task:
- name: Install git using apt module
apt:
name: git
state: present
- name: Install git using command
command: apt-get install git
So, my question is: is there any difference or any reason to use a module instead of command?
The difference in short is that using a specific module will give you playbook's idempotence and provide better portability and readability.
What I mean by idempotence? When you run:
- name: Install git using apt module
apt:
name: git
state: present
It will install git package only if it is not yet installed on the target system and after playbook run this task will be reported in green colour (OK) if git had been already installed.
2nd approach with the command module:
- name: Install git using command
command: apt-get install git
Above command will always report status as changed (yellow colour) when in fact nothing changed (assuming git package had been already installed). There are ways to make tasks that use the command module idempotent as well but it costs you some more work.
Best practice is to always use a specific module before command in playbooks.
Ansible is all about describing and managing system state. When you run a playbook on a certain target system it can be very misleading to see a task reporting a changed state while in fact nothing has been changed.
Think declaratively about describing desired state, not about low level commands needed to get a system to this state.
Below article will also provide some explanation around differences and consequences of using command vs specific module:
Ansible Best Practices: The Essentials
There are probably numerous reasons but here are a few:
Intrinsic idempotence (does not execute task every time without extra effort)
Superior readability (much clearer what you are trying to do)
More concise tasks (much fewer words to describe the task)
Platform-agnostic execution (works on all OS instead of just one without extra effort)
Is it possible to delete old builds in Gitlab CI?
I tested a few things and have now about 20 builds that are useless (most are failed anyway).
It also shows stages that I don't have anymore which kinda clutters the Pipelines page and some of the uploaded artifacts are a bit big.
I wasn't able to find any documentation on this, only that disabling CI in the settings doesn't remove the builds.
Using Gitlab 8.10 Community (hosted by Gitlab.com)
There is currently no option in the GUI to completely get rid of a build other than expunge related data from the build. (The erase option in the build)
If you would have a local installation you could modify the database directly but I would advise caution. (I'll put the guide here for completeness sake)
Login to the GitLab database. If you use the default PostgreSQL :
sudo -u gitlab-psql /opt/gitlab/embedded/bin/psql -h /var/opt/gitlab/postgresql -d gitlabhq_production
Check if there is a table ci_builds. For pSQL: \dt
Delete the builds with normal SQL. For example: DELETE FROM ci_builds WHERE id = 2
(Optional) If you want to cleanup a list of commits which triggered a build you need to midify the table ci_commits.
I'm cloning an SVN repository to git as part of our migration plan. I've hit various snags along the way, forcing me to continue the clone with a git svn fetch command. The most recent failure I can't figure out how to solve:
$ git svn fetch
Checksum mismatch: dc/trunk-4632-jh/dc-smtpd/lib/Qpsmtpd/Address.pm.t 8ce3aea3f47dc115e8fe53bd62d0f074cfe93ec6
expected: 59de969022e46135fa6dc7599fc2f3b4
got: 4334926a01c905cdb7fce71265e370c1
I found this related answer, however that solution doesn't work because git svn log is not yet functional, as the repo is not fully in place:
$ git svn log dc/trunk-4632-jh/dc-smtpd/lib/Qpsmtpd/Address.pm.t
fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions
log --no-color --first-parent --pretty=medium HEAD: command returned error: 128
How can I proceed?
Another answer to an old question but straight forward solutions are tough to find for this problem so hopefully this helps others.
I think this issue occurs due to a corrupted file during transfer. Not sure how or why it happens, but in my case, I get the same error at different revisions every time I do a new clone and sometimes not at all.
Using the questioners error message
$ git svn fetch
Checksum mismatch: dc/trunk-4632-jh/dc-smtpd/lib/Qpsmtpd/Address.pm.t
8ce3aea3f47dc115e8fe53bd62d0f074cfe93ec6
expected: 59de969022e46135fa6dc7599fc2f3b4
got: 4334926a01c905cdb7fce71265e370c1
The following steps allowed me to resume and progress :-
View all branches. These will all be remote branches. git branch -a
Checkout branch affected. git checkout remotes/origin/trunk-4632-jh
This will take some time to complete.
Find the last revision that the problematic file was changed. git svn log dc-smtpd/lib/Qpsmtpd/Address.pm.t
Note the highest revision #
Reset back to this rev. git svn reset -r (rev #) -p
Carry on. git svn fetch
Good luck.
I know this is old but maybe it will be helpful for future reference as all search results on this are not helpful.
I've hit similar issue on our huge repository which takes days to clone and unfortunately at one point I had to restart my machine. I am currently working out how to resolve the problem, so please keep in mind this is more a suggestion than tested solution.
I think you need to try creating a branch and checking out the commits you currently have from previous fetch:
git checkout -b master git-svn
After that is done you should have working tree up to that commit. Another fetches will probably fail due to object mismatch but at that point at least it should be possible to use "git svn reset" to revert faulty svn fetches (see OP's related answer link). If that's true find offending commit, reset before it and then continue fetching.
You might want to rebase and revert to state before that broken commit on your master branch or convert back to bare repository, if that's what you're after (in my case it is).
Hope this works. I'll post an update when my checkout is done (will take at least few hours... sigh).
Edit: That seemed to work. I successfully discarded some git-svn commits and am able to re-fetch them again. :)
Edit2: Make sure to reset until you don't get any object mismatch warnings on git svn fetch (otherwise you will run into the same issue soon).
Cheers,
Henryk
See also: Git svn rebase : checksum mismatch
In our case the additional treatment of the files (server-side includes in Apache) caused the checksum problem.
Disabling SSI in Apache's /etc/httpd.conf file for the period of migration by commenting out the
AddType text/html .shtml
AddOutputFilter INCLUDES .shtml
directives solved the problem, caused by the interpretation of .shtml files by the front-end Apache server, which produced a new content (and thus a new hash), other than the hash of the original file itself.
That means some files in the repository got corrupted. It can be caused by various reasons such as software bugs, bit rots in drives, etc. I was recently transitioning very old ~10GB svn repository to git, therefore some corruption was expected.
To fix the corruption, you basically need to dump the entire repository and import it while filtering the errors out. Note that our goal is to complete the import process no matter why or how the repository got corrupted. You cannot simply fix the corruption without having a backup and diffing through the revision files.
First basic one-off command you could use is:
svnadmin create repo2
svnadmin dump repo | sed '/^Text-content-md5/d' | svnadmin load repo2
This removes the checksum calculation from the dump so the new repo will have updated checksums.
If you encountered more errors during the dump and load (which is expected), try incremental approach so you can continue from the point you left. Below command will dump the revisions starting from 101 to 150 (inclusive).
svnadmin dump --incremental -r101:150 repo | sed '/^Text-content-md5/d' | svnadmin load repo2
Some common errors and solutions:
'Premature end of content data in dumpstream': That means Content-length of some file does not match the repository version, so some data is lost in the specified file. We must skip it. Add | svndumpfilter exclude path/to/file.jar command like this:
svnadmin dump --incremental -r101:150 repo | svndumpfilter exclude path/to/file.jar | sed '/^Text-content-md5/d' | svnadmin load repo2
Property errors: Add --bypass-prop-validation to svnadmin load command
After populating your second repo, you would simply svnserve -d -r repo2 and try git svn fetch again.
Good luck!