I need your help on a subject.
I need to perform a backup of our Rundeck system and send it to a server in the GCP, but there are more than 90GB of information and I don't know how to make this backup.
All my attempts to compress using gzip, bzip2, xz and rsync have failed, the error is basically because the file is too big.
What would be the best way to perform the backup?
Could you give me suggestions?
Thanks in advance.
You need to follow this guide. Also, as a tip: keep out the execution logs from your project export, to gain space in your files.
Related
Does anyone know a SQL query that will purge a MediaWiki database of old revisions? My database has grown out of control, and I need to prune it to make it possible to download and manage.
I don't have shell access so, I need to do this with a SQL query.
I have tried the solution suggested here, but it doesn't work http://www.mediawiki.org/wiki/Extension_talk:SpecialDeleteOldRevisions2#Deleting_only_archived_revisions
Thanks for reading :)
Nicholas
As you, I don't have shell access to my MediaWiki. So I can't do a lot of things like maintenance.
Here is my solution : host your MediaWiki web site on your computer just to do your maintenance tasks
Backup your database
Backup your MediaWiki folder
Setup Apache (the web server) on your computer
Setup MySQL on your computer
Restore your MediaWiki database on your computer
Put your MediWiki folder on the Apache root folder
Finally run the maintenance task you want using shell. I suggest you the deleteOldRevisions script
After that, rebackup the folder and the database and restore them on the remote host
Use the Maintenance extension and run the relevant maintenance scripts with it. Direct database manipulation is pure madness, and using a local LAMP install as suggested by the other answer quite cumbersome.
Shell access is really required to properly run a MediaWiki but this is a common problem, please report your experience with the extension on the talk page or file a bug if you find any.
Does anyone know of a script or program that can be used for backing up multiple websites?
Ideally, I would like the have it setup on a server where the backups will be stored.
I would like to be able to add the website login info, and it connects and creates a zip file or similar that it would then be sent back to the remote server to be saved as a backup etc...
But it would also need to be able to be set up as a cron so it backed up everyday at least?
I can find PC to Server backups that a similar, but no server to server remote backup scripts etc...
It would be heavily used, and needs to be a gui so the less techy can use it too?
Does anyone know of anything similar to what we need?
HTTP-Track website mirroring utility.
Wget and scripts
RSync and FTP login (or SFTP for security)
Git can be used for backup and has security features and networking ability.
7Zip can be called from the command line to create a zip file.
In any case you will need to implement either secure FTP (SSH secured) OR a password-secured upload form. If you feel clever you might use WebDAV.
Here's what I would do:
Put a backup generator script on each website (outputting a ZIP)
Protect its access with a .htpasswd file
On the backupserver, make a cron script download all the backups and store them
I have a fairly large amount of data (~30G, split into ~100 files) I'd like to transfer between S3 and EC2: when I fire up the EC2 instances I'd like to copy the data from S3 to EC2 local disks as quickly as I can, and when I'm done processing I'd like to copy the results back to S3.
I'm looking for a tool that'll do a fast / parallel copy of the data back and forth. I have several scripts hacked up, including one that does a decent job, so I'm not looking for pointers to basic libraries; I'm looking for something fast and reliable.
Unfortunately, Adam's suggestion won't work as his understanding of EBS is wrong (although I wish he was right and often thought myself it should work that way)... as EBS has nothing to do with S3, but it will only give you an "external drive" for EC2 instances that are separate, but connectable to the instances. You still have to do copying between S3 and EC2, even though there are no data transfer costs between the two.
You didn't mention an operating system of your instance, so I cannot give tailored information. A popular command line tool I use is http://s3tools.org/s3cmd ... it is based on Python and therefore, according to info on its website it should work on Win as well as Linux, although I use it ALL the time on Linux. You could easily whip up a quick script that uses its built in "sync" command that works similar to rsync, and have it triggered every time you're done processing your data. You could also use the recursive put and get commands to get and put data only when needed.
There are graphical tools like Cloudberry Pro that have some command line options for Windows too that you can setup schedule commands. http://s3tools.org/s3cmd is probably the easiest.
By now, there is a sync command in the AWS Command line tools, that should do the trick: http://docs.aws.amazon.com/cli/latest/reference/s3/sync.html
On startup:
aws s3 sync s3://mybucket /mylocalfolder
before shutdown:
aws s3 sync /mylocalfolder s3://mybucket
Of course, the details are always fun to work out eg. how can parallel it is (and can you make it more parallel and is that any faster goven the virtual nature of the whole setup)
Btw hope you're still working on this... or somebody is. ;)
I think you might be better off using an Elastic Block Store to store your files instead of S3. An EBS is akin to a 'drive' on S3 that can be mounted into your EC2 instance without having to copy the data each time, thereby allowing you to persist your data between EC2 instances without having to write to or read from S3 each time.
http://aws.amazon.com/ebs/
Install s3cmd Package as
yum install s3cmd
or
sudo apt-get install s3cmd
depending on your OS
then copy data with this
s3cmd get s3://tecadmin/file.txt
also ls can list the files.
for more detils see this
For me the best form is:
wget http://s3.amazonaws.com/my_bucket/my_folder/my_file.ext
from PuTTy
I have a bash file that contains wget commands to download over 100,000 files totaling around 20gb of data.
The bash file looks something like:
wget http://something.com/path/to/file.data
wget http://something.com/path/to/file2.data
wget http://something.com/path/to/file3.data
wget http://something.com/path/to/file4.data
And there are exactly 114,770 rows of this. How reliable would it be to ssh into a server I have an account on and run this? Would my ssh session time out eventually? would I have to be ssh'ed in the entire time? What if my local computer crashed/got shut down?
Also, does anyone know how many resources this would take? Am I crazy to want to do this on a shared server?
I know this is a weird question, just wondering if anyone has any ideas. Thanks!
Use
#nohup ./scriptname &>logname.log
This will ensure
The process will continue even if ssh session is interrupted
You can monitor it, as it is in action
Will also recommend, that you can have some prompt at regular intervals, will be good for log analysis. e.g. #echo "1000 files copied"
As far as resource utilisation is concerned, it entirely depends on the system and majorly on network characteristics. Theoretically you can callculate the time with just Data Size & Bandwidth. But in real life, delays, latencies, and data-losses come into picture.
So make some assuptions, do some mathematics and you'll get the answer :)
Depends on the reliability of the communication medium, hardware, ...!
You can use screen to keep it running while you disconnect from the remote computer.
You want to disconnect the script from your shell and have it run in the background (using nohup), so that it continues running when you log out.
You also want to have some kind of progress indicator, such as a log file that logs every file that was downloaded, and also all the error messages. Nohup sends stderr and stdout into files.
With such a file, you can pick up broken downloads and aborted runs later on.
Give it a test-run first with a small set of files to see if you got the command down and like the output.
I suggest you detach it from your shell with nohup.
$ nohup myLongRunningScript.sh > script.stdout 2>script.stderr &
$ exit
The script will run to completion - you don't need to be logged in throughout.
Do check for any options you can give wget to make it retry on failure.
If it is possible, generate MD5 checksums for all of the files and use it to check if they all were transferred correctly.
Start it with
nohup ./scriptname &
and you should be fine.
Also I would recommend that you log the progress so that you would be able to find out where it stopped if it does.
wget url >>logfile.log
could be enough.
To monitor progress live you could:
tail -f logfile.log
It may be worth it to look at an alternate technology, like rsync. I've used it on many projects and it works very, very well.
I would like to make a complete backup of my whole joomla 1.5 based site from time to time. How would this ideally be done? Are there any common pitfalls? Not that I only have ftp access to the hosting server. Is there a step by step tutorial somewhere? I am using latest Joomgallery and Kunena 1.0.9 (Legacy mode).
Maybe there is a good way to automate this?
There's two parts of the backup you have to worry about, the database and the files.
The first part is the database. It can be backed up using something like phpMyAdmin. If you don't have this available on your server already, it's not too hard to upload and get it going yourself. From there, you can just Export the entire database to a gzip file.
The second part is the code and uploaded files. The code base shouldn't change too often, so you could probably just make one backup of this. There's a number of ways. The simplest is to just download the entire folder via FTP, though if you're Linux, I'm sure someone will know a single command line to get all the changed files (rsync?).
The database is the main thing you have to worry about though: everything else should be able to be rebuilt just by reinstalling.
I think this: http://www.joomlapack.net/ is what you need. I use it myself and it works like a charm. Both for backups and for moving my Joomla installations from developer sites and to the real site.
get an FTP synchronisation tool and keep an up-to-date copy of your site locally. Then you could run the batch script
mysqldump -hhost -uuser -p%1 schema > C:\backup.sql
to create a backup of your mysql tables at various points in time.
edit
you would have to have MySQL Server installed on your local machine and path to its bin directory in you PATH, in order to run the mysqldump command without much hassle. -p%1 would take the command-line provided password, as you wouldn't want to store passwords in your batch script.
If you only have FTP access you are in a bit of a problem, as beside all files you'll also have to backup the database. Without accessing the database, a full-backup won't do you any good.
Whatever backup strategy you choose - be sure it can handle UTF-8 correctly. Joomla 1.5 stores all content with UTF-8, even when the database charset is set on 'iso-5589-1' - so when the backup solution is detecting the database charset, some characters like € or é will result in "strange" ¬ / é - not really what you'll want.
I absolutely endorse using Joomlapack - it works great. The optional remote tools allow you to initiate the backup from a Windows desktop machine - it performs the backup and downloads it. The remote has a scheduler, and you can also set it off to backup and download a list of sites.
Joomlapack also provides a file "kickstart.php" which you copy to your empty server account along with the backup, which automates the restore procedure. You do have to create an empty database with PHPMyAdmin or similar, and you are given the opportunity to supply the database parameters (host, database, username, password) during the process.
One pitfall I did run into with this though is that some common components can have absolute URLs in their configuration - e.g. SOBI2, Virtuemart. It's then just a matter of finding the appropriate configuration file, editing it and re-uploading it.
Another problem was one archive file (either ZIP or their JPA format) got a filename with a "?" character in it (from a Linux server) and this caused a bit of a problem trying to install it locally on a Windows WAMP stack - the extract process on the ZIP file failed, and it stopped the process completing cleanly.
I suggest using automatic backup service by http://www.everlive.net
Update:
Ok, here is some more information. EverLive.net is a website where you can create a free account. Enter your website details and you are ready to take your backups withe just one click. Restore is also possible in the same way.
Further you can use automatic backup option to take automatic backups at defined intervals. Other than that, you can use the website health check service to inform you if your website is not available.