OpenLDAP BDB log file maintenance and auto removal - ldap

I have a question about the log files OpenLDAP/BDB creates in the data directory. These files have the form log.XXXXXXXXXX (X is a digit) and each has the same size (which is configurable in DB_CONFIG).
I read a lot about checkpointing and log file maintenance in the OpenLDAP and BDB documentatioon. It seems to be normal that these files grow very fast and need maintenance. Normally you should backup them regularly and delete them afterwards. But how to handle this during a long running data migration?
In my case running a test migration for 375 accounts which triggers 3 write requests per account to the LDAP server produces 6 log files with 5 MB each. The problem ist there are more than 37000 accounts on the live system that need to be migrated and the creation of several gigabytes of log files is not accepted.
Because of that I tried to configure auto removal of the log files but the suggested solution is not working for me. After reading through the documentation, my conclusion was that I have to enable checkpoints via slapd.conf and set the DB_LOG_AUTOREMOVE flag in the DB_CONFIG file like this:
My settings in slapd.conf:
checkpoint 128 15
My settings in DB_CONFIG:
set_flags DB_LOG_AUTOREMOVE
set_lg_regionmax 262144
set_lg_bsize 2097152
But the log files are still there - even if I decrease the checkpoint settings to checkpoint 1 1. If I run slapd_db_archive -d in the data directory all of these files but the very last get removed.
Does anyone have an idea how the get the auto removal working? I am close to giving up and add a cron job to run slapd_db_archive -d during the migration. But I am not sure if this may cause problems.
We are using OpenLDAP 2.3.43 with the BDB Backend (HDB to be precise) on centos.

In BDB (dunno HDB), DB_LOG_AUTOREMOVE removes log.* files that do not reference records currently in the database. This is not the same as removing all log files.

Related

Host Disk Usage: Warning message regarding disk usage

I've downloaded version HDF_3.0.2.0_vmware of the Hortonworks Sandbox. I am using VMWare Player version 6.0.7 on my laptop. Shortly after startup/logging into Ambari, I see this alert:
The message that is cut off reads: "Capacity Used: [60.11%, 32.3 GB], Capacity Total: [53.7 GB], path=/usr/hdp". I'd hoped that I would be able to focus on NiFi/Storm development rather than administering the sandbox itself, however it looks like the VM is undersized. Here are the VM settings I have for storage. How do I go about correcting the underlying issue prompting the alert?
I had similar issue, it's about node partitioning and directories mounted for data under HDFS -> Configs -> Settings -> DataNode
You can check your node partitioning using below command-
lsblk -o NAME,FSTYPE,SIZE,MOUNTPOINT,LABEL
Mostly hdfs namenode or datanode directories point to root partitions. We can change thresholds values for alerts temporary and to have permanent solution we can add additional data directories.
Below links can he helpful to do the same.
https://community.hortonworks.com/questions/21212/configure-storage-capacity-of-hadoop-cluster.html
Check from above link - I think your partitioning is wrong you are not using "/" for hdfs directory. If you want use full disk capacity, you can create any folder name under "/" example /data/1 on every data node using command "#mkdir -p /data/1" and add to it dfs.datanode.data.dir. restart the hdfs service.
https://hadooptips.wordpress.com/2015/10/16/fixing-ambari-agent-disk-usage-alert-critical/
https://community.hortonworks.com/questions/21687/how-to-increase-the-capacity-of-hdfs.html
I am not currently able to replicate this, but based on the screenshots the warning is just that there is less space available than recommended. If this is the case everything should still work.
Given that this is a sandbox that should never be used for production, feel free to ignore the warning.
If you want to get rid fo the warning sign, it may be possible to do a quick fix by changing the warning treshold via the alert definition.
If this is still not sufficient, or you want to leverage more storage, please follow the steps outlined by #manohar

Undo zfs create

I have a problem. I created a pool consisting of single volume of 1 file 2.5Tb just to fight with file duplicates. I copied a folder with photos. Some of the photos were not backed up. Just now I see my pool folder is empty. When I checked with 'sudo zfs list' it said 'No datasets available'.
I thought it was detached and to attach I started again all these commands.
sudo zpool create singlepool -f /home/john/zfsvolumes/zfs_single_volume.dat -m /home/share/zfssinglepool
sudo zfs set dedup=on singlepool
sudo zpool get dedupratio singlepool
sudo zfs set compression=lz4 singlepool
sudo chown -R writer:writer /home/share/zfssinglepool
I see now empty pool!
May I get my folders back which I copied to the pool before I started create pool again?
Unfortunately, use of zpool create -f will recreate the pool from scratch even if ZFS recognizes that a pool has already been created using that storage:
-f Forces use of vdevs, even if they appear in use or specify a
conflicting replication level. Not all devices can be over-
ridden in this manner.
This is similar to reformatting a partition with other file systems, which will leave whatever data is there written in place, but still erase the references the file system needs to find the data. You may be able to pay an expert to reconstruct your data, but otherwise I'm afraid the data will be very hard to get back from your pool. As in any data recovery mission, I'd advise making a copy of the data ASAP on some external media that you can use to do the recovery from, in case further attempts at recovery accidentally corrupt the data even worse.

what's the performance impact causing from the large size of Apache's access.log?

If the logs file like access.log or error.log gets very large, will the large-size problem impact the performance of Apache running or user accessing? From my understanding, Apache doesn't read entire logs into memory, but just make use of filehandle to write. Right? If so, I don't have to remove the logs manually every time when it's large enough except for the filesystem issue. Please help and correct me if I'm wrong. Or is there any Apache Log I/O issue I'm supposed to take care when running it?
Thx very much
Well, i totally agree with you. Per my understanding apache access the log files using handlers and just append the new message at the end of the file. That's way a huge log file will not make the difference when has to do with writing to the file. But may be if you want to access the file or open it with a kind of logging monitoring tool then the huge size will slowdown the process of reading the file.
So i would suggest you to use log rotation to have an overall better end result.
This suggestion is directly form the apche web site.
Log Rotation
On even a moderately busy server, the quantity of information stored in the log files is very large. The access log file typically grows 1 MB or more per 10,000 requests. It will consequently be necessary to periodically rotate the log files by moving or deleting the existing logs. This cannot be done while the server is running, because Apache will continue writing to the old log file as long as it holds the file open. Instead, the server must be restarted after the log files are moved or deleted so that it will open new log files.
From the Apache Software Foundation site

Use rsync without copying files that are in use

I have a server (Machine A) that receives uploads throughout the day from other machines. I have a script running on another internal server (running as cron - Machine B) that uses rsync to pull these files onto itself and remove the originals on Machine A. Some of these uploads last an hour or more.
How do I use rsync so that it won't attempt to copy files that are currently uploaded (being written to)? I don't want it to pull partial uploads and then attempt to process them.
I'm using Ubuntu 10.04 64-bit on both machine A & B.
In order to make incremental baclups in rsync, you should put the --update or -u option. The only situation in which the a file existing in the receiver will be updated is when the archive exists and has the same timestamp in both ends but the size differs.
About the partial updates, all the temporary uploads are stored in a temporary archive and then moved to the dest directory when uploaded. you can use the --partial in case of a rsync or network problem, this will resume the partial updates next time you execute the sync again.
You can check the whole options from this man page.

Joomla 1.5 Site Backup Strategy

I would like to make a complete backup of my whole joomla 1.5 based site from time to time. How would this ideally be done? Are there any common pitfalls? Not that I only have ftp access to the hosting server. Is there a step by step tutorial somewhere? I am using latest Joomgallery and Kunena 1.0.9 (Legacy mode).
Maybe there is a good way to automate this?
There's two parts of the backup you have to worry about, the database and the files.
The first part is the database. It can be backed up using something like phpMyAdmin. If you don't have this available on your server already, it's not too hard to upload and get it going yourself. From there, you can just Export the entire database to a gzip file.
The second part is the code and uploaded files. The code base shouldn't change too often, so you could probably just make one backup of this. There's a number of ways. The simplest is to just download the entire folder via FTP, though if you're Linux, I'm sure someone will know a single command line to get all the changed files (rsync?).
The database is the main thing you have to worry about though: everything else should be able to be rebuilt just by reinstalling.
I think this: http://www.joomlapack.net/ is what you need. I use it myself and it works like a charm. Both for backups and for moving my Joomla installations from developer sites and to the real site.
get an FTP synchronisation tool and keep an up-to-date copy of your site locally. Then you could run the batch script
mysqldump -hhost -uuser -p%1 schema > C:\backup.sql
to create a backup of your mysql tables at various points in time.
edit
you would have to have MySQL Server installed on your local machine and path to its bin directory in you PATH, in order to run the mysqldump command without much hassle. -p%1 would take the command-line provided password, as you wouldn't want to store passwords in your batch script.
If you only have FTP access you are in a bit of a problem, as beside all files you'll also have to backup the database. Without accessing the database, a full-backup won't do you any good.
Whatever backup strategy you choose - be sure it can handle UTF-8 correctly. Joomla 1.5 stores all content with UTF-8, even when the database charset is set on 'iso-5589-1' - so when the backup solution is detecting the database charset, some characters like € or é will result in "strange" ¬ / é - not really what you'll want.
I absolutely endorse using Joomlapack - it works great. The optional remote tools allow you to initiate the backup from a Windows desktop machine - it performs the backup and downloads it. The remote has a scheduler, and you can also set it off to backup and download a list of sites.
Joomlapack also provides a file "kickstart.php" which you copy to your empty server account along with the backup, which automates the restore procedure. You do have to create an empty database with PHPMyAdmin or similar, and you are given the opportunity to supply the database parameters (host, database, username, password) during the process.
One pitfall I did run into with this though is that some common components can have absolute URLs in their configuration - e.g. SOBI2, Virtuemart. It's then just a matter of finding the appropriate configuration file, editing it and re-uploading it.
Another problem was one archive file (either ZIP or their JPA format) got a filename with a "?" character in it (from a Linux server) and this caused a bit of a problem trying to install it locally on a Windows WAMP stack - the extract process on the ZIP file failed, and it stopped the process completing cleanly.
I suggest using automatic backup service by http://www.everlive.net
Update:
Ok, here is some more information. EverLive.net is a website where you can create a free account. Enter your website details and you are ready to take your backups withe just one click. Restore is also possible in the same way.
Further you can use automatic backup option to take automatic backups at defined intervals. Other than that, you can use the website health check service to inform you if your website is not available.