Aerospike default.dat file is huge for no reason

Aerospike default.dat file is huge for no reason - aerospike

I have a running Aerospike server with about 36,000 records in a single set. All I'm storing is a few bins for the set. I have configured my aerospike.conf file to persist data on disk as well:
namespace default {
replication-factor 2
memory-size 4G
default-ttl 0
storage-engine device {
file /opt/aerospike/data/default.dat
filesize 2T
data-in-memory true
}
}
The problem I'm having is that my /opt/aerospike/data/default.dat file is listed in my system as about 2TB:
/opt/aerospike/data# ls -lh
total 10M
-rw------- 1 root root 2.0T Jun 5 19:01 default.dat
My questions are:
Why does this .dat file have to be 2TB when the data I'm using in Aerospike is minimal for now?
My hard drive limit is 78GB, so why isn't my Ubuntu system not giving me out of drive space errors?
System disk space looks fine:
df -h --total
Filesystem Size Used Avail Use% Mounted on
/dev/xvda1 79G 4.2G 72G 6% /
none 4.0K 0 4.0K 0% /sys/fs/cgroup
udev 2.0G 8.0K 2.0G 1% /dev
tmpfs 395M 424K 395M 1% /run
none 5.0M 0 5.0M 0% /run/lock
none 2.0G 0 2.0G 0% /run/shm
none 100M 0 100M 0% /run/user
total 83G 4.2G 76G 6% -
Anyone have any ideas?

Well,
Aerospike uses sparse file (or kind of) for storing its file storage.
So when you specified the filesize 2T in namespace config, Aerospike will create a sparse file of size 2TB that you are seeing.
http://en.wikipedia.org/wiki/Sparse_file
So, its just a file with some metadata and not real 2TB data. Once the file content has actually filled your disk, you will see the usual disk full errors on both your system as well as in Aerospike.

Related

Errors during downloading metadata for repository 'AppStream'

When I use yum list java*,I get an error like the following：
[root#crucialer ~]# yum list java*
Repository extras is listed more than once in the configuration
Repository centosplus is listed more than once in the configuration
CentOS-8 - AppStream 17 kB/s | 2.3 kB 00:00
Errors during downloading metadata for repository 'AppStream':
- Status code: 404 for http://mirrors.cloud.aliyuncs.com/centos/8/AppStream/x86_64/os/repodata/repomd.xml (IP: 100.100.2.148)
Error: Failed to download metadata for repo 'AppStream': Cannot download repomd.xml: Cannot download repodata/repomd.xml: All mirrors were tried
[root#crucialer ~]# ping 100.100.2.148
PING 100.100.2.148 (100.100.2.148) 56(84) bytes of data.
64 bytes from 100.100.2.148: icmp_seq=1 ttl=102 time=1.94 ms
64 bytes from 100.100.2.148: icmp_seq=2 ttl=102 time=1.88 ms
64 bytes from 100.100.2.148: icmp_seq=3 ttl=102 time=2.08 ms
64 bytes from 100.100.2.148: icmp_seq=4 ttl=102 time=1.94 ms
64 bytes from 100.100.2.148: icmp_seq=5 ttl=102 time=1.93 ms
^C
--- 100.100.2.148 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 8ms
rtt min/avg/max/mdev = 1.883/1.953/2.078/0.076 ms
[root#crucialer ~]# ping www.baidu.com
PING www.a.shifen.com (180.101.49.12) 56(84) bytes of data.
64 bytes from 180.101.49.12 (180.101.49.12): icmp_seq=1 ttl=50 time=15.6 ms
64 bytes from 180.101.49.12 (180.101.49.12): icmp_seq=2 ttl=50 time=15.2 ms
64 bytes from 180.101.49.12 (180.101.49.12): icmp_seq=3 ttl=50 time=15.2 ms
64 bytes from 180.101.49.12 (180.101.49.12): icmp_seq=4 ttl=50 time=15.3 ms
^C
--- www.a.shifen.com ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 6ms
rtt min/avg/max/mdev = 15.223/15.331/15.581/0.210 ms
[root#crucialer yum.repos.d]# cat CentOS-Base.repo
# CentOS-Base.repo
#
# The mirror system uses the connecting IP address of the client and the
# update status of each mirror to pick mirrors that are updated to and
# geographically close to the client. You should use this for CentOS updates
# unless you are manually picking other mirrors.
#
# If the mirrorlist= does not work for you, as a fall back you can try the
# remarked out baseurl= line instead.
#
#
[base]
name=CentOS-$releasever - Base - 163.com
#mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=os
baseurl=http://mirrors.163.com/centos/$releasever/os/$basearch/
gpgcheck=1
gpgkey=http://mirrors.163.com/centos/RPM-GPG-KEY-CentOS-7
#released updates
[updates]
name=CentOS-$releasever - Updates - 163.com
#mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=updates
baseurl=http://mirrors.163.com/centos/$releasever/updates/$basearch/
gpgcheck=1
gpgkey=http://mirrors.163.com/centos/RPM-GPG-KEY-CentOS-7
#additional packages that may be useful
[extras]
name=CentOS-$releasever - Extras - 163.com
#mirrorlist=http://mirrorlist.centos.org/?release=$releasever&arch=$basearch&repo=extras
baseurl=http://mirrors.163.com/centos/$releasever/extras/$basearch/
gpgcheck=1
gpgkey=http://mirrors.163.com/centos/RPM-GPG-KEY-CentOS-7
#additional packages that extend functionality of existing packages
[centosplus]
name=CentOS-$releasever - Plus - 163.com
baseurl=http://mirrors.163.com/centos/$releasever/centosplus/$basearch/
gpgcheck=1
enabled=0
gpgkey=http://mirrors.163.com/centos/RPM-GPG-KEY-CentOS-7
Any suggestions would be greatly appreciated.

Pierz has rightly pointed. I would like to add few commands that change the repo to vault.centos.org
# sed -i 's/mirrorlist/#mirrorlist/g' /etc/yum.repos.d/CentOS-*
# sed -i 's|#baseurl=http://mirror.centos.org|baseurl=http://vault.centos.org|g' /etc/yum.repos.d/CentOS-*
Now run the command: yum list java*

This is probably down to the fact that CentOS Linux 8 has reached End Of Life. The linked article explains that you if want keep with CentoOS8 you'll need to change the repos to use vault.centos.org BUT there will be no further updates. If you want maintain updates you should migrate to Centos Stream to - one way to do this:
sudo dnf --disablerepo '*' --enablerepo=extras swap centos-linux-repos centos-stream-repos
sudo dnf distro-sync
Also looking at your config it seems you have some references to Centos7 which might interfere with things, though hopefully the update will deal with them. Note: CentOS7 is supported till 2024-06-30.

Google Colab: Disk size with GPU backend

I've been using Google Colab with the GPU backend. On December when I used it, the disk size for the GPU backend was more than 300 GB. Now running df -h on the virtual machine shows this:
Filesystem Size Used Avail Use% Mounted on
overlay 69G 33G 33G 50% /
tmpfs 64M 0 64M 0% /dev
tmpfs 6.4G 0 6.4G 0% /sys/fs/cgroup
/dev/sda1 75G 37G 39G 49% /opt/bin
tmpfs 6.4G 12K 6.4G 1% /var/colab
shm 5.9G 4.0K 5.9G 1% /dev/shm
tmpfs 6.4G 0 6.4G 0% /proc/acpi
tmpfs 6.4G 0 6.4G 0% /proc/scsi
tmpfs 6.4G 0 6.4G 0% /sys/firmware
Do you know if something has changed? I searched the web for news about this but couldn't find any. Before, the overlay filesystem was 359 GB.
Thanks in advance for any clues.
Bests,
B.

It seems it is a new issue. I found this on Github: https://github.com/googlecolab/colabtools/issues/919.
What's ironic about this problem is that one proposed solution is to mount Google Drive. Therefore I bought 200 GB of Google Drive. However, the disk size is still an issue. Apparently once Google Drive is mounted it starts to cache files in /root/.config/Google/DriveFS/[uniqueid]/content_cache. The cache has no control over its size, it does not delete or replace anything, it just accumulates, and it takes all the disk making the code crash. :(

Mount BlockStorage Device on Bluemix VM

I have a debian VM deployed at BlueMix, and I want to increase the size of the hard drive mounting a BlockStorage Device.
I followed the instructions on the new Beta BlockStorage Service and created a volume, and then attached it to the VM as a new device, but seems that although the volume is attached to the VM; is not automatically mounted.
I tryed several ways to mount it, but I did not find it the correct way. In fact, I even tryed to clone the line that came on the fstab refering to the root device mounted (I suspected that the additional volume should be similar) but it did not work (even broke the reboot of my machine)... So.. Can someone please advice me how to mount the BlockStorage Bluemix Service on the VM Machine ?
THks!

By attaching a volume you've essentially done the equivalent of plugging a raw, physical hard disk into your system. Before you can mount it you'll have to format it with a filesystem known by your OS.
After attaching the device you should be able to see the raw block device, for example with the lsblk command:
[mysys]# lsblk
sr0 11:0 1 416K 0 rom
vda 252:0 0 20G 0 disk
--vda1 252:1 0 20G 0 part /
vdb 252:16 0 25G 0 disk
Typically vda is your root device, so in this example the additional device is vdb with 25GB.
Now you can create a filesystem with the mkfs command, for example:
[mysys]# mkfs.ext4 /dev/vdb
mke2fs 1.41.12 (17-May-2010)
Filesystem label=
OS type: Linux
Block size=4096 (log=2)
Fragment size=4096 (log=2)
Stride=0 blocks, Stripe width=0 blocks
1638400 inodes, 6553600 blocks
...
mkfs supports different filesystems, so you might want to check the man pages on the system you're using (man mkfs).
Now all that's left is to create a mount point and mount the new filesystem:
[mysys]# mkdir /mnt/test
[mysys]# mount /dev/vdb /mnt/test
The additional space is now available:
[mysys]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/vda1 20G 946M 18G 5% /
tmpfs 1.9G 0 1.9G 0% /dev/shm
/dev/vdb 25G 172M 24G 1% /mnt/test

Aerospike Config for a Small Server

I'd like to know what the ideal Aerospike namespace configuration is for a mini (staging) server on Ubuntu 12.04 with 1 GB ram and 1 GHz CPU
Some requirements:
1. I'd like to persist data permanently on disk (not using it as a cache).
2. I'm only using a single node
3. I don't want a limit on the filesize of my data
Here's my current config snippet I'm using:
namespace default {
replication-factor 1
memory-size 1G
default-ttl 0 # not sure if this is for cache or disk
storage-engine device {
file /opt/aerospike/data/default.dat
filesize 2T
data-in-memory true
}
}
Thanks

Aerospike doesn't cache data-in-memory. If data-in-memory is set to true then all of your data must fit in RAM.
On a single node you will not be affected by the replication-factor parameter.
Aerospike has a limit of 2 TiB per file, but you can create multiple files of this size and Aerospike will distribute data across them. When going through a filesystem, having multiple files often helps. Also if you are going to use a filesystem then you may wan to look into disabling atime when mounting the disks.
default-ttl is how long the server will keep a record after it is written by default (can be overridden by your application). A default-ttl of 0 mean to never expire or evict data.
Example config with multiple files:
namespace default {
replication-factor 1
memory-size 1G
default-ttl 0 # (This applies to the primary index)
storage-engine device {
file /opt/aerospike/data/file0.dat
file /opt/aerospike/data/file1.dat
file /opt/aerospike/data/file2.dat
file /opt/aerospike/data/file3.dat
file /opt/aerospike/data/file4.dat
file /opt/aerospike/data/file5.dat
filesize 2T
data-in-memory true
}
}

APC (PHP Cache) Uptime 0 minutes, not caching

My goal is to implement APC for opcode cache for a drupal 6 production site.
I have so far tested APC with several php files with and without including other php files with include_once.
Also tried to tweak the apc.ini values for shm_size, apc.include_once_override and apc.stat.
Restarted apache every time.
Resulting in apc.php not showing any changes in any values. (except of course the changed apc.ini values are shown as they should)
Every time i refresh the apc.php test page, the start time resets as the current time showing uptime 0 minutes.
apc.php -testpage shows:
General Cache InformationAPC Version 3.1.9
PHP Version 5.2.10
APC Host xxxx.xx.xx
Server Software Apache/2.2.3 (CentOS)
Shared Memory 1 Segment(s) with 128.0 MBytes
(mmap memory, pthread mutex Locks locking)
Start Time 2011/07/26 11:53:56
Uptime 0 minutes
File Upload Support 1
Cached Files 0 ( 0.0 Bytes)
Hits 1
Misses 1
Request Rate (hits, misses) 2.00 cache requests/second
Hit Rate 1.00 cache requests/second
Miss Rate 1.00 cache requests/second
Insert Rate 0.00 cache requests/second
Cache full count 0
Cached Variables 0 ( 0.0 Bytes)
Hits 0
Misses 0
Request Rate (hits, misses) 0.00 cache requests/second
Hit Rate 0.00 cache requests/second
Miss Rate 0.00 cache requests/second
Insert Rate 0.00 cache requests/second
Cache full count 0
apc.cache_by_default 1
apc.canonicalize 1
apc.coredump_unmap 0
apc.enable_cli 0
apc.enabled 1
apc.file_md5 0
apc.file_update_protection 2
apc.filters
apc.gc_ttl 3600
apc.include_once_override 0
apc.lazy_classes 0
apc.lazy_functions 0
apc.max_file_size 16
apc.mmap_file_mask /tmp/apcphp5.095eRm
apc.num_files_hint 1024
apc.preload_path
apc.report_autofilter 0
apc.rfc1867 0
apc.rfc1867_freq 0
apc.rfc1867_name APC_UPLOAD_PROGRESS
apc.rfc1867_prefix upload_
apc.rfc1867_ttl 3600
apc.serializer default
apc.shm_segments 1
apc.shm_size 128M
apc.slam_defense 0
apc.stat 0
apc.stat_ctime 0
apc.ttl 7200
apc.use_request_time 1
apc.user_entries_hint 4096
apc.user_ttl 7200
apc.write_lock 1
Host Status Diagrams:
Free: 128.0 MBytes (100.0%) Hits: 1 (50.0%)
Used: 20.3 KBytes (0.0%) Misses: 1 (50.0%)
Detailed Memory Usage and Fragmentation:
Fragmentation: 0%
phpinfo shows:
Server API CGI/FastCGI
APC:
Version 3.1.9
APC Debugging Enabled
MMAP Support Enabled
MMAP File Mask /tmp/apcphp5.JkKDk7
Locking type pthread mutex Locks
Serialization Support php
Revision $Revision: 308812 $
Build Date Jul 21 2011 14:31:12
I followed these steps to find if suexec settings would prevent caching:
http://www.litespeedtech.com/support/forum/showthread.php?t=4189
[root#host /]# ps -ef|grep lsphp
root 20402 17833 0 11:21 pts/0 00:00:00 grep lsphp
[root#host /]# ps -waux
root 17833 0.0 0.1 5004 1484 pts/0 S 10:39 0:00 bash
..indicates that there is no lsphp running on the host
also I read the following article and comments, concluding that in my case the problem is not the suexec as the user apache is the httpd process owner
http://www.brandonturner.net/blog/2009/07/fastcgi_with_php_opcode_cache/
also suexec command is not recognized when logged and launced as root # host
also i'm almost confident that there is no cPanel running on the host to check if a setting there would reset the running cache process at some interval
This leaves me with few clues where to head next.
I tried to set (with chown and chgrp) apache as the owner of the apc.php file and some test php files resulting in 500 server error.
Is there a way to check if the file permissions prevent the apc stay running?
I'm tremendously grateful for any suggestions or help.

Can you give your php.ini settings for APC ?
You must restart httpd to take setting change into account.
Try to change max file size
apc.max_file_size = 20M
You are alowing 128M of ram, it's quite low for big php applications like we have today (a single wordpress uses 32M)
apc.shm_size 128M
increase it also

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Aerospike default.dat file is huge for no reason - aerospike

Related

Errors during downloading metadata for repository 'AppStream'

Google Colab: Disk size with GPU backend

Mount BlockStorage Device on Bluemix VM

Aerospike Config for a Small Server

APC (PHP Cache) Uptime 0 minutes, not caching

Categories

Resources