I currently have LibreNMS running on Proxmox and all was well until last week where LibreNMS can still monitor but no longer alert.
On logging into the console and running ./validate.php I see an error advising to free some space
"
The stream or file "/opt/librenms/logs/librenms.log" could not be opened in append mode: failed to open stream: Permission denied
The exception occurred while attempting to log: SQLSTATE[HY000]: General error: 1021 Disk full (/tmp/#sql_50db_2.MAI); waiting for someone to free some space... (errno: 28 "No space left on device")
(SQL: SELECT TABLE_NAME, COLUMN_NAME, CHARACTER_SET_NAME, COLLATION_NAME
FROM information_schema.COLUMNS WHERE TABLE_SCHEMA = 'librenms' AND
( CHARACTER_SET_NAME != 'utf8mb4' OR COLLATION_NAME != 'utf8mb4_unicode_ci' );)
Context: {"exception":{"errorInfo":["HY000",1021,"Disk full (\/tmp\/#sql_50db_2.MAI); waiting for someone to free some space... (errno: 28 \"No space left on device\")"]}}
"
On checking with df -h I see /dev/mapper/nms1--irl2--vg-tmp using 100%
"
guest-admin#nms1-irl2:/$ df -h
Filesystem Size Used Avail Use% Mounted on
udev 7.9G 0 7.9G 0% /dev
tmpfs 1.6G 165M 1.5G 11% /run
/dev/mapper/nms1--irl2--vg-root 22G 15G 6.1G 71% /
tmpfs 7.9G 12K 7.9G 1% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 7.9G 0 7.9G 0% /sys/fs/cgroup
/dev/sda1 920M 83M 774M 10% /boot
/dev/mapper/nms1--irl2--vg-tmp 1.1G 1019M 0 100% /tmp
/dev/mapper/nms1--irl2--vg-home 9.1G 2.8M 8.6G 1% /home
/dev/mapper/nms1--irl2--vg-var 84G 7.3G 72G 10% /var
"
I have allocated further space via proxmox to this vm and can see it has allocated the extra space when I use the command lsblk. Where I am struggling is to add this extra space to tmp which is in partition sda5. I have tried mounting and remounting but always get an error (Please note the below is from the test server which is a clone of the one in production)
sudo mount -o remount, size=2G /sda/sda5/dev/mapper/nms1--irl2--vg-tmp
mount: /sda/sda5/dev/mapper/nms1--irl2--vg-tmp: mount point does not exist.
I have tried multiple variations of the path but none have worked for me
I have checked mount and see
/dev/mapper/nms1--irl2--vg-tmp on /tmp type ext4 (rw,relatime)
I have also tried pfs and pfsresize which looks sucessful but doesnt actually change anything
If anyone can point me to where I can increase the partition size that would be great. When increasing/resizing in proxmox the size in sda changes without running an additional console command, I need this to go to sda5, ideally tmp, proxox only shows two harddrives, 130G and 5G
For reference lsblk shows me:
"
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 127G 0 disk
|-sda1 8:1 0 953M 0 part /boot
|-sda2 8:2 0 1K 0 part
-sda5 8:5 0 119.1G 0 part |-nms1--irl2--vg-root 254:0 0 22.5G 0 lvm / |-nms1--irl2--vg-swap 254:1 0 952M 0 lvm [SWAP] |-nms1--irl2--vg-home 254:2 0 9.3G 0 lvm /home |-nms1--irl2--vg-tmp 254:3 0 1.1G 0 lvm /tmp -nms1--irl2--vg-var 254:4 0 85.3G 0 lvm /var
sdb 8:16 0 5G 0 disk
`-sdb3 8:19 0 5G 0 part
sr0 11:0 1 1024M 0 rom
"
Thanks
Related
I have an intermittent lag on the web applications I am serving from Apache on a Debian box. Apache and MySQL check out. I am far from fully utilizing the box CPU/Memory. Still there is an intermittent lag. My theory is there is a network rate limit needing to be tweaked. Stats below.
Apache Server Status
Current Time: Tuesday, 02-Jun-2020 14:36:53 EDT
Restart Time: Monday, 01-Jun-2020 01:00:03 EDT
Parent Server Config. Generation: 1
Parent Server MPM Generation: 0
Server uptime: 1 day 13 hours 36 minutes 50 seconds
Server load: 2.95 3.23 3.09
Total accesses: 1213060 - Total Traffic: 22.0 GB - Total Duration: 32311929295
CPU Usage: u396.94 s164.31 cu2065.15 cs789.27 - 2.52% CPU load
8.96 requests/sec - 170.5 kB/second - 19.0 kB/request - 26636.7 ms/request
296 requests currently being processed, 66 idle workers
WR.WWWW.KWW_W._W_KWWWWWWKWWWWW_WWWWK_WK_WWW_WW_RWWWWWKCWWWWWW._W
_WW_R_W_.__K_WWWW__WWWWWWKKWWWWWWKWWWW_W____WWWWWWWW_WWW_KWWWWWW
WWWWWWWW_.WWWWWK_WWW_WWKWWWWWWKWWKWK_WWWWWRKWWW.WW_KKWKWWWKW_WWW
WW.W_.K._WWWK_WW_K_K._WW..WWWWWWW_.W_WWWW_W_W.W_WWWW_.WWKWK_WKWW
_W_WWWW_W.WWWWWW.WWWW_K__..W.WW_WWWWWWWWKRW_WWW_C.W_KW_WWW_KW.._
..WWWWWWWCWWW.WWW_WKKWWWW_._WWW.....WWW.W_W.W._.KW...W...WWW.WWW
W..W..K..WW_.W._................W..._W.W.....K.W.K_...R..K...W.W
...W..W.............................................
top
top - 14:31:14 up 79 days, 21:39, 3 users, load average: 2.26, 2.57, 2.86
Tasks: 717 total, 1 running, 716 sleeping, 0 stopped, 0 zombie
%Cpu(s): 3.3 us, 0.7 sy, 0.2 ni, 95.7 id, 0.0 wa, 0.0 hi, 0.1 si, 0.0 st
MiB Mem : 64365.1 total, 539.8 free, 8847.0 used, 54978.4 buff/cache
MiB Swap: 65477.0 total, 63810.0 free, 1667.0 used. 54580.5 avail Mem
ss -s
Total: 1934
TCP: 2362 (estab 1233, closed 1105, orphaned 2, timewait 1104)
Transport Total IP IPv6
RAW 0 0 0
UDP 0 0 0
TCP 1257 430 827
INET 1257 430 827
FRAG 0 0 0
ulimit -n
1024
ss -ntu | awk '{print $5}' | cut -d: -f1 | sort | uniq -c | sort -n
1 Local
6 192.XXX.XXX.XXX
100 127.0.0.1
340 10.0.0.XX
866 [
ss -ntu | awk '{print $6}' | cut -d: -f1 | sort | uniq -c | sort -n
..........
lists # of ip connections. Besides 127.0.0.1 and [ there are 2 ips over 50.
74 104.xxx.xxx.xxx
91 12.xxx.xxx.xxx
MySQL
No processes running more than a second. Number of processes well within limits.
I do not know what stats would be relevant beyond these in diagnosing network rate limiting issues. Any pointers would be appreciated.
EDITED
CPU
lscpu https://pastebin.com/Jha6F7J8
Apache Config
apachectl -t -D DUMP_RUN_CFG https://pastebin.com/i1L2hnjH
Mysql
SHOW GLOBAL STATUS https://pastebin.com/aQX4D01k
SHOW GLOBAL VARIABLES https://pastebin.com/L8EfmHfn
SHOW FULL PROCESSLIST https://pastebin.com/GtqK2tET
mysqltuner https://pastebin.com/GLhhKA9q
Optional Very Helpful Information
top -bn1 https://pastebin.com/r94vpXe6
iostat -xm 5 3 https://pastebin.com/R8YLK3QU
ulimit -a https://pastebin.com/KUC3wqxU
Dorothy, Your system is very busy with activity. Not knowing the frequency and duration of the intermittent hangs puts us at a disadvantage. One possible cause is com_drop_table had 3,318 uses in your 83 days of uptime. Another possible cause is volume of data read and written. It appears innodb_data_written was 484TB in 83 days and yet MySQLTuner reports only 800K of data in 10 tables. Our General Log Analysis could likely identify the cause of this high activity. These suggestions will be a starting effort, more analysis and changes should be accomplished.
From your OS command prompt,
ulimit -n 96000 would enable many more Open Files (handles) above today's 1024 limit.
This is a dynamic operation in Linux and does not require OS restart to be implemented.
For this change to persist across OS stop/start the following URL could be used as a guide.
Please use 96000, not 500000 - as in their example documentation.
https://glassonionblog.wordpress.com/2013/01/27/increase-ulimit-and-file-descriptors-limit/
Rate Per Second = RPS
Suggestions to consider for your my.cnf [mysqld] section
innodb_io_capacity=1900 # from 200 if you have SSD, 900 if you have magnetic storage to improve IOPS
net_buffer_length=32K # from 16K to reduce malloc operations
innodb_lru_scan_depth=100 # from 1024 to conserve 90% of CPU cycles used for function
key_cache_segments=16 # from 0 to reduce mutex contention with MyISAM opens
key_cache_division_limit=50 # from 100 for Hot/Warm storage to reduce key_page_reads RPS of 18
aria_pagecache_division_limit=50 # from 100 for Hot/Warm storage to reduce aria_pagecache_reads RPS of 5K
read_rnd_buffer_size=64K # from 256K to reduce handler_read_rnd_next RPS of 27,707
These changes should reduce elapsed time to complete most queries.
Additional areas to consider include the use of Slow Query Log analysis to find where an index could avoid a table scan. MySQLTuner reported more than 4 million joins performed without indexes. Our FAQ page includes information on how you could find the tables needing indexes to avoid scans. Let us know how these suggestions work for you.
Skype Talk works very well if you have the flexibility to use that form of communication.
I need to expand /tmp so I can install MATLAB.
GParted is no help, as all the mount points are logical volumes.
mount's output is hard to interpret, and I'm not sure if /tmp is mounted on LV home, or root.
Fedora 31 on VMWare Fusion 11.5.1 on macOS 10.15.2
[john#localhost ~]$ mount | grep tmp
devtmpfs on /dev type devtmpfs (rw,nosuid,seclabel,size=4041568k,nr_inodes=1010392,mode=755)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,seclabel)
tmpfs on /run type tmpfs (rw,nosuid,nodev,seclabel,mode=755)
tmpfs on /tmp type tmpfs (rw,nosuid,nodev,seclabel)
tmpfs on /run/user/1000 type tmpfs (rw,nosuid,nodev,relatime,seclabel,size=811980k,mode=700,uid=1000,gid=1000)
--- Logical volume ---
LV Path /dev/fedora_localhost-live/swap
LV Name swap
VG Name fedora_localhost-live
LV UUID bNdAed-IcJC-I5aU-yLYG-91PP-XdSw-WDUrXz
LV Write Access read/write
LV Creation host, time localhost-live, 2020-01-19 21:36:51 -0500
LV Status available
# open 2
LV Size <7.88 GiB
Current LE 2016
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:1
--- Logical volume ---
LV Path /dev/fedora_localhost-live/home
LV Name home
VG Name fedora_localhost-live
LV UUID Vw3QsZ-4xFe-xIRk-UpJr-V3ZV-5NeU-ARHr30
LV Write Access read/write
LV Creation host, time localhost-live, 2020-01-19 21:36:51 -0500
LV Status available
# open 1
LV Size 721.12 GiB
Current LE 184607
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:2
--- Logical volume ---
LV Path /dev/fedora_localhost-live/root
LV Name root
VG Name fedora_localhost-live
LV UUID Uld1hs-Ir4q-uirB-P4Tp-P7se-qDpy-uNPT68
LV Write Access read/write
LV Creation host, time localhost-live, 2020-01-19 21:36:52 -0500
LV Status available
# open 1
LV Size 70.00 GiB
Current LE 17920
Segments 1
Allocation inherit
Read ahead sectors auto
- currently set to 256
Block device 253:0
I have a Centos7 VM with ZFS on linux installed.
The VM has a disk /dev/sdb, that I've added to a pool named 'backup', and in this pool created a dataset.
Now, I wanted to increase the size of the disk in VMware, and then expand the size of the pool, but I'm not getting this to work.
I've tried 'zpool online -e backup sdb', but nothing changes.
I've tried running 'partprobe /dev/sdb' before and after the live above, but nothing changes.
I've tried rebooting + the above, nothing changes.
I've tried "parted /dev/sdb",resizing the partition (it suggests the actual new size of the volume), and then all of the above. But nothing changes
I've tried 'zpool export backup' + 'zpool import backup' in various combinations with all of the above. No luck
And also: 'lsblk' and 'df -h' reports the old/wrong size of /dev/sdb, even if parted seems to understand that it has been increased.
PS: autoexpand=on
What to do?
I faced a similar issue today and had to try a lot before finding the solution.
When I tried the known solutions (using zpool) of setting autoexpand as on and also restarting the partprobe, system would not auto expand (even after a restart).
Finally, I could solve it using parted instead of getting into zpool at all.
We need to be careful here since wrong partition selections can cause data loss.
What worked for me in your situation
Step 1: Find which pool you are trying to expand. In my case, it is 5 as seen below (unallocated space is after this pool). Use parted -l
parted -l
Output
Model: VMware, VMware Virtual S (scsi)
Disk /dev/sda: 69.8GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:
Number Start End Size File system Name Flags
1 1049kB 2097kB 1049kB bios_grub
2 2097kB 540MB 538MB fat32 EFI System Partition boot, esp
3 540MB 2009MB 1469MB swap
4 2009MB 3592MB 1583MB zfs
5 3592MB 32.2GB 28.6GB zfs
Step 2: Instructing explictly to expany pool number 5 to 100% available. Note that '5' is not static. You need to use the pool id you wish to expand. Double-check this. Use parted /dev/XXX resizepart YY 100%
parted /dev/sda resizepart 5 100%
After this, I was able to use the entire space in VM.
For reference:
LSBSK Before
sda 8:0 0 65G 0 disk
├─sda1 8:1 0 1M 0 part
├─sda2 8:2 0 513M 0 part /boot/grub
│ /boot/efi
├─sda3 8:3 0 1.4G 0 part
│ └─cryptoswap 253:1 0 1.4G 0 crypt [SWAP]
├─sda4 8:4 0 1.5G 0 part
└─sda5 8:5 0 29.5G 0 part
LSBSK After
sda 8:0 0 65G 0 disk
├─sda1 8:1 0 1M 0 part
├─sda2 8:2 0 513M 0 part /boot/grub
│ /boot/efi
├─sda3 8:3 0 1.4G 0 part
│ └─cryptoswap 253:1 0 1.4G 0 crypt [SWAP]
├─sda4 8:4 0 1.5G 0 part
└─sda5 8:5 0 61.7G 0 part
I have ssh access to a list of ~20 machines. I need to find the load status for all of them in a list. The program 'top' does a good job giving info on the machine status in its header.
Example:
top - 13:29:53 up 107 days, 20:13, 47 users, load average: 3.80, 3.74, 3.62
Tasks: 794 total, 2 running, 787 sleeping, 3 stopped, 2 zombie
Cpu(s): 2.6%us, 0.8%sy, 0.0%ni, 84.7%id, 11.9%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 99055876k total, 47947572k used, 51108304k free, 697684k buffers
Swap: 26148860k total, 17145136k used, 9003724k free, 35844820k cached
Today I manually do ssh into each machine, do the 'top' copy the data and store it. I was wondering if this task can be automated. I found out that ssh has the option of giving a unix cmd as an argument to be executed on the remote machine. But how to capture the output from 'top'? Or is there a batch-too giving the same header output? It would be great to have just one script that does the table for me.
Thanks,
Gert
For Ubuntu:
[12:15 AM] borlaze#mac: /tmp $ ssh USER#HOST 'top -b -n 1 | head -n 5' >123.txt
[12:15 AM] borlaze#mac: /tmp $ cat 123.txt
top - 00:16:06 up 35 days, 10:58, 1 user, load average: 0,34, 0,36, 0,29
Tasks: 277 total, 1 running, 274 sleeping, 0 stopped, 2 zombie
%Cpu(s): 7,1 us, 5,7 sy, 0,0 ni, 87,0 id, 0,1 wa, 0,0 hi, 0,0 si, 0,0 st
KiB Mem : 24671340 total, 1066056 free, 12822724 used, 10782560 buff/cache
KiB Swap: 16756732 total, 16094308 free, 662424 used. 11208916 avail Mem
So on one system, I have values that are pretty wide open:
$ ulimit -a | grep mem
max locked memory (kbytes, -l) 40000
max memory size (kbytes, -m) unlimited
virtual memory (kbytes, -v) unlimited
Another system has much more limiting values, but I can't for the life of me find out where the 32MB upper limit (it is 32MB despite the mislabling) is being set:
# ulimit -a | grep mem
max locked memory (kbytes, -l) 32
max memory size (kbytes, -m) unlimited
virtual memory (kbytes, -v) unlimited
The second system is a RHEL 5.5 box. I am looking to increase this limit for at least one user- I need a bigger APC mmap memory allocation, but I can't go above 30 MB without running into the above limit, and I would rather not hack the provided apache init script. Where should I be trying to override the system default value so I can map a bigger segment of memory? Doing it in limits.conf for the apache user doesn't do a whole lot; probably because the init script doesn't do anything through PAM.
If the user granularity setting you tried isn't working, you should make sure that's you've correctly matched which user is hitting the limit.
You should also be able to add a line like this to limits.conf:
* hard memlock 40000
That'll change the default setting for all users.
From the limits.conf manpage:
The syntax of the lines is as follows:
<domain> <type> <item> <value>
The fields listed above should be filled as follows:
<domain>
[snip]
· the wildcard *, for default entry.