I currently look into the LXC container API. I am trying to figure out how can I make the operating system know to which container the currently running process belongs. In this way, OS can allocate resource for processes according to the container.
I am assuming your query is - Given a PID, how to find the container in which this process is running?
I will try to answer it based on my recent reading on Linux containers. Each container can be configured to start with its own user and group id mappings.
From https://linuxcontainers.org/lxc/manpages/man5/lxc.container.conf.5.html:
lxc.id_map
Four values must be provided. First a character, either 'u', or 'g', to specify whether user or group ids are being mapped. Next is
the first userid as seen in the user namespace of the container. Next
is the userid as seen on the host. Finally, a range indicating the
number of consecutive ids to map.
So, you would add something like this in config file (Ex: ~/.config/lxc/default.conf):
lxc.id_map = u 0 100000 65536
lxc.id_map = g 0 100000 65536
The above basically means that uids/gids between 0 and 65536 are mapped to numbers between 100000 and 1655356. So, a uid of 0 (root) on container will be seen as 100000 on host
For Example, inside container it will look something like this:
root#unpriv_cont:/# ps -ef
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 02:18 ? 00:00:00 /sbin/init
root 157 1 0 02:18 ? 00:00:00 upstart-udev-bridge --daemon
But on host the same processes will look like this:
ps -ef | grep 100000
100000 2204 2077 0 Dec12 ? 00:00:00 /sbin/init
100000 3170 2204 0 Dec12 ? 00:00:00 upstart-udev-bridge --daemon
100000 1762 2204 0 Dec12 ? 00:00:00 /lib/systemd/systemd-udevd --daemon
Thus, you can find the container of a process by looking for its UID and relating it to the mapping defined in that container's config.
Related
I have an intermittent lag on the web applications I am serving from Apache on a Debian box. Apache and MySQL check out. I am far from fully utilizing the box CPU/Memory. Still there is an intermittent lag. My theory is there is a network rate limit needing to be tweaked. Stats below.
Apache Server Status
Current Time: Tuesday, 02-Jun-2020 14:36:53 EDT
Restart Time: Monday, 01-Jun-2020 01:00:03 EDT
Parent Server Config. Generation: 1
Parent Server MPM Generation: 0
Server uptime: 1 day 13 hours 36 minutes 50 seconds
Server load: 2.95 3.23 3.09
Total accesses: 1213060 - Total Traffic: 22.0 GB - Total Duration: 32311929295
CPU Usage: u396.94 s164.31 cu2065.15 cs789.27 - 2.52% CPU load
8.96 requests/sec - 170.5 kB/second - 19.0 kB/request - 26636.7 ms/request
296 requests currently being processed, 66 idle workers
WR.WWWW.KWW_W._W_KWWWWWWKWWWWW_WWWWK_WK_WWW_WW_RWWWWWKCWWWWWW._W
_WW_R_W_.__K_WWWW__WWWWWWKKWWWWWWKWWWW_W____WWWWWWWW_WWW_KWWWWWW
WWWWWWWW_.WWWWWK_WWW_WWKWWWWWWKWWKWK_WWWWWRKWWW.WW_KKWKWWWKW_WWW
WW.W_.K._WWWK_WW_K_K._WW..WWWWWWW_.W_WWWW_W_W.W_WWWW_.WWKWK_WKWW
_W_WWWW_W.WWWWWW.WWWW_K__..W.WW_WWWWWWWWKRW_WWW_C.W_KW_WWW_KW.._
..WWWWWWWCWWW.WWW_WKKWWWW_._WWW.....WWW.W_W.W._.KW...W...WWW.WWW
W..W..K..WW_.W._................W..._W.W.....K.W.K_...R..K...W.W
...W..W.............................................
top
top - 14:31:14 up 79 days, 21:39, 3 users, load average: 2.26, 2.57, 2.86
Tasks: 717 total, 1 running, 716 sleeping, 0 stopped, 0 zombie
%Cpu(s): 3.3 us, 0.7 sy, 0.2 ni, 95.7 id, 0.0 wa, 0.0 hi, 0.1 si, 0.0 st
MiB Mem : 64365.1 total, 539.8 free, 8847.0 used, 54978.4 buff/cache
MiB Swap: 65477.0 total, 63810.0 free, 1667.0 used. 54580.5 avail Mem
ss -s
Total: 1934
TCP: 2362 (estab 1233, closed 1105, orphaned 2, timewait 1104)
Transport Total IP IPv6
RAW 0 0 0
UDP 0 0 0
TCP 1257 430 827
INET 1257 430 827
FRAG 0 0 0
ulimit -n
1024
ss -ntu | awk '{print $5}' | cut -d: -f1 | sort | uniq -c | sort -n
1 Local
6 192.XXX.XXX.XXX
100 127.0.0.1
340 10.0.0.XX
866 [
ss -ntu | awk '{print $6}' | cut -d: -f1 | sort | uniq -c | sort -n
..........
lists # of ip connections. Besides 127.0.0.1 and [ there are 2 ips over 50.
74 104.xxx.xxx.xxx
91 12.xxx.xxx.xxx
MySQL
No processes running more than a second. Number of processes well within limits.
I do not know what stats would be relevant beyond these in diagnosing network rate limiting issues. Any pointers would be appreciated.
EDITED
CPU
lscpu https://pastebin.com/Jha6F7J8
Apache Config
apachectl -t -D DUMP_RUN_CFG https://pastebin.com/i1L2hnjH
Mysql
SHOW GLOBAL STATUS https://pastebin.com/aQX4D01k
SHOW GLOBAL VARIABLES https://pastebin.com/L8EfmHfn
SHOW FULL PROCESSLIST https://pastebin.com/GtqK2tET
mysqltuner https://pastebin.com/GLhhKA9q
Optional Very Helpful Information
top -bn1 https://pastebin.com/r94vpXe6
iostat -xm 5 3 https://pastebin.com/R8YLK3QU
ulimit -a https://pastebin.com/KUC3wqxU
Dorothy, Your system is very busy with activity. Not knowing the frequency and duration of the intermittent hangs puts us at a disadvantage. One possible cause is com_drop_table had 3,318 uses in your 83 days of uptime. Another possible cause is volume of data read and written. It appears innodb_data_written was 484TB in 83 days and yet MySQLTuner reports only 800K of data in 10 tables. Our General Log Analysis could likely identify the cause of this high activity. These suggestions will be a starting effort, more analysis and changes should be accomplished.
From your OS command prompt,
ulimit -n 96000 would enable many more Open Files (handles) above today's 1024 limit.
This is a dynamic operation in Linux and does not require OS restart to be implemented.
For this change to persist across OS stop/start the following URL could be used as a guide.
Please use 96000, not 500000 - as in their example documentation.
https://glassonionblog.wordpress.com/2013/01/27/increase-ulimit-and-file-descriptors-limit/
Rate Per Second = RPS
Suggestions to consider for your my.cnf [mysqld] section
innodb_io_capacity=1900 # from 200 if you have SSD, 900 if you have magnetic storage to improve IOPS
net_buffer_length=32K # from 16K to reduce malloc operations
innodb_lru_scan_depth=100 # from 1024 to conserve 90% of CPU cycles used for function
key_cache_segments=16 # from 0 to reduce mutex contention with MyISAM opens
key_cache_division_limit=50 # from 100 for Hot/Warm storage to reduce key_page_reads RPS of 18
aria_pagecache_division_limit=50 # from 100 for Hot/Warm storage to reduce aria_pagecache_reads RPS of 5K
read_rnd_buffer_size=64K # from 256K to reduce handler_read_rnd_next RPS of 27,707
These changes should reduce elapsed time to complete most queries.
Additional areas to consider include the use of Slow Query Log analysis to find where an index could avoid a table scan. MySQLTuner reported more than 4 million joins performed without indexes. Our FAQ page includes information on how you could find the tables needing indexes to avoid scans. Let us know how these suggestions work for you.
Skype Talk works very well if you have the flexibility to use that form of communication.
I have a Centos7 VM with ZFS on linux installed.
The VM has a disk /dev/sdb, that I've added to a pool named 'backup', and in this pool created a dataset.
Now, I wanted to increase the size of the disk in VMware, and then expand the size of the pool, but I'm not getting this to work.
I've tried 'zpool online -e backup sdb', but nothing changes.
I've tried running 'partprobe /dev/sdb' before and after the live above, but nothing changes.
I've tried rebooting + the above, nothing changes.
I've tried "parted /dev/sdb",resizing the partition (it suggests the actual new size of the volume), and then all of the above. But nothing changes
I've tried 'zpool export backup' + 'zpool import backup' in various combinations with all of the above. No luck
And also: 'lsblk' and 'df -h' reports the old/wrong size of /dev/sdb, even if parted seems to understand that it has been increased.
PS: autoexpand=on
What to do?
I faced a similar issue today and had to try a lot before finding the solution.
When I tried the known solutions (using zpool) of setting autoexpand as on and also restarting the partprobe, system would not auto expand (even after a restart).
Finally, I could solve it using parted instead of getting into zpool at all.
We need to be careful here since wrong partition selections can cause data loss.
What worked for me in your situation
Step 1: Find which pool you are trying to expand. In my case, it is 5 as seen below (unallocated space is after this pool). Use parted -l
parted -l
Output
Model: VMware, VMware Virtual S (scsi)
Disk /dev/sda: 69.8GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:
Number Start End Size File system Name Flags
1 1049kB 2097kB 1049kB bios_grub
2 2097kB 540MB 538MB fat32 EFI System Partition boot, esp
3 540MB 2009MB 1469MB swap
4 2009MB 3592MB 1583MB zfs
5 3592MB 32.2GB 28.6GB zfs
Step 2: Instructing explictly to expany pool number 5 to 100% available. Note that '5' is not static. You need to use the pool id you wish to expand. Double-check this. Use parted /dev/XXX resizepart YY 100%
parted /dev/sda resizepart 5 100%
After this, I was able to use the entire space in VM.
For reference:
LSBSK Before
sda 8:0 0 65G 0 disk
├─sda1 8:1 0 1M 0 part
├─sda2 8:2 0 513M 0 part /boot/grub
│ /boot/efi
├─sda3 8:3 0 1.4G 0 part
│ └─cryptoswap 253:1 0 1.4G 0 crypt [SWAP]
├─sda4 8:4 0 1.5G 0 part
└─sda5 8:5 0 29.5G 0 part
LSBSK After
sda 8:0 0 65G 0 disk
├─sda1 8:1 0 1M 0 part
├─sda2 8:2 0 513M 0 part /boot/grub
│ /boot/efi
├─sda3 8:3 0 1.4G 0 part
│ └─cryptoswap 253:1 0 1.4G 0 crypt [SWAP]
├─sda4 8:4 0 1.5G 0 part
└─sda5 8:5 0 61.7G 0 part
I have ssh access to a list of ~20 machines. I need to find the load status for all of them in a list. The program 'top' does a good job giving info on the machine status in its header.
Example:
top - 13:29:53 up 107 days, 20:13, 47 users, load average: 3.80, 3.74, 3.62
Tasks: 794 total, 2 running, 787 sleeping, 3 stopped, 2 zombie
Cpu(s): 2.6%us, 0.8%sy, 0.0%ni, 84.7%id, 11.9%wa, 0.0%hi, 0.0%si, 0.0%st
Mem: 99055876k total, 47947572k used, 51108304k free, 697684k buffers
Swap: 26148860k total, 17145136k used, 9003724k free, 35844820k cached
Today I manually do ssh into each machine, do the 'top' copy the data and store it. I was wondering if this task can be automated. I found out that ssh has the option of giving a unix cmd as an argument to be executed on the remote machine. But how to capture the output from 'top'? Or is there a batch-too giving the same header output? It would be great to have just one script that does the table for me.
Thanks,
Gert
For Ubuntu:
[12:15 AM] borlaze#mac: /tmp $ ssh USER#HOST 'top -b -n 1 | head -n 5' >123.txt
[12:15 AM] borlaze#mac: /tmp $ cat 123.txt
top - 00:16:06 up 35 days, 10:58, 1 user, load average: 0,34, 0,36, 0,29
Tasks: 277 total, 1 running, 274 sleeping, 0 stopped, 2 zombie
%Cpu(s): 7,1 us, 5,7 sy, 0,0 ni, 87,0 id, 0,1 wa, 0,0 hi, 0,0 si, 0,0 st
KiB Mem : 24671340 total, 1066056 free, 12822724 used, 10782560 buff/cache
KiB Swap: 16756732 total, 16094308 free, 662424 used. 11208916 avail Mem
I want to know what oracle internal process is running for the below session details.
How to check what process is being carried out by "ora_j001" ?
Please provide me query to find out the process ?
INST_ID SID SERIAL# USERNAME OSUSER MACHINE PROCESS OS Process ID VALUE STATUS LAST_CALL_ET PROGRAM
1 1303 13000 APPS orafin ARG-FIN1A-DC 3842124 3842124 224905256 ACTIVE 57661 oracle#ARG-FIN1A-DC (J001)
$ ps -ef | grep 3842124
orafin 3842124 1 0 18:24:54 - 2:02 ora_j001_FINPROD1
argora 4395248 4784358 0 10:41:08 pts/6 0:00 grep 3842124
$ hostname
ARG-FIN1A-DC
In such kind of process how to check whether what kind of oracle internal process is running ?
You have listed your SID there. This will find the current SQL being run by any SID. Tie this back to DBA_JOBS or DBA_SCHEDULER_JOBS to see job related activity.
select q.sql_text, q.piece from V$SQLTEXT_WITH_NEWLINES
where q.SQL_ID = <SID>
order by 2;
I have a gpu resource called gpus. When I run qstat -F gpus I get weird output of the format "qc:gpus=-1" , thus negative number of available gpus are reported. If i run qstat -g c says I have multiple GPUs available. Multiple jobs fail because of "unavailable gpus". It's like the counting of GPUs starts from 1 instead of 8 on each node, so if I used more than 1 it becomes negative. My queue is :
hostlist node-01 node-02 node-03 node-04 node-05
seq_no 0
load_thresholds NONE
suspend_thresholds NONE
nsuspend 1
suspend_interval 00:05:00
priority 0
min_cpu_interval 00:05:00
processors UNDEFINED
qtype BATCH INTERACTIVE
ckpt_list NONE
pe_list smp mpich2
rerun FALSE
slots 1,[node-01=8],[node-02=8],[node-03=8],[node-04=8],[node-05=8]
Does anyone have any idea why this is happening?
I believe you set the "gpus" complex in the host configuration. You can see it if you do
qconf -se node-01
And you can check the definition of the "gpus" complex with
qconf -sc
For instance, my UGE has this definition for the "ngpus" complex:
#name shortcut type relop requestable consumable default urgency
ngpus gpu INT <= YES YES 0 1000
And an example node "qconf -se gpu01":
hostname gpu01.cm.cluster
...
complex_values exclusive=true,m_mem_free=65490.000000M, \
m_mem_free_n0=32722.546875M,m_mem_free_n1=32768.000000M, \
ngpus=2,slots=16,vendor=intel
You can modify the value by "qconf -me node-01". See the man page complex(5) for details.