Too many rcuob and rcuos are shown - process

On my Linux server, Too many rcuob and rcuos process are shown.
Execute a following command.
ps auxwwf | grep rcu
And shown below.
root 9 0.0 0.0 0 0 ? S 5月30 0:00 \_ [rcuob/0]
root 10 0.0 0.0 0 0 ? S 5月30 0:00 \_ [rcuob/1]
:
:
root 151 0.0 0.0 0 0 ? S 5月30 0:00 \_ [rcuob/142]
root 152 0.0 0.0 0 0 ? S 5月30 0:00 \_ [rcuob/143]
and
root 154 0.0 0.0 0 0 ? S 5月30 0:11 \_ [rcuos/0]
root 155 0.0 0.0 0 0 ? S 5月30 0:04 \_ [rcuos/1]
:
:
root 296 0.0 0.0 0 0 ? S 5月30 0:00 \_ [rcuos/142]
root 297 0.0 0.0 0 0 ? S 5月30 0:00 \_ [rcuos/143]
The server's CPU is "Intel(R) Xeon(R) CPU E5-2630 v3 # 2.40GHz"
Total Memory is "32GB"
and OS is"CentOS Linux release 7.2.1511 (Core)"
I do not know what these are, and if they are the problem, please let me know the procedure to fix it.

I noticed the same behaviour on RHEL7.2 3.10.0-327.el7.
https://access.redhat.com/solutions/1404313 has the title More "rcuob" and "rcuos" kernel threads running than there are CPUs online, and describes how the RCU threads started did not match the number of online CPUs, but the number of possible CPUs. It explains a fix has been released with an errata RHSA-2016-2574, kernel 3.10.0-514.el7. I imagine CentOS will also have a fix.
To view the number of online CPUs and the number of possible CPUs:
> cd /sys/devices/system/cpu ; grep '' {online,offline,possible}
online:0-55
offline:56-191
possible:0-191
Count the number of rcuob and rcuos kernel threads:
> ps aux | awk '/\[(ksoftirqd|migration|watchdog|rcuo)/{print $11}' | sed 's/[0-9]//g' | sort | uniq -c
56 [ksoftirqd/]
56 [migration/]
192 [rcuob/]
192 [rcuos/]
56 [watchdog/]

Related

What is the most minimalistic way to run a program inside an OS?

Generally, when I start a program from bash, it forks bash and inherits many things from it, like stdin, stdout. Is there some other way to run a program, with no such setup? Maybe it explicitly opens fd 1, writes something and closes it?
I came across nohup and disown. But both of those detaches a running process from bash, but initially still the process inherits from bash. Maybe is there a way to start a process that inherits from nothing?
I am asking this just out of curiosity, and have no practical purposes. When a program is ran in a microcontroller, it is just our program that is running with no additional setup (if setup is required, user have to prepend it). Similarly, is there a way even in the presence of an operating system, to run just what is programmed, without any setups?
I assume you are using Linux.
Printing top -u root you see for eg on my system (Ubuntu 20.04 x86_64):
PID PPID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1 0 root 20 0 170380 11368 6500 S 0.0 0.0 0:23.52 systemd
2 0 root 20 0 0 0 0 S 0.0 0.0 0:00.61 kthreadd
3 2 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_gp
4 2 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_par_gp
5 2 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 netns
10 2 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 mm_percpu_wq
11 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_tasks_rude_
12 2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcu_tasks_trace
13 2 root 20 0 0 0 0 S 0.0 0.0 0:04.48 ksoftirqd/0
14 2 root 20 0 0 0 0 I 0.0 0.0 1:28.82 rcu_sched
You see that all processes inherit ultimately from PPID (parent process ID) zero.
This process is not a process, it represents the Linux scheduler. Then systemd (PID 1) is launched by the kernel and every other process in the system is launched by systemd.
At a user level top -u madfred
3371 1 madfred 20 0 19928 7664 6136 S 0.0 0.0 0:11.62 systemd
3372 3371 madfred 20 0 170404 2460 0 S 0.0 0.0 0:00.00 (sd-pam)
3379 3371 madfred 39 19 659828 16348 12500 S 0.0 0.0 0:02.38 tracker-miner-f
3402 3371 madfred 20 0 8664 5112 3412 S 0.0 0.0 0:00.94 dbus-daemon
3407 3371 madfred 20 0 239712 6740 6064 S 0.0 0.0 0:00.03 gvfsd
There is one user systemd that is launched by the root systemd and runs as the user. This user systemd is in charge of launching every process for that user.
That is necessary for assuring all the guarantees that the Linux OS provides as security, memory protection, file resources etc.
What you want would be to replace the kernel with something else, which is very possible. Check for example:
https://wiki.osdev.org/Bare_bones
https://github.com/contiki-ng/contiki-ng
It is pretty easy to replace systemd (or the old /sbin/init) with your own custom initializer. Check this answer:
Writing my own init executable

data highly skewed and values range is too large

i'am trying to rescale and normalize my dataset
my data is highly skewed and also the values range is too large which affecting my models performance
i've tried using robustscaler() and powerTransformer() and yet no improvement
below you can see the boxplot and kde plot and also skew() test of my data
df_test.agg(['skew', 'kurtosis']).transpose()
the data is financial data so it can take a large range of values ( they are not really ouliers)
Depending on your data, there are several ways to handle this. There is however a function that will help you handle skew data by doing a preliminary transformation to your normalization effort.
Go to this repo (https://github.com/datamadness/Automatic-skewness-transformation-for-Pandas-DataFrame) and download the functions skew_autotransform.py and TEST_skew_autotransform.py. Put this function in the same folder as your code. Use it in the same way as in this example:
import pandas as pd
import numpy as np
from sklearn.datasets import load_boston
from skew_autotransform import skew_autotransform
exampleDF = pd.DataFrame(load_boston()['data'], columns = load_boston()['feature_names'].tolist())
transformedDF = skew_autotransform(exampleDF.copy(deep=True), plot = True, exp = False, threshold = 0.5)
print('Original average skewness value was %2.2f' %(np.mean(abs(exampleDF.skew()))))
print('Average skewness after transformation is %2.2f' %(np.mean(abs(transformedDF.skew()))))
It will return several graphs and measures of skewness of each variable, but most importantly a transformed dataframe of the handled skewed data:
Original data:
CRIM ZN INDUS CHAS NOX RM AGE DIS RAD TAX \
0 0.00632 18.0 2.31 0.0 0.538 6.575 65.2 4.0900 1.0 296.0
1 0.02731 0.0 7.07 0.0 0.469 6.421 78.9 4.9671 2.0 242.0
2 0.02729 0.0 7.07 0.0 0.469 7.185 61.1 4.9671 2.0 242.0
3 0.03237 0.0 2.18 0.0 0.458 6.998 45.8 6.0622 3.0 222.0
4 0.06905 0.0 2.18 0.0 0.458 7.147 54.2 6.0622 3.0 222.0
.. ... ... ... ... ... ... ... ... ... ...
501 0.06263 0.0 11.93 0.0 0.573 6.593 69.1 2.4786 1.0 273.0
502 0.04527 0.0 11.93 0.0 0.573 6.120 76.7 2.2875 1.0 273.0
503 0.06076 0.0 11.93 0.0 0.573 6.976 91.0 2.1675 1.0 273.0
504 0.10959 0.0 11.93 0.0 0.573 6.794 89.3 2.3889 1.0 273.0
505 0.04741 0.0 11.93 0.0 0.573 6.030 80.8 2.5050 1.0 273.0
PTRATIO B LSTAT
0 15.3 396.90 4.98
1 17.8 396.90 9.14
2 17.8 392.83 4.03
3 18.7 394.63 2.94
4 18.7 396.90 5.33
.. ... ... ...
501 21.0 391.99 9.67
502 21.0 396.90 9.08
503 21.0 396.90 5.64
504 21.0 393.45 6.48
505 21.0 396.90 7.88
[506 rows x 13 columns]
and the tranformed data:
CRIM ZN INDUS CHAS NOX RM AGE \
0 -6.843991 1.708418 2.31 -587728.314092 -0.834416 6.575 201.623543
1 -4.447833 -13.373080 7.07 -587728.314092 -1.092408 6.421 260.624267
2 -4.448936 -13.373080 7.07 -587728.314092 -1.092408 7.185 184.738608
3 -4.194470 -13.373080 2.18 -587728.314092 -1.140400 6.998 125.260171
4 -3.122838 -13.373080 2.18 -587728.314092 -1.140400 7.147 157.195622
.. ... ... ... ... ... ... ...
501 -3.255759 -13.373080 11.93 -587728.314092 -0.726384 6.593 218.025321
502 -3.708638 -13.373080 11.93 -587728.314092 -0.726384 6.120 250.894792
503 -3.297348 -13.373080 11.93 -587728.314092 -0.726384 6.976 315.757117
504 -2.513274 -13.373080 11.93 -587728.314092 -0.726384 6.794 307.850962
505 -3.643173 -13.373080 11.93 -587728.314092 -0.726384 6.030 269.101967
DIS RAD TAX PTRATIO B LSTAT
0 1.264870 0.000000 1.807258 32745.311816 9.053163e+08 1.938257
1 1.418585 0.660260 1.796577 63253.425063 9.053163e+08 2.876983
2 1.418585 0.660260 1.796577 63253.425063 8.717663e+08 1.640387
3 1.571460 1.017528 1.791645 78392.216639 8.864906e+08 1.222396
4 1.571460 1.017528 1.791645 78392.216639 9.053163e+08 2.036925
.. ... ... ... ... ... ...
501 0.846506 0.000000 1.803104 129845.602554 8.649562e+08 2.970889
502 0.776403 0.000000 1.803104 129845.602554 9.053163e+08 2.866089
503 0.728829 0.000000 1.803104 129845.602554 9.053163e+08 2.120221
504 0.814408 0.000000 1.803104 129845.602554 8.768178e+08 2.329393
505 0.855697 0.000000 1.803104 129845.602554 9.053163e+08 2.635552
[506 rows x 13 columns]
After having done this, normalize the data if you need to.
Update
Given the ranges of some of your data, you need to probably do this case by case and by trial and error. There are several normalizers you can use to test different approaches. I'll give you a few of them on an example columns,
exampleDF = pd.read_csv("test.csv", sep=",")
exampleDF = pd.DataFrame(exampleDF['LiabilitiesNoncurrent_total'])
LiabilitiesNoncurrent_total
count 6.000000e+02
mean 8.865754e+08
std 3.501445e+09
min -6.307000e+08
25% 6.179232e+05
50% 1.542650e+07
75% 3.036085e+08
max 5.231900e+10
Sigmoid
Define the following function
def sigmoid(x):
e = np.exp(1)
y = 1/(1+e**(-x))
return y
and do
df = sigmoid(exampleDF.LiabilitiesNoncurrent_total)
df = pd.DataFrame(df)
'LiabilitiesNoncurrent_total' had 'positive' skewness of 8.85
The transformed one has a skewness of -2.81
Log+1 Normalization
Another approach is to use a logarithmic function and then to normalize.
def normalize(column):
upper = column.max()
lower = column.min()
y = (column - lower)/(upper-lower)
return y
df = np.log(exampleDF['LiabilitiesNoncurrent_total'] + 1)
df_normalized = normalize(df)
The skewness is reduced by approxiamately the same amount.
I would opt for this last option rather than a sigmoidal approach. I also suspect that you can apply this solution to all your features.

Some confusion in creating pivot table

I am trying to create a pivot table but i am not getting the result i want. Couldn't able to understand why is this happening.
I have a dataframe like this -
data_channel_is_lifestyle data_channel_is_bus shares
0 0.0 0.0 593
1 0.0 1.0 711
2 0.0 1.0 1500
3 0.0 0.0 1200
4 0.0 0.0 505
And the result i am looking for is name of the columns in the index and sum of shares in the column. So
i did this -
news_copy.pivot_table(index=['data_channel_is_lifestyle','data_channel_is_bus'], values='shares', aggfunc=sum)
but i am getting the result something like this -
shares
data_channel_is_lifestyle data_channel_is_bus
0.0 0.0 107709305
1.0 19168370
1.0 0.0 7728777
I don't want these 0's and 1's, i just want the result to be something like this -
shares
data_channel_is_lifestyle 107709305
data_channel_is_bus 19168370
How can i do this?
As you put it, it's just matrix multipliation:
df.filter(like='data').T#(df[['shares']])
Output (for sample data):
shares
data_channel_is_lifestyle 0.0
data_channel_is_bus 2211.0

Predictive Maintenance - How to use Bayesian Optimization with objective function and Logistic Regression with Gradient Descent together?

I'm trying to reproduce the problem shown in arimo.com
This is an example how to build a preventive maintenance Machine Learning model for an Hard Drive failures. The section I really don't understand is how to use Bayesian Optimization with a custom objective function and Logistic Regression with Gradient Descent together. What are the hyper-parameters to be optimized? What is the flow of the problem?
As described in our previous post, Bayesian Optimization [6] is used
to find the best hyperparameter values. The objective function to be
optimized in the hyperparameter tuning is the following score measured
on the validation set:
S = alpha * fnr + (1 – alpha) * fpr
where fpr and fnr are the False Positive and False Negative rates
obtained on the validation set. Our goal is to keep False Positive
rate low, therefore we use alpha = 0.2. Since the validation set is
highly unbalanced, we found out that standard scores like Precision,
F1-score, etc… do not work well. In fact, using this custom score is
crucial for the model to obtain a good performance generally.
Note that we only use the above score when running Bayesian
Optimization. To train logistic regression models, we use Gradient
Descent with the usual ridge loss function.
My dataframe before features selection:
index date serial_number model capacity_bytes failure Read Error Rate Reallocated Sectors Count Power-On Hours (POH) Temperature Current Pending Sector Count age yesterday_temperature yesterday_age yesterday_reallocated_sectors_count yesterday_read_error_rate yesterday_current_pending_sector_count yesterday_power_on_hours tomorrow_failure
0 77947 2013-04-11 MJ0331YNG69A0A Hitachi HDS5C3030ALA630 3000592982016 0 0 0 4909 29 0 36348284.0 29.0 20799895.0 0.0 0.0 0.0 4885.0 0.0
1 79327 2013-04-11 MJ1311YNG7EWXA Hitachi HDS5C3030ALA630 3000592982016 0 0 0 8831 24 0 36829839.0 24.0 21280074.0 0.0 0.0 0.0 8807.0 0.0
2 79592 2013-04-11 MJ1311YNG2ZD9A Hitachi HDS5C3030ALA630 3000592982016 0 0 0 13732 26 0 36924206.0 26.0 21374176.0 0.0 0.0 0.0 13708.0 0.0
3 80715 2013-04-11 MJ1311YNG2ZDBA Hitachi HDS5C3030ALA630 3000592982016 0 0 0 12745 27 0 37313742.0 27.0 21762591.0 0.0 0.0 0.0 12721.0 0.0
4 79958 2013-04-11 MJ1323YNG1EK0C Hitachi HDS5C3030ALA630 3000592982016 0 524289 0 13922 27 0 37050016.0 27.0 21499620.0 0.0 0.0 0.0 13898.0 0.0

htop output to human readable file

I've tried piping htop to a text file (e.g. htop > text.txt) but it gives me text garbled by formatting strings (see below). Is there a way to get nicer, human readable output?
^[7^[[?47h^[[1;30r^[[m^[[4l^[[?1h^[=^[[m^[[?1000h^[[m^[[m^[[H^[[2J^[[1B ^[[36m1 ^[[m^[[1m[^[[m^[[32m||||||||||^[[31m||||||||||^[[30m^[[1m \
22.2%^[[m]^[[m ^[[36mTasks: ^[[1m159^[[m^[[36m total, ^[[32m^[[1m5^[[m^[[36m running^[[3;3H2 ^[[m^[[1m[^[[30m \
0.0%^[[m]^[[m ^[[36mLoad average: ^[[30m^[[1m1.11 ^[[m^[[m1.28 ^[[1m1.31 ^[[4;3H^[[m^[[36m3 ^[[m^[[1m[^[[m^[[32m||||||||||^[[30m^[[1m \
11.1%^[[m]^[[m ^[[36mUptime: ^[[1m9 days, 22:04:51^[[5;3H^[[m^[[36m4 ^[[m^[[1m[^[[30m 0.0\
%^[[m]^[[6;3H^[[m^[[36m5 ^[[m^[[1m[^[[m^[[31m||||||||||^[[30m^[[1m 11.1%^[[m]^[[7;3H^[[m^[[36m6 ^[[m^[[1m[^[[30m \
htop author here.
No, there's no "nice" way to get the output of htop piped into a file. It is an interactive application and uses terminal redraw routines to produce its interface (therefore, piping it makes as much sense as, for example, piping vim into a text file -- you'll get similar results).
To get the information about your processes in a text format, use "ps". For example, ps auxf > file.txt gives you lots of easy to parse information (or ps aux if you do not wish tree-formatting -- see man ps for more options).
htop outputs ANSI escape code to use colors and move the cursor around the terminal. There is a great command line program aha that can be used to convert ANSI into HTML.
Ubuntu/Debian installation
apt-get install aha
Save htop output as HTML file
echo q | htop | aha --black --line-fix > htop.html
I have had the same need, and ended up using top instead of htop a is provides a batch mode via the -b flag.
-b : Batch mode operation
Starts top in 'Batch mode', which could be useful for sending output from top to other programs or to a file. In this mode, top will not accept input and runs until the iterations limit you've set with the '-n' command-line option or until killed.
So for example:
top -b -n 1
Hope this helps even if this is not using htop.
This command outputs plain text. (It requires installing aha and html2text.)
echo q | htop -C | aha --line-fix | html2text -width 999 |
grep -v "F1Help\|xml version=" > file.txt
You can also use script prior to running htop in a mode that will redirect timings to a file for later playback. In the realm of 'yet another work around' and 'good for show and tell'.
script -t -a /var/tmp/script.htop.out 2> /var/tmp/script.htop.out.timings
htop
Then to playback
scriptreplay /var/tmp/script.htop.out.timings /var/tmp/script.htop.out
Install recode first, then encode it to utf-8:
$htop | recode utf-8 > test.txt
Then cat the file and you should be good.
Based on the previous answers, I suggest use python to do some post-processing. The codes are as follows:
First, we get the text from htop:
echo q | htop -C > a.txt
Then, we use python to make it human-readable:
import re
htop = open("a.txt").read()
print(re.sub(r'\x1B(?:[#-Z\\-_]|\[[0-?]*[ -/]*[#-~])', "", re.sub(r"\x1b\[\d\d;\dH|\x1b\[\d;3H", "\n", '\n'.join(htop)))[9:])
The results are as follows:
1 [ 0.0%] Tasks: 11, 38 thr; 1 running
2 [ 0.0%] Load average: 0.38 0.26 0.11
3 [ 0.0%] Uptime: 01:19:50
4 [ 0.0%]
Mem[|#**** 700M/25.5G]
Swp[ 0K/0K]
PID USER PRI NI VIRT RES SHR S CPU% MEM% TIME+ Command
51 root 20 0 359M 62880 33428 S 0.0 0.2 0:00.00 /tools/node/bin
52 root20 0 359M 62880 33428 S 0.0 0.2 0:00.21 /tools/node/bin
53 root20 0 359M 62880 33428 S 0.0 0.2 0:00.19 /tools/node/bin
54 root20 0 359M 62880 33428 S 0.0 0.2 0:00.16 /tools/node/bin
55 root20 0 359M 62880 33428 S 0.0 0.2 0:00.15 /tools/node/bin
56 root20 0 359M 62880 33428 S 0.0 0.2 0:00.00 /tools/node/bin
57 root20 0 359M 62880 33428 S 0.0 0.2 0:00.05 /tools/node/bin
58 root20 0 359M 62880 33428 S 0.0 0.2 0:00.04 /tools/node/bin
59 root20 0 359M 62880 33428 S 0.0 0.2 0:00.05 /tools/node/bin
60 root20 0 359M 62880 33428 S 0.0 0.2 0:00.04 /tools/node/bin
1 root20 0 359M 62880 33428 S 0.0 0.2 0:08.76 /tools/node/bin
16 root20 0 35892 4768 3660 S 0.0 0.0 0:00.62 tail -n +0 -F /
75 root20 0 190M 61096 13512 S 0.0 0.2 0:00.00 /usr/bin/python
76 root20 0 190M 61096 13512 S 0.0 0.2 0:00.56 /usr/bin/python
F1Help F2Setup F3SearchF4FilterF5Tree F6SortByF7Nice -F8Nice +F9Kill F10Quit
This may sound really noobish, however, if you have multiple monitors you could have htop running while "record my desktop" is capturing that area. Its more of a video and may not help with searching and sorting but it would look nice and pretty.