gke cant disable Transparent Huge Pages... permission denied - redis

I am trying to run a redis image in gke. It works except I get the dreaded "Transparent Huge Pages" warning:
WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.
Redis is currently too slow to be useful... So I tied turning off THP:
sheena#gke-projectwaxd-cluster-default-pool-23593a74-wxrv ~ $ cat /sys/kernel/mm/transparent_hugepage/enabled
always [madvise] never
sheena#gke-projectwaxd-cluster-default-pool-23593a74-wxrv ~ $ echo never > /sys/kernel/mm/transparent_hugepage/enabled
-bash: /sys/kernel/mm/transparent_hugepage/enabled: Permission denied
sheena#gke-projectwaxd-cluster-default-pool-23593a74-wxrv ~ $ sudo echo never > /sys/kernel/mm/transparent_hugepage/enabled
-bash: /sys/kernel/mm/transparent_hugepage/enabled: Permission denied
These permission errors are disconcerting. Redis wants THP off so it can work properly.
I did a little digging and found that google uses a special os image that makes /sys/ a read-only path. There's an alternative image that's based on Debian 7. It got me all excited but in the end I have exactly the same problem.
So how do I stop redis from being effected by THP on Google container engine?
It's not like I'm doing something unique here. Running databases in containers is pretty normal. And it's pretty normal for a database to malfunction when THP is enabled. So... what am I missing here?

Your command is slightly incorrect: echo runs as root but the redirection itself (>) runs as user so it can't write /sys/.
The following command works fine both on container-vm (debian based) and gci (chromeos based):
sudo sh -c 'echo never > /sys/kernel/mm/transparent_hugepage/enabled'
Persisting this setting on container-vm
Add this kernel command line parameter into /etc/default/grub (don't forget to run sudo update-grub and sudo reboot afterwards):
GRUB_CMDLINE_LINUX="... transparent_hugepage=never"
Persisting this setting on gci
First, using the cloud console copy the instance template that is in use by the node pool.
Second, under metadata change the value for userdata:
#cloud-config
write_files:
- path: /etc/systemd/system/hugepage.service
permissions: 0644
owner: root
content: |
[Unit]
Description=Disable THP
[Service]
Type=oneshot
ExecStart=/bin/sh -c "echo never > /sys/kernel/mm/transparent_hugepage/enabled"
[Install]
WantedBy=kubernetes.target
...
runcmd:
- ...
- systemctl enable hugepage.service
- systemctl start kubernetes.target
Third, change the instance template to the newly created one:
gcloud compute instance-groups managed set-instance-template \
gke-YOUCLUSTER-YOURPOOL-grp \
--template=YOURNEWTEMPLATENAME \
--zone=...
Forth, recreate the instace(s):
gcloud compute instance-groups managed recreate-instances \
gke-YOUCLUSTER-YOURPOOL-grp \
--zone=... \
--instances=...
The instances will loose all data and come up with THP disabled. All new instances will have THP disabled as well (in this node pool).

Related

How to use podman's ssh build flag?

I have been using the docker build --ssh flag to give builds access to my keys from ssh-agent.
When I try the same thing with podman it does not work. I am working on macOS Monterey 12.0.1. Intel chip. I have also reproduced this on Ubuntu and WSL2.
❯ podman --version
podman version 3.4.4
This is an example Dockerfile:
FROM python:3.10
RUN mkdir -p -m 0600 ~/.ssh \
&& ssh-keyscan github.com >> ~/.ssh/known_hosts
RUN --mount=type=ssh git clone git#github.com:ruarfff/a-private-repo-of-mine.git
When I run DOCKER_BUILDKIT=1 docker build --ssh default . it works i.e. the build succeeds, the repo is cloned and the ssh key is not baked into the image.
When I run podman build --ssh default . the build fails with:
git#github.com: Permission denied (publickey).
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
Error: error building at STEP "RUN --mount=type=ssh git clone git#github.com:ruarfff/a-private-repo-of-mine.git": error while running runtime: exit status 128
I have just begun playing around with podman. Looking at the docs, that flag does appear to be supported. I have tried playing around with the format a little, specifying the id directly for example but no variation of specifying the flag or the mount has worked so far. Is there something about how podman works that I may be missing that explains this?
Adding this line as suggested in the comments:
RUN --mount=type=ssh ssh-add -l
Results in this error:
STEP 4/5: RUN --mount=type=ssh ssh-add -l
Could not open a connection to your authentication agent.
Error: error building at STEP "RUN --mount=type=ssh ssh-add -l": error while running runtime: exit status 2
Edit:
I belive this may have something to do with this issue in buildah. A fix has been merged but has not been released yet as far as I can see.
The error while running runtime: exit status 2 does not to me appear to be necessarily related to SSH or --ssh for podman build. It's hard to say really, and I've successfully used --ssh like you are trying to do, with some minor differences that I can't relate to the error.
I am also not sure ssh-add being run as part of building the container is what you really meant to do -- if you want it to talk to an agent, you need to have two environment variables being exported from the environment in which you run ssh-add, these define where to find the agent to talk to and are as follows:
SSH_AUTH_SOCK, specifying the path to a socket file that a program uses to communicate with the agent
SSH_AGENT_PID, specifying the PID of the agent
Again, without these two variables present in the set of exported environment variables, the agent is not discoverable and might as well not exist at all so ssh-add will fail.
Since your agent is probably running as part of the set of processes to which your podman build also belongs to, at the minimum the PID denoted by SSH_AGENT_PID should be valid in that namespace (meaning it's normally invalid in the set of processes that container building is isolated to, so defining the variable as part of building the container would be a mistake). Similar story with SSH_AUTH_SOCK -- the path to the socket file dumped by starting the agent program, would not normally refer to a file that exists in the mount namespace of the container being built.
Now, you can run both the agent and ssh-add as part of building a container, but ssh-add reads keys from ~/.ssh and if you had key files there as part of the container image being built you wouldn't need --ssh in the first place, would you?
The value of --ssh lies in allowing you to transfer your authority to talk to remote services defined through your keys on the host, to the otherwise very isolated container building procedure, through use of nothing else but an SSH agent designed for this very purpose. That removes the need to do things like copying key files into the container. They (keys) should also normally not be part of the built container, especially if they were only to be used during building. The agent, on the other hand, runs on the host, securely encapsulates the keys you add to it, and since the host is where you'd have your keys that's where you're supposed to run ssh-add at to add them to the agent.

How can I raise the limit for open files in Ubuntu 20.04 on WSL2?

My setup looks as follows: Windows 10, Release 1909 (Build 18363.1082), using WSL2 with an Ubuntu 20.04 environment. Everything works nicely most of the time, but there are some issues I cannot manage to solve.
During development using parcel (React bundler), I run into the problem that the bundler apparently opens lots of files at the same time, and at a certain point, I run into the following problem:
EMFILE: too many open files, open '/home/myusername/Projects/some-project-path/node_modules/#material-ui/icons/esm/RoundedCornerRounded.js'
As parcel seemingly does not easily support using something like graceful-fs, I have tried to increase the limit for open files inside the Ubuntu environment. What I have tried so far:
A simple ulimit -n 4096 (which is the highest possible by default), but it's apparently (by far?) not enough
I tried increasing fs.files-max to something really high in /etc/sysctl.conf, but it doesn't seem to have an effect (neither after sysctl -p nor after a restart of wsl)
I also tried increasing fs.inotify.max_user_watches, but that did not seem to have an effect either
Also setting soft and hard limits in /etc/security/limits.conf did not seem to have an effect
I also found information that changing DefaultLimitNOFILE in /etc/systemd/system.conf can have an effect (so I did that as well)
Has anybody manage to solve a similar system on Ubuntu 20.04 on WSL2? This left me pretty stumped, and it prevents me from using parcel inside this environment. That's a real pity, as really everything else is working really fine.
UPDATE
So I have found out that my changes in various places (probably the one in /etc/security/limits.conf) has had some kind of effect. Just not when logging in directly. This illustrates this:
donmartin#SOMEMACHINE:~$ ulimit -Hn
4096
donmartin#SOMEMACHINE:~$ su donmartin
Password:
donmartin#SOMEMACHINE:~$ ulimit -Hn
65536
donmartin#SOMEMACHINE:~$
Which means: If I su to my own user, the ulimit has indeed been raised. But if I log in just as normal using Windows Terminal, this limit is not in effect. Even more puzzled now - BUT - I have a workaround for my problem. Having set my values to 65536, the parcel build now works, running as my own user. Go figure! I still don't quite know which setting was changing the behaviour now - perhaps somebody has more thorough information on how this works and/or how I can make this also the default without having to do a su to get the updated limits.
I had to add the following line to /etc/systemd/user.conf:
DefaultLimitNOFILE=65535
As written in the answer here:
https://superuser.com/questions/1200539/cannot-increase-open-file-limit-past-4096-ubuntu/1200818#1200818?s=1b927bb17396480da98a94cbacf8da62
Also you may need to run this (if working with applications that monitors changes in many files/folders):
sudo sh -c 'sysctl fs.inotify.max_user_watches=524288 && sysctl -p'
Try this:
$ visudo
ADD: user ALL=(ALL) NOPASSWD:ALL
$ vi ~/.profile
ADD: user ALL=(ALL) NOPASSWD:ALL
$ vi /etc/security/limits.conf
ADD: user soft nproc 10000
user hard nproc 10000
user soft nofile 10000
user hard nofile 10000
Temporarily increase the open files hard limit for the session
Run this 3 commands (the first one is optinal), to check current open files limit, switch to admin user, and increase the value.
$ ulimit -n
1024
$ su <user name>
<Enter password>
$ ulimit -n 65535
Check the new limit:
$ ulimit -n
65535
To check all values, run this:
$ ulimit -a

can't change the sshd's prot in CoreOS

System release: CoreOS 2135.5.0
Kernel: 4.19.50-coreos-r1
System install way: vmware
When I change the port in the sshd.service,it displays:
CoreOS-234 ssh # echo "Port 10000" >> /usr/share/ssh/sshd_config ;systemctl mask sshd.socket;systemctl enable sshd.service;systemctl restart sshd.service
-bash: /usr/share/ssh/sshd_config: Read-only file system
The file system that you are working in is currently in Read-only mode. Remounting the file system to read-write should resolve the issue. You will need to have root privilages:
$ mount -o remount,rw /
Occasionally the reason your file system will be running in read-only mode is due to Kernel issues, therefore there may be further problems with the system that will need to be debugged. Regarding the Kernel errors you may want to have a look at the following link: https://unix.stackexchange.com/questions/436483/is-remounting-from-read-only-to-read-write-potentially-dangerous?rq=1
In coreos /usr is designed to be a read-only file system, Remounting /usr is theoretically feasible, but is not officially recommended
You can refer to this
I use the following command to solve this problem
sudo sed -i '$a\Port=60022' /etc/ssh/sshd_config && \
sudo systemctl mask sshd.socket && \
sudo systemctl enable sshd.service && \
sudo systemctl start sshd.service

How to add EnvironmentFile directive to systemctl using Docker with centos7/httpd base image

I am not sure if this is possible without creating my own base image, but I use environment variables in /etc/environment on our servers and typically make them accessible to apache by doing the following:
$ printf "HTTP_VAR1=var1-value\n\
HTTP_VAR2=var2-value"\
>> /etc/environment
$ mkdir /usr/lib/systemd/system/httpd.service.d
$ printf "[Service]\n\
EnvironmentFile=/etc/environment"\
> /usr/lib/systemd/system/httpd.service.d/environment.conf
$ systemctl daemon-reload
$ systemctl restart httpd
$ reboot
The variables are then available in any PHP calls to getenv('HTTP_VAR1'); and etc. However, in running this from a docker file I get dbus errors on the systemctl commands. Without the systemctl commands it seems the variables are not available to apache as it seems the new EnvironmentFile directive doesn't take effect. My docker file snippet:
FROM centos/httpd:latest
RUN printf "HTTP_VAR1=var1-value\n\
HTTP_VAR2=var2-value"\
>> /etc/environment
RUN mkdir /usr/lib/systemd/system/httpd.service.d &&\
printf "[Service]\n\
EnvironmentFile=/etc/environment"\
> /usr/lib/systemd/system/httpd.service.d/environment.conf
RUN systemctl daemon-reload &&\
systemctl restart httpd
COPY entrypoint.sh /entrypoint.sh
So I happened upon the answer to the issue today. It seems that systemd drops backslashes inside single quotes, but it may effect double-quotes too from what I saw in testing. I found the systemd development mailing list thread from April 2014 where patching the issue was being discussed. It seems as though the fix never made it in. So we have to work around it.
In attempting to work around it I noticed some issues with actually reading the variables at all. It seemed as though either Apache or php-cli would get the correct variables, and sometimes not at all, it took a bit of sleuthing to figure out what was going on. Then I started reading into the systemd's EnvironmentFile directive to see if there was more to gain from the docs. It turns out it does not evaluate bash so export won't work. It expects a text file with variable assignments and herein lies one of the main issues that might keep this from being resolved.
I then devised a workable solution. Utilizing systemd's ExecStartPre directive I am able to run a script on startup of the httpd service. I then read in the environment file and write a new plain text one that will then be used by httpd's systemd unit. Here is the code:
Firstly, I moved my variables to /etc/profile.d/ directory rather than /etc/environment file.
file: /etc/profile.d/environment.sh
This is where we store all our environment variables, this gets easily sourced on all interactive shell logins. In the rarer cases where we need to have these variables available non-interactively we can either provide --login flag to /bin/bash or source it manually.
export HTTP_VAR1=var1-value-with-a-back\slash
export HTTP_VAR2=var2-value
file: /usr/lib/systemd/system/httpd.service.d/environment.conf
Our drop-in unit file to extend how the httpd service works. I add in a script that runs before httpd starts up. This gets ran on all httpd restarts and starts. The script that runs generates a plain text file at /etc/profile.d/environment.env which we subsequently tell systemd to load as an EnvironmentFile.
[Service]
ExecStartPre=/usr/bin/bash -c "/usr/local/bin/generate-plain-environment-file"
EnvironmentFile=/etc/profile.d/environment.env
file: /usr/local/bin/generate-plain-environment-file
Here is the script I am using, I whipped this together really fast, I really don't think it is that robust and it could be better. It just removes the export from the beginning of the lines and then escapes any backslashes since systemd drops single backslashes. A more proper solution might be to use bash to evaluate each line and obtain the variable value in case of usage of variables or other bash in the actual bash variables, then output them as plain text name=value assignments, however this is not part of my use-case so I didn't bother.
#!/bin/bash
cd /etc/profile.d/
rm -rf "./environment.env"
while IFS='' read -r line || [[ -n "$line" ]]; do
echo $(echo "${line}" | sed 's/^export //' | sed 's/\\/\\\\/g') >> "./environment.env";
done < "./environment.sh"
file: /etc/profile.d/environment.env
This is the resulting file generated by the described script.
HTTP_VAR1=var1-value-with-a-back\\slash
HTTP_VAR2=var2-value
Conclusion is that the I now have two files with the same thing in them but I only need to maintain the one, the other is generated each time we restart httpd. Also, we fix the backslash issue in the process. Hurray!

How to shorten an inittab process entry, a.k.a., where to put environment variables that will be seen by init?

I am setting up a Debian Etch server to host ruby and php applications with nginx. I have successfully configured inittab to start the php-cgi process on boot with the respawn action. After serving 1000 requests, the php-cgi worker processes die and are respawned by init. The inittab record looks like this:
50:23:respawn:/usr/local/bin/spawn-fcgi -n -a 127.0.0.1 -p 8000 -C 3 -u someuser -- /usr/bin/php-cgi
I initially wrote the process entry (everything after the 3rd colon) in a separate script (simply because it was long) and put that script name in the inittab record, but because the script would run its single line and die, the syslog was filled with errors like this:
May 7 20:20:50 sb init: Id "50" respawning too fast: disabled for 5 minutes
Thus, I got rid of the script file and just put the whole line in the inittab. Henceforth, no errors show up in the syslog.
Now I'm attempting the same with thin to serve a rails application. I can successfully start the thin server by running this command:
sudo thin -a 127.0.0.1 -e production -l /var/log/thin/thin.log -P /var/run/thin/thin.pid -c /path/to/rails/app -p 8010 -u someuser -g somegroup -s 2 -d start
It works apparently exactly the same whether I use the -d (daemonize) flag or not. Command line control comes immediately back (the processes have been daemonized) either way. If I put that whole command (minus the sudo and with absolute paths) into inittab, init complains (in syslog) that the process entry is too long, so I put the options into an exported environment variable in /etc/profile. Now I can successfully start the server with:
sudo thin $THIN_OPTIONS start
But when I put this in an inittab record with the respawn action
51:23:respawn:/usr/local/bin/thin $THIN_OPTIONS start
the logs clearly indicate that the environment variable is not visible to init; it's as though the command were simply "thin start."
How can I shorten the inittab process entry? Is there another file than /etc/profile where I could set the THIN_OPTIONS environment variable? My earlier experience with php-cgi tells me I can't just put the whole command in a separate script.
And why don't you call a wrapper who start thin whith your options?
start_thin.sh:
#!/bin/bash
/usr/local/bin/thin -a 127.0.0.1 -e production -l /var/log/thin/thin.log -P /var/run/thin/thin.pid -c /path/to/rails/app -p 8010 -u someuser -g somegroup -s 2 -d start
and then:
51:23:respawn:/usr/local/bin/start_thin
init.d script
Use a script in
/etc/rc.d/init.d
and set the runlevel
Here are some examples with thin, ruby, apache
http://articles.slicehost.com/2009/4/17/centos-apache-rails-and-thin
http://blog.fiveruns.com/2008/9/24/rails-automation-at-slicehost
http://elwoodicious.com/2008/07/15/nginx-haproxy-thin-fastcgi-php5-load-balanced-rails-with-php-support/
Which provide example initscripts to use.
edit:
Asker pointed out this will not allow respawning. I suggested forking in the init script and disowning the process so init doesn't hang (it might fork() the script itself, will check). And then creating an infinite loop that waits on the server process to die and restarts it.
edit2:
It seems init will fork the script. Just a loop should do it.