Unable to upload a file into OpenStack Swift 2.14.0 on a Raspberry Pi 3 because of "[Errno 13] Permission denied" - permissions

Creating and erasing buckets (containers) in my OpenStack Swift version 2.14.0 installation works well. It is a Swift only installation. No further OpenStack services like Keystone have been deployed.
$ swift stat
Account: AUTH_test
Containers: 2
Objects: 0
Bytes: 0
Containers in policy "policy-0": 2
Objects in policy "policy-0": 0
Bytes in policy "policy-0": 0
Connection: keep-alive
...
$ swift post s3perf
$ swift list -A http://10.0.0.253:8080/auth/v1.0 -U test:tester -K testing
bucket
s3perf
These are the (positive) messages regarding bucket creation inside the file storage1.error.
$ tail -f /var/log/swift/storage1.error
...
May 9 13:58:50 raspberrypi container-server: STDERR: (1114) accepted
('127.0.0.1', 38118)
May 9 13:58:50 raspberrypi container-server: STDERR: 127.0.0.1 - -
[09/May/2017 11:58:50] "POST /d1/122/AUTH_test/s3perf HTTP/1.1" 204 142
0.021630 (txn: tx982eb25d83624b37bd290-005911aefa)
But any attempt to upload a file causes just an error message [Errno 13] Permission denied.
$ swift upload s3perf s3perf-testfile1.txt
Object PUT failed: http://10.0.0.253:8080/v1/AUTH_test/s3perf/s3perf-testfile1.txt
503 Service Unavailable [first 60 chars of response] <html><h1>Service
Unavailable</h1><p>The server is currently
$ tail -f /var/log/swift/storage1.error
...
May 18 20:55:44 raspberrypi object-server: STDERR: (927) accepted
('127.0.0.1', 45684)
May 18 20:55:44 raspberrypi object-server: ERROR __call__ error with PUT
/d1/40/AUTH_test/s3perf/testfile : #012Traceback (most recent call
last):#012 File "/home/pi/swift/swift/obj/server.py", line 1105, in
__call__#012 res = getattr(self, req.method)(req)#012 File
"/home/pi/swift/swift/common/utils.py", line 1626, in _timing_stats#012
resp = func(ctrl, *args, **kwargs)#012 File
"/home/pi/swift/swift/obj/server.py", line 814, in PUT#012
writer.put(metadata)#012 File "/home/pi/swift/swift/obj/diskfile.py",
line 2561, in put#012 super(DiskFileWriter, self)._put(metadata,
True)#012 File "/home/pi/swift/swift/obj/diskfile.py", line 1566, in
_put#012 tpool_reraise(self._finalize_put, metadata, target_path,
cleanup)#012 File "/home/pi/swift/swift/common/utils.py", line 3536, in
tpool_reraise#012 raise resp#012IOError: [Errno 13] Permission denied
(txn: txfbf08bffde6d4657a72a5-00591dee30)
May 18 20:55:44 raspberrypi object-server: STDERR: 127.0.0.1 - -
[18/May/2017 18:55:44] "PUT /d1/40/AUTH_test/s3perf/testfile HTTP/1.1"
500 875 0.015646 (txn: txfbf08bffde6d4657a72a5-00591dee30)
Also the proxy.error file contains an error message ERROR 500 Expect: 100-continue From Object Server.
May 18 20:55:44 raspberrypi proxy-server: ERROR 500 Expect: 100-continue
From Object Server 127.0.0.1:6010/d1 (txn: txfbf08bffde6d4657a72a5-
00591dee30) (client_ip: 10.0.0.220)
I have started Swift as user pi and assigned these folders to this user:
$ sudo chown pi:pi /etc/swift
$ sudo chown -R pi:pi /mnt/sdb1/*
$ sudo chown -R pi:pi /var/cache/swift
$ sudo chown -R pi:pi /var/run/swift
sdb1 is a loopback device with XFS file system.
$ mount | grep sdb1
/srv/swift-disk on /mnt/sdb1 type xfs (rw,noatime,nodiratime,attr2,nobarrier,inode64,logbufs=8,noquota)
$ ls -ld /mnt/sdb1/1/
drwxr-xr-x 3 pi pi 17 May 12 13:14 /mnt/sdb1/1/
I deployed Swift this way.
I wonder why creating buckets (conrainers) works but the upload of a file fails because of Permission denied.
Update 1:
$ sudo swift-ring-builder /etc/swift/account.builder
/etc/swift/account.builder, build version 2
256 partitions, 1.000000 replicas, 1 regions, 1 zones, 1 devices, 0.00 balance, 0.00 dispersion
The minimum number of hours before a partition can be reassigned is 0 (0:00:00 remaining)
The overload factor is 0.00% (0.000000)
Ring file /etc/swift/account.ring.gz is up-to-date
Devices: id region zone ip address:port replication ip:port name weight partitions balance flags meta
0 1 1 127.0.0.1:6012 127.0.0.1:6012 d1 1.00 256 0.00
$ sudo swift-ring-builder /etc/swift/container.builder
/etc/swift/container.builder, build version 2
256 partitions, 1.000000 replicas, 1 regions, 1 zones, 1 devices, 0.00 balance, 0.00 dispersion
The minimum number of hours before a partition can be reassigned is 0 (0:00:00 remaining)
The overload factor is 0.00% (0.000000)
Ring file /etc/swift/container.ring.gz is up-to-date
Devices: id region zone ip address:port replication ip:port name weight partitions balance flags meta
0 1 1 127.0.0.1:6011 127.0.0.1:6011 d1 1.00 256 0.00
$ sudo swift-ring-builder /etc/swift/object.builder
/etc/swift/object.builder, build version 2
256 partitions, 1.000000 replicas, 1 regions, 1 zones, 1 devices, 0.00 balance, 0.00 dispersion
The minimum number of hours before a partition can be reassigned is 0 (0:00:00 remaining)
The overload factor is 0.00% (0.000000)
Ring file /etc/swift/object.ring.gz is up-to-date
Devices: id region zone ip address:port replication ip:port name weight partitions balance flags meta
0 1 1 127.0.0.1:6010 127.0.0.1:6010 d1 1.00 256 0.00
Update 2
The required ports are open.
$ nmap localhost -p 6010,6011,6012,8080,22
...
PORT STATE SERVICE
22/tcp open ssh
6010/tcp open x11
6011/tcp open unknown
6012/tcp open unknown
8080/tcp open http-proxy
Update 3
I can write as user pi inside the folder where Swift should store the objects.
$ whoami
pi
$ touch /srv/1/node/d1/objects/test
$ ls -l /srv/1/node/d1/objects/test
-rw-r--r-- 1 pi pi 0 May 13 22:59 /srv/1/node/d1/objects/test
Update 4
All swift processes belong to user pi.
$ ps aux | grep swift
pi 944 3.2 2.0 24644 20100 ? Ss May12 65:14 /usr/bin/python /usr/local/bin/swift-proxy-server /etc/swift/proxy-server.conf
pi 945 3.1 2.0 25372 20228 ? Ss May12 64:30 /usr/bin/python /usr/local/bin/swift-container-server /etc/swift/container-server.conf
pi 946 3.1 1.9 24512 19416 ? Ss May12 64:03 /usr/bin/python /usr/local/bin/swift-account-server /etc/swift/account-server.conf
pi 947 3.1 1.9 23688 19320 ? Ss May12 64:04 /usr/bin/python /usr/local/bin/swift-object-server /etc/swift/object-server.conf
pi 1000 0.0 1.7 195656 17844 ? Sl May12 0:01 /usr/bin/python /usr/local/bin/swift-object-server /etc/swift/object-server.conf
pi 1001 0.0 1.8 195656 18056 ? Sl May12 0:01 /usr/bin/python /usr/local/bin/swift-object-server /etc/swift/object-server.conf
pi 1002 0.0 1.6 23880 16772 ? S May12 0:01 /usr/bin/python /usr/local/bin/swift-object-server /etc/swift/object-server.conf
pi 1003 0.0 1.7 195656 17848 ? Sl May12 0:01 /usr/bin/python /usr/local/bin/swift-object-server /etc/swift/object-server.conf
pi 1004 0.0 1.7 24924 17504 ? S May12 0:01 /usr/bin/python /usr/local/bin/swift-account-server /etc/swift/account-server.conf
pi 1005 0.0 1.6 24924 16912 ? S May12 0:01 /usr/bin/python /usr/local/bin/swift-account-server /etc/swift/account-server.conf
pi 1006 0.0 1.8 24924 18368 ? S May12 0:01 /usr/bin/python /usr/local/bin/swift-account-server /etc/swift/account-server.conf
pi 1007 0.0 1.8 24924 18208 ? S May12 0:01 /usr/bin/python /usr/local/bin/swift-account-server /etc/swift/account-server.conf
pi 1008 0.0 1.8 25864 18824 ? S May12 0:01 /usr/bin/python /usr/local/bin/swift-container-server /etc/swift/container-server.conf
pi 1009 0.0 1.8 25864 18652 ? S May12 0:01 /usr/bin/python /usr/local/bin/swift-container-server /etc/swift/container-server.conf
pi 1010 0.0 1.7 25864 17340 ? S May12 0:01 /usr/bin/python /usr/local/bin/swift-container-server /etc/swift/container-server.conf
pi 1011 0.0 1.8 25864 18772 ? S May12 0:01 /usr/bin/python /usr/local/bin/swift-container-server /etc/swift/container-server.conf
pi 1012 0.0 1.8 24644 18276 ? S May12 0:03 /usr/bin/python /usr/local/bin/swift-proxy-server /etc/swift/proxy-server.conf
pi 1013 0.0 1.8 24900 18588 ? S May12 0:03 /usr/bin/python /usr/local/bin/swift-proxy-server /etc/swift/proxy-server.conf
pi 1014 0.0 1.8 24900 18588 ? S May12 0:03 /usr/bin/python /usr/local/bin/swift-proxy-server /etc/swift/proxy-server.conf
pi 1015 0.0 1.8 24900 18568 ? S May12 0:03 /usr/bin/python /usr/local/bin/swift-proxy-server /etc/swift/proxy-server.conf
Update 5
When I create a bucket, the Swift service creates a folder like this one:
/mnt/sdb1/1/node/d1/containers/122/9d5/7a23d9409f11da3062432c6faa75f9d5/
and this folder contains a db-file like 7a23d9409f11da3062432c6faa75f9d5.db. I think this is the correct behavior.
But when I try to upload a file inside a bucket, Swift creates just an empty folder like this one.
/mnt/sdb1/1/node/d1/objects/139/eca/8b17958f984943fc97b6b937061d2eca
I can create files inside these empty folders via touch or echo as user pi but for an unknown reason, Swift does not store files inside these folders.
Update 6
In order to investigate this issue further, I installed Swift according to the SAIO - Swift All In One instructions one time inside a VMware ESXi virtual machine with Ubuntu 14.04 LTS and another time inside Raspbian on a Raspberry Pi 3. The result is, that inside the Ubuntu 14.04 VM, Swift works perfectly, but when running ontop of the Rasberry Pi, uploading files does not work.
Object PUT failed: http://10.0.0.253:8080/v1/AUTH_test/s3perf-testbucket/testfiles/s3perf-testfile1.txt
503 Service Unavailable [first 60 chars of response]
<html><h1>Service Unavailable</h1><p>The server is currently
The storage1.error log file still says:
May 24 13:15:15 raspberrypi object-server: ERROR __call__ error with PUT
/sdb1/484/AUTH_test/s3perf-testbucket/testfiles/s3perf-testfile1.txt :
#012Traceback (most recent call last):#012 File
"/home/pi/swift/swift/obj/server.py", line 1105, in __call__#012 res =
getattr(self, req.method)(req)#012 File
"/home/pi/swift/swift/common/utils.py", line 1626, in _timing_stats#012
resp = func(ctrl, *args, **kwargs)#012 File
"/home/pi/swift/swift/obj/server.py", line 814, in PUT#012
writer.put(metadata)#012 File "/home/pi/swift/swift/obj/diskfile.py",
line 2561, in put#012 super(DiskFileWriter, self)._put(metadata,
True)#012 File "/home/pi/swift/swift/obj/diskfile.py", line 1566, in
_put#012 tpool_reraise(self._finalize_put, metadata, target_path,
cleanup)#012 File "/home/pi/swift/swift/common/utils.py", line 3536, in
tpool_reraise#012 raise resp#012IOError: [Errno 13] Permission denied
(txn: txdfe3c7f704be4af8817b3-0059256b43)
Update 7
The issue is still not fixed, but I have now a working Swift service on the Raspberry Pi. I installed the (quite outdated) Swift revision 2.2.0, which is shipped with Raspbian and it works well. The steps I did are explained here.

Based on the information that you provided, the errors occur during writing Metadata.
The operations for writing Metadata falls into two categories: manipulating inode and manipulating extended attributes. So there are two possible sources for your errors.
First, it is an inode related error. This error may occur due to setting the inode64 parameter while mounting the device. According to the XFS man page:
If applications are in use which do not handle inode numbers bigger than 32 bits, the inode32 option should be specified.
Second, it is an extended attributes related error. You can use the xattr package of python to write extended attributes, and check whether exception happens.

Related

spidev0.0 not showing with enc28j60 and raspberry pi 4

I am trying to setup SPI with the ETH Click (a board for the enc28j60) with the raspberry pi 4 model b and pi 3 click shield.
when I set the overlay for enc28j60 in my /boot/config.txt only spidev0.1 appears when I do the command ls -l /dev/spi*:
crw-rw---- 1 root spi 153, 0 May 26 11:40 /dev/spidev0.1
My /boot/config.txt
#Uncomment some or all of these to enable the optional hardware interfaces
#dtparam=i2c_arm=on
#dtparam=i2s=on
dtparam=spi=on
dtoverlay=enc28j60,int_pin=6
I'm using:
CE0 - GPIO8
MISO - GPIO9
MOSI - GPIO10
SCLK - GPIO11
INT - GPIO6
If I remove the dtoverlay spidev0.0 appears.
My raspbian version when running the command: cat /etc/os-release
PRETTY_NAME="Raspbian GNU/Linux 10 (buster)"
NAME="Raspbian GNU/Linux"
VERSION_ID="10"
VERSION="10 (buster)"
VERSION_CODENAME=buster
ID=raspbian
ID_LIKE=debian
HOME_URL="http://www.raspbian.org/"
SUPPORT_URL="http://www.raspbian.org/RaspbianForums"
BUG_REPORT_URL="http://www.raspbian.org/RaspbianBugs"

Ambari unable to run custom hook for modifying user hive

Attempting to add a client node to cluster via Ambari (v2.7.3.0) (HDP 3.1.0.0-78) and seeing odd error
stderr:
Traceback (most recent call last):
File "/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py", line 38, in <module>
BeforeAnyHook().execute()
File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 352, in execute
method(env)
File "/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py", line 31, in hook
setup_users()
File "/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/shared_initialization.py", line 51, in setup_users
fetch_nonlocal_groups = params.fetch_nonlocal_groups,
File "/usr/lib/ambari-agent/lib/resource_management/core/base.py", line 166, in __init__
self.env.run()
File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 160, in run
self.run_action(resource, action)
File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 124, in run_action
provider_action()
File "/usr/lib/ambari-agent/lib/resource_management/core/providers/accounts.py", line 90, in action_create
shell.checked_call(command, sudo=True)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 72, in inner
result = function(command, **kwargs)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 102, in checked_call
tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy, returns=returns)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 150, in _call_wrapper
result = _call(command, **kwargs_copy)
File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 314, in _call
raise ExecutionFailed(err_msg, code, out, err)
resource_management.core.exceptions.ExecutionFailed: Execution of 'usermod -G hadoop -g hadoop hive' returned 6. usermod: user 'hive' does not exist in /etc/passwd
Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-632.json', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-632.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1_2', '']2019-11-25 13:07:58,000 - Reporting component version failed
Traceback (most recent call last):
File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 363, in execute
self.save_component_version_to_structured_out(self.command_name)
File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 223, in save_component_version_to_structured_out
stack_select_package_name = stack_select.get_package_name()
File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 109, in get_package_name
package = get_packages(PACKAGE_SCOPE_STACK_SELECT, service_name, component_name)
File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 223, in get_packages
supported_packages = get_supported_packages()
File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 147, in get_supported_packages
raise Fail("Unable to query for supported packages using {0}".format(stack_selector_path))
Fail: Unable to query for supported packages using /usr/bin/hdp-select
stdout:
2019-11-25 13:07:57,644 - Stack Feature Version Info: Cluster Stack=3.1, Command Stack=None, Command Version=None -> 3.1
2019-11-25 13:07:57,651 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf
2019-11-25 13:07:57,652 - Group['livy'] {}
2019-11-25 13:07:57,654 - Group['spark'] {}
2019-11-25 13:07:57,654 - Group['ranger'] {}
2019-11-25 13:07:57,654 - Group['hdfs'] {}
2019-11-25 13:07:57,654 - Group['zeppelin'] {}
2019-11-25 13:07:57,655 - Group['hadoop'] {}
2019-11-25 13:07:57,655 - Group['users'] {}
2019-11-25 13:07:57,656 - User['yarn-ats'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None}
2019-11-25 13:07:57,658 - User['hive'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None}
2019-11-25 13:07:57,659 - Modifying user hive
Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-632.json', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-632.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1_2', '']
2019-11-25 13:07:57,971 - The repository with version 3.1.0.0-78 for this command has been marked as resolved. It will be used to report the version of the component which was installed
2019-11-25 13:07:58,000 - Reporting component version failed
Traceback (most recent call last):
File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 363, in execute
self.save_component_version_to_structured_out(self.command_name)
File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 223, in save_component_version_to_structured_out
stack_select_package_name = stack_select.get_package_name()
File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 109, in get_package_name
package = get_packages(PACKAGE_SCOPE_STACK_SELECT, service_name, component_name)
File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 223, in get_packages
supported_packages = get_supported_packages()
File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 147, in get_supported_packages
raise Fail("Unable to query for supported packages using {0}".format(stack_selector_path))
Fail: Unable to query for supported packages using /usr/bin/hdp-select
Command failed after 1 tries
The problem appears to be
resource_management.core.exceptions.ExecutionFailed: Execution of 'usermod -G hadoop -g hadoop hive' returned 6. usermod: user 'hive' does not exist in /etc/passwd
caused by
2019-11-25 13:07:57,659 - Modifying user hive
Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-632.json', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-632.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1_2', '']
This is further reinforced by the fact that manually adding the ambari-hdp-1.repo and yum-installing hdp-select before adding the host to the cluster shows the same error messages, just truncated up to the parts of stdout/err shown here.
When running
[root#HW001 .ssh]# /usr/bin/hdp-select versions
3.1.0.0-78
from the ambari server node, I can see the command runs.
Looking at what the hook script is trying to run/access, I see
[root#client001~]# ls -lha /var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py
-rw-r--r-- 1 root root 1.2K Nov 25 10:51 /var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py
[root#client001~]# ls -lha /var/lib/ambari-agent/data/command-632.json
-rw------- 1 root root 545K Nov 25 13:07 /var/lib/ambari-agent/data/command-632.json
[root#client001~]# ls -lha /var/lib/ambari-agent/cache/stack-hooks/before-ANY
total 0
drwxr-xr-x 4 root root 34 Nov 25 10:51 .
drwxr-xr-x 8 root root 147 Nov 25 10:51 ..
drwxr-xr-x 2 root root 34 Nov 25 10:51 files
drwxr-xr-x 2 root root 188 Nov 25 10:51 scripts
[root#client001~]# ls -lha /var/lib/ambari-agent/data/structured-out-632.json
ls: cannot access /var/lib/ambari-agent/data/structured-out-632.json: No such file or directory
[root#client001~]# ls -lha /var/lib/ambari-agent/tmp
total 96K
drwxrwxrwt 3 root root 4.0K Nov 25 13:06 .
drwxr-xr-x 10 root root 267 Nov 25 10:50 ..
drwxr-xr-x 6 root root 4.0K Nov 25 13:06 ambari_commons
-rwx------ 1 root root 1.4K Nov 25 13:06 ambari-sudo.sh
-rwxr-xr-x 1 root root 1.6K Nov 25 13:06 create-python-wrap.sh
-rwxr-xr-x 1 root root 1.6K Nov 25 10:50 os_check_type1574715018.py
-rwxr-xr-x 1 root root 1.6K Nov 25 11:12 os_check_type1574716360.py
-rwxr-xr-x 1 root root 1.6K Nov 25 11:29 os_check_type1574717391.py
-rwxr-xr-x 1 root root 1.6K Nov 25 13:06 os_check_type1574723161.py
-rwxr-xr-x 1 root root 16K Nov 25 10:50 setupAgent1574715020.py
-rwxr-xr-x 1 root root 16K Nov 25 11:12 setupAgent1574716361.py
-rwxr-xr-x 1 root root 16K Nov 25 11:29 setupAgent1574717392.py
-rwxr-xr-x 1 root root 16K Nov 25 13:06 setupAgent1574723163.py
notice there is ls: cannot access /var/lib/ambari-agent/data/structured-out-632.json: No such file or directory. Not sure if this is normal, though.
Anyone know what could be causing this or any debugging hints from this point?
UPDATE 01:
Adding some log printing lines near the offending final line in the error trace, ie. File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 147, in get_supported_packages, I print the code and stdout:
2
ambari-python-wrap: can't open file '/usr/bin/hdp-select': [Errno 2] No such file or directory
So what the heck? It wants hdp-select to already be there, but ambari add-host UI complains if I manually install that binary myself beforehand. When I do manually install it (using the same repo file as in the rest of the existing cluster nodes) all I see is...
0
Packages:
accumulo-client
accumulo-gc
accumulo-master
accumulo-monitor
accumulo-tablet
accumulo-tracer
atlas-client
atlas-server
beacon
beacon-client
beacon-server
druid-broker
druid-coordinator
druid-historical
druid-middlemanager
druid-overlord
druid-router
druid-superset
falcon-client
falcon-server
flume-server
hadoop-client
hadoop-hdfs-client
hadoop-hdfs-datanode
hadoop-hdfs-journalnode
hadoop-hdfs-namenode
hadoop-hdfs-nfs3
hadoop-hdfs-portmap
hadoop-hdfs-secondarynamenode
hadoop-hdfs-zkfc
hadoop-httpfs
hadoop-mapreduce-client
hadoop-mapreduce-historyserver
hadoop-yarn-client
hadoop-yarn-nodemanager
hadoop-yarn-registrydns
hadoop-yarn-resourcemanager
hadoop-yarn-timelinereader
hadoop-yarn-timelineserver
hbase-client
hbase-master
hbase-regionserver
hive-client
hive-metastore
hive-server2
hive-server2-hive
hive-server2-hive2
hive-webhcat
hive_warehouse_connector
kafka-broker
knox-server
livy-client
livy-server
livy2-client
livy2-server
mahout-client
oozie-client
oozie-server
phoenix-client
phoenix-server
pig-client
ranger-admin
ranger-kms
ranger-tagsync
ranger-usersync
shc
slider-client
spark-atlas-connector
spark-client
spark-historyserver
spark-schema-registry
spark-thriftserver
spark2-client
spark2-historyserver
spark2-thriftserver
spark_llap
sqoop-client
sqoop-server
storm-client
storm-nimbus
storm-slider-client
storm-supervisor
superset
tez-client
zeppelin-server
zookeeper-client
zookeeper-server
Aliases:
accumulo-server
all
client
hadoop-hdfs-server
hadoop-mapreduce-server
hadoop-yarn-server
hive-server
Command failed after 1 tries
UPDATE 02:
Printing some custom logging from File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 322 (printing the values of err_msg, code, out, err), ie.
....
312 if throw_on_failure and not code in returns:
313 err_msg = Logger.filter_text("Execution of '{0}' returned {1}. {2}".format(command_alias, c ode, all_output))
314
315 #TODO remove
316 print("\n----------\nMY LOGS\n----------\n")
317 print(err_msg)
318 print(code)
319 print(out)
320 print(err)
321
322 raise ExecutionFailed(err_msg, code, out, err)
323
324 # if separate stderr is enabled (by default it's redirected to out)
325 if stderr == subprocess32.PIPE:
326 return code, out, err
327
328 return code, out
....
I see
Execution of 'usermod -G hadoop -g hadoop hive' returned 6. usermod: user 'hive' does not exist in /etc/passwd
6
usermod: user 'hive' does not exist in /etc/passwd
Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-816.json', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-816.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1_2', '']
2019-11-26 10:25:46,928 - The repository with version 3.1.0.0-78 for this command has been marked as resolved. It will be used to report the version of the component which was installed
So it seems like it is failing to create the hive user (even though it seems to have no problem creating the yarn-ats user before that)
After just giving in and trying to manually create the hive user myself, I see
[root#airflowetl ~]# useradd -g hadoop -s /bin/bash hive
useradd: user 'hive' already exists
[root#airflowetl ~]# cat /etc/passwd | grep hive
<nothing>
[root#airflowetl ~]# id hive
uid=379022825(hive) gid=379000513(domain users) groups=379000513(domain users)
The fact that this existing user's uid looks like this and is not in the /etc/passwd file made me think that there is some existing Active Directory user (which this client node syncs with via installed SSSD) that already has the name hive. Checking our AD users, this turned out to be true.
Temporarily stopping the SSSD service to stop sync with AD (service sssd stop) (since, not sure if you can get a server to ignore AD syncs on an individual user basis) before rerunning the client host add in Ambari fixed the problem for me.

Ceph S3 / Swift bucket create failed / error 416

I am getting 416 errors while creating buckets using S3 or Swift. How to solve this?
swift -A http://ceph-4:7480/auth/1.0 -U testuser:swift -K 'BKtVrq1...' upload testas testas
Warning: failed to create container 'testas': 416 Requested Range Not Satisfiable: InvalidRange
Object PUT failed: http://ceph-4:7480/swift/v1/testas/testas 404 Not Found b'NoSuchBucket'
Also S3 python test:
File "/usr/lib/python2.7/dist-packages/boto/s3/connection.py", line 621, in create_bucket
response.status, response.reason, body)
boto.exception.S3ResponseError: S3ResponseError: 416 Requested Range Not Satisfiable
<?xml version="1.0" encoding="UTF-8"?><Error><Code>InvalidRange</Code><BucketName>mybucket</BucketName><RequestId>tx00000000000000000002a-005a69b12d-1195-default</RequestId><HostId>1195-default-default</HostId></Error>
Here is my ceph status:
cluster:
id: 1e4bd42a-7032-4f70-8d0c-d6417da85aa6
health: HEALTH_OK
services:
mon: 3 daemons, quorum ceph-2,ceph-3,ceph-4
mgr: ceph-1(active), standbys: ceph-2, ceph-3, ceph-4
osd: 3 osds: 3 up, 3 in
rgw: 2 daemons active
data:
pools: 7 pools, 296 pgs
objects: 333 objects, 373 MB
usage: 4398 MB used, 26309 MB / 30708 MB avail
pgs: 296 active+clean
I am using CEPH Luminous build with bluestore
ceph version 12.2.2 (cf0baeeeeba3b47f9427c6c97e2144b094b7e5ba) luminous (stable)
User created:
sudo radosgw-admin user create --uid="testuser" --display-name="First User"
sudo radosgw-admin subuser create --uid=testuser --subuser=testuser:swift --access=full
sudo radosgw-admin key create --subuser=testuser:swift --key-type=swift --gen-secret
Logs on osd:
2018-01-25 12:19:45.383298 7f03c77c4700 1 ====== starting new request req=0x7f03c77be1f0 =====
2018-01-25 12:19:47.711677 7f03c77c4700 1 ====== req done req=0x7f03c77be1f0 op status=-34 http_status=416 ======
2018-01-25 12:19:47.711937 7f03c77c4700 1 civetweb: 0x55bd9631d000: 192.168.109.47 - - [25/Jan/2018:12:19:45 +0200] "PUT /mybucket/ HTTP/1.1" 1 0 - Boto/2.38.0 Python/2.7.12 Linux/4.4.0-51-generic
Linux ubuntu, 4.4.0-51-generic
set default pg_num and pgp_num to lower value(8 for example), or set mon_max_pg_per_osd to a high value in ceph.conf

Why does xrandr give me errors if I try and use commands on my computer, but not if I ssh into it?

When using xrandr on my device to select a resolution I kept getting an error stating " configure crtc 0 failed: "
(shortened) xrandr output after selecting display and running$ xrandr
Screen 0: minimum 8 x 8, current 1920 x 1080, maximum 32767 x 32767
DP1 disconnected (normal left inverted right x axis y axis)
DP2 connected 1920x1080+0+0 (normal left inverted right x axis y axis) 1439mm x 809mm
1920x1080 60.00*+ 50.00 59.94 30.00 24.00 29.97 23.98
4096x2160 24.00 23.98
3840x2160 30.00 25.00 24.00 29.97 23.98
1920x1080i 60.00 50.00 59.94
1680x1050 59.88
1280x720 60.00 50.00 30.00 59.94 29.97 24.00 23.98
1024x768 60.00
720x480 60.00 59.94
640x480 60.00 59.94
HDMI1 disconnected (normal left inverted right x axis y axis)
VIRTUAL1 disconnected (normal left inverted right x axis y axis)
Code I used to select a new resolution
$ xrandr --output DP2 --mode 3840x2160
when that gave me the error I also added the frame rate by trying both
$ xrandr --output DP2 --mode 3840x2160 30
AND
$xrandr --output DP2 --mode 3840x2160_30
(because I wasnt sure of the proper format to add it) Both gave me the error " configure crtc 0 failed: "
This was done on the device itself. for ergonomical reasons I went back to my desk and used SSH to access the device.
I then used a custom resolution (that was the same as above) and tried to use that instead.
steps I used for custom resolution (minus long outputs)
$ cvt 3840x2160
$ xrandr --newmode "3840x2160 30.00" 338.75 3840 4080 4488 5136 2160 2163 2168 2200 -hsync +vsync
$ xrandr --addmode DP2 3840x2160_30.00
$ xrandr --output DP2 --mode 3840x2160_30.00
That seemed to work on my device. When my device restarts I need to repeat the process again though (reverts to 100p when I need it 4k). I stuck $ xrandr --output DP2 --mode 3840x2160_30.00 into a .sh file and now if I run it from my laptop (using SSH) it changes my screens resolution BUT if I try and run the .sh file from my device itself I get the " configure crtc 0 failed: " error
You can reconfigure xOrg. I did this by creating a file in my /usr/share/X11/xorg.conf.d Directory.
I made it using vim:
sudo vim /usr/share/X11/xorg.conf.d/5-monitor.conf
Here is an example of my file
Section "Monitor"
Identifier "Monitor0"
Modeline "1920x1080_60.00" 173.00 1920 2048 2248 2576 1080 1083 1088 1120 -hsync +vsync
Modeline "3840x2160_30.0" 297.00 3840 4016 4104 4400 2160 2168 2178 2250 +hsync +vsync
Modeline "4096x2160_24.0" 297.00 4096 5116 5204 5500 2160 2168 2178 2250 +hsync +vsync
EndSection
Section "Device"
Identifier "Device0"
Driver "intel"
EndSection
Section "Screen"
Identifier "Screen0"
Device "Device0"
Monitor "Monitor0"
DefaultDepth 24
SubSection "Display"
Depth 24
Modes "3840x2160" "1920x1080"
EndSubSection
EndSection
For directions on how to do this you can follow this tutorial: https://wiki.gentoo.org/wiki/Xorg/Multiple_monitors
I was ran into this issue with ubuntu 16.0.4
you can try to use cvt -r 3840 2160 to replace the operation of cvt 3840 2160

How to know which process(stat: T) is attached by gdb?

When a process is attached by gdb, the stat of the process is "T", like:
root 6507 0.0 0.0 67896 952 ? Ss 12:01 0:00 /mytest
root 6508 0.0 0.0 156472 7120 ? Sl 12:01 0:00 /mytest
root 26994 0.0 0.0 67896 956 ? Ss 19:59 0:00 /mytest
root 26995 0.0 0.0 156460 7116 ? Tl 19:59 0:00 /mytest
root 27833 0.0 0.0 97972 24564 pts/2 S+ 20:00 0:00 gdb /mytest
From the above, 26995 may be debuging. How can I know 26995 is debug or not? Or can I know which process is attached by gdb(27833)
pstree -p 27833 --- show gdb(27833)
Another question: How to know a process(stat: T) is attached by which gdb(PID)?
In most siduation, I am not the peoson who is debuging the process.
The T in ps output stands for "being ptrace()d". So that process (26995) is being traced by something. That something is most often either GDB, or strace.
So yes, if you know that you are only running GDB and not strace, and if you see a single process in T state, then you know that you are debugging that process.
You could also ask GDB which process(es) it is debugging:
(gdb) info process
(gdb) info inferior
Update
As Matthew Slattery correctly noted, T just means the process is stopped, and not that it is being ptrace()d.
So a better solution is to do this:
grep '^TracerPid:' /proc/*/status | grep -v ':.0'
/proc/7657/status:TracerPid: 31069
From above output you can tell that process 7657 is being traced by process 31069. This answers both "which process is being debugger" and "which debugger is debugging what".
/proc file system is a telent design of Linux. Many process real-time information can be found from /proc/{PID}/.
Another question: How to know a process(stat: T) is attached by which
gdb(PID)? In most siduation, I am not the peoson who is debuging the
process.
For this question, we can check /proc/{PID}/status file to get the answer.
root 14616 0.0 0.0 36152 908 ? Ss Jun28 0:00 /mytest
root 14617 0.5 0.0 106192 7648 ? Sl Jun28 112:45 /mytest
tachyon 2683 0.0 0.0 36132 1008 ? Ss 11:22 0:00 /mytest
tachyon 4276 0.0 0.0 76152 20728 pts/42 S+ 11:22 0:00 gdb /mytest
tachyon 2684 0.0 0.0 106136 7140 ? Tl 11:22 0:00 /mytest
host1-8>cat /proc/2684/status
Name: mytest
State: T (tracing stop)
SleepAVG: 88%
Tgid: 2684
Pid: 2684
PPid: 2683
TracerPid: 4276
.......
Thus we know 2684 is debug by process 4276.
You can find out this info from ps axf output.
1357 ? Ss 0:00 /usr/sbin/sshd
1935 ? Ss 0:00 \_ sshd: root#pts/0
1994 pts/0 Ss 0:00 \_ -bash
2237 pts/0 T 0:00 \_ gdb /bin/ls
2242 pts/0 T 0:00 | \_ /bin/ls
2243 pts/0 R+ 0:00 \_ ps axf
Here process 2242 is being debuged by gdb process 2237.