Unable to upload a file into OpenStack Swift 2.14.0 on a Raspberry Pi 3 because of "[Errno 13] Permission denied" - permissions
Creating and erasing buckets (containers) in my OpenStack Swift version 2.14.0 installation works well. It is a Swift only installation. No further OpenStack services like Keystone have been deployed.
$ swift stat
Account: AUTH_test
Containers: 2
Objects: 0
Bytes: 0
Containers in policy "policy-0": 2
Objects in policy "policy-0": 0
Bytes in policy "policy-0": 0
Connection: keep-alive
...
$ swift post s3perf
$ swift list -A http://10.0.0.253:8080/auth/v1.0 -U test:tester -K testing
bucket
s3perf
These are the (positive) messages regarding bucket creation inside the file storage1.error.
$ tail -f /var/log/swift/storage1.error
...
May 9 13:58:50 raspberrypi container-server: STDERR: (1114) accepted
('127.0.0.1', 38118)
May 9 13:58:50 raspberrypi container-server: STDERR: 127.0.0.1 - -
[09/May/2017 11:58:50] "POST /d1/122/AUTH_test/s3perf HTTP/1.1" 204 142
0.021630 (txn: tx982eb25d83624b37bd290-005911aefa)
But any attempt to upload a file causes just an error message [Errno 13] Permission denied.
$ swift upload s3perf s3perf-testfile1.txt
Object PUT failed: http://10.0.0.253:8080/v1/AUTH_test/s3perf/s3perf-testfile1.txt
503 Service Unavailable [first 60 chars of response] <html><h1>Service
Unavailable</h1><p>The server is currently
$ tail -f /var/log/swift/storage1.error
...
May 18 20:55:44 raspberrypi object-server: STDERR: (927) accepted
('127.0.0.1', 45684)
May 18 20:55:44 raspberrypi object-server: ERROR __call__ error with PUT
/d1/40/AUTH_test/s3perf/testfile : #012Traceback (most recent call
last):#012 File "/home/pi/swift/swift/obj/server.py", line 1105, in
__call__#012 res = getattr(self, req.method)(req)#012 File
"/home/pi/swift/swift/common/utils.py", line 1626, in _timing_stats#012
resp = func(ctrl, *args, **kwargs)#012 File
"/home/pi/swift/swift/obj/server.py", line 814, in PUT#012
writer.put(metadata)#012 File "/home/pi/swift/swift/obj/diskfile.py",
line 2561, in put#012 super(DiskFileWriter, self)._put(metadata,
True)#012 File "/home/pi/swift/swift/obj/diskfile.py", line 1566, in
_put#012 tpool_reraise(self._finalize_put, metadata, target_path,
cleanup)#012 File "/home/pi/swift/swift/common/utils.py", line 3536, in
tpool_reraise#012 raise resp#012IOError: [Errno 13] Permission denied
(txn: txfbf08bffde6d4657a72a5-00591dee30)
May 18 20:55:44 raspberrypi object-server: STDERR: 127.0.0.1 - -
[18/May/2017 18:55:44] "PUT /d1/40/AUTH_test/s3perf/testfile HTTP/1.1"
500 875 0.015646 (txn: txfbf08bffde6d4657a72a5-00591dee30)
Also the proxy.error file contains an error message ERROR 500 Expect: 100-continue From Object Server.
May 18 20:55:44 raspberrypi proxy-server: ERROR 500 Expect: 100-continue
From Object Server 127.0.0.1:6010/d1 (txn: txfbf08bffde6d4657a72a5-
00591dee30) (client_ip: 10.0.0.220)
I have started Swift as user pi and assigned these folders to this user:
$ sudo chown pi:pi /etc/swift
$ sudo chown -R pi:pi /mnt/sdb1/*
$ sudo chown -R pi:pi /var/cache/swift
$ sudo chown -R pi:pi /var/run/swift
sdb1 is a loopback device with XFS file system.
$ mount | grep sdb1
/srv/swift-disk on /mnt/sdb1 type xfs (rw,noatime,nodiratime,attr2,nobarrier,inode64,logbufs=8,noquota)
$ ls -ld /mnt/sdb1/1/
drwxr-xr-x 3 pi pi 17 May 12 13:14 /mnt/sdb1/1/
I deployed Swift this way.
I wonder why creating buckets (conrainers) works but the upload of a file fails because of Permission denied.
Update 1:
$ sudo swift-ring-builder /etc/swift/account.builder
/etc/swift/account.builder, build version 2
256 partitions, 1.000000 replicas, 1 regions, 1 zones, 1 devices, 0.00 balance, 0.00 dispersion
The minimum number of hours before a partition can be reassigned is 0 (0:00:00 remaining)
The overload factor is 0.00% (0.000000)
Ring file /etc/swift/account.ring.gz is up-to-date
Devices: id region zone ip address:port replication ip:port name weight partitions balance flags meta
0 1 1 127.0.0.1:6012 127.0.0.1:6012 d1 1.00 256 0.00
$ sudo swift-ring-builder /etc/swift/container.builder
/etc/swift/container.builder, build version 2
256 partitions, 1.000000 replicas, 1 regions, 1 zones, 1 devices, 0.00 balance, 0.00 dispersion
The minimum number of hours before a partition can be reassigned is 0 (0:00:00 remaining)
The overload factor is 0.00% (0.000000)
Ring file /etc/swift/container.ring.gz is up-to-date
Devices: id region zone ip address:port replication ip:port name weight partitions balance flags meta
0 1 1 127.0.0.1:6011 127.0.0.1:6011 d1 1.00 256 0.00
$ sudo swift-ring-builder /etc/swift/object.builder
/etc/swift/object.builder, build version 2
256 partitions, 1.000000 replicas, 1 regions, 1 zones, 1 devices, 0.00 balance, 0.00 dispersion
The minimum number of hours before a partition can be reassigned is 0 (0:00:00 remaining)
The overload factor is 0.00% (0.000000)
Ring file /etc/swift/object.ring.gz is up-to-date
Devices: id region zone ip address:port replication ip:port name weight partitions balance flags meta
0 1 1 127.0.0.1:6010 127.0.0.1:6010 d1 1.00 256 0.00
Update 2
The required ports are open.
$ nmap localhost -p 6010,6011,6012,8080,22
...
PORT STATE SERVICE
22/tcp open ssh
6010/tcp open x11
6011/tcp open unknown
6012/tcp open unknown
8080/tcp open http-proxy
Update 3
I can write as user pi inside the folder where Swift should store the objects.
$ whoami
pi
$ touch /srv/1/node/d1/objects/test
$ ls -l /srv/1/node/d1/objects/test
-rw-r--r-- 1 pi pi 0 May 13 22:59 /srv/1/node/d1/objects/test
Update 4
All swift processes belong to user pi.
$ ps aux | grep swift
pi 944 3.2 2.0 24644 20100 ? Ss May12 65:14 /usr/bin/python /usr/local/bin/swift-proxy-server /etc/swift/proxy-server.conf
pi 945 3.1 2.0 25372 20228 ? Ss May12 64:30 /usr/bin/python /usr/local/bin/swift-container-server /etc/swift/container-server.conf
pi 946 3.1 1.9 24512 19416 ? Ss May12 64:03 /usr/bin/python /usr/local/bin/swift-account-server /etc/swift/account-server.conf
pi 947 3.1 1.9 23688 19320 ? Ss May12 64:04 /usr/bin/python /usr/local/bin/swift-object-server /etc/swift/object-server.conf
pi 1000 0.0 1.7 195656 17844 ? Sl May12 0:01 /usr/bin/python /usr/local/bin/swift-object-server /etc/swift/object-server.conf
pi 1001 0.0 1.8 195656 18056 ? Sl May12 0:01 /usr/bin/python /usr/local/bin/swift-object-server /etc/swift/object-server.conf
pi 1002 0.0 1.6 23880 16772 ? S May12 0:01 /usr/bin/python /usr/local/bin/swift-object-server /etc/swift/object-server.conf
pi 1003 0.0 1.7 195656 17848 ? Sl May12 0:01 /usr/bin/python /usr/local/bin/swift-object-server /etc/swift/object-server.conf
pi 1004 0.0 1.7 24924 17504 ? S May12 0:01 /usr/bin/python /usr/local/bin/swift-account-server /etc/swift/account-server.conf
pi 1005 0.0 1.6 24924 16912 ? S May12 0:01 /usr/bin/python /usr/local/bin/swift-account-server /etc/swift/account-server.conf
pi 1006 0.0 1.8 24924 18368 ? S May12 0:01 /usr/bin/python /usr/local/bin/swift-account-server /etc/swift/account-server.conf
pi 1007 0.0 1.8 24924 18208 ? S May12 0:01 /usr/bin/python /usr/local/bin/swift-account-server /etc/swift/account-server.conf
pi 1008 0.0 1.8 25864 18824 ? S May12 0:01 /usr/bin/python /usr/local/bin/swift-container-server /etc/swift/container-server.conf
pi 1009 0.0 1.8 25864 18652 ? S May12 0:01 /usr/bin/python /usr/local/bin/swift-container-server /etc/swift/container-server.conf
pi 1010 0.0 1.7 25864 17340 ? S May12 0:01 /usr/bin/python /usr/local/bin/swift-container-server /etc/swift/container-server.conf
pi 1011 0.0 1.8 25864 18772 ? S May12 0:01 /usr/bin/python /usr/local/bin/swift-container-server /etc/swift/container-server.conf
pi 1012 0.0 1.8 24644 18276 ? S May12 0:03 /usr/bin/python /usr/local/bin/swift-proxy-server /etc/swift/proxy-server.conf
pi 1013 0.0 1.8 24900 18588 ? S May12 0:03 /usr/bin/python /usr/local/bin/swift-proxy-server /etc/swift/proxy-server.conf
pi 1014 0.0 1.8 24900 18588 ? S May12 0:03 /usr/bin/python /usr/local/bin/swift-proxy-server /etc/swift/proxy-server.conf
pi 1015 0.0 1.8 24900 18568 ? S May12 0:03 /usr/bin/python /usr/local/bin/swift-proxy-server /etc/swift/proxy-server.conf
Update 5
When I create a bucket, the Swift service creates a folder like this one:
/mnt/sdb1/1/node/d1/containers/122/9d5/7a23d9409f11da3062432c6faa75f9d5/
and this folder contains a db-file like 7a23d9409f11da3062432c6faa75f9d5.db. I think this is the correct behavior.
But when I try to upload a file inside a bucket, Swift creates just an empty folder like this one.
/mnt/sdb1/1/node/d1/objects/139/eca/8b17958f984943fc97b6b937061d2eca
I can create files inside these empty folders via touch or echo as user pi but for an unknown reason, Swift does not store files inside these folders.
Update 6
In order to investigate this issue further, I installed Swift according to the SAIO - Swift All In One instructions one time inside a VMware ESXi virtual machine with Ubuntu 14.04 LTS and another time inside Raspbian on a Raspberry Pi 3. The result is, that inside the Ubuntu 14.04 VM, Swift works perfectly, but when running ontop of the Rasberry Pi, uploading files does not work.
Object PUT failed: http://10.0.0.253:8080/v1/AUTH_test/s3perf-testbucket/testfiles/s3perf-testfile1.txt
503 Service Unavailable [first 60 chars of response]
<html><h1>Service Unavailable</h1><p>The server is currently
The storage1.error log file still says:
May 24 13:15:15 raspberrypi object-server: ERROR __call__ error with PUT
/sdb1/484/AUTH_test/s3perf-testbucket/testfiles/s3perf-testfile1.txt :
#012Traceback (most recent call last):#012 File
"/home/pi/swift/swift/obj/server.py", line 1105, in __call__#012 res =
getattr(self, req.method)(req)#012 File
"/home/pi/swift/swift/common/utils.py", line 1626, in _timing_stats#012
resp = func(ctrl, *args, **kwargs)#012 File
"/home/pi/swift/swift/obj/server.py", line 814, in PUT#012
writer.put(metadata)#012 File "/home/pi/swift/swift/obj/diskfile.py",
line 2561, in put#012 super(DiskFileWriter, self)._put(metadata,
True)#012 File "/home/pi/swift/swift/obj/diskfile.py", line 1566, in
_put#012 tpool_reraise(self._finalize_put, metadata, target_path,
cleanup)#012 File "/home/pi/swift/swift/common/utils.py", line 3536, in
tpool_reraise#012 raise resp#012IOError: [Errno 13] Permission denied
(txn: txdfe3c7f704be4af8817b3-0059256b43)
Update 7
The issue is still not fixed, but I have now a working Swift service on the Raspberry Pi. I installed the (quite outdated) Swift revision 2.2.0, which is shipped with Raspbian and it works well. The steps I did are explained here.
Based on the information that you provided, the errors occur during writing Metadata.
The operations for writing Metadata falls into two categories: manipulating inode and manipulating extended attributes. So there are two possible sources for your errors.
First, it is an inode related error. This error may occur due to setting the inode64 parameter while mounting the device. According to the XFS man page:
If applications are in use which do not handle inode numbers bigger than 32 bits, the inode32 option should be specified.
Second, it is an extended attributes related error. You can use the xattr package of python to write extended attributes, and check whether exception happens.
Related
spidev0.0 not showing with enc28j60 and raspberry pi 4
I am trying to setup SPI with the ETH Click (a board for the enc28j60) with the raspberry pi 4 model b and pi 3 click shield. when I set the overlay for enc28j60 in my /boot/config.txt only spidev0.1 appears when I do the command ls -l /dev/spi*: crw-rw---- 1 root spi 153, 0 May 26 11:40 /dev/spidev0.1 My /boot/config.txt #Uncomment some or all of these to enable the optional hardware interfaces #dtparam=i2c_arm=on #dtparam=i2s=on dtparam=spi=on dtoverlay=enc28j60,int_pin=6 I'm using: CE0 - GPIO8 MISO - GPIO9 MOSI - GPIO10 SCLK - GPIO11 INT - GPIO6 If I remove the dtoverlay spidev0.0 appears. My raspbian version when running the command: cat /etc/os-release PRETTY_NAME="Raspbian GNU/Linux 10 (buster)" NAME="Raspbian GNU/Linux" VERSION_ID="10" VERSION="10 (buster)" VERSION_CODENAME=buster ID=raspbian ID_LIKE=debian HOME_URL="http://www.raspbian.org/" SUPPORT_URL="http://www.raspbian.org/RaspbianForums" BUG_REPORT_URL="http://www.raspbian.org/RaspbianBugs"
Ambari unable to run custom hook for modifying user hive
Attempting to add a client node to cluster via Ambari (v2.7.3.0) (HDP 3.1.0.0-78) and seeing odd error stderr: Traceback (most recent call last): File "/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py", line 38, in <module> BeforeAnyHook().execute() File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 352, in execute method(env) File "/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py", line 31, in hook setup_users() File "/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/shared_initialization.py", line 51, in setup_users fetch_nonlocal_groups = params.fetch_nonlocal_groups, File "/usr/lib/ambari-agent/lib/resource_management/core/base.py", line 166, in __init__ self.env.run() File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 160, in run self.run_action(resource, action) File "/usr/lib/ambari-agent/lib/resource_management/core/environment.py", line 124, in run_action provider_action() File "/usr/lib/ambari-agent/lib/resource_management/core/providers/accounts.py", line 90, in action_create shell.checked_call(command, sudo=True) File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 72, in inner result = function(command, **kwargs) File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 102, in checked_call tries=tries, try_sleep=try_sleep, timeout_kill_strategy=timeout_kill_strategy, returns=returns) File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 150, in _call_wrapper result = _call(command, **kwargs_copy) File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 314, in _call raise ExecutionFailed(err_msg, code, out, err) resource_management.core.exceptions.ExecutionFailed: Execution of 'usermod -G hadoop -g hadoop hive' returned 6. usermod: user 'hive' does not exist in /etc/passwd Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-632.json', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-632.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1_2', '']2019-11-25 13:07:58,000 - Reporting component version failed Traceback (most recent call last): File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 363, in execute self.save_component_version_to_structured_out(self.command_name) File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 223, in save_component_version_to_structured_out stack_select_package_name = stack_select.get_package_name() File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 109, in get_package_name package = get_packages(PACKAGE_SCOPE_STACK_SELECT, service_name, component_name) File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 223, in get_packages supported_packages = get_supported_packages() File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 147, in get_supported_packages raise Fail("Unable to query for supported packages using {0}".format(stack_selector_path)) Fail: Unable to query for supported packages using /usr/bin/hdp-select stdout: 2019-11-25 13:07:57,644 - Stack Feature Version Info: Cluster Stack=3.1, Command Stack=None, Command Version=None -> 3.1 2019-11-25 13:07:57,651 - Using hadoop conf dir: /usr/hdp/current/hadoop-client/conf 2019-11-25 13:07:57,652 - Group['livy'] {} 2019-11-25 13:07:57,654 - Group['spark'] {} 2019-11-25 13:07:57,654 - Group['ranger'] {} 2019-11-25 13:07:57,654 - Group['hdfs'] {} 2019-11-25 13:07:57,654 - Group['zeppelin'] {} 2019-11-25 13:07:57,655 - Group['hadoop'] {} 2019-11-25 13:07:57,655 - Group['users'] {} 2019-11-25 13:07:57,656 - User['yarn-ats'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None} 2019-11-25 13:07:57,658 - User['hive'] {'gid': 'hadoop', 'fetch_nonlocal_groups': True, 'groups': ['hadoop'], 'uid': None} 2019-11-25 13:07:57,659 - Modifying user hive Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-632.json', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-632.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1_2', ''] 2019-11-25 13:07:57,971 - The repository with version 3.1.0.0-78 for this command has been marked as resolved. It will be used to report the version of the component which was installed 2019-11-25 13:07:58,000 - Reporting component version failed Traceback (most recent call last): File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 363, in execute self.save_component_version_to_structured_out(self.command_name) File "/usr/lib/ambari-agent/lib/resource_management/libraries/script/script.py", line 223, in save_component_version_to_structured_out stack_select_package_name = stack_select.get_package_name() File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 109, in get_package_name package = get_packages(PACKAGE_SCOPE_STACK_SELECT, service_name, component_name) File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 223, in get_packages supported_packages = get_supported_packages() File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 147, in get_supported_packages raise Fail("Unable to query for supported packages using {0}".format(stack_selector_path)) Fail: Unable to query for supported packages using /usr/bin/hdp-select Command failed after 1 tries The problem appears to be resource_management.core.exceptions.ExecutionFailed: Execution of 'usermod -G hadoop -g hadoop hive' returned 6. usermod: user 'hive' does not exist in /etc/passwd caused by 2019-11-25 13:07:57,659 - Modifying user hive Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-632.json', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-632.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1_2', ''] This is further reinforced by the fact that manually adding the ambari-hdp-1.repo and yum-installing hdp-select before adding the host to the cluster shows the same error messages, just truncated up to the parts of stdout/err shown here. When running [root#HW001 .ssh]# /usr/bin/hdp-select versions 3.1.0.0-78 from the ambari server node, I can see the command runs. Looking at what the hook script is trying to run/access, I see [root#client001~]# ls -lha /var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py -rw-r--r-- 1 root root 1.2K Nov 25 10:51 /var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py [root#client001~]# ls -lha /var/lib/ambari-agent/data/command-632.json -rw------- 1 root root 545K Nov 25 13:07 /var/lib/ambari-agent/data/command-632.json [root#client001~]# ls -lha /var/lib/ambari-agent/cache/stack-hooks/before-ANY total 0 drwxr-xr-x 4 root root 34 Nov 25 10:51 . drwxr-xr-x 8 root root 147 Nov 25 10:51 .. drwxr-xr-x 2 root root 34 Nov 25 10:51 files drwxr-xr-x 2 root root 188 Nov 25 10:51 scripts [root#client001~]# ls -lha /var/lib/ambari-agent/data/structured-out-632.json ls: cannot access /var/lib/ambari-agent/data/structured-out-632.json: No such file or directory [root#client001~]# ls -lha /var/lib/ambari-agent/tmp total 96K drwxrwxrwt 3 root root 4.0K Nov 25 13:06 . drwxr-xr-x 10 root root 267 Nov 25 10:50 .. drwxr-xr-x 6 root root 4.0K Nov 25 13:06 ambari_commons -rwx------ 1 root root 1.4K Nov 25 13:06 ambari-sudo.sh -rwxr-xr-x 1 root root 1.6K Nov 25 13:06 create-python-wrap.sh -rwxr-xr-x 1 root root 1.6K Nov 25 10:50 os_check_type1574715018.py -rwxr-xr-x 1 root root 1.6K Nov 25 11:12 os_check_type1574716360.py -rwxr-xr-x 1 root root 1.6K Nov 25 11:29 os_check_type1574717391.py -rwxr-xr-x 1 root root 1.6K Nov 25 13:06 os_check_type1574723161.py -rwxr-xr-x 1 root root 16K Nov 25 10:50 setupAgent1574715020.py -rwxr-xr-x 1 root root 16K Nov 25 11:12 setupAgent1574716361.py -rwxr-xr-x 1 root root 16K Nov 25 11:29 setupAgent1574717392.py -rwxr-xr-x 1 root root 16K Nov 25 13:06 setupAgent1574723163.py notice there is ls: cannot access /var/lib/ambari-agent/data/structured-out-632.json: No such file or directory. Not sure if this is normal, though. Anyone know what could be causing this or any debugging hints from this point? UPDATE 01: Adding some log printing lines near the offending final line in the error trace, ie. File "/usr/lib/ambari-agent/lib/resource_management/libraries/functions/stack_select.py", line 147, in get_supported_packages, I print the code and stdout: 2 ambari-python-wrap: can't open file '/usr/bin/hdp-select': [Errno 2] No such file or directory So what the heck? It wants hdp-select to already be there, but ambari add-host UI complains if I manually install that binary myself beforehand. When I do manually install it (using the same repo file as in the rest of the existing cluster nodes) all I see is... 0 Packages: accumulo-client accumulo-gc accumulo-master accumulo-monitor accumulo-tablet accumulo-tracer atlas-client atlas-server beacon beacon-client beacon-server druid-broker druid-coordinator druid-historical druid-middlemanager druid-overlord druid-router druid-superset falcon-client falcon-server flume-server hadoop-client hadoop-hdfs-client hadoop-hdfs-datanode hadoop-hdfs-journalnode hadoop-hdfs-namenode hadoop-hdfs-nfs3 hadoop-hdfs-portmap hadoop-hdfs-secondarynamenode hadoop-hdfs-zkfc hadoop-httpfs hadoop-mapreduce-client hadoop-mapreduce-historyserver hadoop-yarn-client hadoop-yarn-nodemanager hadoop-yarn-registrydns hadoop-yarn-resourcemanager hadoop-yarn-timelinereader hadoop-yarn-timelineserver hbase-client hbase-master hbase-regionserver hive-client hive-metastore hive-server2 hive-server2-hive hive-server2-hive2 hive-webhcat hive_warehouse_connector kafka-broker knox-server livy-client livy-server livy2-client livy2-server mahout-client oozie-client oozie-server phoenix-client phoenix-server pig-client ranger-admin ranger-kms ranger-tagsync ranger-usersync shc slider-client spark-atlas-connector spark-client spark-historyserver spark-schema-registry spark-thriftserver spark2-client spark2-historyserver spark2-thriftserver spark_llap sqoop-client sqoop-server storm-client storm-nimbus storm-slider-client storm-supervisor superset tez-client zeppelin-server zookeeper-client zookeeper-server Aliases: accumulo-server all client hadoop-hdfs-server hadoop-mapreduce-server hadoop-yarn-server hive-server Command failed after 1 tries UPDATE 02: Printing some custom logging from File "/usr/lib/ambari-agent/lib/resource_management/core/shell.py", line 322 (printing the values of err_msg, code, out, err), ie. .... 312 if throw_on_failure and not code in returns: 313 err_msg = Logger.filter_text("Execution of '{0}' returned {1}. {2}".format(command_alias, c ode, all_output)) 314 315 #TODO remove 316 print("\n----------\nMY LOGS\n----------\n") 317 print(err_msg) 318 print(code) 319 print(out) 320 print(err) 321 322 raise ExecutionFailed(err_msg, code, out, err) 323 324 # if separate stderr is enabled (by default it's redirected to out) 325 if stderr == subprocess32.PIPE: 326 return code, out, err 327 328 return code, out .... I see Execution of 'usermod -G hadoop -g hadoop hive' returned 6. usermod: user 'hive' does not exist in /etc/passwd 6 usermod: user 'hive' does not exist in /etc/passwd Error: Error: Unable to run the custom hook script ['/usr/bin/python', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY/scripts/hook.py', 'ANY', '/var/lib/ambari-agent/data/command-816.json', '/var/lib/ambari-agent/cache/stack-hooks/before-ANY', '/var/lib/ambari-agent/data/structured-out-816.json', 'INFO', '/var/lib/ambari-agent/tmp', 'PROTOCOL_TLSv1_2', ''] 2019-11-26 10:25:46,928 - The repository with version 3.1.0.0-78 for this command has been marked as resolved. It will be used to report the version of the component which was installed So it seems like it is failing to create the hive user (even though it seems to have no problem creating the yarn-ats user before that)
After just giving in and trying to manually create the hive user myself, I see [root#airflowetl ~]# useradd -g hadoop -s /bin/bash hive useradd: user 'hive' already exists [root#airflowetl ~]# cat /etc/passwd | grep hive <nothing> [root#airflowetl ~]# id hive uid=379022825(hive) gid=379000513(domain users) groups=379000513(domain users) The fact that this existing user's uid looks like this and is not in the /etc/passwd file made me think that there is some existing Active Directory user (which this client node syncs with via installed SSSD) that already has the name hive. Checking our AD users, this turned out to be true. Temporarily stopping the SSSD service to stop sync with AD (service sssd stop) (since, not sure if you can get a server to ignore AD syncs on an individual user basis) before rerunning the client host add in Ambari fixed the problem for me.
Ceph S3 / Swift bucket create failed / error 416
I am getting 416 errors while creating buckets using S3 or Swift. How to solve this? swift -A http://ceph-4:7480/auth/1.0 -U testuser:swift -K 'BKtVrq1...' upload testas testas Warning: failed to create container 'testas': 416 Requested Range Not Satisfiable: InvalidRange Object PUT failed: http://ceph-4:7480/swift/v1/testas/testas 404 Not Found b'NoSuchBucket' Also S3 python test: File "/usr/lib/python2.7/dist-packages/boto/s3/connection.py", line 621, in create_bucket response.status, response.reason, body) boto.exception.S3ResponseError: S3ResponseError: 416 Requested Range Not Satisfiable <?xml version="1.0" encoding="UTF-8"?><Error><Code>InvalidRange</Code><BucketName>mybucket</BucketName><RequestId>tx00000000000000000002a-005a69b12d-1195-default</RequestId><HostId>1195-default-default</HostId></Error> Here is my ceph status: cluster: id: 1e4bd42a-7032-4f70-8d0c-d6417da85aa6 health: HEALTH_OK services: mon: 3 daemons, quorum ceph-2,ceph-3,ceph-4 mgr: ceph-1(active), standbys: ceph-2, ceph-3, ceph-4 osd: 3 osds: 3 up, 3 in rgw: 2 daemons active data: pools: 7 pools, 296 pgs objects: 333 objects, 373 MB usage: 4398 MB used, 26309 MB / 30708 MB avail pgs: 296 active+clean I am using CEPH Luminous build with bluestore ceph version 12.2.2 (cf0baeeeeba3b47f9427c6c97e2144b094b7e5ba) luminous (stable) User created: sudo radosgw-admin user create --uid="testuser" --display-name="First User" sudo radosgw-admin subuser create --uid=testuser --subuser=testuser:swift --access=full sudo radosgw-admin key create --subuser=testuser:swift --key-type=swift --gen-secret Logs on osd: 2018-01-25 12:19:45.383298 7f03c77c4700 1 ====== starting new request req=0x7f03c77be1f0 ===== 2018-01-25 12:19:47.711677 7f03c77c4700 1 ====== req done req=0x7f03c77be1f0 op status=-34 http_status=416 ====== 2018-01-25 12:19:47.711937 7f03c77c4700 1 civetweb: 0x55bd9631d000: 192.168.109.47 - - [25/Jan/2018:12:19:45 +0200] "PUT /mybucket/ HTTP/1.1" 1 0 - Boto/2.38.0 Python/2.7.12 Linux/4.4.0-51-generic Linux ubuntu, 4.4.0-51-generic
set default pg_num and pgp_num to lower value(8 for example), or set mon_max_pg_per_osd to a high value in ceph.conf
Why does xrandr give me errors if I try and use commands on my computer, but not if I ssh into it?
When using xrandr on my device to select a resolution I kept getting an error stating " configure crtc 0 failed: " (shortened) xrandr output after selecting display and running$ xrandr Screen 0: minimum 8 x 8, current 1920 x 1080, maximum 32767 x 32767 DP1 disconnected (normal left inverted right x axis y axis) DP2 connected 1920x1080+0+0 (normal left inverted right x axis y axis) 1439mm x 809mm 1920x1080 60.00*+ 50.00 59.94 30.00 24.00 29.97 23.98 4096x2160 24.00 23.98 3840x2160 30.00 25.00 24.00 29.97 23.98 1920x1080i 60.00 50.00 59.94 1680x1050 59.88 1280x720 60.00 50.00 30.00 59.94 29.97 24.00 23.98 1024x768 60.00 720x480 60.00 59.94 640x480 60.00 59.94 HDMI1 disconnected (normal left inverted right x axis y axis) VIRTUAL1 disconnected (normal left inverted right x axis y axis) Code I used to select a new resolution $ xrandr --output DP2 --mode 3840x2160 when that gave me the error I also added the frame rate by trying both $ xrandr --output DP2 --mode 3840x2160 30 AND $xrandr --output DP2 --mode 3840x2160_30 (because I wasnt sure of the proper format to add it) Both gave me the error " configure crtc 0 failed: " This was done on the device itself. for ergonomical reasons I went back to my desk and used SSH to access the device. I then used a custom resolution (that was the same as above) and tried to use that instead. steps I used for custom resolution (minus long outputs) $ cvt 3840x2160 $ xrandr --newmode "3840x2160 30.00" 338.75 3840 4080 4488 5136 2160 2163 2168 2200 -hsync +vsync $ xrandr --addmode DP2 3840x2160_30.00 $ xrandr --output DP2 --mode 3840x2160_30.00 That seemed to work on my device. When my device restarts I need to repeat the process again though (reverts to 100p when I need it 4k). I stuck $ xrandr --output DP2 --mode 3840x2160_30.00 into a .sh file and now if I run it from my laptop (using SSH) it changes my screens resolution BUT if I try and run the .sh file from my device itself I get the " configure crtc 0 failed: " error
You can reconfigure xOrg. I did this by creating a file in my /usr/share/X11/xorg.conf.d Directory. I made it using vim: sudo vim /usr/share/X11/xorg.conf.d/5-monitor.conf Here is an example of my file Section "Monitor" Identifier "Monitor0" Modeline "1920x1080_60.00" 173.00 1920 2048 2248 2576 1080 1083 1088 1120 -hsync +vsync Modeline "3840x2160_30.0" 297.00 3840 4016 4104 4400 2160 2168 2178 2250 +hsync +vsync Modeline "4096x2160_24.0" 297.00 4096 5116 5204 5500 2160 2168 2178 2250 +hsync +vsync EndSection Section "Device" Identifier "Device0" Driver "intel" EndSection Section "Screen" Identifier "Screen0" Device "Device0" Monitor "Monitor0" DefaultDepth 24 SubSection "Display" Depth 24 Modes "3840x2160" "1920x1080" EndSubSection EndSection For directions on how to do this you can follow this tutorial: https://wiki.gentoo.org/wiki/Xorg/Multiple_monitors I was ran into this issue with ubuntu 16.0.4
you can try to use cvt -r 3840 2160 to replace the operation of cvt 3840 2160
How to know which process(stat: T) is attached by gdb?
When a process is attached by gdb, the stat of the process is "T", like: root 6507 0.0 0.0 67896 952 ? Ss 12:01 0:00 /mytest root 6508 0.0 0.0 156472 7120 ? Sl 12:01 0:00 /mytest root 26994 0.0 0.0 67896 956 ? Ss 19:59 0:00 /mytest root 26995 0.0 0.0 156460 7116 ? Tl 19:59 0:00 /mytest root 27833 0.0 0.0 97972 24564 pts/2 S+ 20:00 0:00 gdb /mytest From the above, 26995 may be debuging. How can I know 26995 is debug or not? Or can I know which process is attached by gdb(27833) pstree -p 27833 --- show gdb(27833) Another question: How to know a process(stat: T) is attached by which gdb(PID)? In most siduation, I am not the peoson who is debuging the process.
The T in ps output stands for "being ptrace()d". So that process (26995) is being traced by something. That something is most often either GDB, or strace. So yes, if you know that you are only running GDB and not strace, and if you see a single process in T state, then you know that you are debugging that process. You could also ask GDB which process(es) it is debugging: (gdb) info process (gdb) info inferior Update As Matthew Slattery correctly noted, T just means the process is stopped, and not that it is being ptrace()d. So a better solution is to do this: grep '^TracerPid:' /proc/*/status | grep -v ':.0' /proc/7657/status:TracerPid: 31069 From above output you can tell that process 7657 is being traced by process 31069. This answers both "which process is being debugger" and "which debugger is debugging what".
/proc file system is a telent design of Linux. Many process real-time information can be found from /proc/{PID}/. Another question: How to know a process(stat: T) is attached by which gdb(PID)? In most siduation, I am not the peoson who is debuging the process. For this question, we can check /proc/{PID}/status file to get the answer. root 14616 0.0 0.0 36152 908 ? Ss Jun28 0:00 /mytest root 14617 0.5 0.0 106192 7648 ? Sl Jun28 112:45 /mytest tachyon 2683 0.0 0.0 36132 1008 ? Ss 11:22 0:00 /mytest tachyon 4276 0.0 0.0 76152 20728 pts/42 S+ 11:22 0:00 gdb /mytest tachyon 2684 0.0 0.0 106136 7140 ? Tl 11:22 0:00 /mytest host1-8>cat /proc/2684/status Name: mytest State: T (tracing stop) SleepAVG: 88% Tgid: 2684 Pid: 2684 PPid: 2683 TracerPid: 4276 ....... Thus we know 2684 is debug by process 4276.
You can find out this info from ps axf output. 1357 ? Ss 0:00 /usr/sbin/sshd 1935 ? Ss 0:00 \_ sshd: root#pts/0 1994 pts/0 Ss 0:00 \_ -bash 2237 pts/0 T 0:00 \_ gdb /bin/ls 2242 pts/0 T 0:00 | \_ /bin/ls 2243 pts/0 R+ 0:00 \_ ps axf Here process 2242 is being debuged by gdb process 2237.