opscenter backup fails while trying to snapshot and non-existing SStable

opscenter backup fails while trying to snapshot and non-existing SStable - backup

I'm running a DSE cluster in AWS: m2.4xlarge instances running Datastax Enterprise 4.6.1, with Cassandra 2.0.12.200 and Opscenter 5.1.0.
When we try to do a backup of a keyspace, we get this:
Snapshot of keyspaces [XXXXXXX] on node XXX.XXX.XXX.XXX failed: javax.management.RuntimeMBeanException: java.lang.RuntimeException: Tried to hard link to file that does not exist /raid0/cassandra/data/XXXXXX/XXXXXX/XXXXXXXXXXXX-jb-1-Index.db
Any ideas?

This is likely the following known issue:
https://issues.apache.org/jira/browse/CASSANDRA-6433
The workaround for this is a rolling restart and it is fixed in c* 2.1. It seems to be caused when you drop a keyspace and re-create it again.

I had a similar issue with a dropped keyspace and compaction.
Run a "nodetool repair" and check the "system" and "opscenter" keyspaces for traces of the deleted keyspace. You might need to manually remove stale rows.

Related

s3distcp fail with "mapreduce_shuffle does not exist"

When I running command below,
s3-dist-cp --src s3://test/9.19 --dest hdfs:///user/hadoop/test
I got a error about auxService.
20/02/03 07:52:13 INFO mapreduce.Job: Task Id : attempt_1580716305878_0001_m_000000_2, Status : FAILED
Container launch failed for container_1580716305878_0001_01_000004 : org.apache.hadoop.yarn.exceptions.InvalidAuxServiceException: The auxService:mapreduce_shuffle does not exist
In many QnA, I found a solution like this
link.
But there is no process for nodemanager.
[hadoop#ip-172-31-37-115 ~]$ initctl list | grep yarn
hadoop-yarn-timelineserver start/running, process 8149
hadoop-yarn-resourcemanager start/running, process 17331
hadoop-yarn-proxyserver start/running, process 8147
My EMR was created by quick menu with emr-5.28.0.
Is there anyone knows about this problem?
Thanks!

I'm sure there's some way to update the configs, but what I did was create a cluster using the 'advanced' setup and chose these software packages:
Ganglia
Hive
Hue
Mahout
Pig
Tez
Spark
Hadoop
(8 in total)
Most of those, except spark, are installed with the default settings (the first radio button for software packages in quick setup). One of these software packages or something related to it is what causes s3-dist-cp to be installed, and I was able to use it with no problems with that setup.

yarn not getting nodes

This is in AWS EMR cluster with 2 task nodes and a Master.
I'm trying the hello-samza that launches a yarn job. The job gets stuck in ACCEPTED STATE. I looked in other posts and it seems that my yarn getting no nodes. Any help on what yarn not getting task nodes will help.
[hadoop#xxx hello-samza]$ deploy/yarn/bin/yarn node -list
17/04/18 23:30:45 INFO client.RMProxy: Connecting to ResourceManager at /127.0.0.1:8032
Total Nodes:0
Node-Id Node-State Node-Http-Address Number-of-Running-Containers
[hadoop#xxx hello-samza]$ deploy/yarn/bin/yarn application -list -appStates ALL
17/04/18 23:26:30 INFO client.RMProxy: Connecting to ResourceManager at /127.0.0.1:8032
Total number of applications (application-types: [] and states: [NEW, NEW_SAVING, SUBMITTED, ACCEPTED, RUNNING, FINISHED, FAILED, KILLED]):1
Application-Id Application-Name Application-Type User Queue State Final-State Progress Tracking-URL
application_1492557889328_0001 wikipedia-parser_1 Samza hadoop default ACCEPTED UNDEFINED 0% N/A

I made a complete answer for a similar case I've been experiencing: have a look at it, it might be this kind of conf issue

It seems like the nodemanagers are not running on either node (either not started at all or exited with error). Use jps command to check if all the daemons associated with YARN are running on the two nodes. Additionally, check both nodemanager logs to see if any exceptions might have killed it.

Cannot connect Impala-Kudu to Apache Kudu (without Cloudera Manager): Get TTransportException Error

I have successfully installed kudu on Ubuntu (Trusty) as per the official kudu documentations (see http://kudu.apache.org/docs/installation.html ). The setup has one node running master and tablet server and another node running the tablet server only. I am having issues installing impala-kudu without Cloudera Manager on the node running kudu master. I have followed CDH installation instructions on this (see http://www.cloudera.com/documentation/enterprise/latest/topics/cdh_ig_cdh5_install.html ) page until Step 3. I have avoided installing CDH with YARN and MRv1 as I don’t need to run any mapreduce jobs and will not be using hadoop. Impala-kudu and impala-kudu-shell installed without errors. When I launch the impala-shell it returns:
Starting Impala Shell without Kerberos authentication
Error connecting: TTransportException, Could not connect to kudu_test:21000
***********************************************************************************
Welcome to the Impala shell. Copyright (c) 2015 Cloudera, Inc. All rights reserved.
(Impala Shell v2.7.0-cdh5-IMPALA_KUDU-cdh5 (48f1ad3) built on Thu Aug 18 12:15:44 PDT 2016)Want to know what version of Impala you're connected to? Run the VERSION command to
find out!
***********************************************************************************
[Not connected] >
I have tried to use the CONNECT option to connect to the kudu-master node without success. Both imapala-kudu and kudu are running on the same machine. Are there additional configuration settings which need to be changed or is hadoop and YARN a strict requirement to make impala-kudu work?
After running ps -ef | grep -i impalad I can confirm the impala daemon is not running. After navigating to the impala logs at ~/var/log/impala I find a few errors and warning files. Here is the output of impalad.ERROR:
Log file created at: 2016/09/13 13:26:24
Running on machine: kudu_test
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
E0913 13:26:24.084389 3021 logging.cc:118] stderr will be logged to this file.
E0913 13:26:25.406966 3021 impala-server.cc:249] Currently configured default filesystem: LocalFileSystem. fs.defaultFS (file:///) is not supported.ERROR: block location tracking is not properly enabled because
- dfs.datanode.hdfs-blocks-metadata.enabled is not enabled.
- dfs.client.file-block-storage-locations.timeout.millis is too low. It should be at least 10 seconds.
E0913 13:26:25.406990 3021 impala-server.cc:252] Aborting Impala Server startup due to improper configuration. Impalad exiting.
Maybe I need to revisit HDFS and the Hive Metastore to ensure I have these services configured properly?

According to the log, impalad quits because the default filesystem is configured to be LocalFileSystem, which is not supported. You have to set a distributed filesystem, such as HDFS as the default.
Although Kudu is a separate storage system and does not rely on HDFS, Impala still seems to require a non-local default FS even when using with Kudu. The Impala_Kudu documentation explicitly lists the following requirement:
Before installing Impala_Kudu, you must have already installed and configured services for HDFS (though it is not used by Kudu), the Hive Metastore (where Impala stores its metadata), and Kudu.
I can even imagine that HDFS may not really be needed for any other reason than to make Impala happy, but this is just speculation from my side. Update: Found IMPALA-1850 which confirms my suspicion that HDFS should not be needed for Impala any more, but it's not just a single check that has to be removed.

YARN error: TaskAttempt killed because it ran on unusable node ... Container released on a lost node

I am using CDH 5.4 with Pig 0.12. I am getting a lot of this error from all nodes:
TaskAttempt killed because it ran on unusable nodename:portnumber Container released on a *lost* node
What does this mean? In particular what does "lost" mean here? It doesn't look like the node is really lost in the cluster. Another question (more important question) is how to resolve this issue. Any help would be appreciated.

This particular case turned out to be a data storage problem. I restarted datanode manager from nodes which were lost with the message of "1/1 local-dirs are bad: /data/hadoop/yarn/local;"

Redis Server doesn't start or do anything - Redis-64 on Windows

I'm following these steps outlines on this link, however when I try to start the server nothing happens nor can I connect to anything from the client. Does anyone know how to run this?
when I try from a command prompt instead of double clicking the redis-server.exe I get this message
[11868] 23 Jul 11:58:26.325 # QForkMasterInit: system error caught. error code=0
x000005af, message=VirtualAllocEx failed.: unknown error
http://bartwullems.blogspot.ca/2013/07/unofficial-redis-for-windows.html
The easiest way to install Redis is through NuGet:
Open Visual Studio
Create an empty solution so that NuGet knows where to put the packages
Go the Package Manager Console: Tools –> Library Package Manager –>Package Manager Console
Type Install-Package Redis-64
image
Go to the Packages folder and browse to the Tools folder. Here you’ll find the Redis-server.exe. Double click on it to start it.
Redis is ready to use and start’s listening on a specific port(6379 in
my case)
image
Let’s open up a client and try to put a value into Redis. Start Redis-cli.exe. It already connects to the same port by default.
image
Add a value by executing following command:
image
Read the value again:
image

Try to run with redis-server --maxheap 4000000

Miguel is correct, but it is not that simple. To start redis-server either as a service or from the command prompt, the amount of available RAM and disk space must be sufficient for Redis to run as configured.
Now, if no configuration file is specified when running Redis, it will use the default configuration values. All of this is documented in the redis.windows.conf file as well as in the document "Redis on Windows.docx" (both deployed with the redis installation).
In my experience, errors when starting Redis usually come from lack of available resources (RAM or disk space) or some incorrect configuration of maxhead or maxmemory parameters.
To troubleshoot this kind of behavior, check your system's available resources and try running redis-server from the command line varying the parameters maxmemory, maxheap, and/or heapdir. The loglevel parameter set to verbose might also help diagnosing the issue.
Regards

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas