knife status to list node where chef is failing - automation

I was looking a way to list chef-client where chef is failing (Only failing NOT stopped).
I checked knife status document, but didn't find anything usefull.
anyone know better way to do this ?
Thanks

The knife lastrun plugin provides a useful chef handler that will record the time and status of most recent chef run. It will also usefully store the stack trace of any failed chef run. This is useful plugin when running large numbers of chef clients.

Related

Best way to write Condition to systemd unit file

Trying to figure out the best way to go about determining if a linux instance is Amazon Linux 2 or Red Hat Enterprise Linux 7.
I was looking at the ConditionArchitecture test, however, it does not seem to get granular enough. The other route would be to use ConditionPathExist and try and find a unique path between AL2 and RHEL7.
[Unit]
Description=CloudPassage Halo Agent Configuration
After=network-online.target network.service
Before=cphalod.service
ConditionFileNotEmpty=!/opt/cloudpassage/data/store.db.vector
[Service]
Type=oneshot
ExecStart=/opt/cloudpassage/bin/configure --agent-key=XXXXXXXXXXXXXXXXXXXXXX --tag=XXX-XXX-XXX --proxy=proxy:3128 --dns=false
[Install]
WantedBy=multi-user.target
I basically want to add a condition statement in the Service section of the unit final, saying if AL2 use one agent-key and tag, then if it's RHEL7 use a different agent-key and tag. Has anyone done anything similar? I've tried searching around SO, but I didn't see anything for my any scenario similar to mine. If there is a better way to go about it instead of in the unit file, I'm open to suggestions.

Long running chef-client executions

I'm using a open-source chef server managing about 150 nodes.
Analytics/Reporting module is not activated in the chef server due to resource constraints.
"chef-client" is running on all the nodes every 30 minutes
How can I find, how much time each chef-client run is taking to complete?
I'm trying to find the nodes that are slowest in completing their chef-client runs
Chef Server doesn't store this information. You'll need to manage it yourself, possibly using a handler as linked above in the comments. A simple option would be to make a handler which stores the duration of the last run as a node attribute, but the sky is the limit. If you want something to help debug long runs once you find them, check out my poise-profiler cookbook.

how to find chef client run successful or not

I want to perform chef operations using api from my own program(java). I would like to know whether my chef client run is successful or not.
What is the best way to find this. Did chef maintains any attribute or store recent chef client run status.
You can get the timestamp of the last successful run from the node data (key is ohai_time), but that's about it for vanilla Chef. More likely what you want is the information for specific runs, which you could get from the Reporting system (part of the Premium add-ons) or by making a custom report/error handler to ship the data to your own system.

How to submit code to a remote Spark cluster from IntelliJ IDEA

I have two clusters, one in local virtual machine another in remote cloud. Both clusters in Standalone mode.
My Environment:
Scala: 2.10.4
Spark: 1.5.1
JDK: 1.8.40
OS: CentOS Linux release 7.1.1503 (Core)
The local cluster:
Spark Master: spark://local1:7077
The remote cluster:
Spark Master: spark://remote1:7077
I want to finish this:
Write codes(just simple word-count) in IntelliJ IDEA locally(on my laptp), and set the Spark Master URL to spark://local1:7077 and spark://remote1:7077, then run my codes in IntelliJ IDEA. That is, I don't want to use spark-submit to submit a job.
But I got some problem:
When I use the local cluster, everything goes well. Run codes in IntelliJ IDEA or use spark-submit can submit job to cluster and can finish the job.
But When I use the remote cluster, I got a warning log:
TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
It is sufficient resources not sufficient memory!
And this log keep printing, no further actions. Both spark-submit and run codes in IntelliJ IDEA result the same.
I want to know:
Is it possible to submit codes from IntelliJ IDEA to remote cluster?
If it's OK, does it need configuration?
What are the possible reasons that can cause my problem?
How can I handle this problem?
Thanks a lot!
Update
There is a similar question here, but I think my scene is different. When I run my codes in IntelliJ IDEA, and set Spark Master to local virtual machine cluster, it works. But I got Initial job has not accepted any resources;... warning instead.
I want to know whether the security policy or fireworks can cause this?
Submitting code programatically (e.g. via SparkSubmit) is quite tricky. At the least there is a variety of environment settings and considerations -handled by the spark-submit script - that are quite difficult to replicate within a scala program. I am still uncertain of how to achieve it: and there have been a number of long running threads within the spark developer community on the topic.
My answer here is about a portion of your post: specifically the
TaskSchedulerImpl: Initial job has not accepted any resources; check
your cluster UI to ensure that workers are registered and have
sufficient resources
The reason is typically there were a mismatch on the requested memory and/or number of cores from your job versus what were available on the cluster. Possibly when submitting from IJ the
$SPARK_HOME/conf/spark-defaults.conf
were not properly matching the parameters required for your task on the existing cluster. You may need to update:
spark.driver.memory 4g
spark.executor.memory 8g
spark.executor.cores 8
You can check the spark ui on port 8080 to verify that the parameters you requested are actually available on the cluster.

OpenSliceDDS across a network

I am completely new to the DDS world. I understand basic concepts like publish and subscribe, and the stuff that can be gained from the documentation. I am attempting to use OpenSlice DDS, and am able to get through the tutorial without much difficulty. However, I want to get two different computers on the same network to talk to each other, which seems like a relatively simple task, but i can find no documentation on it.
For example, the message chat room tutorial... how would i get the message board running on one machine, and the chatter on another machine?
Thanks!
Found it! http://opensplice.org/pipermail/developer/2009-July/000094.html.
To summarize from the link:
Setup your environment on node 1 by running the release file in the OSPL_HOME directory (release.bat)
start the opensplice daemon on node 1 (ospl start)
run the messageboard application on node 1
Setup your environment on node 1 by running the release file in the OSPL_HOME directory (release.bat)
start the opensplice daemon on node 2 (ospl start)
run the chatter application on node 2