For the same application code, I get them to appear on machine A but not on B.
On machine B I still get many of the metrics mentioned in the Finagle docs, but not the JVM ones.
Is there any JVM setting, or another environment setting, that would enable/disable JVM metrics?
TL;DR
JvmStats.register(statsReceiver)
Turns out I had a wrong assumption in: "the same application code is running on A and B."
My app would startup differently on machine B because it's running on a web container: it would skip the normal TwitterService startup lifecycle, which you get if you extend TwitterServer with a main() method.
Somewhere along this lifecycle JvmStats.register gets called. That's how machine A had JVM metrics. I finally managed to get them on B by adding that one line.
Related
I have two clusters, one in local virtual machine another in remote cloud. Both clusters in Standalone mode.
My Environment:
Scala: 2.10.4
Spark: 1.5.1
JDK: 1.8.40
OS: CentOS Linux release 7.1.1503 (Core)
The local cluster:
Spark Master: spark://local1:7077
The remote cluster:
Spark Master: spark://remote1:7077
I want to finish this:
Write codes(just simple word-count) in IntelliJ IDEA locally(on my laptp), and set the Spark Master URL to spark://local1:7077 and spark://remote1:7077, then run my codes in IntelliJ IDEA. That is, I don't want to use spark-submit to submit a job.
But I got some problem:
When I use the local cluster, everything goes well. Run codes in IntelliJ IDEA or use spark-submit can submit job to cluster and can finish the job.
But When I use the remote cluster, I got a warning log:
TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
It is sufficient resources not sufficient memory!
And this log keep printing, no further actions. Both spark-submit and run codes in IntelliJ IDEA result the same.
I want to know:
Is it possible to submit codes from IntelliJ IDEA to remote cluster?
If it's OK, does it need configuration?
What are the possible reasons that can cause my problem?
How can I handle this problem?
Thanks a lot!
Update
There is a similar question here, but I think my scene is different. When I run my codes in IntelliJ IDEA, and set Spark Master to local virtual machine cluster, it works. But I got Initial job has not accepted any resources;... warning instead.
I want to know whether the security policy or fireworks can cause this?
Submitting code programatically (e.g. via SparkSubmit) is quite tricky. At the least there is a variety of environment settings and considerations -handled by the spark-submit script - that are quite difficult to replicate within a scala program. I am still uncertain of how to achieve it: and there have been a number of long running threads within the spark developer community on the topic.
My answer here is about a portion of your post: specifically the
TaskSchedulerImpl: Initial job has not accepted any resources; check
your cluster UI to ensure that workers are registered and have
sufficient resources
The reason is typically there were a mismatch on the requested memory and/or number of cores from your job versus what were available on the cluster. Possibly when submitting from IJ the
$SPARK_HOME/conf/spark-defaults.conf
were not properly matching the parameters required for your task on the existing cluster. You may need to update:
spark.driver.memory 4g
spark.executor.memory 8g
spark.executor.cores 8
You can check the spark ui on port 8080 to verify that the parameters you requested are actually available on the cluster.
my teamcity build server has following JVM Arguments:
-Xmx512m -XX:MaxPermSize=270m
sometimes it shows some memory problem message like "TeamCity server memory usage for PS Old Gen pool exceeded 91% of 341 MB maximum available. 437 MB used of 506 MB total heap available. See the TeamCity documentation for possible solutions."
i read here https://confluence.jetbrains.com/display/TCD8/Installing+and+Configuring+the+TeamCity+Server#InstallingandConfiguringtheTeamCityServer-SettingUpMemorysettingsforTeamCityServer that the minimum recommended settings are: -Xmx750m -XX:MaxPermSize=270m.
how/where do i change this setting?
In TC9+ it is possible to set this variable in TC Server GUI:
Administration -> Diagnostics -> Internal Properties -> Edit internal properties
For 64-bit JVM the recommended setting is:
TEAMCITY_SERVER_MEM_OPTS=-Xmx4g -XX:MaxPermSize=270m -XX:ReservedCodeCacheSize=350m
Just add this line to the Internal properties edit box
I would recommend adding the JVM memory options in the startup script (start.sh) for server based startup using the variable TEAMCITY_SERVER_MEM_OPTS . Please do not set it in the profile of the userid that runs teamcity.
This link should be helpful to you.
In case you want different memory settings for server and agent(usually that's the case), please be selective in naming the variables so that there is a difference in identifying the JVM options for server and agent startup.
As a rule of thumb for teamcity setups, I normally let my teamcity server have 20% more memory than my avg usage to account for any increased load during peak usage periods.
Internal properties are read after the JVM is started and so the heap settings will not take effect if put where another answer suggests. I was looking into how to do this for a TeamCity container and the best option seems to be to use environment variables (TEAMCITY_SERVER_MEM_OPTS). For a container, those can be set by passing -e TEAMCITY_SERVER_MEM_OPTS='...' when creating the container.
I recently was modifying some of my server properties in Rational Application Developer to try and increase the memory of my JVM on startup. I forgot to take a backup before doing this, and by adding in an incorrect JVM variable, it seems I have broke my server in an unworking state. Whenever I try and startup my server to do any configuration changes, the JVM refuses to start with invalid params being passed in.
Is there a way to reset any JVM changes for WebSpehere Application Server v7.0 through the filesystem, or a way to do it without needing the server running already? I have been looking around in the wasProfile hoping to stumble onto a file where my settings ultimately live, but have had no luck.
It should be possible to write a wsadmin script to view/adjust the JVM options, but if you're on a non-z/OS platform, the fastest way to get back to working is probably to edit PROFILE_HOME/config/cells/CELL/nodes/NODE/servers/SERVER/server.xml; the JVM settings are typically written at the very end.
I'm experimenting with the Distributed Shell example in YARN 2.2 and am hoping that someone can clarify what the difference between a managed and and an un-managed application manager is?
For example the following lines appear in the client code
// unmanaged AM
appContext.setUnmanagedAM(true);
but I am unable to find documentation explaining the difference this line makes to the execution behaviour.
Many thanks.
The setUnmanagedAM(true) is used for debugging purposes i.e. it runs an application manager in local mode and does not submit it to a cluster so it is easier to step into code and debug.
You can see it in use in the hadoop-yarn-applications-unmanaged-am-launcher.jar that ships with yarn
Check the respective JIRA tickets: JIRA-420 and JIRA-419 (client side)
Currently, the RM itself manages the AM by allocating a container for it and negotiating the launch on the NodeManager and manages the AM lifecycle. Thereafter, the AM negotiates resources with the RM and launches tasks to do the real work.
It would be a useful improvement to enhance this model by allowing the AM to be launched independently by the client without requiring the RM. These AM's would be launched on a gateway machine that can talk to the cluster. This would open up new use cases such as the following
1) Easy debugging of AM, specially during initial development. Having the AM launched on an arbitrary cluster node makes it hard to looks at logs or attach a debugger to the AM. If it can be launched locally then these tasks would be easier.
2) Running AM's that need special privileges that may not be available on machines managed by the NodeManager
Blog post with more implementation details on unmanaged AM: click-me
Example of how Impala manages its resources with the help of unmanaged applications: Llama
I am testing the tomcat7 clickstack for our application which has some config parameters set using the built in Config features of Cloudbees. The tomcat7 clickstack does not find them, but the standard tomcat6 container does. I have double checked them and reset them through the cloudbees sdk and they are there and correct, but are coming back as null for tomcat7.
The switch to clickstacks requires us to refactor how the servlet container gets configured so that the injection points such as cloudbees-web.xml and jvm system properties behave consistently across all the servlet container clickstacks.
Some of that refactoring has been committed but some of the work is still in my backlog... Assuming none of the other bees steal that task from my backlog before I get to it ;-)
IF I recall correctly, the parameters should be available as environment variables (sub optimal I know, but all containers should be giving this as a consistent UX for all clickstacks, eg both non-java based and java-based) and may be already available as system properties (again sub optimal, but the java container refactoring should be giving this as a consistent UX for all java based clickstacks). The consistent java servlet UX has not been committed yet but should be available soon.