I saw this setting: hive.mapred.mode=strict, does it mean that it only works when the execution engine is mapreduce? Does it work with tez?
Related
I have a simple twisted application which I run using a systemd service, executing a script, which subsequently executes a .tac file.
The application is structured as a JSON RPC endpoint (fastjsonrpc), built into a t.w.r.Resource, which is in a t.w.s.Site, and served t.a.i.TCPServer, and the whole thing packed into a t.a.Application. This works fine.
Where I do run into trouble is when I try to warm up caches at startup. This warm-up process is pretty slow (~300 seconds), and makes systemd timeout and kill the process. Increasing the timeout is not really a viable option, since I wouldn't want this to block system boot.
Analogous code is used in a separate stack running on Flask from within Apache and wsgi. That server starts itself off and lets systemd go on while it takes its time building the caches. This behaviour is fine for me.
I've tried calling the warmup function using the following within the setup function of the t.w.r.Resource:
reactor.callLater(1, ep.warmup, None)
I've not yet tried using this from within systemd, and have been testing it from twistd directly on the command line. The server does work as expected, however it no longer responds to SIGINT (^C). Removing the callLater is all that's needed to let the server respond to SIGINT.
If the warmup function is called directly (not by callLater, i.e., the arrangement which makes systemd give up while waiting for warm up to complete), the resulting server also continues to respond to SIGINT.
Is there a better / good way to handle this sort of long-running warmup code?
Why would twistd / the reactor not respond to SIGINT? Am I missing something here?
Twisted is a single-threaded thing. It sounds like your "cache warmup" code is blocking the reactor for those 300 seconds. One easy way to fix this would be using deferToThread to let it run without blocking the reactor.
I have two clusters, one in local virtual machine another in remote cloud. Both clusters in Standalone mode.
My Environment:
Scala: 2.10.4
Spark: 1.5.1
JDK: 1.8.40
OS: CentOS Linux release 7.1.1503 (Core)
The local cluster:
Spark Master: spark://local1:7077
The remote cluster:
Spark Master: spark://remote1:7077
I want to finish this:
Write codes(just simple word-count) in IntelliJ IDEA locally(on my laptp), and set the Spark Master URL to spark://local1:7077 and spark://remote1:7077, then run my codes in IntelliJ IDEA. That is, I don't want to use spark-submit to submit a job.
But I got some problem:
When I use the local cluster, everything goes well. Run codes in IntelliJ IDEA or use spark-submit can submit job to cluster and can finish the job.
But When I use the remote cluster, I got a warning log:
TaskSchedulerImpl: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient resources
It is sufficient resources not sufficient memory!
And this log keep printing, no further actions. Both spark-submit and run codes in IntelliJ IDEA result the same.
I want to know:
Is it possible to submit codes from IntelliJ IDEA to remote cluster?
If it's OK, does it need configuration?
What are the possible reasons that can cause my problem?
How can I handle this problem?
Thanks a lot!
Update
There is a similar question here, but I think my scene is different. When I run my codes in IntelliJ IDEA, and set Spark Master to local virtual machine cluster, it works. But I got Initial job has not accepted any resources;... warning instead.
I want to know whether the security policy or fireworks can cause this?
Submitting code programatically (e.g. via SparkSubmit) is quite tricky. At the least there is a variety of environment settings and considerations -handled by the spark-submit script - that are quite difficult to replicate within a scala program. I am still uncertain of how to achieve it: and there have been a number of long running threads within the spark developer community on the topic.
My answer here is about a portion of your post: specifically the
TaskSchedulerImpl: Initial job has not accepted any resources; check
your cluster UI to ensure that workers are registered and have
sufficient resources
The reason is typically there were a mismatch on the requested memory and/or number of cores from your job versus what were available on the cluster. Possibly when submitting from IJ the
$SPARK_HOME/conf/spark-defaults.conf
were not properly matching the parameters required for your task on the existing cluster. You may need to update:
spark.driver.memory 4g
spark.executor.memory 8g
spark.executor.cores 8
You can check the spark ui on port 8080 to verify that the parameters you requested are actually available on the cluster.
I recently was modifying some of my server properties in Rational Application Developer to try and increase the memory of my JVM on startup. I forgot to take a backup before doing this, and by adding in an incorrect JVM variable, it seems I have broke my server in an unworking state. Whenever I try and startup my server to do any configuration changes, the JVM refuses to start with invalid params being passed in.
Is there a way to reset any JVM changes for WebSpehere Application Server v7.0 through the filesystem, or a way to do it without needing the server running already? I have been looking around in the wasProfile hoping to stumble onto a file where my settings ultimately live, but have had no luck.
It should be possible to write a wsadmin script to view/adjust the JVM options, but if you're on a non-z/OS platform, the fastest way to get back to working is probably to edit PROFILE_HOME/config/cells/CELL/nodes/NODE/servers/SERVER/server.xml; the JVM settings are typically written at the very end.
Good evening all, does any one know anything about this error
JBAS010404: Deploying non-JDBC-compliant driver class com.mysql.jdbc.Driver (version 5.1)
it always appears when i start deploying mysql jar and my application fails to start on the sever HTTP Status 404 i suffered a lot from that and can't have any solution, please help me.
Note: i used mysql-connector-java-5.1.24.jar
That message gets printed because the MySQL driver is not JDBC compliant. That may seem a bit weird, but it's a long-standing known issue:
http://bugs.mysql.com/bug.php?id=62038
The problem is that to be fully JDBC compliant, the driver has to have SQL support conforming to the entry level of the SQL92 standard, but MySQL doesn't support features that are required by that. You read that right: MySQL doesn't support the most basic level of a twenty-year-old standard. Probably the most prominent example of a missing feature is check constraints. Therefore, the driver is non-compliant, and JBoss logs a message saying so.
However, this does not prevent the driver deploying correctly. As the message says, JBoss deploys it.
If your app is not working, the problem lies somewhere else.
Try using these instructions to deploy mysql driver to JBoss AS. With connector 5.1.22 as found in fedora18 I've never had a problem. Here is the module.xml
I have a Rails app that uses MongoDb on the back. I have these messages that say MONGODB [WARNING] Please note that logging negatively impacts client-side performance. You should set your logging level no lower than :info in production in my logs. OK, I never worried about it but decided to look it up just now.
This page on the mongo site doesn't really discuss logging levels, but it does discuss -v vs -vvvv for verbosity. Is that the same thing as log level? As in -vvvvv is the same as a debug log level and -v is the same as an error log level? The docs are very unclear on this topic.
I had problems with this in my tests, so I ended up doing the following in my spec_helper.rb:
Mongoid.logger.level = Logger::INFO
However if you are inside of rails you should probably (untested) use this to access the logger instead:
config.mongoid.logger
Logging levels refer to rails logging levels whereas the -v flag refers to verbosity.
Rails automatically sets the logging level higher in production than when in development so you shouldn't have anything to worry about.
If you're using mongoid 2.2 or higher, you can set it in mongoid.yml:
production:
hosts:
...
database: ...
logger: false
Also, this does have a performance impact. When I turned off mongo logging in production, I saw fewer garbage collections and app instance memory footprints were about 15 megabytes smaller during 30-minute load tests using apachebench.