Error while running Flume agent - apache

I m using flume1.4.0 with hbase0.94.10 and hadoop1.1.2....I m new to flume.while i m running an flume agent, i got following error..
Error occurred during initialization of VM The size of the object heap + VM data exceeds the maximum representable size org.apache.flume.node.Application --conf-file=/usr/local/flume/conf/flume.conf --name agent
Exception in thread "main" java.lang.NoClassDefFoundError: occurred
Caused by: java.lang.ClassNotFoundException: occurred
at java.net.URLClassLoader$1.run(URLClassLoader.java:217)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:205)
at java.lang.ClassLoader.loadClass(ClassLoader.java:321)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294)
at java.lang.ClassLoader.loadClass(ClassLoader.java:266)
Could not find the main class: occurred. Program will exit.
my agent configuration is as follows:
agent1.sources = tail
agent1.channels = memoryChannel
agent1.sinks = sink1
agent1.sources.tail.type = exec
agent1.sources.tail.command = tail -f /usr/local/jarsfortest/LogsForTest/generatingLogs-app.logs
agent1.sources.tail.channels = memoryChannel
agent1.sinks.sink1.type = org.apache.flume.sink.hbase.HBaseSink
agent1.sinks.sink1.channel = memoryChannel
agent1.sinks.sink1.table = testFlume
agent1.sinks.sink1.columnFamily = log
agent1.sinks.sink1.serializer = org.apache.flume.sink.hbase.RegexHbaseEventSerializer
agent1.sinks.sink1.serializer.regex = [a-zA-Zo-9]*[^C][a-zA-Z0-9]*[^C][a-zA-Z0-9]*
agent1.sinks.sink1.serializer.regexIgnoreCase = true
agent1.sinks.sink1.serializer.colNames = id, no_fill_reason, bid
agent1.sinks.sink1.channel = memoryChannel
agent1.sinks.sink1.type = logger
agent1.channels.memoryChannel.type = memory
agent1.channels.memoryChannel.capacity = 100
Please Help me..
Thanks in advance

This looks like you have a JVM problem rather than a Flume problem. I'd check that the JAR files are not corrupt, that you haven't set max heap size too small, etc.
Whatever is on the end of this truncated message is probably a clue, the size of the object heap + VM data
Update
agent1.channels.memoryChannel.capacity = 100
That capacity looks very small. I'd remove that line and let it use the default.

Related

Gatling feeder/parameter issue - Exception in thread "main" java.lang.UnsupportedOperationException

I just involved the new project for API test for our service by using Gatling. At this point, I want to search query, below is the code:
def chnSendToRender(testData: FeederBuilderBase[String]): ChainBuilder = {
feed(testData)
exec(api.AdvanceSearch.searchAsset(s"{\"all\":[{\"all:aggregate:text\":{\"contains\":\"#{edlAssetId}_Rendered\"}}]}", "#{authToken}")
.check(status.is(200).saveAs("searchStatus"))
.check(jsonPath("$..asset:id").findAll.optional.saveAs("renderedAssetList"))
)
.doIf(session => session("searchStatus").as[Int] == 200) {
exec { session =>
printConsoleLog("Rendered Asset ID List: " + session("renderedAssetList").as[String], "INFO")
session
}
}
}
I declared the feeder already in the simulation scala file:
class GVRERenderEditor_new extends Simulation {
private val edlToRender = csv("data/render/edl_asset_ids.csv").queue
private val chnPostRender = components.notifications.notice.JobsPolling_new.chnSendToRender(edlToRender)
private val scnSendEDLForRender = scenario("Search Post Render")
.exitBlockOnFail(exec(preSimAuth))
.exec(chnPostRender)
setUp(
scnSendEDLForRender.inject(atOnceUsers(1)).protocols(httpProtocol)
)
.maxDuration(sessionDuration.seconds)
.assertions(global.successfulRequests.percent.is(100))
}
But Gatling test failed to run, showing this error: Exception in thread "main" java.lang.UnsupportedOperationException: There were no requests sent during the simulation, reports won't be generated
If I hardcode the #{edlAssetId} (put the real edlAssetId in that query), I will get result. I think I passed the parameter wrongly in this case. I've tried to print the output in console log but no luck. What's wrong with this code? I would appreciate your help. Thanks!
feed(testData)
exec(api.AdvanceSearch.searchAsset(s"{\"all\":[{\"all:aggregate:text\":{\"contains\":\"#{edlAssetId}_Rendered\"}}]}", "#{authToken}")
.check(status.is(200).saveAs("searchStatus"))
.check(jsonPath("$..asset:id").findAll.optional.saveAs("renderedAssetList"))
)
You're missing a . (dot) before the exec to attach it to the feed.
As a result, your method is returning the last instruction, ie the exec only.

I configurate Traffic generator with stlib, and have a error

processor = RandomGenerator()
board = TestBoard(
clk_freq="3GHz",
processor=processor,
memory=memory,
cache_hierarchy=cache_hierarchy,
)
root = Root(full_system=False, system=board)
m5.instantiate()
exit_event = (
m5.simulate()
)
I want to compare Memory, so I configure this code. but I got an error.
Command line: build/NULL/gem5.opt Proj/ver2/Proj_v2.py
ImportError: cannot import name 'BaseCPU' from 'm5.objects' (unknown location)
Why does this error happen?

NPE while deserializing avro messages in kafka streams

I wrote a small java class to test the consumption of Avro encoded Kafka topic.
Properties appProps = new Properties();
appProps.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "http://***kfk14bro1.lc:9092");
appProps.put(AbstractKafkaAvroSerDeConfig.SCHEMA_REGISTRY_URL_CONFIG, "http://***kfk14str1.lc:8081");
appProps.put(StreamsConfig.APPLICATION_ID_CONFIG, "consumer");
appProps.put(ConsumerConfig.AUTO_OFFSET_RESET_CONFIG, "latest");
appProps.put(StreamsConfig.DEFAULT_DESERIALIZATION_EXCEPTION_HANDLER_CLASS_CONFIG,LogAndContinueExceptionHandler.class);
StreamsBuilder streamsBuilder = new StreamsBuilder();
streamsBuilder.stream(
"coordinates", Consumed.with(Serdes.String(), new GenericAvroSerde()))
.peek((key, value) -> System.out.println("key=" + key + ", value=" + value));
new KafkaStreams(streamsBuilder.build(), appProps).start();
When I run this class, SerdeConfigs are being logged alright which can be seen in the below log:
[consumer-56b0e0ca-d336-45cc-b388-46a68dbfab8b-StreamThread-1] INFO io.confluent.kafka.serializers.KafkaAvroSerializerConfig - KafkaAvroSerializerConfig values:
schema.registry.url = [http://***kfk14str1.lc:8081]
basic.auth.user.info = [hidden]
auto.register.schemas = true
max.schemas.per.subject = 1000
basic.auth.credentials.source = URL
schema.registry.basic.auth.user.info = [hidden]
value.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
key.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
[normal-consumer-56b0e0ca-d336-45cc-b388-46a68dbfab8b-StreamThread-1] INFO io.confluent.kafka.serializers.KafkaAvroDeserializerConfig - KafkaAvroDeserializerConfig values:
schema.registry.url = [http://***kfk14str1.lc:8081]
basic.auth.user.info = [hidden]
auto.register.schemas = true
max.schemas.per.subject = 1000
basic.auth.credentials.source = URL
schema.registry.basic.auth.user.info = [hidden]
specific.avro.reader = false
value.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
key.subject.name.strategy = class io.confluent.kafka.serializers.subject.TopicNameStrategy
but messages are not being consumed and generates the below log for every message:
[normal-consumer-56b0e0ca-d336-45cc-b388-46a68dbfab8b-StreamThread-1] WARN org.apache.kafka.streams.errors.LogAndContinueExceptionHandler - Exception caught during Deserialization, taskId: 0_0, topic: coordinates, partition: 0, offset: 782205986
org.apache.kafka.common.errors.SerializationException: Error deserializing Avro message for id 83
Caused by: java.lang.NullPointerException
at io.confluent.kafka.serializers.AbstractKafkaAvroDeserializer.deserialize(AbstractKafkaAvroDeserializer.java:116)
at io.confluent.kafka.serializers.AbstractKafkaAvroDeserializer.deserialize(AbstractKafkaAvroDeserializer.java:88)
at io.confluent.kafka.serializers.KafkaAvroDeserializer.deserialize(KafkaAvroDeserializer.java:55)
at io.confluent.kafka.streams.serdes.avro.GenericAvroDeserializer.deserialize(GenericAvroDeserializer.java:63)
at io.confluent.kafka.streams.serdes.avro.GenericAvroDeserializer.deserialize(GenericAvroDeserializer.java:39)
at org.apache.kafka.common.serialization.Deserializer.deserialize(Deserializer.java:58)
at org.apache.kafka.streams.processor.internals.SourceNode.deserializeValue(SourceNode.java:60)
But I am able to read just fine from the avro console consumer, so I know there is nothing wrong with the data written to the topic. Below command prints logs alright:
~/kafka/confluent-5.1.2/bin/kafka-avro-console-consumer --bootstrap-server http://***kfk14bro1.lc:9092 --topic coordinates --property schema.registry.url=http://***kfk14str1.lc:8081 --property auto.offset.reset=latest
When you instantiate an Avro Serde yourself it is not configured automatically with the schema-registry URL.
So either you have to configure it yourself or you define default serdes by adding:
appProps.setProperty(StreamsConfig.DEFAULT_KEY_SERDE_CLASS_CONFIG, Serdes.String().getClass().getName());
appProps.setProperty(StreamsConfig.DEFAULT_VALUE_SERDE_CLASS_CONFIG, GenericAvroSerde.class.getName());
And by removing
Consumed.with(Serdes.String(), new GenericAvroSerde())
To configure Serde use following code (adapt it to your situation):
GenericAvroSerde genericAvroSerde = new GenericAvroSerde();
boolean isKeySerde = false;
genericAvroSerde.configure(
Collections.singletonMap(
AbstractKafkaAvroSerDeConfig.SCHEMA_REGISTRY_URL_CONFIG,
"http://confluent-schema-registry-server:8081/"),
isKeySerde);

Run-time error of Spark code in Intellij

Running spark code in IDEA Intellij is painful as a new Spark/Intellij user.
I googled many pages, but didn't find a solution to this.
Code is very simple as below.
I'm getting run-time error in this line:
val conf = new SparkConf().setAppName("Spark Pi")
import org.apache.spark.SparkConf
import scala.math.random
object HelloWorld {
def main(args: Array[String]) {
println("Hello World")
val conf = new SparkConf().setAppName("Spark Pi")
// val spark = new SparkContext(conf)
// val slices = if (args.length > 0) args(0).toInt else 3
// val n = 100000 * slices
// val count = spark.parallelize(1 to n, slices).map { i =>
// val x = random * 2 - 1
// val y = random * 2 - 1
// if (x*x + y*y < 1) 1 else 0
// }.reduce(_ + _)
// println("Pi is roughly " + 4.0 * count / n)
// val pi = 4.0 * count / n
// val ppi = spark.parallelize(Seq(pi))
// ppi.saveAsTextFile("/tmp/bryan/spark/output.pi")
// spark.stop()
}
}
The error message is :
"C:\Program Files\Java\jdk8\bin\java" -Didea.launcher.port=7534 "-Didea.launcher.bin.path=C:\Program Files (x86)\JetBrains\IntelliJ IDEA Community Edition 14.1.5\bin" -Dfile.encoding=UTF-8 -classpath "C:\Program Files\Java\jdk8\jre\lib\charsets.jar;C:\Program Files\Java\jdk8\jre\lib\deploy.jar;C:\Program Files\Java\jdk8\jre\lib\javaws.jar;C:\Program Files\Java\jdk8\jre\lib\jce.jar;C:\Program Files\Java\jdk8\jre\lib\jfr.jar;C:\Program Files\Java\jdk8\jre\lib\jfxswt.jar;C:\Program Files\Java\jdk8\jre\lib\jsse.jar;C:\Program Files\Java\jdk8\jre\lib\management-agent.jar;C:\Program Files\Java\jdk8\jre\lib\plugin.jar;C:\Program Files\Java\jdk8\jre\lib\resources.jar;C:\Program Files\Java\jdk8\jre\lib\rt.jar;C:\Program Files\Java\jdk8\jre\lib\ext\access-bridge-64.jar;C:\Program Files\Java\jdk8\jre\lib\ext\cldrdata.jar;C:\Program Files\Java\jdk8\jre\lib\ext\dnsns.jar;C:\Program Files\Java\jdk8\jre\lib\ext\jaccess.jar;C:\Program Files\Java\jdk8\jre\lib\ext\jfxrt.jar;C:\Program Files\Java\jdk8\jre\lib\ext\localedata.jar;C:\Program Files\Java\jdk8\jre\lib\ext\nashorn.jar;C:\Program Files\Java\jdk8\jre\lib\ext\sunec.jar;C:\Program Files\Java\jdk8\jre\lib\ext\sunjce_provider.jar;C:\Program Files\Java\jdk8\jre\lib\ext\sunmscapi.jar;C:\Program Files\Java\jdk8\jre\lib\ext\sunpkcs11.jar;C:\Program Files\Java\jdk8\jre\lib\ext\zipfs.jar;D:\code\spark-cdh\analysis-jobs\example\target\scala-2.10\classes;C:\Users\spark39\.ivy2\cache\org.scala-lang\scala-compiler\jars\scala-compiler-2.10.0.jar;C:\Users\spark39\.sbt\boot\scala-2.10.4\lib\scala-library.jar;C:\Program Files (x86)\JetBrains\IntelliJ IDEA Community Edition 14.1.5\lib\idea_rt.jar" com.intellij.rt.execution.application.AppMain com.spark.example.test.HelloWorld
Hello World
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/spark/SparkConf
at com.samsungaustin.yac.spark.example.test.HelloWorld$.main(Test.scala:15)
at com.samsungaustin.yac.spark.example.test.HelloWorld.main(Test.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:497)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
Caused by: java.lang.ClassNotFoundException: org.apache.spark.SparkConf
at java.net.URLClassLoader.findClass(URLClassLoader.java:381)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:331)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 7 more
Someone mentioned about adding classpath through 'Edit Configurations' menu, but can anybody tell exactly how I figure this out?
Thanks.
Starting with new technology can often be a bit painful, but it might lead to valuable new knowledge and experience. Do you use a Maven pom.xml file for building your project? I would advise you to use Maven or something similar to keep track of which libraries your code needs.
You can also try one of the existing examples that are available for Spark on the web site: http://spark.apache.org/examples.html. There is also a GitHub repository with even more examples: https://github.com/apache/spark/tree/master/examples. (This repository already has a Maven pom.xml file: https://github.com/apache/spark/blob/master/examples/pom.xml.)
The following page contains (hopefully) Useful Developer Tools: https://cwiki.apache.org/confluence/display/SPARK/Useful+Developer+Tools
There are also several tutorials that use Spark, Scala, and IntelliJ IDEA:
https://medium.com/large-scale-data-processing/how-to-kick-start-spark-development-on-intellij-idea-in-4-steps-c7c8f5c2fe63#.y0uxk7gay
https://docs.sigmoidanalytics.com/index.php/Step_by_Step_instructions_on_how_to_build_Spark_App_with_IntelliJ_IDEA

Error while trying to run WordCount program using hadoop-yarn

When I am trying to run WordCount program on cluster node I am getting error as mentioned below. Can anyone suggest solution to this issue?
ava.io.IOException: Inconsistent checkpoint fields.
LV = -60 namespaceID = 1055173391 cTime = 0 ; clusterId = CID-5d8c37db-fb42-416f-a179-3e7ff1690690 ; blockpoolId = BP-1781828547-192.168.104.173-1443620560920.
Expecting respectively: -60; 1665606821; 0; CID-6ec1fbb1-3515-4eca-9f55-2dbe8417b7e5; BP-555346680-192.168.104.173-1443500187475.
at org.apache.hadoop.hdfs.server.namenode.CheckpointSignature.validateStorageInfo(CheckpointSignature.java:134)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:531)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:395)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$1.run(SecondaryNameNode.java:361)
at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:412)
at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:357)
at java.lang.Thread.run(Thread.java:745)