Tika child processes keep dying - jvm

Tika child processes keep dying. I tried to increase the heap size to 2GB but that doesn't seem to affect anything, after ~100 files the child process just dies and the Tika server restarts it. I have 8GB RAM/4 CPUs assigned to it, and this is my config.xml looks like:
<?xml version="1.0" encoding="UTF-8"?>
<properties>
<fetchers>
<fetcher class="org.apache.tika.pipes.fetcher.fs.FileSystemFetcher">
<params>
<name>fsf</name>
<basePath>/shared/input</basePath>
</params>
</fetcher>
</fetchers>
<server>
<params>
<forkedJvmArgs>
<arg>-Xms2g</arg>
<arg>-Xmx2g</arg>
</forkedJvmArgs>
<enableUnsecureFeatures>true</enableUnsecureFeatures>
</params>
</server>
</properties>
I'm running it inside a Docker container, I ran jstat to see the statistics for the child process and this is what I get:
# jstat -gc 2132
S0C S1C S0U S1U EC EU OC OU MC MU CCSC CCSU YGC YGCT FGC FGCT GCT
0.0 0.0 0.0 0.0 110592.0 74752.0 1986560.0 5092.0 192.0 75.5 64.0 4.4 0 0.000 0 0.000 0.000
What am I doing wrong?

Related

ovf deployment on VMWare environment with guestinfo property

I am trying to deploy a VM using OVF configuration and my goal is to pass the key-value provided by the user in the OVF environment to the VM using VMware guestinfo.
Following are the settings/attributes I have added to my OVF file
<ProductSection ovf:required="false">
<Info>Virtual Appliance</Info>
<Property ovf:userConfigurable="true" ovf:type="string"
ovf:key="guestinfo.hello" ovf:value="">
<Label>hello</Label>
<Description>enter some string</Description>
</Property>
</ProductSection>
....
<VirtualHardwareSection ovf:transport="com.vmware.guestInfo">
....
After deploying the VM, I am able to verify the property in OVF environment under VM vApp Options. Here is what I see
<?xml version="1.0" encoding="UTF-8"?>
<Environment
xmlns="http://schemas.dmtf.org/ovf/environment/1"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:oe="http://schemas.dmtf.org/ovf/environment/1"
xmlns:ve="http://www.vmware.com/schema/ovfenv"
oe:id=""
ve:vCenterId="vm-xxxx">
<PlatformSection>
<Kind>VMware ESXi</Kind>
<Version>6.0.0</Version>
<Vendor>VMware, Inc.</Vendor>
<Locale>en</Locale>
</PlatformSection>
<PropertySection>
<Property oe:key="guestinfo.hello" oe:value="world"/>
</PropertySection>
<ve:EthernetAdapterSection>
<ve:Adapter ve:mac="00:50:56:b2:d2:8a" ve:network="VLAN1804-
xxx.xxx.xxx.0/25" ve:unitNumber="7"/>
<ve:Adapter ve:mac="00:50:56:b2:83:ea" ve:network="VLAN1804-
xxx.xxx.xxx.0/25" ve:unitNumber="8"/>
</ve:EthernetAdapterSection>
</Environment>
Finally, When I log in to the box and try to get the guestinfo property using vmtoolsd cmd
vmtoolsd --cmd "info-get hello"
I am getting
No value found
I need to help to debug this problem. Not really sure if I am missing something in my OVF configuration. Thanks in Advance. I appreciate your help!
I guess is there no direct way to get the key specifically but I am able to run vmtoolsd --cmd "info-get guestinfo.ovfenv" on my box and output is
<?xml version="1.0" encoding="UTF-8"?>
<Environment
xmlns="http://schemas.dmtf.org/ovf/environment/1"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns:oe="http://schemas.dmtf.org/ovf/environment/1"
xmlns:ve="http://www.vmware.com/schema/ovfenv"
oe:id=""
ve:vCenterId="vm-xxxx">
<PlatformSection>
<Kind>VMware ESXi</Kind>
<Version>6.0.0</Version>
<Vendor>VMware, Inc.</Vendor>
<Locale>en</Locale>
</PlatformSection>
<PropertySection>
<Property oe:key="guestinfo.hello" oe:value="world"/>
</PropertySection>
<ve:EthernetAdapterSection>
<ve:Adapter ve:mac="00:50:56:b2:d2:8a" ve:network="VLAN1804-
xxx.xxx.xxx.0/25" ve:unitNumber="7"/>
<ve:Adapter ve:mac="00:50:56:b2:83:ea" ve:network="VLAN1804-
xxx.xxx.xxx.0/25" ve:unitNumber="8"/>
</ve:EthernetAdapterSection>
</Environment>
which is what I see in OVF environment for the vm in vcenter. Assuming this can be one way to get the key-value from the vm using vmtools

Configuring Nutch 2.3 with HSQL 2.3.3 - ClassNotFoundException : org/apache/avro/ipc/ByteBufferOutputStream

I'm getting ClassNotFoundException : org/apache/avro/ipc/ByteBufferOutputStream when I run apache Nutch with HSQLDB although I have all the avro related jar files under lib
avro-1.7.6.jar
avro-compiler-1.7.6.jar
avro-ipc-1.7.6.jar
avro-mapred-1.7.6.jar
This is what I did:
Got HSQLDB up and running
root#elephant hsqldb# sudo java -cp /home/hsqldb/hsqldb-2.3.3/hsqldb/lib/hsqldb.jar org.hsqldb.server.Server --props /home/hsqldb/hsqldb-2.3.3/hsqldb/conf/server.properties
[Server#372f7a8d]: [Thread[main,5,main]]: checkRunning(false) entered
[Server#372f7a8d]: [Thread[main,5,main]]: checkRunning(false) exited
[Server#372f7a8d]: Startup sequence initiated from main() method
[Server#372f7a8d]: Loaded properties from [/home/hsqldb/hsqldb-2.3.3/hsqldb/conf/server.properties]
[Server#372f7a8d]: Initiating startup sequence...
[Server#372f7a8d]: Server socket opened successfully in 28 ms.
[Server#372f7a8d]: Database [index=0, id=0, db=file:/home/hsqldb/hsqldb-2.3.3/hsqldb/data/nutch, alias=nutchdb] opened sucessfully in 1406 ms.
[Server#372f7a8d]: Startup sequence completed in 1438 ms.
[Server#372f7a8d]: 2015-12-26 18:30:13.841 HSQLDB server 2.3.3 is online on port 9001
[Server#372f7a8d]: To close normally, connect and execute SHUTDOWN SQL
[Server#372f7a8d]: From command line, use [Ctrl]+[C] to abort abruptly
Configured ivy/ivy.xml
uncommented below lines in ivy.xml
<dependency org="org.apache.gora" name="gora-core" rev="0.5" conf="*->default"/>
and
<dependency org="org.apache.gora" name="gora-sql" rev="0.1.1-incubating"
conf="*->default" />
uncommented the below lines conf/gora.properites
###############################
# Default SqlStore properties #
###############################
gora.sqlstore.jdbc.driver=org.hsqldb.jdbc.JDBCDriver
gora.sqlstore.jdbc.url=jdbc:hsqldb:hsql://localhost/nutchdb
gora.sqlstore.jdbc.user=sa
gora.sqlstore.jdbc.password=
Ran ant build
ant runtime
Added configuration for nutch-site.xml
root#elephant conf# cat nutch-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>storage.data.store.class</name>
<value>org.apache.gora.sql.store.SqlStore</value>
</property>
<property>
<name>http.agent.name</name>
<value>NutchCrawler</value>
</property>
<property>
<name>http.robots.agents</name>
<value>NutchCrawler,*</value>
</property>
</configuration>
Created seed.txt under urls folder
Executed the nutch by injecting the urls
[root#elephant local]# bin/nutch inject urls/
InjectorJob: starting at 2015-12-26 19:11:24
InjectorJob: Injecting urlDir: urls
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/avro/ipc/ByteBufferOutputStream
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:259)
at org.apache.nutch.storage.StorageUtils.getDataStoreClass(StorageUtils.java:93)
at org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:77)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:218)
at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:252)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:275)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:284)
Caused by: java.lang.ClassNotFoundException: org.apache.avro.ipc.ByteBufferOutputStream
at java.net.URLClassLoader$1.run(URLClassLoader.java:372)
at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:360)
at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
... 9 more
Gora-sql is not supported. Due some licenses issues (if I am not wrong), it became disabled around Gora 0.2.
So I suggest you to use other storage like, for example, HBase.
How to get HBase up&running fast: read answer at https://stackoverflow.com/a/39837926/582789

Collectd Theshold plugin not warnings to rsyslog

I am attempting to properly configure the collectd (5.4.2 ) plugins installed in FreeBSD 10.1. I expect based on my configuration included below to
see warning events in my /var/log/messages file which is managed by rsyslog
which writes this file for any facility reporting at warning level and above.
I do not receive any threshold warnings. I have used tools like "stress -c "
to force down the idle "jiffies".
I appear to be successfully collecting information based on using rrdtool lastupdate to display
/var/db/collectd/rrd/localhost/cpu-average/cpu-idle.rrd
/var/db/collectd/rrd/localhost/tail-messages/counter-os.rrd
I suspect this is something I am not quite getting right in the configuration declarations of tail, aggregate, Chain or threshold with respect to plugin,type and instance keywords.
In tail where I am looking for problems from various facilities I
think GaugeInc would be the more appropriate DSType but that is not
supported with my current collectd revision.
Appreciate any insights on what is likely a setup matter.
cat /usr/local/etc/collectd.conf
Hostname "localhost"
FQDNLookup true
BaseDir "/var/db/collectd"
PIDFile "/var/run/collectd.pid"
TypesDB "/usr/local/share/collectd/types.db"
#ReadThreads 5
#WriteThreads 5
#https://collectd.org/wiki/index.php/Main_Page
#https://collectd.org/wiki/index.php/Naming_schema
#A value is identified by a unique name, which we usually call The "identifier" consists of five parts, two of which are optional:
#host
#plugin
#plugin instance (optional)
#type
#type instance (optional)
# e.g. host "/" plugin ["-" plugin instance] "/" type ["-" type instance]
# localhost/cpu-0/cpu-idle
LoadPlugin syslog
<plugin syslog>
LogLevel warning
NotifyLevel "OKAY"
</plugin>
LoadPlugin cpu
LoadPlugin aggregation
<LoadPlugin df >
Interval 300
</LoadPlugin>
LoadPlugin interface
LoadPlugin load
LoadPlugin memory
LoadPlugin match_regex
LoadPlugin rrdtool
LoadPlugin threshold
<LoadPlugin tail >
Interval 60
</LoadPlugin>
<plugin "df">
FSType zfs
MountPoint "/"
#ReportInodes false
ValuesPercentage true
</plugin>
<plugin rrdtool>
DataDir "/var/db/collectd/rrd"
CacheTimeout 120
CacheFlush 900
</plugin>
<plugin aggregation>
<Aggregation>
Plugin "cpu"
Type "cpu"
SetPlugin "cpu"
SetPluginInstance "%{aggregation}"
GroupBy "Host"
GroupBy "TypeInstance"
CalculateAverage true
</Aggregation>
</plugin>
<Chain "PostCache">
<Rule> # send cpu values for aggregation
<Match regex>
Plugin "^cpu$"
PluginInstance "[0-9]+$"
</Match>
<Target write>
Plugin "aggregation"
</Target>
Target stop
</Rule>
<Target write> # Write everything else via rrdtool.
Plugin "rrdtool"
</Target>
</Chain>
<plugin "tail">
<File "/var/log/messages">
Instance "messages"
<Match>
# localhost/tail-messages/counter-ace
Regex "local1.(err|warn|alert|crit)"
DSType "CounterInc"
Type "counter"
Instance "ace"
</Match>
<Match>
Regex "local0.(err|warn|alert|crit)"
ExcludeRegex "smdr:"
DSType "CounterInc"
Type "counter"
Instance "postgres"
</Match>
<Match>
Regex "local4.(err|warn|alert|crit)"
DSType "CounterInc"
Type "counter"
Instance "mec"
</Match>
<Match>
Regex "local5.(err|warn|alert|crit)"
DSType "CounterInc"
Type "counter"
Instance "web"
</Match>
<Match>
Regex "(local6|local7).(err|warn|alert|crit)"
DSType "CounterInc"
Type "counter"
Instance "apache"
</Match>
<Match>
Regex "^.*$"
ExcludeRegex " local[0-7] "
DSType "CounterInc"
Type "counter"
Instance "os"
</Match>
</File>
</plugin>
#https://collectd.org/documentation/manpages/collectd-threshold.5.shtml
<Plugin "threshold">
<Plugin "interface">
Instance "eth0"
<Type "if_octets">
FailureMax 10000000
DataSource "rx"
</Type>
</Plugin>
<plugin "df">
<type "df">
Instance "/zroot/ROOT/default"
WarningMax 75
</type>
</plugin>
<Host "Hostname">
<plugin "aggregation">
<type "cpu-average">
Instance "idle"
WarningMin 17000
FailureMin 15000
Hits 1
</type>
</plugin>
<Plugin "memory">
<Type "memory">
Instance "free"
WarningMin 10000000
</Type>
</Plugin>
<plugin "load">
<type "load">
DataSource "midterm"
FailureMax 4
Hits 3
Hysteresis 3
</type>
</plugin>
<Plugin "tail">
Instance "messages"
<type "counter">
Instance "os"
WarningMax .001
</type>
<type"counter">
Instance "ace"
WarningMax .001
</type>
</Plugin>
</Host>
</Plugin>
The most common error with thresholds is to have a too restrictive filter. Try removing sections like Host, Instance etc. until you can see the notifications. Also you can use the unixsock plugin to PUTVAL fake values instead of trying to burn your system.

How i can add some jvm options to arquillian test

Its possible to add some jvm options to embedded glassfish using arquillian ?
I need to add that jvm options:
-Djavax.net.ssl.keyStorePassword=changeit
-Djavax.net.ssl.trustStorePassword=changeit
Java properties on Glassfish are configured in domain.xml. Since you are running an embedded Glassfish, you don't really have a domain.xml file you could modify. You can try to do this in arquillian.xml
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<arquillian xmlns="http://www.jboss.org/arquillian-1.0"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.jboss.org/arquillian-1.0 http://jboss.org/schema/arquillian/arquillian-1.0.xsd">
<engine>
<property name="deploymentExportPath">target/</property>
</engine>
<container qualifier="glassfish" default="true">
<configuration>
<property name="configurationXml">file:src/test/resources/domain.xml</property>
...
</configuration>
</container>
</arquillian>
The configurationXml property is used to pass the configuration file to use for the embedded insance. See also https://docs.jboss.org/author/display/ARQ/GlassFish+3.1+-+Embedded. domain.xml itself has a section for JVM arguments.

Migrate eclipseLink of Toplink

I want to change eclipseLink 2.3.2 to eclipseLink 2.4.2 I tryed with this tutorial but doesn´t work, with this question but I think that he use Weblogic 12.1.2 ( I use Weblogic 12.1.1), with this, and I read this and I don´t know which library change that this works in Module and I add jar in domain/lib and doesnt work. Any Idea???
My shared library:
application.xml:
<application>
<display-name>eclipselink-shared-lib</display-name>
<module>
<java>/lib/eclipselink.jar</java>
</module>
</application>
MANIFEST:
Manifest-Version: 1.0
Ant-Version: Apache Ant 1.8.2
Created-By: 1.7.0_04-b21 (Oracle Corporation)
Extension-Name: eclipselink-shared-lib
Specification-Version: 2.4.2
weblogic-application:
<?xml version="1.0" encoding="UTF-8"?>
<wls:weblogic-application xmlns:wls="http://xmlns.oracle.com/weblogic/weblogic-application" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/javaee_5.xsd http://xmlns.oracle.com/weblogic/weblogic-application http://xmlns.oracle.com/weblogic/weblogic-application/1.4/weblogic-application.xsd">
<prefer-application-packages>
<package-name>org.eclipse.persistence.*</package-name>
</prefer-application-packages>
</wls:weblogic-application>
And weblogic-application ear use :
<?xml version="1.0" encoding="UTF-8"?>
<wls:weblogic-application xmlns:wls="http://xmlns.oracle.com/weblogic/weblogic-application" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://java.sun.com/xml/ns/javaee http://java.sun.com/xml/ns/javaee/javaee_5.xsd http://xmlns.oracle.com/weblogic/weblogic-application http://xmlns.oracle.com/weblogic/weblogic-application/1.4/weblogic-application.xsd">
<library-ref>
<library-name>eclipselink-shared-lib</library-name>
</library-ref>
</wls:weblogic-application>
And now gives me this error: [WARNING] weblogic.deploy.api.tools.deployer.DeployerException: Task 9 failed: [Deployer:149026]deploy application SIUCOM_EAR on AdminServer.
Target state: deploy failed on Server AdminServer
java.io.IOException: C:\oracle\Middleware12c\user_projects\domains\base_domain\servers\AdminServer\tmp_WL_user\SIUCOM_EAR\4q1ire\lib\eclipselink.jar (El sistema no puede hallar el archivo especificado) with : C:\oracle\Middleware12c\user_projects\domains\base_domain\servers\AdminServer\tmp_WL_user\PROJECT_EAR\4q1ire\lib\eclipselink.jar
It´s true that path doesn´t exist, correct path are: C:\oracle\Middleware12c\user_projects\domains\base_domain\servers\AdminServer\tmp_WL_user\eclipselink-shared-lib\276ipa\lib\eclipselink.jar