Related
Hive 3.1.1 on premise
Hadoop 3.1.1 on premise
cloud SDK installed
HADOOP_CLASSPATH includes JAR file gcs-connector-hadoop3-2.2.5-shaded.jar
There is a storage handler that allows a table in Hive created to load data into Google BigQuery table. There is relevant GitHub link here
https://github.com/GoogleCloudDataproc/hive-bigquery-storage-handler
Ok so basically create a table in Hive with the available jar file as below
ADD JAR hdfs://rhes75:9000/jars/hive-bigquery-storage-handler-1.0-shaded.jar;
DROP TABLE IF EXISTS test.testmebq;
CREATE TABLE test.testmebq(
ID BIGINT
)
STORED BY
'com.google.cloud.hadoop.io.bigquery.hive.HiveBigQueryStorageHandler'
TBLPROPERTIES (
'mapred.bq.input.dataset.id'='test',
'mapred.bq.input.table.id'='testme',
'mapred.bq.project.id'='xxx',
'mapred.bq.temp.gcs.path'='gs://tmp_storage_bucket/tmp',
'mapred.bq.gcs.bucket'='etcbucket/hive/test'
)
;
Very simple table with one column called ID
In Google BigQuery I have also created a table called testme in test dataset.
Now if I start by adding the row in BigQuery table test.testme, I can see the row in Hive table test.tesmebq
insert into test.testme values(99)
and let us check that Hive table (running Hive in DEBUG mode to console)
hive> select * from test.testmebq;
2022-03-08 15:49:16,846 INFO [main] conf.HiveConf: Using the default value passed in for log id: ec1f7b38-7286-481a-b48b-44f7ed54c20a
2022-03-08 15:49:16,846 INFO [main] session.SessionState: Updating thread name to ec1f7b38-7286-481a-b48b-44f7ed54c20a main
2022-03-08 15:49:16,846 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] ql.Driver: Compiling command(queryId=hduser_20220308154916_a573a00c-9de7-49d9-9e4d-689f3e81559b): select * from test.testmebq
2022-03-08 15:49:16,892 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] parse.CalcitePlanner: Starting Semantic Analysis
2022-03-08 15:49:16,892 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] parse.CalcitePlanner: Completed phase 1 of Semantic Analysis
2022-03-08 15:49:16,892 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] parse.CalcitePlanner: Get metadata for source tables
2022-03-08 15:49:16,909 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] parse.CalcitePlanner: Get metadata for subqueries
2022-03-08 15:49:16,909 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] parse.CalcitePlanner: Get metadata for destination tables
2022-03-08 15:49:16,913 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] parse.CalcitePlanner: Completed getting MetaData in Semantic Analysis
2022-03-08 15:49:16,914 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] avro.AvroSerDe: AvroSerde::initialize(): Preset value of avro.schema.literal == null
2022-03-08 15:49:16,969 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] parse.CalcitePlanner: Get metadata for source tables
2022-03-08 15:49:16,986 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] parse.CalcitePlanner: Get metadata for subqueries
2022-03-08 15:49:16,986 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] parse.CalcitePlanner: Get metadata for destination tables
2022-03-08 15:49:16,988 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] avro.AvroSerDe: AvroSerde::initialize(): Preset value of avro.schema.literal == null
2022-03-08 15:49:16,988 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] common.FileUtils: Creating directory if it doesn't exist: file:/tmp/hive/ec1f7b38-7286-481a-b48b-44f7ed54c20a/hive_2022-03-08_15-49-16_847_7187167996261551164-1/-mr-10001/.hive-staging_hive_2022-03-08_15-49-16_847_7187167996261551164-1
2022-03-08 15:49:16,988 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] parse.CalcitePlanner: CBO Succeeded; optimized logical plan.
2022-03-08 15:49:16,988 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] ppd.OpProcFactory: Processing for FS(2)
2022-03-08 15:49:16,988 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] ppd.OpProcFactory: Processing for SEL(1)
2022-03-08 15:49:16,988 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] ppd.OpProcFactory: Processing for TS(0)
2022-03-08 15:49:16,989 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] hive.HiveBigQueryStorageHandler: Configuring MapReduce Job Input Properties (if not provided) from Hive Table properties
2022-03-08 15:49:16,989 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] parse.CalcitePlanner: Completed plan generation
2022-03-08 15:49:16,989 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] parse.CalcitePlanner: Not eligible for results caching - no mr/tez/spark jobs
2022-03-08 15:49:16,989 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] ql.Driver: Semantic Analysis Completed (retrial = false)
2022-03-08 15:49:16,989 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] ql.Driver: Returning Hive schema: Schema(fieldSchemas:[FieldSchema(name:testmebq.id, type:bigint, comment:null)], properties:null)
2022-03-08 15:49:16,989 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] avro.AvroSerDe: AvroSerde::initialize(): Preset value of avro.schema.literal == null
2022-03-08 15:49:16,989 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] exec.TableScanOperator: Initializing operator TS[0]
2022-03-08 15:49:16,989 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] exec.SelectOperator: Initializing operator SEL[1]
2022-03-08 15:49:16,989 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] exec.SelectOperator: SELECT struct<id:bigint>
2022-03-08 15:49:16,989 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] exec.ListSinkOperator: Initializing operator LIST_SINK[3]
2022-03-08 15:49:16,989 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] ql.Driver: Completed compiling command(queryId=hduser_20220308154916_a573a00c-9de7-49d9-9e4d-689f3e81559b); Time taken: 0.143 seconds
2022-03-08 15:49:16,989 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] reexec.ReExecDriver: Execution #1 of query
2022-03-08 15:49:16,989 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] lockmgr.DbTxnManager: Setting lock request transaction to txnid:18396 for queryId=hduser_20220308154916_a573a00c-9de7-49d9-9e4d-689f3e81559b
2022-03-08 15:49:16,990 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] lockmgr.DbLockManager: Requesting: queryId=hduser_20220308154916_a573a00c-9de7-49d9-9e4d-689f3e81559b LockRequest(component:[LockComponent(type:SHARED_READ, level:TABLE, dbname:test, tablename:testmebq, operationType:SELECT, isTransactional:false)], txnid:18396, user:hduser, hostname:rhes76, agentInfo:hduser_20220308154916_a573a00c-9de7-49d9-9e4d-689f3e81559b)
2022-03-08 15:49:17,019 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] lockmgr.DbLockManager: Response to queryId=hduser_20220308154916_a573a00c-9de7-49d9-9e4d-689f3e81559b LockResponse(lockid:116445, state:ACQUIRED)
2022-03-08 15:49:17,046 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] ql.Driver: Executing command(queryId=hduser_20220308154916_a573a00c-9de7-49d9-9e4d-689f3e81559b): select * from test.testmebq
2022-03-08 15:49:17,046 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] ql.Driver: Completed executing command(queryId=hduser_20220308154916_a573a00c-9de7-49d9-9e4d-689f3e81559b); Time taken: 0.0 seconds
OK
2022-03-08 15:49:17,046 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] ql.Driver: OK
2022-03-08 15:49:17,046 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] lockmgr.DbTxnManager: Stopped heartbeat for query: hduser_20220308154916_a573a00c-9de7-49d9-9e4d-689f3e81559b
2022-03-08 15:49:17,061 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] hive.WrappedBigQueryAvroInputFormat: Column projection:id and filter text:null and BQ Filter text: {}
2022-03-08 15:49:19,026 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] hive.WrappedBigQueryAvroInputFormat: Split[0] = (name='projects/axial-glow-224522/locations/us/streams/EgxfT1MwU0FJMGtvUUoaAmpkGgJpcigB', schema='{
"type": "record",
"name": "__root__",
"fields": [
{
"name": "id",
"type": [
"null",
"long"
]
}
]
}', limit='2')
2022-03-08 15:49:19,026 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] hive.WrappedBigQueryAvroInputFormat: mapreduceInputSplit is (name='projects/axial-glow-224522/locations/us/streams/EgxfT1MwU0FJMGtvUUoaAmpkGgJpcigB', schema='{
"type": "record",
"name": "__root__",
"fields": [
{
"name": "id",
"type": [
"null",
"long"
]
}
]
}', limit='2'), class is com.google.cloud.hadoop.io.bigquery.DirectBigQueryInputFormat$DirectBigQueryInputSplit
2022-03-08 15:49:19,434 WARN [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] avro.AvroDeserializer: Received different schemas. Have to re-encode: {"type":"record","name":"__root__","fields":[{"name":"id","type":["null","long"]}]}
SIZE{null=org.apache.hadoop.hive.serde2.avro.AvroDeserializer$SchemaReEncoder#3077e4aa} ID null
2022-03-08 15:49:19,692 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] exec.TableScanOperator: RECORDS_OUT_INTERMEDIATE:0, RECORDS_OUT_OPERATOR_TS_0:1,
2022-03-08 15:49:19,692 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] exec.SelectOperator: RECORDS_OUT_INTERMEDIATE:0, RECORDS_OUT_OPERATOR_SEL_1:1,
2022-03-08 15:49:19,692 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] exec.ListSinkOperator: RECORDS_OUT_INTERMEDIATE:0, RECORDS_OUT_OPERATOR_LIST_SINK_3:1,
99
Time taken: 0.215 seconds, Fetched: 1 row(s)
OK I can see that id = 99 there
Now let me insert a row to Hive table and see what happens
> insert into test.testmebq values(11);
exec.MapOperator: Initializing operator MAP[0]
2022-03-08 15:51:56,348 INFO [LocalJobRunner Map Task Executor #0] mr.ExecMapper:
<MAP>Id =0
<Children>null
<\Children>
<Parent><\Parent>
<\MAP>
2022-03-08 15:51:56,348 INFO [LocalJobRunner Map Task Executor #0] exec.TableScanOperator: Initializing operator TS[0]
2022-03-08 15:51:56,348 INFO [LocalJobRunner Map Task Executor #0] exec.SelectOperator: Initializing operator SEL[1]
2022-03-08 15:51:56,348 INFO [LocalJobRunner Map Task Executor #0] exec.SelectOperator: SELECT null
2022-03-08 15:51:56,348 INFO [LocalJobRunner Map Task Executor #0] exec.UDTFOperator: Initializing operator UDTF[2]
2022-03-08 15:51:56,348 INFO [LocalJobRunner Map Task Executor #0] exec.SelectOperator: Initializing operator SEL[3]
2022-03-08 15:51:56,348 INFO [LocalJobRunner Map Task Executor #0] exec.SelectOperator: SELECT struct<col1:int>
2022-03-08 15:51:56,348 INFO [LocalJobRunner Map Task Executor #0] exec.FileSinkOperator: Initializing operator FS[5]
2022-03-08 15:51:56,348 INFO [LocalJobRunner Map Task Executor #0] avro.AvroSerDe: AvroSerde::initialize(): Preset value of avro.schema.literal == {"type":"record","name":"testmebq","namespace":"test","fields":[{"name":"id","type":["null","long"],"default":null}]}
2022-03-08 15:51:56,349 INFO [LocalJobRunner Map Task Executor #0] exec.FileSinkOperator: Using serializer : org.apache.hadoop.hive.serde2.avro.AvroSerDe#619f4105 and formatter : com.google.cloud.hadoop.io.bigquery.hive.WrappedBigQueryAvroOutputFormat#2274f95f
2022-03-08 15:51:56,349 INFO [LocalJobRunner Map Task Executor #0] exec.Utilities: PLAN PATH = file:/tmp/hive/ec1f7b38-7286-481a-b48b-44f7ed54c20a/hive_2022-03-08_15-51-55_732_3190209878982735306-1/-mr-10002/160d3b10-e96a-4b60-bbf6-f2fbdef3936d/map.xml
2022-03-08 15:51:56,349 INFO [LocalJobRunner Map Task Executor #0] exec.FileSinkOperator: New Final Path: FS hdfs://rhes75:9000/user/hive/warehouse/test.db/testmebq
2022-03-08 15:51:56,349 INFO [LocalJobRunner Map Task Executor #0] hive.WrappedBigQueryAvroOutputFormat: Set temporary output file to test_testmebq_a85ef2ba8644
2022-03-08 15:51:56,919 INFO [LocalJobRunner Map Task Executor #0] exec.FileSinkOperator: FS[5]: records written - 1
2022-03-08 15:51:56,919 INFO [LocalJobRunner Map Task Executor #0] exec.MapOperator: MAP[0]: records read - 1
2022-03-08 15:51:56,919 INFO [LocalJobRunner Map Task Executor #0] exec.MapOperator: DESERIALIZE_ERRORS:0, RECORDS_OUT_INTERMEDIATE:0, RECORDS_IN:3, RECORDS_OUT_OPERATOR_MAP_0:0,
2022-03-08 15:51:56,919 INFO [LocalJobRunner Map Task Executor #0] exec.TableScanOperator: RECORDS_OUT_INTERMEDIATE:0, RECORDS_OUT_OPERATOR_TS_0:1,
2022-03-08 15:51:56,919 INFO [LocalJobRunner Map Task Executor #0] exec.SelectOperator: RECORDS_OUT_INTERMEDIATE:0, RECORDS_OUT_OPERATOR_SEL_1:1,
2022-03-08 15:51:56,919 INFO [LocalJobRunner Map Task Executor #0] exec.UDTFOperator: RECORDS_OUT_OPERATOR_UDTF_2:1, RECORDS_OUT_INTERMEDIATE:0,
2022-03-08 15:51:56,919 INFO [LocalJobRunner Map Task Executor #0] exec.SelectOperator: RECORDS_OUT_OPERATOR_SEL_3:1, RECORDS_OUT_INTERMEDIATE:0,
2022-03-08 15:51:56,919 INFO [LocalJobRunner Map Task Executor #0] exec.FileSinkOperator: FS[5]: records written - 1
2022-03-08 15:51:57,272 WARN [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] mapreduce.Counters: Group org.apache.hadoop.mapred.Task$Counter is deprecated. Use org.apache.hadoop.mapreduce.TaskCounter instead
2022-03-08 15:51:57,272 Stage-2 map = 0%, reduce = 0%
2022-03-08 15:51:57,272 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] exec.Task: 2022-03-08 15:51:57,272 Stage-2 map = 0%, reduce = 0%
2022-03-08 15:51:57,646 INFO [LocalJobRunner Map Task Executor #0] exec.FileSinkOperator: RECORDS_OUT_1_test.testmebq:1, RECORDS_OUT_INTERMEDIATE:0, RECORDS_OUT_OPERATOR_FS_5:1,
2022-03-08 15:51:57,647 INFO [LocalJobRunner Map Task Executor #0] mapred.LocalJobRunner:
2022-03-08 15:51:57,647 INFO [LocalJobRunner Map Task Executor #0] mapred.Task: Task:attempt_local1263701903_0004_m_000000_0 is done. And is in the process of committing
2022-03-08 15:51:57,649 INFO [LocalJobRunner Map Task Executor #0] mapred.LocalJobRunner: map
2022-03-08 15:51:57,649 INFO [LocalJobRunner Map Task Executor #0] mapred.Task: Task 'attempt_local1263701903_0004_m_000000_0' done.
2022-03-08 15:51:57,649 INFO [LocalJobRunner Map Task Executor #0] mapred.LocalJobRunner: Finishing task: attempt_local1263701903_0004_m_000000_0
2022-03-08 15:51:57,649 INFO [Thread-385] mapred.LocalJobRunner: map task executor complete.
2022-03-08 15:51:58,276 Stage-2 map = 100%, reduce = 0%
2022-03-08 15:51:58,276 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] exec.Task: 2022-03-08 15:51:58,276 Stage-2 map = 100%, reduce = 0%
Ended Job = job_local1263701903_0004
2022-03-08 15:51:58,276 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] exec.Task: Ended Job = job_local1263701903_0004
MapReduce Jobs Launched:
2022-03-08 15:51:58,277 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] ql.Driver: MapReduce Jobs Launched:
Stage-Stage-2: HDFS Read: 1030549 HDFS Write: 0 SUCCESS
2022-03-08 15:51:58,277 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] ql.Driver: Stage-Stage-2: HDFS Read: 1030549 HDFS Write: 0 SUCCESS
Total MapReduce CPU Time Spent: 0 msec
2022-03-08 15:51:58,277 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] ql.Driver: Total MapReduce CPU Time Spent: 0 msec
2022-03-08 15:51:58,277 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] ql.Driver: Completed executing command(queryId=hduser_20220308155155_9c3bc173-6893-4a5b-a061-226e2ac9b42a); Time taken: 2.297 seconds
OK
2022-03-08 15:51:58,277 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] ql.Driver: OK
2022-03-08 15:51:58,277 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] lockmgr.DbTxnManager: Stopped heartbeat for query: hduser_20220308155155_9c3bc173-6893-4a5b-a061-226e2ac9b42a
Time taken: 2.558 seconds
2022-03-08 15:51:58,296 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] CliDriver: Time taken: 2.558 seconds
2022-03-08 15:51:58,296 INFO [ec1f7b38-7286-481a-b48b-44f7ed54c20a main] conf.HiveConf: Using the default value passed in for log id: ec1f7b38-7286-481a-
So it says OK and no errors
But the record is NOT added to the local Hive table when I query it, nevermind the BigQuery table!
So in summary we can read rows added to the BigQuery table through Hive (on premise) but cannot add any rows to Hive table itself?
I wonder if this has ever worked?
Thanks
I have a docker network of containers running a namenode, datanode, jobhistory, yarnmaster, oozie, and mysql. My oozie can submit a job successfully to my hadoop cluster. The job will succeed, but Jobhistory gets a connection refused to the oozie callback. After a little bit the oozie web interface & instance stops working and any command like "oozie job -info " returns a connection refused, like this:
bash-4.2$ oozie job -info 0000000-180822162217556-oozie-W
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/app-root/oozie-4.3.0/distro/target/oozie-4.3.0-distro/oozie-4.3.0/lib/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/app-root/oozie-4.3.0/distro/target/oozie-4.3.0-distro/oozie-4.3.0/lib/slf4j-simple-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
log4j:WARN No appenders could be found for logger (org.apache.hadoop.security.authentication.client.KerberosAuthenticator).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
Job ID : 0000000-180822162217556-oozie-W
------------------------------------------------------------------------------------------------------------------------------------
Workflow Name : WorkflowRunnerTest
App Path : hdfs://namenode:8020/user/hadoop/oozie-jobs/WordCountTest
Status : RUNNING
Run : 0
User : hadoop
Group : -
Created : 2018-08-22 16:22 GMT
Started : 2018-08-22 16:22 GMT
Last Modified : 2018-08-22 16:23 GMT
Ended : -
CoordAction ID: -
Actions
------------------------------------------------------------------------------------------------------------------------------------
ID Status Ext ID Ext Status Err Code
------------------------------------------------------------------------------------------------------------------------------------
0000000-180822162217556-oozie-W#:start: OK - OK -
------------------------------------------------------------------------------------------------------------------------------------
0000000-180822162217556-oozie-W#intersection0 RUNNING job_1534954806897_0001 RUNNING -
------------------------------------------------------------------------------------------------------------------------------------
bash-4.2$ oozie job -info 0000000-180822162217556-oozie-W
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/opt/app-root/oozie-4.3.0/distro/target/oozie-4.3.0-distro/oozie-4.3.0/lib/slf4j-log4j12-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/opt/app-root/oozie-4.3.0/distro/target/oozie-4.3.0-distro/oozie-4.3.0/lib/slf4j-simple-1.6.6.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Connection exception has occurred [ java.net.ConnectException Connection refused (Connection refused) ]. Trying after 1 sec. Retry count = 1
Connection exception has occurred [ java.net.ConnectException Connection refused (Connection refused) ]. Trying after 2 sec. Retry count = 2
Connection exception has occurred [ java.net.ConnectException Connection refused (Connection refused) ]. Trying after 4 sec. Retry count = 3
Connection exception has occurred [ java.net.ConnectException Connection refused (Connection refused) ]. Trying after 8 sec. Retry count = 4
The jobhistory logs for this job look like this:
Showing 4096 bytes of 69256 total. Click here for the full log.
eds:0 ContAlloc:4 ContRel:0 HostLocal:3 RackLocal:0
2018-08-22 16:25:36,630 INFO [Thread-73] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Deleting staging directory hdfs://namenode:8020 /tmp/hadoop-yarn/staging/hadoop/.staging/job_1534954806897_0002
2018-08-22 16:25:36,636 INFO [Thread-73] org.apache.hadoop.ipc.Server: Stopping server on 35021
2018-08-22 16:25:36,638 INFO [IPC Server listener on 35021] org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 35021
2018-08-22 16:25:36,639 INFO [TaskHeartbeatHandler PingChecker] org.apache.hadoop.mapreduce.v2.app.TaskHeartbeatHandler: TaskHeartbeatHandler thread interrupted
2018-08-22 16:25:36,639 INFO [Ping Checker] org.apache.hadoop.yarn.util.AbstractLivelinessMonitor: TaskAttemptFinishingMonitor thread interrupted
2018-08-22 16:25:36,641 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2018-08-22 16:25:36,653 INFO [Thread-73] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Job end notification started for jobID : job_1534954806897_0002
2018-08-22 16:25:36,654 INFO [Thread-73] org.mortbay.log: Job end notification attempts left 0
2018-08-22 16:25:36,654 INFO [Thread-73] org.mortbay.log: Job end notification trying http://oozie:11000/oozie/callback?id=0000000-180822162217556-oozie-W#intersection0&status=SUCCEEDED
2018-08-22 16:25:36,663 WARN [Thread-73] org.mortbay.log: Job end notification to http://oozie:11000/oozie/callback?id=0000000-180822162217556-oozie-W#intersection0&status=SUCCEEDED failed
java.net.ConnectException: Connection refused (Connection refused)
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
at java.net.Socket.connect(Socket.java:589)
at sun.net.NetworkClient.doConnect(NetworkClient.java:175)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:463)
at sun.net.www.http.HttpClient.openServer(HttpClient.java:558)
at sun.net.www.http.HttpClient.<init>(HttpClient.java:242)
at sun.net.www.http.HttpClient.New(HttpClient.java:339)
at sun.net.www.http.HttpClient.New(HttpClient.java:357)
at sun.net.www.protocol.http.HttpURLConnection.getNewHttpClient(HttpURLConnection.java:1220)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1199)
at sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1050)
at sun.net.www.protocol.http.HttpURLConnection.connect(HttpURLConnection.java:984)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1564)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492)
at java.net.HttpURLConnection.getResponseCode(HttpURLConnection.java:480)
at org.apache.hadoop.mapreduce.v2.app.JobEndNotifier.notifyURLOnce(JobEndNotifier.java:130)
at org.apache.hadoop.mapreduce.v2.app.JobEndNotifier.notify(JobEndNotifier.java:174)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.sendJobEndNotify(MRAppMaster.java:686)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.shutDownJob(MRAppMaster.java:654)
at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$JobFinishEventHandler$1.run(MRAppMaster.java:728)
2018-08-22 16:25:37,666 WARN [Thread-73] org.mortbay.log: Job end notification failed to notify : http://oozie:11000/oozie/callback?id=0000000-180822162217556-oozie-W#intersection0&status=SUCCEEDED
2018-08-22 16:25:42,667 INFO [Thread-73] org.apache.hadoop.ipc.Server: Stopping server on 41027
2018-08-22 16:25:42,668 INFO [IPC Server listener on 41027] org.apache.hadoop.ipc.Server: Stopping IPC Server listener on 41027
2018-08-22 16:25:42,670 INFO [IPC Server Responder] org.apache.hadoop.ipc.Server: Stopping IPC Server Responder
2018-08-22 16:25:42,678 INFO [Thread-73] org.mortbay.log: Stopped HttpServer2$SelectChannelConnectorWithSafeStartup#0.0.0.0:0
Is there anything specifically that could be causing this hiccup?
Here is the output of oozie.log:
2018-08-22 20:25:21,367 INFO Services:520 - SERVER[oozie] Initialized
2018-08-22 20:25:21,369 INFO Services:520 - SERVER[oozie] Running with JARs for Hadoop version [2.6.5]
2018-08-22 20:25:21,369 INFO Services:520 - SERVER[oozie] Oozie System ID [oozie] started!
2018-08-22 20:25:31,345 INFO StatusTransitService$StatusTransitRunnable:520 - SERVER[oozie] Acquired lock for [org.apache.oozie.service.StatusTransitService]
2018-08-22 20:25:31,345 INFO PauseTransitService:520 - SERVER[oozie] Acquired lock for [org.apache.oozie.service.PauseTransitService]
2018-08-22 20:25:31,348 INFO StatusTransitService$StatusTransitRunnable:520 - SERVER[oozie] Running coordinator status service first instance
2018-08-22 20:25:31,609 INFO StatusTransitService$StatusTransitRunnable:520 - SERVER[oozie] Running bundle status service first instance
2018-08-22 20:25:31,637 INFO StatusTransitService$StatusTransitRunnable:520 - SERVER[oozie] Released lock for [org.apache.oozie.service.StatusTransitService]
2018-08-22 20:25:31,641 INFO CoordMaterializeTriggerService$CoordMaterializeTriggerRunnable:520 - SERVER[oozie] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] CoordMaterializeTriggerService - Curr Date= 2018-08-22T20:30Z, Num jobs to materialize = 0
2018-08-22 20:25:31,648 INFO CoordMaterializeTriggerService$CoordMaterializeTriggerRunnable:520 - SERVER[oozie] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] Released lock for [org.apache.oozie.service.CoordMaterializeTriggerService]
2018-08-22 20:25:31,723 INFO PurgeXCommand:520 - SERVER[oozie] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] STARTED Purge to purge Workflow Jobs older than [30] days, Coordinator Jobs older than [7] days, and Bundlejobs older than [7] days.
2018-08-22 20:25:31,723 INFO PurgeXCommand:520 - SERVER[oozie] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] ENDED Purge deleted [0] workflows, [0] coordinatorActions, [0] coordinators, [0] bundles
2018-08-22 20:25:31,746 INFO PauseTransitService:520 - SERVER[oozie] Released lock for [org.apache.oozie.service.PauseTransitService]
2018-08-22 20:26:31,571 INFO StatusTransitService$StatusTransitRunnable:520 - SERVER[oozie] Acquired lock for [org.apache.oozie.service.StatusTransitService]
2018-08-22 20:26:31,572 INFO StatusTransitService$StatusTransitRunnable:520 - SERVER[oozie] Running coordinator status service from last instance time = 2018-08-22T20:25Z
2018-08-22 20:26:31,614 INFO StatusTransitService$StatusTransitRunnable:520 - SERVER[oozie] Running bundle status service from last instance time = 2018-08-22T20:25Z
2018-08-22 20:26:31,641 INFO StatusTransitService$StatusTransitRunnable:520 - SERVER[oozie] Released lock for [org.apache.oozie.service.StatusTransitService]
2018-08-22 20:26:31,676 INFO PauseTransitService:520 - SERVER[oozie] Acquired lock for [org.apache.oozie.service.PauseTransitService]
2018-08-22 20:26:31,708 INFO PauseTransitService:520 - SERVER[oozie] Released lock for [org.apache.oozie.service.PauseTransitService]
2018-08-22 20:27:31,571 INFO StatusTransitService$StatusTransitRunnable:520 - SERVER[oozie] Acquired lock for [org.apache.oozie.service.StatusTransitService]
2018-08-22 20:27:31,572 INFO StatusTransitService$StatusTransitRunnable:520 - SERVER[oozie] Running coordinator status service from last instance time = 2018-08-22T20:26Z
2018-08-22 20:27:31,584 INFO StatusTransitService$StatusTransitRunnable:520 - SERVER[oozie] Running bundle status service from last instance time = 2018-08-22T20:26Z
2018-08-22 20:27:31,589 INFO StatusTransitService$StatusTransitRunnable:520 - SERVER[oozie] Released lock for [org.apache.oozie.service.StatusTransitService]
2018-08-22 20:27:31,639 INFO PauseTransitService:520 - SERVER[oozie] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] Acquired lock for [org.apache.oozie.service.PauseTransitService]
2018-08-22 20:27:31,661 INFO PauseTransitService:520 - SERVER[oozie] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[-] ACTION[-] Released lock for [org.apache.oozie.service.PauseTransitService]
2018-08-22 20:27:47,241 INFO ActionStartXCommand:520 - SERVER[oozie] USER[hadoop] GROUP[-] TOKEN[] APP[WorkflowRunnerTest] JOB[0000000-180822202517586-oozie-W] ACTION[0000000-180822202517586-oozie-W#:start:] Start action [0000000-180822202517586-oozie-W#:start:] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2018-08-22 20:27:47,242 INFO ActionStartXCommand:520 - SERVER[oozie] USER[hadoop] GROUP[-] TOKEN[] APP[WorkflowRunnerTest] JOB[0000000-180822202517586-oozie-W] ACTION[0000000-180822202517586-oozie-W#:start:] [***0000000-180822202517586-oozie-W#:start:***]Action status=DONE
2018-08-22 20:27:47,242 INFO ActionStartXCommand:520 - SERVER[oozie] USER[hadoop] GROUP[-] TOKEN[] APP[WorkflowRunnerTest] JOB[0000000-180822202517586-oozie-W] ACTION[0000000-180822202517586-oozie-W#:start:] [***0000000-180822202517586-oozie-W#:start:***]Action updated in DB!
2018-08-22 20:27:47,394 INFO WorkflowNotificationXCommand:520 - SERVER[oozie] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-180822202517586-oozie-W] ACTION[] No Notification URL is defined. Therefore nothing to notify for job 0000000-180822202517586-oozie-W
2018-08-22 20:27:47,394 INFO WorkflowNotificationXCommand:520 - SERVER[oozie] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-180822202517586-oozie-W] ACTION[0000000-180822202517586-oozie-W#:start:] No Notification URL is defined. Therefore nothing to notify for job 0000000-180822202517586-oozie-W#:start:
2018-08-22 20:27:47,432 INFO ActionStartXCommand:520 - SERVER[oozie] USER[hadoop] GROUP[-] TOKEN[] APP[WorkflowRunnerTest] JOB[0000000-180822202517586-oozie-W] ACTION[0000000-180822202517586-oozie-W#intersection0] Start action [0000000-180822202517586-oozie-W#intersection0] with user-retry state : userRetryCount [0], userRetryMax [0], userRetryInterval [10]
2018-08-22 20:27:47,507 INFO HadoopAccessorService:520 - SERVER[oozie] USER[hadoop] GROUP[-] TOKEN[] APP[WorkflowRunnerTest] JOB[0000000-180822202517586-oozie-W] ACTION[0000000-180822202517586-oozie-W#intersection0] Processing configuration file [/opt/app-root/oozie-4.3.0/distro/target/oozie-4.3.0-distro/oozie-4.3.0/conf/action-conf/default.xml] for action[default] and hostPort [*]
2018-08-22 20:27:47,508 INFO HadoopAccessorService:520 - SERVER[oozie] USER[hadoop] GROUP[-] TOKEN[] APP[WorkflowRunnerTest] JOB[0000000-180822202517586-oozie-W] ACTION[0000000-180822202517586-oozie-W#intersection0] Processing configuration file [/opt/app-root/oozie-4.3.0/distro/target/oozie-4.3.0-distro/oozie-4.3.0/conf/action-conf/map-reduce.xml] for action [map-reduce] and hostPort [*]
2018-08-22 20:27:48,482 WARN JobResourceUploader:64 - SERVER[oozie] Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
2018-08-22 20:27:48,493 WARN JobResourceUploader:171 - SERVER[oozie] No job jar file set. User classes may not be found. See Job or Job#setJar(String).
2018-08-22 20:27:50,173 INFO MapReduceActionExecutor:520 - SERVER[oozie] USER[hadoop] GROUP[-] TOKEN[] APP[WorkflowRunnerTest] JOB[0000000-180822202517586-oozie-W] ACTION[0000000-180822202517586-oozie-W#intersection0] checking action, hadoop job ID [job_1534969405649_0001] status [RUNNING]
2018-08-22 20:27:50,175 INFO ActionStartXCommand:520 - SERVER[oozie] USER[hadoop] GROUP[-] TOKEN[] APP[WorkflowRunnerTest] JOB[0000000-180822202517586-oozie-W] ACTION[0000000-180822202517586-oozie-W#intersection0] [***0000000-180822202517586-oozie-W#intersection0***]Action status=RUNNING
2018-08-22 20:27:50,176 INFO ActionStartXCommand:520 - SERVER[oozie] USER[hadoop] GROUP[-] TOKEN[] APP[WorkflowRunnerTest] JOB[0000000-180822202517586-oozie-W] ACTION[0000000-180822202517586-oozie-W#intersection0] [***0000000-180822202517586-oozie-W#intersection0***]Action updated in DB!
2018-08-22 20:27:50,208 INFO WorkflowNotificationXCommand:520 - SERVER[oozie] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-180822202517586-oozie-W] ACTION[0000000-180822202517586-oozie-W#intersection0] No Notification URL is defined. Therefore nothing to notify for job 0000000-180822202517586-oozie-W#intersection0
2018-08-22 20:28:02,437 INFO CallbackServlet:520 - SERVER[oozie] USER[-] GROUP[-] TOKEN[-] APP[-] JOB[0000000-180822202517586-oozie-W] ACTION[0000000-180822202517586-oozie-W#intersection0] callback for action [0000000-180822202517586-oozie-W#intersection0]
2018-08-22 20:28:06,269 INFO MapReduceActionExecutor:520 - SERVER[oozie] USER[hadoop] GROUP[-] TOKEN[] APP[WorkflowRunnerTest] JOB[0000000-180822202517586-oozie-W] ACTION[0000000-180822202517586-oozie-W#intersection0] External ID swap, old ID [job_1534969405649_0001] new ID [job_1534969405649_0002]
2018-08-22 20:28:06,273 INFO MapReduceActionExecutor:520 - SERVER[oozie] USER[hadoop] GROUP[-] TOKEN[] APP[WorkflowRunnerTest] JOB[0000000-180822202517586-oozie-W] ACTION[0000000-180822202517586-oozie-W#intersection0] checking action, hadoop job ID [job_1534969405649_0002] status [RUNNING]
did you try oozie info command adding
-oozie $OOZIE_URL
where OOZIE_URL is variable set for actual oozie url
EDIT:
There was an update to the plugin some weeks ago, and now I get in the Jenkins log that:
Aug 14, 2018 8:57:26 AM WARNING com.sonyericsson.hudson.plugins.gerrit.trigger.playback.GerritMissedEventsPlaybackManager performCheck
Missed Events Playback used to be NOT supported. now it IS!
Aug 14, 2018 8:57:26 AM INFO com.sonymobile.tools.gerrit.gerritevents.GerritConnection run
And in the GERRIT_SITE/logs/error_log it says plugin is loaded:
[2018-08-14 10:56:57,213] [ShutdownCallback] INFO com.google.gerrit.pgm.Daemon : caught shutdown, cleaning up
[2018-08-14 10:56:57,380] [ShutdownCallback] INFO org.eclipse.jetty.server.AbstractConnector : Stopped ServerConnector#500beb9f{HTTP/1.1,[http/1.1]}{127.0.0.1:8081}
[2018-08-14 10:56:57,403] [ShutdownCallback] INFO org.eclipse.jetty.server.handler.ContextHandler : Stopped o.e.j.s.ServletContextHandler#3437fc4f{/,null,UNAVAILABLE}
[2018-08-14 10:56:57,469] [ShutdownCallback] WARN org.apache.sshd.server.channel.ChannelSession : doCloseImmediately(ChannelSession[id=1, recipient=1]-ServerSessionIm$
[2018-08-14 10:56:57,508] [ShutdownCallback] INFO com.google.gerrit.sshd.SshDaemon : Stopped Gerrit SSHD
[2018-08-14 10:57:21,044] [main] WARN com.google.gerrit.sshd.SshDaemon : Cannot format SSHD host key [EdDSA]: invalid key type
[2018-08-14 10:57:21,061] [main] WARN com.google.gerrit.server.config.GitwebCgiConfig : gitweb not installed (no /usr/lib/cgi-bin/gitweb.cgi found)
[2018-08-14 10:57:22,289] [main] INFO org.eclipse.jetty.util.log : Logging initialized #15822ms
[2018-08-14 10:57:22,430] [main] INFO com.google.gerrit.server.git.LocalDiskRepositoryManager : Defaulting core.streamFileThreshold to 1339m
[2018-08-14 10:57:22,784] [main] INFO com.google.gerrit.server.plugins.PluginLoader : Loading plugins from /opt/gerrit/plugins
[2018-08-14 10:57:23,056] [main] INFO com.google.gerrit.server.plugins.PluginLoader : Loaded plugin delete-project, version v2.13-61-g8d6b23b122
[2018-08-14 10:57:23,500] [main] INFO com.google.gerrit.server.plugins.PluginLoader : Loaded plugin events-log, version v2.13-66-ge95af940c6
[2018-08-14 10:57:24,150] [main] INFO com.google.gerrit.server.git.GarbageCollectionRunner : Ignoring missing gc schedule configuration
[2018-08-14 10:57:24,151] [main] INFO com.google.gerrit.server.config.ScheduleConfig : accountDeactivation schedule parameter "accountDeactivation.interval" is not co$
[2018-08-14 10:57:24,151] [main] INFO com.google.gerrit.server.change.ChangeCleanupRunner : Ignoring missing changeCleanup schedule configuration
[2018-08-14 10:57:24,295] [main] INFO com.google.gerrit.sshd.SshDaemon : Started Gerrit SSHD-CORE-1.6.0 on *:29418
[2018-08-14 10:57:24,298] [main] INFO org.eclipse.jetty.server.Server : jetty-9.3.18.v20170406
[2018-08-14 10:57:25,454] [main] INFO org.eclipse.jetty.server.handler.ContextHandler : Started o.e.j.s.ServletContextHandler#73f0b216{/,null,AVAILABLE}
[2018-08-14 10:57:25,475] [main] INFO org.eclipse.jetty.server.AbstractConnector : Started ServerConnector#374013e8{HTTP/1.1,[http/1.1]}{127.0.0.1:8081}
[2018-08-14 10:57:25,476] [main] INFO org.eclipse.jetty.server.Server : Started #19011ms
[2018-08-14 10:57:25,478] [main] INFO com.google.gerrit.pgm.Daemon : Gerrit Code Review 2.15.1 ready
So now this is solved.
I am trying to solve the issue with Missed Events Playback warning I get in Jenkins.
I've enabled the REST API in Jenkins with my generated http password from Gerrit web UI.
So my issue is with the events-log plugin.
I've installed the events-log.jar plugin under GERRIT_SITE/gerrit/plugins
This directory has drwxr-xr-x as permission settings.
GERRIT_SITE/gerrit/logs/error_log gives me this when restarting:
[2018-06-21 13:40:34,678] [main] WARN com.google.gerrit.sshd.SshDaemon : Cannot format SSHD host key [EdDSA]: invalid key type [2018-06-21 13:40:34,697] [main] WARN com.google.gerrit.server.config.GitwebCgiConfig : gitweb not installed (no /usr/lib/cgi-bin/gitweb.cgi found) [2018-06-21 13:40:35,761] [main] INFO org.eclipse.jetty.util.log : Logging initialized #11099ms [2018-06-21 13:40:35,925] [main] INFO com.google.gerrit.server.git.LocalDiskRepositoryManager : Defaulting core.streamFileThreshold to 1339m [2018-06-21 13:40:36,410] [main] INFO com.google.gerrit.server.plugins.PluginLoader : Removing stale plugin file: plugin_events-log_180621_1333_5163201567282630382.jar [2018-06-21 13:40:36,410] [main] INFO com.google.gerrit.server.plugins.PluginLoader : Loading plugins from /opt/gerrit/plugins [2018-06-21 13:40:36,528] [main] INFO com.google.gerrit.server.plugins.PluginLoader : Loaded plugin delete-project, version v2.13-61-g8d6b23b122 [2018-06-21 13:40:36,614] [main] WARN com.google.gerrit.server.plugins.PluginLoader : **Cannot**
**load plugin events-log** java.lang.NoSuchMethodError: com.google.gerrit.server.git.WorkQueue.createQueue(ILjava/lang/String;)Ljava/util/concurrent/ScheduledThreadPoolExecutor;
at com.ericsson.gerrit.plugins.eventslog.EventQueue.start(EventQueue.java:35)
at com.google.gerrit.lifecycle.LifecycleManager.start(LifecycleManager.java:92)
at com.google.gerrit.server.plugins.ServerPlugin.startPlugin(ServerPlugin.java:251)
at com.google.gerrit.server.plugins.ServerPlugin.start(ServerPlugin.java:192)
at com.google.gerrit.server.plugins.PluginLoader.runPlugin(PluginLoader.java:491)
at com.google.gerrit.server.plugins.PluginLoader.rescan(PluginLoader.java:419)
at com.google.gerrit.server.plugins.PluginLoader.start(PluginLoader.java:324)
at com.google.gerrit.lifecycle.LifecycleManager.start(LifecycleManager.java:92)
at com.google.gerrit.pgm.Daemon.start(Daemon.java:349)
at com.google.gerrit.pgm.Daemon.run(Daemon.java:256)
at com.google.gerrit.pgm.util.AbstractProgram.main(AbstractProgram.java:61)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.google.gerrit.launcher.GerritLauncher.invokeProgram(GerritLauncher.java:223)
at com.google.gerrit.launcher.GerritLauncher.mainImpl(GerritLauncher.java:119)
at com.google.gerrit.launcher.GerritLauncher.main(GerritLauncher.java:63)
at Main.main(Main.java:24) [2018-06-21 13:40:36,687] [main] INFO com.google.gerrit.server.plugins.PluginLoader : Loaded plugin gitiles, version dd264dd2d4 [2018-06-21 13:40:36,728] [main] INFO com.google.gerrit.server.plugins.PluginLoader : Loaded plugin its-jira, version v2.15 [2018-06-21 13:40:37,034] [main] INFO com.google.gerrit.server.git.GarbageCollectionRunner : Ignoring missing gc schedule configuration [2018-06-21 13:40:37,034] [main] INFO com.google.gerrit.server.config.ScheduleConfig : accountDeactivation schedule parameter "accountDeactivation.interval" is not configured [2018-06-21 13:40:37,034] [main] INFO com.google.gerrit.server.change.ChangeCleanupRunner : Ignoring missing changeCleanup schedule configuration [2018-06-21 13:40:37,060] [main] INFO com.google.gerrit.sshd.SshDaemon : Started Gerrit SSHD-CORE-1.6.0 on *:29418 [2018-06-21 13:40:37,074] [main] INFO org.eclipse.jetty.server.Server : jetty-9.3.18.v20170406 [2018-06-21 13:40:38,104] [main] INFO org.eclipse.jetty.server.handler.ContextHandler : Started o.e.j.s.ServletContextHandler#2c8469fe{/,null,AVAILABLE} [2018-06-21 13:40:38,113] [main] INFO org.eclipse.jetty.server.AbstractConnector : Started ServerConnector#3803bc1a{HTTP/1.1,[http/1.1]}{127.0.0.1:8081} [2018-06-21 13:40:38,115] [main] INFO org.eclipse.jetty.server.Server : Started #13456ms [2018-06-21 13:40:38,118] [main] INFO com.google.gerrit.pgm.Daemon : Gerrit Code Review 2.15.1 ready
I would like some help on why the plugin is not loading/enables when other plugins are working.
Note 1: Jenkins v2.107.2 and Gerrit v2.15.1 are installed on different linux based servers. And I'm able to trigger a build from Gerrit.
Note 2: I tried both with plugin-manager (uninstalled for now) and with command wget https://gerrit-ci.gerritforge.com/view/Plugins-stable-2.15/job/plugin-events-log-bazel-stable-2.15/lastSuccessfulBuild/artifact/bazel-genfiles/plugins/events-log/events-log.jar, which is the way I'm doing now.
Note 3: events-log in gerrit.config looks like this:
[plugin "events-log"]
maxAge = 20
returnLimit = 10000
storeDriver = org.postgresql.Driver
storeUsername = gerrit
storeUrl = jdbc:postgresql:/var/lib/postgresql/9.5/main
urlOptions = loglevel=INFO
urlOptions = logUnclosedConnections=true
copyLocal = true
Working on setting up a local test of Flink 1.4.0 that writes to s3 and I'm getting the following error:
java.lang.NoClassDefFoundError: Could not initialize class org.apache.flink.fs.s3presto.shaded.com.amazonaws.services.s3.internal.S3ErrorResponseHandler
at org.apache.flink.fs.s3presto.shaded.com.amazonaws.services.s3.AmazonS3Client.<init>(AmazonS3Client.java:363)
at org.apache.flink.fs.s3presto.shaded.com.amazonaws.services.s3.AmazonS3Client.<init>(AmazonS3Client.java:542)
at org.apache.flink.fs.s3presto.shaded.com.facebook.presto.hive.PrestoS3FileSystem.createAmazonS3Client(PrestoS3FileSystem.java:639)
at org.apache.flink.fs.s3presto.shaded.com.facebook.presto.hive.PrestoS3FileSystem.initialize(PrestoS3FileSystem.java:212)
at org.apache.flink.fs.s3presto.S3FileSystemFactory.create(S3FileSystemFactory.java:132)
at org.apache.flink.core.fs.FileSystem.getUnguardedFileSystem(FileSystem.java:397)
at org.apache.flink.core.fs.FileSystem.get(FileSystem.java:320)
at org.apache.flink.core.fs.Path.getFileSystem(Path.java:293)
at org.apache.flink.runtime.state.filesystem.FsCheckpointStreamFactory.<init>(FsCheckpointStreamFactory.java:99)
at org.apache.flink.runtime.state.filesystem.FsStateBackend.createStreamFactory(FsStateBackend.java:277)
at org.apache.flink.streaming.runtime.tasks.StreamTask.createCheckpointStreamFactory(StreamTask.java:787)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:247)
at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeOperators(StreamTask.java:694)
at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:682)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:253)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718)
at java.lang.Thread.run(Thread.java:748)
Following the documentation here I added the flink-s3-fs-presto-1.4.0.jar from opt/ to lib/ so I'm not exactly sure why I'm getting this error. Any help would be appreciated let me know if I can add additional information.
Here is some more information about my system and process:
I start the local job manager:
[flink-1.4.0] ./bin/start-local.sh
Warning: this file is deprecated and will be removed in 1.5.
Starting cluster.
Starting jobmanager daemon on host MBP0535.local.
Starting taskmanager daemon on host MBP0535.local.
OS information:
[flink-1.4.0] system_profiler SPSoftwareDataType
Software:
System Software Overview:
System Version: macOS 10.13.2 (17C205)
Kernel Version: Darwin 17.3.0
Boot Volume: Macintosh HD
Try to run jar:
[flink-1.4.0] ./bin/flink run streaming.jar
I'm actually having trouble reproducing the error. Here is the task manager log:
2018-01-18 10:17:07,668 INFO org.apache.flink.runtime.taskmanager.TaskManager - Starting TaskManager (Version: 1.4.0, Rev:3a9d9f2, Date:06.12.2017 # 11:08:40 UTC)
2018-01-18 10:17:07,668 INFO org.apache.flink.runtime.taskmanager.TaskManager - OS current user: k
2018-01-18 10:17:08,002 WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-01-18 10:17:08,084 INFO org.apache.flink.runtime.taskmanager.TaskManager - Current Hadoop/Kerberos user: k
2018-01-18 10:17:08,084 INFO org.apache.flink.runtime.taskmanager.TaskManager - JVM: Java HotSpot(TM) 64-Bit Server VM - Oracle Corporation - 1.8/25.152-b16
2018-01-18 10:17:08,084 INFO org.apache.flink.runtime.taskmanager.TaskManager - Maximum heap size: 1024 MiBytes
2018-01-18 10:17:08,084 INFO org.apache.flink.runtime.taskmanager.TaskManager - JAVA_HOME: /Library/Java/JavaVirtualMachines/jdk1.8.0_152.jdk/Contents/Home
2018-01-18 10:17:08,087 INFO org.apache.flink.runtime.taskmanager.TaskManager - Hadoop version: 2.8.1
2018-01-18 10:17:08,087 INFO org.apache.flink.runtime.taskmanager.TaskManager - JVM Options:
2018-01-18 10:17:08,087 INFO org.apache.flink.runtime.taskmanager.TaskManager - -XX:+UseG1GC
2018-01-18 10:17:08,087 INFO org.apache.flink.runtime.taskmanager.TaskManager - -Xms1024M
2018-01-18 10:17:08,087 INFO org.apache.flink.runtime.taskmanager.TaskManager - -Xmx1024M
2018-01-18 10:17:08,087 INFO org.apache.flink.runtime.taskmanager.TaskManager - -XX:MaxDirectMemorySize=8388607T
2018-01-18 10:17:08,087 INFO org.apache.flink.runtime.taskmanager.TaskManager - -Dlog.file=/Users/k/flink-1.4.0/log/flink-k-taskmanager-0-MBP0535.local.log
2018-01-18 10:17:08,087 INFO org.apache.flink.runtime.taskmanager.TaskManager - -Dlog4j.configuration=file:/Users/k/flink-1.4.0/conf/log4j.properties
2018-01-18 10:17:08,087 INFO org.apache.flink.runtime.taskmanager.TaskManager - -Dlogback.configurationFile=file:/Users/k/flink-1.4.0/conf/logback.xml
2018-01-18 10:17:08,087 INFO org.apache.flink.runtime.taskmanager.TaskManager - Program Arguments:
2018-01-18 10:17:08,088 INFO org.apache.flink.runtime.taskmanager.TaskManager - --configDir
2018-01-18 10:17:08,088 INFO org.apache.flink.runtime.taskmanager.TaskManager - /Users/k/flink-1.4.0/conf
2018-01-18 10:17:08,088 INFO org.apache.flink.runtime.taskmanager.TaskManager - Classpath: /Users/k/flink-1.4.0/lib/flink-python_2.11-1.4.0.jar:/Users/k/flink-1.4.0/lib/flink-s3-fs-hadoop-1.4.0.jar:/Users/k/flink-1.4.0/lib/flink-shaded-hadoop2-uber-1.4.0.jar:/Users/k/flink-1.4.0/lib/log4j-1.2.17.jar:/Users/k/flink-1.4.0/lib/slf4j-log4j12-1.7.7.jar:/Users/k/flink-1.4.0/lib/flink-dist_2.11-1.4.0.jar:::
2018-01-18 10:17:08,089 INFO org.apache.flink.runtime.taskmanager.TaskManager - Registered UNIX signal handlers for [TERM, HUP, INT]
2018-01-18 10:17:08,094 INFO org.apache.flink.runtime.taskmanager.TaskManager - Maximum number of open file descriptors is 10240
2018-01-18 10:17:08,117 INFO org.apache.flink.runtime.taskmanager.TaskManager - Loading configuration from /Users/k/flink-1.4.0/conf
2018-01-18 10:17:08,119 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: classloader.resolve-order, parent-first
2018-01-18 10:17:08,119 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: classloader.parent-first-patterns, java.;org.apache.flink.;javax.annotation;org.slf4j;org.apache.log4j;org.apache.logging.log4j;ch.qos.logback;com.mapr.;org.apache.
2018-01-18 10:17:08,120 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: s3.access-key, XXXXXXXXXXXXXXXXXXXX
2018-01-18 10:17:08,120 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: s3.secret-key, YYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYYY
2018-01-18 10:17:08,120 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.rpc.address, localhost
2018-01-18 10:17:08,120 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.rpc.port, 6123
2018-01-18 10:17:08,120 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: jobmanager.heap.mb, 1024
2018-01-18 10:17:08,120 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.heap.mb, 1024
2018-01-18 10:17:08,120 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.numberOfTaskSlots, 1
2018-01-18 10:17:08,121 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: taskmanager.memory.preallocate, false
2018-01-18 10:17:08,121 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: parallelism.default, 1
2018-01-18 10:17:08,121 INFO org.apache.flink.configuration.GlobalConfiguration - Loading configuration property: web.port, 8082
2018-01-18 10:17:08,199 INFO org.apache.flink.runtime.security.modules.HadoopModule - Hadoop user set to k (auth:SIMPLE)
2018-01-18 10:17:08,289 INFO org.apache.flink.runtime.util.LeaderRetrievalUtils - Trying to select the network interface and address to use by connecting to the leading JobManager.
2018-01-18 10:17:08,289 INFO org.apache.flink.runtime.util.LeaderRetrievalUtils - TaskManager will try to connect for 10000 milliseconds before falling back to heuristics
2018-01-18 10:17:08,291 INFO org.apache.flink.runtime.net.ConnectionUtils - Retrieved new target address localhost/127.0.0.1:6123.
2018-01-18 10:17:08,472 INFO org.apache.flink.runtime.taskmanager.TaskManager - TaskManager will use hostname/address 'MBP0535.local' (10.1.11.139) for communication.
2018-01-18 10:17:08,482 INFO org.apache.flink.runtime.taskmanager.TaskManager - Starting TaskManager
2018-01-18 10:17:08,482 INFO org.apache.flink.runtime.taskmanager.TaskManager - Starting TaskManager actor system at MBP0535.local:54024.
2018-01-18 10:17:08,484 INFO org.apache.flink.runtime.taskmanager.TaskManager - Trying to start actor system at mbp0535.local:54024
2018-01-18 10:17:08,898 INFO akka.event.slf4j.Slf4jLogger - Slf4jLogger started
2018-01-18 10:17:08,960 INFO akka.remote.Remoting - Starting remoting
2018-01-18 10:17:09,087 INFO akka.remote.Remoting - Remoting started; listening on addresses :[akka.tcp://flink#mbp0535.local:54024]
2018-01-18 10:17:09,097 INFO org.apache.flink.runtime.taskmanager.TaskManager - Actor system started at akka.tcp://flink#mbp0535.local:54024
2018-01-18 10:17:09,105 INFO org.apache.flink.runtime.metrics.MetricRegistryImpl - No metrics reporter configured, no metrics will be exposed/reported.
2018-01-18 10:17:09,111 INFO org.apache.flink.runtime.taskmanager.TaskManager - Starting TaskManager actor
2018-01-18 10:17:09,115 INFO org.apache.flink.runtime.io.network.netty.NettyConfig - NettyConfig [server address: MBP0535.local/10.1.11.139, server port: 0, ssl enabled: false, memory segment size (bytes): 32768, transport type: NIO, number of server threads: 1 (manual), number of client threads: 1 (manual), server connect backlog: 0 (use Netty's default), client connect timeout (sec): 120, send/receive buffer size (bytes): 0 (use Netty's default)]
2018-01-18 10:17:09,118 INFO org.apache.flink.runtime.taskexecutor.TaskManagerConfiguration - Messages have a max timeout of 10000 ms
2018-01-18 10:17:09,122 INFO org.apache.flink.runtime.taskexecutor.TaskManagerServices - Temporary file directory '/var/folders/sw/jcdfbbc15td51f3635hvt77w0000gp/T': total 465 GB, usable 333 GB (71.61% usable)
2018-01-18 10:17:09,236 INFO org.apache.flink.runtime.io.network.buffer.NetworkBufferPool - Allocated 101 MB for network buffer pool (number of memory segments: 3255, bytes per segment: 32768).
2018-01-18 10:17:09,323 WARN org.apache.flink.runtime.query.QueryableStateUtils - Could not load Queryable State Client Proxy. Probable reason: flink-queryable-state-runtime is not in the classpath. Please put the corresponding jar from the opt to the lib folder.
2018-01-18 10:17:09,324 WARN org.apache.flink.runtime.query.QueryableStateUtils - Could not load Queryable State Server. Probable reason: flink-queryable-state-runtime is not in the classpath. Please put the corresponding jar from the opt to the lib folder.
2018-01-18 10:17:09,324 INFO org.apache.flink.runtime.io.network.NetworkEnvironment - Starting the network environment and its components.
2018-01-18 10:17:09,353 INFO org.apache.flink.runtime.io.network.netty.NettyClient - Successful initialization (took 23 ms).
2018-01-18 10:17:09,378 INFO org.apache.flink.runtime.io.network.netty.NettyServer - Successful initialization (took 25 ms). Listening on SocketAddress /10.1.11.139:54026.
2018-01-18 10:17:09,381 WARN org.apache.flink.runtime.taskmanager.TaskManagerLocation - No hostname could be resolved for the IP address 10.1.11.139, using IP address as host name. Local input split assignment (such as for HDFS files) may be impacted.
2018-01-18 10:17:09,431 INFO org.apache.flink.runtime.taskexecutor.TaskManagerServices - Limiting managed memory to 0.7 of the currently free heap space (640 MB), memory will be allocated lazily.
2018-01-18 10:17:09,437 INFO org.apache.flink.runtime.io.disk.iomanager.IOManager - I/O manager uses directory /var/folders/sw/jcdfbbc15td51f3635hvt77w0000gp/T/flink-io-186cf8c8-5a0d-44cc-9d78-e81c943b0b9f for spill files.
2018-01-18 10:17:09,439 INFO org.apache.flink.runtime.filecache.FileCache - User file cache uses directory /var/folders/sw/jcdfbbc15td51f3635hvt77w0000gp/T/flink-dist-cache-a9a568cd-c7cd-45c6-abbe-08912d051583
2018-01-18 10:17:09,509 INFO org.apache.flink.runtime.filecache.FileCache - User file cache uses directory /var/folders/sw/jcdfbbc15td51f3635hvt77w0000gp/T/flink-dist-cache-bd3cc98c-cebb-4569-98d3-5357393d8c5b
2018-01-18 10:17:09,516 INFO org.apache.flink.runtime.taskmanager.TaskManager - Starting TaskManager actor at akka://flink/user/taskmanager#1044592356.
2018-01-18 10:17:09,516 INFO org.apache.flink.runtime.taskmanager.TaskManager - TaskManager data connection information: 97b3a934f84ba25e20aae8a91a40e336 # 10.1.11.139 (dataPort=54026)
2018-01-18 10:17:09,516 INFO org.apache.flink.runtime.taskmanager.TaskManager - TaskManager has 1 task slot(s).
2018-01-18 10:17:09,518 INFO org.apache.flink.runtime.taskmanager.TaskManager - Memory usage stats: [HEAP: 112/1024/1024 MB, NON HEAP: 35/36/-1 MB (used/committed/max)]
2018-01-18 10:17:09,522 INFO org.apache.flink.runtime.taskmanager.TaskManager - Trying to register at JobManager akka.tcp://flink#localhost:6123/user/jobmanager (attempt 1, timeout: 500 milliseconds)
2018-01-18 10:17:09,692 INFO org.apache.flink.runtime.taskmanager.TaskManager - Successful registration at JobManager (akka.tcp://flink#localhost:6123/user/jobmanager), starting network stack and library cache.
2018-01-18 10:17:09,696 INFO org.apache.flink.runtime.taskmanager.TaskManager - Determined BLOB server address to be localhost/127.0.0.1:54025. Starting BLOB cache.
2018-01-18 10:17:09,699 INFO org.apache.flink.runtime.blob.PermanentBlobCache - Created BLOB cache storage directory /var/folders/sw/jcdfbbc15td51f3635hvt77w0000gp/T/blobStore-77287aab-5128-4363-842c-1a124114fd91
2018-01-18 10:17:09,702 INFO org.apache.flink.runtime.blob.TransientBlobCache - Created BLOB cache storage directory /var/folders/sw/jcdfbbc15td51f3635hvt77w0000gp/T/blobStore-c9f62e97-bf53-4fc4-9e4a-1958706e78ec
2018-01-18 10:26:25,993 INFO org.apache.flink.runtime.taskmanager.TaskManager - Received task Source: Kafka -> Sink: S3 (1/1)
2018-01-18 10:26:25,993 INFO org.apache.flink.runtime.taskmanager.Task - Source: Kafka -> Sink: S3 (1/1) (95b54853308d69fbb84ee308508bf397) switched from CREATED to DEPLOYING.
2018-01-18 10:26:25,994 INFO org.apache.flink.runtime.taskmanager.Task - Creating FileSystem stream leak safety net for task Source: Kafka -> Sink: S3 (1/1) (95b54853308d69fbb84ee308508bf397) [DEPLOYING]
2018-01-18 10:26:25,996 INFO org.apache.flink.runtime.taskmanager.Task - Loading JAR files for task Source: Kafka -> Sink: S3 (1/1) (95b54853308d69fbb84ee308508bf397) [DEPLOYING].
2018-01-18 10:26:25,998 INFO org.apache.flink.runtime.blob.BlobClient - Downloading 34e7c81bd4a0050e7809a1343af0c7cb/p-4eaec529eb247f30ef2d3ddc2308e029e625de33-93fe90509266a50ffadce2131cedc514 from localhost/127.0.0.1:54025
2018-01-18 10:26:26,238 INFO org.apache.flink.runtime.taskmanager.Task - Registering task at network: Source: Kafka -> Sink: S3 (1/1) (95b54853308d69fbb84ee308508bf397) [DEPLOYING].
2018-01-18 10:26:26,240 INFO org.apache.flink.runtime.taskmanager.Task - Source: Kafka -> Sink: S3 (1/1) (95b54853308d69fbb84ee308508bf397) switched from DEPLOYING to RUNNING.
2018-01-18 10:26:26,249 INFO org.apache.flink.streaming.runtime.tasks.StreamTask - Using user-defined state backend: File State Backend # s3://stream-data/checkpoints.
2018-01-18 10:26:26,522 INFO org.apache.flink.fs.s3hadoop.shaded.org.apache.hadoop.util.NativeCodeLoader - Skipping native-hadoop library for flink-s3-fs-hadoop's relocated Hadoop... using builtin-java classes where applicable
2018-01-18 10:26:29,041 ERROR org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink - Error while creating FileSystem when initializing the state of the BucketingSink.
java.io.IOException: No FileSystem for scheme: s3
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2660)
at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.createHadoopFileSystem(BucketingSink.java:1196)
at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initFileSystem(BucketingSink.java:411)
at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initializeState(BucketingSink.java:355)
at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:178)
at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:160)
at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:96)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:259)
at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeOperators(StreamTask.java:694)
at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:682)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:253)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718)
at java.lang.Thread.run(Thread.java:748)
2018-01-18 10:26:29,048 INFO org.apache.flink.runtime.taskmanager.Task - Source: Kafka -> Sink: S3 (1/1) (95b54853308d69fbb84ee308508bf397) switched from RUNNING to FAILED.
java.lang.RuntimeException: Error while creating FileSystem when initializing the state of the BucketingSink.
at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initializeState(BucketingSink.java:358)
at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.tryRestoreFunction(StreamingFunctionUtils.java:178)
at org.apache.flink.streaming.util.functions.StreamingFunctionUtils.restoreFunctionState(StreamingFunctionUtils.java:160)
at org.apache.flink.streaming.api.operators.AbstractUdfStreamOperator.initializeState(AbstractUdfStreamOperator.java:96)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:259)
at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeOperators(StreamTask.java:694)
at org.apache.flink.streaming.runtime.tasks.StreamTask.initializeState(StreamTask.java:682)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:253)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:718)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.IOException: No FileSystem for scheme: s3
at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2660)
at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.createHadoopFileSystem(BucketingSink.java:1196)
at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initFileSystem(BucketingSink.java:411)
at org.apache.flink.streaming.connectors.fs.bucketing.BucketingSink.initializeState(BucketingSink.java:355)
... 9 more
I've spent a few days working on our SC7 upgrade (from Sitecore 6.6) and I'm running into an issue rebuilding the indexes (Sitecore desktop > control panel > indexing > indexing manager > web > rebuild).
I've stopped our various scheduled tasks and am using a private instance of SC, so there are no users or processing snuffling about in the way. Prior to rebuilding, I stop IIS/worker processes and delete any existing index files on the filesystem.
The indexing will begin, and the wizard will update on progress to about 15,000 (out of 20,000 items). At that point it will just start going slower and slower, before stopping. Sometimes the CPU maxes out and stays there. Sometimes the RAM fills up. Sometimes w3wp crashes with a heep corruption error. Usually (4/5 times) it just stops and goes no further.
The log and the Crawler log don't seem show anything unusual. There are a couple of messages about large MP4 videos that can't be resized, and some binary files that can't have their content indexed. Sitecore just seems to stop and restart (a new log file is started and contains the standard SC 'boot up' messages).
Log:
ManagedPoolThread #9 17:23:44 INFO Job started: Index_Update_IndexName=sitecore_web_index
ManagedPoolThread #11 17:23:50 INFO Job ended: Sitecore.Tasks.DatabaseAgent (units processed: )
ManagedPoolThread #11 17:24:55 INFO Job started: Sitecore.Tasks.DatabaseAgent
ManagedPoolThread #11 17:24:55 INFO Job ended: Sitecore.Tasks.DatabaseAgent (units processed: )
ManagedPoolThread #4 17:25:55 INFO Scheduling.DatabaseAgent started. Database: web
ManagedPoolThread #4 17:25:55 INFO Job ended: Sitecore.Tasks.DatabaseAgent (units processed: )
ManagedPoolThread #1 17:27:00 INFO Job started: Sitecore.Tasks.DatabaseAgent
ManagedPoolThread #1 17:27:00 INFO Scheduling.DatabaseAgent started. Database: web
ManagedPoolThread #1 17:27:00 INFO Examining schedules (count: 0)
ManagedPoolThread #1 17:27:00 INFO Job ended: Sitecore.Tasks.DatabaseAgent (units processed: )
12088 17:27:54 ERROR Could not resize image as it was larger than the maximum size allowed for memory processing. Media item: {0} [/sitecore/media library/Files/[MP4 film]?sc_database=web]
ManagedPoolThread #16 17:28:05 INFO Job started: Sitecore.Tasks.DatabaseAgent
ManagedPoolThread #16 17:28:05 INFO Scheduling.DatabaseAgent started. Database: web
ManagedPoolThread #16 17:28:05 INFO Examining schedules (count: 0)
ManagedPoolThread #16 17:28:05 INFO Job ended: Sitecore.Tasks.DatabaseAgent (units processed: )
ManagedPoolThread #16 17:29:05 INFO Job started: Sitecore.Tasks.DatabaseAgent
ManagedPoolThread #16 17:29:05 INFO Scheduling.DatabaseAgent started. Database: web
ManagedPoolThread #16 17:29:05 INFO Examining schedules (count: 0)
ManagedPoolThread #16 17:29:05 INFO Job ended: Sitecore.Tasks.DatabaseAgent (units processed: )
1668 17:29:13 ERROR Could not resize image as it was larger than the maximum size allowed for memory processing. Media item: {0} [/sitecore/media library/Files/[MP4 film]?sc_database=web]
1668 17:29:33 ERROR Could not resize image as it was larger than the maximum size allowed for memory processing. Media item: {0} [/sitecore/media library/Files/[MP4 film]?sc_database=web]
1668 17:29:43 ERROR Could not resize image as it was larger than the maximum size allowed for memory processing. Media item: {0} [/sitecore/media library/Files/[MP4 film]?sc_database=web]
ManagedPoolThread #16 17:30:06 INFO Job started: Sitecore.Tasks.DatabaseAgent
ManagedPoolThread #16 17:30:06 INFO Scheduling.DatabaseAgent started. Database: web
ManagedPoolThread #16 17:30:06 INFO Examining schedules (count: 0)
ManagedPoolThread #16 17:30:06 INFO Job ended: Sitecore.Tasks.DatabaseAgent (units processed: )
7780 17:30:50 ERROR Could not resize image as it was larger than the maximum size allowed for memory processing. Media item: {0} [/sitecore/media library/Files/[MP4 film]?sc_database=web]
ManagedPoolThread #7 17:31:06 INFO Job started: Sitecore.Tasks.DatabaseAgent
ManagedPoolThread #7 17:31:06 INFO Scheduling.DatabaseAgent started. Database: web
ManagedPoolThread #7 17:31:06 INFO Examining schedules (count: 0)
ManagedPoolThread #7 17:31:06 INFO Job ended: Sitecore.Tasks.DatabaseAgent (units processed: )
ManagedPoolThread #2 17:32:06 INFO Job started: Sitecore.Tasks.DatabaseAgent
ManagedPoolThread #2 17:32:06 INFO Scheduling.DatabaseAgent started. Database: web
ManagedPoolThread #2 17:32:06 INFO Examining schedules (count: 0)
ManagedPoolThread #2 17:32:06 INFO Job ended: Sitecore.Tasks.DatabaseAgent (units processed: )
Heartbeat 17:32:49 WARN Sitecore has no necessary permissions for reading/creating counters.
Heartbeat 17:32:49 INFO Health.PrivateBytes: 0
Heartbeat 17:32:49 INFO Health.CacheInstances: 95
Heartbeat 17:32:49 INFO Health.CacheTotalCount: 70,442
Heartbeat 17:32:49 INFO Health.CacheTotalSize: 78,936,852
Heartbeat 17:32:49 WARN Sitecore has no necessary permissions for reading/creating counters.
Heartbeat 17:32:51 WARN Sitecore has no necessary permissions for reading/creating counters.
ManagedPoolThread #19 17:32:51 INFO Job started: Sitecore.Tasks.DatabaseAgent
ManagedPoolThread #19 17:32:51 INFO Scheduling.DatabaseAgent started. Database: core
ManagedPoolThread #19 17:32:51 INFO Examining schedules (count: 0)
ManagedPoolThread #19 17:32:51 INFO Job ended: Sitecore.Tasks.DatabaseAgent (units processed: )
ManagedPoolThread #3 17:32:51 INFO Job started: Sitecore.Tasks.TaskDatabaseAgent
ManagedPoolThread #3 17:32:51 INFO Processing tasks (count: 1)
ManagedPoolThread #3 17:32:51 INFO Executing email reminder task
ManagedPoolThread #3 17:32:56 INFO Job ended: Sitecore.Tasks.TaskDatabaseAgent (units processed: )
ManagedPoolThread #14 17:33:06 INFO Job started: Sitecore.Tasks.DatabaseAgent
ManagedPoolThread #14 17:33:06 INFO Scheduling.DatabaseAgent started. Database: web
ManagedPoolThread #14 17:33:06 INFO Examining schedules (count: 0)
ManagedPoolThread #14 17:33:06 INFO Job ended: Sitecore.Tasks.DatabaseAgent (units processed: )
ManagedPoolThread #12 17:34:06 INFO Job started: Sitecore.Tasks.DatabaseAgent
ManagedPoolThread #12 17:34:06 INFO Scheduling.DatabaseAgent started. Database: web
ManagedPoolThread #12 17:34:06 INFO Examining schedules (count: 0)
ManagedPoolThread #12 17:34:06 INFO Job ended: Sitecore.Tasks.DatabaseAgent (units processed: )
ManagedPoolThread #7 17:35:07 INFO Job started: Sitecore.Tasks.DatabaseAgent
ManagedPoolThread #7 17:35:07 INFO Scheduling.DatabaseAgent started. Database: web
ManagedPoolThread #7 17:35:07 INFO Examining schedules (count: 0)
ManagedPoolThread #7 17:35:07 INFO Job ended: Sitecore.Tasks.DatabaseAgent (units processed: )
ManagedPoolThread #18 17:36:07 INFO Job started: Sitecore.Tasks.DatabaseAgent
ManagedPoolThread #18 17:36:07 INFO Scheduling.DatabaseAgent started. Database: web
ManagedPoolThread #18 17:36:07 INFO Examining schedules (count: 0)
ManagedPoolThread #18 17:36:07 INFO Job ended: Sitecore.Tasks.DatabaseAgent (units processed: )
ManagedPoolThread #16 17:37:07 INFO Job started: Sitecore.Tasks.DatabaseAgent
ManagedPoolThread #16 17:37:07 INFO Scheduling.DatabaseAgent started. Database: web
ManagedPoolThread #16 17:37:07 INFO Examining schedules (count: 0)
ManagedPoolThread #16 17:37:07 INFO Job ended: Sitecore.Tasks.DatabaseAgent (units processed: )
ManagedPoolThread #8 17:37:52 INFO Job started: Sitecore.Tasks.UrlAgent
ManagedPoolThread #8 17:37:52 INFO Scheduling.UrlAgent started. Url: http://local/sitecore/service/keepalive.aspx
Heartbeat 17:37:53 WARN Sitecore has no necessary permissions for reading/creating counters.
2080 17:38:25 INFO Cache created: 'GeoIp' (max size: 1MB, running total: 799MB)
ManagedPoolThread #8 17:38:26 INFO Scheduling.UrlAgent done (received: 426 bytes)
ManagedPoolThread #8 17:38:26 INFO Job ended: Sitecore.Tasks.UrlAgent (units processed: )
ManagedPoolThread #0 17:38:31 INFO Job started: Sitecore.Tasks.DatabaseAgent
ManagedPoolThread #0 17:38:31 INFO Scheduling.DatabaseAgent started. Database: web
ManagedPoolThread #0 17:38:31 INFO Examining schedules (count: 0)
ManagedPoolThread #0 17:38:31 INFO Job ended: Sitecore.Tasks.DatabaseAgent (units processed: )
ManagedPoolThread #4 17:39:31 INFO Job started: Sitecore.Tasks.DatabaseAgent
ManagedPoolThread #4 17:39:31 INFO Scheduling.DatabaseAgent started. Database: web
ManagedPoolThread #4 17:39:31 INFO Examining schedules (count: 0)
ManagedPoolThread #4 17:39:31 INFO Job ended: Sitecore.Tasks.DatabaseAgent (units processed: )
ManagedPoolThread #3 17:40:31 INFO Job started: Sitecore.Tasks.DatabaseAgent
ManagedPoolThread #3 17:40:31 INFO Scheduling.DatabaseAgent started. Database: web
ManagedPoolThread #3 17:40:31 INFO Examining schedules (count: 0)
ManagedPoolThread #3 17:40:31 INFO Job ended: Sitecore.Tasks.DatabaseAgent (units processed: )
ManagedPoolThread #2 17:41:32 INFO Job started: Sitecore.Tasks.DatabaseAgent
ManagedPoolThread #2 17:41:32 INFO Scheduling.DatabaseAgent started. Database: web
ManagedPoolThread #2 17:41:32 INFO Examining schedules (count: 0)
ManagedPoolThread #2 17:41:32 INFO Job ended: Sitecore.Tasks.DatabaseAgent (units processed: )
Crawler:
12332 17:22:50 INFO [Index=sitecore_web_index] Initializing OnPublishEndAsynchronousStrategy.
12332 17:22:50 INFO [Index=sitecore_web_index] Initializing SitecoreItemCrawler. DB:web / Root:/sitecore
ManagedPoolThread #9 17:23:44 WARN [Index=sitecore_web_index] Reset Started
ManagedPoolThread #9 17:23:44 WARN [Index=sitecore_web_index] Reset Ended
ManagedPoolThread #9 17:23:44 WARN [Index=sitecore_web_index] Full Rebuild Started
9828 17:23:57 ERROR Could not compute value for ComputedIndexField: _content for indexable: sitecore://web/{2199D684-1240-4702-B1AD-98CA54A482CD}?lang=en&ver=1
Exception: System.Runtime.InteropServices.COMException
Message: Error HRESULT E_FAIL has been returned from a call to a COM component.
Source: mscorlib
at System.Runtime.InteropServices.ComTypes.IPersistFile.Load(String pszFileName, Int32 dwMode)
at Sitecore.ContentSearch.Extracters.IFilterTextExtraction.FilterLoader.LoadAndInitIFilter(String fileName, String extension)
at Sitecore.ContentSearch.Extracters.IFilterTextExtraction.FilterReader..ctor(String fileName)
at Sitecore.ContentSearch.ComputedFields.MediaItemIFilterTextExtractor.ComputeFieldValue(IIndexable indexable)
at Sitecore.ContentSearch.ComputedFields.MediaItemContentExtractor.ComputeFieldValue(IIndexable indexable)
at Sitecore.ContentSearch.LuceneProvider.LuceneDocumentBuilder.AddComputedIndexFields()
11652 17:24:18 ERROR Could not compute value for ComputedIndexField: _content for indexable: sitecore://web/{87B6F986-53D4-4D87-9531-6CE90F684DC8}?lang=en&ver=1
Exception: System.Runtime.InteropServices.COMException
Message: Exception from HRESULT: 0x8004170C
Source: mscorlib
at System.Runtime.InteropServices.ComTypes.IPersistFile.Load(String pszFileName, Int32 dwMode)
at Sitecore.ContentSearch.Extracters.IFilterTextExtraction.FilterLoader.LoadAndInitIFilter(String fileName, String extension)
at Sitecore.ContentSearch.Extracters.IFilterTextExtraction.FilterReader..ctor(String fileName)
at Sitecore.ContentSearch.ComputedFields.MediaItemIFilterTextExtractor.ComputeFieldValue(IIndexable indexable)
at Sitecore.ContentSearch.ComputedFields.MediaItemContentExtractor.ComputeFieldValue(IIndexable indexable)
at Sitecore.ContentSearch.LuceneProvider.LuceneDocumentBuilder.AddComputedIndexFields()
6316 17:24:53 INFO [Index=sitecore_core_index] IntervalAsynchronousUpdateStrategy executing.
6316 17:24:53 INFO [Index=sitecore_core_index] History engine is empty. Incremental rebuild returns
6268 17:25:55 INFO [Index=sitecore_core_index] IntervalAsynchronousUpdateStrategy executing.
6268 17:25:55 INFO [Index=sitecore_core_index] History engine is empty. Incremental rebuild returns
9736 17:26:55 INFO [Index=sitecore_core_index] IntervalAsynchronousUpdateStrategy executing.
9736 17:26:55 INFO [Index=sitecore_core_index] History engine is empty. Incremental rebuild returns
6316 17:27:50 INFO [Index=sitecore_web_index] TimeIntervalCommitPolicy.ShouldCommit - Time Limit Exceeded, lastCommit=20/02/2015 17:22:50, count=13635
6316 17:27:50 INFO [Index=sitecore_web_index] Committing: Add: 13634; Update:0; DeleteUnique: 0; DeleteGroup: 0
6316 17:27:50 INFO [Index=sitecore_web_index] Committed
1748 17:27:56 INFO [Index=sitecore_core_index] IntervalAsynchronousUpdateStrategy executing.
1748 17:27:56 INFO [Index=sitecore_core_index] History engine is empty. Incremental rebuild returns
9980 17:28:55 INFO [Index=sitecore_core_index] IntervalAsynchronousUpdateStrategy executing.
9980 17:28:55 INFO [Index=sitecore_core_index] History engine is empty. Incremental rebuild returns
My Sitecore.ContentSearch.Lucene.DefaultIndexConfiguration.config file:
<configuration xmlns:patch="http://www.sitecore.net/xmlconfig/">
<sitecore>
<contentSearch>
<indexUpdateStrategies>
<intervalAsyncCore type="Sitecore.ContentSearch.Maintenance.Strategies.IntervalAsynchronousStrategy, Sitecore.ContentSearch">
<param desc="database">core</param>
<param desc="interval">00:01:00</param>
<CheckForThreshold>true</CheckForThreshold>
</intervalAsyncCore>
<intervalAsyncMaster type="Sitecore.ContentSearch.Maintenance.Strategies.IntervalAsynchronousStrategy, Sitecore.ContentSearch">
<param desc="database">master</param>
<param desc="interval">00:00:05</param>
<CheckForThreshold>true</CheckForThreshold>
</intervalAsyncMaster>
<manual type="Sitecore.ContentSearch.Maintenance.Strategies.ManualStrategy, Sitecore.ContentSearch" />
<onPublishEndAsync type="Sitecore.ContentSearch.Maintenance.Strategies.OnPublishEndAsynchronousStrategy, Sitecore.ContentSearch">
<param desc="database">web</param>
<CheckForThreshold>true</CheckForThreshold>
</onPublishEndAsync>
<rebuildAfterFullPublish type="Sitecore.ContentSearch.Maintenance.Strategies.RebuildAfterFullPublishStrategy, Sitecore.ContentSearch" />
<remoteRebuild type="Sitecore.ContentSearch.Maintenance.Strategies.RemoteRebuildStrategy, Sitecore.ContentSearch" />
<syncMaster type="Sitecore.ContentSearch.Maintenance.Strategies.SynchronousStrategy, Sitecore.ContentSearch">
<param desc="database">master</param>
</syncMaster>
</indexUpdateStrategies>
<databasePropertyStore type="Sitecore.ContentSearch.Maintenance.IndexDatabasePropertyStore, Sitecore.ContentSearch">
<Key>$(1)</Key>
<Database>core</Database>
</databasePropertyStore>
<configuration type="Sitecore.ContentSearch.LuceneProvider.LuceneSearchConfiguration, Sitecore.ContentSearch.LuceneProvider">
<defaultIndexConfiguration type="Sitecore.ContentSearch.LuceneProvider.LuceneIndexConfiguration, Sitecore.ContentSearch.LuceneProvider">
<indexAllFields>true</indexAllFields>
<analyzer type="Sitecore.ContentSearch.LuceneProvider.Analyzers.PerExecutionContextAnalyzer, Sitecore.ContentSearch.LuceneProvider">
<param desc="defaultAnalyzer" type="Sitecore.ContentSearch.LuceneProvider.Analyzers.DefaultPerFieldAnalyzer, Sitecore.ContentSearch.LuceneProvider">
<param desc="defaultAnalyzer" type="Lucene.Net.Analysis.Standard.StandardAnalyzer, Lucene.Net">
<param hint="version">Lucene_30</param>
</param>
</param>
...snip...
</analyzer>
<fieldMap type="Sitecore.ContentSearch.FieldMap, Sitecore.ContentSearch">
...snip...
</fieldMap>
<virtualFieldProcessors hint="raw:AddVirtualFieldProcessor">
...snip...
</virtualFieldProcessors>
<exclude hint="list:ExcludeTemplate">
<BucketFolderTemplate>{ADB6CA4F-03EF-4F47-B9AC-9CE2BA53FF97}</BucketFolderTemplate>
</exclude>
<include hint="list:IncludeField">
<fieldId>{8CDC337E-A112-42FB-BBB4-4143751E123F}</fieldId>
</include>-->
<exclude hint="list:ExcludeField">
...snip...
</exclude>
<fields hint="raw:AddCustomField">
<field luceneName="__smallCreatedDate" storageType="yes" indexType="tokenized" format="yyyyMMdd">__created</field>
<field luceneName="__smallUpdatedDate" storageType="yes" indexType="tokenized" format="yyyyMMdd">__updated</field>
</fields>
<fields hint="raw:RemoveSpecialFields">
<remove type="both">AllTemplates</remove>
<remove type="both">Created</remove>
<remove type="both">Editor</remove>
<remove type="both">Hidden</remove>
<remove type="both">Icon</remove>
<remove type="both">Links</remove>
<remove type="both">Updated</remove>
</fields>
<fields hint="raw:AddComputedIndexField">
...snip...
</fields>
<fieldReaders type="Sitecore.ContentSearch.FieldReaders.FieldReaderMap, Sitecore.ContentSearch">
...snip...
</fieldReaders>
<indexFieldStorageValueFormatter type="Sitecore.ContentSearch.LuceneProvider.Converters.LuceneIndexFieldStorageValueFormatter, Sitecore.ContentSearch.LuceneProvider">
...snip...
</indexFieldStorageValueFormatter>
<indexDocumentPropertyMapper type="Sitecore.ContentSearch.LuceneProvider.DefaultLuceneDocumentTypeMapper, Sitecore.ContentSearch.LuceneProvider" />
</defaultIndexConfiguration>
</configuration>
</contentSearch>
<settings>
<setting name="ContentSearch.CalibrateSizeByDeletes" value="true" />
<setting name="ContentSearch.ConcurrentMergeSchedulerThreads" value="25" />
<setting name="ContentSearch.IndexMergeFactor" value="10" />
<setting name="ContentSearch.LuceneQueryClauseCount" value="1024" />
<setting name="ContentSearch.MaxDocumentBufferSize" value="10000" />
<setting name="ContentSearch.MaxMergeDocs" value="10000" />
<setting name="ContentSearch.MaxMergeMB" value="512" />
<setting name="ContentSearch.MinMergeMB" value="10" />
<setting name="ContentSearch.RamBufferSize" value="512" />
<setting name="ContentSearch.TermIndexInterval" value="256" />
<setting name="ContentSearch.UseCompoundFile" value="false" />
<setting name="ContentSearch.WaitForMerges" value="true" />
</settings>
</sitecore>
</configuration>
Does anyone have any clue what might be causing the indexer to stall? I've tried:
increasing the RAM used for indexing in the config file (the machine has 16GB RAM)
setting WaitForMerges to false
using the compound file,
reducing the number of threads
increasing the max pool limit in the DB connection strings to 100, 250, 500 and 1000.
I'm absolutely lost as to why the indexing dies like this in SC7 vs SC6.6. There are no obvious config differences from a fresh SC7 install compared to our SC6.6 indexing settings.
Anyone who can help me get to the bottom of this deserves a medal! Thank you in advance :)
EDIT
My is Caching.DisableCacheSizeLimits setting is:
<setting name="Caching.DisableCacheSizeLimits" value="false" />
Check the items {2199D684-1240-4702-B1AD-98CA54A482CD} and {87B6F986-53D4-4D87-9531-6CE90F684DC8}
Are these media items?
See this blog about indexing media
http://www.sitecore.net/learn/blogs/technical-blogs/john-west-sitecore-blog/posts/2013/04/sitecore-7-indexing-media-with-ifilters