Jack Rabbit OAK: How to avoid background read operation in application start up - jcr

I am using jack rabbit OAK implementation for CMS, but when I am starting my springboot application I am getting below stat logs frequently and application taking long time to start:
Start up logs ---
{"#timestamp":"2020-07-30T16:41:36.715+05:30","#version":1,"message":"RDBDocumentStore (1.22.3) instantiated for database PostgreSQL 10.4 (Ubuntu 10.4-2.pgdg16.04+1) (10.4), using driver: PostgreSQL JDBC Driver 42.2.2 (42.2), connecting to: jdbc:postgresql://host:1234/cms_db, properties: {datcollate=C, pg_encoding_to_char(encoding)=UTF8}, transaction isolation level: TRANSACTION_READ_COMMITTED (2), .table_nodes: id varchar(512), modified int8, hasbinary int2, deletedonce int2, modcount int8, cmodcount int8, dsize int8, data varchar(16384), bdata bytea(2147483647), version int2, sdtype int2, sdmaxrevtime int8 /* {bytea=-2, int2=5, int8=-5, varchar=12} */ /* index table_nodes_mod on public.table_nodes (modified ASC) other (#312616320, p2324089), unique index table_nodes_pkey on public.table_nodes (id ASC) other (#312616320, p8445036), index table_nodes_sdm on public.table_nodes (sdmaxrevtime ASC) other (#312616320, p2086629), index table_nodes_sdt on public.table_nodes (sdtype ASC) other (#312616320, p1877301), index table_nodes_vsn on public.table_nodes (version ASC) other (#312616320, p1901293) */","logger_name":"org.apache.jackrabbit.oak.plugins.document.rdb.RDBDocumentStore","thread_name":"main","level":"INFO","level_value":20000}
{"#timestamp":"2020-07-30T16:41:36.715+05:30","#version":1,"message":"Tables present upon startup: [table_CLUSTERNODES, table_NODES, table_SETTINGS, table_JOURNAL]","logger_name":"org.apache.jackrabbit.oak.plugins.document.rdb.RDBDocumentStore","thread_name":"main","level":"INFO","level_value":20000}
{"#timestamp":"2020-07-30T16:41:39.631+05:30","#version":1,"message":"Acquired (new) clusterId 773. MachineId mac:847xxb46xxxx, InstanceId D:\\cmsWorkspace\\test-cms\\cms-services","logger_name":"org.apache.jackrabbit.oak.plugins.document.ClusterNodeInfo","thread_name":"main","level":"INFO","level_value":20000}
{"#timestamp":"2020-07-30T16:41:50.256+05:30","#version":1,"message":"ChangeSetBuilder enabled and size set to maxItems: 50, maxDepth: 9","logger_name":"org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore","thread_name":"main","level":"INFO","level_value":20000}
{"#timestamp":"2020-07-30T16:41:50.256+05:30","#version":1,"message":"Initialized DocumentNodeStore with clusterNodeId: 773, updateLimit: 100000 (id: 773, startTime: 1596107498977, machineId: mac:847beb4693fb, instanceId: D:\\cmsWorkspace\\test-cms\\cms-services, pid: 11700, uuid: 4e5s7886-test-4a03-8dah-7a320f6ea72f, readWriteMode: null, leaseCheckMode: STRICT, state: ACTIVE, oakVersion: 1.22.3, formatVersion: 1.8.0, invisible: false)","logger_name":"org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore","thread_name":"main","level":"INFO","level_value":20000}
{"#timestamp":"2020-07-30T16:43:03.387+05:30","#version":1,"message":"Background update operation failed (will be retried with next run): org.apache.jackrabbit.oak.plugins.document.DocumentStoreException: failed update of 0:/ (race?) after 10 retries","logger_name":"org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore","thread_name":"DocumentNodeStore background update thread (773)","level":"WARN","level_value":30000,"stack_trace":"org.apache.jackrabbit.oak.plugins.document.DocumentStoreException: failed update of 0:/ (race?) after 10 retries\r\n\tat org.apache.jackrabbit.oak.plugins.document.rdb.RDBDocumentStore.internalUpdate(RDBDocumentStore.java:1728)\r\n\tat org.apache.jackrabbit.oak.plugins.document.rdb.RDBDocumentStore.internalCreateOrUpdate(RDBDocumentStore.java:1651)\r\n\tat org.apache.jackrabbit.oak.plugins.document.rdb.RDBDocumentStore.findAndUpdate(RDBDocumentStore.java:603)\r\n\tat org.apache.jackrabbit.oak.plugins.document.util.LeaseCheckDocumentStoreWrapper.findAndUpdate(LeaseCheckDocumentStoreWrapper.java:141)\r\n\tat org.apache.jackrabbit.oak.plugins.document.UnsavedModifications.persist(UnsavedModifications.java:214)\r\n\tat org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.backgroundWrite(DocumentNodeStore.java:2451)\r\n\tat org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.internalRunBackgroundUpdateOperations(DocumentNodeStore.java:2102)\r\n\tat org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.runBackgroundUpdateOperations(DocumentNodeStore.java:2075)\r\n\tat org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore$BackgroundUpdateOperation.execute(DocumentNodeStore.java:3251)\r\n\tat org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore$NodeStoreTask.run(DocumentNodeStore.java:3220)\r\n\tat java.lang.Thread.run(Thread.java:748)\r\n"}
{"#timestamp":"2020-07-30T16:43:03.388+05:30","#version":1,"message":"Background operation failed: org.apache.jackrabbit.oak.plugins.document.DocumentStoreException: failed update of 0:/ (race?) after 10 retries","logger_name":"org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore","thread_name":"DocumentNodeStore background update thread (773)","level":"WARN","level_value":30000,"stack_trace":"org.apache.jackrabbit.oak.plugins.document.DocumentStoreException: failed update of 0:/ (race?) after 10 retries\r\n\tat org.apache.jackrabbit.oak.plugins.document.rdb.RDBDocumentStore.internalUpdate(RDBDocumentStore.java:1728)\r\n\tat org.apache.jackrabbit.oak.plugins.document.rdb.RDBDocumentStore.internalCreateOrUpdate(RDBDocumentStore.java:1651)\r\n\tat org.apache.jackrabbit.oak.plugins.document.rdb.RDBDocumentStore.findAndUpdate(RDBDocumentStore.java:603)\r\n\tat org.apache.jackrabbit.oak.plugins.document.util.LeaseCheckDocumentStoreWrapper.findAndUpdate(LeaseCheckDocumentStoreWrapper.java:141)\r\n\tat org.apache.jackrabbit.oak.plugins.document.UnsavedModifications.persist(UnsavedModifications.java:214)\r\n\tat org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.backgroundWrite(DocumentNodeStore.java:2451)\r\n\tat org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.internalRunBackgroundUpdateOperations(DocumentNodeStore.java:2102)\r\n\tat org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.runBackgroundUpdateOperations(DocumentNodeStore.java:2075)\r\n\tat org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore$BackgroundUpdateOperation.execute(DocumentNodeStore.java:3251)\r\n\tat org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore$NodeStoreTask.run(DocumentNodeStore.java:3220)\r\n\tat java.lang.Thread.run(Thread.java:748)\r\n"}
{"#timestamp":"2020-07-30T16:43:25.362+05:30","#version":1,"message":"Background update operation failed (will be retried with next run): org.apache.jackrabbit.oak.plugins.document.DocumentStoreException: failed update of 0:/ (race?) after 10 retries","logger_name":"org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore","thread_name":"DocumentNodeStore background update thread (773)","level":"WARN","level_value":30000,"stack_trace":"org.apache.jackrabbit.oak.plugins.document.DocumentStoreException: failed update of 0:/ (race?) after 10 retries\r\n\tat org.apache.jackrabbit.oak.plugins.document.rdb.RDBDocumentStore.internalUpdate(RDBDocumentStore.java:1728)\r\n\tat org.apache.jackrabbit.oak.plugins.document.rdb.RDBDocumentStore.internalCreateOrUpdate(RDBDocumentStore.java:1651)\r\n\tat org.apache.jackrabbit.oak.plugins.document.rdb.RDBDocumentStore.findAndUpdate(RDBDocumentStore.java:603)\r\n\tat org.apache.jackrabbit.oak.plugins.document.util.LeaseCheckDocumentStoreWrapper.findAndUpdate(LeaseCheckDocumentStoreWrapper.java:141)\r\n\tat org.apache.jackrabbit.oak.plugins.document.UnsavedModifications.persist(UnsavedModifications.java:214)\r\n\tat org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.backgroundWrite(DocumentNodeStore.java:2451)\r\n\tat org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.internalRunBackgroundUpdateOperations(DocumentNodeStore.java:2102)\r\n\tat org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.runBackgroundUpdateOperations(DocumentNodeStore.java:2075)\r\n\tat org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore$BackgroundUpdateOperation.execute(DocumentNodeStore.java:3251)\r\n\tat org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore$NodeStoreTask.run(DocumentNodeStore.java:3220)\r\n\tat java.lang.Thread.run(Thread.java:748)\r\n"}
{"#timestamp":"2020-07-30T16:43:25.363+05:30","#version":1,"message":"Background operation failed: org.apache.jackrabbit.oak.plugins.document.DocumentStoreException: failed update of 0:/ (race?) after 10 retries","logger_name":"org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore","thread_name":"DocumentNodeStore background update thread (773)","level":"WARN","level_value":30000,"stack_trace":"org.apache.jackrabbit.oak.plugins.document.DocumentStoreException: failed update of 0:/ (race?) after 10 retries\r\n\tat org.apache.jackrabbit.oak.plugins.document.rdb.RDBDocumentStore.internalUpdate(RDBDocumentStore.java:1728)\r\n\tat org.apache.jackrabbit.oak.plugins.document.rdb.RDBDocumentStore.internalCreateOrUpdate(RDBDocumentStore.java:1651)\r\n\tat org.apache.jackrabbit.oak.plugins.document.rdb.RDBDocumentStore.findAndUpdate(RDBDocumentStore.java:603)\r\n\tat org.apache.jackrabbit.oak.plugins.document.util.LeaseCheckDocumentStoreWrapper.findAndUpdate(LeaseCheckDocumentStoreWrapper.java:141)\r\n\tat org.apache.jackrabbit.oak.plugins.document.UnsavedModifications.persist(UnsavedModifications.java:214)\r\n\tat org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.backgroundWrite(DocumentNodeStore.java:2451)\r\n\tat org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.internalRunBackgroundUpdateOperations(DocumentNodeStore.java:2102)\r\n\tat org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.runBackgroundUpdateOperations(DocumentNodeStore.java:2075)\r\n\tat org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore$BackgroundUpdateOperation.execute(DocumentNodeStore.java:3251)\r\n\tat org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore$NodeStoreTask.run(DocumentNodeStore.java:3220)\r\n\tat java.lang.Thread.run(Thread.java:748)\r\n"}
{"#timestamp":"2020-07-30T16:43:26.124+05:30","#version":1,"message":"Background read operations stats (read:44551 ReadStats{cacheStats:NOP, head:23104, cache:20688, diff: 3, lock:0, dispatch:0, numExternalChanges:734, externalChangesLag:55245, totalReadTime:44551})","logger_name":"org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore","thread_name":"DocumentNodeStore background read thread (773)","level":"INFO","level_value":20000}
{"#timestamp":"2020-07-30T16:43:46.433+05:30","#version":1,"message":"Background update operation failed (will be retried with next run): org.apache.jackrabbit.oak.plugins.document.DocumentStoreException: failed update of 0:/ (race?) after 10 retries","logger_name":"org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore","thread_name":"DocumentNodeStore background update thread (773)","level":"WARN","level_value":30000,"stack_trace":"org.apache.jackrabbit.oak.plugins.document.DocumentStoreException: failed update of 0:/ (race?) after 10 retries\r\n\tat org.apache.jackrabbit.oak.plugins.document.rdb.RDBDocumentStore.internalUpdate(RDBDocumentStore.java:1728)\r\n\tat org.apache.jackrabbit.oak.plugins.document.rdb.RDBDocumentStore.internalCreateOrUpdate(RDBDocumentStore.java:1651)\r\n\tat org.apache.jackrabbit.oak.plugins.document.rdb.RDBDocumentStore.findAndUpdate(RDBDocumentStore.java:603)\r\n\tat org.apache.jackrabbit.oak.plugins.document.util.LeaseCheckDocumentStoreWrapper.findAndUpdate(LeaseCheckDocumentStoreWrapper.java:141)\r\n\tat org.apache.jackrabbit.oak.plugins.document.UnsavedModifications.persist(UnsavedModifications.java:214)\r\n\tat org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.backgroundWrite(DocumentNodeStore.java:2451)\r\n\tat org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.internalRunBackgroundUpdateOperations(DocumentNodeStore.java:2102)\r\n\tat org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.runBackgroundUpdateOperations(DocumentNodeStore.java:2075)\r\n\tat org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore$BackgroundUpdateOperation.execute(DocumentNodeStore.java:3251)\r\n\tat org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore$NodeStoreTask.run(DocumentNodeStore.java:3220)\r\n\tat java.lang.Thread.run(Thread.java:748)\r\n"}
{"#timestamp":"2020-07-30T16:43:46.433+05:30","#version":1,"message":"Background operation failed: org.apache.jackrabbit.oak.plugins.document.DocumentStoreException: failed update of 0:/ (race?) after 10 retries","logger_name":"org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore","thread_name":"DocumentNodeStore background update thread (773)","level":"WARN","level_value":30000,"stack_trace":"org.apache.jackrabbit.oak.plugins.document.DocumentStoreException: failed update of 0:/ (race?) after 10 retries\r\n\tat org.apache.jackrabbit.oak.plugins.document.rdb.RDBDocumentStore.internalUpdate(RDBDocumentStore.java:1728)\r\n\tat org.apache.jackrabbit.oak.plugins.document.rdb.RDBDocumentStore.internalCreateOrUpdate(RDBDocumentStore.java:1651)\r\n\tat org.apache.jackrabbit.oak.plugins.document.rdb.RDBDocumentStore.findAndUpdate(RDBDocumentStore.java:603)\r\n\tat org.apache.jackrabbit.oak.plugins.document.util.LeaseCheckDocumentStoreWrapper.findAndUpdate(LeaseCheckDocumentStoreWrapper.java:141)\r\n\tat org.apache.jackrabbit.oak.plugins.document.UnsavedModifications.persist(UnsavedModifications.java:214)\r\n\tat org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.backgroundWrite(DocumentNodeStore.java:2451)\r\n\tat org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.internalRunBackgroundUpdateOperations(DocumentNodeStore.java:2102)\r\n\tat org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.runBackgroundUpdateOperations(DocumentNodeStore.java:2075)\r\n\tat org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore$BackgroundUpdateOperation.execute(DocumentNodeStore.java:3251)\r\n\tat org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore$NodeStoreTask.run(DocumentNodeStore.java:3220)\r\n\tat java.lang.Thread.run(Thread.java:748)\r\n"}
{"#timestamp":"2020-07-30T16:43:53.827+05:30","#version":1,"message":"Background operation BackgroundUpdateOperation successful again","logger_name":"org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore","thread_name":"DocumentNodeStore background update thread (773)","level":"INFO","level_value":20000}
{"#timestamp":"2020-07-30T16:43:56.066+05:30","#version":1,"message":"Background read operations stats (read:28941 ReadStats{cacheStats:NOP, head:27955, cache:1, diff: 4, lock:0, dispatch:0, numExternalChanges:959, externalChangesLag:54012, totalReadTime:28941})","logger_name":"org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore","thread_name":"DocumentNodeStore background read thread (773)","level":"INFO","level_value":20000}
{"#timestamp":"2020-07-30T16:44:52.230+05:30","#version":1,"message":"Slow Query Report SQL=select MODIFIED, MODCOUNT, CMODCOUNT, HASBINARY, DELETEDONCE, VERSION, SDTYPE, SDMAXREVTIME, case when (MODCOUNT = ? and MODIFIED = ?) then null else DATA end as DATA, case when (MODCOUNT = ? and MODIFIED = ?) then null else BDATA end as BDATA from table_NODES where ID = ?; time=44158 ms;","logger_name":"org.apache.tomcat.jdbc.pool.interceptor.SlowQueryReport","thread_name":"DocumentNodeStore background read thread (773)","level":"WARN","level_value":30000}
{"#timestamp":"2020-07-30T16:44:52.454+05:30","#version":1,"message":"Background read operations stats (read:44383 ReadStats{cacheStats:NOP, head:44383, cache:0, diff: 0, lock:0, dispatch:0, numExternalChanges:0, externalChangesLag:0, totalReadTime:44383})","logger_name":"org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore","thread_name":"DocumentNodeStore background read thread (773)","level":"INFO","level_value":20000}
I have millions of data(nodes) in DB. How to avoid above background operations to start application quickly.

Related

Data ingest issues hive: java.lang.OutOfMemoryError: unable to create new native thread

I'm a hive newbie and having an odyssey of problems getting a large (1TB) HDFS file into a partitioned Hive managed table. Can you please help me get around this? I feel like I have a bad config somewhere because I'm not able to complete reducer jobs.
Here is my query:
DROP TABLE IF EXISTS ts_managed;
SET hive.enforce.sorting = true;
CREATE TABLE IF NOT EXISTS ts_managed (
svcpt_id VARCHAR(20),
usage_value FLOAT,
read_time SMALLINT)
PARTITIONED BY (read_date INT)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
STORED AS ORC
TBLPROPERTIES("orc.compress"="snappy","orc.create.index"="true","orc.bloom.filter.columns"="svcpt_id");
SET hive.vectorized.execution.enabled = true;
SET hive.vectorized.execution.reduce.enabled = true;
SET set hive.cbo.enable=true;
SET hive.tez.auto.reducer.parallelism=true;
SET hive.exec.reducers.max=20000;
SET yarn.nodemanager.pmem-check-enabled = true;
SET optimize.sort.dynamic.partitioning=true;
SET hive.exec.max.dynamic.partitions=10000;
INSERT OVERWRITE TABLE ts_managed
PARTITION (read_date)
SELECT svcpt_id, usage, read_time, read_date
FROM ts_raw
DISTRIBUTE BY svcpt_id
SORT BY svcpt_id;
My cluster specs are:
VM cluster
4 total nodes
4 data nodes
32 cores
140 GB RAM
Hortonworks HDP 3.0
Apache Tez as default Hive engine
I am the only user of the cluster
My yarn configs are:
yarn.nodemanager.resource.memory-mb = 32GB
yarn.scheduler.minimum-allocation-mb = 512MB
yarn.scheduler.maximum-allocation-mb = 8192MB
yarn-heapsize = 1024MB
My Hive configs are:
hive.tez.container.size = 682MB
hive.heapsize = 4096MB
hive.metastore.heapsize = 1024MB
hive.exec.reducer.bytes.per.reducer = 1GB
hive.auto.convert.join.noconditionaltask.size = 2184.5MB
hive.tex.auto.reducer.parallelism = True
hive.tez.dynamic.partition.pruning = True
My tez configs:
tez.am.resource.memory.mb = 5120MB
tez.grouping.max-size = 1073741824 Bytes
tez.grouping.min-size = 16777216 Bytes
tez.grouping.split-waves = 1.7
tez.runtime.compress = True
tez.runtime.compress.codec = org.apache.hadoop.io.compress.SnappyCodec
I've tried countless configurations including:
Partition on date
Partition on date, cluster on svcpt_id with buckets
Partition on date, bloom filter on svcpt, sort by svcpt_id
Partition on date, bloom filter on svcpt, distribute by and sort by svcpt_id
I can get my mapping vertex to run, but I have not gotten my first reducer vertex to complete. Here is my most recent example from the above query:
----------------------------------------------------------------------------------------------
VERTICES MODE STATUS TOTAL COMPLETED RUNNING PENDING FAILED KILLED
----------------------------------------------------------------------------------------------
Map 1 .......... container SUCCEEDED 1043 1043 0 0 0 0
Reducer 2 container RUNNING 9636 0 0 9636 1 0
Reducer 3 container INITED 9636 0 0 9636 0 0
----------------------------------------------------------------------------------------------
VERTICES: 01/03 [=>>-------------------------] 4% ELAPSED TIME: 6804.08 s
----------------------------------------------------------------------------------------------
The error was:
Error: Error while processing statement: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Reducer 2, vertexId=vertex_1537061583429_0010_2_01, diagnostics=[Task failed, taskId=task_1537061583429_0010_2_01_000070, diagnostics=[TaskAttempt 0 failed, info=[Error: Error while running task ( failure ) : java.lang.OutOfMemoryError: unable to create new native thread
I either get this OOM error which I cannot seem to get around or I get datanodes going offline and not being able to meet my replication factor requirements.
At this point I've been troubleshooting for over 2 weeks. Any contacts for professional consultants I can pay to solve this problem would also be appreciated.
Thanks in advance!
I ended up solving this after speaking with a Hortonworks tech guy. Turns out I was over-partitioning my table. Instead of partitioining by day over about 4 years I partitioned by month and it worked great.

hive query is returning no data

CREATE EXTERNAL TABLE invoiceitems (
InvoiceNo INT,
StockCode INT,
Description STRING,
Quantity INT,
InvoiceDate BIGINT,
UnitPrice DOUBLE,
CustomerID INT,
Country STRING,
LineNo INT,
InvoiceTime STRING,
StoreID INT,
TransactionID STRING
)
ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
LOCATION 's3a://streamingdata/data/*';
The data files were created by a spark structured streaming job:
...
data/part-00000-006fc42a-c6a1-42a2-af03-ae0c326b40bd-c000.json 7.1 KB 29/08/2018 10:27:32 PM
data/part-00000-0075634b-8513-47b3-b5f8-19df8269cf9d-c000.json 1.3 KB 30/08/2018 10:47:32 AM
data/part-00000-00b6b230-8bb3-49d1-a42e-ad768c1f9a94-c000.json 2.3 KB 30/08/2018 1:25:02 AM
...
Here is are the first few rows of the first file:
{"InvoiceNo":5421462,"StockCode":22426,"Description":"ENAMEL WASH BOWL CREAM","Quantity":8,"InvoiceDate":1535578020000,"UnitPrice":3.75,"CustomerID":13405,"Country":"United Kingdom","LineNo":6,"InvoiceTime":"21:27:00","StoreID":0,"TransactionID":"542146260180829"}
{"InvoiceNo":5501932,"StockCode":22170,"Description":"PICTURE FRAME WOOD TRIPLE PORTRAIT","Quantity":4,"InvoiceDate":1535578020000,"UnitPrice":6.75,"CustomerID":13952,"Country":"United Kingdom","LineNo":26,"InvoiceTime":"21:27:00","StoreID":0,"TransactionID":"5501932260180829"}
However, if I run the query, no data is returned:
hive> select * from invoiceitems limit 5;
OK
Time taken: 24.127 seconds
The log files for hive are empty:
$ ls /var/log/hive*
/var/log/hive:
/var/log/hive-hcatalog:
/var/log/hive2:
How can I debug this further?
I received more of a hint about the error when I ran:
select count(*) from invoiceitems;
This returned the following error
...
DAG did not succeed due to VERTEX_FAILURE. failedVertices:1
killedVertices:1 FAILED: Execution Error, return code 2 from
org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed,
vertexName=Map 1, vertexId=vertex_1535521291031_0011_1_00,
diagnostics=[Vertex vertex_1535521291031_0011_1_00 [Map 1]
killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input:
invoiceitems initializer failed, vertex=vertex_1535521291031_0011_1_00
[Map 1], java.io.IOException: cannot find dir =
s3a://streamingdata/data/part-00000-006fc42a-c6a1-42a2-af03-ae0c326b40bd-c000.json
in pathToPartitionInfo: [s3a://streamingdata/data/*]
I decided to change the create table definition from:
LOCATION 's3a://streamingdata/data/*';
to
LOCATION 's3a://streamingdata/data/';
and this fixed the issue.

ShareLock and ExclusiveLock with postgres database

I was checking the lock in logs of one application which is running in heroku and its shows so many lock from delayed_jobs and increment_counter, also these time i got so many timeouts
sql_error_code = 00000 LOG: process 129728 still waiting for ShareLock on
transaction 1296511670 after 1000.149 ms
2017-06-02T16:24:58+00:00 app
postgres.129728 - - [TEST] [7-2] sql_error_code = 00000 DETAIL: Process
holding the lock: 129457. Wait queue: 129728.
02 Jun 2017 20:24:58.338198 <134>1 2017-06-02T16:24:58+00:00 app
postgres.129728 - - [TEST] [7-3] sql_error_code = 00000 CONTEXT: while
locking tuple (75,2) in relation "delayed_jobs"
LOG: process 129429 acquired ExclusiveLock on tuple (878044,83) of relation
16953 of database 16385 after 3220.356 ms
02 Jun 2017 20:24:58.338591 <134>1 2017-06-02T16:24:58+00:00 app
postgres.129728 - - [TEST] [7-4] sql_error_code = 00000 STATEMENT: UPDATE
"delayed_jobs" SET locked_at = '2017-06-02 16:24:57.033870', locked_by =
'host:a96aff72dae123123e pid:4' WHERE id IN (SELECT id FROM
"delayed_jobs" WHERE ((run_at <= '2017-06-02 16:24:57.032776' AND (locked_at
IS NULL OR locked_at < '2017-06-02 12:24:57.032817') OR locked_by =
'host:a96aff72dae123123e pid:4') AND failed_at IS NULL)
ORDER BY priority ASC, run_at ASC LIMIT 1 FOR UPDATE) RETURNING *
sql_error_code = 00000 DETAIL: Process holding the lock: 129495. Wait queue:
3276.
02 Jun 2017 20:25:09.279197 <134>1 2017-06-02T16:25:08+00:00 app
postgres.3276
- - [TEST] [7-3] sql_error_code = 00000 CONTEXT: while updating tuple
(878034,120) in relation "messages"
02 Jun 2017 20:25:09.279248 <134>1 2017-06-02T16:25:08+00:00 app
postgres.3276
- - [TEST] [7-4] sql_error_code = 00000
STATEMENT: UPDATE "messages" SET
"item_no" = COALESCE("item_no", 0) + 1 WHERE "messages"."id" =
48290879
I think this is not a normal lock, is there any way to fix these kind of lock?
I don't know what you consider a "normal" kind of lock to be. This is the normal kind of lock you get when multiple transactions try to update (or to select for update) on the same tuple at the same time.
But why are the transactions that are taking these locks holding on to them for at least a second? Are the transactions inherently slow, or are they getting distracted?

Want to occur deadlock using sql query

I want to demonstrate a deadlock situation:
In my first transaction:
UPDATE POSITION SET EXTRA = EXTRA || 'yes' WHERE NAME="JOHN"
UPDATE POSITION SET EXTRA = 'HI' WHERE EXTRA = 'EXTRA';
So second transaction:
UPDATE POSITION SET BONUS = BONUS * 1.05;
UPDATE POSITION SET BONUS = 0 IF BONUS IS NULL;
So isn't possible to occur deadlock here just want to try and understand for it
for my knowledge. deadlock occur if update at different row but not different column and transaction occur same with each other, but for this 4 updates. i don't know how to make it become deadlock situation
Deadlocks occur when two processes block each other by trying to obtain the same resources in a different order. I've seen Oracle deadlocks happen for three reasons, there are probably more:
Concurrent sessions update the same rows in different order because explain plans retrieve the rows differently. For example, one session uses an index and another uses a full table scan.
Un-indexed foreign keys cause table locks.
Bitmap indexes and any type of concurrent DML on a table.
The code below demonstrates the first case. It generates a deadlock by looping through two of your update statements. The index causes the first session to use an INDEX RANGE SCAN and the second session uses a FULL TABLE SCAN. The results are not deterministic but it only took about a second for this to fail on my PC.
Sample schema and data
create table position(name varchar2(100), extra varchar2(4000), bonus number);
insert into position select 'JOHN', null, 1 from dual connect by level <= 100;
insert into position select level , null, 1 from dual connect by level <= 100000;
create index position_index on position(name);
Session 1 (run at the same time as session 2)
begin
for i in 1 .. 1000 loop
UPDATE POSITION SET EXTRA = EXTRA || 'yes' WHERE NAME='JOHN';
commit;
end loop;
end;
/
Session 2 (run at the same time as session 1)
begin
for i in 1 .. 1000 loop
UPDATE POSITION SET BONUS = BONUS * 1.05;
commit;
end loop;
end;
/
Error message
ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource
ORA-06512: at line 3
Find the location of the trace file generated for each deadlock:
select value from v$parameter where name like 'background_dump_dest';
Example of a trace:
...
Deadlock graph:
---------Blocker(s)-------- ---------Waiter(s)---------
Resource Name process session holds waits process session holds waits
TX-0009000F-00004ACC-00000000-00000000 37 129 X 55 375 X
TX-0008001B-0000489C-00000000-00000000 55 375 X 37 129 X
session 129: DID 0001-0025-00000281 session 375: DID 0001-0037-00012A2C
session 375: DID 0001-0037-00012A2C session 129: DID 0001-0025-00000281
Rows waited on:
Session 129: obj - rowid = 0001AC1C - AAAawcAAGAAAudMAAQ
(dictionary objn - 109596, file - 6, block - 190284, slot - 16)
Session 375: obj - rowid = 0001AC1C - AAAawcAAGAAAudMAAL
(dictionary objn - 109596, file - 6, block - 190284, slot - 11)
----- Information for the OTHER waiting sessions -----
Session 375:
sid: 375 ser: 10033 audsid: 56764801 user: 111/xxxxxxxxxxxx
flags: (0x45) USR/- flags_idl: (0x1) BSY/-/-/-/-/-
flags2: (0x40009) -/-/INC
pid: 55 O/S info: user: oracle, term: xxxxxxxxxx, ospid: 7820
image: ORACLE.EXE (SHAD)
client details:
O/S info: user: xxxxxxxxxx\xxxxxxxxxx, term: xxxxxxxxxx, ospid: 11848:10888
machine: xxxxxxxxxx\xxxxxxxxxx program: sqlplus.exe
application name: SQL*Plus, hash value=3669949024
current SQL:
UPDATE POSITION SET BONUS = BONUS * 1.05
----- End of information for the OTHER waiting sessions -----
Information for THIS session:
----- Current SQL Statement for this session (sql_id=cp515bpfsjd07) -----
UPDATE POSITION SET EXTRA = EXTRA || 'yes' WHERE NAME='JOHN'
...
The locked object is not always the table directly modified. Check which object caused the problem:
select * from dba_objects where object_id = 109596;

Delayed Job job table locks

UPDATE 3
Hm,...i was using an old, old, version of delayed_job_active_record. All is well.
I have implemented a push notification service in an API i'm working on. I process the push notifications using delayed job. My problem is that sometimes it seems that the worker process don't obtain a lock on the job table. That is, 2 workers sometimes pick up the same job. I can't reproduce the problem consistently, but i'm wondering if anyone else has experienced this? Here is the code for enqueueing jobs:
Device.where("platform = ? AND enabled = ?", 'ios', true ).find_in_batches( batch_size: 2000 ) do |batch|
Delayed::Job.enqueue APNWorker.new( params[:push_notification], batch )
end
devices is a table containing mobile device tokens. Testing is done locally with Foreman.
UPDATE 1
Here is some output from Foreman
13:10:41 worker.1 | started with pid 2489
13:10:41 worker.2 | started with pid 2492
13:10:41 worker.3 | started with pid 2495
Then, when i enqueue a job using the above code, sometimes, rather randomly i get
13:15:55 worker.1 | work
13:15:55 worker.3 | work
Here, 'work' indicates that the job is being executed. And i receive a duplicate push notification. If i check the delayed_jobs table i see only one, locked, job. Still, 2 workers are picking it up.
UPDATE 2
Here are some logs from Rails
Delayed::Backend::ActiveRecord::Job Load (1.1ms) SELECT "delayed_jobs".* FROM "delayed_jobs" WHERE ((run_at <= '2014-02-02 17:42:37.813835' AND (locked_at IS NULL OR locked_at < '2014-02-02 13:42:37.813853') OR locked_by = 'host:positive-definite-fakta-vbox pid:4114') AND failed_at IS NULL) ORDER BY priority ASC, run_at ASC LIMIT 1
Delayed::Backend::ActiveRecord::Job Load (4.8ms) SELECT "delayed_jobs".* FROM "delayed_jobs" WHERE ((run_at <= '2014-02-02 17:42:37.772102' AND (locked_at IS NULL OR locked_at < '2014-02-02 13:42:37.772130') OR locked_by = 'host:positive-definite-fakta-vbox pid:4118') AND failed_at IS NULL) ORDER BY priority ASC, run_at ASC LIMIT 1
(0.1ms) BEGIN
Delayed::Backend::ActiveRecord::Job Load (0.7ms) SELECT "delayed_jobs".* FROM "delayed_jobs" WHERE "delayed_jobs"."id" = $1 LIMIT 1 FOR UPDATE [["id", 537]]
(0.4ms) UPDATE "delayed_jobs" SET "locked_at" = '2014-02-02 17:42:37.844545', "locked_by" = 'host:positive-definite-fakta-vbox pid:4118', "updated_at" = '2014-02-02 17:42:37.954756' WHERE "delayed_jobs"."id" = 537
(0.6ms) COMMIT
(3.0ms) BEGIN
Delayed::Backend::ActiveRecord::Job Load (7.0ms) SELECT "delayed_jobs".* FROM "delayed_jobs" WHERE "delayed_jobs"."id" = $1 LIMIT 1 FOR UPDATE [["id", 537]]
(0.4ms) UPDATE "delayed_jobs" SET "locked_at" = '2014-02-02 17:42:37.869191', "locked_by" = 'host:positive-definite-fakta-vbox pid:4114', "updated_at" = '2014-02-02 17:42:37.997562' WHERE "delayed_jobs"."id" = 537
(0.8ms) COMMIT
Device Load (0.6ms) SELECT "devices".* FROM "devices" WHERE "devices"."id" = $1 LIMIT 1 [["id", "18"]]
Device Load (0.6ms) SELECT "devices".* FROM "devices" WHERE "devices"."id" = $1 LIMIT 1 [["id", "18"]]
As can be seen, both workers get to do the job( 'Device Load...' is the actual work ).
In the delayed_jobs table there is a single entry, locked by:
host:positive-definite-fakta-vbox pid:4114
What i don't really get is that the above seems like a perfectly normal, highly likely scenario. The only thing happening is that two workers poll the job_queue at just about the same time. Nothing strange about that i guess...but of course the result is disastrous.
How come that the select for update statement:
Delayed::Backend::ActiveRecord::Job Load (0.7ms) SELECT "delayed_jobs".* FROM "delayed_jobs" WHERE "delayed_jobs"."id" = $1 LIMIT 1 FOR UPDATE [["id", 537]]
is not checking that the job is not already locked? The regular queue-polling seems to do just that.