Subversion checkout fails with ACCESS_VIOLATION in svn_wc_add_repos_file4() - apache

We reproducibly experience errors when attempting to checkout or update working copies.
Environment
Our environment is as follows:-
svn 1.8.9 (r1591380) client and server running on the same server (also happens with client on another server, but less often)
Server runs Windows Server 2008 (64 bit)
Apache httpd server
We are running the svn checkout from QuickBuild.
Client Error
This error is reported from the checkout:-
svn checkout http://qvsvn101/PayWay/PayWay/Branches/2014.R1/ D:\quickbuild_workspace\PayWay\Application\PointRelease\Release --non-interactive --username SrvAcc --password ****** -r 11523
Command return code: -1073741819
Command error output: This application has halted due to an unexpected error.
A crash report and minidump file were saved to disk, you can find them here:
C:\Users\SrvAcc\AppData\Local\Temp\svn-crash-log20140527164109.log
C:\Users\SrvAcc\AppData\Local\Temp\svn-crash-log20140527164109.dmp
Please send the log file to users_at_subversion.apache.org to help us analyze
and solve this problem.
NOTE: The crash report and minidump files can contain some sensitive information
(filenames, partial file content, usernames and passwords etc.)
Apache httpd error log
At the same time, the Apache error.log contains this:
[Tue May 27 16:41:12 2014] [error] [client 192.168.40.47] Provider encountered an error while streaming a REPORT response. [500, #0]
[Tue May 27 16:41:12 2014] [error] [client 192.168.40.47] A failure occurred while driving the update report editor [500, #106]
Subversion Crash Log File
Subversion writes out a log file as follows:
Process info:
Cmd line: svn checkout http://qvsvn101/PayWay/PayWay/Branches/2014.R1/ D:\quickbuild_workspace\PayWay\Application\PointRelease\Release --non-interactive --username SrvAcc --password ****** -r 11523
Working Dir: D:\quickbuild_workspace\PayWay\Application\PointRelease\Release
Version: 1.8.9 (r1591380), compiled May 8 2014, 04:25:41
Platform: Windows OS version 6.1 build 7601 Service Pack 1
Exception: ACCESS_VIOLATION
Registers:
eax=7259a7c0 ebx=00000000 ecx=01e5e138 edx=72746e65 esi=01e5e138 edi=61006469
eip=7259a7cf esp=003cf44c ebp=003cf458 efl=00010202
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b
Stacktrace:
#1 0x7259a7cf in svn_wc_add_repos_file4()
#2 0x732396b7 in svn_ra_svn_init()
#3 0x7323470b in svn_ra_svn_init()
#4 0x73234abe in svn_ra_svn_init()
#5 0x731f3110 in (unknown function)
Loaded modules:
0x00020000 D:\SubversionServer\svn.exe (1.8.9.18516, 241664 bytes)
0x77a20000 C:\Windows\SysWOW64\ntdll.dll (6.1.7601.18247, 1572864 bytes)
0x75f00000 C:\Windows\SysWOW64\kernel32.dll (6.1.7601.18229, 1114112 bytes)
0x75e00000 C:\Windows\SysWOW64\KERNELBASE.dll (6.1.7601.18229, 290816 bytes)
0x73600000 D:\SubversionServer\libapr-1.dll (1.5.1.0, 163840 bytes)
0x764a0000 C:\Windows\SysWOW64\ws2_32.dll (6.1.7601.17514, 217088 bytes)
0x75800000 C:\Windows\SysWOW64\msvcrt.dll (7.0.7601.17744, 704512 bytes)
0x75350000 C:\Windows\SysWOW64\rpcrt4.dll (6.1.7601.18205, 983040 bytes)
0x75100000 C:\Windows\SysWOW64\sspicli.dll (6.1.7601.18270, 393216 bytes)
0x750f0000 C:\Windows\SysWOW64\CRYPTBASE.dll (6.1.7600.16385, 49152 bytes)
0x75ee0000 C:\Windows\SysWOW64\sechost.dll (6.1.7600.16385, 102400 bytes)
0x75950000 C:\Windows\SysWOW64\nsi.dll (6.1.7600.16385, 24576 bytes)
0x74510000 C:\Windows\System32\mswsock.dll (6.1.7601.18254, 245760 bytes)
0x76010000 C:\Windows\SysWOW64\user32.dll (6.1.7601.17514, 1048576 bytes)
0x75960000 C:\Windows\SysWOW64\gdi32.dll (6.1.7601.18275, 589824 bytes)
0x779f0000 C:\Windows\SysWOW64\lpk.dll (6.1.7601.18177, 40960 bytes)
0x758b0000 C:\Windows\SysWOW64\usp10.dll (1.626.7601.18009, 643072 bytes)
0x759f0000 C:\Windows\SysWOW64\advapi32.dll (6.1.7601.18247, 655360 bytes)
0x76510000 C:\Windows\SysWOW64\shell32.dll (6.1.7601.18222, 12886016 bytes)
0x75e70000 C:\Windows\SysWOW64\shlwapi.dll (6.1.7601.17514, 356352 bytes)
0x73260000 D:\SubversionServer\MSVCR100.DLL (10.0.30319.1, 778240 bytes)
0x73580000 D:\SubversionServer\libsvn_client-1.dll (1.8.9.18516, 319488 bytes)
0x74010000 D:\SubversionServer\libsvn_delta-1.dll (1.8.9.18516, 122880 bytes)
0x733e0000 D:\SubversionServer\libaprutil-1.dll (1.5.3.0, 204800 bytes)
0x73ee0000 D:\SubversionServer\libapriconv-1.dll (1.2.1.0, 36864 bytes)
0x72a80000 D:\SubversionServer\libsvn_subr-1.dll (1.8.9.18516, 1077248 bytes)
0x73670000 C:\Windows\System32\shfolder.dll (6.1.7600.16385, 20480 bytes)
0x755c0000 C:\Windows\SysWOW64\ole32.dll (6.1.7601.17514, 1425408 bytes)
0x75230000 C:\Windows\SysWOW64\crypt32.dll (6.1.7601.18277, 1179648 bytes)
0x75ed0000 C:\Windows\SysWOW64\msasn1.dll (6.1.7601.17514, 49152 bytes)
0x74a30000 C:\Windows\System32\version.dll (6.1.7600.16385, 36864 bytes)
0x73cc0000 D:\SubversionServer\libsvn_diff-1.dll (1.8.9.18516, 86016 bytes)
0x731f0000 D:\SubversionServer\libsvn_ra-1.dll (1.8.9.18516, 454656 bytes)
0x738f0000 D:\SubversionServer\libsasl.dll (2.1.23.0, 81920 bytes)
0x73f10000 D:\SubversionServer\libsvn_fs-1.dll (1.8.9.18516, 225280 bytes)
0x73f60000 D:\SubversionServer\libsvn_repos-1.dll (1.8.9.18516, 180224 bytes)
0x735d0000 C:\Windows\System32\secur32.dll (6.1.7601.18270, 32768 bytes)
0x731a0000 D:\SubversionServer\ssleay32.dll (1.0.1.7, 286720 bytes)
0x72940000 D:\SubversionServer\libeay32.dll (1.0.1.7, 1306624 bytes)
0x72550000 D:\SubversionServer\libsvn_wc-1.dll (1.8.9.18516, 544768 bytes)
0x76340000 C:\Windows\System32\imm32.dll (6.1.7601.17514, 393216 bytes)
0x75160000 C:\Windows\SysWOW64\msctf.dll (6.1.7600.16385, 835584 bytes)
0x74960000 C:\Windows\System32\profapi.dll (6.1.7600.16385, 45056 bytes)
0x744f0000 C:\Windows\System32\nlaapi.dll (6.1.7601.17761, 65536 bytes)
0x744e0000 C:\Windows\System32\NapiNSP.dll (6.1.7600.16385, 65536 bytes)
0x742c0000 C:\Windows\System32\dnsapi.dll (6.1.7601.17570, 278528 bytes)
0x744d0000 C:\Windows\System32\winrnr.dll (6.1.7600.16385, 32768 bytes)
0x742a0000 C:\Windows\System32\IPHLPAPI.DLL (6.1.7601.17514, 114688 bytes)
0x74290000 C:\Windows\System32\winnsi.dll (6.1.7600.16385, 28672 bytes)
0x74250000 C:\Windows\System32\FWPUCLNT.DLL (6.1.7601.18283, 229376 bytes)
0x74240000 C:\Windows\System32\rasadhlp.dll (6.1.7600.16385, 24576 bytes)
0x74500000 C:\Windows\System32\WSHTCPIP.DLL (6.1.7600.16385, 20480 bytes)
0x74ae0000 C:\Windows\System32\dbghelp.dll (6.1.7601.17514, 962560 bytes)
0x73170000 C:\Windows\System32\powrprof.dll (6.1.7600.16385, 151552 bytes)
0x76110000 C:\Windows\SysWOW64\setupapi.dll (6.1.7601.17514, 1691648 bytes)
0x75470000 C:\Windows\SysWOW64\cfgmgr32.dll (6.1.7601.17621, 159744 bytes)
0x763a0000 C:\Windows\SysWOW64\oleaut32.dll (6.1.7601.17676, 585728 bytes)
0x75e50000 C:\Windows\SysWOW64\devobj.dll (6.1.7601.17621, 73728 bytes)

Here is the related thread in dev# Subversion mailing list.
It looks like your server interrupts the connection and Subversion 1.8 client built with serf 1.3.5 library crashes on this failure. That's why you see the error with older Subversion client but the client built with serf 1.3.5 crashes.
Serf 1.3.5 fails to process the error and thus crash on the client occurs. There is a great chance that the crash is caused by the bug in Serf library (on client side) which is fixed in the version 1.3.6:
Revert r2319 from serf 1.3.5: this change was making serf call
handle_response multiple times in case of an error response, leading
to unexpected behavior.
I suggest trying Subversion command-line client which is built against Serf 1.3.6. Subversion 1.8.x binaries built with serf 1.3.6 are going to be available soon.

I posted the same dump file to the users#subversion.apache.org mailing list but have had no reply in 2 weeks. To get around this issue, I switched QuickBuild to use jsvn.bat from the pure Java distribution of SVNKit and the issue seems to be resolved. Jsvn has the same command-line interface as the Apache svn binary, so it's a simple drop-in replacement.
I initially had an authentication issue because we use NTLM to authenticate with our active directory. The error was "svn: E170001: Authentication required for repsoitory" The solution was to add the following to svnkit\bin\jsvn.bat after the existing EXTRA_JVM_ARGUMENTS environment variable:
rem Adding this to resolve authentication issue as per http://subversion.1072662.n5.nabble.com/SVN-authentication-problem-td1560.html
set EXTRA_JVM_ARGUMENTS=%EXTRA_JVM_ARGUMENTS% "-Dsvnkit.http.methods=Basic,Digest,Negotiate,NTLM"

The only thing worked for me is to install the previous version of Tortoise svn(TortoiseSVN-1.8.6.25419-x64-svn-1.8.8) which runs with the pervious version of svn client 1.8.8 and then use the old version of the svn.exe. That worked against the newer version of the server (1.8.9) too.
(I had the same issue. I upgraded yesterday my collabnet subversion to the latest (svn version 1.8.9-3871.129) and both the command prompt svn checkout or the latest tortoise svn (1.8.7) fails. I have the same ACCESS_VIOLATION error in the dump log. And my computer is Windows 7 64 bit.)

Related

How can I low ksoftirq usage when it hits 100% of a CPU?

I have a Linux server with 48 CPU and, from time to time, some of them starts hitting almost 100% of usage. And when that happens, the usage doesn't go down and that's affecting some services performance.
When I reboot this server, everything goes ok for several days when some starts hitting almost 100% of usage again. This is like a cycle.
The problem is that the usage of some CUP, when hits 100%, doesn't change, unless I reboot server, and there is a service that's performing badly when that occurs.
When I use the htop command, I can see some process ksoftirq/0, ksoftirq/1, ksoftirq/2 ... is the responsible for this usage. I don't know what to do.
One approach that I tried was: I observed that the number of CPU where that occurs tends to be in the first half (0-23). I tried to change the affinity of those process (ksoftirq) to another CPU in the second half (24-47), but without any success.
What I want is: how to low this ksoftirq process usage or, at least, how to distribute them in order to keep CPU with a usage lower?
I'm not sure if the solution is about changing the affinity or changing some other attribute. I'm really clueless in this situation.
This is a physical server. Some info about it:
output of lsb_release -a:
No LSB modules are available.
Distributor ID: Debian
Description: Debian GNU/Linux 11 (bullseye)
Release: 11
Codename: bullseye
output of uname -a:
Linux tiamat-dc 5.10.0-10-amd64 #1 SMP Debian 5.10.84-1 (2021-12-08) x86_64 GNU/Linux
output of lscpu:
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
Address sizes: 44 bits physical, 48 bits virtual
CPU(s): 48
On-line CPU(s) list: 0-47
Thread(s) per core: 2
Core(s) per socket: 6
Socket(s): 4
NUMA node(s): 4
Vendor ID: GenuineIntel
CPU family: 6
Model: 46
Model name: Intel(R) Xeon(R) CPU E7530 # 1.87GHz
Stepping: 6
CPU MHz: 1239.565
BogoMIPS: 3723.93
Virtualization: VT-x
L1d cache: 768 KiB
L1i cache: 768 KiB
L2 cache: 6 MiB
L3 cache: 48 MiB
NUMA node0 CPU(s): 0,4,8,12,16,20,24,28,32,36,40,44
NUMA node1 CPU(s): 1,5,9,13,17,21,25,29,33,37,41,45
NUMA node2 CPU(s): 2,6,10,14,18,22,26,30,34,38,42,46
NUMA node3 CPU(s): 3,7,11,15,19,23,27,31,35,39,43,47
Vulnerability Itlb multihit: KVM: Mitigation: VMX disabled
Vulnerability L1tf: Mitigation; PTE Inversion; VMX conditional cache flushes, SMT vulnerable
Vulnerability Mds: Vulnerable: Clear CPU buffers attempted, no microcode; SMT vulnerable
Vulnerability Meltdown: Mitigation; PTI
Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl and seccomp
Vulnerability Spectre v1: Mitigation; usercopy/swapgs barriers and __user pointer sanitization
Vulnerability Spectre v2: Mitigation; Full generic retpoline, IBPB conditional, IBRS_FW, STIBP conditional, RSB filling
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ht tm pbe syscall nx rdtscp l
m constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni dtes64 monitor ds_cpl vmx est tm2 ssse3 cx
16 xtpr pdcm dca sse4_1 sse4_2 x2apic popcnt lahf_lm pti ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid dtherm ida flush_l1d

wal-g restore issue on PG-11

I am running postgresql-11 and trying to restore database using wal-g. I get the below error despite creating recovery.conf since this is version 11 but some reason when i bring up the database it's still reading from postgresql.conf.
$ cat recovery.conf
recovery_command='WALG_FILE_PREFIX="file://localhost/testbackup" /home/pgadmco/.local/bin/wal-g wal-fetch %f %p'
standby_mode=on
recovery_target_timeline='2022-05-31 21:30'
WALG_FILE_PREFIX="file://localhost/nas/pgbackup" wal-g backup-list
name last_modified wal_segment_backup_start
base_0000000100000001000000F7 2022-05-31T21:30:03-05:00 0000000100000001000000F7
WALG_FILE_PREFIX="file://localhost/testbackup" wal-g backup-fetch /u01/app/pgsql/11/data base_0000000100000001000000F7
INFO: 2022/06/01 10:21:29.284505 Finished extraction of part_003.tar.lz4
INFO: 2022/06/01 10:21:29.284990 Finished decompression of part_003.tar.lz4
INFO: 2022/06/01 10:21:30.765528 Finished extraction of part_001.tar.lz4
INFO: 2022/06/01 10:21:30.765971 Finished decompression of part_001.tar.lz4
INFO: 2022/06/01 10:21:30.788485 Finished extraction of pg_control.tar.lz4
INFO: 2022/06/01 10:21:30.788501
Backup extraction complete
$ pg_ctl -D /u01/app/pgsql/11/data start
waiting for server to start....2022-06-01 17:42:41.472 GMT [7645] LOG: unrecognized configuration parameter "restore_command" in file "/u01/app/pgsql/11/data/postgresql.conf" line 64
2022-06-01 17:42:41.472 GMT [7645] LOG: unrecognized configuration parameter "recovery_target_timeline" in file "/u01/app/pgsql/11/data/postgresql.conf" line 65
2022-06-01 17:42:41.472 GMT [7645] FATAL: configuration file "/u01/app/pgsql/11/data/postgresql.conf" contains errors
stopped waiting
pg_ctl: could not start server
Any suggestions?

Failed to use vscode remote ssh, but use ssh directly can work

Problem
I re-installed my server system.Before then, I can use remote-ssh normally.However, I can't use remote-ssh to connect to my server anymore.But I can still use ssh directly to connect to the server.
I suppose it managed to get into the system but somehow it broke down.
The error log is below:
Welcome to Ubuntu 20.04 LTS (GNU/Linux 5.4.0-77-generic x86_64)
* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage
System information as of Tue 14 Sep 2021 09:56:58 PM CST
System load: 0.07 Processes: 117
Usage of /: 6.5% of 59.00GB Users logged in: 1
Memory usage: 10% IPv4 address for eth0: 10.0.12.2
Swap usage: 0%
* Super-optimized for small spaces - read how we shrank the memory
footprint of MicroK8s to make it the smallest full K8s around.
https://ubuntu.com/blog/microk8s-memory-optimisation
ready: 6425958cce28
Linux 5.4.0-77-generic #86-Ubuntu SMP Thu Jun 17 02:35:03 UTC 2021
6425958cce28: running
bash: line 1: _exitcode: command not found
bash: line 2: syntax error near unexpected token `elif'
bash: line 2: ` elif [[ $ALLOW_CLIENT_DOWNLOAD == "1" ]]; then'
-sh: 4: function: not found
-sh: 69: [[: not found
-sh: 90: [[: not found
-sh: 155: Syntax error: "(" unexpected (expecting "then")
Transferred: sent 17180, received 4016 bytes, in 0.5 seconds
Bytes per second: sent 35433.6, received 8283.0
local-server-1> ssh child died, shutting down
[21:56:58.587] Failed to parse remote port from server output
[21:56:58.588] Resolver error: Error:
at Function.Create (/Users/luther/.vscode/extensions/ms-vscode-remote.remote-ssh-0.65.7/out/extension.js:1:64659)
at Object.t.handleInstallOutput (/Users/luther/.vscode/extensions/ms-vscode-remote.remote-ssh-0.65.7/out/extension.js:1:63302)
at Object.e [as tryInstallWithLocalServer] (/Users/luther/.vscode/extensions/ms-vscode-remote.remote-ssh-0.65.7/out/extension.js:1:387573)
at processTicksAndRejections (internal/process/task_queues.js:93:5)
at async /Users/luther/.vscode/extensions/ms-vscode-remote.remote-ssh-0.65.7/out/extension.js:1:294473
at async Object.t.withShowDetailsEvent (/Users/luther/.vscode/extensions/ms-vscode-remote.remote-ssh-0.65.7/out/extension.js:1:406463)
at async /Users/luther/.vscode/extensions/ms-vscode-remote.remote-ssh-0.65.7/out/extension.js:1:386112
at async E (/Users/luther/.vscode/extensions/ms-vscode-remote.remote-ssh-0.65.7/out/extension.js:1:382710)
at async Object.t.resolveWithLocalServer (/Users/luther/.vscode/extensions/ms-vscode-remote.remote-ssh-0.65.7/out/extension.js:1:385728)
at async Object.t.resolve (/Users/luther/.vscode/extensions/ms-vscode-remote.remote-ssh-0.65.7/out/extension.js:1:295870)
at async /Users/luther/.vscode/extensions/ms-vscode-remote.remote-ssh-0.65.7/out/extension.js:127:110656
[21:56:58.592] ------
Tried
I tried delete the know_hosts file from host, re-install the remote-ssh plugin, but can't work
I am pretty new to remote-ssh, hope can give me more detailed solution.
Thanks :)
I downgraded remote-ssh.Then I changed my default shell into zsh and upgrade remote-ssh.It began to install '.vscode-server' file again and magically it worked.

gem5 x86 kvm doesn't work with error "KVM: Failed to enter virtualized mode (hw reason: 0x80000021)"

I tried to run gem5 fs mode with KVM to fast forward linux boot-up and failed with this error.
info: 0x4b564d04: 0x0
info: 0x3b: 0x0
info: 0x6e0: 0x0
info: 0x1a0: 0x0
info: 0x17a: 0x0
info: 0x17b: 0x0
info: 0x9e: 0x0
panic: KVM: Failed to enter virtualized mode (hw reason: 0x80000021)
Memory Usage: 33878524 KBytes
Program aborted at tick 186932115
--- BEGIN LIBC BACKTRACE ---
gem5/build/X86/gem5.opt(_Z15print_backtracev+0x28)[0x15e45d8]
gem5/build/X86/gem5.opt(_Z12abortHandleri+0x46)[0x15f5196]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x11390)[0x7fb3c9f7d390]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x38)[0x7fb3c8a72428]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x16a)[0x7fb3c8a7402a]
gem5/build/X86/gem5.opt[0x80f14f]
gem5/build/X86/gem5.opt[0x18cb151]
gem5/build/X86/gem5.opt(_ZN10BaseKvmCPU13handleKvmExitEv+0x1bc)[0x18cb8bc]
gem5/build/X86/gem5.opt(_ZN10BaseKvmCPU4tickEv+0x229)[0x18c8d69]
gem5/build/X86/gem5.opt(_ZN10EventQueue10serviceOneEv+0xd5)[0x15eb485]
gem5/build/X86/gem5.opt(_Z9doSimLoopP10EventQueue+0x48)[0x160a9c8]
gem5/build/X86/gem5.opt[0x160ad1f]
/usr/lib/x86_64-linux-gnu/libstdc++.so.6(+0xbd57f)[0x7fb3c93e557f]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x76ba)[0x7fb3c9f736ba]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7fb3c8b4441d]
--- END LIBC BACKTRACE ---
I've used the gem5art and slightly modified the runscript not to run spec benchmark and run /bin/bash instead. It seems that this error has happened a while ago and issued in here. It seems that this problem has been fixed by the gem5 v19 but got the same error code. Could anyone explain why this error happens and how to fix it?

Container is running beyond memory limits - RECEIVED SIGNAL 15: SIGTERM

I implemented model prediction in oozie workflow and i got error "Container is running beyond memory limits" on step 3 i.e. model1.predict_proba. Table1 has 27 Million records. It run fine on jyupiter notebook but i got this error on oozie. Can someone please help.
d1 = sqlContext.sql("SELECT * FROM table1").toPandas()
xyz= d1.drop(['abc'], axis = 1)
modelprob = model1.predict_proba(xyz)[:,1]
Error : Yarn Logs
Application application_1547693435775_8741566 failed 2 times due to AM Container for appattempt_1547693435775_8741566_000002 exited with exitCode: -104
For more detailed output, check application tracking page:https://xyz
Diagnostics: Container [pid=224941,containerID=container_e167_1547693435775_8741566_02_000002] is running beyond physical memory limits. Current usage: 121.2 GB of 121 GB physical memory used; 226.9 GB of 254.1 GB virtual memory used. Killing container.
2019-04-15 22:43:36,231 [dispatcher-event-loop-10] INFO org.apache.spark.storage.BlockManagerInfo - Removed broadcast_5_piece0 on xyz.corp.intranet:34252 in memory (size: 5.6 KB, free: 6.2 GB)
2019-04-15 22:43:36,231 [dispatcher-event-loop-35] INFO org.apache.spark.storage.BlockManagerInfo - Removed broadcast_5_piece0 on xyz1.corp.intranet:38363 in memory (size: 5.6 KB, free: 6.2 GB)
2019-04-15 22:43:36,242 [Spark Context Cleaner] INFO org.apache.spark.ContextCleaner - Cleaned accumulator 4
2019-04-15 22:43:36,245 [dispatcher-event-loop-51] INFO org.apache.spark.storage.BlockManagerInfo - Removed broadcast_2_piece0 on xyz3 in memory (size: 53.5 KB, free: 52.8 GB)
2019-04-15 22:43:36,245 [dispatcher-event-loop-51] INFO org.apache.spark.storage.BlockManagerInfo - Removed broadcast_2_piece0 on xyz4.corp.intranet:46309 in memory (size: 53.5 KB, free: 6.2 GB)
2019-04-15 22:43:36,248 [dispatcher-event-loop-9] INFO org.apache.spark.storage.BlockManagerInfo - Removed broadcast_2_piece0 on xyz5.corp.intranet:44850 in memory (size: 53.5 KB, free: 6.2 GB)
2019-04-15 22:45:48,103 [SIGTERM handler] INFO org.apache.spark.deploy.yarn.ApplicationMaster - Final app status: FAILED, exitCode: 16
2019-04-15 22:45:48,106 [SIGTERM handler] ERROR org.apache.spark.deploy.yarn.ApplicationMaster - RECEIVED SIGNAL 15: SIGTERM
2019-04-15 22:45:48,124 [Thread-5] INFO org.apache.spark.SparkContext - Invoking stop() from shutdown hook
below are sparkconf parameters :
sconf = SparkConf().setAppName("xyz model").set("spark.driver.memory", "8g").set('spark.executor.memory', '12g').set("spark.yarn.am.memory", "8g").set('spark.dynamicAllocation.enabled', 'true').set('spark.dynamicAllocation.minExecutors', 20').set('spark.dynamicAllocation.maxExecutors', '60').set("spark.shuffle.service.enabled", "true").set('spark.kryoserializer.buffer.max.mb', '2047').set("spark.shuffle.blockTransferService", "nio").set("spark.driver.maxResultSize", "4g").set('spark.rpc.message.maxSize', '330').setMaster("yarn-cluster")
sc = SparkContext(conf=sconf)
below are sprkopts parameters :
sparkopts=--executor-memory 115g --num-executors 60 --driver-memory 110g --executor-cores 16 --driver-cores 2 --conf "spark.dynamicAllocation.enabled=true" --conf "spark.kryoserializer.buffer.max=2047m" --conf "spark.driver.maxResultSize=4096m" --conf spark.yarn.executor.memoryOverhead=8000 --conf "spark.network.timeout=10000000" --conf "spark.executor.extraJavaOptions=-XX:+UseCompressedOops -XX:PermSize=2048M -XX:MaxPermSize=2048M -XX:+UseG1GC" --conf "spark.broadcast.compress=true" --conf "spark.broadcast.blockSize=128m" --conf "spark.serializer.objectStreamReset=2" --conf spark.executorEnv.PYSPARK_PYTHON=/opt/cloudera/parcels/Anaconda/bin/python --conf spark.yarn.appMasterEnv.PYSPARK_PYTHON=/opt/cloudera/parcels/Anaconda/bin/python --files ${xyz}/hive-site.xml --files ${xyz}/yarn-site.xml