Mono 100% CPU spin in mono_conc_hashtable_lookup - mono

I have an app that is a offline batch task processor. Basically a task runner that accepts input on stdin, executes the given command task, stores output via NHibernate against a MySQL database. It's not complex and runs single threaded (think Apache Prefork model).
In deploying to a more modern version of mono, I've been having the processes get hung using 100% of a cpu core. The app is normally deployed on Linux, but I've been able reproduce it under the Visual Studio Community for Mac debugger on Mono 5.8.1 and 5.12.0.233. Under Visual Studio Community, when I break the application I don't get any information about what statement is executing.
Dropping down to gdb on Linux I do get the following backtraces:
backtrace:
#0 0x00000000006fdff0 in mono_conc_hashtable_lookup (hash_table=0xf174e0, key=key#entry=0x963d4f0) at mono-conc-hashtable.c:175
#1 0x000000000042fd29 in mono_jit_runtime_invoke (method="System.Object:lambda_method ()", obj=0x0, params=0x7fffb1067c10, exc=0x0,
error=0x7fffb1067d00) at mini-runtime.c:2668
#2 0x0000000000605a8f in do_runtime_invoke (method="System.Object:lambda_method ()", obj=, params=,
exc=, error=0x7fffb1067d00) at object.c:2922
#3 0x0000000000611911 in mono_runtime_try_invoke_array (method=method#entry="System.Object:lambda_method ()", obj=obj#entry=0x0,
params=params#entry=0x7fdb5d61fe68, exc=exc#entry=0x0, error=error#entry=0x7fffb1067d00) at object.c:5261
#4 0x0000000000611bd0 in mono_runtime_invoke_array_checked (method=method#entry="System.Object:lambda_method ()", obj=obj#entry=0x0,
params=params#entry=0x7fdb5d61fe68, error=error#entry=0x7fffb1067d00) at object.c:5139
#5 0x00000000005c17fb in ves_icall_InternalInvoke (method=0x7fdb5d61fe40, this_arg=0x0, params=0x7fdb5d61fe68, exc=0x7fffb1067e70) at icall.c:3392
#6 0x0000000040c4081a in ?? ()
#7 0x00007fdb5d5ab2a0 in ?? ()
#8 0x00007fdb5d61e688 in ?? ()
#9 0x00007fdb5d61e828 in ?? ()
#10 0x0000000000000000 in ?? ()
mono_backtrace
#0 0x00000000006fdff0 in mono_conc_hashtable_lookup (hash_table=0xf174e0, key=key#entry=0x963d4f0) at mono-conc-hashtable.c:175
175 if (key == kvs [i].key) {
[New Thread 0x7fdaf9b1d700 (LWP 11880)]
#1 0x000000000042fd29 in mono_jit_runtime_invoke (method="System.Object:lambda_method ()", obj=0x0, params=0x7fffb1067c10, exc=0x0,
error=0x7fffb1067d00) at mini-runtime.c:2668
2668 info = (RuntimeInvokeInfo *)mono_conc_hashtable_lookup (domain_info->runtime_invoke_hash, method);
[New Thread 0x7fdb50baa700 (LWP 11881)]
#2 0x0000000000605a8f in do_runtime_invoke (method="System.Object:lambda_method ()", obj=, params=,
exc=, error=0x7fffb1067d00) at object.c:2922
2922 result = callbacks.runtime_invoke (method, obj, params, exc, error);
#3 0x0000000000611911 in mono_runtime_try_invoke_array (method=method#entry="System.Object:lambda_method ()", obj=obj#entry=0x0,
params=params#entry=0x7fdb5d61fe68, exc=exc#entry=0x0, error=error#entry=0x7fffb1067d00) at object.c:5261
5261 res = mono_runtime_invoke_checked (method, obj, pa, error);
#4 0x0000000000611bd0 in mono_runtime_invoke_array_checked (method=method#entry="System.Object:lambda_method ()", obj=obj#entry=0x0,
params=params#entry=0x7fdb5d61fe68, error=error#entry=0x7fffb1067d00) at object.c:5139
5139 return mono_runtime_try_invoke_array (method, obj, params, NULL, error);
[New Thread 0x7fdaf8d27700 (LWP 11882)]
#5 0x00000000005c17fb in ves_icall_InternalInvoke (method=0x7fdb5d61fe40, this_arg=0x0, params=0x7fdb5d61fe68, exc=0x7fffb1067e70) at icall.c:3392
3392 MonoObject *result = mono_runtime_invoke_array_checked (m, obj, params, error);
#6 0x40c4081a in (wrapper managed-to-native) System.Reflection.MonoMethod:InternalInvoke (System.Reflection.MonoMethod,object,object[],System.Exception&) {0x12a4498} + 0x6a (0x40c407b0 0x40c4088c) [0xf125d0 - MOAB.Task.TaskRunner.exe]
[New Thread 0x7fdb509a9700 (LWP 11884)]
#7 0x00007fdb5d5ab2a0 in ?? ()
#8 0x00007fdb5d61e688 in ?? ()
#9 0x00007fdb5d61e828 in ?? ()
#10 0x0000000000000000 in ?? ()
When this occurs, a most threads are sleeping:
Id Target Id Frame
15 Thread 0x7fdb5d3ff700 (LWP 11515) "SGen worker" pthread_cond_wait##GLIBC_2.3.2 ()
at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185
14 Thread 0x7fdb5aee5700 (LWP 11516) "Finalizer" 0x00007fdb6458da0b in futex_abstimed_wait (cancel=true, private=, abstime=0x0,
expected=0, futex=0xa45c00 ) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:43
13 Thread 0x7fdb515df700 (LWP 11519) "Timer-Scheduler" pthread_cond_timedwait##GLIBC_2.3.2 ()
at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
12 Thread 0x7fdb50dab700 (LWP 11520) "Thread Pool I/O" 0x00007fdb64094a3d in poll () at ../sysdeps/unix/syscall-template.S:81
11 Thread 0x7fdaf991c700 (LWP 11572) "Datastore Capac" pthread_cond_timedwait##GLIBC_2.3.2 ()
at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:238
10 Thread 0x7fdaebfff700 (LWP 11822) "Thread Pool Wor" 0x00007fdb6458dc21 in futex_abstimed_wait (cancel=true, private=0, abstime=0x7fdaebffeda0,
expected=0, futex=0xa465c8 ) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:87
9 Thread 0x7fdaeaed3700 (LWP 11879) "Thread Pool Wor" 0x00007fdb6458dc21 in futex_abstimed_wait (cancel=true, private=0, abstime=0x7fdaeaed2da0,
expected=0, futex=0xa465c8 ) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:87
8 Thread 0x7fdb509a9700 (LWP 11884) "Thread Pool Wor" 0x00007fdb6458dc21 in futex_abstimed_wait (cancel=true, private=0, abstime=0x7fdb509a8da0,
expected=0, futex=0xa465c8 ) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:87
7 Thread 0x7fdaf9d1e700 (LWP 11944) "Thread Pool Wor" 0x00007fdb6458dc21 in futex_abstimed_wait (cancel=true, private=0, abstime=0x7fdaf9d1dda0,
expected=0, futex=0xa465c8 ) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:87
6 Thread 0x7fdaeacd2700 (LWP 11993) "Thread Pool Wor" 0x00007fdb6458dc21 in futex_abstimed_wait (cancel=true, private=0, abstime=0x7fdaeacd1da0,
expected=0, futex=0xa465c8 ) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:87
5 Thread 0x7fdaf9b1d700 (LWP 12002) "Thread Pool Wor" 0x00007fdb6458dc21 in futex_abstimed_wait (cancel=true, private=0, abstime=0x7fdaf9b1cda0,
expected=0, futex=0xa465c8 ) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:87
4 Thread 0x7fdaf8d27700 (LWP 12201) "Thread Pool Wor" 0x00007fdb6458dc21 in futex_abstimed_wait (cancel=true, private=0, abstime=0x7fdaf8d26da0,
expected=0, futex=0xa465c8 ) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:87
3 Thread 0x7fdb507a8700 (LWP 12631) "Thread Pool Wor" 0x00007fdb6458dc21 in futex_abstimed_wait (cancel=true, private=0, abstime=0x7fdb507a7da0,
expected=0, futex=0xa465c8 ) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:87
2 Thread 0x7fdae8899700 (LWP 12632) "Thread Pool Wor" 0x00007fdb6458dc21 in futex_abstimed_wait (cancel=true, private=0, abstime=0x7fdae8898da0,
expected=0, futex=0xa465c8 ) at ../nptl/sysdeps/unix/sysv/linux/sem_waitcommon.c:87
* 1 Thread 0x7fdb650b3780 (LWP 11511) "TaskRunner" 0x00000000006fdff0 in mono_conc_hashtable_lookup (hash_table=0xf174e0, key=key#entry=0x963d4f0)
at mono-conc-hashtable.c:175
The code for mono_conc_hashtable_lookup:
158 mono_conc_hashtable_lookup (MonoConcurrentHashTable *hash_table, gpointer key)
159 {
160 MonoThreadHazardPointers* hp;
161 conc_table *table;
162 int hash, i, table_mask;
163 key_value_pair *kvs;
164 hash = mix_hash (hash_table->hash_func (key));
165 hp = mono_hazard_pointer_get ();
166
167 retry:
168 table = (conc_table *)mono_get_hazardous_pointer ((gpointer volatile*)&hash_table->table, hp, 0);
169 table_mask = table->table_size - 1;
170 kvs = table->kvs;
171 i = hash & table_mask;
172
173 if (G_LIKELY (!hash_table->equal_func)) {
174 while (kvs [i].key) {
175 if (key == kvs [i].key) {
176 gpointer value;
177 /* The read of keys must happen before the read of values */
178 mono_memory_barrier ();
179 value = kvs [i].value;
180 /* FIXME check for NULL if we add suppport for removal */
181 mono_hazard_pointer_clear (hp, 0);
182 return value;
183 }
184 i = (i + 1) & table_mask;
185 }
186 } else {
187 GEqualFunc equal = hash_table->equal_func;
188
189 while (kvs [i].key) {
190 if (kvs [i].key != TOMBSTONE && equal (key, kvs [i].key)) {
191 gpointer value;
192 /* The read of keys must happen before the read of values */
193 mono_memory_barrier ();
194 value = kvs [i].value;
195
196 /* We just read a value been deleted, try again. */
197 if (G_UNLIKELY (!value))
198 goto retry;
199
200 mono_hazard_pointer_clear (hp, 0);
201 return value;
202 }
203 i = (i + 1) & table_mask;
204 }
205 }
206
207 /* The table might have expanded and the value is now on the newer table */
208 mono_memory_barrier ();
209 if (hash_table->table != table)
210 goto retry;
211
212 mono_hazard_pointer_clear (hp, 0);
213 return NULL;
214 }
Setting breakpoints at the returns, 182,201,213:
(gdb) break 182 thread 1
Breakpoint 4 at 0x6fdff5: file mono-conc-hashtable.c, line 182.
(gdb) break 201 thread 1
Breakpoint 5 at 0x6fe0e0: file mono-conc-hashtable.c, line 201.
(gdb) break 213 thread 1
Breakpoint 6 at 0x6fe024: file mono-conc-hashtable.c, line 213.
Those breakpoints never get hit infinite loop? Let's try the gotos:
(gdb) break 198 thread 1
Breakpoint 5 at 0x6fe0d5: file mono-conc-hashtable.c, line 198.
(gdb) break 210 thread 1
Breakpoint 6 at 0x6fe019: file mono-conc-hashtable.c, line 210.
These don't appear to get hit as well.
I focused on i = (i + 1) & table_mask; at line 184:
Breakpoint 12, mono_conc_hashtable_lookup (hash_table=0xf174e0, key=key#entry=0x963d4f0) at mono-conc-hashtable.c:184
184 i = (i + 1) & table_mask;
(gdb) p i
$15 = 4095
(gdb) p table_mask
$16 = 4095
(gdb) s
174 while (kvs [i].key) {
(gdb) p i
$17 = 0
It appears it's scanning the kvs keys using i as an index, but when i reaches the table_mask value, it gets reset to 0 due to the mask, and loops onward.
At this point, I'm stuck and am asking for how to debug this further? I'm at my limit on knowledge of mono internals. Is this really entering an endless loop? If so, what is the scenario where entering mono_conc_hashtable_lookup will trigger it?
Other Notes:
The issue occurs at random times and is not consistent in its pattern.

Related

Nanomsg gives signal 6 abort during fetching from couchbase

I am getting a signal 6 error when i am trying to fetch data from couchbase, this occurs at erratic intervals. I am using version 1.1 and from the code i can see if poll returns value less than 0, errno_assert is being triggered which crashes the application with signal 6.
Below is the backtrace of nanomsg thread:
Program terminated with signal 6, Aborted.
#0 0x00007ffff4c74a33 in select () from /usr/lib64/libc.so.6
Missing separate debuginfos, use: debuginfo-install boost-system-1.53.0-28.el7.x86_64 cyrus-sasl-lib-2.1.26-23.el7.x86_64 keyutils-libs-1.5.8-3.el7.x86_64 krb5-libs-1.15.1-50.el7.x86_64 libcom_err-1.42.9-19.el7.x86_64 libcurl-7.29.0-59.el7_9.1.x86_64 libidn-1.28-4.el7.x86_64 libselinux-2.5-15.el7.x86_64 libssh2-1.8.0-4.el7.x86_64 nspr-4.32.0-1.el7_9.x86_64 nss-3.53.1-3.el7_9.x86_64 nss-util-3.67.0-1.el7_9.x86_64 openldap-2.4.44-22.el7.x86_64 openssl-libs-1.0.2k-21.el7_9.x86_64 pcre-8.32-17.el7.x86_64 zlib-1.2.7-19.el7_9.x86_64
(gdb) thread 18
[Switching to thread 18 (Thread 0x7fffcef74700 (LWP 40630))]
#0 0x00007ffff4bfce00 in _IO_cleanup () from /usr/lib64/libc.so.6
(gdb) bt full
#0 0x00007ffff4bfce00 in _IO_cleanup () from /usr/lib64/libc.so.6
No symbol table info available.
#1 0x00007ffff4bb6be5 in abort () from /usr/lib64/libc.so.6
No symbol table info available.
#2 0x00000000009ff371 in nn_err_abort ()
No symbol table info available.
#3 0x00000000009ff2cd in nn_efd_wait ()
No symbol table info available.
#4 0x00000000009fbb13 in nn_sock_recv ()
No symbol table info available.
#5 0x00000000009f95fa in nn_recvmsg ()
No symbol table info available.
#6 0x00000000009f9015 in nn_recv ()
No symbol table info available.
#7 0x000000000099b8c9 in vcmNpsIcmMsgRecv ()
No symbol table info available.
#8 0x0000000000975a57 in __vcmNpsIcmRecv ()
No symbol table info available.
#9 0x00000000007261f5 in vcmDpeEmaIcmStatsCb(void*) ()
No symbol table info available.
#10 0x000000000099a54b in vcmNpsIcmInterfaceCreate ()
No symbol table info available.
#11 0x000000000095b1c0 in ?? ()
No symbol table info available.
#12 0x00007ffff7250ea5 in start_thread (arg=0x7fffcef74700) at pthread_create.c:307
__res =
pd = 0x7fffcef74700
now =
unwind_buf = {cancel_jmp_buf = {{jmp_buf = {140736665700096, -1738215837302118578, 0, 33558528, 0, 140736665700096, 1738108024908031822,
1738235171653297998}, mask_was_saved = 0}}, priv = {pad = {0x0, 0x0, 0x0, 0x0}, data = {prev = 0x0, cleanup = 0x0, canceltype = 0}}}
#13 0x00007ffff4c7d9fd in clone () from /usr/lib64/libc.so.6
This is present in one of our stdout files:
Invalid argument [22] (home/3rd-party/nanomsg/src/utils/efd.c:91)
Code::
88 rc = poll (&pfd, 1, timeout);
89 if (nn_slow (rc < 0 && errno == EINTR))
90 return -EINTR;
91 errno_assert (rc >= 0);
I found that in version 1.2, this errno_assert is not present in the code, Can this errno_assert be safely removed from our code without updating the nanomsg version . Please help.

Realm crash on Results.first.getter

Realm 5.0.2
I'm trying to understand how this crash is possible. It is crashing when calling first on a Realm Results object. I'm guessing this is a concurrency issue, but Realm is instantiated on same thread as query, and the Realm objects are immediately converted to app objects. But if it's not a concurrency issue, then ... what?!
0
Thread 1 Queue : com.apple.main-thread (serial)
#0 0x00000001037d8f7c in realm::IndexArray::index_string_all(realm::StringData, std::__1::vector<realm::ObjKey, std::__1::allocator<realm::ObjKey> >&, realm::ClusterColumn const&) const ()
#1 0x00000001036c8fdc in realm::IntegerNode<realm::ArrayIntNull, realm::Equal>::init() ()
#2 0x00000001036a740c in realm::Query::init() const ()
#3 0x00000001036a74dc in realm::Query::find_all(realm::ConstTableView&, unsigned long, unsigned long, unsigned long) const ()
#4 0x0000000103866e7c in realm::ConstTableView::do_sync() ()
#5 0x00000001036a78d0 in realm::Query::find_all(unsigned long, unsigned long, unsigned long) ()
#6 0x00000001036a7d50 in realm::Query::find_all(realm::DescriptorOrdering const&) ()
#7 0x00000001033af2c4 in realm::Results::do_evaluate_query_if_needed(bool)
#8 0x00000001033aeca8 in realm::util::Optional<realm::Obj> realm::Results::try_get<realm::Obj>(unsigned long)
#9 0x00000001033af01c in realm::util::Optional<realm::Obj> realm::Results::first<realm::Obj>()
#10 0x0000000103542b9c in _ZZN5realm7Results5firstI18RLMAccessorContextEEDaRT_ENKUlTyS4_E_clIPNS_3ObjEEES3_S4_
#11 0x0000000103541d38 in _ZN5realmL14switch_on_typeINS_3ObjEZNS_7Results5firstI18RLMAccessorContextEEDaRT_EUlTyS6_E_EES5_NS_12PropertyTypeEOT0_
#12 0x00000001035418c0 in _ZNK5realm7Results8dispatchIZNS0_5firstI18RLMAccessorContextEEDaRT_EUlTyS5_E_EES4_OS5_
#13 0x0000000103541870 in auto realm::Results::first<RLMAccessorContext>(RLMAccessorContext&)
#14 0x0000000103541834 in -[RLMResults firstObject]::$_8::operator()() const
#15 0x000000010353d5c4 in auto translateRLMResultsErrors<-[RLMResults firstObject]::$_8>(-[RLMResults firstObject]::$_8&&, NSString*)
#16 0x000000010353d544 in -[RLMResults firstObject]
#17 0x0000000104968cf4 in Results.first.getter
#18 0x000000010228e97c in static LibraryManager.LibraryPersistence.findPersistentTag(permanentId:temporaryId:)
#19 0x000000010228e500 in LibraryManager.getTag(forAssignment:)
#20 0x000000010220c284 in closure #1 in AudiobookUserSettings.tags.getter
#21 0x000000019eb5e038 in Sequence.compactMap<A>(_:) ()
#22 0x000000010220c1d0 in AudiobookUserSettings.tags.getter
#23 0x0000000102195e70 in Audiobook.tags.getter
#24 0x000000010226407c in FolderManager.updateFromLibrary()
#25 0x00000001022ca78c in LibraryTableViewController.refreshFolders()
#26 0x00000001022c6930 in closure #1 in LibraryTableViewController.refreshView()
#27 0x0000000102150700 in thunk for #escaping #callee_guaranteed () -> () ()
#28 0x0000000104b7605c in _dispatch_call_block_and_release ()
#29 0x0000000104b774d8 in _dispatch_client_callout ()
#30 0x0000000104b85f64 in _dispatch_main_queue_callback_4CF ()
#31 0x00000001911cc8d4 in __CFRUNLOOP_IS_SERVICING_THE_MAIN_DISPATCH_QUEUE__ ()
#32 0x00000001911c758c in __CFRunLoopRun ()
#33 0x00000001911c6bc8 in CFRunLoopRunSpecific ()
#34 0x000000019b5a85cc in GSEventRunModal ()
#35 0x0000000195379744 in UIApplicationMain ()
#36 0x00000001023d838c in main
#37 0x0000000191043384 in start ()
Enqueued from audiobookSettings-persistence (Thread 33) Queue
#0 0x0000000104b7bdd0 in dispatch_async ()
#1 0x00000001c807004c in OS_dispatch_queue.async(group:qos:flags:execute:) ()
#2 0x00000001022c6808 in LibraryTableViewController.refreshView()
#3 0x00000001022c6288 in closure #1 in LibraryTableViewController.viewWillAppear(_:)
The relevant code from the app:
func getTag(forAssignment assignment: TagAssignment) -> Tag? {
let tag = LibraryPersistence.findPersistentTag(permanentId: assignment.permanentTagId,
temporaryId: assignment.temporaryTagId)
return tag?.toTag()
}
static func findPersistentTag(permanentId: Int? = nil, temporaryId: String? = nil) -> PersistentTag? {
guard let userId = getIdForCurrentUser() else {
return nil
}
let realm = try! Realm()
// First try lookup by permanent tag id
if let permanentId = permanentId {
let permanentIdNumber = NSNumber(value: permanentId)
if let stored = realm.objects(PersistentTag.self).filter("permanentId == %# AND userId == %#", permanentIdNumber, userId).first {
return stored
}
}
// Then try temporary tag id
if let temporaryId = temporaryId {
return realm.objects(PersistentTag.self).filter("temporaryId == %# AND userId == %#", temporaryId, userId).first
}
return nil
}
UPDATE: More crash details:
* thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT)
#0: 0x0000000103bfc2fd Realm`realm::util::EncryptedFileMapping::read_barrier(void const*, unsigned long, unsigned long (*)(char const*)) + 29
* frame #1: 0x00000001035ce000 Realm`realm::util::do_encryption_read_barrier(addr=0x00007fff87a95540, size=8, header_to_size=(Realm`realm::NodeHeader::get_byte_size_from_header(char const*) at node_header.hpp:201), mapping=0xca4c51df08c18be5)(char const*), realm::util::EncryptedFileMapping*) at file_mapper.hpp:133:14
#2: 0x0000000103b496ee Realm`realm::IndexArray::index_string_all(realm::StringData, std::__1::vector<realm::ObjKey, std::__1::allocator<realm::ObjKey> >&, realm::ClusterColumn const&) const + 398
...
It looks like you're not the only one with this problem.
https://github.com/realm/realm-cocoa/issues/6556
This looks like a problem with 5.0.x, some have downgraded to 4.x to remove the problem. It references another issue (bottom of the report) which has been fixed and merged, but not yet released.
I'd read the bug report and see if this follows your experience, then either (a) follow the workaround, or (b) downgrade to 4.4.x. Watch the fix issue (https://github.com/realm/realm-core/pull/3828) for being included in a new release.

Why doesn't #try...#catch work with -[NSFileHandle writeData]?

I have a method that is similar to the tee utility. It receives a notification that data has been read on a pipe, and then writes that data to one or more pipes (connected to subordinate applications). If a subordinate app crashes, then that pipe is broken, and I naturally get an exception, which is then handled in a #try...#catch block.
This works most of the time. What I'm puzzled by is that occasionally, the exception crashes the app entirely with an uncaught exception, and pointing to the writeData line . I haven't been able to figure out what the pattern is on when it crashes, but why should it ever NOT be caught? (Note this is not executing inside the debugger.)
Here's the code:
//in setup:
[[NSNotificationCenter defaultCenter] addObserver:self selector:#selector(tee:) name:NSFileHandleReadCompletionNotification object:fileHandle];
-(void)tee:(NSNotification *)notification
{
// NSLog(#"Got read for tee ");
NSData *readData = notification.userInfo[NSFileHandleNotificationDataItem];
totalDataRead += readData.length;
// NSLog(#"Total Data Read %ld",totalDataRead);
NSArray *pipes = [teeBranches objectForKey:notification.object];
if (readData.length) {
for (NSPipe *pipe in pipes {
#try {
[[pipe fileHandleForWriting] writeData:readData];
}
#catch (NSException *exception) {
NSLog(#"download write fileHandleForWriting fail: %#", exception.reason);
if (!_download.isCanceled) {
[_download rescheduleOnMain];
NSLog(#"Rescheduling");
}
return;
}
#finally {
}
}
}
I should mention that I have set a signal handler in my AppDelegate>appDidFinishLaunching:
signal(SIGPIPE, &signalHandler);
signal(SIGABRT, &signalHandler );
void signalHandler(int signal)
{
NSLog(#"Got signal %d",signal);
}
And that does execute whether the app crashes or the signal is caught.
Here's a sample crash backtrace:
Crashed Thread: 0 Dispatch queue: com.apple.main-thread
Exception Type: EXC_CRASH (SIGABRT)
Exception Codes: 0x0000000000000000, 0x0000000000000000
Application Specific Information:
*** Terminating app due to uncaught exception 'NSFileHandleOperationException', reason: '*** -[NSConcreteFileHandle writeData:]: Broken pipe'
abort() called
terminating with uncaught exception of type NSException
Application Specific Backtrace 1:
0 CoreFoundation 0x00007fff838cbbec __exceptionPreprocess + 172
1 libobjc.A.dylib 0x00007fff90e046de objc_exception_throw + 43
2 CoreFoundation 0x00007fff838cba9d +[NSException raise:format:] + 205
3 Foundation 0x00007fff90a2be3c __34-[NSConcreteFileHandle writeData:]_block_invoke + 81
4 Foundation 0x00007fff90c53c17 __49-[_NSDispatchData enumerateByteRangesUsingBlock:]_block_invoke + 32
5 libdispatch.dylib 0x00007fff90fdfb76 _dispatch_client_callout3 + 9
6 libdispatch.dylib 0x00007fff90fdfafa _dispatch_data_apply + 110
7 libdispatch.dylib 0x00007fff90fe9e73 dispatch_data_apply + 31
8 Foundation 0x00007fff90c53bf0 -[_NSDispatchData enumerateByteRangesUsingBlock:] + 83
9 Foundation 0x00007fff90a2bde0 -[NSConcreteFileHandle writeData:] + 150
10 myApp 0x000000010926473e -[MTTaskChain tee:] + 2030
11 CoreFoundation 0x00007fff838880dc __CFNOTIFICATIONCENTER_IS_CALLING_OUT_TO_AN_OBSERVER__ + 12
12 CoreFoundation 0x00007fff83779634 _CFXNotificationPost + 3140
13 Foundation 0x00007fff909bb9b1 -[NSNotificationCenter postNotificationName:object:userInfo:] + 66
14 Foundation 0x00007fff90aaf8e6 _performFileHandleSource + 1622
15 CoreFoundation 0x00007fff837e9ae1 __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ + 17
16 CoreFoundation 0x00007fff837dbd3c __CFRunLoopDoSources0 + 476
17 CoreFoundation 0x00007fff837db29f __CFRunLoopRun + 927
18 CoreFoundation 0x00007fff837dacb8 CFRunLoopRunSpecific + 296
19 HIToolbox 0x00007fff90664dbf RunCurrentEventLoopInMode + 235
20 HIToolbox 0x00007fff90664b3a ReceiveNextEventCommon + 431
21 HIToolbox 0x00007fff9066497b _BlockUntilNextEventMatchingListInModeWithFilter + 71
22 AppKit 0x00007fff8acf5cf5 _DPSNextEvent + 1000
23 AppKit 0x00007fff8acf5480 -[NSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] + 194
24 AppKit 0x00007fff8ace9433 -[NSApplication run] + 594
25 AppKit 0x00007fff8acd4834 NSApplicationMain + 1832
26 myApp 0x00000001091b16a2 main + 34
27 myApp 0x00000001091ab864 start + 52
So, the nice folks at Crashlytics were able to help me here. To quote them:
Here's the story:
The pipe dies because the child process crashes. The very next read/write will cause a fault.
That write occurs, which results in a SIGPIPE (not a runtime exception).
If that SIGPIPE is masked/ignored, NSFileHandle checks errno and creates a runtime exception which it throws.
A function deeper than your tee: method has wrapped this write in a #try/#catch (proved by setting a breakpoint on __cxa_begin_catch)
That function, which turns out to be "_dispatch_client_callout", which makes a call to objc_terminate, which effectively kills the
process.
Why does _dispatch_client_callout do this? I'm not sure, but you can
see the code here:
http://www.opensource.apple.com/source/libdispatch/libdispatch-228.23/src/object.m
Unfortunately, AppKit has a really poor track record of being a good
citizen in the face of runtime exceptions.
So, you are right that NSFileHandle raises a runtime exception about
the pipe dying, but not before a signal is raised that kills the
process. Others have encountered this exact issue (on iOS, which has
much better semantics about runtime exceptions).
How can I catch EPIPE in my NSFIleHandle handling?
In short, I don't believe it is possible for you to catch this
exception. But, by ignoring SIGPIPE and using lower-level APIs to
read/write to this file handle, I believe you can work around this. As
a general rule, I'd recommend against ignoring signals, but in this
case, it seems reasonable.
Thus the revised code is now:
-(void)tee:(NSNotification *)notification {
NSData *readData = notification.userInfo[NSFileHandleNotificationDataItem];
totalDataRead += readData.length;
// NSLog(#"Total Data Read %ld",totalDataRead);
NSArray *pipes = [teeBranches objectForKey:notification.object];
if (readData.length) {
for (NSPipe *pipe in pipes ) {
NSInteger numTries = 3;
size_t bytesLeft = readData.length;
while (bytesLeft > 0 && numTries > 0 ) {
ssize_t amountSent= write ([[pipe fileHandleForWriting] fileDescriptor], [readData bytes]+readData.length-bytesLeft, bytesLeft);
if (amountSent < 0) {
NSLog(#"write fail; tried %lu bytes; error: %zd", bytesLeft, amountSent);
break;
} else {
bytesLeft = bytesLeft- amountSent;
if (bytesLeft > 0) {
NSLog(#"pipe full, retrying; tried %lu bytes; wrote %zd", (unsigned long)[readData length], amountSent);
sleep(1); //probably too long, but this is quite rare
numTries--;
}
}
}
if (bytesLeft >0) {
if (numTries == 0) {
NSLog(#"Write Fail4: couldn't write to pipe after three tries; giving up");
}
[self rescheduleOnMain];
}
}
}
}
I know this doesn't do much to answer why the exception catching seems broken, but I hope that this is a helpful answer to workaround the issue.
I faced a similar issue trying to read/write to a socket wrapped in an NSFileHandle. I worked around it by testing the pipe availability directly with the fileDescriptor like so
- (BOOL)socketIsValid
{
return (write([fh fileDescriptor], NULL, 0) == 0);
}
then I tested with that method before attempting to call writeData:.

CallCenter crashes the app

I have a huge issue with the CTCallCenter where the phone crashes if you get a call but never asnwers. Then the music should come back but instead evertyhing just dies.
This is what the code looks like:
_callCenter = [[CTCallCenter alloc] init];
_callCenter.callEventHandler = ^(CTCall* call){
if (call.callState == CTCallStateDialing || call.callState==CTCallStateIncoming) {
_shouldResumeSongIfConnectionIsAlive=NO;
if([[TFAudioPlayer sharedAudioPlayer] status]==TFAudioPlayerStatusPlaying){
[[TFAudioPlayer sharedAudioPlayer] pause];
isAppWasPlaying=YES;
}else isAppWasPlaying=NO;
}else if(call.callState==CTCallStateDisconnected){
if(isAppWasPlaying){
[[TFAudioPlayer sharedAudioPlayer] playForcedFromWhereItStopped];
_shouldResumeSongIfConnectionIsAlive=YES;
}
}
};
I cannot find any way to handle the case where the user doesnt pick up their phone. The CTCallCenter only has Incoming, Disconnect, Connect and Dialing from what I could see.
Does anyone have a clue?
Edit:
This issue only appears when I run the app on the phone when its NOT connected and debugging. The app dies directly when the other person hangs up (missed call).
Stacktrace:
Date/Time: 2012-11-19 14:20:47.470 +0100
OS Version: iPhone OS 5.1 (9B179)
Report Version: 104
Exception Type: EXC_BAD_ACCESS (SIGSEGV)
Exception Codes: KERN_INVALID_ADDRESS at 0xbbadbeef
Crashed Thread: 10
Thread 0 name: Dispatch queue: com.apple.main-thread
Thread 0:
0 libsystem_kernel.dylib 0x356bb004 mach_msg_trap + 20
1 libsystem_kernel.dylib 0x356bb1fa mach_msg + 50
2 CoreFoundation 0x372203ec __CFRunLoopServiceMachPort + 120
3 CoreFoundation 0x3721f0ea __CFRunLoopRun + 818
4 CoreFoundation 0x371a249e CFRunLoopRunSpecific + 294
5 CoreFoundation 0x371a2366 CFRunLoopRunInMode + 98
6 GraphicsServices 0x320af432 GSEventRunModal + 130
7 UIKit 0x33926e76 UIApplicationMain + 1074
8 My App 0x0001e63c 0x1a000 + 17980
9 My App 0x0001c268 0x1a000 + 8808
Also! If I run the app with DEPLOY POSTPROCESSING = YES it doesnt crash. I dont get it

App crashes on tab clicks message is like Collection <CALayerArray: > was mutated while being enumerated

Can you please find out why my app crashes? When I tap on any of my tab in my tabbars app crashes. It's a rendom issue but generates very frequently. The error message is as follows:
2012-01-18 14:48:50.029 MyApp[2823:f803] *** Terminating app due to uncaught exception 'NSGenericException', reason: '*** Collection <CALayerArray: 0x6b46bd0> was mutated while being enumerated.'
*** First throw call stack:
(0x17a8052 0x1d5cd0a 0x17a7c21 0x66b65f 0x66b80d 0x66b80d 0x66b80d 0x66b80d 0x66bb90 0x66bcb6 0x670a4f 0x66a72b 0x6d8116 0x6d7b0e 0x714dc6 0x7149bd 0x712f8a 0x712e2f 0x7148f4 0x17a9ec9 0x6365c2 0x63655a 0x85b569 0x17a9ec9 0x6365c2 0x63655a 0x6dbb76 0x6dc03f 0x6dbbab 0x85dd1f 0x17a9ec9 0x6365c2 0x63655a 0x6dbb76 0x6dc03f 0x6db2fe 0x65ba30 0x65bc56 0x642384 0x635aa9 0x226afa9 0x177c1c5 0x16e1022 0x16df90a 0x16dedb4 0x16deccb 0x2269879 0x226993e 0x633a9b 0x2a7d 0x29f5)
terminate called throwing an exception(gdb) bt
#0 0x9a09e9c6 in __pthread_kill ()
#1 0x90b50f78 in pthread_kill ()
#2 0x90b41bdd in abort ()
#3 0x01f00e78 in abort_message ()
#4 0x01efe89e in default_terminate ()
#5 0x01d5cf4b in _objc_terminate ()
#6 0x01efe8de in safe_handler_caller ()
#7 0x01efe946 in std::terminate ()
#8 0x01effb3e in __cxa_rethrow ()
#9 0x01d5ce49 in objc_exception_rethrow ()
#10 0x016dee10 in CFRunLoopRunSpecific ()
#11 0x016deccb in CFRunLoopRunInMode ()
#12 0x02269879 in GSEventRunModal ()
#13 0x0226993e in GSEventRun ()
#14 0x00633a9b in UIApplicationMain ()
#15 0x00002a7d in main (argc=1, argv=0xbffff5d4) at /Users/Bob/Desktop/MyApp/MyApp/main.m:14
Current language: auto; currently objective-c
(gdb)
Usually this is caused by adding / removing objects from a NSMutableArray while it's enumerated. For example:
[array enumerateObjectsUsingBlock:^(id obj, NSUInteger idx, BOOL *stop) {
[array removeObject:obj];
}];
Check your code whether you are doing something like this.