When are issue velocity alerts sent - crashlytics

I have configured Fabric and Crashlytics in my application. I have added the call to test crashes:
Crashlytics.sharedInstance().crash()
I am seeing those crashes reported in the Dashboard with the stack traces and everything.
In the Settings menu under Notifications, I have all the alerts set to On, including Issue Velocity Alert.
According to this answer the Issue Velocity Alert:
If an issue is causing a crash in 1% of all user sessions within the past hour, you'll be notifiied.
I have received a New Fatal Issue Alert for the calls to crash() which shows the I am receiving alerts correctly.
But I haven't received any Issue Velocity Alert. Since 100% of my sessions have crashed to the same error, I should be receiving it right? The first crash happened 3 hours ago.
Note that I have tested it with 1 user on 1 device.
Why am I not receiving any alerts?

Paul from Fabric here. Crashlytics has a minimum threshold of unique users of an app before Velocity Alerts will be sent. I can't say what the exact numbers are, but they're designed to prevent apps with few users from getting spammed by our issue reporting.

This is what I found out looking into it:
An issue in an app exceeds the defined threshold for that app.
The app has 250 sessions in that time period.
There was no alert previously raised for the issue in the app.
Source: https://firebase.google.com/docs/crashlytics/velocity-alerts
Personally, I find the name velocity alert quite misleading.

Related

Error Domain=NSPOSIXErrorDomain Code=28 “No space left on device” UserInfo={_kCFStreamErrorCodeKey=28, _kCFStreamErrorDomainKey=1}

Lately numerous network requests with Alamofire made from our iOS device fail with the following error:
Error Domain=NSPOSIXErrorDomain Code=28 "No space left on device"
UserInfo={_NSURLErrorFailingURLSessionTaskErrorKey=LocalDataTask
.<3>,
_kCFStreamErrorDomainKey=1, _NSURLErrorRelatedURLSessionTaskErrorKey=( "LocalDataTask .<3>" ),
_kCFStreamErrorCodeKey=28}
Our app has a mechanism to send a network request if the user has moved +- 10 meters. This is checked every 5 seconds, so in theory every five seconds a call can be made. The network request fails occasionally with this message, returning no status code and the above error.
The message implies the error has to do with available disk/memory space on the device. However, after checking both there is no link to be found since there is plenty of space available. Also, the error occurs on multiple devices, all running iOS 14.4 or higher.
Is there information available regarding error code 28 and what could be the culprit on iOS devices? Even better; how can this error be prevented?
To answer the occurrence of the error itself:
NSPOSIXErrorDomain Code=28 "No space left on device"
With logs in the Xcode terminal:
2021-05-07 15:56:50.873428+0200 MYAPP[21757:7406020] [] nw_path_evaluator_create_flow_inner NECP_CLIENT_ACTION_ADD_FLOW 05CD829A-810D-412F-B86E-7524369359E8 [28: No space left on device]
2021-05-07 15:56:50.877243+0200 MYAPP[21757:7400322] Task <5504BCDF-7DFE-4045-BD4B-E75054636D5B>.<1> finished with error [28] Error Domain=NSPOSIXErrorDomain Code=28 "No space left on device" UserInfo={_NSURLErrorFailingURLSessionTaskErrorKey=LocalUploadTask <5504BCDF-7DFE-4045-BD4B-E75054636D5B>.<1>, _kCFStreamErrorDomainKey=1, _NSURLErrorRelatedURLSessionTaskErrorKey=(
"LocalUploadTask <5504BCDF-7DFE-4045-BD4B-E75054636D5B>.<1>"
), _kCFStreamErrorCodeKey=28}
It appears to get called when there are too many NSURLSessions created, reaching a limit of (in our tests) 600-700 sessions, which are not maintained or closed properly. The error started to get thrown since iOS 14, so it is interesting to see if there was a limit introduced.
Linked is a github issue raised stating the same issues on the ktor microservices framework by JetBrains, pointing in the same direction, mentioning the invalidation of sessions to prevent this issue:
https://github.com/ktorio/ktor/issues/1341
In our own project the origin of the problem turned out to be our implementation of the StarScream websocket library. This might not be relevant for the issues others are having, but explained anyways to create a complete picture of the problem. It is the cause and fix of our specific situation.
At first we assumed it had something to do with the URLSession created by Alamofire (networking library used) since POST requests started to get cancelled, and a kill of the app seemed the only solution to do requests again.
However, we also make use of websocket connections using the StarScream library, which attempts to connect to an socket, and if failed retry to connect every two seconds for a max time of two hours. This would mean for two hours, every two seconds, we connect to the socket -> receive a failure to connect -> disconnect the socket -> connect again. Using a singleton of the socket it was thought there was no possibility of creating multiple URLSessions, since the socket was only initiated once. However calling the connect to the socket again would create a new nw_connection object every single time, since the library did not handle the disconnect properly.
image of NWConcrete_nw_connection objects generated in socket connection
The way this was validated was using the instruments app to check for the creation of new nw_connection objects. Logged as a "memory leak" there, the creation of the nw_connection objects was getting logged and the solution was to make sure we disconnect the socket (invalidate the session) properly before connecting again.
I hope to answer a big part of the issue here, and I will mark my own question answered since this was the solution to the problem at hand. I think Apple should consider giving accurate reports on the number of objects created being limited, instead of giving an error "No space left on device".
Just wanted to chime in with more info, since we're experiencing the same issue.
Based on our analytics, this issue only started happening since iOS 14. We've verified it happening on 14.2, 14.4 and 14.5. Naturally the most straightforward cause for this error would be low memory or disk storage. We've excluded this option with additional logging, as you seem to have done as well.
A possibly related SO post has attributed the issue to a network inspecting framework that was enabled in their release build. It's worth checking if you use a similar tool.
Another report of this issue, this time on the Github of AFNetworking (predecessor to the Alamofire library you use), says they were able to fix it by limiting the creation of URLSession objects.
For us personally, neither of these did the trick. We created a support ticket with Apple, but this hasn't lead to a solution. They requested a small sample project that reproduces the issue, but the error only manifested after 7 days of continuous use in our app. If you have a faster way to reproduce this, it may be worth it to submit your own support ticket.
Hopefully this helps you find a solution, if you do please add this to your post to help others!

no result response from FCM push notification

We launched a new mobile app for iOS and Android devices first week of January and use FCM to send push notification to users.
Thus far we've sent (based on the firebase console report) ~60k notifications out to our users and overall its a very solid and reliable platform. We split our 'sends' in groups of 1000 push tokens / devices.
Question: ~15 times since we've launched we've received 'No Result' back from the CURL that sends the notifications upstream to FCM... and on one occasion we received an error 500.
To work around this and not just assume success we are detecting when the result isn't what we expect it to be upon success, and we log the response (i.e. "no result")... then wait 5 seconds and retry, up to 3 times. (our log message denotes the 'try number' as well).
We have, maybe twice a week, received the 'first try' message (meaning the first attempt failed and 5 secs later the 2nd attempt kicks off)... and only ONCE (this week) have we received the 'second try' message...
We're wondering if this is normal behavior for FCM? Is there some paid level of support or access that would alleviate these re-try instances for us? I don't think there is an SLA for FCM, but generally speaking are others seeing this same behavior and is the rate I've described here what you'd' consider 'normal'?
Thx!
Answer received from Google today:
Hello!
If I've correctly understood this, you have sent 60k messages and received 16 failures? That comes out to around 99.9997% success. Three nines is pretty much industry gold. So looking stellar so far.
There is no paid FCM version, but all clients, regardless of payment plan, run on the best hardware available so you're already in the premium servers. : )

Crashlytics velocity alerts for non-fatal issues

I use Crashlytics to log a non-fatal issue whenever the backend returns unexpected data. Is there a way to get notified when the frequency of this issue suddenly increases? Basically something like the "Issue Velocity Alert" that works for non-fatals too.
Right now we only show notifications for velocity alerts for fatal crashes. However, I'll let the team know you are interested so we can think about this going forward. -Todd from Fabric :)

Quickblox chat channel stopped working after 2 days

I have a quickblox account that we're using internally for testing. Very low throughput (Total of around 600 messages across 2 days and never more than a 3 or 4 per second at the very peak.)
Today the messages stopped sending in the chatroom. There doesn't appear to be any errors coming through the network panel of chrome and no errors popping up in the admin panel.
As a test, without changing any client code, I created a new room and simply updated my config so my client pointed there. This worked with absolutely no problems.
Are there any things I may be missing here? Is this possibly a free tier thing where only a few hundred messages may be sent at any one time or is this more likely something client side?
It were some maintenance periods, you should receive emails about it.
So maybe you were trying to use chat during that period.
600 messages across 2 days it's very small value, so no problems here with limits.

Application getting crash while taking picture continuesly 2 to 3 times

In my application I am uploading photos to Face Book. So, While taking picture from iPhone device 3G continuously two or three times my application getting crash and getting message as below
Program received signal: “0”.
Data Formatters temporarily unavailable, will re-try after a 'continue'. (Unknown error loading shared library "/Developer/usr/lib/libXcodeDebuggerSupport.dylib")
(gdb)
I am not getting how to solve this can any one help me to solve this issue.
Thanks in advance.
There really should be a backtrace and that error sounds a lot like your install is hosed; you might try re-installing the dev tools and iOS SDK.
In any case, the description of your problem sounds like you might potentially be using all available memory and your app might be being jettisoned by the system.
If you rate limit the photos such that you can't take another photo until the first photo is uploaded, does the problem go away?
Do you have a memory warning hook in your app? Is it getting fired?