Intermittent HTTP error when loading files from ADLS Gen2 in Azure Databricks - azure-data-lake

I am getting an intermittent HTTP error when I try to load the contents of files in Azure Databricks from ADLS Gen2. The storage account has been mounted using a service principal associated with Databricks and has been given Storage Blob Data Contributor access through RBAC on the data lake storage account. A sample statement to load is
df = spark.read.format("orc").load("dbfs:/mnt/{storageaccount}/{filesystem}/{filename}")
The error message I get is:
Py4JJavaError: An error occurred while calling o214.load. : java.io.IOException: GET https://{storageaccount}.dfs.core.windows.net/{filesystem}/{filename}?timeout=90``` StatusCode=412 StatusDescription=The condition specified using HTTP conditional header(s) is not met.
ErrorCode=ConditionNotMet ErrorMessage=The condition specified using HTTP conditional header(s) is not met.
RequestId:51fbfff7-d01f-002b-49aa-4c89d5000000
Time:2019-08-06T22:55:14.5585584Z
This error is not with all the files in the filesystem. I can load most of the files. The error is just with some of the files. Not sure what the issue is here.

This has been resolved now. The underlying issue was due to a change at Microsoft end. This is the RCA I got from Microsoft Support:
There was a storage configuration that is turned on incorrectly during the latest storage tenant upgrade. This type of error would only show up for the namespace enabled account on the latest upgraded tenant. The mitigation for this issue is to turn off the configuration on the specific tenant, and we had kicked off the super sonic configuration rollout for the all the tenants. We have since added additional Storage upgrade validation for ADLS Gen 2 to help cover this type of scenario.

I had the same problem on one file today. Downloading the file, deleting it from storage and putting it back solved the problem.
Tried to rename file -> didn't work.
Edit: we have it on more files, random.
We worked around the problem by copying the entire folder to a new folder and rename it to original. Jobs run without problems again.
Still the question remains, why did the files end up in this situation?

Same issue here. After some research, it seems it was probably an If-Match eTag condition failure in the http GET request. Microsoft talk about how they will return error 412 when this happens in this post: https://azure.microsoft.com/de-de/blog/managing-concurrency-in-microsoft-azure-storage-2/
Regardless, Databricks seem to have resolved the issue on their end now.

Related

Bigquery to Redshift data transfer using AWS SCT failing?

I am trying to migrate data from Bigquery to Redshift using this article. I followed through and successfully got till "Start the Local Data Migration Task".I had to setup AWS profile to access "Data Migration View(Other)". AWS profile was setup using access key and access secret of an admin user account in AWS.
What am I missing ?However, upon starting the task I keep getting following error:
class com.amazon.dmt.model.FileCredentials cannot be cast to class com.amazon.dmt.model.UserCredentials (com.amazon.dmt.model.FileCredentials and com.amazon.dmt.model.UserCredentials are in unnamed module of loader 'app')
I tried to check AWS documentation and looked around but this error is not listed anywhere. I cannot seem to understand that, why is type casting from FileCredentials to UserCredentials is being done ?
Anyone faced a similar issue or can point me in right direction please ?
Based on my testing, I have determined that this is an issue in the 1.0.670 version of SCT. A request has been submitted to correct the issue. In the meantime, to allow you to continue with your project, please revert to AWS-SCT version 1.0.666 using this link. https://d211wdu1froga6.cloudfront.net/builds/1.0/666/Windows/aws-schema-conversion-tool-1.0.zip
You will have to uninstall SCT and the extractor agent then reinstall and configure the previous version(s) as you did before.

error when ingesting with Watson Discovery Service

Trying to ingest a 7MB json file with Watson Discovery Service. When using WDS tooling interface to ingest, the interface indicates successful ingestion, but the document then looks to have failed. The error returned when using the API was: Your request could not be processed because of a problem on the server". The error is not really helping troubleshoot the problem. Any ideas? How do we troubleshoot these problems?
Thank you
The interface indicates successful ingestion because the process is asynchronous. Actually it means that the document was loaded and submitted for ingestion.
Check if your input json file has a top-level array. Currently this type of json files are not supported so it might be the cause of your issue.

re-configuring a worklight application with analytics

After redeploying a worklight application, some configuration for analytics got lost and I'm trying to configure worklight with analytics again.
The dashboard shows "No data available" for time after the deployment although there are old records displayed for the time before the deployment of the application. So the db was not affected.
I set the wl.analytics.logs.forward property to "true" in worklight.properties;
also I set the wl.analytics.url of the db to be something like:
https://myserver:port/analytics/data
The dashboard is on
https://myserver:port/analytics/console
That is the URL for the analytics server.
Although if I put the db URL in a browser I get something like:
Error 404: java.io.FileNotFoundException: SRVE0190E: File not found: /data
Checked SystemOut.log and SystemErr.log (WAS logs) and I did not see errors there.
Does anybody know which is the XML I need to check in order to validate the configuration is OK for analytics? How could I troubleshoot this problem? Are there other logs I could check?
In the list of environment variables you gave I do not see any for username and password. Try to set:
wl.analytics.password=admin
wl.analytics.username=admin
It would be useful to see a wireshark trace, maybe you are not getting 403s. The Analytics data uploader generally has a small bit of protections and you have the option to keep or remove it.
#patbarron is correct about the multiple WAR files though. You need to send your analytics data to the /analytics-service context. The WAR analytics-service is the WAR that handles all the data processing, querying, etc. The other WAR analytics just handles the console UI.
When testing it might be beneficial to lower the
wl.analytics.queue and wl.analytics.queue.size, those values are for collecting data on the MobileFirst runtime server. Data is collected at the runtime server then sent to the analytics server. The larger these values are generally, the longer it will take to send. There are good to set for production

AppHarbor + RavenDB - EsentFileAccessDeniedException

I'm trying to get my application running in AppHarbor using a RavenDB instance. I've updated my raven libs to build 888 and I'm still getting the error below. I've also allowed file system write access but im still getting the same error. Any ideas how to resolve this issue?
[EsentFileAccessDeniedException: Cannot access file, the file is locked or in use]
at Microsoft.Isam.Esent.Interop.Api.Check(Int32 err) in C:\Work\ravendb\SharedLibs\Sources\managedesent-61618\EsentInterop\Api.cs:line 2739
at Microsoft.Isam.Esent.Interop.Api.JetInit(JET_INSTANCE& instance) in C:\Work\ravendb\SharedLibs\Sources\managedesent-61618\EsentInterop\Api.cs:line 131
at Raven.Storage.Esent.TransactionalStorage.Initialize(IUuidGenerator uuidGenerator) in c:\Builds\RavenDB-Stable\Raven.Storage.Esent\TransactionalStorage.cs:line 205
It appears that during app startup, my app was creating a data directory in the application root. I discovered that I had an old reference to RavenDB Embedded which was on longer being used in the project which was creating the unneeded data directory. Once I removed that reference and pushed, everything began working correctly.
Are you trying to use the AppHarbor web worker instance file storage for the RavenDB backing store? That's a bad idea, use the hosted RavenHQ add-on instead.
The self-hosted RavenDB not working on AppHarbor is a known problem.
Even if you get it to work, note that the worker filesystems are not persisted when you deploy new versions of your code or in case of worker instance failure.

Deploy sync error: maximum number of sync passes '5' has been exceeded

When running a web deploy to a specific IIS site I get the following error:
Error: The synchronization is being stopped because the maximum number of sync passes '5' has been exceeded even though all the changes could not be applied. This could occur if there are external changes being made to the destination.
At C:\Code\.....\deploy.ps1:185 char:10
+ & <<<< ($appDeployCmd) $type /M:$url /U:$user /P:$pass /A:Basic -allowUntrusted -useCheckSum
+ CategoryInfo : NotSpecified: (Error: The sync...he destination.:String) [], RemoteException
+ FullyQualifiedErrorId : NativeCommandError
Web Deploy is working fine on this environment against other IIS sites and file syncs are also working. I have previously been able to use web deploy to deploy this specific site without issue. All of the sudden out of nowhere, this issue started happening and I can no longer deploy this site.
I'm doing a basic site deploy with a package built from msbuild. I don't think the specifics are that important because as I said this was all working before and currently works against other sites on the same server farm without issues.
The error message says:
"This could occur if there are external changes being made to the destination."
but I'm not sure how to track this down or if it is even the issue to begin with. I've made sure all explorer windows are closed in all remote sessions. I've tried restarting the site and the app pool. The only thing I have not tried is rebooting the server which is not possible at moment.
Any ideas what might be cause this web deploy to fail?
I had the same error and the problem was my dropbox.
I was working directly in my dropbox folder, and when you publish, it causes dropbox to syncronize at the same time, which caused the error.
Disabling dropbox sync while working solved the problem.
I recon the problem also could happen with onedrive, google drive and so on.
We had this problem when converting from a previously adhoc deploy of a service to MSDeploy, and found that if there were files that were either
marked as read-only via the DOS/Windows read-only file attribute.
inaccessible due to ACLs
then we would get the "maximum number of sync passes" error on deploying.
Once we fixed the attributes/ACLs, we were able to sync.
Quick and easy way to resolve this issue is to delete the files in the destination and re-run the web deploy.
The issue seems to revolve around the ACL step of the web deploy, which attempts to change the permissions of your websites files as a safety measure intended to ensure they are not changed during a deployment.
By default Web Deploy sets the ACL of the sites anonymous user to read only while also overwriting Control Panel access to your website.
Source
You can turn of ACL in future to avoid this if you wish, but it's not really worth it. This will also speed up web deploys - but that is a separate issue.
Not really an answer, but one workaround you can try if you are using the Web Deploy dirPath, filePath, or contentPath providers is the ignoreErrors provider setting. If you know that you are consistently hitting a certain error number, you can specify that that error be ignored when it's hit. See the dirPath provider article for full details (and caveats).
In my case I couldn't fix it but realised the deployment worked regardless.
If you are reading this I wouldn't suggest to just assume it worked, and if it did that it deployed fully, but consider that it may be a false alarm!