ADF: Copy activity suddenly stopped working with additional colums - azure-data-factory-2

this is really weird: I have a copy activity that worked fine for a long time, but now stopped working and I can't understand why.
I tried many things trying to understand the issue, and I found that the additional column (a classic $$FILEPATH) is the problem.
removing the additional column everything works fine (but I can't, I need that column...)
basically copying several csv files from an SFTP to an Azure DataLake Storage V2, adding a column with the file path, now results in a very strange error:
{
"errorCode": "2200",
"message": "Failure happened on 'Sink' side. ErrorCode=UserErrorFailedFileOperation,'Type=Microsoft.DataTransfer.Common.Shared.HybridDeliveryException,Message=Upload file failed at path ingestion/cegid/dev\\.,Source=mscorlib,''Type=System.ArgumentException,Message=An item with the same key has already been added.,Source=mscorlib,'",
"failureType": "UserError",
"target": "CopyFromSftp",
"details": []
}
I really don't know why it doesn't work anymore, and why it seems it's adding something already added (????).
The only thig I can figure out is something has changed in the default integration runtime (maybe a bug?)

Related

Google Batch Translate Documents - Processing Progress

I've followed the instructions on the Google Translate Multiple Documents guide, set up a batch of Office documents to be translated and successfully submitted a request using Powershell. I get the expected response as follows, apparently indicating that the request is successful:-
{
"name": "projects/<My-Project-Number-Here>/locations/us-central1/operations/20220525-16311653521501-<Generated-Job-ID-GUID>",
"metadata": {
"#type": "type.googleapis.com/google.cloud.translation.v3.BatchTranslateDocumentMetadata",
"state": "RUNNING"
}
}
All good so far.
However, the problem is that I dont see any translated documents appearing in the Storage Bucket that I specified in the output_config of the request .JSON file, and I can't seem to find a way to view the status of the long-running job that it has created.
If I re-submit the same job, it tells me that the output bucket is in-use by another batch process, which indicates that something is happening.
But, for the life of me, I can't see where in the Google Cloud Dashboard or via the gcloud command line, to query the status of the job it has created. It just says that it has created a batch a job and that's it. No further feedback. I am assuming that it fails somehow as there are no resulting translated files in the output storage bucket location.
Does anyone have experience with this and could they share how to query the job?
Thanks in advance.
Regards,
Glenn
EDIT: OK - the job hasn't failed - I just needed to be patient as I checked the output storage bucket and there are a bunch of files in there, so it is working (phew - I have a lot of documents to translate). But I would still like to know if there is a way to see the status of the job.

Getting error when retrieving Spatial Data from my database

I have spent 2 days chasing this one round and round, and I have tried several solutions (detailed below).
Problem. When retrieving Geographical data from a Microsft SQL Database I get an error
DBServer routine OpenDataSet has failed with error DataReader.GetFieldType(3) returned null.
From what I have read, this is typically because the project cannot load or access Microsoft.SqlServer.Types, so it can't interpret the returned data effectively
What I have tried;
Removing and readding the reference.
Setting the assembly to copy Local
Removing and reinstalling via Nuget (v14.0)
Referencing said assembly in the web.config
Adding a utility class in Global.asax, then calling that on Application_Start to load in the other dependent files
LoadNativeAssembly(nativeBinaryPath, "msvcr120.dll")
LoadNativeAssembly(nativeBinaryPath, "SqlServerSpatial140.dll")
The error happens whether I am running locally (not such a key issue) or on an Azure vps (SqlServer Web Edition).
The stored procedure I am calling to return the data works fine. (In fact, this code is a lift and shift project. the old vps works fine if we fire it up, so it is most likely a configuration issue and all the above I have done is wasted effort. But the original developer is not contactable, nor are there any notes on how this was made to work.)

ResourceNotFoundException with full deploy to prod

I have a fully developed set of functions which work fine in the "dev" stage and it's now time for me deploy to production. Unfortunately every time I try to deploy it goes for a long time but after printing "Checking Stack update progress" it fails with a 404 error:
An error occurred: SentinelLambdaFunction - Function not found: arn:aws:lambda:us-east-1:837955377040:function:xyz-services-prod-sentinel (Service: AWSLambda; Status Code: 404; Error Code: ResourceNotFoundException; Request ID: 38f86b7a-99cd-11e8-af06-fffd92e40dc5).
This error is non-sensical to me as this function does exist and executing precisely the same full deployment to "dev" results in no error. Note that in both environment/stage, we are deploying 10 functions with a fully deployment.
I tried removing the function which was being complained about first, with the hope that I could re-include it on a second deployment but then it simply complained about a different function not existing.
I also thought maybe the "--force" parameter might push this deployment into place but it has had no impact on the error I get.
The cycle time for each attempt is very long so I'd be very grateful if anyone could help to point me in the right direction on this.
Below is a screenshot of the output when run in "verbose" mode:
In attempt to get around the error I thought maybe I'd have a better chance if I went into CloudFormation and explicitly deleted the template for prod. I attempted to do this from the GUI and got the following:
This actually has further convinced me that this removal is important but I'm not sure what to do next.
For me, the solution was:
serverless remove
and then try deploying again.
So the solution to this problem was to ensure that all previous traces of the CloudFront stack was removed. In my case I had manually taken out a few functions from Lambda and the 401 errors I was getting were likely occuring in the removal attempts rather than my assumption that it was related to adding these functions.
Bear in mind you may find yourself -- like I did -- where the first attempt to delete fails. In this case try again and make sure to check off any checkboxes exposed by UI that indicate what had caused the issues the prior attempt.
Once I'd done that I was able to deploy as per normal from the serverless framework.

Azure Data Factory Pipeline Intermittent Error 2906

I have four ADF pipelines running on different schedules (1Hr, 2Hr, 6Hr and 1Day). Since yesterday they are having intermittent failures, reporting Error 2906, as follows:
{
"errorCode": "2906",
"message": "Package execution failed.",
"failureType": "UserError",
"target": "%package name%"
}
I'm unclear in the error however given that it has been working just fine up until yesterday, and then yesterday it succeeds and then fails intermittently, is there any advice on how/where to troubleshoot this?
I can't say this is the definitive answer, however I suspect the DB was merely under strain performance-wise. The problem appears to have disappeared now.

PouchDB corruption detection

I am building up a webapp with offline functionality. I am using combination of webcache and pouchDB to achieve it.
Currently I am testing recovery mechanisms against DB corruption. My premise is that since pouchDB is running in client, it is exposed to anyone who by mistake or on purpose can corrupt the DB. Also maybe in case of bugs or similar, DB could get corrupted. Then, if DB gets corrupted, unless it gets detected and clean by webapp, this will never work correctly.
Test is quite simple:
- Create PouchDB:
var dbOptions = {
auto_compaction : false,
cache : false
};
var db = new PouchDB('myDB',dbOptions);
With Developer Tools delete part of the database.
On loading the application it tries to read all documents:
db.allDocs({include_docs : true}, function(_err,_response){
(certain code here)
}
It is at this point when "Uncaught TypeError: Cannot set property '_rev' of undefined " is thrown. I tried to catch exception and using provided promise by pouchDB but none did work.
Did any of you fellows have similar problem? How did you solve it?
EDIT:
When PouchDB returns 500 Internal error, how is the application supposed to recover from it? I tried to destroy the database
db.destroy(function(err,info){console.log(err||info);}
but it does not work. It returns 500 Internal error as well.
It indeed sounds like your database got corrupted. Sorry about that; we try to write bulletproof code, but since we're working against the WebSQL/IndexedDB APIs, there's always the possibility that something goes wrong at that interface, the browser crashes, lightning strikes your computer, etc.
500 errors indicate an internal PouchDB error, so you're not supposed to recover from them. Probably the best way to protect against corruption like that is just to set up continual sync with a CouchDB server (kind of the point of PouchDB anyway). CouchDB is a full database implemented from top to bottom and is very robust – since it uses append-only database files, your database can never get corrupted. So if you use continuous sync, you can always delete the PouchDB database and recover from CouchDB.
That being said, if you could let us know which version of PouchDB you're running, which browser you saw this on, or even a code snippet to reproduce, that would be really helpful. If you're using Firefox, you can also send us the storage files themselves for IDB by following the instructions here to find the Profile folder and then sending us the contents of the storage/persistent/<my_site>/idb folder. Thanks!
I got this error while adding a new schema to my RxDB database. It turned out I included the primary key and wrong property names into encrypted fields. I removed the primary key and put proper names and it worked fine after that.