Create a PDF file using PDFTron and then rename or delete it - pdf

I am trying to create PDF file using PDFTron in application which runs in the UWP environment. I am able to create a file successfully. Depending on user input that newly created file might need to be renamed or completely deleted from the system. Although when I try to access the file that was just created the system throws the following exception:
Exception thrown: 'System.IO.IOException' in System.IO.FileSystem.dll The process cannot access the file (filename) because it is being used by another process.
The following part show what is used for the file to be created:
await sdfDoc.SaveAsync(filePath, SDFDocSaveOptions.e_linearized, "%PDF-1.5");
sdfDoc.Dispose();
And this is my delete implementation:
var filedelete = Task.Run(() => File.Delete(filePath));
The creation of the file is running on a seperate Task and the deletion takes place upon a button press.
I understand the nature of the exception, although I was wondering if the resources of the file are returned to the system from PDFTron after the creation of the file?
Any help or direction would be much appreciated.
Thank you.

PDFNet uses reference counting internally to know when to release the filesystem handles and memory.
For example, the following would trigger the issue where the file is still locked.
PDFDoc doc = new PDFDoc(input_filename);
doc.InitSecurityHandler();
SDFDoc sdfdoc = doc.GetSDFDoc();
await sdfdoc.SaveAsync(output_file_path, SDFDocSaveOptions.e_linearized, "%PDF-1.5");
sdfdoc.Dispose();
await Task.Run(() => File.Delete(output_file_path)); // fails, as PDFDoc still has reference.
But this would work as expected.
using(PDFDoc doc = new PDFDoc(input_filename))
{
doc.InitSecurityHandler();
SDFDoc sdfdoc = doc.GetSDFDoc();
await sdfdoc.SaveAsync(output_file_path, SDFDocSaveOptions.e_linearized, "%PDF-1.5");
sdfdoc.Dispose();
}
await Task.Run(() => File.Delete(output_file_path)); // works
Note the using statement for the PDFDoc instance, and the manual dispose of the SDFDoc instance, though you could use a using statement on that also.

Related

Rename azure blob file and Download it

I have few files in azure blobs that are stored with unique file names and when the client wants to download, i want to rename to a friendly name.
I'm still using 2014 azure storage dlls in my project and i'm not planning to update them anytime soon. So i can't use built-in ContentDeposition and rename it.
I tried using following code in my controller:
var blob = blobStorage.GetBlobRef("https://mysite.blob.core.windows.net/my-container/WERF3234435FFF_ERFas23E.doc");
MemoryStream memStream = new MemoryStream();
blob.DownloadToStream(memStream);
Response.ContentType = blob.Properties.ContentType;
Response.AddHeader("Content-Disposition", "Attachment; filename=abcd_New.doc");
Response.AddHeader("Content-Length", blob.Properties.Length.ToString());
Response.BinaryWrite(memStream.ToArray());
but its not downloading the file.
I also tried using this:
MemoryStream memStream = new MemoryStream();
blob.DownloadToStream(memStream);
System.Web.HttpContext.Current.Response.Clear();
System.Web.HttpContext.Current.Response.ContentType = blob.Properties.ContentType;
System.Web.HttpContext.Current.Response.AddHeader("Content-Disposition", "Attachment; filename=" + friendlyName.doc);
System.Web.HttpContext.Current.Response.AddHeader("Content-Length", blob.Properties.Length.ToString());
System.Web.HttpContext.Current.Response.BinaryWrite(memStream.ToArray());
System.Web.HttpContext.Current.Response.End();
I have my business logic in a separate solution and getting the blob reference from there to my main solution.
Am i missing something?
When we're talking about ASP.NET MVC, I'm missing the Controller/Action in your code? You're not supposed to write to the HttpContext yourself when doing ASP.NET MVC. You have ActionResults for that.
public ActionResult Download()
{
// ...
var bytes = memStream.ToArray();
return File(bytes, System.Net.Mime.MediaTypeNames.Application.Octet, "abcd_New.doc");
}
The browser will decide whether to open the file download or open it directly within the browser window. If you want to control that, you will need the following piece of code before you call the return File(... method:
var contentDisposition = new System.Net.Mime.ContentDisposition
{
FileName = "abcd_New.doc",
Inline = false // true will try to open in Browser, false will download
};
Response.AppendHeader("Content-Disposition", contentDisposition.ToString());
We need to flush our response after wrote a file to response. I use the code which you provided. After adding following code, I can see the file can be download from server.
Response.BinaryWrite(memStream.ToArray());
Response.Flush();
Response.End();

An exception "The Content-MD5 you specified did not match what we received"

I got an exception, I never got before when testing my application that uploads a file from ec2 to s3. The content is:
Exception in thread "Thread-1" com.amazonaws.services.s3.model.AmazonS3Exception: The Content-MD5 you specified did not match what we received. (Service: Amazon S3; Status Code: 400; Error Code: BadDigest; Request ID: 972CB8E04388AB20), S3 Extended Request ID: T7bmFnQ2RlGWlJD+aGYfTy97XZw88pbQrwNB8YCezSjyq6O2joxHRP/6ko+Q2zZeGewkw4x/90k=
at com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1383)
at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:902)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:607)
at com.amazonaws.http.AmazonHttpClient.doExecute(AmazonHttpClient.java:376)
at com.amazonaws.http.AmazonHttpClient.executeWithTimer(AmazonHttpClient.java:338)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:287)
at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3676)
at com.amazonaws.services.s3.AmazonS3Client.putObject(AmazonS3Client.java:1439)
at com.amazonaws.services.s3.transfer.internal.UploadCallable.uploadInOneChunk(UploadCallable.java:131)
at com.amazonaws.services.s3.transfer.internal.UploadCallable.call(UploadCallable.java:123)
at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:139)
at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:47)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
What can I do to fix this bug? I used the same code as before in my application.
I think I have solved my problem. I finally found that some of my files actually changed during the uploading. Because the file is generated by another thread, the uploading and generating is done at the same time. The file can not be generated immediately, and during the generating of a file, it may be uploading at the same time, the file actually changed during the uploading.
The md5 of file is created at the beginning of uploading by the AmazonS3Client, then the whole file is uploaded to the S3, at this time, the file is different from the file uploaded at beginning, so the md5 actually changed. I modified my program to a single-threading program, and the problem never turned up again.
Another reason for having this issue is to run a code such as this (python)
with open(filename, 'r') as fd:
self._bucket1.put_object(Key=key, Body=fd)
self._bucket2.put_object(Key=key, Body=fd)
In this case the file object (fd) is pointing to the end of the file when it reaches line 3, so we will get the "Content MD5" error, in order to avoid it we will need to point the file reader back to the start position in the file
with open(filename, 'r') as fd:
bucket1.put_object(Key=key, Body=fd)
fd.seek(0)
bucket2.put_object(Key=key, Body=fd)
This way we won't get the aforementioned Boto error.
I also ran into this error when I was doing something like this:
InputStream productInputStream = convertImageFileToInputStream(file);
InputStream thumbnailInputStream = generateThumbnail(productInputStream);
String uploadedFileUrl = amazonS3Uploader.uploadToS3(BUCKET_PRODUCTS_IMAGES, productFilename, productInputStream);
String uploadedThumbnailUrl = amazonS3Uploader.uploadToS3(BUCKET_PRODUCTS_IMAGES, productThumbnailFilename, thumbnailInputStream);
The generateThumbnail method was manipulating the productInputStream using a third party library. Because I couldn't modify the third party library, I simply performed the upload first:
InputStream productInputStream = convertImageFileToInputStream(file);
// do this first...
String uploadedFileUrl = amazonS3Uploader.uploadToS3(BUCKET_PRODUCTS_IMAGES, productFilename, productInputStream);
/// and then this...
InputStream thumbnailInputStream = generateThumbnail(productInputStream);
String uploadedThumbnailUrl = amazonS3Uploader.uploadToS3(BUCKET_PRODUCTS_IMAGES, productThumbnailFilename, thumbnailInputStream);
... and added this line inside my generateThumbnail method:
productInputStream.reset();
FWIW, I've managed to find a completely different way of triggering this problem, which requires a different solution.
It turns out that if you decide to assign ObjectMetadata to a PutObjectRequest explicitly, for example to specify a cacheControl setting, or a contentType, then the AWS SDK mutates the ObjectMetadata instance to stash the MD5 that it computes for the put request. This means that if you are putting multiple objects, all of which you think should have the same metadata assigned to them, you still need to create a new ObjectMetadata instance for each and every PutObjectRequest. If you don't do this, then it reuses the MD5 computed from the previous put request and you get the MD5 mismatch error on the second object you try to put.
So, to be explicit, doing something like this will fail on the second iteration:
ObjectMetadata metadata = new ObjectMetadata();
metadata.setContentType("text/html");
for(Put obj: thingsToPut)
{
PutObjectRequest por =
new PutObjectRequest(bucketName, obj.s3Key, obj.file);
por = por.withMetadata(metadata);
PutObjectResult res = s3.putObject(por);
}
You need to do it like this:
for(Put obj: thingsToPut)
{
ObjectMetadata metadata = new ObjectMetadata(); // <<-- New ObjectMetadata every time!
metadata.setContentType("text/html");
PutObjectRequest por =
new PutObjectRequest(bucketName, obj.s3Key, obj.file);
por = por.withMetadata(metadata);
PutObjectResult res = s3.putObject(por);
}
I too ran into this problem. How I solved this:
I have a microservice that processes AWS SQS Messages. Each message would create multiple temporary files that would have to be uploaded to S3.
The issue was that the temporary files were named with fixed names without any salt added to them.
So between two messages, it was possible to rewrite the original file that was to be uploaded.
I fixed it by adding a random salt (this can be a UUID or the current time in millis depending on what you want) to the file names, after which the files were not being over-written and were successfully uploaded to S3.
For me it was that I used ContentLength in the params while executing upload. When it is commented out, it worked just fine.
const params = {
Bucket: "",
ContentType: "application/json",
Key: "filename.json",
// ContentLength: body.length, <--- what I have commented out
Body: body
};
await s3.upload(params).promise();

Windows Azure Storage Blobs to zip file with Express

I am trying to use this pluggin (express-zip). At the Azure Storage size we have getBlobToStream which give us the file into a specific Stream. What i do now is getting image from blob and saving it inside the server, and then res.zip it. Is somehow possible to create writeStream which will write inside readStream?
Edit: The question has been edited to ask about doing this in express from Node.js. I'm leaving the original answer below in case anyone was interested in a C# solution.
For Node, You could use a strategy similar to what express-zip uses, but instead of passing a file read stream in this line, pass in a blob read stream obtained using createReadStream.
Solution using C#:
If you don't mind caching everything locally while you build the zip, the way you are doing it is fine. You can use a tool such as AzCopy to rapidly download an entire container from storage.
To avoid caching locally, you could use the ZipArchive class, such as the following C# code:
internal static void ArchiveBlobs(CloudBlockBlob destinationBlob, IEnumerable<CloudBlob> sourceBlobs)
{
using (Stream blobWriteStream = destinationBlob.OpenWrite())
{
using (ZipArchive archive = new ZipArchive(blobWriteStream, ZipArchiveMode.Create))
{
foreach (CloudBlob sourceBlob in sourceBlobs)
{
ZipArchiveEntry archiveEntry = archive.CreateEntry(sourceBlob.Name);
using (Stream archiveWriteStream = archiveEntry.Open())
{
sourceBlob.DownloadToStream(archiveWriteStream);
}
}
}
}
}
This creates a zip archive in Azure storage that contains multiple blobs without writing anything to disk locally.
I'm the author of express-zip. What you are trying to do should be possible. If you look under the covers, you'll see I am in fact adding streams into the zip:
https://github.com/thrackle/express-zip/blob/master/lib/express-zip.js#L55
So something like this should work for you (prior to me adding support for this in the interface of the package itself):
var zip = zipstream(exports.options);
zip.pipe(express.response || http.ServerResponse.prototype); // res is a writable stream
var addFile = function(file, cb) {
zip.entry(getBlobToStream(), { name: file.name }, cb);
};
async.forEachSeries(files, addFile, function(err) {
if (err) return cb(err);
zip.finalize(function(bytesZipped) {
cb(null, bytesZipped);
});
});
Apologize if I've made horrible errors above; I haven't been on this for a bit.

Winrt StreamWriter & StorageFile does not completely Overwrite File

Quick search here yielded nothing. So, I have started using some rather roundabout ways to use StreamWriter in my WinRT Application. Reading works well, writing works differently. What' I'm seeing is that when I select my file to write, if I choose a new file then no problem. The file is created as I expect. If I choose to overwrite a file, then the file is overwritten to a point, but the point where the stream stops writing, if the original file was large, then the old contents exist past where my new stream writes.
The code is as such:
public async void WriteFile(StorageFile selectedFileToSave)
{
// At this point, selectedFileToSave is from the Save File picker so can be a enw or existing file
StreamWriter writeStream;
Encoding enc = new UTF8Encoding();
Stream dotNetStream;
dotNetStream = await selectedFileToSave.OpenStreamForWriteAsync();
StreamWriter writeStream = new StreamWriter(dotNetStream, enc);
// Do writing here
// Close
writeStream.Write(Environment.NewLine);
await writeStream.FlushAsync();
await dotNetStream.FlushAsync();
}
Can anyone offer clues on what I could be missing? There are lots of functions missing in WinRT, so not really following ways to get around this
Alternatively you can set length of the stream to 0 with SetLength method before using StreamWriter:
var stream = await file.OpenStreamForWriteAsync();
stream.SetLength(0);
using (var writer = new StreamWriter(stream))
{
writer.Write(text);
}
Why not just use the helper methods in FileIO class? You could call:
FileIO.WriteTextAsync(selectedFileToSave, newTextContents);
If you really need a StreamWriter, first truncate the file by calling
FileIO.WriteBytesAsync(selectedFileToSave, new byte[0]);
And then continue with your existing code.

Do I need to call CachedFileManager.DeferUpdates in Windows 8 app

In the file picker Windows 8 sample a file is saved like this:
CachedFileManager.DeferUpdates(file);
await FileIO.WriteTextAsync(file, stringContent);
FileUpdateStatus status = await CachedFileManager.CompleteUpdatesAsync(file);
I'm serialising an object as XML so doing it slightly differently:
// CachedFileManager.DeferUpdates(file);
var ras = await file.OpenAsync(FileAccessMode.ReadWrite);
var outStream = ras.GetOutputStreamAt(0);
var serializer = new XMLSerializer();
serializer.Write(myObject, outStream);
// FileUpdateStatus status = await CachedFileManager.CompleteUpdatesAsync(file);
It works with or without the CachedFileManager (commented out above).
So, should I include the CachedFileManager and if I do use it am I saving the file in the right way.
This code works and saves the file fine, but I don't like including code that I don't understand.
Yes, this code will work without CachedFileManager. But, when you use CachedFileManager, you inform the file provider that the file is in process of change. If your file is located on SkyDrive it is faster to create a file and upload it at once instead of update it multiple times.
You can have the full story there : http://www.jonathanantoine.com/2013/03/25/win8-the-cached-file-updater-contract-or-how-to-make-more-useful-the-file-save-picker-contract/
It simply tells the "repository" app to upload the file.