Requesting an image file (local and NOT an URL) for machine learning processing in python azure function app - azure-function-app

In the middle of developing an azure function app that receives an input image file through 'post' request. The image should be provided directly as an input (without storing it in an azure storage container) to a machine learning algorithm for processing. What is the snippet to get an image file and reading its content in python azure function.

def main(req: func.HttpRequest) -> func.HttpResponse:
logging.info('Python HTTP trigger function processed a request.')
try:
image = req.files.get('image')
uploadedImage = image.read() # uploadedImage contains image data
.......
except Exception as e:
logging.info(e)

Related

How can I save a telegram audio file directly to S3 from Telegram?

I am trying to save a user-sent Telegram voice message directly to S3. This happens inside AWS Lambda so saving to disk and using s3.upload_file(filename,...) will not work. This fails:
def audio_handler(update, context):
message = update.effective_message
file = message.voice.get_file()
s3 = boto3.client('s3')
s3.upload_file(file, Bucket='mybucket', Key='onelove.ogg')
ValueError: Filename must be a string
If I attempt to use
s3.upload_fileobj(BytesIO(file).getbuffer(), Bucket='mybucket', Key='onelove.ogg')
TypeError: a bytes-like object is required, not 'File'
Voice.get_file returns an object of type File. To download the voice to memory, you can e.g. pass an empty BytesIO object to the out argument of File.download. Please also have a look at the wiki section on working with files and media.
Disclaimer: I'm currently the maintainer of python-telegram-bot.

Size of PDF breaks FastAPI using python-multipart?

I am trying to upload a PDF to FastAPI. After turning the PDF into a base64-blob and storing it in a txt-file, I POST this file to FastAPI using Postman.
This is my server-side code:
from fastapi import FastAPI, File, UploadFile
import base64
app = FastAPI()
#app.post("/uploadfile/")
async def create_upload_file(file: UploadFile = File(...)):
contents = await file.read()
blob = base64.b64decode(contents)
pdf = open('result.pdf','wb')
pdf.write(blob)
pdf.close()
return {"filename": file.filename}
This procedure works fine for a single-page PDF document of size 279KB (blob-size: 372KB), but it doesn't for a multi-page document of size 1.8MB (blob-size: 2.4MB).
When I try, I get the following WARNING and a 400 bad request response (along with the reseponse "detail": "There was an error parsing the body"):
"Did not find boundary character 55 at index 2"
I'm sure there must be an explanation for this behavior? Maybe it has something to do with async?
This is most likely an issue with saving the file using open().
For large files pdf.close() will execute before pdf.write() has finished saving all the contents of the file.
In order to ensure the whole file being written before it is closed, use with such as this:
with open('failed.pdf', 'wb') as outfile:
outfile.write(blob)
Using the with you will not need to close() after writing. with should also be considered best practice over saving the file into a local variable.

ArcGis Offline map layer changes synchronization

In my WPF application I’m trying to use off-line map functionality. Right now my feature service is configured for data sync and I’m able to create data replica on server and download local copy of geodatabase.
gdbSyncTask = await GeodatabaseSyncTask.CreateAsync(_featureServiceUri);
Envelope extent = new Envelope(xmin, ymin, xmax, ymax, new SpatialReference(wkidStart));
GenerateGeodatabaseParameters generateParams = await _gdbSyncTask.CreateDefaultGenerateGeodatabaseParametersAsync(extent);
_generateGdbJob = _gdbSyncTask.GenerateGeodatabase(generateParams, _gdbPath);
_generateGdbJob.JobChanged += GenerateGdbJobChanged;
_generateGdbJob.ProgressChanged += ((object sender, EventArgs e) =>
{
UpdateProgressBar();
});
_generateGdbJob.Start();
After initial synchronization, I’m able to successfully work with map in off-line mode. This includes operations like adding new geometries or editing existing polygons inside local DB.
However, when I’m trying to synchronize changes back to server – I’m getting no results.
To perform data synchronization with local database – I’m using the following code:
SyncGeodatabaseParameters parameters = new SyncGeodatabaseParameters()
{
GeodatabaseSyncDirection = SyncDirection.Bidirectional,
RollbackOnFailure = false
};
Geodatabase gdb = await Geodatabase.OpenAsync(this.GetGdbPath());
foreach (GeodatabaseFeatureTable table in gdb.GeodatabaseFeatureTables)
{
long id = table.ServiceLayerId;
SyncLayerOption option = new SyncLayerOption(id);
option.SyncDirection = SyncDirection.Bidirectional;
parameters.LayerOptions.Add(option);
}
_gdbSyncTask = await GeodatabaseSyncTask.CreateAsync(_featureServiceUri);
SyncGeodatabaseJob job = _gdbSyncTask.SyncGeodatabase(parameters, gdb);
job.JobChanged += SyncJob_JobChanged;
job.ProgressChanged += SyncJob_ProgressChanged;
job.Start();
Everything goes well. The synchronization ends with status “Succeeded”. The messages logged by the SyncGeodatabaseJob are like on the screen below:
However – when I open edited feature layer from server inside map web client I cannot found any of my local changes. In the serve database I can also see that no new records were created during synchronization.
Interesting think is that when I open “Replica” data inside web I can see the following information:
Replica Server Gen: 2
Creation Date: 2018/02/07 10:49:54 UTC
Last Sync Date: 2018/02/07 10:49:54 UTC
The “Last Sync Data” is equal to replica “Creation date” However, in the replica log in ArcMap I can see the following information:
Can anyone can tell me how should I interpret above described situation? Am I missing some steps in my code? Or maybe some configuration feature is missing on the server? It looks like data modifications are successfully pushed back to replica on server but after that replica is not synchronized with server database (should it work automatically?).
I’m a “fresh” person regarding ArcGis development so any help will be appreciated
Thanks for all the answers. It occurred that there is versioning enabled on the server database and the offline, versioned changes was not reconciled to the server.
After running reconcile/post script (http://desktop.arcgis.com/en/arcmap/10.3/manage-data/geodatabases/automate-reconcile-post-after-sync.htm) off-line changes started to be visibile to other system users.
The code looks ok on fast look so I would assume that there is something going on in the setup.
What do you get back from the sync operation after the sync has completed? Note that you can just use await syncJob.GetResultsAsync to start the job and wait the results.
How is the Feature Service set up on the server? Please refer https://enterprise.arcgis.com/en/server/latest/publish-services/linux/prepare-data-for-offline-use.htm for the different ways to set these things.

create a temp image file for upload

I have created a HTML5 image uploader using canvas.
I have the image data using
Canvas.toDataURL();
which is in the form
data:image/png;base64,<base64image string>
I sent the above data to php which will be used to upload the image to amazon server.
I normally pass the return value of
file_get_contents(path_to_file_to_upload);
to the amazon sdk and the work gets done.
Now how do i have the base64 image data converted into file_get_contents type data to upload the file.
I am not allowed to create a file in the server.Is there any way of creating a temp image and get the file_get_contents data from that temp file??
Pass the return value of base64_decode() instead of file_get_contents to the AWS SDK. file_get_contents loads a file into a string, base64_decode loads a base64 string and returns a string. Since you have a base64 string and not a file, you would call base64_decode.

Locally calculate dropbox hash of files

Dropbox rest api, in function metatada has a parameter named "hash" https://www.dropbox.com/developers/reference/api#metadata
Can I calculate this hash locally without call any remote api rest function?
I need know this value to reduce upload bandwidth.
https://www.dropbox.com/developers/reference/content-hash explains how Dropbox computes their file hashes. A Python implementation of this is below:
import hashlib
import math
import os
DROPBOX_HASH_CHUNK_SIZE = 4*1024*1024
def compute_dropbox_hash(filename):
file_size = os.stat(filename).st_size
with open(filename, 'rb') as f:
block_hashes = b''
while True:
chunk = f.read(DROPBOX_HASH_CHUNK_SIZE)
if not chunk:
break
block_hashes += hashlib.sha256(chunk).digest()
return hashlib.sha256(block_hashes).hexdigest()
The "hash" parameter on the metadata call isn't actually the hash of the file, but a hash of the metadata. It's purpose is to save you having to re-download the metadata in your request if it hasn't changed by supplying it during the metadata request. It is not intended to be used as a file hash.
Unfortunately I don't see any way via the Dropbox API to get a hash of the file itself. I think your best bet for reducing your upload bandwidth would be to keep track of the hash's of your files locally and detect if they have changed when determining whether to upload them. Depending on your system you also likely want to keep track of the "rev" (revision) value returned on the metadata request so you can tell whether the version on Dropbox itself has changed.
This won't directly answer your question, but is meant more as a workaround; The dropbox sdk gives a simple updown.py example that uses file size and modification time to check the currency of a file.
an abbreviated example taken from updown.py:
dbx = dropbox.Dropbox(api_token)
...
# returns a dictionary of name: FileMetaData
listing = list_folder(dbx, folder, subfolder)
# name is the name of the file
md = listing[name]
# fullname is the path of the local file
mtime = os.path.getmtime(fullname)
mtime_dt = datetime.datetime(*time.gmtime(mtime)[:6])
size = os.path.getsize(fullname)
if (isinstance(md, dropbox.files.FileMetadata) and mtime_dt == md.client_modified and size == md.size):
print(name, 'is already synced [stats match]')
As far as I am concerned, No you can't.
The only way is using Dropbox API which is explained here.
The rclone go program from https://rclone.org has exactly what you want:
rclone hashsum dropbox localfile
rclone hashsum dropbox localdir
It can't take more than one path argument but I suspect that's something you can work with...
t0|todd#tlaptop/p8 ~/tmp|295$ echo "Hello, World!" > dropbox-hash-demo/hello.txt
t0|todd#tlaptop/p8 ~/tmp|296$ rclone copy dropbox-hash-demo/hello.txt dropbox-ttf:demo
t0|todd#tlaptop/p8 ~/tmp|297$ rclone hashsum dropbox dropbox-hash-demo
aa4aeabf82d0f32ed81807b2ddbb48e6d3bf58c7598a835651895e5ecb282e77 hello.txt
t0|todd#tlaptop/p8 ~/tmp|298$ rclone hashsum dropbox dropbox-ttf:demo
aa4aeabf82d0f32ed81807b2ddbb48e6d3bf58c7598a835651895e5ecb282e77 hello.txt