stream analytics 'Invalid Avro Format, drop invalid record.' - serialization

I am trying to serialize my C# classes to 'Avro' using 'Microsoft Avro Library' and sending it to event hub. However when I am trying to read the data thru stream analytics it gives this error at the logs 'Invalid Avro Format, drop invalid record'
More details..
using reflection method as shown in https://azure.microsoft.com/en-in/documentation/articles/hdinsight-dotnet-avro-serialization/ to serialize to avro format and sending it to event hub
//Create a new AvroSerializer instance and specify a custom serialization strategy AvroDataContractResolver
//for serializing only properties attributed with DataContract/DateMember
var avroSerializer = AvroSerializer.Create<SensorData>();
//Create a memory stream buffer
using (var buffer = new MemoryStream())
{
//Create a data set by using sample class and struct
var expected = new SensorData { Value = new byte[] { 1, 2, 3, 4, 5 }, Position = new Location { Room = 243, Floor = 1 } };
//Serialize the data to the specified stream
avroSerializer.Serialize(buffer, expected);
var bytes = buffer.ToArray();
var data = new EventData(bytes) {PartitionKey = "deviceId"};
// send to event hub client
eventHubClient.Send(data);
}
Events are published fine to event hubs. I have created a worker role that can consume these events and can able to deserialize them.
However when I set this event hub as input to my stream analytics and setting the event serialization format as 'avro' it gives below errors..
Message: Invalid Avro Format, drop invalid record.
Message: IncorrectSerializationFormat errors are occuring too rapidly.
They are being suppressed temporarily
I guess I have to include Avro Schema as well. Can anyone please guide me a correct way to serialize a C# class to 'avro' so stream analytics could understand it?
Thanks for your time.

You will have to include the schema. Below is an example of how you can send events along with Schema. This uses an AvroContainer.
var eventHubClient = EventHubClient.CreateFromConnectionString("ReplaceConnectionString","ReplaceEventHubPath");
int numberOfEvents = 10;
using (var memoryStream = new MemoryStream())
using (var avroWriter = AvroContainer.CreateWriter<SensorData>(memoryStream, Codec.Null))
using (var sqWriter = new SequentialWriter<SensorData>(avroWriter, numberOfEvents))
{
Enumerable.Range(0, numberOfEvents)
.Select(i => new SensorData() { Id = "DeviceId", Value = i })
.ToList()
.ForEach(data => sqWriter.Write(data));
memoryStream.Seek(0, SeekOrigin.Begin);
var eventData = new EventData(memoryStream.ToArray());
eventHubClient.Send(eventData);
}

Related

Push files from Azure function triggered by blobstorage trigger to Github using octokit.net

I use the following code on an Azure function to push files on a github repository when a new file is uploaded to a blobstorage, that trigger the function.
But it doesn't work if multiple file are uploaded to blobstorage in a short time interval: only one random file is pushed to github and then the function throw an exception; in the log:
Description: The process was terminated due to an unhandled exception.
Exception Info: Octokit.ApiValidationException: Reference cannot be updated
{"message":"Reference cannot be updated","documentation_url":"https://docs.github.com/rest/reference/git..."}
This is the code:
public static async void PushToGithub(string fileName, Stream myBlob)
{
// github variables
var owner = GITHUB_USER;
var repo = GITHUB_REPO;
var token = GITHUB_TOKEN;
//Create our Client
var github = new GitHubClient(new ProductHeaderValue("GithubCommit"));
var tokenAuth = new Credentials(token);
github.Credentials = tokenAuth;
var headMasterRef = "heads/master";
// Get reference of master branch
var masterReference = await github.Git.Reference.Get(owner, repo, headMasterRef);
// Get the laster commit of this branch
var latestCommit = await github.Git.Commit.Get(owner, repo, masterReference.Object.Sha);
// For image, get image content and convert it to base64
byte[] bytes;
using (var memoryStream = new MemoryStream())
{
myBlob.Position = 0;
myBlob.CopyTo(memoryStream);
bytes = memoryStream.ToArray();
}
var pdfBase64 = Convert.ToBase64String(bytes);
// Create blob
var pdfBlob = new NewBlob { Encoding = EncodingType.Base64, Content = (pdfBase64) };
var pdfBlobRef = await github.Git.Blob.Create(owner, repo, pdfBlob);
// Create new Tree
var nt = new NewTree { BaseTree = latestCommit.Tree.Sha };
// Add items based on blobs
nt.Tree.Add(new NewTreeItem { Path = fileName, Mode = "100644", Type = TreeType.Blob, Sha = pdfBlobRef.Sha });
var newTree = await github.Git.Tree.Create(owner, repo, nt);
// Create Commit
var newCommit = new NewCommit("File update " + DateTime.UtcNow, newTree.Sha, masterReference.Object.Sha);
var commit = await github.Git.Commit.Create(owner, repo, newCommit);
// Update HEAD with the commit
await github.Git.Reference.Update(owner, repo, headMasterRef, new ReferenceUpdate(commit.Sha, true));
}
How can I solve so it pushes correctly to github all the files that are uploaded on the blobstorage?
Thanks in advance,
Marco
Have a look of this official doc:
https://learn.microsoft.com/en-us/azure/azure-functions/functions-bindings-storage-blob-trigger?tabs=csharp
In addition, storage logs are created on a "best effort" basis.
There's no guarantee that all events are captured. Under some
conditions, logs may be missed.
If you require faster or more reliable blob processing, consider
creating a queue message when you create the blob. Then use a queue
trigger instead of a blob trigger to process the blob. Another option
is to use Event Grid; see the tutorial Automate resizing uploaded
images using Event Grid.
If you focus on processing blob and don't care about the loose event, then you can use queue trigger to make sure the blob be precessed, if you care about the loose event, please use event grid.

Azure service bus Message deserialize broken in core conversion

So, I've created a new Azure Functions project v3 and am porting over a subset of functions from v1 that was running on 4.6.2, while retiring the rest as obsolete. Unfortunately in the change from BrokeredMessage to Message due to changing from Microsoft.ServiceBus.Messaging to Microsoft.Azure.ServiceBus the following deserialization method is now failing with:
There was an error deserializing the object of type stream. The input source is not correctly formatted.
The problem is right there in the error, but Im not sure what the correct new approach is, its a bit unclear.
Serialize
public static Message CreateBrokeredMessage(object messageObject)
{
var message = new Message(Encoding.UTF8.GetBytes(JsonConvert.SerializeObject(messageObject)))
{
ContentType = "application/json",
Label = messageObject.GetType().Name
};
return message;
}
Deserialize
public static T ParseBrokeredMessage<T>(Message msg)
{
var body = msg.GetBody<Stream>();
var jsonContent = new StreamReader(body, true).ReadToEnd();
T updateMessage = JsonConvert.DeserializeObject<T>(jsonContent);
return updateMessage;
}
Object
var fileuploadmessage = new PlanFileUploadMessage()
{
PlanId = file.Plan_Id.Value,
UploadedAt = uploadTimeStamp,
UploadedBy = uploadUser,
FileHash = uploadedFileName,
FileName = file.Name,
BusinessUnitName = businessUnitName,
UploadedFileId = uploadedFile.Id
};
```
Message.GetBody<T>() is an extension method for messages sent using the legacy Service Bus SDK (WindowsAzure.ServiceBus package) where BrokeredMessage was populated with anything other than Stream. If your sender sends an array of bytes as you've showed, you should access it using Message.Body property.
In case your message is sent as a BrokeredMessage, the receiving code will need to select either of the methods based on some information to indicate how the message was originally sent.

Send Microsoft.Azure.ServiceBus Message to BizTalk 2013 WCF-Custom

I need to send messages from a .NET Core app via the Azure Service Bus to BizTalk 2013. I have configured a WCF Custom receive port on BizTalk but on receiving a message get the following error:
The adapter "WCF-Custom" raised an error message. Details "System.Xml.XmlException: The input source is not correctly formatted.
I've found examples using Windows.Azure.ServiceBus package and BrokeredMessage, but this is deprecated. I need to use Microsoft.Azure.ServiceBus and the Message object.
I've tried many ways of serializing the XML but nothing seems to work.
In short I'm creating the message like this:
var message = new Message(Encoding.UTF8.GetBytes("<message>Hello world</message>"));
Is there a way to serialize the message correctly to be received by WCF in BizTalk 2013?
I figured it out.
For anyone who needs to send messages via Azure Service Bus using Microsoft.Azure.ServiceBus Message to BizTalk 2013 WCF-Custom receive port.
var toAddress = "sb://yourbusname.servicebus.windows.net/yourqueuename";
var bodyXml = SerializeToString(yourSerializableObject); //
var soapXmlString = string.Format(#"<s:Envelope xmlns:s=""http://www.w3.org/2003/05/soap-envelope"" xmlns:a=""http://www.w3.org/2005/08/addressing""><s:Header><a:Action s:mustUnderstand=""1"">*</a:Action><a:To s:mustUnderstand=""1"">{0}</a:To></s:Header><s:Body>{1}</s:Body></s:Envelope>",
toAddress, bodyXml);
var content = Encoding.UTF8.GetBytes(soapXmlString);
var message = new Message { Body = content };
message.ContentType = "application/soap+msbin1";
This wraps the Xml in a proper SOAP format. Note the "to" embedded in the SOAP envelope is necessary (I found it didn't work using message.To).
For completeness, this is the serialization method (for clean xml):
public string SerializeToString<T>(T value)
{
var emptyNamespaces = new XmlSerializerNamespaces(new[] { XmlQualifiedName.Empty });
var serializer = new XmlSerializer(value.GetType());
var settings = new XmlWriterSettings
{
Indent = false,
OmitXmlDeclaration = true
};
using (var stream = new StringWriter())
using (var writer = XmlWriter.Create(stream, settings))
{
serializer.Serialize(writer, value, emptyNamespaces);
return stream.ToString();
}
}

Power App - generate PDF

I got an assignment to see if I can make power app that will generate some PDF file for end user to see.
After through research on this topic I found out that this is not an easy to achieve :)
In order to make power app generate and download/show generated pdf I made these steps:
Created power app with just one button :) to call Azure function from step 2
Created Azure function that will generate and return pdf as StreamContent
Due to power app limitations (or I just could not find the way) there was no way for me to get pdf from response inside power app.
After this, I changed my Azure function to create new blob entry but know I have problem to get URL for that new entry inside Azure function in order to return this to power app and then use inside power app Download function
My Azure function code is below
using System;
using System.Net;
using System.Net.Http.Headers;
using System.Runtime.InteropServices;
using Aspose.Words;
public static async Task<HttpResponseMessage> Run(HttpRequestMessage req, TraceWriter log, Stream outputBlob)
{
log.Info($"C# HTTP trigger function processed a request. RequestUri={req.RequestUri}");
var dataDir = #"D:/home";
var docFile = $"{dataDir}/word-templates/WordAutomationTest.docx";
var uid = Guid.NewGuid().ToString().Replace("-", "");
var pdfFile = $"{dataDir}/pdf-export/WordAutomationTest_{uid}.pdf";
var doc = new Document(docFile);
doc.Save(pdfFile);
var result = new HttpResponseMessage(HttpStatusCode.OK);
var stream = new FileStream(pdfFile, FileMode.Open);
stream.CopyTo(outputBlob);
// result.Content = new StreamContent(stream);
// result.Content.Headers.ContentDisposition = new ContentDispositionHeaderValue("attachment");
// result.Content.Headers.ContentDisposition.FileName = Path.GetFileName(pdfFile);
// result.Content.Headers.ContentType = new MediaTypeHeaderValue("application/octet-stream");
// result.Content.Headers.ContentLength = stream.Length;
return result;
}
I left old code (the one that streams pdf back under comments just as reference of what I tried)
Is there any way to get download URL for newly generated blob entry inside Azure function?
Is there any better way to make power app generate and download/show generated PDF?
P.S. I tried to use PDFViewer control inside power app, but this control is completely useless cause U can not set Document value via function
EDIT: Response from #mathewc helped me a lot to finally wrap this up. All details are below.
New Azure function that works as expected
#r "Microsoft.WindowsAzure.Storage"
using System;
using System.Net;
using System.Net.Http.Headers;
using System.Runtime.InteropServices;
using Aspose.Words;
using Microsoft.WindowsAzure.Storage.Blob;
public static async Task<HttpResponseMessage> Run(HttpRequestMessage req, TraceWriter log, CloudBlockBlob outputBlob)
{
log.Info($"C# HTTP trigger function processed a request. RequestUri={req.RequestUri}");
var dataDir = #"D:/home";
var docFile = $"{dataDir}/word-templates/WordAutomationTest.docx";
var uid = Guid.NewGuid().ToString().Replace("-", "");
var pdfFile = $"{dataDir}/pdf-export/WordAutomationTest_{uid}.pdf";
var doc = new Document(docFile);
doc.Save(pdfFile);
var result = new HttpResponseMessage(HttpStatusCode.OK);
var stream = new FileStream(pdfFile, FileMode.Open);
outputBlob.UploadFromStream(stream);
return req.CreateResponse(HttpStatusCode.OK, outputBlob.Uri);
}
REMARKS:
Wee need to add "WindowsAzure.Storage" : "7.2.1" inside project.json. This package MUST be the same version as one with same name that is in %USERPROFILE%\AppData\Local\Azure.Functions.Cli
If you change your blob output binding type from Stream to CloudBlockBlob you will have access to CloudBlockBlob.Uri which is the blob path you require (documentation here). You can then return that Uri back to your Power App. You can use CloudBlockBlob.UploadFromStreamAsync to upload your PDF Stream to the blob.

Returning Azure BLOB from WCF service as a Stream - Do we need to close it?

I have a simple WCF service that exposes a REST endpoint, and fetches files from a BLOB container. The service returns the file as a stream. i stumbled this post about closing the stream after the response has been made :
http://devdump.wordpress.com/2008/12/07/disposing-return-values/
This is my code:
public class FileService
{
[OperationContract]
[WebGet(UriTemplate = "{*url}")]
public Stream ServeHttpRequest(string url)
{
var fileDir = Path.GetDirectoryName(url);
var fileName = Path.GetFileName(url);
var blobName = Path.Combine(fileDir, fileName);
return getBlob(blobName);
}
private Stream getBlob(string blobName)
{
var account = CloudStorageAccount.FromConfigurationSetting("ConnectingString");
var client = account.CreateCloudBlobClient();
var container = client.GetContainerReference("data");
var blob = container.GetBlobReference(blobName);
MemoryStream ms = new MemoryStream();
blob.DownloadToStream(ms);
ms.Seek(0, SeekOrigin.Begin);
return ms;
}
}
So I have two question :
Should I follow the pattern mentioned in the post ?
If I change my return type to Byte[], what are Cons/Pros ?
( My client is Silverlight 4.0, just in case it has any effect )
I'd consider changing your return type to byte[]. It's tidier.
Stream implements IDisposable, so in theory the consumer of your method will need to call your code in a using block:
using (var receivedStream = new FileService().ServeHttpRequest(someUrl))
{
// do something with the stream
}
If your client definitely needs access to something that Stream provides, then by all means go ahead and return that, but by returning a byte[] you keep control of any unmanaged resources that are hidden under the covers.
OperationBehaviorAttribute.AutoDisposeParameters is set to TRUE by default which calls dispose on all the inputs/outputs that are disposable. So everything just works.
This link :
http://devdump.wordpress.com/2008/12/07/disposing-return-values/
explains how to manually control the process.