Need to transfer large file content from wcf service to java client (java web app) - wcf

basically need to transfer large file between wcf service and java client,
can someone give directions please?
Basically I need to a create wcf service which needs to read blob content(actually a file content stored in db column) and pass it to a java web application(being a client to wcf).
File size may vary from 1kb to 20MB in size.
By now I have already researched/checked below options but still not able to finalize which one i should go with, which is feasible and which is not,
could someone guide me please.
pass file content as byte[]:
I understand it will increase data size passed to client as it will encode data into base 64 format and embed the base 64 encoding into soap message itself and hence makes communication slower and have performance issues.
But this works for sure, but I am not sure if it is advisable to go by this approach.
Share a NetworkDrive/FTPFolder accessible to both client and wcf service App:
Per this File needed by client will first get stored there by wcf and then client needs to use java I/O OR FTP options to read it.
This looks good from data size/bandwidth point of view, but has extra processing at both service and client side (as need to store/read via NetworkShared/FTP folder)
Streaming:
This one I am not sure will be feasible with a Java client, but my understanding is that streaming is supported for Non .net clients but how to go about it i am not sure???
I understand for streaming i need to use basichttp binding, but do i need to use DataContract or MessageContract or any will work, and then what is to be done at java client side, that i am not sure about.
Using MTOM approach for passing large data in soap requests:
This looks actually having support specifically designed to solve large data transfer in web service calls, but have to investigate further on this, as of now I don’t have much idea on this. Does anyone of you have some suggestion on this?
I understand question is bit lengthier but i had to put all 4 options i have tried and my concerns/findings with each, so that you all can suggest among these or new option may be, also you will know what i have already tried and so can direct me in more effective way.

I am in the same position as yourself and I can state from experience that option 1 is a poor choice for anything more than a couple of MB.
In my own system, times to upload increase exponentially, with 25MB files taking in excess of 30mins to upload.
I've run some timings and the bulk of this is in the transfer of the file from the .NET client to the Java Web Service. Our web service is a facade for a set of 3rd party services; using the built in client provided by the 3rd party (not viable in the business context) is significantly faster - less than 5mins for a 25MB file. Upload to our client application is also quick.
We have tried MTOM and, unless we implemented it incorrectly, didn't see huge improvements (under 10% speed increase).
Next port of call will be option 2 - file transfers are relatively quick so by uploading the file directly to one of the web service hosts I'm hoping this will speed things up dramatically - if I get some meaningful results I will add them to my post.

Related

Why mocking wcf services with soapui

I am trying to understand mocking the wcf services using SOAPUI.
Quoting from smartbear's blog this can be handy in
Rapid Web Services Prototyping
Generate a complete static mock implementation from a WSDL in seconds
and add dynamic functionality using Groovy. This allows you to
implement and test clients much faster than if you had needed to wait
for the actual solution to get build.
Client testing or development
Clients can be developed against the MockService and
tested without access to the live services.
So from this and some other blogs that I went through I understand that the Primary thing (by what I read) is to keep testing moving along before the service is available (I must say that I didn't get this actually. The service has to up before you sent up some mock requests and responses). Does it implies that the service should be available at the time we setting up this mocks so that we can play with them later on when it is actually not available?
Also can we say that there won't be a difference between saving multiple test cases for a given service and mocking given the service is up and running (after all its service it is supposed to be running).
I was on a large project where 3 different systems all traded data with services. It was the same WSDL (an "industry standard", very complicated beast that didn't fit either of our systems particularly well), and we all had clients for sending data to the servers of the other systems. Each dev/test team had to develop both a client and a service, and we didn't really understand mocks.
As you would expect, we all had our clients done before any of the services were ready. And testing couldn't do anything for months. And when they could finally test (the day after the devs were able to get data to finally flow), things were a mess.
So I can't time-travel back to 2010 and save myself, but I CAN save YOU.
Here's where you still don't get it:
The service has to up before you sent up some mock requests and
responses
You don't need the service to be up, built, coded, funded, or even approved. The SoapUI Mock Service IS the service. Rather, it's a very capable stand-in. So once you have a WSDL, you can built the mock service, create some sample responses, and hit it with a client (possibly other instances of SoapUI).
So why do this? Lots of reasons.
Multiple dev teams, on different timelines.
Testing can proceed, yes.
Avoid surprises down the road (after code) Just because we agreed to use this WSDL, doesn't mean that my data is what your system expects, and vice-versa. Let's find out now!
Examples:
For the Armor field, the WSDL just says "string" but our system allows 25 chars and yours allows 45.
We need you to send UserHighScore if you change LifetimeAchievements. Otherwise it gets reset.
I thought we agreed to put UserRank in the User Atributes tag, not the Power tag?
UserRank needs Effective Date, otherwise it causes our side to delete all of the UserRank history.
That's how it used to work.
Stop sending us the same data that we just sent you. When you UPPERCASE our data that you just received, that ISN'T A CHANGE that you need to tell us about.
Ideally, the system would be developed first with mock services and SoapUI. Once the WSDL is developed, stand up the mock service, then send submit sample requests via SoapUI. Both test and dev should be involved. Look at the data being sent from the SoapUI clients and build/script responses. Spend a few days developing test cases. Audit for invalid data, return realistic responses, make sure that you return failures as well as successes, and try to consider (and document) all expected failure scenarios, including time-outs (can be scripted with sleep() function in mock service). The time-out scenario can be used to simulate load, so that you can see the impact on clients and infrastructure (we were able to tip over Layer7 gateways by sending transactions at a higher rate than the service could handle, if we kept at it for 30 minutes).
So use the mockservices as a joint workshop to hammer out the details of what your service-oriented solution will look like, THEN code it up. You'll be glad you did.

Async WCF and Protocol Behaviors

FYI: This will be my first real foray into Async/Await; for too long I've been settling for the familiar territory of BackgroundWorker. It's time to move on.
I wish to build a WCF service, self-hosted in a Windows service running on a remote machine in the same LAN, that does this:
Accepts a request for a single .ZIP archive
Creates the archive and packages several files
Returns the archive as its response to the request
I have to support archives as large as 10GB. Needless to say, this scenario isn't covered by basic WCF designs; we must take additional steps to meet the requirement. We must eliminate timeouts while the archive is building and memory errors while it's being sent. Both of these occur under basic WCF designs, depending on the size of the file returned.
My plan is to proceed using task-based asynchronous WCF calls and streaming mode.
I have two concerns:
Is this the proper approach to the problem?
Microsoft has done a nice job at abstracting all of this, but what of the underlying protocols? What goes on 'under the hood?' Does the server keep the connection alive while the archive is building (could be several minutes) or instead does it close the connection and initiate a new one once the operation is complete, thereby requiring me to properly route the request through the client machine firewall?
For #2, clearly I'm hoping for the former (keep-alive). But after some searching I'm not easily finding an answer. Perhaps you know.
You need streaming for big payloads. That is the right approach. This has nothing at all to do with asynchronous IO. The two are independent. The client cannot even tell that the server is async internally.
I'll add my standard answers for whether to use async IO or not:
https://stackoverflow.com/a/25087273/122718 Why does the EF 6 tutorial use asychronous calls?
https://stackoverflow.com/a/12796711/122718 Should we switch to use async I/O by default?
Each request runs over a single connection that is kept alive. This goes for both streaming big amounts of data as well as big initial delays. Not sure why you are concerned about routing. Does your router kill such connections? That's a problem.
Regarding keep alive, there is nothing going over the wire to do that. TCP sessions can stay open indefinitely without any kind of wire traffic.

Cancelling WCF calls with large data?

I'm about to implement a FileService using WCF. It should be able to upload files by providing the filecontent itself and the filename. The current ServiceContract looks like the following
[ServiceContract]
public interface IFileService
{
[OperationContract]
[FaultContract(typeof(FaultException))]
byte[] LoadFile(string relativeFileNamePath);
[OperationContract]
[FaultContract(typeof(FaultException))]
void SaveFile(byte[] content, string relativeFileNamePath);
}
It works fine at the moment, but i want to be able to reduce the network payload of my application using this Fileservice. I need to provide many files as soon as the user openes a specific section of my application, but i might be able to cancel some of them as soon as the user navigates further through the application. As many of mine files are somewhere between 50 and 300 MB, it takes quite a few seconds to transfer the files (the application might run on very slow networks, it might take up a minute).
To clarify and to outline the difference to all those other WCF questions: The specific problem is that providing the data between client <-> server is the bottleneck, not the performance of the service itself. Is changing the interface to a streamed WCF service reasonable?
It is a good practice to use a stream if the file size is above a certain amount. At my work on the enterprise application we are writing, if it is bigger than 16kb then we stream it. If it is less than that, we buffer. Our file service is specially designed to handle this logic.
When you have the transfer mode of your service set to buffer, it will buffer on the client as well as on the service when you are transmitting your data. This means if you are sending a 300mb file, it will buffer all 300mb during the call on both ends before the call is complete. This will definitely create bottlenecks. For performance reasons, this should only be when you have small files that buffer quickly. Otherwise, a stream is the best way to go.
If the majority or all of your files are larger files I'd switch to using a stream.

WCF Data Service whose data source is another WCF Data Service

does someone know if it possible to use one WCF Data Service as data source of another WCF Data Service? If so, how?
So the short answer is yes. Actually I have consumed one WCF service in another (HttpBinding coming to a service on computer, then that service had a NamedPipesBinding service to communicate with multiple desktop apps, but it did some data transformation in the middle). That would not be an issue at all, you would set up a proxy/client just like you would in a desktop client, and handle everything in your new service as if it was just passing information along, you could even create a shared library for the DataContracts and such.
HOWEVER I would not suggest the leapfrog method in your implementation. Depending on how many customers you are potentially opening the door too, you may be introducing a bottlekneck, if you have a singleton service, or overload your existing service in the case of many connections from the new one. Since you have a SQL server, why would you not have a WCF service on your web/app server (public) that connected to it and provided the data you need? I'm only thinking this because your situation can become exponentially complicated when you start trying to pass credentials for authentication and authorization between the two, depending on your security settings. Another thing to consider is the complexity in debugging this new service and the old one, and a client at the same time, as if it wasn't a pain just to do server and client, since you are opening it to a public facing port, there are different things to set up, and debugging everything on the same machine is not the same as a public facing application server.
Sorry if this goes against what you were hoping to hear. I'm just saying that it is possible, but not suggested (at least by me) in your particular case.

Are there any libraries or samples for non-duplex WCF chunking?

I'm looking for a way of implementing a file transfer service over HTTPS which uses chunking to cope with intermittent connectivity loss and to reduce the large timeouts required by using Streaming. Because the client may be behind firewalls, the Chunking Channel sample on MSDN isn't suitable.
There is an old discussion about this on the Microsoft Forums but not a complete answer, or at least not one that I have the know-how to implement.
There is a sample of a resumable download service here: http://cid-8d29fb569d8d732f.skydrive.live.com/self.aspx/.Public/WCF/Resume%5E_Download%5E_WCF%5E_1%20%5E52%5E6.zip
This sample uses a custom WCF binding. It looks like it works by getting a segment of the file at a time, with the possibility to get any remaining segments when the system is back on line.