Using WCF to provide large report datasets - wcf

I have a need for an application to access reporting data from a remote database. We currently have a WCF service that handles the I/O for this database. Normally the application just sends small messages back and forth between the WCF service and itself, but now we need to run some historical reports on that activity. The result could be several hundred to a few thousand records. I came across http://msdn.microsoft.com/en-us/library/ms733742.aspx which talks about streaming, but it also mentions segmenting messages, which I didn't find any more information on. What is the best way to send large amounts of data such as this from a WCF service?

It seems my options are streaming or chunking. Streaming restricts other WCF features, message security being one (http://msdn.microsoft.com/en-us/library/ms733742.aspx). Chunking is breaking up a message into pieces then putting those pieces back together at the client. This can be done by implementing a custom Channel which MS has provided an example of here: http://msdn.microsoft.com/en-us/library/aa717050.aspx. This is implemented below the security layer so security can still be used.

Related

confusion over aggregator service in microservices pattern

I need to create a service to collect and consolidate events from other services, as far as found on internet ,the aggregator service helps to find out what's going on in the application flow, I have confusion here which need your help, aggregator microservice means if any input or output from a service should be sent to the aggregator service with time and date? But in clouds also we have such a service like application insights, does not it do the same thing? Even if we store every event it gona be a huge data in the db,is it really a best solution?
So Answering your first question,
Aggregator microservice means if any input or output from a service should be sent to the aggregator service with time and date?
Not Really, Aggregator Microservice is a pattern, which is basically another service that receives requests, subsequently makes requests to multiple different services and combines the results and responds to the initiating request.
So I guess you're looking for some Log aggregators, which are software functions that consolidate log data from throughout the IT infrastructure into a single centralized platform where it can be reviewed and analyzed.
But in clouds also we have such a service like application insights, does not it do the same thing? Yes, you can say that it's a similar service.
Even if we store every event it gona be a huge data in the db,is it really a best solution? Leave this with your Log aggregator tool, it will have a proper mechanism to keep your data. Mostly they will keep the data in a compact way and properly indexed too.

how would I expose 200k+ records via an API?

what would be the best option for exposing 220k records to third party applications?
SF style 'bulk API' - independent of the standard API to maintain availability
server-side pagination
call back to a ftp generated file?
webhooks?
This bulk will have to happen once a day or so. ANY OTHER SUGGESTIONS WELCOME!
How are the 220k records being used?
Must serve it all at once
Not ideal for human consumers of this endpoint without special GUI considerations and communication.
A. I think that using a 'bulk API' would be marginally better than reading a file of the same data. (Not 100% sure on this.) Opening and interpreting a file might take a little bit more time than directly accessing data provided in an endpoint's response body.
Can send it in pieces
B. If only a small amount of data is needed at once, then server-side pagination should be used and allows the consumer to request new batches of data as desired. This reduces unnecessary server load by not sending data without it being specifically requested.
C. If all of it needs to be received during a user-session, then find a way to send the consumer partial information along the way. Often users can be temporarily satisfied with partial data while the rest loads, so update the client periodically with information as it arrives. Consider AJAX Long-Polling, HTML5 Server Sent Events (SSE), HTML5 Websockets as described here: What are Long-Polling, Websockets, Server-Sent Events (SSE) and Comet?. Tech stack details and third party requirements will likely limit your options. Make sure to communicate to users that the application is still working on the request until it is finished.
Can send less data
D. If the third party applications only need to show updated records, could a different endpoint be created for exposing this more manageable (hopefully) subset of records?
E. If the end-result is displaying this data in a user-centric application, then maybe a manageable amount of summary data could be sent instead? Are there user-centric applications that show 220k records at once, instead of fetching individual ones (or small batches)?
I would use a streaming API. This is an API that does a "select * from table" and then streams the results to the consumer. You do this using a for loop to fetch and output the records. This way you never use much memory and as long as you frequently flush the output the webserver will not close the connection and you will support any size of result set.
I know this works as I (shameless plug) wrote the mysql-crud-api that actually does this.

what is/are the right WCF messaging function to use in my project?

I am novice in WCF and I have a project that needs to be migrated into WCF communication base with the client/server and server to server architecture.
My question is what is the right messaging function that I need for this project that insure the security of data across the network ,reliable connection and speed exchange of data.
I was able to find out the WCF has numerous messaging function.
Below is the architecture of my project:
Note: The clients should be simultaneously updated by both data processing and feed source servers. And clients also sends simultaneous requests to the servers while feeds are still being supplied by the feed source server.
I would be appreciate any suggestion or comments.
My first question is why are you putting the Connection Manager (CM) component in-between your clients and the services which they want to use? What is the job it does which means it needs to be right in the middle of everything?
This ultimately means that your CM component will have to handle potentially high volumes of bi-directional traffic across potentially different transport bindings and introduces a single failure point.
What if client A wants only to receive messages from the Feed Source (FS) component? Why should client A have to deal with an intermediary when it just wants to send a subscription notification to receive updates from the FS?
Equally, what if client B wants to send a message to the Data Processing (DP) component? Surely it should just be able to fire off a message to DP?
I think the majority of what you want to do with this architecture can be achieved with one-way messaging, in which case you should use netMsmqBinding (assuming you are in a pure wcf environment).

Do I really need reliable sessions for my services? (description inside)

Our company leases a music service to it's clients. The product consists of an automated mp3 player and daily renewals/updates of the costumers music library (mp3 songs) downloaded to their machines. So far we use an ugly solution for the mp3 updates, by synchronizing server and client folders using GBridge. This is obviously a disadvantage, as we force our clients to download our whole music library (currently 25.000 songs) while most of them will never play songs from all of our music categories (pop, rock etc). Most important we can only offer one subscription packet (our whole music library) while our competitors offer packets by categories with lower prices. For those reasons we decided to turn to WCF.
The service uses PerCall instancing mode and implements two operations, invoked from a winform client application with the classic request-reply pattern.
The first operation retrieves from a database the categories a client is allowed to download from (request) and sends back to the client a list of these categories (reply).
The second operation is used for downloading. The client first downloads an xml version of the server's database. A similar xml lies on the client side. The client app checks which songs, in each of the categories returned from the first operation, are missing in it's own xml compared to the server's xml file. If there are any files (elements on the xml) missing, it downloads them one file at a time. After each download, the client updates his xml and does the same comparison again until all files (elements) match in the 2 xml.
Long story short, considering that the instancing mode on the service is PerCall for throughput reasons and keeping memory consumption low and that both my operations use the request-reply pattern which means that the acknowledgement messages will be send back to the client with each response from the service, so if something goes wrong in the connection or if the client can't reach the service I can catch the CommunicationObjectFaultedException on the client, reconstruct the proxy and retry do you think theres a need for reliable sessions on my service implementation? What problems could arise if I don't have reliable sessions in the operations just described?
What problems could arise if I don't have reliable sessions in the
operations just described?
I am aware of only few problems being solved by reliable sessions while it puts a lot of stress on the server.
I would personally go for BasicHttpBinding (for better interoperability) without reliable session.
UPDATE
In order to understand Reliable Sessions, have a read of this and this.
If you are a bank, it makes sense to use Reliable Sessions, if you are sending money to and from other banks. This will ensure the message is received by the final party involved. But in most cases, you would not need it.

MSMQ between WCF services in a load balanced enviroment

I'm thinking of adding a queue function in a product based on a bunch of WCF services. I've read some about MSMQ, first I thought that was what I needed but I'm not sure and are considering to just put the queue in a database table. I wonder if somone here got some feedback on which way to go.
Basicly I'm planning to have a facade WCF service called over http. The facade service should only write all incoming messages to a queue to give a fast response to the calling system. The messages in the queue should then be processed by another component, either a WCF service or a Windows service depending om my choice of queue.
The product is running in a load balanced enviroment with 2 to n web servers.
The options I'm considering and the questions I got are:
To let the facade WCF write to a MSMQ and then have anothther WCF service reading from this queue to do the processing of the messages. What I don't feel confident about for this alternative from what I've read is how this will work in a load balanced enviroment.
1A. Where should the MSMQ(s) be placed? One on each web server? One on a separate server? Mulitple on a separate server? (not considering need of redundance and that data in rare cases could be lost and re-sent)
1B. How it the design affected if I want the system redundant? I'd like to be alble to lose a server (it never comes up online again) holding the MSMQ without losing the data in that queue. From what I've read about MSMQ that leaves me to the only option of placing the MSMQ on a windows cluster. Is that correct? (I'd like to avoid using a windows cluster fo this).
The second design alternative is to let the facade WCF service write the queue to a database. Then have two or more Windows services to do the processing of the queue. I don't have any questions on this alternative. If you wonder why I don't pick this one as it seems simpler to me then it is because I'd like to build this not introducing any windows services to the solution, that I beleive the MSMQ got functionality I don't want to code myself and I'm also curious about using MSMQ as I've never used it before.
Best Regards
HÃ¥kan
OK, so you're not using WCF with MSMQ integration, you're using WCF to create MSMQ messages as an end-product. That simplifies things to "how do I load balance MSMQ?"
The arrangement you use is based on what works best for you.
You could have multiple webservers sending messages to a remote queue on a central machine.
Instead you could have a webservers putting messages in local queues with a central machine polling the queues for new arrivals.
You don't need to cluster MSMQ to make it resilient. You can instead make your code resilient so that it copes with lost messages using dead letter queues, transactional queues, journaling, and so on. Hardware clustering is the easy option :-)
Load-balancing MSMQ - a brief
discussion
Oil and water - MSMQ transactional
messages and load balancing
After reading some more on the subjet I haver decided to not use MSMQ. It seems like I really got no reason to go down this road. I need this to be non-transactional and as I understand it none of the journaling or dead letter techniques will help me with my redundancy requirement.
All my components will be online most of the time (maybe a couple of hours per year when they got access problems).
The MSQM will only add complexity to the exciting solution, another technique and maybe another server to keep track of.
To get full redundance to prevent data loss in MSMQ I will need a windows cluster or implement send/recieve to multiple identical queues. I don't want to do either of those.
All this lead me to front my recieving application with a WCF facade accepting http calls writing to a database queue. This database is already protected from data loss. The queue will be polled by muliple active instances of a Windows Servce containing all the heavy business logic. With low process priority these services could be hosted on the already existing nodes used by the load balaced web application. If I got time to use MSMQ or if I needed it for another reason in my application I might change my decision.