Is WAMP publish/subscribe battery efficient? - api

I am writing an client side desktop app that will need to receive updates from a server. These updates would be few and far between (possibly 1 a week) but I would like them to be received as quickly as possible.
Is it hard on the battery to "subscribe" to the topic that will provide the updates through WAMP and let the app run in the background continuously? Would it be more efficient to periodically poll the server using a REST based API?

WAMP requires a persistent connection - so you have to deal with the battery drain for this. The only way to find out how much of a cost this is is to test it on the system you'll be running the app on. Then you can consider the actual trade offs involved versus a polling solution.

There are no implications on energy consumption when subscribing. However there are implications when persisting a connection for so long time for so few updates. You should reconsider your use of WAMP as your communication protocol I think.

Related

Deploy REST API over multiple servers world-wide, but stay in sync

I've built a REST API with a pretty decent latency. Each request is answered in ~100 ms with a thousand requests per second. This is however with a relatively low physical distance to the data center. The users of this API would, however, be spread all over the globe. From the US for example (to a data center in Germany), the response time for a single request is ~400 ms under no load.
What would be the best approach to deploying this API? I suspect multiple servers at different locations, with a load balancer in front. How would I ensure that the MySQL database would stay in sync between the servers in that case?
With multiple servers and a load balancer, the costs rise exponentially, which is something I can hopefully afford in the future, but not at the moment.
I'd love to hear your ideas!
Afaik. for big projects people use event sourcing with an event storage and microservices and message queues between them or a basic solution is polling the event storage through a simple REST API something like send me the latest events since the last event I received. If you can accept eventual consistency, then I think this approach can work pretty well. It makes writing somewhat slower, but reading can be very fast with it. No need to sync the MySQL databases directly, you just pull the latest events and use a projection to update the local MySQL database. So the event storage is the single source of truth.

scalability of azure cloud queue

In current project we currently use 8 worker role machines side by side that actually work a little different than azure may expect it.
Short outline of the system:
each worker start up to 8 processes that actually connect to cloud queue and processes messages
each process accesses three different cloud queues for collecting messages for different purposes (delta recognition, backup, metadata)
each message leads to a WCF call to an ERP system to gather information and finally add retreived response in an ReDis cache
this approach has been chosen over many smaller machines due to costs and performance. While 24 one-core machines would perform by 400 calls/s to the ERP system, 8 four-core machines with 8 processes do over 800 calls/s.
Now to the question: when even increasing the count of machines to increase performance to 1200 calls/s, we experienced outages of Cloud Queue. In same moment of time, 80% of the machines' processes don't process messages anymore.
Here we have two problems:
Remote debugging is not possible for these processes, but it was possible to use dile to get some information out.
We use GetMessages method of Cloud Queue to get up to 4 messages from queue. Cloud Queue always answers with 0 messages. Reconnect the cloud queue does not help.
Restarting workers does help, but shortly lead to same problem.
Are we hitting the natural end of scalability of Cloud Queue and should switch to Service Bus?
Update:
I have not been able to fully understand the problem, I described it in the natual borders of Cloud Queue.
To summarize:
Count of TCP connections have been impressive. Actually too impressive (multiple hundreds)
Going back to original memory size let the system operate normally again
In my experience I have been able to get better raw performance out of Azure Cloud Queues than service bus, but Service Bus has better enterprise features (reliable, topics, etc). Azure Cloud Queue should process up to 2K/second per queue.
https://azure.microsoft.com/en-us/documentation/articles/storage-scalability-targets/
You can also try partitioning to multiple queues if there is some natural partition key.
Make sure that your process don't have some sort of thread deadlock that is the real culprit. You can test this by connecting to the queue when it appears hung and trying to pull messages from the queue. If that works it is your process, not the queue.
Also take a look at this to setup some other monitors:
https://azure.microsoft.com/en-us/documentation/articles/storage-monitor-storage-account/
It took some time to solve this issue:
First a summarization of the usage of the storage account:
We used the blob storage once a day pretty heavily.
The "normal" diagonistics that Azure provides out of the box also used the same storage account.
Some controlling processes used small tables to store and read information once an hour for ca. 20 minutes
There may be up to 800 calls/s that try to increase a number to count calls to an ERP system.
When recognizing that the storage account is put under heavy load we split it up.
Now there are three physical storage accounts heaving 2 queues.
The original one still keeps up to 800/s calls for increasing counters
Diagnositics are still on the original one
Controlling information has been also moved
The system runs now for 2 weeks, working like a charm. There are several things we learned from that:
No, the infrastructure is "not just there" and it doesn't scale endlessly.
Even if we thought we didn't use "that much" summarized we used quite heavily and uncontrolled.
There is no "best practices" anywhere in the net that tells the complete story. Esp. when start working with the storage account a guide from MS would be quite helpful
Exception handling in storage is quite bad. Even if the storage account is overused, I would expect some kind of exception and not just returning zero message without any surrounding information
Read complete story here: natural borders of cloud storage scalability
UPDATE:
The scalability has a lot of influences. You may are interested in Azure Service Bus: Massive count of listeners and senders to be aware of some more pitfalls.

Optimizing long polling on dedicated server

Right now, I am hosting a site on a dedicated server, 8 GB ram, Intel Xeon E3 1230 V3. I am using long polling techniques in order to display information which gets added into a database consistently.
The problem is: so far, after around let's say 20 users come onto the site, it starts lagging and slowing down dramatically. I'm pretty sure the server is strong enough to handle way more people than that. Thus, I am not sure what exactly is the problem. Can long polling using Apache handle that many users? If not, how should I implement real-time information being displayed. And if it can, how should I configure Apache or anything in order to handle around 500-1000 concurrent users.
Any help is appreciated.

Gathering distributed data into central database

I was assigned to update existing system of gathering data coming from points of sale and inserting it into central database. The one that is working now is based on FTP/SFTP transmission, where the information is sent once a day, usually at night. Unfortunately, because of unstable connection links (low quality 2G/3G modems), some of the files appear to be broken. With just a few shops connected that way everything was working smooth, but along with increasing number of shops, errors became more often. What is worse, the time needed to insert data into central database is taking up to 12 - 14h (including waiting for the data to be downloaded from all of the shops) and that cannot happen during the working day as it would block the process of creating sale reports and other activities with the database - so we are really tight with processing time here.
The idea my manager suggested is to send the data continuously, during the day. Data packages would be significantly smaller, so their transmission and insertion would be much faster, central server would contain actual (almost real time) data and night could be used for long running database activities like creating backups, rebuilding indexes etc.
After going through many websites, I found that:
using ASMX web service is now obsolete and WCF should be used instead
WCF with MSMQ or System Messaging could be used to safely transmit data, where I don't have to care that much about acknowledging delivery of data, consistency, nodes going offline etc.
according to http://blogs.msdn.com/b/motleyqueue/archive/2007/09/22/system-messaging-versus-wcf-queuing.aspx WCF queuing is better
there are also other technologies for implementing message queue, like RabbitMQ, ZeroMQ etc.
And that is where I become confused. With so many options, do you have any pros and cons of these technologies?
We were using .NET with Windows Forms and SQL Server, but if it would be necessary, we could change to something more suitable. I am also a bit afraid of server efficiency. After some calculations, server would be receiving about 15 packages of data per second (peak). Is it much? I know there are many websites without serious server infrastructure, that handle hundreds of visitors online and still run smooth, but the website mainly uploads data to the client, and here we would download it from the client.
I also found somewhat similar SO question: Middleware to build data-gathering and monitoring for a distributed system
where DDS was mentioned. What do you think about introducing some middleware servers that would cope with low quality links to points of sale, so the main server would not be clogged with 1KB/s transmission?
I'd be grateful with all your help. Thank you in advance!
Rabbitmq can easily cope with thousands of 1kb messages per second.
As your use case is not about processing real time data, I'd say you should combine few messages and send them as a batch. That would be good enough in order to spread load over the day.
As the motivation here is not to process the data in real time, then any transport layer would do the job. Even ftp/sftp. As rabbitmq will work fine here, it's not the typical use case for it.
As you mentioned that one of your concerns is slow/unreliable network, I'd suggest to compress the files before sending them, and on the receiving end, immediately verify their integrity. Rsync or similar will probably do great job in doing that.
From what I understand, you have basically two problems:
Potential for loss/corruption of call data
Database write performance
The potential for loss/corruption of call data is being caused by a lack of reliability in the transmission of data from client to service.
And it's not clear what is causing the database contention/performance issues, beyond a vague reference to high volumes, so this answer will be more geared towards solving the first problem.
You have correctly identified the need for reliable asynchronous communication transport as a way to address the reliability issues in your current setup.
Looking at MSMQ to deliver this is a valid first step. MSMQ provides reliable communication via a store and forward messaging semantic which comes out of the box and requires very little in the way of configuration.
Unfortunately, while suitable for your needs, MSMQ relies on 2 things:
A reliable network protocol, and
A client service running on both sending and receiving machine.
From your description above, I don't believe 1 exists (the internet is not a reliable network), and you might well struggle with 2 - MSMQ only ships with Windows Server or business/enterprise versions of Windows on the desktop.(*see below...)
As a possible solution to the network reliability problem, you could use a WCF or a RESTful endpoint (using Nancy or WebApi) to expose a service operation(s) exposed over HTTP, which would accept the incoming calls from the client machines. These technologies are quite different, so you'll need to make sure you're making the correct choice early on.
WCF supports WS-ReliableMessaging from the SOAP 1.2 specification out of the box, which allows for reliable web service calls over http, however it's very config-heavy and not generally a nice framework to work with.
REST much simpler than WCF in .Net, is very lightweight and easy to use. However, for reliable delivery you would have to expose some kind of GET operation (in addition to a POST to allow the client to send data) to be called (within a reasonable time-frame) to verify the data was committed. The client would have to implement some kind of retry semantic if the result of the GET "acknowledgement" was negative.
Despite requiring two operations rather than one for the WCF route, I would favour the REST approach. I've done plenty of both and find REST services way nicer to work with.
(*) That's not to say that MSMQ wouldn't work in your ultimate solution, just that it would not be used to address the transmission reliability issue. However it could still be used to address another of your problems, that of database write contention. If you were to queue incoming requests once they came into the server, then these could be processed by an "offline" process, which could then perform the required database operations in a reliable manner. This could be done by using MSMQ transactional queues.
In response to comments:
99% messages are passed from shop to main server, but if some change
is needed (price correction, discounts etc.), that data has to be sent
to shop.
This kind of changes things. Had I understood from the beginning that you had a bidirectional requirement, and seeing as how you have managed to establish msmq communication, I would have nudged you towards NServiceBus, which is a really, really cool wrapper around MSMQ. The reason I would have done this is that you appear to have both a one way, and a publish-subscribe requirement, which is supported really nicely by NServiceBus.

How can I handle 200K request per sec in wcf

I need to design a system that can handle 200K request per second in each machine over HTTP.
The wcf service need to be hosted under win service.
I wonder if wcf can handle such a requirement?
What is the best system setup/ best configuration?
The machine itself is pretty heavy 32G RAM and 8 core (or more), and can be upgraded if needed
Can I handle such amount of request in each single machine with wcf using http?
Doing this on a single machine is likely to be pretty tough (if indeed it's possible). It would be better to make your system scale horizontally, so you can add lots of machines as required. How you do that will depend on what your system actually needs to do. If it's some simple calculation which requires no persisted state, it shouldn't be too hard. If you've got some interaction with storage of some form which really needs to be read/written on each request, it'll be a lot harder - and choosing your persistence technology is likely to be pretty key to making it all hang together.
Note that there are other benefits to scaling horizontally too - in particular, the ability to upgrade the system without any downtime (if you're careful) and removing a huge single point of failure.
You need to give some more info on this.
Do you get the request and have to process it immediately?
Can you store the request data and delegate the processing to some other thread/process? Is there any way to scale the system out instead of up?
Is this in fact the only piece of infrastructure you can deploy stuff to?
I would start by asking what is it that I want to do during request handling. then what the bottlenecks are going to be.