monitor process uptime on Windows - process

We want to monitor some processes on windows machine. If process is down for 30 minutes, an alarm will be raised. Is it possible/or necessary to monitor a process downtime precisely, say, a process is down EXACTLY for 30 minutes, then an alarm will be raised? Normally, we can check it every 1 minute, but technically, most of the time, you could miss somes seconds.

If you're looking for something much lighter and simpler then Nagios, see AlertGrid. It is very esay to use, the only downside is that it requires a bit of integration: AlertGrid only LISTENS for heartbeat signals, so you have to provide them manually (the API is extremely simple).
The other cool thing is that if the process you want to monitor runs YOUR code - you can send heartbeat events directly "from inside", and these events can carry your own custom parameters. Then, in the AlertGrid - you can easily manage custom rules around these parameters. So if the executable you monitor is, for instance, an order processing application you can send parameter called 'number_of_orders_processed' and create rule "if number_of_orders_processed > 100, send SMS message / make phone call to... " and it will work immediately.
I am in the AlertGrid dev team, if you have any questions - feel free to ask.

Look into Nagios and NSClient++ if you can monitor your Windows host from a Linux host. If this would be an option there is room to do this and many many other things in regards to monitoring your processes and more.

Related

Philips Hue command limitation

First of all I'm developing my own C# library for controlling Philips Hue, which means I'm not using the official SDK. (I'm guessing that the SDK will make sure you won't have any problems)
I'm a little confused about the limitation in the Core concepts page in the API, which states:
We can’t send commands to the lights too fast. If you stick to around 10 commands per second to the /lights resource as maximum you should be fine. For /groups commands you should keep to a maximum of 1 per second.
I intend to respect this limitation, but does the limitation still apply when you are performing GET requests on the /lights resource, or is it only for sending actual commands with PUT requests to /lights/<id>/state that change the state of the light? Same question goes for the /groups resource.
Also is it even possible to damage anything by sending too many requests, or will it just take longer to get all responses?
Edit:
My overall question is: How should I understand the API limitation?
A more specific sub-question is: Should I wait 100 ms before sending another /lights command, relative to when I received a response, or relative to when I sent the previous command?
Another sub-question is: Should I consider this limitation only when using PUT requests on e.g. /lights/<id>/state, or on all request types GET/PUT/POST/DELETE
I don't know if anything was changed in firmware updates, but I have discovered that the bridge might not be so simple as you would think, and that the API description isn't very clear.
I've done a little testing while running firmware 01009914.
The bridge seems to have some kind of queue of incoming commands. I sent {"bri":254} to a group 9 times and 1 final command of {"bri":1}. From the first command to when the light is actually dimmed, takes roughly 3-4 seconds. Each time I sent a command the bridge replied almost instantly with success token.
I did the same small tests sending other commands, 10 of each JSON object:
{"bri":254} 3-4 seconds
{"on":true, "bri":254} 6-7 seconds
{"on":true, "bri":254, "alert":"none", "effect":"none"} 12-13 seconds
This actually shows that each change of attributes takes roughly 0.3 seconds for the bridge to handle.
I will claim that for each attribute we change, the bridge takes about 300 ms to finish, and the limitation of commands should be understood as: As long as you stick with changing one attribute of a group each second, you should be fine.
Note: I only tried with one group consisting of three lights, and I don't know if the bridge actually does have a queue of incoming commands, and in case it does have a queue, I don't know what the limit of items in it is.
Edit:
Now we have some official clarification of the Hue System Performance.
I'm fairly certain that the 10 commands per second is a guideline to prevent failure of the Bridge, and is a technical limitation of the hardware. Any more than that and you're apt to overload the bridge. I believe this applies to commands as well as requests.
Both approaches are reasonable. For laziness' sake, you could wait for 100ms to send a response, but I would only rely on this method if you don't plan on any other interactions with the Bridge.
I consider this limitation on all request types.
You won't damage anything if you send commands too fast. However, if you send commands too fast the bridge might become unresponsive and/or some messages can be ignored.
When it comes to the bridge, the way I think of it is that the bridge is more or less single threaded, so it works best if you make sure you don't send the next command before the previous one has returned.
In practice we've found that this works much better than waiting a fixed time between each request. In fact, you can pretty much send commands as fast as you want as long as you wait for the previous one to finish.
When you send a command to the bridge, the bridge has to then send it to the lamps through Zigbee. Since it's a mesh network in some cases the message has to make a couple of hops from lamp to lamp before it reaches the target. Depending on how many lamps you have and how many hops the signal needs to take, this can take a while. Also, it's possible that some messages randomly take much longer than others.
In general the system is not designed to handle very fast changes, but if you keep the above in mind you can make many cool effects :)

How can I tell a WAS service polling an MSMQ to wait when busy?

I'm working on a system which amongst other things, runs payroll, a heavy load process. It is likely that soon, there may be so many requests to run payroll at peak times that the batch servers will be overwhelmed.
I'm looking to put together a proof of concept to cope with this by using MSMQ (probably replacing this with a commercial solution like nservicebus later). I using this this example as a basis. I can see how to set up the bindings and stick it together, but I still need a way to tell the subscribers hosted by WAS to only process the 'run heavy payroll process' message if they are not busy. Otherwise the messages on the queue will get picked up straightaway and we have the same problem as before.
Can I set up the subscribing service to say, "I'm busy, I can't take the message, leave it on the queue"? Does the queue need to be transactional?
If you're using WCF then there's no way to conditionally activate the channel thereby leaving the messages on the queue for later.
A better solution is to host the message receiver in a completely different process, for example as a windows service. These can then be enabled/disabled according to your service window requirement.
You also get the additional benefit of being able to very easily scale out the message receivers to handle greater loads (by hosting more instances of your receiver).
One way to do this is to have 2 queues, your polling always checks the high priority queue first, only if there are no items in that queue does it take an item from the other

queue on WCF web service - can you implement something like a background task

I have a service that does image processing on an image supplied by the client.
Each processing takes CPU (3min aprox runtime/image), so I will not allow more than 1 image to be processed at a time.
what I did is that when the service is called, the image is saved on the server and an entry is added into the database, with the status queued.
Now I would like to create a background task or something that takes every entry from the database that has a status Queued, processes that image,updates the entry status to Done, and than takes the new entry with the status Queued and so on.
There may be the case that no image is queued at some time.
How do you suggest me implementing this?
It sounds like what you want is a queued service.
http://msdn.microsoft.com/en-us/library/ms731089.aspx
It allows you to focus on your core algorithm and not worry much about the mechanics of queuing the messages (e.g. making custom DB tables for queues etc.). Queuing sounds easy, but to get it to work reliably is harder than it sounds - better to leave it to the experts at MS :o)
It also provides some good features like durability, poison message handling etc.
You could use Windows Server AppFabric to host a workflow-backed WCF service. Instead of service.svc, the extension is service.xamlx. AppFabric is designed to run long running processes like this and will scale to your needs.
Perhaps you can develop a windows service which polls the database every minute for any images to process.

Tools to monitor and debug SaaS Services

What tools will come in handy to debug and monitor SaaS services built on WCF in production environment ?
FYI - No access to the actual server whatsoever. No remoting in, and no access to the file system.
There are dozens of 'dotcom-monitors' (eg site24x7.com) but they can only monitor parameters that are publicly available, like site uptime, response times etc.
If you want to monitor memory usage and other parameters known only from 'inside', then you have two choices: either install some monitoring agent on a server (in most cases it would be a pain).
You can also send 'signals' from your code to some external event handling and notification service. I recommend AlertGrid (http://alert-grid.com) for the latter purpose it is very flexible and extremely easy to integrate.
AlertGrid doesn't require installation, access to the file system etc. it just gathers data you send and allows to build some notification rules. Examples:
you can send some parameter like memory usage and built rule 'if memory_usage > threshold -> send SMS to admin'
you can send data related to your applicatioin. If you have application proceeding orders, you can send number of processed orders in the signal and build notification rules around that
If you have some logic trigerred periodically (cron, windows service) you can send signal each time your logic is executed to check if it is executed on a scheduled basis.
(I am a developer in AlertGrid's team, in case of any question, please feel free to ask.)
What exactly do you want to monitor? If you only care about availability then good old ping might be enough :)

Work managers threads constraint and page cannot be displayed

We have a memory intensive processing for certain functionality and we would like to limit the number of parallel requests to this processing. We are able to configure by using "Work Managers" in WebLogic and putting a limit on the number of threads for that servlet.
For example, if we put maximim thread limit as 3, then if there are 10 parallel requests; 7 requests are in queue. There could be situations where these the requests waiting in queue could take up to 30-40 minutes to be processed. We did simple testing and the received page cannot be displayed due to timeout after 15 mins and received the message after 1 hour.
Does any one know if there is a setting in WebLogic to increase/decrease timeout and avoid page cannot be displayed?
Appreciate if any one has any thoughts around this.
Does any one know if there is a setting in WebLogic to increase/decrease timeout and avoid page cannot be displayed?
There might be something but I actually didn't check as it would be a bad advice anyway. By looking for this, you are trying to solve the wrong problem here. A browser is just not made for long-running process like the one you are describing (>30mn) even if you don't mind the user waiting (not mentioning that he could refresh the page and queue more and more jobs).
So, the right answer here is in my opinion: use asynchronism, this is the perfect use case. When the user clicks on the button, send a JMS message to a queue (or create a Quartz job) and send the user a page with a request ID telling him to come back later. When the processing is done, update the status somewhere and make the status/result available to the user. Really, the user experience will be better doing this and you'll face less problems than with a browser.
1) Use some other tool (not browser) like WGET where you can control timeout parameter (--timeout).
2) Why do you use HTTP? Use message driven beans and send message JMS to that and don't care about time outs.
Perhaps quartz can do what you need? Start a job and check in on it as you need to?