Can one program have multiple processes? - process

after reading and searching about OS and process and threads, I checked on wiki and it said,
A computer program is a passive
collection of instructions, a process
is the actual execution of those
instructions. Several processes may be
associated with the same program; for
example, opening up several instances
of the same program often means more
than one process is being executed.
Now is it possible for a program to have more than one process and I am not including the possibility of running more than one instance of the same program. I mean one instance of one program is running, is it possible for a program to have more than one process?
If yes, how? If no, why not?
I am a newbie in this, but damn curious :)
Thanks for all your help..

Yes, fairly obviously - you can run two or more copies of most programs - I routinely have about 5 copies of vim running, and each of those is a separate process. As to how, the OS loads the executable file, creates a process and then tells that process to start executing the file contents.

It is most definitely possible but a desktop application might not be a good example and I think this is the source of your confusion.
Consider a webserver instead (NginX or Apache). There is one master process and multiple worker processes at work. The master process "accpets" the work , so to speak, and delegates it to the workers. Both NginX and Apache could be configured to any number of worker processes.
At our company we are in the business of delivering a SaaS that helps businesses have an online chat with their visitors via their websites. The back-end part of our system has multiple "service"es communicating with each other to accomplish the task. Each service has multiple instances running.

Related

Running multiple Kettle transformation on single JVM

We want to use pan.sh to execute multiple kettle transformations. After exploring the script I found that it internally calls spoon.sh script which runs in PDI. Now the problem is every time a new transformation starts it create a separate JVM for its executions(invoked via a .bat file), however I want to group them to use single JVM to overcome memory constraints that the multiple JVM are putting on the batch server.
Could somebody guide me on how can I achieve this or share the documentation/resources with me.
Thanks for the good work.
Use Carte. This is exactly what this is for. You can startup a server (on the local box if you like) and then submit your jobs to it. One JVM, one heap, shared resource.
Benefit of that is then scalability, so when your box becomes too busy just add another one, also using carte and start sending some of the jobs to that other server.
There's an old but still current blog here:
http://diethardsteiner.blogspot.co.uk/2011/01/pentaho-data-integration-remote.html
As well as doco on the pentaho website.
Starting the server is as simple as:
carte.sh <hostname> <port>
There is also a status page, which you can use to query your carte servers, so if you have a cluster of servers, you can pick a quiet one to send your job to.

Application calls another Application. Does it create another process?

I was reading about Processes. I wan't to know what really happens. My situation :
"I opened an Application. That creates a process say process1. I have other applications interfaced with this one and all these open up when i click a button inside my running application. I want to know Does my process1 create new processes and IPC happens OR processes for all the linked applications are created at once and then IPC happens?"
Obviously,a running application is a bunch of processes,or maybe a single process which has internally multiple threads acting within these processes.
So,your activity decides the creation and deletion of processes.say,if you are running an application such as media player and you suddenly start searching related info about the album---so here,totally a new process is created which helps interaction through web and after returning the output,it may die,may not,but the process was created on your request.Also,mostly ipc happens within processes,exactly as per your thinking,but shared memory communication is also one of the option,which is complicated and is less common.
One more thing to point out is that there are several 'daemon processes' which are running in the background and don't die before shutdown instruction!So,these processes are also sometimes related to the running application and serves its request.But,mostly,newer processes are created when we switch our task or perform certain action in the application.

BIND9.7. When several named processes are running, how to judge which process is providing the service?

For example, I execute "sudo named" several times, so there are several named processes running. When I use "pidof named", I get several pids.
I want to calculate the CPU usage rate of the BIND process,so I need to get some parameters from "/proc/pid/stat", so I need the pid of the named process which is really providing the domain resolution service.
What's the difference between the named process which is providing the service and the others? Could you give me a detailed explanation?
thanks very much~
(It's my first time to use stackoverflow , to use English to ask quetions , please ignore those syntax errors.)
There should be just one named running, the scripts that manage the service ensure that. You shouldn't start it like that, you should use what your distribution uses to start it, probably something along the lines of service bind start (that is probably a RedHat-ism), or /etc/rc.d/bind start (for bog-standard SysVinit).
I was responsible for DNS for quite some time here. Some tips:
DNS is a very critical service, configure and monitor with extreme care. Do read up on setting up and managing this, don't go ahead until you are absolutely clear.
Get somebody as a backup for the case that you aren't available, and make sure they understand the previous point.
DNS isn't CPU intensive (OK, with signed domains and that newfangled stuff that might have changed), it is memory intensive (and network intensive, or at least sensitive to delays). Our main DNS server was running for months at a time, and clocked up some half hour of CPU time during that kind of period IIRC.
Separate your master server (responsible for the domain(s) from the servers queried by clients (caching servers). There have been vulnerabilities where malformed questions or "answers" to questions that hadn't been asked soiled the database
The master server will have all the domain information in RAM, make sure you have got enough of it
Make sure all machines under your jurisdiction use the same caching server. It makes no sense for more than one, that destroys the idea of cache.
The caching servers collect immense amounts of data over time. This data rarely is performance critical, so configure them with plenty of swap space to accommodate overflows.
Bind issues as many named processes as many CPUs you have:
man named:
-n #cpus
Create #cpus worker threads to take advantage of multiple CPUs. If not specified, named will try to determine the number of CPUs present and create one thread per CPU. If it is unable to determine the number of CPUs, a single worker thread will be created.
External source:
https://unix.stackexchange.com/questions/140986/multiple-named-processes-for-bind9-in-debian

long running agents in f#

I use agents in different ways, one of which consists of 100 agents monitoring website changes, and reporting back to a supervisor which I can call to spawns new monitor off, or listen to the merged changes.
This is only part of my program, and I am happy with it.
I now would like to spin it off and that it runs truly independently of my main program.
(Yet I would like this independent spinoff to stay as much as possible inside the langage, and use the least amount of glue code possible)
What strategies do I have here / would you recommend ?
One option for executing long-running agents is to write a Windows Service that starts with the operating system (possibly even before login) and runs in the background. Your main application can then connect to the service and communicate with it.
Here is a basic example of F# Windows Service on MSDN.
Running the agent in a service is quite easy. The communication between service and main application is more tricky, because they are two separate processes. The sample uses .NET Remoting, which has now been replaced with WCF, so I think that would be a thing to look at (especially if you want asynchronous communication). Alternatively, there are some F# projects that implement simple socket-based communication, which might be easier to use.

Deploying on EC2

This question is for anyone who has actually used Amazon EC2. I'm looking into what it would take to deploy a server there.
It looks like I can start in VirtualBox, setup my server and then export the image using the provided ec2-tools.
What gets tricky is if I actually want to make configuration changes to my running server, they will not be persistent.
I have some PHP code that I need to be able to deploy (and redeploy) to the system, so I was thinking that EBS would be a good choice there.
I have a massive amount of data that I need stored, but it just so happens that latency is not an issue, so I was thinking something like s3fs might work.
So my question is... What would you do? What does your configuration look like? What have been particular challenges that perhaps you didn't see coming?
We have deployed a large-scale commercial app in the AWS environment.
There are three basic approaches to keeping your changes under control once the server is running, all of which we use in different situations:
Keep the changes in source control. Have a script that is part of your original image that can pull down the latest and greatest. You can pull down PHP code, Apache settings, whatever you need. If you need to restart your instance from your AMI (Amazon Machine Image), just run your script to get the latest code and configuration, and you're good to go.
Use EBS (Elastic Block Storage). EBS is like a big external hard drive that you can attach to your instance. Even if your instance goes away, EBS survives. If you later need two (or more) identical instances, you can give each one of them access to what you save in EBS. See https://stackoverflow.com/a/3630707/141172
Burn a new AMI after each change. There's a tool to create a new AMI from a running instance. If EBS is like having an external hard drive, creating a new AMI is like having a DVD-R. You can save the current state of your machine to it. Next time you have to start a new instance, base it on that new AMI. Good to go.
I recommend storing your PHP code in a repository such as SVN, and writing a script that checks the latest code out of the repository and redeploys it when you want to upgrade. You could also have this script run on instance startup so that you get the latest code whenever you spin up a new instance; saves on having to create a new AMI every time.
The main challenge that I didn't see coming with EC2 is instance startup time - especially with Windows. Linux instances take 5 to 10 minutes to launch, but I've seen Windows instances take up to 40 minutes; this can be an issue if you want to do dynamic load balancing and start up new instances when your load increases.
I'd suggest the best bet is to simply 'try it'. The charges to run a small instance are not high and data transfer rates are very low - I have moved quite a few GB and my data fees are still less than a dollar(!) in my first month. You will likely end up paying mostly for system time rather than data I suspect.
I haven't deployed yet but have run up an instance, migrated it from Ubuntu 8.04 to 8.10, tried different port security settings, seen what sort of access attempts unknown people have tried (mostly looking for phpadmin), run some testing against it and generally experimented with the config and restart of the components I'm deploying. It has been a good prelude to my end deployment. I won't be starting with a big DB so will be initially sticking with the standard EC2 instance space.
The only negativity I have heard it that some spammers have made some of the IP ranges subject to spam-blocking - but have not yet confirmed that.
Your virtual box approach I will suggest you take after you are more familiar with the EC2 infrastructure. I suggest that you go to EC2, open an account and follow Amazon's EC2 getting-started guide. This guide will give you enough overview on all things (EBS, IP, CONNECTIONS, and otherS) to get you started. We are currently using EC2 for production and the way we started was like I am explaining here.
I hope you become a Cloud Expert Soon.
Per timbo's concern, I was able to nab an IP that, so far hasn't legitimately shown up on any spam lists. You will have a few hiccups since many blacklists are technically whitelists and will have every IP on their list until otherwise notified that a Mail Server is running on that IP. It's really easy to remove, most of them have automated removal request forms and every one that doesn't has been very cooperative in removing me from their lists. Just be professional, ask if they can give a time and reason for the block and what steps you should take to remove your IP. All the services I have emailed never asked me to jump through any hoops, within two or three business days they all informed me my IP had been removed.
Still, if you plan on running a mail server I would recommend reserving IPs now. They're 1 cent per every hour they are not bound to an instance so it works out to being about $7 a month. I went ahead and reserved an extra one as I plan on starting up another instance soon.
I have deployed some simple stuff to EC2 Win2k3 instances. Here's my advice:
Find a tutorial. Sign up for the service. Just spend an afternoon setting up your first server. It's pretty darned easy, though there will be obstacles to overcome. It's not too tough.
When I was fooling with EC2 I think I spent like $2.00 setting up a server and playing with it for a while.
Some of your data will be persistent, but you can connect S3 to EC2 as well.
Just go for it!
With regards to the concerns about blacklisting of mail servers, you can also use Amazon's Simple Email Service (SES), which obviates the need to run the mail server on the EC2 instances.
I had trouble with this as well, but posted a note here in their forums - https://forums.aws.amazon.com/thread.jspa?threadID=80158&tstart=0