Gawk "system" call. Run external program in several instances - system

I am using Gawk on a MAC based UNIX. I have a "system" call that runs a program. As I have understood things, the program waits for the external program before going on to the next line. I would like to start several instances of the external program, wait for them all to finish, "cat" the results and then continue with the Gawk program. The value in this is to use all the processors on a MAC Pro machine. It takes days to run now and running the external program in several instances would greatly help. Thank You in advance for any help on this.

system call to bash script that ran various programs in the background. Used the bash script "wait" command to make sure the programs were finished before returning to the GAWK program. That worked.
Got the idea from the son of a friend. Very helpful.
Ellis

Related

Key binding to interactively execute commands from Python interpreter history in order?

I sometimes test Python modules as I develop them by running a Python interactive prompt in a terminal, importing my new module and testing out the functionality. Of course, since my code is in development there are bugs, and frequent restarts of the interpreter are required. This isn't too painful when I've only executed a couple of interpreter lines before restarting: my key sequence when the interpreter restart looks like Up Up Enter Up Up Enter... but extrapolate it to 5 or more statements to be repeated and it gets seriously painful!
Of course I could put my test code into a script which I execute with python -i, but this is such a scratch activity that it doesn't seem quite "above threshold" for opening a text editor :) What I'm really pining for is the Ctrl-r behaviour from the bash shell: executing a sequence of 10 commands in sequence in bash involves finding the command in history (repeated Up or Ctrl-r for a search -- both work in the Python interpreter shell) and then just pressing Ctrl-o ten times. One of my favourite bash shell features.
The problem is that while lots of other readline binding functionality like Ctrl-a, Ctrl-e, Ctrl-r, and Ctrl-s work in the Python interpreter, Ctrl-o does not. I've not been able to find any references to this online, although perhaps the readline module can be used to add this functionality to the python prompt. Any suggestions?
Edit: Yes, I know that using the interactive interpreter is not a development methodology that scales beyond a few lines! But it is convenient for small tests, and IMO the interactiveness can help to work out whether a developing API is natural and convenient, or too heavy. So please confine the answers to the technical question of whether readline history-stepping can be made to work in python, rather than the side-opinion of whether one should or shouldn't choose to (sometimes) work this way!
Edit: Since posting I realised that I am already using the readline module to make some Python interpreter history functions work. But the Ctrl-o binding to the operate-and-get-next readline command doesn't seem to be supported, even if I put readline.parse_and_bind("Control-o: operate-and-get-next") in my PYTHONSTARTUP file.
I often test Python modules as I develop them by running a Python interactive prompt in a terminal, importing my new module and testing out the functionality.
Stop using this pattern and start writing your test code in a file and your life will be much easier.
No matter what, running that file will be less trouble.
If you make the checks automatic rather than reading the results, it will be quicker and less error-prone to check your code.
You can save that file when you're done and run it whenever you change your code or environment.
You can perform metrics on the tests, like making sure you don't have parts of your code you didn't test.
Are you familiar with the unittest module?
Answering my own question, after some discussion on the python-ideas list: despite contradictory information in some readline documentation it seems that the operate-and-get-next function is in fact defined as a bash extension to readline, not by core readline.
So that's why Ctrl-o neither behaves as hoped by default when importing the readline module in a Python interpreter session, nor when attempting to manually force this binding: the function doesn't exist in the readline library to be bound.
A Google search reveals https://bugs.launchpad.net/ipython/+bug/382638, on which the GNU readline maintainer gives reasons for not adding this functionality to core readline and says that it should be implemented by the calling application. He also says "its implementation is not complicated", although it's not obvious to me how (or whether it's even possible) to do this as a pure Python extension to the readline module behaviour.
So no, this is not possible at the moment, unless the operate-and-get-next function from bash is explicitly implemented in the Python readline module or in the interpreter itself.
This isn't exactly an answer to your question, but if that is your development style you might want to look at DreamPie. It is a GUI wrapper for the Python terminal that provides various handy shortcuts. One of these is the ability to drag-select across the interpreter display and copy only the code (not the output). You can then paste this code in and run it again. I find this handy for the type of workflow you describe.
Your best bet will be to check that project : http://ipython.org
This is an example with a history search with Ctrl+R :
EDIT
If you are running debian or derivated :
sudo apt-get install ipython

profile an awk command?

Probably a silly question, since awk commands are usually pretty compact and do just one or two operations...
Is there a way to profile and awk command? ie. if it uses gsub, split, sorting associative arrays, is there an easy way to find out which part is bogging down the whole operation?
EDIT: Specifically I am looking for executing time for each subcommand, not how many times it was called. is this possible?
From the gawk man page:
pgawk is the profiling version of gawk. It is identical in every way
to gawk, except that programs run more slowly, and it automatically
produces an execution profile in the file awkprof.out when done. See
the --profile option, below.
so the answer would be yes if you are using the GNU implementation.
And to forstall your next question, the man page goes on to say
dgawk is an awk debugger. Instead of running the program directly, it
loads the AWK source code and then prompts for debugging commands.
Unlike gawk and pgawk, dgawk only processes AWK program source provided
with the -f option. The debugger is documented in GAWK: Effective AWK
Programming.
There's an awk implementation with a debugger similar to gdb, called dgawk.
You say you want execution time for each subcommand.
Here's how I do it, regardless of language:
Give it enough workload so it runs long enough, and time it with a watch (N seconds).
Then do it again, and while it's running, hit Ctrl-C.
Do backtrace to examine the stack, and copy that into a text editor.
Do that several times, like 10.
Any subcommand will appear on the stack for the fraction of time it spends.
So if sort is taking 50% of the time (N/2 seconds), it will appear on about 5 of those samples.
This tells you about big time-takers, not little ones. I assume you are looking for the big ones.
(Some people say this isn't accurate, which is baloney. Sure the amount of time isn't very accurate - it doesn't need to be. The accuracy you need is in location - pinpointing where the problem is, and that's what it does.)
ADDED: You can almost do this with pgawk. If you run your program in profiling mode, each time you hit Ctrl-C (or whatever) it prints the call stack to the output file. The only problem is, it prints the function names but not what lines they are called from, which you might actually need.
Here is the fine documentation about profiling gawk.
Build a profiling version of gawk for gprof, or use the kernel-based oprofile. You can then see in a lot of detail how much time is spent in various internal functions in gawk in response to your script and its data. Functions like gsub and split map to functions inside gawk.
For instance gsub and other functions are handled by the do_sub function in this source file:
http://git.savannah.gnu.org/cgit/gawk.git/tree/builtin.c
So you would look for how much time is spent in do_sub.
You want to compile and link gawk with the -pg GCC option. Successful runs of the program will then dump a profiling file gmon.out from which gprof will produce a report.
I highly recommend oprofile also, but going into it little out of scope for this answer.

See when an executable opens and closes

How can I get a callback within my application when an executable (such as pbs, cp, etc) launches and then exits? This would need to work only knowing the path to the executable.
You could move the original executable aside, and replace it with a wrapper that runs the original, reporting when it runs and exits.
You could look at the accton and lastcomm commands, which record the start and exit of every process on the system.
You could look into using dtrace, which can definitely do what you're asking but it's rather complicated to use. You'd probably have to do a fair amount of learning to do this. I don't know much about writing dtrace scripts, but I'd probably start with execsnoop as my model.

Objective c terminal application

Is there such thing? I need to make an application in Xcode to basically do what the terminal app does. With just an nstextfield as the input, a label for the terminal output, and a send button. All this needs to be done without terminal accually being open.
Is this possible? If so, can someone post a website or sample code?
It's certainly possible. The Terminal basically just runs a shell (bash by default). You could just launch an app that forwards entered text onto bash, and let bash do the work. Or you could interpret the input yourself. Bash is pretty simple, for the most part: you type in a program and arguments, it finds the program in the $PATH, and launches it with the given arguments. (Of course, bash gets a bit fancier, which pipes, input/output redirection, scripting, background tasks, etc., but if you don't need that in your application, you could ignore those features.) You can use NSTask, system(3), or the exec family of tasks to launch processes (probably NSTask is your best bet).

strange behavior

I wrote simple script test
echo hello #<-- inside test
if I press one time enter after hello, my script will run, if I don't press - it will not, if two times I'll receive my hello and + command was not found, can somebody please explain me this behavior thanks in advance
This is not a part of the code, this is actual code
and I run it on C-Shell, via editor of Windows
command:
source ./test
Some points:
You should not ask questions tagged with both the [csh] and [bash] tags - these are completely different programs and implement completely different script programming languages
You should never name a script (or any other program) test, as this is the name of a built-in feature of bash
Post the actual code you are asking about, without annotations and show how you run it.
I have tried a similar case. I wrote a script like yours, saved it using Windows Notepad (with CRLF line terminators) and run in bash with the same effect as yours in csh. The problem is bash (so csh as well) does not understand Windows' 2-byte line terminators, which are interpreted as commands, which obviously do not exist.
The solution is: change your editor or configure your current editor to use unix line terminators.
You can try for example Notepad++. Remember to change the line terminators to LF.