I'm pretty new to both Taverna and Abaqus but I am trying to run an Abaqus model using a "Tool" in Taverna remotely on a HPC. This works fine if I already have my model file and inputs on the HPC but I need a way of uploading the files dynamically in Taverna (trying to generically wrap Abaqus models).
I've tried adding a input port that takes a file list but I don't know how I can copy it to the "location" that I've set for the tool. Could a beanshell service be the answer or can I iterate through the file list and copy them up before executing the abaqus model?
Thanks
When you say that you created an input port that takes a file list, I guess you mean an input to the tool service.
Assuming the input port is called my_file_list, when the tool service is run, it will take a list of data values on port my_file_list. As an example, say it has "hello", "hi" and "hola" is the three values in the list.
On the location where the tool service is run, it executes in a temporary directory - a different directory for each execution of the service. It is normally something like /tmp/usecase-2029778474741087696
Three files will be created in the temporary directory; those files contain the (in this example) three values the tool service received on port my_file_list. The files could be called
/tmp/usecase-2029778474741087696/tempfile.0.tmp containing hello
/tmp/usecase-2029778474741087696/tempfile.1.tmp containing hi
/tmp/usecase-2029778474741087696/tempfile.2.tmp containing hola
There will also be a file called my_input_list. That file will contain
/tmp/usecase-2029778474741087696/tempfile.0.tmp
/tmp/usecase-2029778474741087696/tempfile.1.tmp
/tmp/usecase-2029778474741087696/tempfile.2.tmp
The script of your tool service would normally read the contents of my_input_list line by line and do something with the contents of the listed file(s).
I have also seen some scripts that 'cheat' and iterate directly over tempfile*.tmp but that would be "a bad thing". The problem with that trick, is that if you want to add a second list of files to the tool service then the file my_input_list could contain
/tmp/usecase7932018053449784034/tempfile.4.tmp
/tmp/usecase7932018053449784034/tempfile.5.tmp
/tmp/usecase7932018053449784034/tempfile.6.tmp
as other temporary files were used for the other file list port.
I hope that helps
The tool service allows you to upload files - but if you are using the HPC through a job submission node, then you would have to modify your command line tool to then use the job file staging command to further push the files as part of the job. The files would be available in the current (temporary) directory of the specified tool script.
I would try to do it through the Tool service and not involve the beanshell - then you can keep your workflow simpler.
A good thing to remember is that you can write multiple shell commands in the box.
Similarly you would probably want to retrieve back the results so that you can process them further in the workflow (unless they are massive - in which case you should just output their remote filenames and send them in again to the next HPC job)
The exact commands to use for staging files and retrieving them depends on the HPC job submission system. Which one are you using?
Thanks for the input guys.
It was my misunderstanding of how Taverna uses the File list. All the files in the list are copied to the temp "sandbox" and are therefore available for use.
Another nice easy way is to zip the directory and pass the zipped files into an input port for the service. Then just unzip the files inside the command.
Thanks again
Related
I have a requirement use a flowgear workflow to process files (via a droppoint, targeting an windows file share [SMB]), but targeting only the files that have been modified after a certain time of day.
How can one tell the the "Last Modified" date/time of a file using a flowgear node?
I have been searching the Flowgear help center, and have been experimenting with file-related nodes - File, File Enumerator, File Watcher and File Manage, but I haven't seen any property that exposes this piece of metadata.
Here's an example of how you can do it.
https://flowgear.me/#s/cCp8kGQ
In this sample, you simply use the script to get a list of files, after that you can once again use the normal File nodes to do the rest. It can be modified to return the file directly via the Script node, however that would require additional hand-coding and is needlessly complex.
Are there guidelines regarding how to share a Snakemake workflow among multiple users on the same data under Linux, or is the whole thing considered bad practice?
Let me explain in case it's not clear:
Suppose user A executes a workflow in directory dir/. Assume the workflow terminates successfully, and he/she then properly sets file/directory permissions recursively on all output and intermediate files and the .snakemake/ subdirectory for other users to read/write, of course.
User B subsequently navigates to dir/, adds input files to the workflow, then executes it. Can anything go wrong?
TL;DR: I'm asking about non-concurrent execution of the same workflow by distinct users on the same system, and on the same data on disk. Is Snakemake designed for such use cases?
It's possible to run snakemake --nolock which will prevent locking of the directory, so multiple runs can be made from inside the same directory. However, without lock, there's now an opening for errors due to concurrent runs trying to modify the same files. It's probably OK, if you are certain that this will be avoided, e.g. if you are in constant communication with another user about which files will be modified.
An alternative option is to create a third directory/path, and put all the data there. This way you can work from separate directories/path and avoid costly recomputes.
I would say that from the point of view of snakemake, and workflow management in general, it's ok for user B to add or update input files and re-run the pipeline. After all, one of the advantages of a workflow management system is to update results according to new input. The problem is that user A could find her results updated without being aware of it.
From the top of my head and without more detail this is what I would suggest. Make snakemake read the list of input files from a table (pandas comes in handy for this) or from some configuration file. Keep this sample sheet under version control (with git/github) together with the Snakefile and other source code.
When users update the working directory with new files, they will also need to update the sample sheet in order for snakemake to "see" the new input and other users will know about it via version control. I prefer this setup over dumping files in a directory and letting snakemake process whatever is in there.
I have a single command line windows executable that has many options built into this exe file.
Eg:
(It can take screenshot)
ToolGo.exe printscreen c:\temp\filename.jpg yyyymmdd
(It can show up)
ToolGo.exe showIP machineA
I want to write another command line application, possibly in .net , where it can embed/build a wrapper around this ToolGo.exe file into my application without the user be able to use the ToolGo.exe, and also users can only access one function of this main exe file.
In the example I want this other tool to access only the print screen function in this new exe file.
The new application will have this:
Tool2go.exe printscreen c:\temp\filename.jpg yyyymmdd
But if someone types the following, it will not work:
Tool2go.exe showIP machineA
Or
ToolGo.exe showIP machineA
Any ideas how I can write this code to do this in a .net command line application?
This is a multi-part question, so I'll just give the main part of the issue as the answer with suggestions on handling the rest.
You can embed a .exe into your program by clicking on Properties and navigating the the Resources section, and adding that .exe to it.
After that, it's just a matter of extracting it locally so you can pass your commands to it, and handle it's responses. (I'm not really aware of any way to do so w/out first extracting the. exe; the .exe itself needs to run somehow after all).
To extract the embedded .exe, you do this:
' Extract the MyProgram resource (i.e. your .exe)
Dim b() As Byte = My.Resources.MyProgram
' Write it to the user's Temp folder
File.WriteAllBytes(Environment.ExpandEnvironmentVariables("%TEMP%\MyProgram.exe"), b)
By extracting it to the user's Temp folder, you can pass it your commands, and since it's 'out of sight' the user probably won't even know it's there to directly use it themselves, unless they're a bit more advanced and visit their Temp folder often. You can slightly help to avoid this, but extracting the .exe when your program starts, and then deleting it when it exits, so it only exists on the user's system while your program is running.
As far as what the user can and cannot type in order to pass to the program, you can simply handle the filtering with your program; since your program is the one passing the commands to the .exe, just don't pass any commands that you don't allowed, and pass the ones you do want allowed.
Situation now:
I have a data warehouse job profile that publishes .txt file in Data folder every day in the morning. I open Tableau workbook which automatically updates data visualisations because of union I made. I save this workbook as extract and collages without Tableau Desktop can view it via Tableau Reader.
What I need:
This reporting format is heavily dependent on me and I need to automate this.
Is this even possible without Tableau Server?
Since Tableau Viewer can only use packaged workbooks with extracted data, you may not directly achieve this.
However, you may automate the packaging process using Tableau's command line parameters and the process will not be dependent on anyone anymore.
You may check the .PDF file on below link. Using that help document, you may create a .BAT file and get that .BAT file periodically started using Task Scheduler on your computer. The users then may open the packaged file from the network location you have saved. Or else (If all user computers have Tableau Desktop installed) you may put the file opening line at the end of the .BAT file, so the user can run the .BAT when they want to see the report.
https://community.tableau.com/docs/DOC-5209
Bernardo was correct in saying the Extract API can be used to programatically create extracts, and thus "refresh" an extract by simply recreating it (the point about Tableau Server is only relevant if you want to publish the extract that you create with the Extract API).
Where you might have trouble is that there is no currently supported way to programatically replace an extract within a .twbx file. That said, it should be possible to do this by simply renaming the .twbx to .zip (it is after all just an archive) and then using something like Python's zip module to manipulate the archive to replace the extract with your new extract.
NB: The Extract API can only be used to create .hyper files. If you want to work with .tde files, then you'll need to use the Tableau SDK instead
So I would like to create an action that will take a folder, and display/store its contents filenames (as I would later want to manipulate the files on this folder via their filenames). I believe this is achieved via javascript.
I did the ff:
create new action
-> starts with -> A Folder on my Computer
I have no idea what to do next.
Any help would be awesome.
Thanks
A document's path can be accessed via this.path, the file name can be accessed via this.documentFileName. If all you want is a list of paths, you could print them to the console via console.println(this.path);.
There are likely better methods to achieve your objective than via Acrobat's batch processes, but without more detail, it's hard to point you in a direction.