Is there a way to import backups in NiFi? - backup

Using NiFi v0.6.1 is there a way to import backups/archives?
And by backups I mean the files that are generated when you call
POST /controller/archive using the REST api or "Controller Settings" (tool bar button) and then "Back-up flow" (link).
I tried unzipping the backup and importing it as a template but that didn't work. But after comparing it to an exported template file, the formats are reasonably different. But perhaps there is a way to transform it into a template?
At the moment my current work around is to not select any components on the top level flow and then select "create template"; which will add a template with all my components. Then I just export that. My issue with this is it's a bit more tricky to automate via the REST API. I used Fiddler to determine what the UI is doing and it first generates a snippet that includes all the components (labels, processors, connections, etc.). Then it calls create template (POST /nifi-api/contorller/templates) using the snippet ID. So the template call is easy enough but generating the definition for the snippet is going to take some work.
Note: Once the following feature request is implemented I'm assuming I would just use that instead:
https://cwiki.apache.org/confluence/display/NIFI/Configuration+Management+of+Flows

The entire flow for a NiFi instance is stored in a file called flow.xml.gz in the conf directory (flow.xml.tar in a cluster). The back-up functionality is essentially taking a snapshot of that file at the given point in time and saving it to the conf/archive directory. At a later point in time you could stop NiFi and replace conf/flow.xml.gz with one of those back-ups to restore the flow to that state.
Templates are a different format from the flow.xml.gz. Templates are more public facing and shareable, and can be used to represent portions of a flow, or the entire flow if no components are selected. Some people have used templates as a model to deploy their flows, essentially organizing their flow into process groups and making template for each group. This project provides some automation to work with templates: https://github.com/aperepel/nifi-api-deploy

You just need to stop NiFi, replace the nifi flow configuration file (for example this could be flow.xml.gz in the conf directory) and start NiFi back up.
If you have trouble finding it check your nifi.properties file for the string nifi.flow.configuration.file= to find out what you've set this too.
If you are using clustered mode you need only do this on the NCM.

Related

Is it tenable to access file metadata using flowgear resources?

I have a requirement use a flowgear workflow to process files (via a droppoint, targeting an windows file share [SMB]), but targeting only the files that have been modified after a certain time of day.
How can one tell the the "Last Modified" date/time of a file using a flowgear node?
I have been searching the Flowgear help center, and have been experimenting with file-related nodes - File, File Enumerator, File Watcher and File Manage, but I haven't seen any property that exposes this piece of metadata.
Here's an example of how you can do it.
https://flowgear.me/#s/cCp8kGQ
In this sample, you simply use the script to get a list of files, after that you can once again use the normal File nodes to do the rest. It can be modified to return the file directly via the Script node, however that would require additional hand-coding and is needlessly complex.

TIBCO Global variables, reverse engineering

I'm currently working on a project were am at the stage of figuring out what the current implementation is doing. Have been putting in a lot of time (A LOT) searching connections between queues declared as global variables.
Is there a way to get a listing of were a specific global variable is being used, or do I actually need to go through all processes, as I´m doing atm?
Thank you :)
in Tibco Designer 5.8 you can find where global variable is used using "Tools->Find Global Variable usages" menu item.
Please note that all tibco processes source code are text files. So, you also can search inside project folder using file text search from any utility that allowing you to search inside text files. For windows I prefer Far Manager
In the "Far manager" you can navigate to project folder then ALt+F7 and search for
%%GLOBAL_VARIABLE_NAME%%
Please also note that even if you don't have tibco project source code you can get it from tibco BW server. example path
tibco\tra\domain\tibco\datafiles\YOUR_PROJECT_NAME

Bulk edit UrbanCode configuration?

I want to do some bulk search/edit operation on the scripts embedded in our UrbanCode components and applications, and possibly on the flowcharts and blueprints. Unfortunately a lot of this is stored in UrbanCode's own repository, where it can only be access through the browser GUI and I can't do things like grep for common patterns across the whole set.
Is there any documented way to check out/check in, or at least download, a copy of an entire UCD environment as text files that I could analyze?
Thanks.
I think the closest documented way to get some of the things you are looking for is to export the application and to search through the json file. Component processes with all their steps are included in the application export.

How to separate the latest file from Multiple files in Mule

I have 5000 files in a folder and on daily basis new file keep loaded in same file. I need to get the latest file only on daily basis among all the files.
Will it be possible to achieve the scenario in Mule out of box.
Tried keeping file component inside Poll component( To make use of waterMark) but not working.
Is there any way we can achieve this. If not please suggest the best way ( Any possible links).
Mule Studio: 5.3, RunTime 3.7.2.
Thanks in advance
Short answer: Not really any extremely quick out of the box solution. But there are other ways. Im not saying this is the right or only way of solving it, but I've earlier implemented a similar scenario in this way:
A Normal File inbound with a database table as file-log. Each time a new file is processed, a component checks if its name appears in the table. By choice or filter I only continue if it isn't in there already - and after processing I add the filename to the table.
This is a quite "heavy" solution though. A simpler access would be to use an idempotent filter with a object store. For example a Redis server: https://github.com/mulesoft/redis-connector/blob/master/src/test/resources/redis-objectstore-tests-config.xml
It is actually very simple if your incoming file contains timestamp........you can configure the file inbound connector by setting file:filename-regex-filter pattern="myfilename_#[function:timestamp].csv". I hope this helps
May be you can use a quartz scheduler( mention the time in cron expression), followed by a groovy script in which you can start the file connector . Keep the file connector in another flow.

Merge two Endeca Servers (Endeca 3.1) into one. Including their current data

Let me explain in more detail:
1st: I'm running endeca 3.1, so Endeca Server here refers to 3.0's Data Domain.
I'm required to use an Endeca Server currently present on Endeca (Downloaded a Demo VM). All the info on it, including, groups, attributes and data, must be merged into out Endeca Server. (It can also be the other way around, i could merge my Endeca Server into this one.)
So far, i've tried to do the following:
1) Clone the Endeca Server
2) use the putCollection sconfig operation to create a collection on it with the same name i have on mine.
3) Load configurations using the LoadCollection & LoadAttributes graphs from OEID POC Template 3.1. I point to the new collection on the Configuration.xls file.
This is where i encounter an issue. The LoadAttributes graph gets a T/O message from the server's WS. Then the config WSDL becomes inaccesible for a while. I can't go beyond this point.
I've been able to load data into the collection, but i need to load the attributes first.
THanks in advance for your replies.
Regards
There are a few techniques.
Have you tried exporting the data domain and then importing it?
You can use the endeca-cmd tools to export to a file, and then import from that file. This would enable you to add 2 datastores into one server.
If you want to combine 2 datastores then that is a different question.
The simplest approach in 3.1 if the data collections are small. Extract then as CSV (via a data-table), convert to XLS and add them via self provisioning into separate collections within a single data store. If you are running in the VM this is potentially the easiest approach.
This can also be done using Integrator.
You don't need to load the attributes unless you are using multi-value types. You can call against the conversation web-service to extract data and then load it using 'bulk-load' I would not worry too much about creating the attributes unless this becomes essential due to their type or complexity. If you cannot call against the conversation web-service, then again extract as csv and load using Integrator.