How to save API output data to a Dataset in Azure Data Factory - api

I'm currently working on a project in Azure Data Factory, which involves collecting data from a Dataset, using this data to make API calls, and thereafter taking the output of the calls, and posting them to another dataset.
In this way I wish to end up with a dataset containing various different data, that the API call returns to me.
My current difficulty with this is, that do not know how to make the "Web activity" (which I use to make the API Call) save its output to my dataset.
I have tried numerous different solutions found online, however none of them seem to work. I am not sure if the official documentation is outdated or if I'm misunderstanding parts of it. Below I've listed links to the solutions I've tried and failed:
Copy data from a REST source
Copy data from an HTTP source
(among others, including similar posts to mine.)
The current flow in my pipeline is, that a "Lookup" collects a list of variables named "User_ID". These user ID's are put in to a ForEach loop, which makes an API call with the "Web" activity, using each of the USER_ID's. And this is where in the pipeline I wish to implement an activity or other, that can post each of these Web activity outputs into my new dataset.
I've tried to use the "Copy data" activity, but all it seems to do, is copying data straight from one dataset to another, and not being able to manipulate the data (which i wish to do with my api call).
Does anyone have a solution to how this is done?
Thanks a lot in advance.

Not sure why you could not achieve this following Copy data from a REST endpoint. I tested the below which works fine. I used schema mapping feature of 'Copy data' activity.
For example, I used a sample API http://dummy.restapiexample.com/api/v1/employees as source and for my testing, I used CosmosDB as sink. Of course you can choose any other dataset as per your requirement.
Create 'Linked Service' for the REST API. For simplicity I do not have authentication for this API. Of course, you have that option if required.
Create 'Linked Service' for the target data store. In my case, it is CosmosDB.
Create Dataset for the REST API and link to the linked service created in #1.
Create Dataset for the Data store (in my case CosmosDB) and link to the linked service created in #2.
In the pipeline, add a 'Copy data' activity like below with source as the REST dataset created in #3 and sink as the dataset created in #4. Also, in my case I had to add schema mapping to select the employees array from the API output and map to each field in my datastore.
And voila, that's it. When I run the pipeline, it calls the REST API and saves the output in my DB with my desired mapping.

Related

Google Data Fusion Salesforce to Bigquery Pipeline, automatic way of managing schema updates in Salesforce

Hey I am trying to create some batch jobs that reads from a couple Salesforce Objects and pushes them to BQ. Every-time batch process runs it will truncate the table in BQ and push all the data in the SF object back into BQ. Is it possible for google data fusion to automatically detect changes in an object in Salesforce(like adding a new column or changing data types of a column) then be registered and pushed to BQ via google data fusion?
For SF side of the puzzle you could look into https://developer.salesforce.com/docs/atlas.en-us.api_rest.meta/api_rest/resources_describeGlobal.htm and If-Modified-Since header telling you if the definition of table(s) changed. That url is for all tables in the org or you run table-specific metadata describe calls with https://developer.salesforce.com/docs/atlas.en-us.api_rest.meta/api_rest/resources_sobject_describe.htm
But I can't tell you how to use it in your job.
You can use the provided answer of #eyescream to be the condition or the trigger for the update to BigQuery. You may push changes to BigQuery using the pre-built plugin Stream Source approach from Datafusion in which, as mentioned in this docmentation, it
tracks updates in Salesforce sObjects. Examples of sObjects are
opportunities, contacts, accounts, leads, any custom object, etc.
You may use this approach to automatically track changes and push them to BigQuery. You can also find the whole Salesforce Streaming Source configuration reference in this documentation as also redirected from google's official documentation.
However, if you want a more dynamic approach for your overall use case, you may also use the integration of BigQuery with Salesforce. However in this approach, you will need to build your own code in which you can also use #eyescream 's answer as the primary condition/trigger and then automatically push the update to your BigQuery schema.

Is there any command line client to do data entry in DHIS2?

I want to know is there any command line client to do data entry in DHIS2?
I found one, named as dish (https://github.com/baosystems/dish2/), but it is only used for simplifying common tasks and is suitable for handling batch metadata operations, system maintenance operations.
I want to enter data into data elements directly, is it possible? If not there is any alternative method to it?
as far as I know, there are no command line clients to do data entry for DHIS2. There are however options to import data into DHIS2 using xml, json or csv formats. So one option is to create the data in one of these formats first, then use the API to import it.
When you say you want to enter data into data elements directly, I assume you are referring to actual data and not metadata.
There is no way to interact with the DHIS2 api to add data directly to a data element. The reason for this is that data elements are either connected to a data set or, if you are using the tracker models, a program stage. A single data element can be connected to multiple data sets or program stages, so adding data directly to a data element wouldn't make sense.
You can however do data entry for a data element, but you need to go through either a data set or program stage that uses the data element.
What is your use-case for needing a command line client for this? Maybe I know of another solution that would help you.

How to add more data to be stored in jenkins rest api

To make the question simple, I know that I can get some build information with https://jenkins_server/...///api/json|xml|python. And I get a lot of information for that build record.
However, I want to add more information to that build record. For example, the docker image created, or the tickets or files changed from last build to create release note, ... etc. How do I do that?
For now, I use a script to create a json file as an artifact and call that json file to get these information, but it seems a duplicate if I can add more data to the jenkins build object directly.
The Jenkins remote access API is designed to provide access to generic Jenkins-internal information, like build numbers, timestamps, fingerprints etc.
If you want to add your own data there, then you must extend Jenkins accordingly, e.g., by designing a plugin that advertises your (custom) information items as standard Jenkins-"internal" data. If you want to do that, you may want to have a look at they way fingerprint information is handled (I found that quite instructive).
However, I'd recommend that you stick with your current approach, and keep generic Jenkins-internal information separated from Job-specific data. It is less effort and clearly separates your own data from Jenkins' data.

Merge two Endeca Servers (Endeca 3.1) into one. Including their current data

Let me explain in more detail:
1st: I'm running endeca 3.1, so Endeca Server here refers to 3.0's Data Domain.
I'm required to use an Endeca Server currently present on Endeca (Downloaded a Demo VM). All the info on it, including, groups, attributes and data, must be merged into out Endeca Server. (It can also be the other way around, i could merge my Endeca Server into this one.)
So far, i've tried to do the following:
1) Clone the Endeca Server
2) use the putCollection sconfig operation to create a collection on it with the same name i have on mine.
3) Load configurations using the LoadCollection & LoadAttributes graphs from OEID POC Template 3.1. I point to the new collection on the Configuration.xls file.
This is where i encounter an issue. The LoadAttributes graph gets a T/O message from the server's WS. Then the config WSDL becomes inaccesible for a while. I can't go beyond this point.
I've been able to load data into the collection, but i need to load the attributes first.
THanks in advance for your replies.
Regards
There are a few techniques.
Have you tried exporting the data domain and then importing it?
You can use the endeca-cmd tools to export to a file, and then import from that file. This would enable you to add 2 datastores into one server.
If you want to combine 2 datastores then that is a different question.
The simplest approach in 3.1 if the data collections are small. Extract then as CSV (via a data-table), convert to XLS and add them via self provisioning into separate collections within a single data store. If you are running in the VM this is potentially the easiest approach.
This can also be done using Integrator.
You don't need to load the attributes unless you are using multi-value types. You can call against the conversation web-service to extract data and then load it using 'bulk-load' I would not worry too much about creating the attributes unless this becomes essential due to their type or complexity. If you cannot call against the conversation web-service, then again extract as csv and load using Integrator.

How to merge two content source in Sharepoint 2010?

In my share point 2010 website, I added two content source
file system (shared folder)
BDC data (Line of Business Data)
I added the managed properties to map the metadata of the BDC data.
My search result coming link this
I would like to link the two content source, my second content source having the file related information like (tab, category, fileno, case name)
I added the column and also I altered the xslt in the search result web part. the results are coming link below.
From the result, the third one (120) is coming from the database so all the properties are mapped (caseid, casename,fileno, doctab, description)
But it's not mapping to the file system. The file system having relationship with the table with the file name and also the the path of the files having some information:
file://192.168.25.231/FolderName/CaseID/documenttab/filename
CaseId is the primary key for the table which I added as second content source.
How can I achieve this?
Hmm, it's difficult to add much more without seeing the environment. But here's plan B
Given you're using the BCS and want to display both unstructured content (the files) and application data that shares metadata with the files, you could try the following. It will require some coding knowledge. You can make connections between web parts in SharePoint Designer but this will need Visual Studio
create a custom search results web page, and use the standard core search results web part along with separate data web part for displaying the application data
create a custom query box for entering the search query, probably best done with separate fields for the metadata - case ID, case name etc. (You'd normally use a data filter web part, but that won't pass results through to the normal search results - you need to code to run two queries)
format and pass the query to both the core search results web part, and the BCS data web part, to display items that match the query
That's probably as much as I can help with. The SharePoint section on MSDN should be the next port of call. Good luck!
This may be an overly simplistic explanation to keep the response as short as possible.
For your search results page, the best approach when also retrieving application data is to not present that information in the core search results web part. Exclude it from the default scope. Instead, use a federated search results web part added to the results page. You'll also need to create the corresponding federation location for the scope (easy to do), and you can then use XSLT to style the display of the results - application data needs to be presented differently to links to files and web pages.
Then, a search for say the case ID, will display all files containing that information in the core search results web part, and will display any matching application data in the federated results web part, with the different formatting applied. Note - there will be no connection between the two. The only relationship is that they both match the search query. It is possible to connect web parts to filter one based on the selected value in another, but it's an entirely different approach and not easily done using search results.