Bulk extract with Authenticated Connector (import.io) - api

I am new to import.io and this forum.
I am trying to extract information from a target database where I have to run a query with an input. With help of the support, I successfully created the authenticated connector. With multiple inputs that have to be manually entered in the UI, it fetches the data properly.
The problem is I have more than 10,000 inputs to run, so it has to be in a form of bulk extraction. import.io support told me that they do not have this feature within their UI and suggested to use their API posted in here: http://api.docs.import.io/#!/Query_Methods/queryPost.
Could anyone walk me through to make a use of this? I just need a working script that takes multiple string lines as inputs and run the connector that I built and post the result. I am not very familiar with this kind of technology but I am very willing to learn.
Thanks all in advance!

I would be happy to walk you through a bit of an into. It will be a bit basic though since I don't know your specific use case.
Yes, support was correct. You will need to use the POST query in order to pass your authentication credentials as inputs.
I will break down this query by steps. Essentially, our API docs are just a simple UI to pass through your credentials, then you can generate a query API.
ID - This is the GUID of your connector. This information can be found at the end of the URL, like this: https://import.io/data/mine/?tag=CONNECTOR&id=33f4e828-25ce-40c4-948c-9b734c70d1ab
Query - This is where you will put the inputs from your connector in order to execute. Be sure to keep this in structured JSON or it will bring back errors when you are querying.
Once you have successfully entered that information you will query the API.
This will give you the request URL that you need to query the API.
If you have anymore questions, just let me know.
Thanks,
Meg

Related

Informatica using URI based REST API

I'm having real trouble getting Informatica PowerCenter or Developer to call a URI based REST API and I'm doing it for something simple (JIRA's API). Basically I want to call JIRA's worklog REST API which is a different URL for a list of issue ids and write it to our DB.
https://docs.atlassian.com/jira/REST/6.2/
/rest/api/2/issue/{issueIdOrKey}/worklog
Informatica PowerCenter supports only HTTP transformation which is only a simple GET. Unfortunately the latest version is still stuck in the 'old' query type URL building where they append inputs into search strings. E.g. if I have a "key" input field with value "ABC-1" and the URL is jira/rest/api/2/search it would actually build the URL on the fly into jira/rest/api/2/search?key=ABC-1. While some of JIRA's API works this way, some use the URI way e.g. jira/rest/api/2/ABC-1/worklog which requires embedding the value into the URI. There's no way I can get this to work :-
if I do jira/rest/api/$key/worklog it still converts the URI into jira/rest/api/$key/worklog/?key=ABC-1 so $key does not get replaced
even if i pre-build the URI outside the mapping it's not feasible as the URI needs to be dynamic to the list of JIRA keys and anyway because it appends ? at the end JIRA throws an error (because ? is a reserved key word for this API)
HTTP transformation does not support NTLMv2 authentication which our company's JIRA instance may upgrade to shortly
Last resort is to use a Java transformation in which Informatica has quite little value add. This also means I need to somehow pass in the JIRA user password for authentication which is a separate challenge (versus just storing as a HTTP connection)
Informatica Developer supports REST Web Consumer Transformation but has similar limitations with only building query type URL. Even worse I can't even dynamically build the URL since it's fixed to the HTTP connection object URL.
Am I straight outta luck?
I got the query and here I would like to answer about this. I can write here only points and it might be you won’t be able to understand that thing properly. So Here I am putting link of blog where the task "how to informatica read rest api is mentioned in detail step by step with video tutotial. Some examples are also there. Feel free to visit
https://zappysys.com/blog/read-json-informatica-import-rest-api-json-file/
Hope it will help.

How to write filter {key} for a 'get items as xlsx' request via Podio API

I'll get this out of the way first: I'm an amateur programmer (at best)...i have some knowledge of how APIs work, but little to no experience manipulating the podio API directly (ie I use zapier/globiflow a lot and don't write any php/ruby). I'm sure other people can figure this out via the API documentation, but I can't. So i'm really hoping someone can help clarify and give some more detailed instruction.
My Overall objective:
I frequently export podio files as xlsx from the podio front-end. This is used by my team and me to do regular data analysis tasks in excel. I would like to make this process easier by automating the function of getting an updated podio export into my excel. My plan is to do this via excel VBA. I understand from other searching that it is possible to send an HTTP request using VBA, so i want to make sure i understand what I need to send to the Podio API to get what I need. The method of how to write the HTTP request in excel VBA is outside the intentional scope of this question (though i'd accept any help on this!)
What I've tried so far:
I know that 'get items as xlsx' is part of the podio API: https://developers.podio.com/doc/items/get-items-as-xlsx-63233
However I cannot seem to get this to work in the sandbox environment on that page so that i can figure out a valid request url. I get this message: 'Invalid filtering key' ... because i have no idea how to fill in that field. The information on that page is not clear on this. Nor is it evident on the referenced 'views page'. There are no examples to follow!
I don't even want to do any filtering. I just want to get ALL items in the app. Or i can give it a pre-existing view_id, but that doens't seem to work either without a {key}
I realise this is probably dead simple. Please help a noob? :)
Unfortunately, the interactive API sandbox does not behave appropriately for this particular endpoint. For filtering, this API endpoint expects query string parameters where the field-value pairs consist of integer field IDs and the allowed values for each field. Filtering by fields is totally optional. It looks like this sandbox page isn't built for this kind of operation with dynamic query string field names; wherever you see the {key} field on that page is meant as a placeholder for whatever field IDs that you would use for filtering.
If you want to experiment with this endpoint, I would encourage you to try another dedicated HTTP client first. I was able to get this simple example working with the command-line program wget:
wget --header="Authorization:OAuth2 $MY_SECRET_TOKEN" \
--content-disposition \
"https://api.podio.com/item/app/16476850/xlsx/"
In this case, wget downloaded an Excel file containing all the items in my app without any filtering applied. The additional --content-disposition argument tells wget to save the output as a file with a name using the information in the server's Content-Disposition response header.
With a filter applied:
wget --header="Authorization:OAuth2 $MY_SECRET_TOKEN" \
--content-disposition \
"https://api.podio.com/item/app/16476850/xlsx/?130654431=galaxy"
In this case, the downloaded file filtered the results to items where field id 130654431 (which is a category field) contain the value galaxy.

Backend database used in the API

By going through this API documentation page, is it possible to tell which database is being used in the backend?
Zomato API
MySQL would require a php file on the server to handle the requests, make queries, pack data in JSON format then send it back to the device. But in this case parameters are passed to .json files. Please advice
There is no way to "see through" to what the backend service actually used to provide you with the information you may query for. Are you sure you want to continue using this product? The site notes that Zomato will no longer be available to individuals, and that your API key will be disabled if you don't use it monthly.
I haven't read the specs for that particular API. But in general, is it possible to tell what database is being used on the back end by studying an API? No. That's the whole point of an API: It's supposed to shield the API-user from implementation details.
It's probably true that in many cases you could make reasonable guesses about what tools are being used on the back end. Like if you see that the API gives you a syntax for doing comparisons that looks exactly like the proprietary compare function used in Foobar SQL and not found in any other database product, that would be a strong clue. But even something like that wouldn't be proof. Maybe originally they were using Foobar SQL, then they switched to another database, but to maintain compatibility they wrote code to translate the Foobar SQL compare to standard SQL syntax.

List of all companies on AngelList via API

https://angel.co/api/spec/startups
What would the best approach for hitting every company that is listed on AngelList? My first guess would be to query all the numbers up until 250k, the number of companies on angelList, using this endpoint https://api.angel.co/1/startups/45435
There surely has to be a better way of doing this though.
Yes it is possible via their API. And the API endpoint that you have mentioned in your question is the correct one. I have written a PHP component to achieve this. You can use this exporter application to download the start-ups data for each country into a CSV file : AngelList Data Exporter
I hope this helps you.
Angel.co does not expose its api anymore. So you have to parse the website to get any data.
Also a quick google search would give you a few websites which have different datasets from angel.co website.

Web Scraping through Excel VBA

I need to fetch company addresses(cim) from site http://www.ceginfo.hu/
Example Company Name: AB-KONTÍR Szolgáltató Bt.
I know how to do it using WinHttp.WinHttpRequest object and FireBug.
But I am not able to decide to which URL I should send this request.
When I analyse the request/responses using FireBug, I get the following URL:
http://www.ceginfo.hu/company/search/4221638
4221638 is CompanyID here I think. But in my case I will have company name only and that's what my problem is.
So can anybody please tell me where can I get URL using firebug or any other tool using which I can track the URL with Company Name as parameter which I can use in my VBA code.
Thanks in advance!
So can anybody please tell me where can I get URL using firebug or any
other tool using which I can track the URL with Company Name as
parameter which I can use in my VBA code.
No. Unless there is a publicly available database (I would suggest calling them, if you can) or an API that allows for programmatic access, the only way to arrive at this link slug is by executing the search.
Further, the post slog is not as relevant as you think. If you search for simply "Kontir", this is the resulting page -- with many results:
http://www.ceginfo.hu/company/search/4222407
You're going to have to automate the "search" -- passing the criteria to the Web Page and executing the button-click and/or HTTPPost, and then parse the result(s). In the example company name, there is only one result. But it is possible as in my example above, that there may be multiple matches for some queries, and then you will need to have a method of dealing with these, or ignoring them.