Web Scraping through Excel VBA - vba

I need to fetch company addresses(cim) from site http://www.ceginfo.hu/
Example Company Name: AB-KONTÍR Szolgáltató Bt.
I know how to do it using WinHttp.WinHttpRequest object and FireBug.
But I am not able to decide to which URL I should send this request.
When I analyse the request/responses using FireBug, I get the following URL:
http://www.ceginfo.hu/company/search/4221638
4221638 is CompanyID here I think. But in my case I will have company name only and that's what my problem is.
So can anybody please tell me where can I get URL using firebug or any other tool using which I can track the URL with Company Name as parameter which I can use in my VBA code.
Thanks in advance!

So can anybody please tell me where can I get URL using firebug or any
other tool using which I can track the URL with Company Name as
parameter which I can use in my VBA code.
No. Unless there is a publicly available database (I would suggest calling them, if you can) or an API that allows for programmatic access, the only way to arrive at this link slug is by executing the search.
Further, the post slog is not as relevant as you think. If you search for simply "Kontir", this is the resulting page -- with many results:
http://www.ceginfo.hu/company/search/4222407
You're going to have to automate the "search" -- passing the criteria to the Web Page and executing the button-click and/or HTTPPost, and then parse the result(s). In the example company name, there is only one result. But it is possible as in my example above, that there may be multiple matches for some queries, and then you will need to have a method of dealing with these, or ignoring them.

Related

How to write filter {key} for a 'get items as xlsx' request via Podio API

I'll get this out of the way first: I'm an amateur programmer (at best)...i have some knowledge of how APIs work, but little to no experience manipulating the podio API directly (ie I use zapier/globiflow a lot and don't write any php/ruby). I'm sure other people can figure this out via the API documentation, but I can't. So i'm really hoping someone can help clarify and give some more detailed instruction.
My Overall objective:
I frequently export podio files as xlsx from the podio front-end. This is used by my team and me to do regular data analysis tasks in excel. I would like to make this process easier by automating the function of getting an updated podio export into my excel. My plan is to do this via excel VBA. I understand from other searching that it is possible to send an HTTP request using VBA, so i want to make sure i understand what I need to send to the Podio API to get what I need. The method of how to write the HTTP request in excel VBA is outside the intentional scope of this question (though i'd accept any help on this!)
What I've tried so far:
I know that 'get items as xlsx' is part of the podio API: https://developers.podio.com/doc/items/get-items-as-xlsx-63233
However I cannot seem to get this to work in the sandbox environment on that page so that i can figure out a valid request url. I get this message: 'Invalid filtering key' ... because i have no idea how to fill in that field. The information on that page is not clear on this. Nor is it evident on the referenced 'views page'. There are no examples to follow!
I don't even want to do any filtering. I just want to get ALL items in the app. Or i can give it a pre-existing view_id, but that doens't seem to work either without a {key}
I realise this is probably dead simple. Please help a noob? :)
Unfortunately, the interactive API sandbox does not behave appropriately for this particular endpoint. For filtering, this API endpoint expects query string parameters where the field-value pairs consist of integer field IDs and the allowed values for each field. Filtering by fields is totally optional. It looks like this sandbox page isn't built for this kind of operation with dynamic query string field names; wherever you see the {key} field on that page is meant as a placeholder for whatever field IDs that you would use for filtering.
If you want to experiment with this endpoint, I would encourage you to try another dedicated HTTP client first. I was able to get this simple example working with the command-line program wget:
wget --header="Authorization:OAuth2 $MY_SECRET_TOKEN" \
--content-disposition \
"https://api.podio.com/item/app/16476850/xlsx/"
In this case, wget downloaded an Excel file containing all the items in my app without any filtering applied. The additional --content-disposition argument tells wget to save the output as a file with a name using the information in the server's Content-Disposition response header.
With a filter applied:
wget --header="Authorization:OAuth2 $MY_SECRET_TOKEN" \
--content-disposition \
"https://api.podio.com/item/app/16476850/xlsx/?130654431=galaxy"
In this case, the downloaded file filtered the results to items where field id 130654431 (which is a category field) contain the value galaxy.

Generate a URL to Rally objects using FormattedID?

I would like to write web pages that have links to Rally issues (Test Cases, Defects, etc). I would like to be able to generate a URL with the FormattedID. Is this possible? Or do I really need the objectID? For example:
http://rally1.rallydev.com/363953481d/detail/testcase/TC1665
(or something like that, instead of the cryptic object id)
The following allows users to go directly to the detail page of a work product without having to know the Object ID:
https://rally1.rallydev.com/#/search?keywords=US1234
This relies on a feature of Rally's search functionality and isn't officially supported - so the above URL isn't guaranteed to work forever. However it's a decent way to use Formatted ID's as a workaround.
Searching by just keyword will give you the item with that ID and related items (e.g. other items that mention the desired ID in their name). Sometimes this is fine, sometimes not.
To search for DE75700 and DE72760 only, and no others use
https://rally1.rallydev.com/#/search?keywords=FormattedID%3ADE75700%20FormattedID%3ADE72760
This is equivalent to manually typing
FormattedID:DE75700 FormattedID:DE72760
in the Rally search box.
As a corollary to the main answer I have defined a Chrome browser bookmark which will take me right to any Rally item by its ID.
The URL for this bookmark in full is:
javascript:(function(){window.location = "https://rally1.rallydev.com/#/search?keywords=" + prompt("Enter ID:");})();
When this bookmark is activated, it prompts you like so:
I find this to be a huge time saver.
Thanks to Displaying a prompt from javascript Chrome bookmark.

how to correct spelling mistakes in Google custom API

I am using Google's custom search API, I make an HTTP request to a URL that looks like this:
https://www.googleapis.com/customsearch/v1?key=<my-key>&cref=&num=10&q=how+can+i+do+htis
if you search for "how can i do htis" on Google you are told "Showing results for how can i do this", and give you some results (call them result set A)
but if you use the API to search for the misspelled string, you get different results than those of A... Searching with a correctly spelled string gives you result A, which matches the ordinary search service on Google
Is there a way to search directly using the suggested string? I want to use the API I can't afford implementing a spell checker myself that can also correct people names and everything
I think what you want to do is possible using the spelling suggestions of Google. This is part of the xml-results returned by your query.
See API here.

Pass a string to various websites

I have a product code which I need to enter into 6 different websites in order to pull different information from them about the product. Is there away to save this product code into some sort of variable and pass it into each websites input box and it return all the information from each one automatically? Really have no idea where to go/start with this so if anyone can brainstorm a few ideas to get me moving that would be great.
In order get what you are planning for:
You need a script which visits the specified web site,
then at the website, you can get the element by tag.
For instance in javascript,
var textBox = document.getElementByTag(Input);
This will give you a reference to text field to enter the text. It can be done as follows:
textBox.value = "any string";
Once you have done this, you will have to retrieve the results from the page, based on the website layout.
So if you can specify about your work in detail, you would get better response.
Assuming you're talking about using an ordinary GUI browser, the best you can do is copy it to your system clipboard, and paste it into each page on the browser.
If you're talking about a programmatic web-access like wget or curl, it depends on what language you are writing your script in.
you have to create the web request for each web site and find a way to parse the response which will be HTML
have a look at the HttpWebRequest you can find lots of example on internet that shows how you can create an HTTP POST to a website.
http://www.terminally-incoherent.com/blog/2008/05/05/send-a-https-post-request-with-c/

How to make a Google Maps address - like lookup

You've probably all seen the maps.google.com.au address lookup. Start typing into the text box and your address auto completes in the list before you've finished. It also bolds the matching sections of the text that link to what you are typing.
I've used both the javascript api of maps and the http api. The geocoding seems to do something decent with the matches but i'm not entirely sure how one would go about getting this to work.
Anyway have a tutorial or a quick five step process that they would recommend I follow to get this feature going?
The feature you are looking for is "find as you type" or "suggest as you type" or AJAX live search.
To get the functionality via the Maps API is possible as any other find as you type solution. For each key entered into your search box you already send the request to the server and see what matches the entered text so far. The problem is that you can only send so many requests to google before you get a 620 (too many requests) error. Having a find-as-you-type mechanism is usually easier when you have your own small DB which you can query, that is faster and you won't have problems with too many requests.
Some links with tutorials:
Javascript Autocomplete Combobox - find as you type
Suggest as you type
AJAX Live Search