Question on Google Maps integration - api

I am working on a webapplication where I need to deliver products to houses in my country.
All the street names and neighborhoods are present in Google maps. I want to know if there is any way to get all the street data(street name, region) from Google maps into one single file to load in my database.
This way people can easily find there street with the auto-suggest options Javascript has. And then I can calculate the cost, trough distance, the people have to pay.
Or is there another way to use the Google maps data in my web application.
PS. sorry for this not being a programming question. If someone knows another place on StackExchange where I can get this question better answered this post can be relocated.

Getting everything in a database is not something Google is going to give you. It's taken them a lot of effort to build it and they want some return.
You do have the option of working with GeoNames where you can either download the database or use a webservice.
Alternatively, you could access Google's database using the Google Geocoding API.
I would recommend working with the Google version as it is much more likely to be up to date.

If you are looking for a way to calculate driving distance from one address to another so you could calculate costs, then I recommend you check out this article. Actually, I'm going to do exactly that on my current project. The problem is that we do not have specific addressing system in my country, so I'll have to use approximate estimations (within an area of the address that is...).

Related

How APIs works and get datas?

First, I'm sorry if it's not the good place to ask this question, but I think it's related with programming.
I'm going to create a website, and this website needs soccer data like scores, players ... etc, I search on the net and I found a lot of APIs that you need to buy for it.
but I'm really confused how this APIs works ? it there are a companies who have employs and watched games and add data manually, or this use other method ? that can I create my own API ? If yes, how can I get the data ?

Find the dbpedia uri for a given place

I need your help with the following situation.
I have a local relational database that contains information about several places in a city. These places could be any kind of attraction: Museum, a cathedral, or even a square.
As an example I have information about "Square Victoria" (https://en.wikipedia.org/wiki/Victoria_Square,_Montreal)
A simple search in google gave me the wikipedia URL above. But I want to be able to do it programmatically.
For each place in the database I have also its category (square, museum, church, ....). These categories are local only and do not match any standardized categorization.
My goal is to improve this database by associating each place to its dbpedia URI.
My question is what is the best way to do that? I have some theoretical background about Semantic Web technologies but I don't have yet the practice skills to determine how to do that.
More specific questions:
Is it possible to determine the dbpedia URI using sparql only?
If it is not possible to do it with sparql only, what other technologies would I need to be able to accomplish that?
Thank you
First of all I would recommend, if you have not done it yet, to have a look at wikidata. This project is a semantic extension to wikipedia, but contrary to dbpedia, the data is not extracted from wikipedia, it is created by contributors, and therefore appears (or will appear as the project is still growing) to be more relevant.
The service offers many solutions to access data (including a Sparql endpoint), and it's main advantage is that the underlying software is mediawiki, same used for wikipedia and other Wikimedia foundation projects. The mediawiki API offers an Opensearch option that should allow you to search more efficiently that Sparql queries.
Putting everything together, I think it might be worth having a look at wikidata + wikipedia API to get pivot data to align you local database.
No direct answer but I hope that will help.

What free/paid search API's allow for programmatic querying and caching/storage of the resulting data?

If you've done any serious research into search API's, you know that most of them have a huge slew of TOS/TOU restrictions that make them nearly impossible to use in anything but the most inane applications.
Bing's 2.0 API, Yahoo Search BOSS, Google Places, Google AJAX Search (dead), et al, are far too restrictive for us. I need to run a finite and relatively small number of queries (perhaps 500k) one time only, storing specific data from the results for use within our application.
For example, we need to match up business names with their target websites (we have written the algorithm to make a 'best guess' from a set of results if necessary; we just need a vanilla result set). Also, we need to match an address to this company in question.
Unfortunately, I can find ZERO search API's that will allow us to fire off queries in a programmatic, non-user-initiated manner.
We're even quite eager to give someone cold, hard cash for access to this kind of data; Google, Bing, Yahoo, and others simply seem to not want our money (as evidenced by their TOSes)...
Any thoughts?
A freely accessible index of 5 billion web pages, their page rank, their link graphs and other metadata, hosted on Amazon EC2.
http://commoncrawl.org/
Their Terms of Service (or TOU) are pretty reasonable and unrestricted too:
http://commoncrawl.org/about/terms-of-use/
If you know some visual basic I'd suggest playing around with Bing Ad Intelligence. It's a free Excel plugin and all you need to use it is a free Microsoft account.
The query limit is 20,000 words per query. You can get information on Clicks, Impressions, CTR, CPC, Average Bid and Total Cost. The query limit is a little lower if you use the more advanced keyword research features.

Constructing Intersections from Google Maps API

Problem:
I am trying to reverse geocode a lat/long into a closest street intersection using Google Maps API V3. Also, for now, this doesn't have to be super accurate-- as I am just trying to anonymize an address as opposed to providing directions.
I have seen that the Geocoding Results data contains an Address Component Type of "Intersection", but this doesn't seem to be consistent at all in the return results-- and is more often that not blank.
I have also done some looking on SO for the best way to construct this barring getting it from Google directly, and the closest I have seen is: How can I find the nearest intersection via the Google Maps API?, which doesn't really resolve my issue. In light of this I have come up with my own solution, and would like some opinions, optimizations, constructive criticism, or other options entirely.
My Tentative Solution:
After playing around with the API, I decided to give the following algorithm a shot (just for context, this is written in C# within a console app):
I take an address and resolve that into a lat/long.
I then add or subtract a certain amount of lat or long from the
coordinate-- on the order of a city block (a distance which is adjusted given your
latitude) and get walking directions between the points. I do this for up to all four directions-- so the first modification would be to keep the latitude the same but subtract some longitude. Then the next modification would be to keep the latitude and add some longitude, etc.
After getting the directions, I parse the results and check the start
and end address. If they are different, I pull out the street names
and treat them as an "intersection" (even though sometimes this
results in parallel streets-- again just trying to get a ballpark).
If I don't find two different streets, I widen the distance of the end destination and repeat the process.
So far this working well enough, but obviously it is an expensive process both in terms of time, and in using up my allotted query limit. Also, I checked the API terms of service, and as long as I include their disclaimer and display the results on a Google Map I think that I am ok.
My questions for the community are:
How can I improve the efficiency of the algorithm? Specifically, in
the number of times I call the API (the implementation code is not a
problem)
Is there another way entirely to do this using the Google Maps API?
In the SO question referred to above, the solution was to loop
over building numbers. I am not sure exactly what that means-- so any
clarification would be great.
As referred to above, I do not believe this is breaking the terms of service-- but am I mistaken?
Is there another web-based API to use that may meet my needs better?
Perhaps Bing, or some other provider?
Thanks a lot for any help.
UPDATE:
I have run into my query limit for the day, so I won't be able to test any suggestions against Google today, but I am also still open to using a different API. Thanks.
Old question, but since the original poster stated they were open to solutions other than Google, Geonames has a web API for this for the U.S. See GeoNames WebServices overview and http://www.geonames.org/maps/us-reverse-geocoder.html#findNearestIntersection

How does a site like kayak.com aggregate content? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
Greetings,
I've been toying with an idea for a new project and was wondering if anyone has any idea on how a service like Kayak.com is able to aggregate data from so many sources so quickly and accurately. More specifically, do you think Kayak.com is interacting with APIs or are they crawling/scraping airline and hotel websites in order to fulfill user requests? I know there isn't one right answer for this sort of thing but I'm curious to know what others think would be a good way to go about this. If it helps, pretend you are going to create kayak.com tomorrow ... where is your data coming from?
I'm working in travel industry as a software architect / project lead on the precisely kind of project you describe - in our region we work with suppliers directly, but for outgoing we connect to several aggregators.
To answer your question... some data you have, some you get in various ways, and some you have to torture and twist until it confesses.
What's your angle?
The questions you have to ask are... Do you want to sell advertising like Kayak or do you take a cut like Expedia? Are you into search or into selling travel services? Do you target niche (for example, just air travel) or everything (accommodation, airlines, rent-a-car, additional services like transport/sightseeing/conferences etc)? Do you target region (US or part of US) or the world? How deep do you go - do you just show several sites on a single screen, or do you bundle different services together and package them dynamically?
Getting the data
If you're going with Kayak business model, you technically don't need site's permission... but a lot of sites have affiliate programs with IFrames or other simple ways to direct the customer to their site. On the plus side, you don't have to deal with payments/complaints and travelers themselves. As for the cons... if you want to compare prices yourself and present the cheapest option to the user, you'll have to integrate on a deeper level, and that means APIs and web scraping.
As for web scraping... avoid it. It sucks. Really. Just don't do it. Trust me on this one. For example, some things like lowcosters you can't get without web scraping. Low cost airlines live from value added services. If the user doesn't see their website, they don't sell extra stuff, and they don't earn anything. Therefore, they don't have affiliates, they don't offer APIs, and they change their site layout almost constantly. However, there are companies which earn a living by web scraping lowcoster's sites and wrapping them into nice APIs. If you can afford them, you can give your users cost-comparison of low cost flights and that's huge.
On the other hand, there are "normal" carriers which offer APIs. It's not that big of a problem to get to airlines since they're all united under IATA; basically, you buy from IATA, and IATA distributes the money to carriers. However, you probably don't want to connect directly to carrier network. They have web services and SOAP these days, but believe me when I say that there are SOAP protocols which are just an insanely thin wrappers around a text prompt through which you can interact with a mainframe with an 80es-style protocol (think of a Unix prompt where you're billed per command; and it takes about 20 commands to do one search). That's why you probably want to connect to somebody a bit more down the food chain, with a better API.
Airlines are thus on both extremes of Gaussian curve; on one side are individual suppliers, and on the other highly centralized systems where you implement one API and you're able to fly anywhere in the world. Accommodation and the rest of travel products are in between. There are several big players which aggregate hotels, and a ton of small suppliers with a lot of aggregators which cover only part of a spectrum. For example, you can rent a lighthouse and it's even not that expensive - but you won't be able to compare the prices of different lighthouses in one place.
If you're into Kayak business model, you'll probably end up scraping websites. If you're into integrating different providers, you'll often work with APIs, some of which are pretty good, and most of which are tolerable. I haven't worked with RSS but there's not a lot of difference between RSS and web scraping. There is also a fourth option not mentioned in Jeff's answer... the one where you get your data nightly, for example .CSV files through FTP and similar.
Life sucks (mini-rant)
And then there's complexity. The more value you want to add, the more complexity you'll have to handle. Can you search accommodations which allow pets? For a hostel which is located less than 5 km from the town center? Are you combining flights, and are you able to guarantee that the traveler will have enough time to get from one airport to another... can you sell the transport in advance? A famous cellist doesn't want to part from his precious 18th century cello; can you sell him another seat for the cello (yep, not making this one up)?
Want to compare prices? Sure, the room is EUR 30 per night. But you can either get one double for 30 and one single for 20, or you can get one extra bed in a double and get 70% off for third person. But only if it's a child under 12 years of age; our extra beds are not for adults. And you don't get the price for extra bed in search results - only when you calculate the final price.
And don't even get me started on dynamic packaging. Want to sell accommodation + rent-a-car? No problem; integrate with two different providers, and off you go... manually updating list of locations in the city (from rent-a-car provider) to match with hotels (from accommodation provider, who gives you only the city for each hotel). Of course, provided that you've already matched the list of cities from the two, since there is no international standard for city codes.
Unlike a lot of other industries which have many products, travel industry has many very complex products. Amazon has it easy; selling books and selling potatoes, it's the same thing; you can even ship them in the same box. They combine easily and aren't assembled from many parts. :)
P.S. Linking to an interesting recent thread on Hacker News with some insider info regarding flights.
P.P.S. Recently stumbled on a great albeit rather old blogpost on IATA's NDC protocol with overview of how travel industry is connected and a history lesson how this came to be.
They use a software package like ITA Software, which is one of the companies Google is in the process of picking up.
Only 3 ways I know of to get data from websites.
RSS Feeds - We use rss feeds a lot at my company to integrate existing site's data with our apps. It's fast and most sites already have an RSS feed available. The problem with this is not all sites implement the RSS standard properly so if you're pulling data from many RSS feeds across many sites then make sure you write your code so that you can add exceptions and filters easily.
APIs - These are nice if they are designed well and have all the information you need, however that's not always the case, plus if the sites are not using a standard api format then you'll have to support multiple API's.
Web Scraping - This method would be the most unreliable as well as the most expensive to maintain. But if you're left with nothing else it can be done.
This article says that Kayak was asked to stop scrapping a certain airlines page. That leads me to believe that they probably do scraping on sites that they don't have a relationship with (and a data feed that comes with that relationship).
Travelport offer a product called "Universal API" which connects to flights and hotels and car rental companies and copes with package deals and all the various complexities to do with taxes and exchange rates:
https://developer.travelport.com/app/developer-network/resource-centre-uapi
I've just started using it and it seems fine so far. The queries are a little slow, but then so is every query on every OTA (Online travel agent)'s site.
There's two good APIs I've found from flight comparison websites recently
There's one from Wego, and one from Skyscanner. Both seem to have a good range and breadth of data from a number of airlines and good documentation too.
Wego pays each time a user clicks from your app to a booking website and Skyscanner pay affiliates 50% of 'revenue' (I assume that means the commission they make from airlines)
This is an old post but I thought I'd just add. I'm a data architect who works for a company that feeds these travel sites with content. This company enters into contracts with many hotel brands, individual hotels and other content providers. We aggregate this information then pass it onto the different channels. They then aggregate again in to their system.
The Large GDS systems are also content providers.
Aggregation is done by many methods... matching algorithms(in-house) and keys. Being an aggregation service, we need to communicate on the client level.
Hope this helps! cheers!