Recently I've been occasionally receiving odd results from the Google Reverse Geocode API.
For example when looking up the address for coordinates 43.2379396 -72.44746565 the result I get is:
6HQ3+52 Springfield, VT, USA
In another case looking up 43.703563 -72.209753 results with:
PQ3R+C3 Hanover, NH, USA
Does anyone know what the initial 7 bytes of the returned address symbolize? When I receive this type of result it's always 4 bytes of alphanumeric data followed by a plus sign then 2 more alphanumeric bytes.
After some additional research I found that these are Plus Code addresses, a relatively new feature in Google Maps. These are used for places that don't have a street address. These seem to have some similarities to "what 3 words" addresses.
Related
I am using cloud google vision API to extract text from Aadhaar and PAN. How can I get exact user details like name, father's name, and address?
Raw Data
ଭାରତ ସରକାର
Government of India
ଜିତ୍ୟାନନ୍ଦ ଖେମୁକୁ
NITYANANDA KHEMUDU
ପିତା : ସୀତାରାମ ଖେମୁକୁ
Father: Sitaram Khemudu
ଜନ୍ମ ତାରିଖ / DOB : 01.07.1999
ପୁରୁଷ / Male
ମୋ ଆଧାର, ମୋ ପରିଚୟ
I have built 5-6 OCR till date like aadhar, pan, ITR, Driving Linces etc., using google cloud vision API, I think you are looking for response like
{"pan_card_no":"ECXXXXXX123",
"name":"fshksj"
}
to get such response you need to built your own logic, here are some logic's i can share with you
Perform OCR on your document using Google_cloud_vision API and store that response into one array (Goggle gives logic line by line)
Like in above case if you want to grab DOB first you can build logic like i) if "DOB" in (list of item) then grab the numeric values
To get the name what you can do is dropping the unnecessary items from list by if using if condition like (if "India" in i) or (if i.isdigit()) then drop it likewise you can drop the unnesseary items from main list to get the Name
to grab the Address what you can do is, 95% of the time address come with pincode at last, so what you can do is treat pincode as a last index of address and look of "Address" kind of keyword then add all the elements from "Add keyword index" to "pincode index" ( this can be easily done in list) to validate whether the pincode is valid or not you can use library like Pyzipin
There are multiple conditions that you can use, above are the very basic one i mentioned, if you need any specific logic then then you can ask me
I am using premium account (not sandbox) for data collection.
I want to collect:
All tweets in English that contain ‘china’ or ‘chinese’ that are user geolocated to US and not geolocated at tweet level, excluding all retweets
All tweets in English that contain ‘china’ or ‘chinese’ that are user geolocated to ‘Minnesota’ and not geolocated at tweet level, excluding all retweets
The code is as follows:
premium_search_args = load_credentials('twitter_API.yaml',
yaml_key ='search_tweets_premium_api', env_overwrite=False)
# keywords for the search
# key word 1
keywords = '(China OR Chinese) lang:en profile_country:US -place_country:US -is:retweet'
# key word 2
keywords = '(China OR Chinese) lang:en -place_country:US profile_region:"Minnesota" -is:retweet'
# define search rule
rule = gen_rule_payload(keywords,from_date='2019-12-01',
to_date='2019-12-10',results_per_call=500)
# create result stream and print before start
rs = ResultStream(rule_payload=rule, max_results=1250000,
**premium_search_args)
My problems are that:
For the first one, a large portion of the results I get didn’t satisfy the query. First, some don’t have Profile Geo enrichment, i.e. user.derived.locations attribute is not in the user object. Second, if it is, a lot don’t have country code US, i.e. they are identified to other countries.
For the second one, the result I get from this method is a smaller subset of the results I can get from 1). That is, when I filter all tweets user geolocated to Minnesota (by user.derived.locations.region) from profile_country:US, it gives a larger sample than using profile_region:“Minnesota”. A considerable amount of data is missing using this method.
I have tried several times but it seems that user geolocation operators don’t work exactly what I want. Does anyone has any idea why this is the case? I would very much appreciate any answers/suggestions/comments.
Thank you!
using https://www.google.ca/maps and the geocoding api gives the same results:
using https://www.google.ca/maps and searching for:
143 GARRISON CIR , RED DEER, AB , Canada
returns two results:
143 Garrison PL
143 Garrison Cir
using the API reveals that it considers the first one '... Pl' more accurate than '... Cir' when clearly the second one is more true to the original addressed used to search since it contains 'Cir'...
using:
https://maps.googleapis.com/maps/api/geocode/xml?address=143%20GARRISON%20CIR%20%2C%20RED%20DEER%2C%20AB%20%2C%20Canada
reveals the first result's accuracy is:
ROOFTOP
and the second result's accuracy is:
RANGE_INTERPOLATED {not as accurate}
WHY???
Interestingly... if I use the postal code in the full address {which I verified with Canada Post as being correct}:
'143 GARRISON CIR , RED DEER, AB T4P0P5, Canada'
I get no results from either method!
again... WHY???
The RANGE_INTERPOLATED result means that there is no exact street address feature in the Google database and the service tries to guess where the address is located. Maybe due to this reason the exact ROOFTOP result is scored higher than an interpolation. Especially taking into account that the coordinates of both results are very close to each other:
https://google-developers.appspot.com/maps/documentation/utils/geocoder/#q%3D143%2520GARRISON%2520CIR%2520%252C%2520RED%2520DEER%252C%2520AB%2520%252C%2520Canada
In order to solve this you should report a missing address to Google using Send feedback mechanism:
https://support.google.com/maps/answer/3094045
Also note that an interpolated result for the address has a different postal code T4N 3M4. Even more, if I try to search the postal code T4P 0P5, I'll get back only postal code prefix T4P:
https://google-developers.appspot.com/maps/documentation/utils/geocoder/#q%3D%26options%3Dtrue%26in_country%3DCA%26in_postal_code%3DT4P%25200P5
That means the postal code T4P 0P5 is also missing from Google database and you should report it as well.
As the postal code is missing you are getting ZERO_RESULTS for complete string 143 GARRISON CIR , RED DEER, AB T4P0P5, Canada
https://google-developers.appspot.com/maps/documentation/utils/geocoder/#q%3D143%2520GARRISON%2520CIR%2520%252C%2520RED%2520DEER%252C%2520AB%2520T4P0P5%252C%2520Canada%26options%3Dtrue
As you mentioned, we can see that the same behavior is reproducible on maps.google.com. There are two options for the address and Garrison Pl is the first item while Garrison Cir is the second. That confirms that this is a data issue rather than API issue:
I hope this explains your doubt.
I have a set of queries and I am trying to get web_urls using the NYT article search API. But I am seeing that it works for q2 below but not for q1.
q1: Seattle+Jacob Vigdor+the University of Washington
q2: Seattle+Jacob Vigdor+University of Washington
If you paste the url below with your API key in the web browser, you get an empty result.
Search request for q1
api.nytimes.com/svc/search/v2/articlesearch.json?q=Seattle+Jacob%20Vigdor+the%20University%20of%20Washington&begin_date=20170626&api-key=XXXX
Empty results for q1
{"response":{"meta":{"hits":0,"time":27,"offset":0},"docs":[]},"status":"OK","copyright":"Copyright (c) 2013 The New York Times Company. All Rights Reserved."}
Instead if you paste the following in your web browser (without the article 'the' in the query) you get non-empty results
Search request for q2
api.nytimes.com/svc/search/v2/articlesearch.json?q=Seattle+Jacob%20Vigdor+University%20of%20Washington&begin_date=20170626&api-key=XXXX
Non-empty results for q2
{"response":{"meta":{"hits":1,"time":22,"offset":0},"docs":[{"web_url":"https://www.nytimes.com/aponline/2017/06/26/us/ap-us-seattle-minimum-wage.html","snippet":"Seattle's $15-an-hour minimum wage law has cost the city jobs, according to a study released Monday that contradicted another new study published last week....","lead_paragraph":"Seattle's $15-an-hour minimum wage law has cost the city jobs, according to a study released Monday that contradicted another new study published last week.","abstract":null,"print_page":null,"blog":[],"source":"AP","multimedia":[],"headline":{"main":"New Study of Seattle's $15 Minimum Wage Says It Costs Jobs","print_headline":"New Study of Seattle's $15 Minimum Wage Says It Costs Jobs"},"keywords":[],"pub_date":"2017-06-26T15:16:28+0000","document_type":"article","news_desk":"None","section_name":"U.S.","subsection_name":null,"byline":{"person":[],"original":"By THE ASSOCIATED PRESS","organization":"THE ASSOCIATED PRESS"},"type_of_material":"News","_id":"5951255195d0e02550996fb3","word_count":643,"slideshow_credits":null}]},"status":"OK","copyright":"Copyright (c) 2013 The New York Times Company. All Rights Reserved."}
Interestingly, both queries work fine on the api test page
http://developer.nytimes.com/article_search_v2.json#/Console/
Also, if you look at the article below returned by q2, you see that the query term in q1, 'the University of Washington' does occur in it and it should have returned this article.
https://www.nytimes.com//aponline//2017//06//26//us//ap-us-seattle-minimum-wage.html
I am confused about this behaviour of the API. Any ideas what's going on? Am I missing something?
Thank you for all the answers. Below I am pasting the answer I received from NYT developers.
NYT's Article Search API uses Elasticsearch. There are lots of docs online about the query syntax of Elasticsearch (it is based on Lucene).
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-query-string-query.html#query-string-syntax
If you want articles that contain "Seattle", "Jacob Vigdor" and "University of Washington", do
"Seattle" AND "Jacob Vigdor" AND "University of Washington"
or
+"Seattle" +"Jacob Vigdor" +"University of Washington"
I think you need to change encoding of spaces (%20) to + (%2B):
In your example,
q=Seattle+Jacob%20Vigdor+the%20University%20of%20Washington
When I submit from the page on the site, it uses %2B:
q=Seattle%2BJacob+Vigdor%2Bthe+University+of+Washington
How are you URL encoding? One way to fix it would be to replace your spaces with + before URL encoding.
Also, you may need to replace %20 with +. There are various schemes for URL encoding, so the best way would depend on how you are doing it.
We are using Google map api to find distance between two postalcodes. We can get the distance between any two postalcodes by using google map api Working Map API example. Some of the postal-code this api returns NotFound error when the postal-code is sent without any space, but for the same postal-code if we give space after 3 letters it works well and returns the distance correctly.
API URL
http://maps.googleapis.com/maps/api/distancematrix/json?origins=H3W3C4&destinations=H1C0A6&mode=driving&language=en-EN&sensor=false
Example 1:
url:http://maps.googleapis.com/maps/api/distancematrix/json?origins=H3W3C4&destinations=H1A0C2&mode=driving&language=en-EN&sensor=false
postal-code: H1A0C2,
api result : NotFound
Example 2:
url:http://maps.googleapis.com/maps/api/distancematrix/json?origins=H3W3C4&destinations=H1A 0C2&mode=driving&language=en-EN&sensor=false
postal-code: H1A 0C2,api
result : returns correct distance
Below are list of postal codes for which api returns NotFound if we give postal-code without space(but if we give space like 'H1A 0C2' it will return results)
H1A0C2
H1A0C3
H1A0C4
H1B0B7
H1C0E3
H1C0E4
H1C0E5
H1C0E6
H1C0E7
H1C0E8
H1C0E9
H1C0G1
H1C0G2
below list postalcodes for which api returns the distance correctly if we give with/without space( like works well if we give 'H1C 0A7' and 'H1C0A7'.
H1C0A7
H1C0A8
H1C0A9
Though it works with or without space for most of the postalcodes, for few it does not return values without spaces. What could be the reason?
I would suggest checking if the postal code exists in the Google database.
For example the postal code H1A0C2 seems to be missing
https://developers-dot-devsite-v2-prod.appspot.com/maps/documentation/utils/geocoder/#q%3D%26options%3Dtrue%26in_country%3DCA%26in_postal_code%3DH1A0C2
As you can see, the geocoder tool returns only postal code prefix HA1, but not the postal code itself.
For the postal code H1C0A9 geocoder returns a complete postal code:
https://developers-dot-devsite-v2-prod.appspot.com/maps/documentation/utils/geocoder/#q%3D%26in_country%3DCA%26in_postal_code%3DH1C0A9
I think the distance matrix cannot find distance for missing postal codes. However, when you add an empty space it can find a postal code prefix and calculates distance based on coordinates of postal code prefix. So the result might be not precise enough in this case.
You can report missing postal codes to Google as described in the following support doc:
https://support.google.com/maps/answer/3094088
Hope this helps!