options for questions in Watson conversation api - npm

I need to get the available options for a certain question in Watson conversation api?
For example I have a conversation app and in some cases Y need to give the users a list to select an option from it.
So I am searching for a way to get the available reply options for a certain question.

I can't answer to the NPM part, but you can get a list of the top 10 possible answers by setting alternate_intents to true. For example.
{
"context":{
"conversation_id":"cbbea7b5-6971-4437-99e0-a82927607079",
"system":{
"dialog_stack":["root"
],
"dialog_turn_counter":1,
"dialog_request_counter":1
}
},
"alternate_intents":true,
"input":{
"text":"Is it hot outside?"
}
}
This will return at most the top ten answers. If there is a limited number of intents it will only show them.
Part of your JSON response will have something like this:
"intents":[{
"intent":"temperature",
"confidence":0.9822100598134365
},
{
"intent":"conditions",
"confidence":0.017789940186563623
}
This won't get you the output text though from the node. So you will need to have your answer store elsewhere to cross reference.
Also be aware that just because it is in the list, doesn't mean it's a valid answer to give the end user. The confidence level needs to be taken into account.
The confidence level also does not work like a normal confidence. You need to determine your upper and lower bounds. I detail this briefly here.
Unlike earlier versions of WEA, the confidence is relative to the
number of intents you have. So the quickest way to find the lowest
confidence is to send a really ambiguous word.
These are the results I get for determining temperature or conditions.
treehouse = conditions / 0.5940327076534431
goldfish = conditions / 0.5940327076534431
music = conditions / 0.5940327076534431
See a pattern?🙂 So the low confidence level I will set at 0.6. Next
is to determine the higher confidence range. You can do this by mixing
intents within the same question text. It may take a few goes to get a
reasonable result.
These are results from trying this (C = Conditions, T = Temperature).
hot rain = T/0.7710267712183176, C/0.22897322878168241
windy desert = C/0.8597747113239446, T/0.14022528867605547
ice wind = C/0.5940327076534431, T/0.405967292346557
I purposely left out high confidence ones. In this I am going to go
with 0.8 as the high confidence level.

Related

Can this list be sorted by date in Velocity?

I've found this code for getting articles by tag and display them as a list with links in xWiki, but I want it sorted by date.
Has anyone a suggestion for me?
{{velocity}}
#set ($list = $xwiki.tag.getDocumentsWithTag('myTag'))
#foreach($reference in $list)
#set ($document = $xwiki.getDocument($reference))
#set ($label = $document.getTitle())
[[$label>>$reference]]
#end
{{/velocity}}
Thanks in advance!
Sorting in velocity can hit one of 2 performance penalties:
Actually sorting in velocity, either with a sorting algorithm -> unnecesarrily compicated
Loading all the document results into memory (a collection) and
sorting that collection with the sort/collection tool -> you risk quickly running out of memory if the result is larger than you expected.
The easiest alternative, given that there is XWiki running behind it, would be to do an XWQL query for the XWiki.TagClass objects stored inside the documents and do the sorting at the database level. At this point, in velocity, you only need to display the results:
{{velocity}}
#foreach ($docStringRef in $services.query.xwql("from doc.object(XWiki.TagClass) tagsObj where 'conference' member of tagsObj.tags order by doc.creationDate DESC").setLimit(10).execute())
#set ($document = $xwiki.getDocument($docStringRef))
[[$document.title>>$docStringRef]]
#end
{{/velocity}}
For future use/reference, the list of available Velocity tools in XWiki might also be useful https://extensions.xwiki.org/xwiki/bin/view/Extension/Velocity%20Module#HVelocityTools since they can help with common operations, including sorting (that I mentioned at point 2. above)

Sentinel 1 data gaps in swath overlap (not sequential scenes) in Google Earth Engine

I am working on a project using the Sentinel 1 GRD product in Google Earth Engine and I have found a couple examples of missing data, apparently in swath overlaps in the descending orbit. This is not the issue discussed here and explained on the GEE developers forum. It is a much larger gap and does not appear to be the product of the terrain correction as explained for that other issue.
This gap seems to persist regardless of year changes in the date range or polarization. The gap is resolved by changing the orbit filter param from 'DESCENDING' to 'ASCENDING', presumably because of the different swaths or by increasing the date range. I get that increasing the date range increases revisits and thus coverage but is this then just a byproduct of the orbital geometry? ie it takes more than the standard temporal repeat to image that area? I am just trying to understand where this data gap is coming from.
Code example:
var geometry = ee.Geometry.Polygon(
[[[-123.79472413785096, 46.20720039434629],
[-123.79472413785096, 42.40398120362418],
[-117.19194093472596, 42.40398120362418],
[-117.19194093472596, 46.20720039434629]]], null, false)
var filtered = ee.ImageCollection('COPERNICUS/S1_GRD').filterDate('2019-01-01','2019-04-30')
.filterBounds(geometry)
.filter(ee.Filter.eq('orbitProperties_pass', 'DESCENDING'))
.filter(ee.Filter.listContains('transmitterReceiverPolarisation', 'VH'))
.filter(ee.Filter.listContains('transmitterReceiverPolarisation', 'VV'))
.filter(ee.Filter.eq('instrumentMode', 'IW'))
.select(["VV","VH"])
print(filtered)
var filtered_mean = filtered.mean()
print(filtered_mean)
Map.addLayer(filtered_mean.select('VH'),{min:-25,max:1},'filtered')
You can view an example here: https://code.earthengine.google.com/26556660c352fb25b98ac80667298959

Twitter Premium API Profile location operators profile_country: and profile_region: not working

I am using premium account (not sandbox) for data collection.
I want to collect:
All tweets in English that contain ‘china’ or ‘chinese’ that are user geolocated to US and not geolocated at tweet level, excluding all retweets
All tweets in English that contain ‘china’ or ‘chinese’ that are user geolocated to ‘Minnesota’ and not geolocated at tweet level, excluding all retweets
The code is as follows:
premium_search_args = load_credentials('twitter_API.yaml',
yaml_key ='search_tweets_premium_api', env_overwrite=False)
# keywords for the search
# key word 1
keywords = '(China OR Chinese) lang:en profile_country:US -place_country:US -is:retweet'
# key word 2
keywords = '(China OR Chinese) lang:en -place_country:US profile_region:"Minnesota" -is:retweet'
# define search rule
rule = gen_rule_payload(keywords,from_date='2019-12-01',
to_date='2019-12-10',results_per_call=500)
# create result stream and print before start
rs = ResultStream(rule_payload=rule, max_results=1250000,
**premium_search_args)
My problems are that:
For the first one, a large portion of the results I get didn’t satisfy the query. First, some don’t have Profile Geo enrichment, i.e. user.derived.locations attribute is not in the user object. Second, if it is, a lot don’t have country code US, i.e. they are identified to other countries.
For the second one, the result I get from this method is a smaller subset of the results I can get from 1). That is, when I filter all tweets user geolocated to Minnesota (by user.derived.locations.region) from profile_country:US, it gives a larger sample than using profile_region:“Minnesota”. A considerable amount of data is missing using this method.
I have tried several times but it seems that user geolocation operators don’t work exactly what I want. Does anyone has any idea why this is the case? I would very much appreciate any answers/suggestions/comments.
Thank you!

offline GTFS connection from A to B

The project I am working on involves static offline GTFS data in a mobile app. All the GTFS data is available inside realm-objects (or SQLite if needed).
Now, I would like to establish all train- or bus-connections from A to B (starting after a certain departure-time).
How do I query the GTFS-data in order to get a connection from A to B ???
I reealized to get all trips leaving from A.
I realized to get all station-names along that trip including times.
But I find it very hard to get the connection information between two locations A and B. What SQL queries do i have to set up in order to get that information ?
Any help appreciated !
If you just want to dynamically calculate shortest travel routes between static hubs, in an offline application, determined like the following image, you can use the following formula:
   (Source)
Here's the pseudo code:
1 function Dijkstra(Graph, source):
2
3 create vertex set Q
4
5 for each vertex v in Graph: // Initialization
6 dist[v] ← INFINITY // Unknown distance from source to v
7 prev[v] ← UNDEFINED // Previous node in optimal path from source
8 add v to Q // All nodes initially in Q (unvisited nodes)
9
10 dist[source] ← 0 // Distance from source to source
11
12 while Q is not empty:
13 u ← vertex in Q with min dist[u] // Node with the least distance
14 // will be selected first
15 remove u from Q
16
17 for each neighbor v of u: // where v is still in Q.
18 alt ← dist[u] + length(u, v)
19 if alt < dist[v]: // A shorter path to v has been found
20 dist[v] ← alt
21 prev[v] ← u
22
23 return dist[], prev[]
(Source)
Alright, I'll admit so far my answer was a tad facetious...
I suspect you don't realize just how complicated it is to do what you're asking, especially in an offline environment. Companies like Google and Microsoft have spent millions on research with huge teams of data scientists.
If this is something you are serious about, I'd encourage you to start with a 10×10 grid and work on the logic of getting from "Point A → Point B" when you start adding barriers in random places (this simulating roads beginning & ending). Recently I was surprised how complicated a seemingly-simple, somewhat-related Stack Overflow "pipe sizes conversion" question that I answered had become.
If you didn't have the "offline" condition, I would've suggested looking into getting a [free] API Key for Google Web Services Directions API. Basically you tell it a where Points A & B are, and it gives you detailed route information, via transit or other methods.
Another of Google's many API's that could be helpful is the Snap to Roads API, which turns partial or error-ridden paths into driveable ones.
The study Route Logic and Shortest Path Logic is actually really fascinating stuff, and there are some amazing resources to learn about the related theories (see below), including a video from Google explaining how they went about it.
...but unfortunately this isn't something that's going to be accomplished with a simple SQL Query.
Actual Resources:
YouTube : Google Tech Talk: Fast Route Planning
Wikipedia : Shortest path problem
Wikipedia : Dijkstra's algorithm
...and some slightly Lighter Reading:
Wikipedia : Travelling Salesman Problem
Why UPS drivers don’t turn left and you probably shouldn’t either
Google Web Services : Directions API Developer's Guide
Stack Overflow : Shortest Route of converters between two different pipe sizes?
Wikipedia : Seven Bridges of Königsberg

Foursquare Venue API & Number of Results, in a more efficient way?

I'd like to ask if there is a more efficient way to get more than 50 results besides these options?
How do I get more locations?
Foursquare Venue API & Number of Results
and this, which is for the old API Foursquare API nearByVenue service issue
I'm using the current foursquare api for the venue search https://developer.foursquare.com/docs/venues/search .
What I'd like is something like an offset option, in order to get more results, but it seems that there isn't such an option.
Is there an alternative solution?
Thank you in advance.
you should use venues explore with offset and limit as paramters,
venues explore gives you totalResults and you can use this response to calculate number of pages you need in paginate
for example assume totalResults is 90(pay attention at offset and limit parametr value )
in first request:
https://api.foursquare.com/v2/venues/explore?client_id=client_id&client_secret=client_secret&v=20150825&near=city_name&categoryId=category_id&intent=browse&offset=0&limit=30
in second request:
https://api.foursquare.com/v2/venues/explore?client_id=client_id&client_secret=client_secret&v=20150825&near=city_name&categoryId=category_id&intent=browse&offset=30&limit=30
in third request:
https://api.foursquare.com/v2/venues/explore?client_id=client_id&client_secret=client_secret&v=20150825&near=city_name&categoryId=category_id&intent=browse&offset=60&limit=30
for 90 results you can get all records with above three request
There is actually another option not mentioned here (not pagination though)
Using the (experimental?) categoryId filter.
You can search for a single point (ll) a few times with different category ids, giving you many results (some duplicates as venues can have more than one category).
So you can search for 'Food' venues and 'Nightlife' venues at the same place, getting 100 results in stand of 50.. as said it is 100 results, but not unique results, could be duplicates. I think that is more efficient then trying to play around with the browse radius thing.
Not pagination, but will give a lot more results than a normal search - usually enough even in urban areas.
But yea, having some sort of way to extract more than 50 on a single point is not possible, but could be nice :)
Afraid not. Currently there is no pagination, in order to find more venues you need to move your search area around as in the answers you highlighted. I agree, pagination would be handy though!
For the explorer endpoint this worked for me: If the maximum number of results that is returned for instance is 100, just use offset=100 in the next call which gives you the next 100 results starting from 100 (the offset). Iterate (e.g. using while loop) and keep increasing offset by 100 until you reach the total number or results (which is returned in in the api for totalResults).
My first stack overflow post, tried to answer as clearly as possible
def getNearbyVenues(neighborhoods, latitudes, longitudes, radius=500,ven_num=300):
venues_list=[]
for name, lat, lng in zip(neighborhoods, latitudes, longitudes):
i=0
while (i < ven_num+50):
url = 'https://api.foursquare.com/v2/venues/explore?&client_id={}&client_secret={}&v={}&ll={},{}&radius={}&offset={}&limit={}'.format(
CLIENT_ID,
CLIENT_SECRET,
VERSION,
lat,
lng,
radius,
i,
LIMIT)
# make the GET request
results = requests.get(url).json()['response']['groups'][0]['items']
# return only relevant information for each nearby venue
venues_list.append([(
name,
lat,
lng,
v['venue']['name'],
v['venue']['location']['lat'],
v['venue']['location']['lng'],
v['venue']['categories'][0]['name']) for v in results])
i=i+50
nearby_venues = pd.DataFrame([item for venue_list in venues_list for item in venue_list])
nearby_venues.columns = ['Neighborhood',
'Neighborhood Latitude',
'Neighborhood Longitude',
'Venue',
'Venue Latitude',
'Venue Longitude',
'Venue Category']
print('Ok')
return(nearby_venues)
the code above has worked perfectly with me, where ven_num variable is the desired limit for calling venues in a certain neighborhood