Best way to deal with IDs only containing numbers - wit.ai

We're trying to display some booking information to the users and we're asking to them the ID which has a 10 length numbers format like this one: 1553296942
In the stories, we try to identify the user input with an intent called bookingStatus and a entity called uid.
Thing is, this IDs are recognized as a wit/location type (it looks like coordinates to him, I guess) and it doesn't recognize them properly most of the times.
What would be the best approach to handle this situation?
For now, in the Understanding tab we're feeding the bot with lots of these IDs, adding the intent bookingStatus and marking it as uid entity aswell. Is this the right thing and shall we continue training it this way?

You can feed with the 10-length numbers. Actually there are only 10^10 possibilities for the uid entity. You can basically feed the whole 10^10 possibilities with a simple CURL command which contains a loop to 10^10.
How do you feed your NLP without the Understanding tab? Well..
Check the HTTP API Docs here
https://wit.ai/docs/http/20160526#post--entities-:entity-id-values-link
Have a nice day!

Related

How to add a list of keywords to an wit.ai entity?

I'm new to wit.ai. I'm building a bot that does voice input for an order processing pipeline. In one field i need to input the client location. And this client_location entity has a keyword search strategy attached to it.
Now i want to add all cities, towns and villages to this entity as keywords. Because only one of this will be considered a valid value for client_location entity.
But there are a couple thousands of them, and adding them by hand, one by one, inside the wit.ai UI, doesn't make much sense.
I want to use a cli tool or a node package or something - to do it programmatically.
How can i do this? And also is ok to have so many keywords for one entity?
For now, the node.js library can't do training.
But you can still do it using HTTP REST api, by posting to /samples path.
POST/samples
The docs are here.

Creating a SOLR index for activity stream or newsfeed

I am trying to index the activity feed of a social portal am building. The portal allows users to follow each other to get updates from the people they follow as an activity feed sorted by date.
For example, user A will be following users B, C, D, E & F. So user A should see all the posts from B, C, D, E & F on his/her activity feed.
Let's assume the post consist of just two fields.
1. The text of the post. (text_field)
2. The name/UID of the user who posted it. (user_field)
Currently, I am creating an index for all the posts and indexing the text_field & user_field. In scale, there can be 1,000,000+ posts. A user may follow 100s if not 1000s of users. What will be the best way to create an index for this scenario?
Should I also index a person followers, so that its quickly looked up and then pass it to a second query for getting the posts of all those users sorted by date?
What is the best way to query the index consisting of all these posts, by passing the UID of all the users that are followed? Considering this may be in 100's or more.
Update:
The motivation for using Solr for the news feed was mainly inspired by this detailed slide and my brief discussion with OpenSocial team.
When starting off with a social portal, Fan out on write seems an overkill and more expensive. However Fan out on read is better. Both the slide and the OpenSocial team suggested using a search backend for Fan out on read. The slide mentioned above also have data on how it helped them.
At present, the feed is going to be flat and only sort criteria will be the date(recency). We won't be considering relevance or posts from more closer groups.
It's kind of abstract, but I will do my best here. Based on what you mentioned, I am not sure if Solr is really the right tool for the job here. You can still have Solr for full text search, but I am not sure about generating a news feed from it in this scenario. Remember that although Solr is pretty impressive, it is a search engine. I will pretend that you will stick with Solr for the rest of the post, keep in mind that we are trying to put a square peg through a round hole here though.
Here are a few additional questions you should think about.
You will probably want to add a timestamp of the post to the data element
You need to figure out how to properly sort the results. Is it in order of recency? Or based on posts that the user is more likely to interact with?
If a user has 1000+ connections, would he want to see an update from every one of them in the main feed? Or should posts from a closer group of friends show up higher?
Here are some comments about your questions:
1) If you index person's followers, it may be hard to keep up. I am assuming followers are going to be changing often and re-indexing in this scenario would not really be practical.
2) That sounds more on par, but again, you need to figure out the sorting. You can get a list of connections for the user, then run a search for top posts from all of them.

Instagram: sort photos with a specific tag with most likes

I'm running a contest on the web where the image with the most likes wins. It's tiresom having to go through 900 images manually so what I want to do is, sort all images with the tag lets say #computer after the amount of likes, with the most liked pics on top. I have searched the net like crazy for some program or site that does this (ExtraGram, gramhoot, statigram, webstagram) but none offer to sort by amount of likes and it drives me INSANE! It's a really relevant request.
I've tried istafeed.js but it doesn't include all images, actually it leaves out the ones with the moest likes which defies the purpose.
There's nothing I know of in the Instagram API that sends back media sorted by likes in advance. I don't think there's a tool to do this either, but writing one is relatively simple IMO and I've done it before for a contest specifically.
The simplest thing to do is to do the following:
Use the Instagram API (via a library or pure REST) to query by tag. For instance, if you only care about the most recently tagged media or you want to process by date, you can use the [/tag/tag-name/media/recent][1] enpoint.
Page through each result page by processing the next_max_id/next_max_tag_id.
Collect the results locally into a database. You will receive the "like" count for each media item. You will have to update the data if you want to track the likes over time.
Sort the results using your database or if it's a small result set, you could skip #3 and just sort in memory.
If you need to refresh the results, you need to subscribe to the Tag via the API. You can give Instagram a URL to then push updates, and then you'll have to retrieve 1 or media items and update them in your database accordingly.
You will of course need to register your application with Instagram to get an API key if you want to do this. Then you can either send them your client_id or use OAuth.
The best way to achieve this is to pull the photos in and then sort them programmatically based on the likes numeric value. I've designed a plugin that does this automatically for you for anyone interested.
Instagram Journal

The REST-way to check/uncheck like/unlike favorite/unfavorite a resource

Currently I am developing an API and within that API I want the signed in users to be able to like/unlike or favorite/unfavorite two resources.
My "Like" model (it's a Ruby on Rails 3 application) is polymorphic and belongs to two different resources:
/api/v1/resource-a/:id/likes
and
/api/v1/resource-a/:resource_a_id/resource-b/:id/likes
The thing is: I am in doubt what way to choose to make my resources as RESTful as possible. I already tried the next two ways to implement like/unlike structure in my URL's:
Case A: (like/unlike being the member of the "resource")
PUT /api/v1/resource/:id/like maps to Api::V1::ResourceController#like
PUT /api/v1/resource/:id/unlike maps to Api::V1::ResourceController#unlike
and case B: ("likes" is a resource on it's own)
POST /api/v1/resource/:id/likes maps to Api::V1::LikesController#create
DELETE /api/v1/resource/:id/likes maps to Api::V1::LikesController#destroy
In both cases I already have a user session, so I don't have to mention the id of the corresponding "like"-record when deleting/"unliking".
I would like to know how you guys have implemented such cases!
Update April 15th, 2011: With "session" I mean HTTP Basic Authentication header being sent with each request and providing encrypted username:password combination.
I think the fact that you're maintaining application state on the server (user session that contains the user id) is one of the problems here. It's making this a lot more difficult than it needs to be and it's breaking a REST's statelessness constraint.
In Case A, you've given URIs to operations, which again is not RESTful. URIs identify resources and state transitions should be performed using a uniform interface that is common to all resources. I think Case B is a lot better in this respect.
So, with these two things in mind, I'd propose something like:
PUT /api/v1/resource/:id/likes/:userid
DELETE /api/v1/resource/:id/likes/:userid
We also have the added benefit that a user can only register one 'Like' (they can repeat that 'Like' as many times as they like, and since the PUT is idempotent it has the same result no matter how many times it's performed). DELETE is also idempotent, so if an 'Unlike' operation is repeated many times for some reason then the system remains in a consistent state. Of course you can implement POST in this way, but if we use PUT and DELETE we can see that the rules associated with these verbs seem to fit our use-case really well.
I can also imagine another useful request:
GET /api/v1/resource/:id/likes/:userid
That would return details of a 'Like', such as the date it was made or the ordinal (i.e. 'This was the 50th like!').
case B is better, and here have a good sample from GitHub API.
Star a repo
PUT /user/starred/:owner/:repo
Unstar a repo
DELETE /user/starred/:owner/:repo
You are in effect defining a "like" resource, a fact that a user resource likes some other resource in your system. So in REST, you'll need to pick a resource name scheme that uniquely identifies this fact. I'd suggest (using songs as the example):
/like/user/{user-id}/song/{song-id}
Then PUT establishes a liking, and DELETE removes it. GET of course finds out if someone likes a particular song. And you could define GET /like/user/{user-id} to see a list of the songs a particular user likes, and GET /like/song/{song-id} to see a list of the users who like a particular song.
If you assume the user name is established by the existing session, as #joelittlejohn points out, and is not part of the like resource name, then you're violating REST's statelessness constraint and you lose some very important advantages. For instance, a user can only get their own likes, not their friends' likes. Also, it breaks HTTP caching, because one user's likes are indistinguishable from another's.

Can you get the exact date a user started following another using the twitter API?

Let's say user A follows user B, and B follows A. I want to know the exact date A started following B and viceversa.
Is this information stored on twitter? Can I retrieve it using the API?
To clear out: The point of this question is finding a way to know who followed who first.
(I'm assuming both A and B deleted the notification e-mails)
No Ignacio, you can't. You just can know who follows who but not the date the follow started.
Looking at the API, there's is no way, there are two calls to get the followers:
User Methods/statuses/followers
and
Social Graph Methods/followers/ids
Neither of them returns dates or even a serial that would let you see who started following first. Really, there's no indication that twitter is internally storing this information, neither in the API nor Twitter's web interface.
This is a very old question, but perhaps some might be interested to know that while you cannot get the date at which someone started following, you can at least infer an "earliest possible following date" from the fact that the list of followers is ordered according to date, and the fact that follower objects come with a created_at timestamp.
Here's a Python function for calculating an "earliest possible following date": https://github.com/BernhardClemm/twitter-follow-dates
Of course Twitter stores it, because Twitter sorts followers and following lists by the date ;)
It is possible to do this, but impractical. When you call the followers API you can page the results. Each returned object contains next_cursor and prev_cursor items. These refer to the first and last records in the next and previous pages. These values are time based and can be used to calculate the time that the respective users followed you.
It follows that, if you set the page size to 1, you can walk through the list of follower IDs one at a time and the next_cursor value will allow you to derive the follow time for the next record.
This is reasonably simple to implement, however, in practice, you'll very quickly hit Twitter's API rate limit.