Setting options in Apache Stanbol request - semantic-web

I am trying to learn semantic-web where I can pass the sample text.
which is normally in English.
I am using IKS project to learn.
http://dev.iks-project.eu:8080/engines
Is there a way to set that I should return enhancments only in English and
if in sample test a word e.g London is repeated 10 Times it return also in RDF contents London 10 time. Can I get Unique suggestions back?

Related

Is iTunes search API working correctly?

I’m trying to search all film with word “Seven”. It found 50 films. But some films and its description does not contain word “Seven”. For example film with trackId = 421072264 not contain word “Seven”.
Is iTunes search API working correctly?
https://itunes.apple.com/search?term=Seven&entity=movie
I attached response to Search%3DSeven.txt file.
https://dl.dropboxusercontent.com/u/55328092/Search%3DSeven.txt
Yes this is working correctly, the search defaults to a broader set of data to search on when no attribute parameter is specified in the query.
You can reduce the search parameters by setting a value for attribute the the query, however this will still not stop it from returning films with 7 in their title, this seem to just be a quirk of the search algorithm.
https://itunes.apple.com/search?term=Seven&entity=movie&attribute=movieTerm
Weirdly this returns "Fast & Furious 5 - 7 Collection" before "Seven" which doesn't make much sense.
If it needs to be more precise then a id search can be done which would only return the specified movie
https://itunes.apple.com/lookup?id=534912090
See this link for more details on how to modify the search query and lookup searches:
https://affiliate.itunes.apple.com/resources/documentation/itunes-store-web-service-search-api/#searching

How to get with Mediawiki API all images in a category which are not in another one?

I am entirely new to API, so sorry if the question is silly.
I would like to get all images in a category in Commons let's say X, but exclude those which are also in another one (Y). I do not understand if I can actually do this.
https://commons.wikimedia.org/w/api.php?action=query&list=categorymembers&cmtype=file&cmtitle=Category:X
will get all of them, how to exclude some?
moreover I would like in the result to have the description of the images, not just the name of the file, is that possible?
MediaWiki has - by default - no built-in support for category building and querying intersections. To accomplish this task, extensions or external tools or multiple API queries and result processing is required.
CirrusSearch API
On Wikimedia Commons, like on the whole Wikimedia Wiki farm, CirrusSearch powers filtered search, including search for category intersections and is also available through API (action=query&list=search&srsearch=incategory:A+-incategory:B, this is Category:A minus Category:B).
FastCCI
One of the tools I can recommend (because it's a dedicated high-performance solution and actually running) is fastcci, developed by Daniel Schwen; specifically for Wikimedia Commons, there is already a database maintained and a webservice running but it's possible to set it up for any wiki, provided the tool set has a host to run on and has database access.
Query
Consider the following query URL:
https://fastcci.wmflabs.org/?c1=3302993&c2=15516712&d1=0&d2=0&s=200&a=not&t=js
https://fastcci.wmflabs.org/ - Host Wikimedia Commons fastcci runs on
c1 - ID of category 1
c2 - ID of category 2
d1 - depth of category 1 to search in (fastcci by default considers sub-categories)
d2 - depth of category 2 to search in (fastcci by default considers sub-categories)
s - Number or results to return
o - Offset
a - conjunction
t - connection type (t=js for a JSONP response; otherwise assumes being used as websocket)
Response
fastcciCallback( [ 'RESULT 27572680,0,0|1675043,0,0|27577015,0,0|27577043,0,0|27577106,0,0|27576896,0,0|27576790,0,0|23481936,0,0|17560964,0,0|11009066,0,0', 'OUTOF 10', 'DBAGE 378310', 'DONE'] );
RESULT followed by a | separated list of up to 50 integer triplets of the form pageId,depth,tag. Each triplet stands for one image or category
Resources
Sample client side implementation - to see it in action, just visit any category and next to the Good pictures button in any category page.
Example is FilesOf('Category:Saaleck') - FilesOf('Category:Rapeseed fields in Saxony-Anhalt')
Server application
Presentation on YouTube
Slides
A note on pageIDs
page IDs → page titles: GET /w/api.php?action=query&pageids=page_IDs_separated_by_pipe
page titles → page IDs: GET /w/api.php?action=query&titles=Titles_separated_by_pipe
AFAIK, there is no way to get that directly using the API. But, assuming both categories are reasonably small, you could get all images from both of them and then compute the complement in your code.
To retrieve the description, you can use prop=imageinfo&iiprop=extmetadata&iiextmetadatafilter=ImageDescription.
In the context of your example query, it would look like this:
https://commons.wikimedia.org/w/api.php?action=query&generator=categorymembers&gcmtype=file&gcmtitle=Category:X&prop=imageinfo&iiprop=extmetadata&iiextmetadatafilter=ImageDescription

Automatic test data generation

I need to prepare sample test data with 5 million rows of Different employees ie;
It should contain relevant information like -
First Name
Last Name
Address-1
Address-2
Zip code
st
county
country
...etc
Is there any tool that I can use to test it?
I have found the site http://www.generatedata.com/ to be good for this kind of thing - it has a bunch of different formats you can generate data in and outputs in a number of different formats that can be either read in by your code (e.g. from CSV) or easily translated into code using your favorite Unix text manipulation tools.
Either try a webservice, like:
http://www.generatedata.com/
http://www.mockaroo.com/
or try one of the following utils for fake data generation:
PHP "Faker" - https://github.com/fzaninotto/Faker
Perl's Data::Faker - http://metacpan.org/pod/Data::Faker
ruby "faker" - http://faker.rubyforge.org/
http://paulthedutchman.nl/datagenerator/
I would like to suggest a modern PHP fake data generator, with also the ability to fake an entity.
Fakerino: https://github.com/niklongstone/Fakerino

Whats wrong with Neo4j 2.0 Query?

I am trying to understand why the data is not showing up in my query. I was wondering if there is any way to troubleshoot whats going on.
Here is the current issue:
I have populated some data from existing test database to check the performance with a relation like this : (e:Event)-[:FOR_USER]->(u:User) when I get all the users and look at the property, I can see the data, but when I query the users using same data it says 0 records found.
Below image shows the 2 query:
Can some one please help me understand how to debug such issue in neo4j
EDIT
Issue is that the Browser is somehow truncating the multiple spaces in the result. Like in this case "User-May<space>1 2013 1:18AM" was displayed on both webadmin and new browser, but in reality it should have been "User-May<space><space>1 2013<space><space>1:18AM"
So no matter what I do I can't query the value as looks like duplicate space is truncated somewhere.
Tabular data as Micheal suggested is as below
{"id":"75307","labels":["User"],"properties":{"Name":"User-May 1 2013 1:18AM"}}
and what we are seeing is User-May 1 2013 1:18AM
Regards
Kiran
Use the following Cypher syntax in the browser:
MATCH (user:User { Name: "User-May 1 2013 1:18AM" })
RETURN user.Name as Name
As far as the rendering of multiple spaces being trimmed, that is a browser specific functionality. See screenshot below for example:
The text itself is preserved as it is returned from the Neo4j server. As you can see when I analyze the HTML element of the browser using Firebug, the redundant spaces are indeed there.
So again, this doesn't seem to be a bug with Neo4j, it's how the browser you are using renders the text. The browser expects redundant spaces to be encoded as like so: "Testing testing" which is HTML encoded as Testing testing

Google Custom Search API, Howto return country specific results only

I am making some PHP code which takes a given search phrase and url and searches through the google search results until it finds the url (only first 100 results). My problem is, this is only working for the US. I have tried adding the "&cr=" option, but it still on returns US results.
The full URL I am using for the request is:
https://www.googleapis.com/customsearch/v1?key=API_KEY&cx=CX_VALUE&q=KEYWORD&cr=COUNTRY&alt=JSON
Does anyone have any experience with this? I want to be able to see UK results. Tried inserting &cr=countryUK , but still only does US results.
Thanks :)
Regards,
Stian
Use the gl=<country code> param to limit it to your country of choice (so gl=gb for the uk).
More info here:
http://googleajaxsearchapi.blogspot.com/2009/10/web-search-in-your-country.html