How to search entities by instance in Wikidata API

How to search entities by instance in Wikidata API - api

I tried without success get this query:
https://www.wikidata.org/w/api.php?action=wbsearchentities&search=arturo&format=json&language=en&uselang=en&type=item
but only with elements from instance "human of", that means from P31 : Q5
Any help will be preciated because I can't find a way to set up those props.
This kind of query explodes in SPARQL but if someone has a relative answer to get the same result (search for all people with his name start with....) will be preciated too.

I'm really not sure that's possible with the Mediawiki API. It was not possible a few years ago and I don't think the feature has been implemented since then.
But Open Refine has a reconciliation service with Wikidata based on an API that can filter by type ("instance of") or property. Its developer has not yet advertised it explicitly as a standalone API, but this is in project. Here is an example of use: find in Wikidata the people called "arturo" instances of human (Q5) and whose occupation (P106) is actor (Q33999).
https://tools.wmflabs.org/openrefine-wikidata/en/api?query={
"query":"arturo",
"limit":6,
"type" : "Q5",
"properties" : [
{ "pid" : "P106" , "v" : "Q33999"}
]
}
Result.
The main problem with this query is it will only return Arturos which are explicitly an instance of Q5. This will be the case of this one, but not of those who are, for example, only instance of "film maker" (and it does not matter that film maker is himself a subclass of "human"). Sparql can handle class transitivity using the property paths (eg : wdt:P279*/wdt:P31*), but this API has not (yet) implemented them.

Related

Representing complex data types in XACML using Authzforce

I am new to XACML and I would be grateful if you can help me with one problem I encountered.
I use AuthzForce Core PDP (version 17.1.2).
I am wondering what is the correct approach of representing complex data types in XACML.
Example
Access should be granted if PIP response contains any person whose name is present in names array from request and salary of that person is higher than salary provided in request.
Request
names = ["Eric", "Kyle"]
salary = 1500
PIP response
[
{
"name": "Kyle",
"salary": 1000
},
{
"name": "Kenny",
"salary": 2000
},
{
"name": "Eric",
"salary": 4000
},
{
"name": "Stan",
"salary": 3000
}
]
Access will be granted because PIP response contains person with name Eric and his salary is higher than 1500.
My implementation
To represent PIP response I ended up with creating custom type by extending StringParseableValue class from AuthzForce. For above mentioned logic I use attribute designator in xml and have coresponding attribute provider (class extending BaseNamedAttributeProvider) in Java performing PIP call.
I also wrote two custom functions:
Find people with higher salary than provided in one param (returns filtered list)
Get person name (returns string)
And using those functions and standard function I wrote policy and it works.
However my solution seems to be overcomplicated. I suppose what I did can be achieved by using only standard functions.
Additionally if I wanted to define hardcoded bag of people inside other policy single element would look like this:
<AttributeValue DataType="person">name=Eric###salary=4000</AttributeValue>
There is always possibility that parsing of such strings might fail.
So my question is: What is a good practice of representing complex types like my PIP response in XACML using Authzforce? Sometimes I might need to pass more complex data in the request and I saw example in XACML specification showing passing such data inside <Content> element.

Creating a new XACML data-type - and consequently new XACML function(s) to handle that new data-type - seems a bit overkill indeed. Instead, you may improve your PIP (Attribute Provider) a little bit, so that it returns only the results for the employees named in the Request, and only their salaries (extracting them from the JSON using JSON path) returned as a bag of integers.
Then, assuming this PIP result is set to the attribute employee_salaries in your policy (bag of integers) for instance, and min_salary is the salary in the Request, it is just a matter of applying any-of(integer-less-than, min_salary, employee_salaries) in a Condition. (I'm using short names for the functions by convenience, please refer to the XACML 3.0 standard for the full identifiers.)
Tips to improve the PIP:
One issue here is performance (scalability, response time / size...) because if you have hundreds even thousands of employees, it is overkill to get the whole list from the REST service over and over, all the more as you need only a small subset (the names in the Request). Instead, you may have some way to request the REST service to return only a specific employees, using query parameters; an example using RSQL (but this depends on the REST service API):
HTTP GET http://rest-service.example.com/employees?search=names=in=($employee_names)
... where you set the $employee_names variable to (a comma-separated list of) the employee names from the Request (e.g. Eric,Kyle). You can get these in your AttributeProvider implementation, from the EvaluationContext argument of the overriden get(...) method (EvaluationContext#getNamedAttributeValue(...)).
Then you can use a JSON path library (as you did) to extract the salaries from the JSON response (so you have only the salaries of the employees named in the Request), using this JSON path for instance (tested with Jayway):
$[*].salary
If the previous option is not possible, i.e. you have no way of filtering employees on the REST API, you can always do this filtering in your AttributeProvider implementation with the JSON path library, using this JSON path for instance (tested with Jayway against your PIP response):
$[?(#.name in [$employee_names])].salary
... where you set the $employee_names variable like in the previous way, getting the names from the EvaluationContext. So the actual JSONpath after variable replacement would be something like:
$[?(#.name in [Eric,Kyle])].salary
(You may add quotes to each name to be safe.)
All things considered, if you still prefer to go for new XACML data-type (and functions), and since you seem to have done most of the work (impressive btw), I have a suggestion - if doable without to much extra work - to generalize the Person data-type to more generic JSON object datatype that could be reused in any use case dealing with JSON. Then see whether the extra functions could be done with a generic JSONPath evaluation function applied to the new JSON object data-type. This would provide a JSON equivalent to the standard XML/XPath data-type and functions we already have in XACML, and this kind of contribution would benefit the AuthzForce community greatly.
For the JSON object data-type, actually you can use the one in the testutils module as an example: CustomJsonObjectBasedAttributeValue which has been used to test support of JSON objects for the GeoXACML extension.

REST GET mehod: Can return a list of enriched resources?

I have a doubt when I'm designing a REST API.
Consider I have a Resource "Customer" with two elements in my server, like this:
[
{
name : "Mary",
description : "An imaginary woman very tall."
},
{
name : "John",
description : "Just a guy."
}
]
And I want to make an endpoint, that will accept a GET request with a query. The query will provide a parameter with a value that will make an algorithm count how many occurrences for this text are there in all of its parameters.
So if we throw this request:
GET {baseURL}/customers?letters=ry
I should get something like
[
{
name : "Mary",
description : "An imaginary woman very tall.",
count : 3
},
{
name : "John",
description : "Just a guy.",
count : 0
}
]
Count parameter can not be included in the resource scheme as will depend on the value provided in the query, so the response objects have to be enriched.
I'm not getting a list of my resource but a modified resource.
Although it keeps the idempotent condition for GET Method, I see it escapes from the REST architecture concept (even the REST beyond CRUD).
Is it still a valid endpoint in a RESTful API? or should I create something like a new resource called "ratedCustomer"?

REST GET mehod: Can return a list of enriched resources?
TL;DR: yes.
Longer answer...
A successful GET request returns a representation of a single resource, identified by the request-target.
The fact that the information used to create the representation of the resource comes from multiple entities in your domain model, or multiple rows in your database, or from reports produced by other services... these are all implementation details. The HTTP transfer of documents over a network application doesn't care.
That also means that we can have multiple resources that include the same information in their representations. Think "pages in wikipedia" that duplicate each others' information.
Resource identifiers on the web are semantically opaque. All three of these identifiers are understood to be different resources
/A
/A?enriched
/B
We human beings looking at these identifiers might expect /A?enriched to be semantically closer to /A than /B, but the machines don't make that assumption.
It's perfectly reasonable for /A?enriched to produce representations using a different schema, or even a different content-type (as far as the HTTP application is concerned, it's perfectly reasonable that /A be an HTML document and /A?enriched be an image).
Because the machines don't care, you've got additional degrees of freedom in how you design both you resources and your resource identifiers, which you can use to enjoy additional benefits, including designing a model that's easy to implement, or easy to document, or easy to interface with, or easy to monitor, or ....
Design is what we do to get more of what we want than we would get by just doing it.

sparql query to get all individual of specific class using data value

I have an ontology that contains two classes (course,lesson) the course has a data properties called code of type string
How to get all individuals from specific class with specific data properties value
here is a screenshot

The general pattern is something like this:
SELECT ?individual
WHERE { ?individual a <uri-of-specific-class> ;
<uri-of-property> ?propertyValue .
FILTER(STR(?propertyValue) = "expected value")
}
You will need to adapt this with the details of your specific ontology (the URIs of your class names and properties), but it shows the general approach. I would also suggest that you try out a SPARQL tutorial, there's several good ones online for you to find.

How to construct intersection in REST Hypermedia API?

This question is language independent. Let's not worry about frameworks or implementation, let's just say everything can be implemented and let's look at REST API in an abstract way. In other words: I'm building a framework right now and I didn't see any solution to this problem anywhere.
Question
How one can construct REST URL endpoint for intersection of two independent REST paths which return collections? Short example: How to intersect /users/1/comments and /companies/6/comments?
Constraint
All endpoints should return single data model entity or collection of entities.
Imho this is a very reasonable constraint and all examples of Hypermedia APIs look like this, even in draft-kelly-json-hal-07.
If you think this is an invalid constraint or you know a better way please let me know.
Example
So let's say we have an application which has three data types: products, categories and companies. Each company can add some products to their profile page. While adding the product they must attach a category to the product. For example we can access this kind of data like this:
GET /categories will return collection of all categories
GET /categories/9 will return category of id 9
GET /categories/9/products will return all products inside category of id 9
GET /companies/7/products will return all products added to profile page of company of id 7
I've omitted _links hypermedia part on purpose because it is straightforward, for example / gives _links to /categories and /companies etc. We just need to remember that by using hypermedia we are traversing relations graph.
How to write URL that will return: all products that are from company(7) and are of category(9)? In otherwords how to intersect /categories/9/products and /companies/7/products?
Assuming that all endpoints should represent data model resource or collection of them I believe this is a fundamental problem of REST Hypermedia API, because in traversing hypermedia api we are traversing relational graph going down one path so it is impossible to describe such intersection because it is a cross-section of two independent graph paths.
In other words I think we cannot represent two independent paths with only one path. Normally we traverse one path like A->B->C, but if we have X->Y and Z->Y and we want all Ys that come from X and Z then we have a problem.
So far my proposition is to use query strings: /categories/9/products?intersect=/companies/9 but can we do better?
Why do I want this?
Because I'm building a framework which will auto-generate REST Hypermedia API based on SQL database relations. You could think of it as a trans compiler of URLs to SELECT ... JOIN ... WHERE queries, but the client of the API only sees Hypermedia and the client would like to have a nice way of doing intersections, like in the example.

I don't think you should always look at REST as database representation, this case looks more of a kind of specific functionality to me. I think I'd go with something like this:
/intersection/comments?company=9&product=5
I've been digging after I wrote it and this is what I've found (http://www.vinaysahni.com/best-practices-for-a-pragmatic-restful-api):
Sometimes you really have no way to map the action to a sensible RESTful structure. For example, a multi-resource search doesn't really make sense to be applied to a specific resource's endpoint. In this case, /search would make the most sense even though it isn't a resource. This is OK - just do what's right from the perspective of the API consumer and make sure it's documented clearly to avoid confusion.

What You want to do is to filter products in one of the categories ... so following Your example if we have:
GET /categories/9/products
Above will return all products in category 9, so to filter out products for company 7 I would use something like this
GET /categories/9/products?company=7
You should treat URI as link to fetch all data (just like simple select query in SQL) and query parameters as where, limit, desc etc.
Using this approach You can build complex and readable queries fe.
GET /categories/9/products?company=7&order=name,asc&offset=10&limit=20

All endpoints should return single data model entity or collection of
entities.
This is NOT a REST constraint. If you want to read about REST constraints, then read the Fielding dissertation.
Because I'm building a framework which will auto-generate REST
Hypermedia API based on SQL database relations.
This is a wrong approach and has nothing to do with REST.
By REST you describe possible resource state transitions (or operation call templates) by sending hyperlinks in the response. These hyperlinks consist of a HTTP methods and URIs (and other data which is not relevant now) if you build the uniform interface using the HTTP and URI standards, and we usually do so. The URIs are not (necessarily) database entity and collection identifiers and if you apply such a constraint you will end up with a CRUD API, not with a REST API.
If you cannot describe an operation with the combination of HTTP methods and already existing resources, then you need a new resource.
In your case you want to aggregate the GET /users/1/comments and GET /companies/6/comments responses, so you need to define a link with GET and a third resource:
GET /comments/?users=1&companies=6
GET /intersection/users:1/companies:6/comments
GET /intersection/users/1/companies/6/comments
etc...

RESTful architecture is about returning resources that contain hypermedia controls that offer state transitions. What i see here is a multistep process of state transitions. Let's assume you have a root resource and somehow navigate over to /categories/9/products using the available hypermedia controls. I'd bet the results would look something like this in hal:
{
_links : {
self : { href : "/categories/9/products"}
},
_embedded : {
item : [
{json of prod 1},
{json of prod 2}
]
}
}
If you want your client to be able to intersect this with another collection you need to provide to them the mechanism to perform this. You have to give them a hypermedia control. HAL only has links, templated links, and embedded as control types. let's go with links..change the response to:
{
_links : {
self : { href : "/categories/9/products"},
x:intersect-with : [
{
href : "URL IS ABSOLUTELY IRRELEVANT!!! but unique 1",
title : "Company 6 products"
},
{
href : "URL IS ABSOLUTELY IRRELEVANT!!! but unique 2",
title : "Company 5 products"
},
{
href : "URL IS ABSOLUTELY IRRELEVANT!!! but unique 3",
title : "Company 7 products"
}
]
},
_embedded : {
item : [
{json of prod 1},
{json of prod 2}
]
}
}
Now the client just picks the right hypermedia control (aka link) based on the title field of the link.
That's the simplest solution. But you'll probably say there's 1000's of companies i don't want 1000's of links...well ok if that;s REALLY the case...you just offer a state transition in the middle of the two we have:
{
_links : {
self : { href : "/categories/9/products"},
x:intersect-options : { href : "URL to a Paged collection of all intersect options"},
x:intersect-with : [
{
href : "URL IS ABSOLUTELY IRRELEVANT!!! but unique 1",
title : "Company 6 products"
},
{
href : "URL IS ABSOLUTELY IRRELEVANT!!! but unique 2",
title : "Company 5 products"
},
{
href : "URL IS ABSOLUTELY IRRELEVANT!!! but unique 3",
title : "Company 7 products"
}
]
},
_embedded : {
item : [
{json of prod 1},
{json of prod 2}
]
}
}
See what i did there? an extra control for an extra state transition. JUST LIKE YOU WOULD DO IF YOU HAD A WEBPAGE. You'd probably put it in a pop up, well that's what the client of your app can do too with the result of that control.
It's really that simple...just think how you'd do it in HTML and do the same.
The big benefit here is that the client NEVER EVER needed to know a company or category id or ever plug that in to some template. The id's are implementation details, the client never knows they exist, they just executed Hypermedia controls..and that is RESTful.

The RESTful way to include or not include children of a resource?

Say I have a team object, that has a name property, a city property and a players property, where the players property is a an array of possibly many players. This is represented in an SQL database with a teams table and a players table, where each player has a name and a team_id.
Building a RESTful api based on this simple data-structure, I'm in doubt if there is a clear rule regarding, if the return object should/could include a list of players, when hitting /teams/:id ?
I have a view, that needs to show a team, and its players with their names, so:
1: Should /teams/:id join the two tables behind the scene and return the full team object, with a players property, that is an array of names and id's?
2: Should /teams/:id join the two tables behind the scene and return the team object, with a players property, that is an array of just id's that will then have to be queried one-by-one to /players/:id ?
3: Should two calls be made, one to /teams/:id and one to /teams/:id/players ?
4: Should a query string be used like this /teams/:id?fields=name,city,players ?
If either 2 or 3 is the way to go, how would one approach the situation, where a team could also have multiple cities, resulting in another cities table in the DB to keep it normalized? Should a new endpoint then be created at /teams/:id/cities.
When creating RESTful API's, is it the normalized datastructure in the DB that dictates the endpoints in the API?

Usually with a RESTful API, it is best that the use-cases dictate the endpoints of the API, not necessarily the data structure.
If you sometimes need just the teams, sometimes need just the players of a team, and sometimes need both together, I would have 3 distinct calls, probably something like /teams/:id, /players/:teamid and player-teams/:teamid (or something similar).
The reason you want to do it this way is because it minimizes the number of HTTP requests that need to be made for any given page. Of all of the typical performance issues, an inflated number of HTTP requests is usually one of the most common performance hits, and usually one of the easiest to avoid.
That being said, you also don't want to go so crazy that you create an over-inflated API. Think through the typical use cases and make calls for those. Don't just implement every possible combination you can think of just for the sake of it. Remember You Aren't Gonna Need It.

I'd suggest something like:
GET /teams
{
"id" : 12,
"name" : "MyTeam"
"players" :
{
"self" : "http://my.server/players?teamName=MyTeam"
},
"city" :
{
"self" : "http://my.server/cities/MyCity"
}
}
GET /cities
GET /cities/{cityId}
GET /players
GET /players/{playerId}
You can then use URIs to call out to get whatever other related resources you need. If you want the flexibility to embed values, you can use ?expand, such as:
GET /teams?expand=players
{
"id" : 12,
"name" : "MyTeam"
"players" :
{
"self" : "http://my.server/players?teamName=MyTeam",
[
{
"name" : "Mary",
"number" : "12"
},
{
"name" : "Sally",
"number" : "15"
}
]
},
"city" :
{
"self" : "http://my.server/cities/MyCity"
}
}

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas