API modelling: Two resources with the same name - api

I have an API modelling question I need help with.
Let's assume I have Product representation that consists of:
Attribute
Type
id
string
names
MultilingualName List
price
decimal
MultilingualName is represented by:
Attribute
Type
locale
string
name
string
This is to support multiple languages, so that a store owner user is able to create products (and maintain/CRUD) within a product shelf that supports multiple languages.
Our POST API (/products) may look like this:
{
"id": "123abcxyz"
"names": [
{
"locale": "en-CA",
"name": "Pencil"
},
{
"locale": "fr-CA",
"name": "Crayon"
}
],
"price": 1.99
}
The problem I am having trouble with is with the consumer user based GET end point for Product.
I would like to model the resource without having the MultilingualName language complexity. So for example, the GET endpoint would look like this:
Attribute
Type
id
string
name
string
price
decimal
The end point would return the product name based on the user's preferred language, which is already known.
But clearly I have a problem. I have two Products resources now: CRUD operations (GET, POST, PUT, DELETE) geared to the store owner that maintains their product shelf (with the language complexity), and one for the consumer user (GET only, without the language complexity).
How should I approach this model? Is this a naming problem or am I approaching the modelling incorrectly?
Thanks for your help.
Edit: adding a UML as suggest by #qwerty_so.

Remark: answer provided when question had an UML tag (last part still valid though)
Multiple problems in your diagram:
An actor is a classifier that is external to the system. A class can therefore not be associated with an actor.
A «use» dependency should be noted with a dashed line arrow, and not a plain line arrow.
It is not clear what the links between your «Representation» classes and you »«DB table» represents: is it again a dependency? Is it a navigable association?
You have multiple definitions of a name without any possibility to disambiguate.
How to do it in UML?
In UML, having two classes with the same name in the same namespace is not valid, since it would not be clear which class is meant.
Nevertheless, you could adjust your model and separate the different family of classes, enclosing them in distinct packages, e.g. Database, CRUD, Consumer.
A package defines a namespaces. In consequence all references to a Product in the CRUD package would refer to the relevant CRUD class.
For relations between classes of different packages, you may disambiguate with a qualified name (i.e. including the package names), or, in case there would be no conflict, by importing a package into another.
Is it the best way to design your API?
Managing conflicting names in endpoints is not ideal. It’s n easy source of confusion. You can of course find technical solutions to address this, but still, why bother ?
Why not leave the client side tailoring/simplifying data as needed (e.g. using the user's language by default, unless something else is required)?
Why not let the end-point query parameters provide an optional language code to filter the set of returned languages ?
Or why not just use different names to call differently different things? I was thinging about ProductLocalized or ProductShort?

Is this a naming problem or am I approaching the modelling incorrectly?
It sounds to me like a naming problem.
It's perfectly reasonable to have more than one resource whose representations are derived from the same underlying data.
But each resource does need to have an identifier of its own. The machines don't particularly care what the identifier is, so you could use...
/b6d5cc6a-e4b3-4bc9-90d0-723f1d8ee22a
/5952d730-d447-4537-9bf0-74cdc2f9f79a
But human beings generally copy better with identifiers that are human readable.
Any spelling consistent with your local conventions is fine.
One possibility would be to just encode the name of the audience into the identifier itself
/storeOwner/products/123
/shopper/products/123
or you might instead choose names that are more closely aligned with the business capability
/inventory/products/123
/sales/products/123
You've got the freedom to choose which human beings you want to optimize for (domain experts? operators reading access logs? tech writers documenting the api? remote developers consuming it? and so on).

Related

Query String vs Resource Path for Filtering Criteria

Background
I have 2 resources: courses and professors.
A course has the following attributes:
id
topic
semester_id
year
section
professor_id
A professor has the the following attributes:
id
faculty
super_user
first_name
last_name
So, you can say that a course has one professor and a professor may have many courses.
If I want to get all courses or all professors I can: GET /api/courses or GET /api/professors respectively.
Quandary
My quandary comes when I want to get all courses that a certain professor teaches.
I could use either of the following:
GET /api/professors/:prof_id/courses
GET /api/courses?professor_id=:prof_id
I'm not sure which to use though.
Current solution
Currently, I'm using an augmented form of the latter. My reasoning is that it is more scale-able if I want to add in filtering/sorting criteria.
I'm actually encoding/embedding JSON strings into the query parameters. So, a (decoded) example might be:
GET /api/courses?where={professor_id: "teacher45", year: 2016}&order={attr: "topic", sort: "asc"}
The request above would retrieve all courses that were (or are currently being) taught by the professor with the provided professor_id in the year 2016, sorted according to topic title in ascending ASCII order.
I've never seen anyone do it this way though, so I wonder if I'm doing something stupid.
Closing Questions
Is there a standard practice for using the query string vs the resource path for filtering criteria? What have some larger API's done in the past? Is it acceptable, or encouraged to use use both paradigms at the same time (make both endpoints available)? If I should indeed be using the second paradigm, is there a better organization method I could use besides encoding JSON? Has anyone seen another public API using JSON in their query strings?
Edited to be less opinion based. (See comments)
As already explained in a previous comment, REST doesn't care much about the actual form of the link that identifies a unique resource unless either the RESTful constraints or the hypertext transfer protocol (HTTP) itself is violated.
Regarding the use of query or path (or even matrix) parameters is completely up to you. There is no fixed rule when to use what but just individual preferences.
I like to use query parameters especially when the value is optional and not required as plenty of frameworks like JAX-RS i.e. allow to define default values therefore. Query parameters are often said to avoid caching of responses which however is more an urban legend then the truth, though certain implementations might still omit responses from being cached for an URI containing query strings.
If the parameter defines something like a specific flavor property (i.e. car color) I prefer to put them into a matrix parameter. They can also appear within the middle of the URI i.e. /api/professors;hair=grey/courses could return all cources which are held by professors whose hair color is grey.
Path parameters are compulsory arguments that the application requires to fulfill the request in my sense of understanding otherwise the respective method handler will not be invoked on the service side in first place. Usually this are some resource identifiers like table-row IDs ore UUIDs assigned to a specific entity.
In regards to depicting relationships I usually start with the 1 part of a 1:n relationship. If I face a m:n relationship, like in your case with professors - cources, I usually start with the entity that may exist without the other more easily. A professor is still a professor even though he does not hold any lectures (in a specific term). As a course wont be a course if no professor is available I'd put professors before cources, though in regards to REST cources are fine top-level resources nonetheless.
I therefore would change your query
GET /api/courses?where={professor_id: "teacher45", year: 2016}&order={attr: "topic", sort: "asc"}
to something like:
GET /api/professors/teacher45/courses;year=2016?sort=asc&onField=topic
I changed the semantics of your fields slightly as the year property is probably better suited on the courses rather then the professors resource as the professor is already reduced to a single resource via the professors id. The courses however should be limited to only include those that where held in 2016. As the sorting is rather optional and may have a default value specified, this is a perfect candidate for me to put into the query parameter section. The field to sort on is related to the sorting itself and therefore also belongs to the query parameters. I've put the year into a matrix parameter as this is a certain property of the course itself, like the color of a car or the year the car was manufactured.
But as already explained previously, this is rather opinionated and may not match with your or an other folks perspective.
I could use either of the following:
GET /api/professors/:prof_id/courses
GET /api/courses?professor_id=:prof_id
You could. Here are some things to consider:
Machines (in particular, REST clients) should be treating the URI as an opaque thing; about the closest they ever come to considering its value is during resolution.
But human beings, staring that a log of HTTP traffic, do not treat the URI opaquely -- we are actually trying to figure out the context of what is going on. Staying out of the way of the poor bastard that is trying to track down a bug is a good property for a URI Design to have.
It's also a useful property for your URI design to be guessable. A URI designed from a few simple consistent principles will be a lot easier to work with than one which is arbitrary.
There is a great overview of path segment vs query over at Programmers
https://softwareengineering.stackexchange.com/questions/270898/designing-a-rest-api-by-uri-vs-query-string/285724#285724
Of course, if you have two different URI, that both "follow the rules", then the rules aren't much help in making a choice.
Supporting multiple identifiers is a valid option. It's completely reasonable that there can be more than one way to obtain a specific representation. For instance, these resources
/questions/38470258/answers/first
/questions/38470258/answers/accepted
/questions/38470258/answers/top
could all return representations of the same "answer".
On the /other hand, choice adds complexity. It may or may not be a good idea to offer your clients more than one way to do a thing. "Don't make me think!"
On the /other/other hand, an api with a bunch of "general" principles that carry with them a bunch of arbitrary exceptions is not nearly as easy to use as one with consistent principles and some duplication (citation needed).
The notion of a "canonical" URI, which is important in SEO, has an analog in the API world. Mark Seemann has an article about self links that covers the basics.
You may also want to consider which methods a resource supports, and whether or not the design suggests those affordances. For example, POST to modify a collection is a commonly understood idiom. So if your URI looks like a collection
POST /api/professors/:prof_id/courses
Then clients are more likely to make the associate between the resource and its supported methods.
POST /api/courses?professor_id=:prof_id
There's nothing "wrong" with this, but it isn't nearly so common a convention.
GET /api/courses?where={professor_id: "teacher45", year: 2016}&order={attr: "topic", sort: "asc"}
I've never seen anyone do it this way though, so I wonder if I'm doing something stupid.
I haven't either, but syntactically it looks a little bit like GraphQL. I don't see any reason why you couldn't represent a query that way. It would make more sense to me as a single query description, rather than breaking it into multiple parts. And of course it would need to be URL encoded, etc.
But I would not want to crazy with that right unless you really need to give to your clients that sort of flexibility. There are simpler designs (see Roman's answer)

Using Demand in Schema.org

I'm working on a website where companies create posts to describe services they need.
For me, this action is basically making a demand, in the sense of https://schema.org/Demand, and the demand is a for a https://schema.org/Service.
In addition to that, the Demand page on schema.org tell us :
For describing demand using this type, the very same properties used
for Offer apply.
Given the last quote, the property named "itemOffered" in the "Demand" entity is very possibly what I want, but the name is very confusing (I rather use something called like "itemNeeded")...
Should I use the "Demand" entity with "itemOffered" for my use case ? Or something better exists ?
While the property name and the definition of itemOffered doesn’t seem to match for using it in Demand, I guess the sentence you quote was added to assure users that it’s fine to use these for that purpose, and that it’s not an oversight.
And there isn’t any other Schema.org type suitable for representing demands; and for Demand, itemOffered is the only property that expects a Service. So this seems to be the only option.
So if an organization demands a certain service, this seems to be the intended way to represent it in Schema.org:
Organization seeks Demand
Demand itemOffered Service

Does my API design violate RESTful principles?

I'm currently (I try to) designing a RESTful API for a social network. But I'm not sure if my current approach does still accord to the RESTful principles. I'd be glad if some brighter heads could give me some tips.
Suppose the following URI represents the name field of a user account:
people/{UserID}/profile/fields/name
But there are almost hundred possible fields. So I want the client to create its own field views or use predefined ones. Let's suppose that the following URI represents a predefined field view that includes the fields "name", "age", "gender":
utils/views/field-views/myFieldView
And because field views are kind of higher logic I don't want to mix support for field views into the "people/{UserID}/profile/fields" resource. Instead I want to do the following:
utils/views/field-views/myFieldView/{UserID}
Another example
Suppose we want to perform some quantity operations (hope that this is the right name for it in English). We have the following URIs whereas each of them points to a list of persons -- the friends of them:
GET people/exampleUID-1/relationships/friends
GET people/exampleUID-2/relationships/friends
And now we want to find out which of their friends are also friends of mine. So we do this:
GET people/myUID/relationships/intersections/{Value-1};{Value-2}
Whereas "{Value-1/2}" are the url encoded values of "people/exampleUID-1/friends" and "people/exampleUID-2/friends". And then we get back a representation of all people which are friends of all three persons.
Though Leonard Richardson & Sam Ruby state in their book "RESTful Web Services" that a RESTful design is somehow like an "extreme object oriented" approach, I think that my approach is object oriented and therefore accords to RESTful principles. Or am I wrong?
When not: Are such "object oriented" approaches generally encouraged when used with care and in order to avoid query-based REST-RPC hybrids?
Thanks for your feedback in advance,
peta
I've never worked with REST, but I'd have assumed that GETting a profile resource at '''/people/{UserId}/profile''' would yield a document, in XML or JSON or something, that includes all the fields. Client-side I'd then ignore the fields I'm not interested in. Isn't that much nicer than having to (a) configure a personalised view on the server or (b) make lots of requests to fetch each field?
Hi peta,
I'm still reading through RESTful Web Services myself, but I'd suggest a slightly different approach than the proposed one.
Regarding the first part of your post:
utils/views/field-views/myFieldView/{UserID}
I don't think that this is RESTful, as utils is not a resource. Defining custom views is OK, however these views should be (imho) a natural part of your API's URI scheme. To incorporate the above into your first URI example, I would propose one of the following examples instead of creating a special view for it:
people/{UserID}/profile/fields/name,age,gender/
people/{UserID}/profile/?fields=name,age,gender
The latter example considers fields as an input value for your algorithm. This might be a better approach than having fields in the URI as it is not a resource itself - it just puts constraints on the existing view of people/{UserID}/profile/. Technically, it's very similar as pagination, where you would limit a view by default and allow clients to browse through resources by using ?page=1, ?page=2 and so on.
Regarding the second part of your post:
This is a more difficult one to crack.
First:
Having intersection in the URI breaks your URI scheme a bit. It's not a resource by itself and also it sits on the same level as friends, whereas it would be more suitable one level below or as an input value for your algorithm, i.e.
GET people/{UserID}/relationships/friends/intersections/{Value-1};{Value-2}
GET people/{UserID}/relationships/friends/?intersections={Value-1};{Value-2}
I'm again personally inclined to the latter, because similarly as in the first case, you are just constraining the existing view of people/{UserID}/relationships/friends/
Secondly, regarding:
Whereas "{Value-1/2}" are the url
encoded values of
"people/exampleUID-1/friends" and
"people/exampleUID-2/friends"
If you meant that {Value-1/2} contain the whole encoded response of the mentioned GET requests, then I would avoid that - I don't think that the RESTful way. Since friends is a resource by itself, you may want to expose it and access it directly, i.e.:
GET friends/{UserID-1};{UserID-2};{UserID-3}
One important thing to note here - I've used ; between user IDs in the previous example, whereas I used , in the fields example above. The reasoning is that both represent a different operator. In the first case we needed OR (,) in order to get all three fields, while in the last example above we had to use AND (;) in order to get an intersection.
Usage of two types of operators can over-complicate the API design, but it should provide more flexibility in the end.
thanks for your clarifying answers. They are exactly what I was asking for. Unfortunately I hadn't the time to read "RESTful Web Services" from cover to cover; but I will catch it up as soon as possible. :-)
Regarding the first part of my post:
You're right. I incline to your first example, and without fields. I think that the I don't need it at all. (At the moment) Why do you suggest the use of OR (,) instead of AND (;)? Intuitively I'd use the AND operator because I want all three of them and not just the first one existing. (Like on page 121 the colorpairs example)
Regarding the second part:
With {Value-1/2} I meant only the url-encoded value of the URIs -- not their response data. :) Here I incline with you second example. Here it should be obvious that under the hood an algorithm is involed when calculating intersecting friends. And beside that I'm probably going to add some further operations to it.
peta

Service Design (WCF, ASMX, SOA)

Soliciting feedback/thoughts on a pattern or best practice to address a situation that I have seen a few times over the years, yet I haven't found any one solution that addresses it the way I'd like.
Here is the background.
Company has 3 applications supporting 3 separate "lines of business" that are very much related to each other. Two of the applications are literally copy/paste from the original. The applications need to be able to grow at different rates and have slightly different functionality. The main differences in functionality come from the data entry fields. The differences essentially fall into one of the following categories:
One instance has a few fields
that the other does not.
String field has a max length of 200 in one
instance, but 50 in another.
Lookup/Reference fields have
different underlying values (i.e.
same table structures, but coming
from different databases).
A field is defined as a user supplied,
free text, value in one instance,
but a lookup/reference in another.
The problem is that there are other applications within the company that need to consume data from these three separate applications, but ideally, talk to them in a core/centralized manner (i.e. through a central service rather than 3 separate services). My question is how to handle, in particular, item D above. I am thinking a "lowest common denominator" approach might be the only way. For example:
<SomeFieldName>
<Code></Code> <!-- would store a FK ref value if instance used lookup, otherwise would be empty or nonexistent-->
<Text></Text> <!-- would store the text from the lookup if instance used lookup, would store user supplied text if not-->
</SomeFieldName>
Other thoughts/ideas on this?
TIA!
So are the differences strictly from a Datamodel view or are there functional business / behavioral differences at the application level.
If the later is the case then I would definetly go down the path you appear to be heading down with SOA. Now how you impliment your SOA just depends upon your architecture needs. What I would look at for design would be into some various patterns. Its hard to say for sure which one(s) would meet the needs with out more information / example on how the behavioral / functional differences are being used. From off of the top of my head tho with what you have described I would probably start off looking at a Strategy pattern in my initial design.
Definetly prototype this using TDD so that you can determine if your heading down the right path.
How about: extend your LCD approach, put a facade in front of these systems. devise a normalised form of the data which (if populated with enough data) can be transformed to any of the specific instances. [Heading towards an ESB here.]
Then you have the problem, how does a client know what "enough" is? Some kind of meta-data may be needed so that you can present a suiatble UI. So extend the services to provide an operation to deliver the meta data.

Should we use prefixes in our database table naming conventions?

We are deciding the naming convention for tables, columns, procedures, etc. at our development team at work. The singular-plural table naming has already been decided, we are using singular. We are discussing whether to use a prefix for each table name or not. I would like to read suggestions about using a prefix or not, and why.
Does it provide any security at all (at least one more obstacle for a possible intruder)? I think it's generally more comfortable to name them with a prefix, in case we are using a table's name in the code, so to not confuse them with variables, attributes, etc. But I would like to read opinions from more experienced developers.
I find hungarian DB object prefixes to indicate their types rather annoying.
I've worked in places where every table name had to start with "tbl". In every case, the naming convention ended up eventually causing much pain when someone needed to make an otherwise minor change.
For example, if your convention is that tables start with "tbl" and views start with "v", thn what's the right thing to do when you decide to replace a table with some other things on the backend and provide a view for compatibility or even as the preferred interface? We ended up having views that started with "tbl".
I prefer prefixing tables and other database objects with a short name of the application or solution.
This helps in two potential situations which spring to mind:
You are less likely to get naming conflicts if you opt to use any third-party framework components which require tables in your application database (e.g. asp net membership provider).
If you are developing solutions for customers, they may be limited to a single database (especially if they are paying for external hosting), requiring them to store the database objects for multiple applications in a single database.
I don't see how any naming convention can improve security...
If an intruder have access to the database (with harmful permissions), they will certainly have permissions to list table names and select to see what they're used for.
But I think that truly confusing table names might indirectly worsen security.
It would make further development hard, thus reducing the chance security issues will be fixed, or it could even hide potential issues:
If a table named (for instance) 'sro235onsg43oij5' is full of randomly named coloumns with random strings and numbers, a new developer might just think it's random test data (unless he touches the code that interact with it), but if it was named 'userpasswords' or similar any developer who looks at the table would perhaps be shocked that the passwords is stored in plaintext.
Why not name the tables according to the guidelines you have in place for coding? Consider the table name a "class" and the columns a "property" or "field". This assists when using an ORM that can automatically infer table/column naming from class/member naming.
For instance, Castle ActiveRecord, declared like below assumes the names are the same as the member they are on.
[ActiveRecord]
public class Person
{
[PrimaryKey]
public Int32 Id { get; set; }
[Property]
public String Name { get; set; }
}
If you use SqlServer the good start would be to look at the sample databases provided for some guidance.
In the past, I've been opposed to using prefixes in table names and column names. However, when faced with the task of redesigning a system, having prefixes is invaluable for doing search and replace. For example, grepping for "tbl_product" will probably give you much more relevant results than grepping for "product".
If you're worried about mixing up your table names, employ a hungarian notation style system in your code. Perhaps "s" for string + "tn" for table name:
stnUsers = 'users';
stnPosts = 'posts';
Of course, the prefix is up to you, depending on how verbose you like your code... strtblUsers, strtblnmeUsers, thisisthenameofatableyouguysUsers...
Appending a prefix to table names does have some benefits, especially if you don't hardcode that prefix into the system, and allow it to change per installation. For one, you run less risk of conflicts with other components, as Ian said, and secondly, should you wish, you could have two or instances of your program running off the same database.