Let's assume I have a POST /orders operation that takes as input a collection of order items. An order can't contain more than 50 items, but where do I perform this validation?
Validating the order size in both the client and the server would be redundant, and increase the maintenance cost if I decide to change the order size limit,.
Validating it only in the server would prevent clients from "failing fast" (i.e., you add a thousand items to the order and is informed of the limit only when completing it).
I'm assuming client-only validation is not an option, as the API may have other clients.
The problem gets more complicated if I have dynamic validation rules. Suppose retail customers can have orders 50 items, but wholesale customers can have 500 items in their orders. Should the API expose an operation so clients can fetch the current validation rules?
You have to do both, but differently.
To guarantee valid operations, all critical validation must happen on the server/web service side. The client side UI is just that - a user interface to make interacting with the web service convenient for a person. Once the web service is stable and secure, create a default method to pass web service errors through the client to the user. After that, features in the UI layer are usability issues and should be based on testing (even if that is informal testing by watching over a user's shoulder or listening to feedback.)
I agree with what was said before.
Although, I think if you can predict almost every situation a user may come into, you could also create client-side validation.
As per your example about wholesale/retail, you could first create a drop-down that asks the client to choose whether they're wholesale or retail and then apply the 500/50 rule to the input box based on the first option.
The obvious problem comes in the fact that if your API is released to other developers, they may not be aware of the 50/500 rule and that is where I agree with the previous answer about critical validation happening on the server. If you're building the API for your own use then you could go either way because you're aware of the input restrictions. It will also save quite a bit on server-costs if the app is very big (validation on the server will be taxing).
Related
I am working on one chatroom [all to all] application in Elixir using OTP Genserver and getting messages from js client as user gets registered with their names as first phase. Now, just bit not sure what would be the best approach to store these names at my elixir server somehow and send regular updates to client with list of users online or database storage. Please suggest the best approach.
I agree with bitwalker that ETS is a good fit.
Here's a short summary of what I did in production. It wasn't a chat server, but a server push with a couple of thousand of users connecting via long polling. Pushed data was divided in some 50 categories, and users were able to choose which ones they want. At peak times the server pushed new messages each 2 secs, and processed > 2000 reqs/sec.
Essentially, I kept a gen_server for each user, where I held pending messages and user's configuration (basically a list of selected channels). This was beneficial with long polling, since user's data is decoupled from the user's request, so the data remains while requests are transient. However, I think this approach is also good for permanent connections, such as websockets, since there might still be occasional disconnections, and keeping a more stable user's data gives you a chance of resuming after reconnect.
Obviously, when a request arrives, you need to find the user specific process, and for this, ETS is a good fit, since you don't have a single process bottleneck. Instead of manually working with ETS, I'd recommend using gproc in conjunction with via tuples. Basically, when starting a user's gen_server, you can provide name: {:via, :gproc, {:n, :l, key}} where key is some custom key (arbitrary term) you make based on your internal user's id(:n and :l indicate a unique name on the local node). You can then use that same via tuple when issuing calls/casts, and gen_server will use gproc to find the corresponding process.
Finally, you need to have some timeout/disconnect logic to cleanup user processes. In my case, I simply terminated a user's process if there was no activity from the web layer (no end-user came for data in some time). Gproc will automatically remove entries for terminated process from its internal ETS table. It's probably best to supervise user processes under a temporary strategy.
I realize all of this is still a bit vague, but I hope it makes some sense. Keep in mind that this is not the ultimate pattern (there's no such thing of course), but I think it's a reasonable first attempt.
You may also want to take a look at Phoenix web framework that has an interesting pub-sub facility in form ofTopics. I didn't try this out myself yet, but it seems interesting, and may even simplify some of the stuff I discussed above, or at least help for pushing notifications from chatroom to all users.
Sounds like a good use case for ETS.
A simpler approach might be to use an Agent to store the online users information, but it depends quite a lot on what you need from the storage mechanism you choose.
There appear to be two lines of APIs for adding, authenticating and aggregating sites. Depending upon which version of the Documentation/SDK set your rep started you off on, or where in the SDK Guide you started implementing from determines where you start.
Path #1 starts at
ContentServiceTraversal which allows for the retrieval of all ContentServiceInfo (by container type (such as BANK)
ItemManagementService is used to add these items
Refresh is done through RefreshService (most API not containing the word Site)
Path #2 starts at
SiteTranversalService which allows for the retrieval of all SiteInfo (no apparent support for Container Type filter)
SiteAccountManagementService is used to add these items
Refresh is done through Refreshservice (all API containing the word Site).
From the best that I can tell the aforementioned API have a lot of functionality duplication. I have noticed certain API that exist on one branch and not the other but usually they are minor changes (e.g. things you are able to filter by).
I started off with ContentServiceInfo because the documentation and samples that our rep initially gave us started there. Additionally this API started off by providing greater granularity (e.g. simply being able to filter by Container type since we were pretty much only interested in Banks and Processor sites (which I do not believe you guys support)).
My questions are:
Do the two branches of API do the exact same thing?
Do they mostly behave the same way?
Do they back-end to the exact same
System
Data store
Scraper?
Is one line of API supposed to be deprecated sooner in the future than another?
Does one line of API have more future in terms of actually adding new or augmenting existing functionality?
Site-level addition has been introduced through Yodlee APIs to overcome the fact that though a user had bank,creditcard,loan,rewards account at the same end site, user had to provide credentials for each of these containers. Site level addition APIs try to add all these containers with only 1 set of credentials. That's the only difference between container based addition and site based addition.
As to answer your questions:
Do the two branches of API do the exact same thing?
Do they mostly behave the same way?
If you mean the aggregation functionality, Yes.Except for the fact that Site level adds/refreshes all the container(bank,creditcard,loan,rewards) and Container level can add/refresh only one container per API call, all the other behavior will remain the same.
Do they back-end to the exact same
System
Data store
Scraper?
If you are referring to the Yodlee data gathering components, Yes.
Is one line of API supposed to be deprecated sooner in the future than another?
No.Both these sets of APIs cater to different needs. If you are a company who solely rely on Creditcard data, using site level addition will be overkill as it will take longer time for the aggregation and it makes more sense to use container based addition. There is also the factor of backward compatibility, which rules out deprecation of APIs.
I'm trying to wrap my head around how to design a RESTful API for creating object graphs. For example, think of an eCommerce API, where resources have the following relationships:
Order (the main object)
Has-many Addresses
Has-many Order Line items (what does the order consist of)
Has-many Payments
Has-many Contact Info
The Order resource usually makes sense along with it's associations. In isolation, it's just a dumb container with no business significance. However, each of the associated objects has a life of it's own and may need to be manipulated independently, eg. editing the shipping address of an order, changing the contact info against an order, removing a line-item from an order after it has been placed, etc.
There are two options for designing the API:
The Order API endpoint intelligently creates itself AND its associated resources by processing "nested resource" in the content sent to POST /orders
The Order resource only creates itself and the client has to make follow-up POST requests to newly created endpoints, like POST /orders/123/addresses, PUT /orders/123/line-items/987, etc.
While the second option is simpler to implement at the server-side, it makes the client do extra work for 80% of the use-cases.
The first option has the following open questions:
How does one communicate the URL for the newly created resource? The Location header can communicate only one URL, however the server would've potentially created multiple resources.
How does one deal with errors? What if one of the associons has an error? Do we reject the entire object graph? How is that error communicated to the client?
What's the RESTful + pragmatic way of dealing with this?
How I handle this is the first way. You should not assume that a client will make all the requests it needs to. Create all the entities on the one request.
Depending on your use case you may also want to enforce an 'all-or-nothing' approach in creating the entities; ie, if something falls, everything rolls back. You can do this by using a transaction on your database (which you also can't do if everything is done through separate requests). Determining if this is the behavior you want is very specific to your situation. For instance, if you are creating an order statement you may which to employ this (you dont want to create an order that's missing items), however if you are uploading photos it may be fine.
For returning the links to the client, I always return a JSON object. You could easily populate this object with links to each of the resources created. This way the client can determine how to behave after a successful post.
Both options can be implemented RESTful. You ask:
How does one communicate the URL for the newly created resource? The Location header can communicate only one URL, however the server would've potentially created multiple resources.
This would be done the same way you communicate linkss to other Resources in the GET case. Use link elements or what ever your method is to embed the URL of a Resource into a Representation.
We're using the Shopify API to grab data from orders, but we're having some trouble with data validation on the fulfillment side. Is there any way we can add data validation to our checkout page? Even just Javascript validation would be a huge improvement. By the time we see an error, the customer is out of the loop, so we're having to make assumptions about our user's data which is potentially dangerous.
One example is that user typed in a phone number that began with a 1 e.g. (xxx)-1xx-xxxx, which is invalid. Another typed an address that was too long for the shipping API we send it to. We don't want to truncate arbitrary addresses so is there a way to present an error to the customer?
The checkout server is a black box as far as the API is concerned. This is mainly for security reasons.
Unfortunately, this prevents you from doing the kind of extra validation you're asking about during the checkout process.
What would be best practice for the following situation. I have an ecommerce store that pulls down inventory levels from a distributor. Should the site, for everytime a user loads a product detail page use the third party API for the most up to date data? Or, should the site using third party APIs and then store that data for a certain amount of time in it's own system and update it periodically?
To me it seems obvious that it should be updated everytime the product detail page is loaded but what about high traffic ecommerce stores? Are completely different solutions used for that case?
In this case I would definitely cache the results from the distributor's site for some period of time, rather than hitting them every time you get a request. However, I would not simply use a blanket 5 minute or 30 minute timeout for all cache entries. Instead, I would use some heuristics. If possible, for instance if your application is written in a language like Python, you could attach a simple script to every product which implements the timeout.
This way, if it is an item that is requested infrequently, or one that has a large amount in stock, you could cache for a longer time.
if product.popularityrating > 8 or product.lastqtyinstock < 20:
cache.expire(productnum)
distributor.checkstock(productnum)
This gives you flexibility that you can call on if you need it. Initially, you can set all the rules to something like:
cache.expireover("3m",productnum)
distributor.checkstock(productnum)
In actual fact, the script would probably not include the checkstock function call because that would be in the main app, but it is included here for context. If python seems too heavyweiaght to include just for this small amount of flexibilty, then have a look at TCL which was specifically designed for this type of job. Both can be embedded easily in C, C++, C# and Java applications.
Actually, there is another solution. Your distributor keeps the product catalog on their servers and gives you access to it via Open Catalog Interface. When a user wants to make an order he gets redirected in-place to the distributor's catalog, chooses items then transfers selection back to your shop.
It is widely used in SRM (Supplier Relationship Management) branch.
It depends on many factors: the traffic to your site, how often the inventory levels change, the business impact of displaing outdated data, how often the supplers allow you to call their API, their API's SLA in terms of availability and performance, and so on.
Once you have these answers, there are of course many possibilities here. For example, for a low-traffic site where getting the inventory right is important, you may want to call the 3rd-party API on every call, but revert to some alternative behavior (such as using cached data) if the API does not respond within a certain timeout.
Sometimes, well-designed APIs will include hints as to the validity period of the data. For example, some REST-over-HTTP APIs support various HTTP Cache control headers that can be used to specify a validity period, or to only retrieve data if it has changed since last request.