How can I follow a clean architecture style if theres theres logic only local to database layer which cannot be put into entity methods? - entity

for a use-case that allows an actor to perform an account creation.
As such I have created these methods on User
async createUser({ email, username, password }: Pick<UserObject, 'password' | 'email' | 'username'>) {
return await this.#dbAccessor.createNewUser({ email, username, password });
}
//and...
checkEmailExists(email: string): boolean;
checkUsernameExists(username: string): boolean;
On the Use-Case Layer, after the hurdles of checking with utilities such as pwhashing and validation, I make use check conditions with checkEmailExists and checkUsernameExists, then if passes, I'll finally run createUser.
However, my question is, what if those email/username checks is already stale by time it begins to insert? To prevent that, would I have to migrate these check logic into the db query ? And if that's so, wouldn't it mean my logic is leaking out of Application logic/Enterprise logic layers into these adapters?
is a method such as 'checkAndInsertNewUser' more suitable in this scenario, and implement so in the database layer?

It is generally not recommended to "merge" business logic with the database layer. What if you want to or need to replace the whole database layer in some point of time? What about testing business logic (decision making) independent from database? Even if your business logic involves querying the database the actual decision making (what to check and how) should be separate from database. What might be useful or even necessary in your case are transactions to make the query of the DB and the creation/insertion of a new item atomic.

Related

API design for machine learning models

Using Django rest framework to build an API webservice that contains many of already trained machine learning models. Some models can predict a batch_size of 1 or an image at a time. Others need a history of data (timelines) to be able to predict/forecasts. Usually these timelines can hardly fit and passed as parameter. Being that, we want to give the requester the ability to request by either:
sending the data (small batches) to predict as parameter.
passing a database id/reference as parameter then the API will query the database and do the predictions.
So the question is, what would be the best API design for identifying which approach the requester chose?. Some considered approaches:
Add /db to the path of the endpoint ex: POST models/<X>/db. The problem with this approach is that (2x) endpoints are generated for each model.
Add parameter db as boolean to each request. The problem with such approach is that it adds additional overhead for each request just to check which approach. Also, make the code less readable.
Global variable set for each requester when signed for the API token. The problem is that you restricted the requester for 1 mode which is not convenient.
What would be the best approach for this case
The fact that you currently have more than one source would cause me to seriously consider attempting to abstract the "source" component as much as possible, to allow all manner of sources. For example, suppose that future users would like to pull data out of a mongodb, instead of a whatever db you currently are using? Or from some other storage structure? Or pull from a third party? Or, or, or....
In any case the question is now "how much do they all have in common, and what should they all implement?"
class Source(object):
def __get_batch__(self, batch_size=1):
raise NotImplementedError() #each source needs to implement this on its own
#http_library.POST_endpoint("/db")
class DBSource(Source):
def __init__(self, post_data):
if post_data["table"] in ["data1", "data2"]:
self.table = table
else:
raise Exception("Must use predefined table to prevent SQL injection")
def __get_batch__(self, batch_size=1):
return sql_library.query("SELECT * FROM {} LIMIT ?".format(self.table), batch_size)
#http_library.POST_endpoint("/local")
class LocalSource(Source):
def __init__(self, post_data):
self.data = post_data["data"]
def __get_batch__(self, batch_size=1):
data = self.data[self.i, self.i+batch_size]
i += batch_size
return data
This is just an example. However, if a fixed part of your path designates "the source", then you have left yourself open to scale this indefinitely.
Add /db to the path of the endpoint ex: POST models//db. The problem with this approach is that (2x) endpoints are generated for
each model.
Inevitable. DRY out common code to sub-methods.
Add parameter db as boolean to each request. The problem with such approach is that it adds additional overhead for each request just to
check which approach. Also, make the code less readable.
There won't be any additional overhead (that's what your underlying framework does to match a URL to a function/method anyway). However, these are 2 separate functionalities, I would keep them separate, so I would prefer the first approach.
Global variable set for each requester when signed for the API token. The problem is that you restricted the requester for 1 mode
which is not convenient.
Yikes! unless you provide a UI letting a user to select his preference and apply it globally (I don't think any UX will agree to that)
That being said, the api design should be driven by questioning who is mastering (or owning) the data. If it's the application and user already knows the ID of that entity, then you shouldn't be asking the data from the user.
If it's the user, and then if it won't fit in a POST body, then I would say, a real-time API may not be the right solution, think about message queues/pub-sub based systems.
If you need a hybrid solution as you asked in the question, then, I would prefer the 1st approach.

API interface design - toggle or 2 different interfaces

I am studying interface design.
Here is what I curious about.
Some of open API support 2 different interfaces to implement toggling. i.e. instagram like interface. It separates like interface(like, cancel like)
What is the advantage of separate those two.(separating into two interfaces makes end-user more complicated in my view)
I question this, since it could be implemented with toggle.
i.e. user send item_id and user_id. server check database(this item is already liked or not), and update.
Thanks for answer!
The real benefit to having two interfaces for toggling is that it doesn't require the user to know the current state of the thing they are attempting to change (i.e. it doesn't require me to first query for the state).
If I am a consumer of an API, typically I will want to perform actions such as like-ing something. Very rarely can I think of a case where I would want to perform the action of do the opposite of what I did previously (unless I'm feeling like flip-flopping). If you didn't have two endpoints for like and unlike then you'd first have to poll the API to get the current status, and then perform the toggle that you're talking about if needed.
This situation introduces more logic into your code, requires that you make 1-2 calls to the API, and assumes that the state didn't change between calls; whereas having two endpoints reduces the logic, limits your API calls to 1 per action, and you don't have to worry about the state changing unexpectedly.
In the case where you try to like something that the user has already liked, then the API would simply return a successful result and not alter the underlying data.
One reason to prefer an interface where you specify the desired state explicitly is that it will be idempotent. That is, the resulting state is the same even if the request is made multiple times.
This is a pretty contrived example, but if two different people sharing the same account tried to like the same thing within a small enough window, you could end up with it being un-liked instead.

Porting PHP API over to Parse

I am a PHP dev looking to port my API over to the Parse platform.
Am I right in thinking that you only need cloud code for complex operations? For example, consider the following methods:
// Simple function to fetch a user by id
function getUser($userid) {
return (SELECT * FROM users WHERE userid=$userid LIMIT 1)
}
// another simple function, fetches all of a user's allergies (by their user id)
function getAllergies($userid) {
return (SELECT * FROM allergies WHERE userid=$userid)
}
// Creates a script (story?) about the user using their user id
// Uses their name and allergies to create the story
function getScript($userid) {
$user = getUser($userid)
$allergies = getAllergies($userid).
return "My name is {$user->getName()}. I am allergic to {$allergies}"
}
Would I need to implement getUser()/getAllergies() endpoints in Cloud Code? Or can I simply use Parse.Query("User")... thus leaving me with only the getScript() endpoint to implement in cloud code?
Cloud code is for computation heavy operations that should not be performed on the client, i.e. handling a large dataset.
It is also for performing beforeSave/afterSave and similar hooks.
In your example, providing you have set up a reasonable data model, none of the operations require cloud code.
Your approach sounds reasonable. I tend to put simply queries that will most likely not change on the client side, but it all depends on your scenario. When developing mobile apps I tend to put a lot of code in cloud code. I've found that it speeds up my development cycle. For example, if someone finds a bug and it's in cloud code, make the fix, run parse deploy, done! The change is available to all mobile environments instantly!!! If that same code is in my mobile app, it really sucks, cause now I have to fix the bug, rebuild, push it to the app store/google play, wait x number of days for it to be approved, have the users download it... you see where I'm going here.
Take for example your
SELECT * FROM allergies WHERE userid=$userid query.
Even though this is a simple query, what if you want to sort it? maybe add some additional filtering?
These are the kinds of things I think of when deciding where to put the code. Hope this helps!
As a side note, I have also found cloud code very handy when needing to add extra security to my apps.

Deep level access control in DataMapper ORM

Introduction
I'm currently building an access control system in my DataMapper ORM installation (with CodeIgniter 2.*). I have the initial injection of the User's rights (Root/Anonymous layers too) working perfectly. When a User logs in the DataMapper calls that are done in the system will automatically be marked with the Userrights the User has.
So until this point it works perfectly, but now I'm a bit in a bind. The problem is that I need some way to catch and filter each method-call on the Object that is instantiated.
I have two special calls so I can disable the Userrights-checks too. This is particularly handy at the exact moment I want to login a User and need to do initial checks;
DataMapper::disable_userrights();
$this->_user = new User($this->session->userdata('_user_id'));
$this->_userrights = ($this->_user ? $this->_user->userrights(TRUE) : NULL);
DataMapper::enable_userrights();
The above makes sure I can do the initial User (and it's Userrights) injection. Inside the DataMapper library I use the $CI =& get_instance(); to access the _ globals I use. The general rule in this installment I'm building is that $this->_ is reserved for a "globals" system that always gets loaded (or can sometimes be NULL/FALSE) so I can easily access information that's almost always required on each page/call.
Details
Ok, so image the above my logged-in User has the Userrights: Create/Read/Update on the User Entity. So now if I call a simple:
$test = new User();
$test->get_where('name', 'Allendar');
The $_rights Array inside the DataMapper instance will know that the current logged-in User is allowed to perform certain tasks on "this" instance;
protected $_rights = array(
'Create' => TRUE,
'Read' => TRUE,
'Update' => TRUE,
'Delete' => FALSE,
);
The issue
Now comes my problem. I want to control these Userrights by validating them over each action that is performed. I have the following ideas;
Super redundant; make a global validation method that is executed at the start of each other method in the DataMapper Class.
Problem 1: I have to spam the whole DataMapper Class with the same calls
Problem 2: I have no control over DataMapper extension methods
Problem 3: How to detect relational includes? They should be validated too
Low level binding on certain Core DataMapper calls where I can clearly detect what kind of action is executed on the database (C/R/U/D).
So I'm aiming for Option 2 (and 1.) Problem 2), as it will also solve 1.) Problem 2.
The problem is that DataMapper is so massive and it's pretty complex to discern what actually happens when on it's deepest calling level. Furthermore it looks like all methods are very scattered and hardly ever use each other ($this->get() is often not used to do an eventual call to get a dataset).
So my goal is:
User (logged-in, Anonymous, Root) makes a DataMapper istance
$user_test = new User;
User wants to get $user-test (Read)
$user_test->get(1);
DataMapper will validate the actual call that is done at the database
IF it is only SELECT; OK
IF something else than SELECT (or JOINs to other Model that the User doesn't have access to that/those Models, it will fail with a clear error message)
IF JOINed Models also validate; OK
Return the actual instance;
IF OK: continue DataMapper's normal workflow
IF not OK: inform the User and return the normal empty DataMapper instance of that Model
Furthermore, for this system I think I will need to add some customization for the raw_sql (etc.) SQL calls so that I have to inject the rights manually related to that SQL statement or only allow the Root User to do those things.
Recap
I'm curious if someone ever attempted something like this in DataMapper or has some hints how I can use/intercept those lowest level calls in DataMapper.
If I can get some clearance on the deepest level of DataMapper's actual final query-call I can probably get a long way myself too.
I would like to suggest not to do this in Datamapper itself (mainly due to the complexity of the code, as you have already noticed yourself).
Instead, use a base model, and have that extend Datamapper. Then add the code to the base model required for your ACL checks, and then overload every Datamapper method that needs an ACL check. Have it call your ACL, deal with an access denied, and if access is granted, simply return the result of parent::method();.
Instead of extending Datamapper, your application models should then extend this base model, so they will inherit the ACL features.

Adding to a property in a restful API

I am in the process of designing an HTTP API.
I have a Card resource which has an Balance property, which clients can add/subtract to.
At first I thought this should be implemented as PUT, because it's a form of Update to the resource, but then I read that PUT is idempotent, but adding to an amount isn't idempotent.
As it's not a creation of an object, I think I'm left with referring to it as a controller, something like:
POST example.org/card/{card-Id}/AddToBalance
data: value=10
will add 10 to the balance.
Is there a better way?
Yea, use cases like these are not where REST excels (expressing operations, particularly when they only affect a small subset of an entities data). Your particular case is pretty simple though, you can handle it with a slight change to your verb and endpoint:
PUT example.org/card/{card-Id}/balance
{"value" : 100}
Basically read as "Update the balance of card {id} to 100". On the server side you will still need to validate the transaction, and determine wether its a valid add based off the existing value of the balance.
Design Looks good as for as REST principals are concerned.
PUT action should be Idempotent. But it depends upon you requirement
Other thing you can use PATCH, as you are just doing partial amount of Updates rather than complete replacement of resources.