For context, I'm working on a RNG-based motorsport simulator, for lack of a better term. Users can create universes (think FIA in real life terms), in a universe they can create series (think F1, F2, F3 etc) and in each series they can create seasons. In each of these, users can create additional models;
In a universe, a user can create teams and drivers.
In a season, a user can create a calendar using circuits they've added outside any universe, entrants (based on teams they've created in the parent universe) and to these entrants they can add drivers (based on drivers created), which I called "lineups".
The deeper I go with testing these relationships, the more models I need to create through factories to be able to test properly, and the longer it takes for a test to run. I've got a pretty simple test to verify a universe owner can add a driver to an entrant belonging to that universe;
test('a universe owner can add drivers to entrants', function () {
$user = User::factory()->create();
$season = createSeasonForUser($user);
$driver = Driver::factory()->for($season->universe)->create();
$entrant = Entrant::factory()->for($season)->create();
$this->actingAs($user)
->post(route('seasons.lineups.store', [$season]), [
'driver_id' => $driver->id,
'entrant_id' => $entrant->id,
'number' => 2,
])
->assertRedirect(route('seasons.lineups.index', [$season]));
$this->assertDatabaseCount('lineups', 1);
$this->assertCount(1, $entrant->drivers);
$this->assertCount(1, $season->drivers);
});
I've got two helper functions to quickly create a series and/or season belonging to a specific user;
function createSeriesForUser(User $user): Series
{
return Series::factory()->for(Universe::factory()->for($user)->create())->create();
}
function createSeasonForUser(User $user): Season
{
$series = createSeriesForUser($user);
return Season::factory()->for($series)->create();
}
As you can see, to test one part of the process, I need to create six models through factories (with some of these factories sometimes calling more factories). I ran the tests five times, timing each part of the test (factories, the actual request, and the assertions), and the factories take up 1,9 seconds on average, with the rest of the tests taking up 0,015 seconds, which doesn't seem right to me.
Ideally I'd create all required database entries before each test file is run, but I've understood this is bad practice. Is there another way to speed up the tests? Sadly I can't make the relationships less nested or complicated, since these are simply the requirement of the website I'm building.
Alternatively, is this approach in general even the right way to test my controller's functionality, or can it be done differently?
To not clutter up the question too much, here's a pastebin with all current factories
Turns out Faker's image() is really slow. I replaced the 'avatar' => $this->faker->image(), in my UserFactory with 'avatar' => null,, and my entire test suit now runs in barely over three seconds, or a second if I run them in parallel.
Related
We are creating a service for an app using tornado and sqlalchemy. The application is written in django and uses a "soft delete mechanism". What that means is that there was no deletion in the underlying mysql tables. To mark a row as deleted we simply set the attributed "delete" as True. However, in the service we are using sqlalchemy. Initially, we started to add check for delete in the queries made through sqlalchemy itself like:
customers = db.query(Customer).filter(not_(Customer.deleted)).all()
However this leads to a lot of potential bugs because developers tend to miss the check for deleted in there queries. Hence we decided to override the default querying with our query class that does a "pre-filter":
class SafeDeleteMixin(Query):
def __iter__(self):
return Query.__iter__(self.deleted_filter())
def from_self(self, *ent):
# override from_self() to automatically apply
# the criterion too. this works with count() and
# others.
return Query.from_self(self.deleted_filter(), *ent)
def deleted_filter(self):
mzero = self._mapper_zero()
if mzero is not None:
crit = mzero.class_.deleted == False
return self.enable_assertions(False).filter(crit)
else:
return self
This inspired from a solution on sqlalchemy docs here:
https://bitbucket.org/zzzeek/sqlalchemy/wiki/UsageRecipes/PreFilteredQuery
However, we are still facing issues, like in cases where we are doing filter and update together and using this query class as defined above the update does not respect the criterion of delete=False when applying the filter for update.
db = CustomSession(with_deleted=False)()
result = db.query(Customer).filter(Customer.id == customer_id).update({Customer.last_active_time: last_active_time })
How can I implement the "soft-delete" feature in sqlalchemy
I've done something similar here. We did it a bit differently, we made a service layer that all database access goes through, kind of like a controller, but only for db access, we called it a ResourceManager, and it's heavily inspired by "Domain Driven Design" (great book, invaluable for using SQLAlchemy well). A derived ResourceManager exists for each aggregate root, ie. each resource class you want to get at things through. (Though sometimes for really simple ResourceManagers, the derived manager class itself is generated dynamically) It has a method that gives out your base query, and that base query gets filtered for your soft delete before it's handed out. From then on, you can add to that query generatively for filtering, and finally call it with query.one() or first() or all() or count(). Note, there is one gotcha I encountered for this kind of generative query handling, you can hang yourself if you join a table too many times. In some cases for filtering we had to keep track of which tables had already been joined. If your delete filter is off the primary table, just filter that first, and you can join willy nilly after that.
so something like this:
class ResourceManager(object):
# these will get filled in by the derived class
# you could use ABC tools if you want, we don't bother
model_class = None
serializer_class = None
# the resource manager gets instantiated once per request
# and passed the current requests SQAlchemy session
def __init__(self, dbsession):
self.dbs = dbsession
# hand out base query, assumes we have a boolean 'deleted' column
#property
def query(self):
return self.dbs(self.model_class).filter(
getattr(self.model_class, 'deleted')==False)
class UserManager(ResourceManager):
model_class = User
# some client code might look this
dbs = SomeSessionFactoryIHave()
user_manager = UserManager(dbs)
users = user_manager.query.filter_by(name_last="Duncan").first()
Now as long as I always start off by going through a ResourceManager, which has other benefits too (see aforementioned book), I know my query is pre-filtered. This has worked very well for us on a current project that has soft-delete and quite an extensive and thorny db schema.
hth!
I would create a function
def customer_query():
return db.session.query(Customer).filter(Customer.deleted == False)
I used query functions to not forget default flags, to set flags based on user permission, filter using joins etc, so that these things wont be copy-pasted and forgotten at various places.
While creating an app in Laravel 4 after reading T. Otwell's book on good design patterns in Laravel I found myself creating repositories for every table on the application.
I ended up with the following table structure:
Students: id, name
Courses: id, name, teacher_id
Teachers: id, name
Assignments: id, name, course_id
Scores (acts as a pivot between students and assignments): student_id, assignment_id, scores
I have repository classes with find, create, update and delete methods for all of these tables. Each repository has an Eloquent model which interacts with the database. Relationships are defined in the model per Laravel's documentation: http://laravel.com/docs/eloquent#relationships.
When creating a new course, all I do is calling the create method on the Course Repository. That course has assignments, so when creating one, I also want to create an entry in the score's table for each student in the course. I do this through the Assignment Repository. This implies the assignment repository communicates with two Eloquent models, with the Assignment and Student model.
My question is: as this app will probably grow in size and more relationships will be introduced, is it good practice to communicate with different Eloquent models in repositories or should this be done using other repositories instead (I mean calling other repositories from the Assignment repository) or should it be done in the Eloquent models all together?
Also, is it good practice to use the scores table as a pivot between assignments and students or should it be done somewhere else?
I am finishing up a large project using Laravel 4 and had to answer all of the questions you are asking right now. After reading all of the available Laravel books over at Leanpub, and tons of Googling, I came up with the following structure.
One Eloquent Model class per datable table
One Repository class per Eloquent Model
A Service class that may communicate between multiple Repository classes.
So let's say I'm building a movie database. I would have at least the following following Eloquent Model classes:
Movie
Studio
Director
Actor
Review
A repository class would encapsulate each Eloquent Model class and be responsible for CRUD operations on the database. The repository classes might look like this:
MovieRepository
StudioRepository
DirectorRepository
ActorRepository
ReviewRepository
Each repository class would extend a BaseRepository class which implements the following interface:
interface BaseRepositoryInterface
{
public function errors();
public function all(array $related = null);
public function get($id, array $related = null);
public function getWhere($column, $value, array $related = null);
public function getRecent($limit, array $related = null);
public function create(array $data);
public function update(array $data);
public function delete($id);
public function deleteWhere($column, $value);
}
A Service class is used to glue multiple repositories together and contains the real "business logic" of the application. Controllers only communicate with Service classes for Create, Update and Delete actions.
So when I want to create a new Movie record in the database, my MovieController class might have the following methods:
public function __construct(MovieRepositoryInterface $movieRepository, MovieServiceInterface $movieService)
{
$this->movieRepository = $movieRepository;
$this->movieService = $movieService;
}
public function postCreate()
{
if( ! $this->movieService->create(Input::all()))
{
return Redirect::back()->withErrors($this->movieService->errors())->withInput();
}
// New movie was saved successfully. Do whatever you need to do here.
}
It's up to you to determine how you POST data to your controllers, but let's say the data returned by Input::all() in the postCreate() method looks something like this:
$data = array(
'movie' => array(
'title' => 'Iron Eagle',
'year' => '1986',
'synopsis' => 'When Doug\'s father, an Air Force Pilot, is shot down by MiGs belonging to a radical Middle Eastern state, no one seems able to get him out. Doug finds Chappy, an Air Force Colonel who is intrigued by the idea of sending in two fighters piloted by himself and Doug to rescue Doug\'s father after bombing the MiG base.'
),
'actors' => array(
0 => 'Louis Gossett Jr.',
1 => 'Jason Gedrick',
2 => 'Larry B. Scott'
),
'director' => 'Sidney J. Furie',
'studio' => 'TriStar Pictures'
)
Since the MovieRepository shouldn't know how to create Actor, Director or Studio records in the database, we'll use our MovieService class, which might look something like this:
public function __construct(MovieRepositoryInterface $movieRepository, ActorRepositoryInterface $actorRepository, DirectorRepositoryInterface $directorRepository, StudioRepositoryInterface $studioRepository)
{
$this->movieRepository = $movieRepository;
$this->actorRepository = $actorRepository;
$this->directorRepository = $directorRepository;
$this->studioRepository = $studioRepository;
}
public function create(array $input)
{
$movieData = $input['movie'];
$actorsData = $input['actors'];
$directorData = $input['director'];
$studioData = $input['studio'];
// In a more complete example you would probably want to implement database transactions and perform input validation using the Laravel Validator class here.
// Create the new movie record
$movie = $this->movieRepository->create($movieData);
// Create the new actor records and associate them with the movie record
foreach($actors as $actor)
{
$actorModel = $this->actorRepository->create($actor);
$movie->actors()->save($actorModel);
}
// Create the director record and associate it with the movie record
$director = $this->directorRepository->create($directorData);
$director->movies()->associate($movie);
// Create the studio record and associate it with the movie record
$studio = $this->studioRepository->create($studioData);
$studio->movies()->associate($movie);
// Assume everything worked. In the real world you'll need to implement checks.
return true;
}
So what we're left with is a nice, sensible separation of concerns. Repositories are only aware of the Eloquent model they insert and retrieve from the database. Controllers don't care about repositories, they just hand off the data they collect from the user and pass it to the appropriate service. The service doesn't care how the data it receives is saved to the database, it just hands off the relevant data it was given by the controller to the appropriate repositories.
Keep in mind you're asking for opinions :D
Here's mine:
TL;DR: Yes, that's fine.
You're doing fine!
I do exactly what you are doing often and find it works great.
I often, however, organize repositories around business logic instead of having a repo-per-table. This is useful as it's a point of view centered around how your application should solve your "business problem".
A Course is a "entity", with attributes (title, id, etc) and even other entities (Assignments, which have their own attributes and possibly entities).
Your "Course" repository should be able to return a Course and the Courses' attributes/Assignments (including Assignment).
You can accomplish that with Eloquent, luckily.
(I often end up with a repository per table, but some repositories are used much more than others, and so have many more methods. Your "courses" repository may be much more full-featured than your Assignments repository, for instance, if your application centers more around Courses and less about a Courses' collection of Assignments).
The tricky part
I often use repositories inside of my repositories in order to do some database actions.
Any repository which implements Eloquent in order to handle data will likely return Eloquent models. In that light, it's fine if your Course model uses built-in relationships in order to retrieve or save Assignments (or any other use case). Our "implementation" is built around Eloquent.
From a practical point of view, this makes sense. We're unlikely to change data sources to something Eloquent can't handle (to a non-sql data source).
ORMS
The trickiest part of this setup, for me at least, is determing if Eloquent is actually helping or harming us. ORMs are a tricky subject, because while they help us greatly from a practical point of view, they also couple your "business logic entities" code with the code doing the data retrieval.
This sort of muddles up whether your repository's responsibility is actually for handling data or handling the retrieval / update of entities (business domain entities).
Furthermore, they act as the very objects you pass to your views. If you later have to get away from using Eloquent models in a repository, you'll need to make sure the variables passed to your views behave in the same way or have the same methods available, otherwise changing your data sources will roll into changing your views, and you've (partially) lost the purpose of abstracting your logic out to repositories in the first place - the maintainability of your project goes down as.
Anyway, these are somewhat incomplete thoughts. They are, as stated, merely my opinion, which happens to be the result of reading Domain Driven Design and watching videos like "uncle bob's" keynote at Ruby Midwest within the last year.
I like to think of it in terms of what my code is doing and what it is responsible for, rather than "right or wrong". This is how I break apart my responsibilities:
Controllers are the HTTP layer and route requests through to the underlying apis (aka, it controls the flow)
Models represent the database schema, and tell the application what the data looks like, what relationships it may have, as well as any global attributes that may be necessary (such as a name method for returning a concatenated first and last name)
Repositories represent the more complex queries and interactions with the models (I don't do any queries on model methods).
Search engines - classes that help me build complex search queries.
With this in mind, it makes sense every time to use a repository (whether you create interfaces.etc. is a whole other topic). I like this approach, because it means I know exactly where to go when I'm needing to do certain work.
I also tend to build a base repository, usually an abstract class which defines the main defaults - basically CRUD operations, and then each child can just extend and add methods as necessary, or overload the defaults. Injecting your model also helps this pattern to be quite robust.
Think of Repositories as a consistent filing cabinet of your data (not just your ORMs). The idea is that you want to grab data in a consistent simple to use API.
If you find yourself just doing Model::all(), Model::find(), Model::create() you probably won't benefit much from abstracting away a repository. On the other hand, if you want to do a bit more business logic to your queries or actions, you may want to create a repository to make an easier to use API for dealing with data.
I think you were asking if a repository would be the best way to deal with some of the more verbose syntax required to connect related models. Depending on the situation, there are a few things I may do:
Hanging a new child model off of a parent model (one-one or one-many), I would add a method to the child repository something like createWithParent($attributes, $parentModelInstance) and this would just add the $parentModelInstance->id into the parent_id field of the attributes and call create.
Attaching a many-many relationship, I actually create functions on the models so that I can run $instance->attachChild($childInstance). Note that this requires existing elements on both side.
Creating related models in one run, I create something that I call a Gateway (it may be a bit off from Fowler's definitions). Way I can call $gateway->createParentAndChild($parentAttributes, $childAttributes) instead of a bunch of logic that may change or that would complicate the logic that I have in a controller or command.
I have been building a rest api (using Phil Sturgeons Codeigniter-Restserver) and Ive been sticking closely to the tutorial at:
http://net.tutsplus.com/tutorials/php/working-with-restful-services-in-codeigniter-2/
in particular Ive been paying attention to this part of the tutorial:
function user_get()
{
// respond with information about a user
}
function user_put()
{
// create a new user and respond with a status/errors
}
function user_post()
{
// update an existing user and respond with a status/errors
}
function user_delete()
{
// delete a user and respond with a status/errors
}
and Ive been writing the above functions for each database object that is accessible by the api, and also:
function users_get() // <-- Note the "S" at the end of "user"
{
// respond with information about all users
}
I currently have approximately 30 database objects (users, products, clients, transactions etc), all of which have the above functions written for them, and all functions are dumped into /controllers/api/api.php, and this file has now grown to be quite large (over 2000 lines of code).
QUESTION 1:
Is there a way to split this api file up, into 30 files for example, and keep all api functions relating to a single database object in a single place, rather than just dumping all api functions into a single file?
QUESTION 2:
I would also like to keep a separation between my current model functions (non-api related functions) and the functions that are used by the api.
Should I be doing this?
Is there a recommended approach that I should use here? For example should I write separate models that are used by the api, or is ok to keep all model functions (both non-api functions, and api functions) for a given database object in the same file?
Any feedback or advice would be great..
You can create api controllers the same way you do regular controllers; you can do the same with models.
application/controllers/api/users.php
class Users extends REST_Controller{
function user_post(){
$this->users_model->new_user()
...
POST index.php/api/user
--
application/controllers/api/transactions.php
class Transactions extends REST_Controller{
function transaction_get(){
$this->transactions_model->get()
...
GET index.php/api/transaction
I would also like to keep a separation between my current model functions (non-api related functions) and the functions that are used by the api.
I don't see why you couldn't use the same methods so long as they return what you need.
My SQL and Entity Framework knowledge is a somewhat limited. In one Entity Framework (4) application, I notice it takes forever (about 2 minutes) to complete one of my method calls. The first queries do not take much time, but when I loop through the Entity Framework objects returned by the queries, even though I am only reading (not modifying) the data I supposedly got, it takes forever to complete the nested loops, even though there are only dozens of entries in each list and a few levels of looping.
I expect the example below could be re-written with a fancier query that could probably include all of the filtering I am doing in my loops with some SQL words I don't really know how to use, so if someone could show me what the equivalent SQL expression would be, that would be extremely educational to me and probably solve my current performance problem.
Moreover, since other parts of this and other applications I develop often want to do more complex computations on SQL data, I would also like to know a good way to retrieve data from Entity Framework to local memory objects that do not have huge delays in reading them. In my LINQ-to-SQL project there was a similar performance problem, and I solved it by refactoring the whole application to load all SQL data into parallel objects in RAM, which I had to write myself, and I wonder if there isn't a better way to either tell Entity Framework to not keep doing whatever high-latency communication it is doing, or to load into local RAM objects.
In the example below, the code gets a list of food menu items for a member (i.e. a person) on a certain date via a SQL query, and then I use other queries and loops to filter out the menu items on two criteria: 1) If the member has a rating of zero for any group id which the recipe is a member of (a many-to-many relationship) and 2) If the member has a rating of zero for the recipe itself.
Example:
List<PFW_Member_MenuItem> MemberMenuForCookDate =
(from item in _myPfwEntities.PFW_Member_MenuItem
where item.MemberID == forMemberId
where item.CookDate == onCookDate
select item).ToList();
// Now filter out recipes in recipe groups rated zero by the member:
List<PFW_Member_Rating_RecipeGroup> ExcludedGroups =
(from grpRating in _myPfwEntities.PFW_Member_Rating_RecipeGroup
where grpRating.MemberID == forMemberId
where grpRating.Rating == 0
select grpRating).ToList();
foreach (PFW_Member_Rating_RecipeGroup grpToExclude in ExcludedGroups)
{
List<PFW_Member_MenuItem> rcpsToRemove = new List<PFW_Member_MenuItem>();
foreach (PFW_Member_MenuItem rcpOnMenu in MemberMenuForCookDate)
{
PFW_Recipe rcp = GetRecipeById(rcpOnMenu.RecipeID);
foreach (PFW_RecipeGroup group in rcp.PFW_RecipeGroup)
{
if (group.RecipeGroupID == grpToExclude.RecipeGroupID)
{
rcpsToRemove.Add(rcpOnMenu);
break;
}
}
}
foreach (PFW_Member_MenuItem rcpToRemove in rcpsToRemove)
MemberMenuForCookDate.Remove(rcpToRemove);
}
// Now filter out recipes rated zero by the member:
List<PFW_Member_Rating_Recipe> ExcludedRecipes =
(from rcpRating in _myPfwEntities.PFW_Member_Rating_Recipe
where rcpRating.MemberID == forMemberId
where rcpRating.Rating == 0
select rcpRating).ToList();
foreach (PFW_Member_Rating_Recipe rcpToExclude in ExcludedRecipes)
{
List<PFW_Member_MenuItem> rcpsToRemove = new List<PFW_Member_MenuItem>();
foreach (PFW_Member_MenuItem rcpOnMenu in MemberMenuForCookDate)
{
if (rcpOnMenu.RecipeID == rcpToExclude.RecipeID)
rcpsToRemove.Add(rcpOnMenu);
}
foreach (PFW_Member_MenuItem rcpToRemove in rcpsToRemove)
MemberMenuForCookDate.Remove(rcpToRemove);
}
You can use EFProf http://www.hibernatingrhinos.com/products/EFProf to track see exactly what EF is sending to SQL. It can also show you how many queries you are sending and how many unique queries. It also provides you some analysis of each query (e.g. is it unbound etc). Entity Framework with its navigation properties, it is quite easy to not realize you are making a db request. When you are in a loop, and have a navigation property, you get in to the N + 1 problem.
You could use the Keyword Virtual on your List parts of your model if you are using code first to enable proxying, that way you will not have to get all the data back at once, only as you need it.
Also consider NoTracking for read only data
context.bigTable.MergeOption = MergeOption.NoTracking;
In Orchard, how is a module developer able to learn how "joins" work, particularly when joining to core parts and records? One of the better helps I've seen was in Orchard documentation, but none of those examples show how to form relations with existing or core parts. As an example of something I'm looking for, here is a snippet of module service code taken from a working example:
_contentManager
.Query<TaxonomyPart>()
.Join<RoutePartRecord>()
.Where(r => r.Title == name)
.List()
In this case, a custom TaxonomyPart is joining with a core RoutePartRecord. I've investigated the code, and I can't see how that a TaxononmyPart is "joinable" to a RoutePartRecord. Likewise, from working code, here is another snippet driver code which relates a custom TagsPart with a core CommonPartRecord:
List<string> tags = new List<string> { "hello", "there" };
IContentQuery<TagsPart, TagsPartRecord> query = _cms.Query<TagsPart, TagsPartRecord>();
query.Where(tpr => tpr.Tags.Any(t => tags.Contains(t.TagRecord.TagName)));
IEnumerable<TagsPart> parts =
query.Join<CommonPartRecord>()
.Where(cpr => cpr.Id != currentItemId)
.OrderByDescending(cpr => cpr.PublishedUtc)
.Slice(part.MaxItems);
I thought I could learn from either of the prior examples of how to form my own query. I did this:
List<string> tags = new List<string> { "hello", "there" };
IContentQuery<TagsPart, TagsPartRecord> query = _cms.Query<TagsPart, TagsPartRecord>();
query.Where(tpr => tpr.Tags.Any(t => tags.Contains(t.TagRecord.TagName)));
var stuff =
query.Join<ContainerPartRecord>()
.Where(ctrPartRecord => ctrPartRecord.ContentItemRecord.ContentType.Name == "Primary")
.List();
The intent of my code is to limit the content items found to only those of a particular container (or blog). When the code ran, it threw an exception on my join query saying {"could not resolve property: ContentType of: Orchard.Core.Containers.Models.ContainerPartRecord"}. This leads to a variety of questions:
Why in the driver's Display() method of the second example is the CommonPartRecord populated, but not the ContainerPartRecord? In general how would I know what part records are populated, and when?
In the working code snippets, how exactly is the join working since no join key/condition is specified (and no implicit join keys are apparent)? For example, I checked the data migration file and models classes, and found no inherent relation between a TagsPart and a CommonPartRecord. Thus, besides looking at that sample code, how would anyone have known in the first place that such a join was legal or possible?
Is the join I tried with TagsPart and ContainerPartRecord legal in any context? Which?
Is the query syntax of these examples primarily a reflection of Orchard, of NHibernate, or LINQ to NHibernate? If it is primarily a reflection of NHibernate, then which NHibernate book or article is recommended reading so that I can dig deeper into Orchard?
It seems there is a hole in the documentation regarding these kinds of thoughts and questions, which makes it hard to write a module. Whatever answers can be found for this topic, I'd be glad to compile into an article or community Orchard documentation.
The join is only there to enable the where that follows it. It doesn't mean that the part being joined will be actually brought down from the DB. That will happen no matter what with the latest 1.x source, and will happen lazily with 1.3.
You don't need a condition as you can only join parts this way. The join condition is implicit: parts are joined by the item id.
Yes. What is not legal is that the condition in the where is using data that is not available from the joined part records.
Those examples are all Orchard Content Manager queries, so they are fairly constrained, but also fairly easy to build as long as you don't step outside of their boundaries because so much can be assumed and will happen implicitly. If you need more control, you could use the new HQL capabilities that were added in the latest 1.x drops.
As for holes in the documentation, well, but of course. The documentation that we have today is only covering a very small part of the platform. Your best reference today is the source code. Any contribution you could make to this is highly appreciated by us and by the rest of the community. Let me know if you need help with this.