Related
#Private attribute example
class C {
has $!w; #private attribute
multi method w { $!w } #getter method
multi method w ( $_ ) { #setter method
warn “Don’t go changing my w!”; #some side action
$!w = $_
}
}
my $c = C.new
$c.w( 42 )
say $c.w #prints 42
$c.w: 43
say $c.w #prints 43
#but not
$c.w = 44
Cannot modify an immutable Int (43)
so far, so reasonable, and then
#Public attribute example
class C {
has $.v is rw #public attribute with automatic accessors
}
my $c = C.new
$c.v = 42
say $c.v #prints 42
#but not
$c.v( 43 ) #or $c.v: 43
Too many positionals passed; expected 1 argument but got 2
I like the immediacy of the ‘=‘ assignment, but I need the ease of bunging in side actions that multi methods provide. I understand that these are two different worlds, and that they do not mix.
BUT - I do not understand why I can’t just go
$c.v( 43 )
To set a public attribute
I feel that raku is guiding me to not mix these two modes - some attributes private and some public and that the pressure is towards the method method (with some : sugar from the colon) - is this the intent of Raku's design?
Am I missing something?
is this the intent of Raku's design?
It's fair to say that Raku isn't entirely unopinionated in this area. Your question touches on two themes in Raku's design, which are both worth a little discussion.
Raku has first-class l-values
Raku makes plentiful use of l-values being a first-class thing. When we write:
has $.x is rw;
The method that is generated is:
method x() is rw { $!x }
The is rw here indicates that the method is returning an l-value - that is, something that can be assigned to. Thus when we write:
$obj.x = 42;
This is not syntactic sugar: it really is a method call, and then the assignment operator being applied to the result of it. This works out, because the method call returns the Scalar container of the attribute, which can then be assigned into. One can use binding to split this into two steps, to see it's not a trivial syntactic transform. For example, this:
my $target := $obj.x;
$target = 42;
Would be assigning to the object attribute. This same mechanism is behind numerous other features, including list assignment. For example, this:
($x, $y) = "foo", "bar";
Works by constructing a List containing the containers $x and $y, and then the assignment operator in this case iterates each side pairwise to do the assignment. This means we can use rw object accessors there:
($obj.x, $obj.y) = "foo", "bar";
And it all just naturally works. This is also the mechanism behind assigning to slices of arrays and hashes.
One can also use Proxy in order to create an l-value container where the behavior of reading and writing it are under your control. Thus, you could put the side-actions into STORE. However...
Raku encourages semantic methods over "setters"
When we describe OO, terms like "encapsulation" and "data hiding" often come up. The key idea here is that the state model inside the object - that is, the way it chooses to represent the data it needs in order to implement its behaviors (the methods) - is free to evolve, for example to handle new requirements. The more complex the object, the more liberating this becomes.
However, getters and setters are methods that have an implicit connection with the state. While we might claim we're achieving data hiding because we're calling a method, not accessing state directly, my experience is that we quickly end up at a place where outside code is making sequences of setter calls to achieve an operation - which is a form of the feature envy anti-pattern. And if we're doing that, it's pretty certain we'll end up with logic outside of the object that does a mix of getter and setter operations to achieve an operation. Really, these operations should have been exposed as methods with a names that describes what is being achieved. This becomes even more important if we're in a concurrent setting; a well-designed object is often fairly easy to protect at the method boundary.
That said, many uses of class are really record/product types: they exist to simply group together a bunch of data items. It's no accident that the . sigil doesn't just generate an accessor, but also:
Opts the attribute into being set by the default object initialization logic (that is, a class Point { has $.x; has $.y; } can be instantiated as Point.new(x => 1, y => 2)), and also renders that in the .raku dumping method.
Opts the attribute into the default .Capture object, meaning we can use it in destructuring (e.g. sub translated(Point (:$x, :$y)) { ... }).
Which are the things you'd want if you were writing in a more procedural or functional style and using class as a means to define a record type.
The Raku design is not optimized for doing clever things in setters, because that is considered a poor thing to optimize for. It's beyond what's needed for a record type; in some languages we could argue we want to do validation of what's being assigned, but in Raku we can turn to subset types for that. At the same time, if we're really doing an OO design, then we want an API of meaningful behaviors that hides the state model, rather than to be thinking in terms of getters/setters, which tend to lead to a failure to colocate data and behavior, which is much of the point of doing OO anyway.
BUT - I do not understand why I can’t just go $c.v( 43 ) To set a public attribute
Well, that's really up to the architect. But seriously, no, that's simply not the standard way Raku works.
Now, it would be entirely possible to create an Attribute trait in module space, something like is settable, that would create an alternate accessor method that would accept a single value to set the value. The problem with doing this in core is, is that I think there are basically 2 camps in the world about the return value of such a mutator: would it return the new value, or the old value?
Please contact me if you're interested in implementing such a trait in module space.
I currently suspect you just got confused.1 Before I touch on that, let's start over with what you're not confused about:
I like the immediacy of the = assignment, but I need the ease of bunging in side actions that multi methods provide. ... I do not understand why I can’t just go $c.v( 43 ) To set a public attribute
You can do all of these things. That is to say you use = assignment, and multi methods, and "just go $c.v( 43 )", all at the same time if you want to:
class C {
has $!v;
multi method v is rw { $!v }
multi method v ( :$trace! ) is rw { say 'trace'; $!v }
multi method v ( $new-value ) { say 'new-value'; $!v = $new-value }
}
my $c = C.new;
$c.v = 41;
say $c.v; # 41
$c.v(:trace) = 42; # trace
say $c.v; # 42
$c.v(43); # new-value
say $c.v; # 43
A possible source of confusion1
Behind the scenes, has $.foo is rw generates an attribute and a single method along the lines of:
has $!foo;
method foo () is rw { $!foo }
The above isn't quite right though. Given the behavior we're seeing, the compiler's autogenerated foo method is somehow being declared in such a way that any new method of the same name silently shadows it.2
So if you want one or more custom methods with the same name as an attribute you must manually replicate the automatically generated method if you wish to retain the behavior it would normally be responsible for.
Footnotes
1 See jnthn's answer for a clear, thorough, authoritative accounting of Raku's opinion about private vs public getters/setters and what it does behind the scenes when you declare public getters/setters (i.e. write has $.foo).
2 If an autogenerated accessor method for an attribute was declared only, then Raku would, I presume, throw an exception if a method with the same name was declared. If it were declared multi, then it should not be shadowed if the new method was also declared multi, and should throw an exception if not. So the autogenerated accessor is being declared with neither only nor multi but instead in some way that allows silent shadowing.
If I add public method to subclass and a client program calls added method, client programs can't use parent object instead of subclass.
import unittest
class BaseClass(object):
def doSomething(self):
pass
class SubClass(BaseClass):
def doStuff(self):
pass
class Client(object):
def __init__(self, obj):
self.obj = obj
def do(self):
self.obj.doStuff()
class LSPTestCase(unittest.TestCase):
def test_call_subclass_method(self):
client = Client(SubClass())
client.do()
def test_call_baseclass_method(self):
client = Client(BaseClass())
with self.assertRaises(AttributeError):
client.do()
if __name__ == '__main__':
unittest.main()
This case violates LSP?
No as long as all the methods inherrited from the parent class follow the same contract as the parent then LSP is preserved.
the whole point of LSP is to be able to pass around a subclass as the parent class without any problems. It says nothing about not being able to downcast for additional functionality.
Adding a method to a subclass does not violate LSP. However, it does violate if you invoke that method by downcasting from a parent class.
Robert C. Martin states right in the beginning of his LSP / SOLID paper that:
select a function based upon the type of an object
(...) is an example of a simple violation of the Liskov Substitution Principle.
Generally, if you end up in situations where you need to downcast or use the instanceof operator, it's better to revisit the use of inheritance in favor of other approaches like composition.
TL;DR
How do you test a value object in isolation from its dependencies without stubbing or injecting them?
In Misko Hevery's blog post To “new” or not to “new”… he advocates the following (quoted from the blog post):
An Injectable class can ask for other Injectables in its constructor.(Sometimes I refer to Injectables as Service Objects, but
that term is overloaded.). Injectable can never ask for a non-Injectable (Newable) in its constructor.
Newables can ask for other Newables in their constructor, but not for Injectables (Sometimes I refer to Newables as Value Object, but
again, the term is overloaded)
Now if I have a Quantity value object like this:
class Quantity{
$quantity=0;
public function __construct($quantity){
$intValidator = new Zend_Validate_Int();
if(!$intValidator->isValid($quantity)){
throw new Exception("Quantity must be an integer.");
}
$gtValidator = new Zend_Validate_GreaterThan(0);
if(!$gtvalidator->isValid($quantity)){
throw new Exception("Quantity must be greater than zero.");
}
$this->quantity=$quantity;
}
}
My Quantity value object depends on at least 2 validators for its proper construction. Normally I would have injected those validators through the constructor, so that I can stub them during testing.
However, according to Misko a newable shouldn't ask for injectables in its constructor. Frankly a Quantity object that looks like this
$quantity=new Quantity(1,$intValidator,$gtValidator); looks really awkward.
Using a dependency injection framework to create a value object is even more awkward. However now my dependencies are hard coded in the Quantity constructor and I have no way to alter them if the business logic changes.
How do you design the value object properly for testing and adherence to the separation between injectables and newables?
Notes:
This is just a very very simplified example. My real object my have serious logic in it that may use other dependencies as well.
I used a PHP example just for illustration. Answers in other languages are appreciated.
A Value Object should only contain primitive values (integers, strings, boolean flags, other Value Objects, etc.).
Often, it would be best to let the Value Object itself protect its invariants. In the Quantity example you supply, it could easily do that by checking the incoming value without relying on external dependencies. However, I realize that you write
This is just a very very simplified example. My real object my have serious logic in it that may use other dependencies as well.
So, while I'm going to outline a solution based on the Quantity example, keep in mind that it looks overly complex because the validation logic is so simple here.
Since you also write
I used a PHP example just for illustration. Answers in other languages are appreciated.
I'm going to answer in F#.
If you have external validation dependencies, but still want to retain Quantity as a Value Object, you'll need to decouple the validation logic from the Value Object.
One way to do that is to define an interface for validation:
type IQuantityValidator =
abstract Validate : decimal -> unit
In this case, I patterned the Validate method on the OP example, which throws exceptions upon validation failures. This means that if the Validate method doesn't throw an exception, all is good. This is the reason the method returns unit.
(If I hadn't decided to pattern this interface on the OP, I'd have preferred using the Specification pattern instead; if so, I'd instead have declared the Validate method as decimal -> bool.)
The IQuantityValidator interface enables you to introduce a Composite:
type CompositeQuantityValidator(validators : IQuantityValidator list) =
interface IQuantityValidator with
member this.Validate value =
validators
|> List.iter (fun validator -> validator.Validate value)
This Composite simply iterates through other IQuantityValidator instances and invokes their Validate method. This enables you to compose arbitrarily complex validator graphs.
One leaf validator could be:
type IntegerValidator() =
interface IQuantityValidator with
member this.Validate value =
if value % 1m <> 0m
then
raise(
ArgumentOutOfRangeException(
"value",
"Quantity must be an integer."))
Another one could be:
type GreaterThanValidator(boundary) =
interface IQuantityValidator with
member this.Validate value =
if value <= boundary
then
raise(
ArgumentOutOfRangeException(
"value",
"Quantity must be greater than zero."))
Notice that the GreaterThanValidator class takes a dependency via its constructor. In this case, boundary is just a decimal, so it's a Primitive Dependency, but it could just as well have been a polymorphic dependency (A.K.A a Service).
You can now compose your own validator from these building blocks:
let myValidator =
CompositeQuantityValidator([IntegerValidator(); GreaterThanValidator(0m)])
When you invoke myValidator with e.g. 9m or 42m, it returns without errors, but if you invoke it with e.g. 9.8m, 0m or -1m it throws the appropriate exception.
If you want to build something a bit more complicated than a decimal, you can introduce a Factory, and compose the Factory with the appropriate validator.
Since Quantity is very simple here, we can just define it as a type alias on decimal:
type Quantity = decimal
A Factory might look like this:
type QuantityFactory(validator : IQuantityValidator) =
member this.Create value : Quantity =
validator.Validate value
value
You can now compose a QuantityFactory instance with your validator of choice:
let factory = QuantityFactory(myValidator)
which will let you supply decimal values as input, and get (validated) Quantity values as output.
These calls succeed:
let x = factory.Create 9m
let y = factory.Create 42m
while these throw appropriate exceptions:
let a = factory.Create 9.8m
let b = factory.Create 0m
let c = factory.Create -1m
Now, all of this is very complex given the simple nature of the example domain, but as the problem domain grows more complex, complex is better than complicated.
Avoid value types with dependencies on non-value types. Also avoid constructors that perform validations and throw exceptions. In your example I'd have a factory type that validates and creates quantities.
Your scenario can also be applied to entities. There are cases where an entity requires some dependency in order to perform some behaviour. As far as I can tell the most popular mechanism to use is double-dispatch.
I'll use C# for my examples.
In your case you could have something like this:
public void Validate(IQuantityValidator validator)
As other answers have noted a value object is typically simple enough to perform its invariant checking in the constructor. An e-mail value object would be a good example as an e-mail has a very specific structure.
Something a bit more complex could be an OrderLine where we need to determine, totally hypothetical, whether it is, say, taxable:
public bool IsTaxable(ITaxableService service)
In the article you reference I would assert that the 'newable' relates quite a bit to the 'transient' type of life cycle that we find in DI containers as we are interested in specific instances. However, when we need to inject specific values the transient business does not really help. This is the case for entities where each is a new instance but has very different state. A repository would hydrate the object but it could just as well use a factory.
The 'true' dependencies typically have a 'singleton' life-cycle.
So for the 'newable' instances a factory could be used if you would like to perform validation upon construction by having the factory call the relevant validation method on your value object using the injected validator dependency as Mark Seemann has mentioned.
This gives you the freedom to still test in isolation without coupling to a specific implementation in your constructor.
Just a slightly different angle on what has already been answered. Hope it helps :)
I am building a package to handle data that arrives with up to 4 different types. Each of these types is a legitimate class in the form of a matrix, data.frame or tree. Depending on the way the data is processed and other experimental factors, some of these data components may be missing, but it is still extremely useful to be able to store this information as an instance of a special class and have methods that recognize the different component data.
Approach 1:
I have experimented with an incremental inheritance structure that looks like a nested tree, where each combination of data types has its own class explicitly defined. This seems difficult to extend for additional data types in the future, and is also challenging for new developers to learn all the class names, however well-organized those names might be.
Approach 2:
A second approach is to create a single "master-class" that includes a slot for all 4 data types. In order to allow the slots to be NULL for the instances of missing data, it appears necessary to first define a virtual class union between the NULL class and the new data type class, and then use the virtual class union as the expected class for the relevant slot in the master-class. Here is an example (assuming each data type class is already defined):
################################################################################
# Use setClassUnion to define the unholy NULL-data union as a virtual class.
################################################################################
setClassUnion("dataClass1OrNULL", c("dataClass1", "NULL"))
setClassUnion("dataClass2OrNULL", c("dataClass2", "NULL"))
setClassUnion("dataClass3OrNULL", c("dataClass3", "NULL"))
setClassUnion("dataClass4OrNULL", c("dataClass4", "NULL"))
################################################################################
# Now define the master class with all 4 slots, and
# also the possibility of empty (NULL) slots and an explicity prototype for
# slots to be set to NULL if they are not provided at instantiation.
################################################################################
setClass(Class="theMasterClass",
representation=representation(
slot1="dataClass1OrNULL",
slot2="dataClass2OrNULL",
slot3="dataClass3OrNULL",
slot4="dataClass4OrNULL"),
prototype=prototype(slot1=NULL, slot2=NULL, slot3=NULL, slot4=NULL)
)
################################################################################
So the question might be rephrased as:
Are there more efficient and/or flexible alternatives to either of these approaches?
This example is modified from an answer to a SO question about setting the default value of slot to NULL. This question differs in that I am interested in knowing the best options in R for creating classes with slots that can be empty if needed, despite requiring a specific complex class in all other non-empty cases.
In my opinion...
Approach 2
It sort of defeats the purpose to adopt a formal class system, and then to create a class that contains ill-defined slots ('A' or NULL). At a minimum I would try to make DataClass1 have a 'NULL'-like default. As a simple example, the default here is a zero-length numeric vector.
setClass("DataClass1", representation=representation(x="numeric"))
DataClass1 <- function(x=numeric(), ...) {
new("DataClass1", x=x, ...)
}
Then
setClass("MasterClass1", representation=representation(dataClass1="DataClass1"))
MasterClass1 <- function(dataClass1=DataClass1(), ...) {
new("MasterClass1", dataClass1=dataClass1, ...)
}
One benefit of this is that methods don't have to test whether the instance in the slot is NULL or 'DataClass1'
setMethod(length, "DataClass1", function(x) length(x#x))
setMethod(length, "MasterClass1", function(x) length(x#dataClass1))
> length(MasterClass1())
[1] 0
> length(MasterClass1(DataClass1(1:5)))
[1] 5
In response to your comment about warning users when they access 'empty' slots, and remembering that users usually want functions to do something rather than tell them they're doing something wrong, I'd probably return the empty object DataClass1() which accurately reflects the state of the object. Maybe a show method would provide an overview that reinforced the status of the slot -- DataClass1: none. This seems particularly appropriate if MasterClass1 represents a way of coordinating several different analyses, of which the user may do only some.
A limitation of this approach (or your Approach 2) is that you don't get method dispatch -- you can't write methods that are appropriate only for an instance with DataClass1 instances that have non-zero length, and are forced to do some sort of manual dispatch (e.g., with if or switch). This might seem like a limitation for the developer, but it also applies to the user -- the user doesn't get a sense of which operations are uniquely appropriate to instances of MasterClass1 that have non-zero length DataClass1 instances.
Approach 1
When you say that the names of the classes in the hierarchy are going to be confusing to your user, it seems like this is maybe pointing to a more fundamental issue -- you're trying too hard to make a comprehensive representation of data types; a user will never be able to keep track of ClassWithMatrixDataFrameAndTree because it doesn't represent the way they view the data. This is maybe an opportunity to scale back your ambitions to really tackle only the most prominent parts of the area you're investigating. Or perhaps an opportunity to re-think how the user might think of and interact with the data they've collected, and to use the separation of interface (what the user sees) from implementation (how you've chosen to represent the data in classes) provided by class systems to more effectively encapsulate what the user is likely to do.
Putting the naming and number of classes aside, when you say "difficult to extend for additional data types in the future" it makes me wonder if perhaps some of the nuances of S4 classes are tripping you up? The short solution is to avoid writing your own initialize methods, and rely on the constructors to do the tricky work, along the lines of
setClass("A", representation(x="numeric"))
setClass("B", representation(y="numeric"), contains="A")
A <- function(x = numeric(), ...) new("A", x=x, ...)
B <- function(a = A(), y = numeric(), ...) new("B", a, y=y, ...)
and then
> B(A(1:5), 10)
An object of class "B"
Slot "y":
[1] 10
Slot "x":
[1] 1 2 3 4 5
I'm trying to confirm now, what I believe about the Data Mapper pattern. So here we go:
Section A:
A Data Mapper is a class which is used to create, update and delete objects of another Class. Example: A class called Cat, and a Data Mapper called CatDataMapper. And a database table called cats. But it could also be an xml file called cats.xml, or an hard-coded array called cats. The whole point of that Data Mapper is to free the Business Logic which uses the Cat class from thinking about "how to get an exisiting cat", or "how to save a cat", "where to save a cat". As a user of the Data Mapper it looks like a blackbox with well-defined methods like getCat(int id), saveCat(Cat catObject), deleteCat(Cat catObject), and so on.
Section B:
First I thought it would be clever if Cat inherits from CatDataMapper, because calling such functions then is a bit more convenient. For example, methods like catWithId(int id) could be static (class method) and return an instance of a Cat, initialized with data from anywhere. And when I work with a cat object in my code, I could simply call myCat->save(); to store it whereever the Data Mapper will store it (don't care where and how, the Data Mapper hides this complexity from the user).
In conclusion, I'm a little bit confused now ;)
Do you think that Section A is valid for the Data Mapper pattern? And if I would do it additionaly as described in Section B, would that be bad? Why?
I think your Section A corresponds to the definiton of the Data Mapper pattern as given by Martin Fowler
Be careful with the details of your implementation language. In Section B having catWithId() be a static member of the cat class may interfere with polymorphic behavior of the method.
In java, the JVM will dispatch a static method based on the declared type of the reference.
Try this out:
1. create a class CatDataMapper with the static method catWithId(int id)
2. create a class Cat extending CatDataMapper that has the desired Business Logic behavior
3. subclass Cat with LoggedCat that logs all activity, including the activity from CatDataMapper
4. do Cat foo = new LoggedCat()
5. do Cat bar = foo.catWithId(5)
note which method is called, it should be the static method of CatDataMapper not the static method of LoggedCat
http://faq.javaranch.com/view?OverridingVsHiding gives a more in-depth discussion of this.
I think this is an OK approach. Aside from the naming conventions used, you're following a well known Data Access pattern here and you're allowing the users of the Cat objects to perform CRUD operations without having to talk to the CatDataMapper which is always a plus in my book.
I would usggest looking at Spring Container technology for this if you're in the java world.