Reference Semantics in Google Protocol Buffers - serialization

I have slightly peculiar program which deals with cases very similar to this
(in C#-like pseudo code):
class CDataSet
{
int m_nID;
string m_sTag;
float m_fValue;
void PrintData()
{
//Blah Blah
}
};
class CDataItem
{
int m_nID;
string m_sTag;
CDataSet m_refData;
CDataSet m_refParent;
void Print()
{
if(null == m_refData)
{
m_refParent.PrintData();
}
else
{
m_refData.PrintData();
}
}
};
Members m_refData and m_refParent are initialized to null and used as follows:
m_refData -> Used when a new data set is added
m_refParent -> Used to point to an existing data set.
A new data set is added only if the field m_nID doesn't match an existing one.
Currently this code is managing around 500 objects with around 21 fields per object and the format of choice as of now is XML, which at 100k+ lines and 5MB+ is very unwieldy.
I am planning to modify the whole shebang to use ProtoBuf, but currently I'm not sure as to how I can handle the reference semantics. Any thoughts would be much appreciated

Out of the box, protocol buffers does not have any reference semantics. You would need to cross-reference them manually, typically using an artificial key. Essentially on the DTO layer you would a key to CDataSet (that you simply invent, perhaps just an increasing integer), storing the key instead of the item in m_refData/m_refParent, and running fixup manually during serialization/deserialization. You can also just store the index into the set of CDataSet, but that may make insertion etc more difficult. Up to you; since this is serialization you could argue that you won't insert (etc) outside of initial population and hence the raw index is fine and reliable.
This is, however, a very common scenario - so as an implementation-specific feature I've added optional (opt-in) reference tracking to my implementation (protobuf-net), which essentially automates the above under the covers (so you don't need to change your objects or expose the key outside of the binary stream).

Related

Common return type for all ANTLR visitor methods

I'm writing a parser for an old proprietary report specification with ANTLR and I'm currently trying to implement a visitor of the generated parse tree extending the autogenerated abstract visito class.
I have little experience both with ANTLR (which I learned only recently) and with the visitor pattern in general, but if I understood it correctly, the visitor should encapsulate one single operation on the whole data structure (in this case the parse tree), thus sharing the same return type between each Visit*() method.
Taking an example from The Definitive ANTLR 4 Reference book by Terence Parr, to visit a parse tree generated by a grammar that parses a sequence of arithmetic expressions, it feels natural to choose the int return type, as each node of the tree is actually part of the the arithmetic operation that contributes to the final result by the calculator.
Considering my current situation, I don't have a common type: my grammar parses the whole document, which is actually split in different sections with different responsibilities (variable declarations, print options, actual text for the rows, etc...), and I can't find a common type between the result of the visit of so much different nodes, besides object of course.
I tried to think to some possible solutions:
I firstly tried implementing a stateless visitor using object as
the common type, but the amount of type casts needed sounds like a
big red flag to me. I was considering the usage of JSON, but I think
the problem remains, potentially adding some extra overhead in the
serialization process.
I was also thinking about splitting the visitor in more smaller
visitors with a specific purpose (get all the variables, get all the
rows, etc.), but with this solution for each visitor I would
implement only a small subset of the method of the autogenerated
interface (as it is meant to support the visit of the whole tree),
because each visiting operation would probably focus only on a
specific subtree. Is it normal?
Another possibility could be to redesign the data structure so that
it could be used at every level of the tree or, better, define a generic
specification of the nodes that can be used later to build the data
structure. This solution sounds good, but I think it is difficult to
apply in this domain.
A final option could be to switch to a stateful visitor, which
incapsulates one or more builders for the different sections that
each Visit*() method could use to build the data structure
step-by-step. This solution seems to be clean and doable, but I have
difficulties to think about how to scope the result of each visit
operation in the parent scope when needed.
What solution is generally used to visit complex ANTLR parse trees?
ANTLR4 parse trees are often complex because of recursion, e.g.
I would define the class ParsedDocumentModel whose properties would added or modified as your project evolves (which is normal, no program is set in stone).
Assuming your grammar be called Parser in the file Parser.g4, here is sample C# code:
public class ParsedDocumentModel {
public string Title { get; set; }
//other properties ...
}
public class ParserVisitor : ParserBaseVisitor<ParsedDocumentModel>
{
public override ParsedDocumentModel VisitNounz(NounzContext context)
{
var res = "unknown";
var s = context.GetText();
if (s == "products")
res = "<<products>>"; //for example
var model = new ParsedDocumentModel();
model.Title = res; //add more info...
return model;
}
}

Should my object be responsible for randomizing its own content?

I'm building an app that generates random sequences of musical notes and displays them to the user as musical notation. These sequences can be generated according to several parameters, including density and maximum consecutive notes of the same pitch.
Musical sequences are captured by a sequence object whose notes property is a simple string of notes such as "abcdaba".
My early attempts to generate random sequences involved a SequenceGenerator class that compiled random sequences using several private methods. This looks like a service to me. But I'm trying to honour the principle expressed in Domain-Driven Design (Evans 2003) to only use services where necessary and to prefer associating behaviour with domain objects.
So my question is:
Should the job of producing random sequences be taken care of by a public method on sequence itself (such as generateRandom()) or should it be kept separate?
I considered the possibility that my original design is more along the lines of a builder or factory pattern than a service, but the the code is very different for creating a random sequence than for creating one with a supplied string of notes.
One concern I have with the method route is that generateRandom() as a method on sequence changes the content of sequence but isn't actually generating a new sequence object. This just feels wrong, but I can't express why.
I'm still getting my head around some the core OO design principles, so any help is greatly appreciated.
Should the job of producing random sequences be taken care of by a public method on sequence itself (such as generateRandom()) or should it be kept separate?
I usually find that I get cleaner designs if I treat "random" the same way that I treat "time", or "I/O" -- as an input to the model, rather than as an aspect of the model itself.
If you don't consider time an input value, think about it until you do -- it is an important concept (John Carmack, 1998).
Within the constraints of DDD, that could either mean passing a "domain service" as an argument to your method, allowing your aggregate to invoke the service as needed, or it could mean having a method on the aggregate, so that the application can pass in random numbers when needed.
So any creation of a sequence would involve passing in some pattern or seed, but whether that is random or not is decided outside of the sequence itself?
Yes, exactly.
The creation of an object is not usually considered part of the logic for the object.
How you do that technically is a different matter. You could potentially use delegation. For example:
public interface NoteSequence {
void play();
}
public final class LettersNoteSequence implements NoteSequence {
public LettersNoteSequence(String letters) {
...
}
...
}
public final class RandomNoteSequence implements NoteSequence {
...
#Override
public void play() {
new LetterNoteSequence(generateRandomLetters()).play();
}
}
This way you don't have to have a "service" or a "factory", but this is only one alternative, may or may not fit your use-case.

Value object in event sourcing

Is there a place for value objects in an event sourced domain model?
Lets define a value object as an object with immutable state that guards its invariants and has no particular identifier.
An event sourced domain model in this context is a domain that is entirely or partially event sourced, meaning that its current state can be derived from applying all events that have occurred in the past. Events themselves are considered immutable, even over time.
Debate has taken place about the validity of using value objects within events - this question goes slightly further: Do value objects have a place in event sourced domains at all?
The (potential) problem with using value objects is that it becomes rather tricky to alter the domain in such a way that invariants are tightened.
An example of this scenario would be to have a Username value object, with the sole constraint that the name must be anywhere between 2 and 16 characters.
While this has been working well for some time, the business decides to only allow usernames of at least 5 characters.
A migration period begins and users with names of less than 5 characters are asked to update their names.
Lets say the process was successful, correction events are applied and everyone is happy.
We tighten the constraints on our Username value object to require at least 5 characters.
For a while everyone is happy, but then we discover a problem with the snapshots and replay all events.
We now face an exception from our Username object: by loading the historic data, we're breaking an invariant of our domain.
The rules of a value objects apply retroactively - does this make them inherently unsuitable for event sourcing? Would it be worth applying versioning of value objects? Is there a simpler way of avoiding such problems?
I would say, that at the moment you redefined what Username means, and you don't migrate historical data somehow, you've essentially created 2 different Username meanings.
Because there are 2 different meanings of the word, you have to make it explicit in the code somehow. "Versioning" is one way, although I wouldn't use such a generic solution, there are different modeling options.
You could make it explicit that the history of a "username" is just that, a history. So for example create a HistoricUsername, which is the event-sourced object, even a value object if you want. And create a Username which is at all times the username with the most current rules, which is not persisted at all, but created from a HistoricUsername if it can.
Some people suggest sometimes to extract the "rules" from the object, and re-apply it later. That way the object itself is valid at all times and you can ask it to validate itself against rules that might change. I don't really prefer these kinds of solutions, but it's an option, and the Username would still be a value-object.
So the problem is not really that value-objects don't fit into event-sourcing, it's just that the modeling has to be more accurate.
Do value objects have a place in event sourced domains at all?
Yes.
Is there a simpler way of avoiding such problems?
"Don't do that."
The problem you are describing is really one about messaging - if we make backwards incompatible changes to our messages, then things break.
(More precisely, you have a "Username" message, and you are trying to re-use that message with a new set of constraints that reject some previously valid uses of the message).
The answer is that you don't introduce backwards incompatible changes - instead, introduce new names that match the new requirements, and deprecated the old ones.
Which is to say, adding support for new messages, and removing support for the old messages, become two separately managed options.
Greg Young's book Versioning in an Event Sourced System dedicates some chapters to this idea. Also, Rich Hickey ends up touching on these important ideas in most of his talks -- I'd suggest starting from Spec-ulation.
The "value object", meaning that the type that the current implementation of the domain model uses to move the information around, is a separate concern from the messages. The data structures we use in memory don't need to be coupled to our serialization formats.
The representation of the information on the wire is distinct from the representation of information in memory, and that in turn is distinct from the abstractions that manipulate the information in memory.
The challenging thing is that, at the beginning of a project, you have the least amount of information about when the different representations are going to diverge.
We've solved this in a slightly different way. By separating the public API of our value objects from the internal (domain only) API, we are able to evolve one without affecting the other.
For example:
public class Username
{
private readonly string value;
// Domain-only (internal) constructor.
// Does not enforce constriants and can only be called within the domain.
internal Username(string value)
{
this.value = value;
}
// Public factory method.
// Enforces business constraints. Used by consumers of the domain (application layer etc.)
// to create new instances of the value object.
public static Username Create(string value)
{
// Business constraints. These will evolve and grow over time.
if (value == null)
{
// throw exception etc.
}
if (value.Length < 2)
{
// throw exception etc.
}
return new Username(value);
}
}
Consumers of the domain must use the static Create method to create a new instance of the value object. This factory method contains all of our business constraints and prevents an instance being created in an invalid state.
Inside the domain, classes have access to the internal (constraint-less) constructor. Since this does not enforce any business constraints, an instance of the value object can always be created in this way (regardless of its value). By using this constructor when replaying events we can ensure that historical data will always succeed.
The benefits of this design are:
A single class is used to represent the domain concept (no need for multiple classes, versioning etc.).
Business rules are free to evolve over time.
Historical data always works. A Username from a year ago is still a user name, even if our rules have changed.
Although already answered I do find this an interesting situation.
I agree with others that the event data should be record-based and, therefore, nothing more than a data container that may be used to reconstitute the aggregate.
That being said when the rules change so does the domain. A major portion of domain-driven design is to capture as much of the domain (rules/structure) as is required. If this is the case should the changes in the rules not also be kept?
For instance, if we have a Username Value Object and it starts out with the 2 to 16 characters rules then that is coded as such:
public class Username
{
public string Value { get; }
public Username(string value)
{
if (value.Length < 2 || value.Length > 16)
{
throw new DomainException("Username must be between 2 and 16 characters");
}
Value = value;
}
}
Now we get to 1 March 2018 and the rule changes. We can keep the rule around:
public class Username
{
public string Value { get; }
public Username(string value, DateTime registrationDate)
{
if (registrationDate < new Date(2018, 3, 1) &&
(value.Length < 2 || value.Length > 16))
{
throw new DomainException("Username must be between 2 and 16 characters");
}
if (registrationDate >= new Date(2018, 3, 1) &&
(value.Length < 5 || value.Length > 16))
{
throw new DomainException("Username must be between 5 and 16 characters");
}
Value = value;
}
}
That is the basic idea. In this way we keep our "old" rules around as well. This may become quite a hassle but I don't have enough experience to say. Changing our rules retroactively may introduce some pretty tricky situation so I guess one would need to evaluate this on a case-by-case basis.
Just a thought.

Does CF ORM have an Active Record type Update()?

Currently I am working partly with cfwheels and its Active Record ORM (which is great), and partly raw cfml with its Hibernate ORM (which is also great).
Both work well for applicable situations, but the thing I do miss most when using CF ORM is the model.update() method that is available in cfwheels, where you can just pass a form struct to the method, and it will map up the struct elements with the model properties and update the records.. really good for updating and maintaining large tables. In CF ORM, it seems the only way to to update a record is to set each column individually, then do a save. Is this the case?
Does cf9 ORM have an Active Record type update() (or equivalent) method which can just receive a struct with values to update and update the object without having to specify each one?
For example, instead of current:
member = entityLoadByPK('member',arguments.id);
member.setName(arguments.name);
member.setEmail(arguments.email);
is there a way to do something like this in CF ORM?
member = entityLoadByPK('member',arguments.id);
member.update(arguments);
Many thanks in advance
In my apps I usually create two helper functions for models which handle the task:
/*
* Get properties as key-value structure
* #limit Limit output to listed properties
*/
public struct function getMemento(string limit = "") {
local.props = {};
for (local.key in variables) {
if (isSimpleValue(variables[local.key]) AND (arguments.limit EQ "" OR ListFind(arguments.limit, local.key))) {
local.props[local.key] = variables[local.key];
}
}
return local.props;
}
/*
* Populate the model with given properties collection
* #props Properties collection
*/
public void function setMemento(required struct props) {
for (local.key in arguments.props) {
variables[local.key] = arguments.props[local.key];
}
}
For better security of setMemento it is possible to check existence of local.key in variables scope, but this will skip nullable properties.
So you can make myObject.setMemento(dataAsStruct); and then save it.
There's not a method exactly like the one you want, but EntityNew() does take an optional struct as a second argument, which will set the object's properties, although depending on how your code currently works, it may be clunky to use this method and I don;t know whether it'll have any bearing on whether a create/update is executed when you flush the ORM session.
If your ORM entities inherit form a master CFC, then you could add a method there. Alternatively, you could write one as a function and mix it into your objects.
I'm sure you're aware, but that update() feature can be a source of security problems (known as the mass assignment problem) if used with unsanitized user input (such as the raw FORM scope).

Is there a commonly used OO Pattern for holding "constant variables"?

I am working on a little pinball-game project for a hobby and am looking for a pattern to encapsulate constant variables.
I have a model, within which there are values which will be constant over the life of that model e.g. maximum speed/maximum gravity etc. Throughout the GUI and other areas these values are required in order to correctly validate input. Currently they are included either as references to a public static final, or just plain hard-coded. I'd like to encapsulate these "constant variables" in an object which can be injected into the model, and retrieved by the view/controller.
To clarify, the value of the "constant variables" may not necessarily be defined at compile-time, they could come from reading in a file; user input etc. What is known at compile time is which ones are needed. A way which may be easier to explain it is that whatever this encapsulation is, the values it provides are immutable.
I'm looking for a way to achieve this which:
has compile time type-safety (i.e. not mapping a string to variable at runtime)
avoids anything static (including enums, which can't be extended)
I know I could define an interface which has the methods such as:
public int getMaximumSpeed();
public int getMaximumGravity();
... and inject an instance of that into the model, and make it accessible in some way. However, this results in a lot of boilerplate code, which is pretty tedious to write/test etc (I am doing this for funsies :-)).
I am looking for a better way to do this, preferably something which has the benefits of being part of a shared vocabulary, as with design patterns.
Is there a better way to do this?
P.S. I've thought some more about this, and the best trade-off I could find would be to have something like:
public class Variables {
enum Variable {
MaxSpeed(100),
MaxGravity(10)
Variable(Object variableValue) {
// assign value to field, provide getter etc.
}
}
public Object getVariable(Variable v) { // look up enum and get member }
} // end of MyVariables
I could then do something like:
Model m = new Model(new Variables());
Advantages: the lookup of a variable is protected by having to be a member of the enum in order to compile, variables can be added with little extra code
Disadvantages: enums cannot be extended, brittleness (a recompile is needed to add a variable), variable values would have to be cast from Object (to Integer in this example), which again isn't type safe, though generics may be an option for that... somehow
Are you looking for the Singleton or, a variant, the Monostate? If not, how does that pattern fail your needs?
Of course, here's the mandatory disclaimer that Anything Global Is Evil.
UPDATE: I did some looking, because I've been having similar debates/issues. I stumbled across a list of "alternatives" to classic global/scope solutions. Thought I'd share.
Thanks for all the time spent by you guys trying to decipher what is a pretty weird question.
I think, in terms of design patterns, the closest that comes to what I'm describing is the factory pattern, where I have a factory of pseudo-constants. Technically it's not creating an instance each call, but rather always providing the same instance (in the sense of a Guice provider). But I can create several factories, which each can provide different psuedo-constants, and inject each into a different model, so the model's UI can validate input a lot more flexibly.
If anyone's interested I've came to the conclusion that an interface providing a method for each psuedo-constant is the way to go:
public interface IVariableProvider {
public int maxGravity();
public int maxSpeed();
// and everything else...
}
public class VariableProvider {
private final int maxGravity, maxSpeed...;
public VariableProvider(int maxGravity, int maxSpeed) {
// assign final fields
}
}
Then I can do:
Model firstModel = new Model(new VariableProvider(2, 10));
Model secondModel = new Model(new VariableProvider(10, 100));
I think as long as the interface doesn't provide a prohibitively large number of variable getters, it wins over some parameterised lookup (which will either be vulnerable at run-time, or will prohibit extension/polymorphism).
P.S. I realise some have been questioning what my problem is with static final values. I made the statement (with tongue in cheek) to a colleague that anything static is an inherently not object-oriented. So in my hobby I used that as the basis for a thought exercise where I try to remove anything static from the project (next I'll be trying to remove all 'if' statements ;-D). If I was on a deadline and I was satisfied public static final values wouldn't hamstring testing, I would have used them pretty quickly.
If you're just using java/IOC, why not just dependency-inject the values?
e.g. Spring inject the values via a map, specify the object as a singleton -
<property name="values">
<map>
<entry> <key><value>a1</value></key><value>b1</value></entry>
<entry> <key><value>a2</value></key><value>b3</value></entry>
</map>
</property>
your class is a singleton that holds an immutable copy of the map set in spring -
private Map<String, String> m;
public String getValue(String s)
{
return m.containsKey(s)?m.get(s):null;
}
public void setValues(Map m)
{
this.m=Collections.unmodifiableMap(m):
}
From what I can tell, you probably don't need to implement a pattern here -- you just need access to a set of constants, and it seems to me that's handled pretty well through the use of a publicly accessible static interface to them. Unless I'm missing something. :)
If you simply want to "objectify" the constants though, for some reason, than the Singleton pattern would probably be called for, if any; I know you mentioned in a comment that you don't mind creating multiple instances of this wrapper object, but in response I'd ask, then why even introduce the sort of confusion that could arise from having multiple instances at all? What practical benefit are you looking for that'd be satisfied with having the data in object form?
Now, if the values aren't constants, then that's different -- in that case, you probably do want a Singleton or Monostate. But if they really are constants, just wrap a set of enums or static constants in a class and be done! Keep-it-simple is as good a "pattern" as any.