Dependency Injection/config object - oop

I have an object that is responsible for exporting a file to csv.
It works well but i am looking at ways to refactor it.
This question pertains to the constructor which takes a number of arguments, relating to how the csv is to be exported:
For example, file name, delimiter, etc. etc.
Also, lately I have been reading about dependency injection but cant decide if this is a case where I should:
A. Leave the constructor as is.
B. Create a new class that gets passed to the constructor that simple holds config values for the file name etc
C. Something else altogether?
Here is the existing constructor (in PHP)
public function __construct($file,$overwriteExistingFile, $enclosure, $delim, $headerRow)
{
//set all properties here
}

Each of those values represents data that is an input to some process. $enclosure, $delimiter, and $headerRow pertain to generating the CSV content, while $file and $overwriteExistingFile pertain to persisting the content to disk.
A hallmark of a DI-style refactoring is to identify the various responsibilities (generate, persist) and encapsulate each of them in its own type. This shifts the refactoring from "how do I best get the values to this class?" to "how do I remove knowledge of these values from this class?"
To answer that, we would define two new concepts, each of which takes one of the responsibilities, and pass those into the existing constructor:
public function __construct($csvGenerator, $csvFileWriter)
{
...save dependencies...
}
...at some point, generate the CSV content and pass it to the file writer...
In this way, the original class becomes the orchestrator of the interaction between the generation and file writing, without having intimate knowledge of either activity. We have elevated the class to a higher level of abstraction, simplifying it as well as isolating its responsibilities into its collaborators.
Now, you would define two new classes, constructing them with the relevant parameters:
Generator
public function __construct($enclosure, $delimiter, $headerRow)
File Writer
public function __construct($file, $overwriteExistingFile)
With these elements in place, you can compose them together by creating the generator, then the file writer, then passing both to the orchestrator.

I would create a CSVFormatter that you can setup the deliminator on and unit test the formatting independently with.
Inject the formatter into a CSVWriter which writes the formatted output to a file.
The reason you would do this is to unit test the formatting logic or if you need to do multiple kinds of formatting or write to different kinds of output streams. If the code is very small and simple then you don't need to break it into multiple classes.

Related

Where should the code processing the state of the object be?

I have a class as follows:
data class ProductState(
val id: Int,
val products: MutableMap<Int, MutableSet<Int>> = mutableMapOf(),
val customerTopics: Topic = Topic()
)
It is basically a data class.
Now I have a function that among other things, processes the products and customerTopics and creates some output based on the processing.
But it seems to me that it is not a good idea to have the logic in the function.
My question is:
Do we create methods inside the data class for the processing of the object'state? If so would it be some companion object? Or is there some other design pattern that deals with this better?
In kotlin you have several options:
Additional method on the data class (if it should be called in multiple other places);
Public extension method in the same file as the data class (if it should be called in multiple other places but you want to keep your data class simple and have the handling methods separated);
Private extension method in the same file as the only place where it is called (this only applies if this processing is to be called in a single place in your code);
Private method in the only place where this processing is needed.
The best place really depends on what the processing is (is it very specific to ProductState or does it need additional data? If it is very specific then it may make sense to keep it as a ProductState method or an extension function) and in how many places it will be triggered (if in a single very specific place, then maybe keeping it along-side that piece of code as a private extension function or private method might be the best option).

Common return type for all ANTLR visitor methods

I'm writing a parser for an old proprietary report specification with ANTLR and I'm currently trying to implement a visitor of the generated parse tree extending the autogenerated abstract visito class.
I have little experience both with ANTLR (which I learned only recently) and with the visitor pattern in general, but if I understood it correctly, the visitor should encapsulate one single operation on the whole data structure (in this case the parse tree), thus sharing the same return type between each Visit*() method.
Taking an example from The Definitive ANTLR 4 Reference book by Terence Parr, to visit a parse tree generated by a grammar that parses a sequence of arithmetic expressions, it feels natural to choose the int return type, as each node of the tree is actually part of the the arithmetic operation that contributes to the final result by the calculator.
Considering my current situation, I don't have a common type: my grammar parses the whole document, which is actually split in different sections with different responsibilities (variable declarations, print options, actual text for the rows, etc...), and I can't find a common type between the result of the visit of so much different nodes, besides object of course.
I tried to think to some possible solutions:
I firstly tried implementing a stateless visitor using object as
the common type, but the amount of type casts needed sounds like a
big red flag to me. I was considering the usage of JSON, but I think
the problem remains, potentially adding some extra overhead in the
serialization process.
I was also thinking about splitting the visitor in more smaller
visitors with a specific purpose (get all the variables, get all the
rows, etc.), but with this solution for each visitor I would
implement only a small subset of the method of the autogenerated
interface (as it is meant to support the visit of the whole tree),
because each visiting operation would probably focus only on a
specific subtree. Is it normal?
Another possibility could be to redesign the data structure so that
it could be used at every level of the tree or, better, define a generic
specification of the nodes that can be used later to build the data
structure. This solution sounds good, but I think it is difficult to
apply in this domain.
A final option could be to switch to a stateful visitor, which
incapsulates one or more builders for the different sections that
each Visit*() method could use to build the data structure
step-by-step. This solution seems to be clean and doable, but I have
difficulties to think about how to scope the result of each visit
operation in the parent scope when needed.
What solution is generally used to visit complex ANTLR parse trees?
ANTLR4 parse trees are often complex because of recursion, e.g.
I would define the class ParsedDocumentModel whose properties would added or modified as your project evolves (which is normal, no program is set in stone).
Assuming your grammar be called Parser in the file Parser.g4, here is sample C# code:
public class ParsedDocumentModel {
public string Title { get; set; }
//other properties ...
}
public class ParserVisitor : ParserBaseVisitor<ParsedDocumentModel>
{
public override ParsedDocumentModel VisitNounz(NounzContext context)
{
var res = "unknown";
var s = context.GetText();
if (s == "products")
res = "<<products>>"; //for example
var model = new ParsedDocumentModel();
model.Title = res; //add more info...
return model;
}
}

Should my object be responsible for randomizing its own content?

I'm building an app that generates random sequences of musical notes and displays them to the user as musical notation. These sequences can be generated according to several parameters, including density and maximum consecutive notes of the same pitch.
Musical sequences are captured by a sequence object whose notes property is a simple string of notes such as "abcdaba".
My early attempts to generate random sequences involved a SequenceGenerator class that compiled random sequences using several private methods. This looks like a service to me. But I'm trying to honour the principle expressed in Domain-Driven Design (Evans 2003) to only use services where necessary and to prefer associating behaviour with domain objects.
So my question is:
Should the job of producing random sequences be taken care of by a public method on sequence itself (such as generateRandom()) or should it be kept separate?
I considered the possibility that my original design is more along the lines of a builder or factory pattern than a service, but the the code is very different for creating a random sequence than for creating one with a supplied string of notes.
One concern I have with the method route is that generateRandom() as a method on sequence changes the content of sequence but isn't actually generating a new sequence object. This just feels wrong, but I can't express why.
I'm still getting my head around some the core OO design principles, so any help is greatly appreciated.
Should the job of producing random sequences be taken care of by a public method on sequence itself (such as generateRandom()) or should it be kept separate?
I usually find that I get cleaner designs if I treat "random" the same way that I treat "time", or "I/O" -- as an input to the model, rather than as an aspect of the model itself.
If you don't consider time an input value, think about it until you do -- it is an important concept (John Carmack, 1998).
Within the constraints of DDD, that could either mean passing a "domain service" as an argument to your method, allowing your aggregate to invoke the service as needed, or it could mean having a method on the aggregate, so that the application can pass in random numbers when needed.
So any creation of a sequence would involve passing in some pattern or seed, but whether that is random or not is decided outside of the sequence itself?
Yes, exactly.
The creation of an object is not usually considered part of the logic for the object.
How you do that technically is a different matter. You could potentially use delegation. For example:
public interface NoteSequence {
void play();
}
public final class LettersNoteSequence implements NoteSequence {
public LettersNoteSequence(String letters) {
...
}
...
}
public final class RandomNoteSequence implements NoteSequence {
...
#Override
public void play() {
new LetterNoteSequence(generateRandomLetters()).play();
}
}
This way you don't have to have a "service" or a "factory", but this is only one alternative, may or may not fit your use-case.

Proto-buf serialization with Obfuscation

I am looking for some guidance as to what is going on when using proto-buf net with obfuscation (Dotfuscator). One half of the project is a DLL and the other is an EXE elsewhere and using proto-buf NET they exchange data flawlessly. Until I obfuscate the DLL.
At that point P-BN fails without raising an exception, returning variously a 0 length byte array or a foreshortened one depending on what I have fiddled with. The class is fairly simple (VB):
<ProtoContract(Name:="DMailer")> _
Friend Class DMailer
Private _Lic As Cert
Private _Sys As Sys
Private _LList As List(Of LItem)
..
..
End Class
There are 3 props all decorated with ProtoMember to get/set the constituent class objects. Snipped for brevity.
Again, it works GREAT until I obfuscate the DLL. Then, Dotfuscator renames each of these to null, apparently since they are all Friend, and that seems to choke proto-buff. If I exempt the class from renaming (just the class name, not props/members), it seems to work again. It makes sense that P-BN would only be able to act on objects with a proper name, though when asked to serialize a null named object, it seems like an exception might be in order.
On the other hand, much of the charm of PB-N is supposed to be serialization independent of .NET names working from attributes - at least as I understand it. Yet in this case it only seems to work with classes with names. I tried using the Name qualifier or argument as shown above, to no avail - it apparently doesnt do what I thought it might.
So, I am curious if:
a) ...I have basically surmised the problem correctly
b) ...There is some other attribute or flag that might facilitate serializing
a null named object
c) ...if there are any other insights that would help.
If I exempt all 3 or 4 classes from Dotfuscator renaming (LList is not actually implemented yet, leaving DMailer, Cert and Sys), the DLL seems to work again - at least the output is the correct size. I can live with that, though obscured names would be better: Dotfuscator (CE) either exempts them or sets the names to Null - I cant seem to find a way to force them to be renamed.
Rather than exempt 3 or 4 classes from renaming, one alternative I am considering is to simply store the Serializer output for Cert and Sys as byte arrays or Base64 strings in DMailer instead of classes. Then have the receiver Deserialize each object individually. It is kind of nice to be able to unpack just one thing and have your toys right there as if by magic though.
(many)TIA
Interesting. I confess I have never tried this scenario, but if you can walk me through your process (or better: maybe provide a basic repro example with "run this, then this, then this: boom") I'll happily investigate.
Note: the Name on ProtoContract is mainly intended for GetProto() usage; it is not needed by the core serializer, and can be omitted to reduce your exposure. Also, protobuf-net isn't interested in fields unless those fields are decorated with the attributes, so that shouldn't be an issue.
However! there's probably a workaround here that should work now; you can pre-generate a static serialization dll; for example in a separate console exe (just as a tool; I really need to wrap this in a standalone utility!)
So if you create a console exe that references your unobfuscated library and protobuf-net.dll:
var model = RuntimeTypeModel.Create();
model.Add(typeof(DMailer), true); // true means "use the attributes etc"
// and other types needed, etc
model.Compile("MailSerializer", "MailSerializer.dll");
this should write MailSerializer.dll, which you can then reference from your main code (in addition to protobuf-net), and use:
var ser = new MailSerializer(); // our pre-genereated serializer
ser.Serialize(...); // etc
Then include MailSerializer.dll in your obfuscation payload.
(this is all v2 specific, btw)
If this doesn't work, I'll need to investigate the main issue, but I'm not an obfuscation expert so could do with your repro steps.
Since there were a few upticks of interest, here is what looks like will work:
a) No form of reflection will be able to get the list of properties for an obfuscated type.
I tried walking thru all the types to find the ones with ProtoContract on it, I could find them
but the property names are all changed to a,m, b, j, g.
I also tried Me.GetType.GetProperties with the same result.
You could implement a map from the output to indicate that Employee.FirstName is now a0.j, but distributing this defeats the purpose of obfuscation.
b) What does work to a degree is to exempt the class NAME from obfuscation. Since PB-N looks for the ProtoMember attributes to get the data, you CAN obfuscate the Property/Member names, just not the CLASS/type name. If the name is something like FederalReserveLogIn, your class/type has a bullseye on it.
I have had initial success doing the following:
1) Build a simple class to store a Property Token and value. Store everything as string using ConvertFromInvariantString. Taking a tip from PBN, I used an integer for the token:
<ProtoMember(propIndex.Foo)>
Property Foo As String
An enum helps tie everything together later. Store these in a Dictionary(Of T, NameValuePair)
2) add some accessors. these can perform the type conversions for you:
Public Sub Add(ByVal Key As T, ByVal value As Object)
If _col.ContainsKey(Key) Then
_col.Remove(Key)
End If
_col.Add(Key, New TValue(value))
End Sub
Public Function GetTItem(Of TT)(key As T) As TT
If _col.ContainsKey(key) Then
Return CType(_col(key).TValue, TT)
Else
Return Nothing
End If
End Function
T is whatever key type you wish to use. Integer results in the smallest output and still allows the subscribing code to use an Enum. But it could be String.
TT is the original type:
myFoo = props.GetTItem(Of Long)(propsEnum.Foo)
3) Expose the innerlist (dictionary) to PBN and bingo, all done.
Its also very easy to add converters for Point, Rectangle, Font, Size, Color and even bitmap.
HTH

God object - decrease coupling to a 'master' object

I have an object called Parameters that gets tossed from method to method down and up the call tree, across package boundaries. It has about fifty state variables. Each method might use one or two variables to control its output.
I think this is a bad idea, beacuse I can't easily see what a method needs to function, or even what might happen if with a certain combination of parameters for module Y which is totally unrelated to my current module.
What are some good techniques for decreasing coupling to this god object, or ideally eliminating it ?
public void ExporterExcelParFonds(ParametresExecution parametres)
{
ApplicationExcel appExcel = null;
LogTool.Instance.ExceptionSoulevee = false;
bool inclureReferences = parametres.inclureReferences;
bool inclureBornes = parametres.inclureBornes;
DateTime dateDebut = parametres.date;
DateTime dateFin = parametres.dateFin;
try
{
LogTool.Instance.AfficherMessage(Variables.msg_GenerationRapportPortefeuilleReference);
bool fichiersPreparesAvecSucces = PreparerFichiers(parametres, Sections.exportExcelParFonds);
if (!fichiersPreparesAvecSucces)
{
parametres.afficherRapportApresGeneration = false;
LogTool.Instance.ExceptionSoulevee = true;
}
else
{
The caller would do :
PortefeuillesReference pr = new PortefeuillesReference();
pr.ExporterExcelParFonds(parametres);
First, at the risk of stating the obvious: pass the parameters which are used by the methods, rather than the god object.
This, however, might lead to some methods needing huge amounts of parameters because they call other methods, which call other methods in turn, etcetera. That was probably the inspiration for putting everything in a god object. I'll give a simplified example of such a method with too many parameters; you'll have to imagine that "too many" == 3 here :-)
public void PrintFilteredReport(
Data data, FilterCriteria criteria, ReportFormat format)
{
var filteredData = Filter(data, criteria);
PrintReport(filteredData, format);
}
So the question is, how can we reduce the amount of parameters without resorting to a god object? The answer is to get rid of procedural programming and make good use of object oriented design. Objects can use each other without needing to know the parameters that were used to initialize their collaborators:
// dataFilter service object only needs to know the criteria
var dataFilter = new DataFilter(criteria);
// report printer service object only needs to know the format
var reportPrinter = new ReportPrinter(format);
// filteredReportPrinter service object is initialized with a
// dataFilter and a reportPrinter service, but it doesn't need
// to know which parameters those are using to do their job
var filteredReportPrinter = new FilteredReportPrinter(dataFilter, reportPrinter);
Now the FilteredReportPrinter.Print method can be implemented with only one parameter:
public void Print(data)
{
var filteredData = this.dataFilter.Filter(data);
this.reportPrinter.Print(filteredData);
}
Incidentally, this sort of separation of concerns and dependency injection is good for more than just eliminating parameters. If you access collaborator objects through interfaces, then that makes your class
very flexible: you can set up FilteredReportPrinter with any filter/printer implementation you can imagine
very testable: you can pass in mock collaborators with canned responses and verify that they were used correctly in a unit test
If all your methods are using the same Parameters class then maybe it should be a member variable of a class with the relevant methods in it, then you can pass Parameters into the constructor of this class, assign it to a member variable and all your methods can use it with having to pass it as a parameter.
A good way to start refactoring this god class is by splitting it up into smaller pieces. Find groups of properties that are related and break them out into their own class.
You can then revisit the methods that depend on Parameters and see if you can replace it with one of the smaller classes you created.
Hard to give a good solution without some code samples and real world situations.
It sounds like you are not applying object-oriented (OO) principles in your design. Since you mention the word "object" I presume you are working within some sort of OO paradigm. I recommend you convert your "call tree" into objects that model the problem you are solving. A "god object" is definitely something to avoid. I think you may be missing something fundamental... If you post some code examples I may be able to answer in more detail.
Query each client for their required parameters and inject them?
Example: each "object" that requires "parameters" is a "Client". Each "Client" exposes an interface through which a "Configuration Agent" queries the Client for its required parameters. The Configuration Agent then "injects" the parameters (and only those required by a Client).
For the parameters that dictate behavior, one can instantiate an object that exhibits the configured behavior. Then client classes simply use the instantiated object - neither the client nor the service have to know what the value of the parameter is. For instance for a parameter that tells where to read data from, have the FlatFileReader, XMLFileReader and DatabaseReader all inherit the same base class (or implement the same interface). Instantiate one of them base on the value of the parameter, then clients of the reader class just ask for data to the instantiated reader object without knowing if the data come from a file or from the DB.
To start you can break your big ParametresExecution class into several classes, one per package, which only hold the parameters for the package.
Another direction could be to pass the ParametresExecution object at construction time. You won't have to pass it around at every function call.
(I am assuming this is within a Java or .NET environment) Convert the class into a singleton. Add a static method called "getInstance()" or something similar to call to get the name-value bundle (and stop "tramping" it around -- see Ch. 10 of "Code Complete" book).
Now the hard part. Presumably, this is within a web app or some other non batch/single-thread environment. So, to get access to the right instance when the object is not really a true singleton, you have to hide selection logic inside of the static accessor.
In java, you can set up a "thread local" reference, and initialize it when each request or sub-task starts. Then, code the accessor in terms of that thread-local. I don't know if something analogous exists in .NET, but you can always fake it with a Dictionary (Hash, Map) which uses the current thread instance as the key.
It's a start... (there's always decomposition of the blob itself, but I built a framework that has a very similar semi-global value-store within it)