Why do we sometimes prefer having data structures over objects?

Why do we sometimes prefer having data structures over objects? - oop

As Robert Martin's Clean Code said:
Objects hide
their data behind abstractions and expose functions that operate on that data. Data structure
expose their data and have no meaningful functions.
But he also mentioned that:
Procedural code (code using data structures) makes it easy to add new functions without
changing the existing data structures. OO code, on the other hand, makes it easy to add
new classes without changing existing functions.
and:
...
Such hybrids make it hard to add new functions but also make it hard to add new data
structures. They are the worst of both worlds. Avoid creating them.
Hybrid above means the hybrid between data structure and objects.
But it seems there is a conflict for data structure: the pros of using data structure is for adding functions easier, but for data structure we'd better never add functions in it. Then what's the point of having data structure? E.g. DTO(data transfer object) are one of the examples of using data structure instead of using objects. And it's always a good practice to not add a lot of logic out of getters in it. -- But why?

You are misunderstanding what Uncle Bob said.
the pros of using data structure is for adding functions easier
No, it should be: the pros of separating data structures and functions is for adding functions easier.
When data structures and functions are separated, you can add a new function without breaking the old functions. In contrast, when data structures and functions are bundled into units like classes in OOP, adding a new function (method) will break all the old units/classes.

Related

Is using ES6 classes in vue/vuex(/flux?) an anti-pattern?

I've been using Vuex, and it's adherence to only altering state through it's mutators or actions makes me think your store should only include as flat an object as you can, with only primitives types.
Some threads even prescribe normalising your data (so instead of nested object trees you have objects with arrays of id's to indicate tree relationships). This probably matches closely to your JSON api.
This makes me think that storing classes (that may have methods to alter themselves) in your flux store is an anti-pattern. Indeed even hydrating your store's data into a class seems like you're moving against the tide unless your class performs no modifications to its internal data.
Which then got me thinking, is using any class in a Vue/Vuex/Reactive/Flux an anti-pattern?
The libraries seem explicitly designed to work with plain JS objects and the general interactions you have with the API (data in, data out) makes me feel like a more functional approach (sans immutability) is what the original designers were thinking about.
It also seems be easier to write code that runs from function => test => state mutator wrapper around function.
I understand that JS objects and JS classes behave very similarly (and are basically the same thing), but my logic is if you don't design with classes in mind, then you're more likely to not pollute your state with non-flux state changes.
Is there a general consensus in the community that your flux code should be more functional and less object orientated?

Yes. You are absolutely right in what you are thinking. State containers like Redux, Vuex are supposed to hold your data constructs and not functions. It is true that functions in JavaScript are simply objects which are callable. You can store static data on functions too. But that still doesn't qualify as pure data. It is also for this same reason that we don't put Symbols in our state containers.
Coming back to the ES classes, as long as you are using classes as POJO i.e. only to store data then you are free to use those. But why have classes if you can have simple plain objects.
Separating data from UI components and moving it into state containers has fundamental roots in functional programming. Most of the strict functional languages like Haskell, Elm, OCaml or even Elixir/Erlang work this way. This provides strong reasoning about your data flows in your application. Additionally, this is complemented by the fact that, in these languages, data is immutable. And, thus there is no place for stateful Class like construct.
With JavaScript since things are inherently mutable, the boundaries are a bit blur and it is hard to define good practices.
Finally, as a community, there is no definite consensus about using the functional way, but it seems that the community is heading towards more functional, stateless component approaches. Some of the great examples are:
Elm
ReasonML
Hybrids
swiss-element
Cycle.js
And now, even we have functional components in both Vue and React.

Class with a list of materials: best practice

I've created the custom class ZMaterial that can be instantiated passing an ID to the constructor which sets the properties for a single material using SELECTs and BAPIs. This class is basically used to READ and UPDATE a single material.
Now I need to create a service to return a list of materials. I already have the procedural code for it in a static method (for now actually a function module), but I would like to keep using a full OOP approach and instantiate a list of my custom material object. The first approach I found is to enhance the static method to instantiate a list of my single material object after the selects are executed and I have the data in internal tables, but it does not seem the most OOP.
The second option in my mind is to create a new class ZMaterialList with one property being a list of objects ZMaterial and then a constructor with the necessary input parameters for the database select. The problem I see with this option is that I create a full class just for the constructor.
What do you think is the best way to proceed?

Create a separate class to produce the list of materials. The single responsibility principle says each class should do exactly one thing. In all but the most simple cases, using a thing is a different responsibility than producing it.
Don’t make a ZMaterialList class. A list’s focus would be managing the list items, i.e. adding, removing, iterating, sorting etc. But you should be fine with a regular STANDARD TABLE OF REF TO ZMaterial.
Make a ZMaterialReader, -Repository, -Query or -Factory class or the like, depending on the precise way you want to produce the ZMaterials. Readers read by keys, repositories read and write, queries use varying sets of selection criteria, factories instantiate with possibly different sets of inputs.
You can well let that class use the original FUNCTION underneath. It’s good style to exploit what’s already there. Just make sure you trust that code, put it in a test harness, and keep it afar from the rest of your oo code.
Extract all public interaction of ZMaterial to an interface and use only that interface. That allows you to offer alternative implementations of ZMaterial, ones that differ in the way they are produced or how they store their data.
Split single production from mass production. Reading MARA to retrieve a single material is okay. But you don’t want thousands of ZMaterials reading MARA individually - that wrecks performance.
Now you’ve got the interface, you could offer a second implementation of ZMaterial whose constructor receives all relevant data and relies on it already having been validated to avoid additional SELECTs.
You could also offer an implementation that doesn’t store its data at all but only stores pointers to rows in internal tables somewhere else. See the flyweight pattern for ideas.
If you expect mass updates on the materials, such as “reclassify all of these as B”, consider extracting these list-oriented operations to separate classes as well.

What design pattern is used by IProject.setDescription in Eclipse

I'm designing an API with a specific pattern in mind, but don't know if this pattern has a name. It's similar to the Command pattern in GoF (Gang of Four) but not exactly.
One simple example of it I can find is in Eclipse where you manipulate a project (IProject), not by calling methods on the project that change its state, but by this 3 step process:
extracting its state into a descriptor object (IProjectDescription) with getDescription
setting properties on the descriptor. E.g. setName
applying the descriptor back to the original project with setDescription
The general principle seems to be that you have a complex object as part of a framework with many potentially interdependent properties, and rather than working directly on that object, one property at a time, you extract the properties into a simple data object, manipulate that, and apply it back.
It has some of the attributes of the Command pattern, in that the data object encapsulates all of the changes like a Command would - but it's not really a Command, because you don't execute it on the object, it's simply a representation of the state of the object.
It also has some attributes of a Transactional API, in that, by making the changes all in one hit with the set... call, you allow for the entire modification to effectively "roll back" if any one property changes fails. But while that's an advantage of the approach, it's not really the main purpose of it. And what's more, you can achieve the transactional nature without this approach, by simply adding transactional methods to the API (like commit and rollback)
There are two advantages in this pattern that I do want to exploit - although I don't see them being exploited by the eclipse example above:
You can represent the meaningful state of the underlying object while its implementation changes. This is useful for upgrading, or copying state from different types of representations. Say I release a new version of my API where I create an object Foo2 which is a totally new form of my old Foo1, but both have the same basic properties. To upgrade a Foo1 to a Foo2, I can extract those properties as a FooState. foo2.setFooState(foo1.getFooState) as simply as that. The way in which the properties are interpreted and represented is encapsulated in the Foos and can be totally different.
I can persist and transmit the state of the underlying object with my simple data object, where persisting the object itself would be much more complex. So I can extract the state of Foo as a FooState, and persist it as a simple XML document then later apply it to some new object by "loading" it and applying it. Or I can transmit the FooState simply to a webservice as a JSON object whereas the Foo itself is too big and complex to transmit. (Or the objects on each end of the service call are entirely different, like Foo1 and Foo2)
Anyway, I can't find an name or example of this pattern anywhere, neither in the Gang of Four design patterns, nor even in Martin Fowler's comprehensive "bliki"

Data Transfer Object(DTO) that Martin Fowler describes in his book Principles of Enterprise Application Architecture seems to be for the purpose you describe in point 2.
A DTO is a fairly simple extraction of the more complex Domain Model that it represents.
Fowler describes that the usage of a DTO in combination with an assembler can be used to keep the DTO independent from the actual Domain Object(or Objects) that it is supposed to represent. The assembler knows how to create a DTO from the Domain Object and vice versa. Also he mentions that the DTO needs to be serializable to persist/transmit its state. What you describe in point 2 seems to match this description.
What you've described in point 1 though does not seem to be an intended purpose, but definitely seems achievable using this pattern.
I'm not sure if you went through the Pattern catalog of his book or the book itself. The book itself describes this in much greater detail.
You may also want to have a look at Transfer Object definition from Oracle which Fowler says here is what he describes as DTO.

Not every design is documented as a single Design Pattern, in fact most system designs are combinations of multiple patterns.
However one part of what you're doing, with IProjectDescription is using a Memento, however yours seems to be a Polymorphic variation. Consider Patterns as they appear in Pattern Catalogues to be the pared down to the essential starting point not the end result. Patterns are by there very nature supposed to be extended and combined.
The Command pattern can give you Commit and RollBack (Do/Undo) and combining it with Memento in that way is a quite common approach. The same thing is seen in the Java Servlet API with HttpRequest & HttpResponse.

How do I refactor a class with lots of operations which all require its internal data?

My Problem
I have a class with just a few fields but which represents a relatively complicated data structure. This class is central in my program and over time I found myself adding more and more functionality into it, making things a mess. Since (almost) all of its methods rely on its internal fields, I could not think of a way to move some of the methods elsewhere, even though most methods are independent of each other. How can I refactor this class to make it simpler and reduce the number of methods which are directly implemented in it?
More Information
The class in question represents a sort of automaton. It supports a ton of operations such as retrieving information about it, performing various binary operations between it and other automata, querying for specific information stored inside it, saving it to file, etc. Almost all of these operations depend on the precise implementation of the class - in my specific case I maintain an edge-set-based implementation, but other implementations were also used in the past and might be used again in the future.
Except for a narrow set of basic helper methods which are commonly used, most methods are independent of each other.
The language I am using is Java, but I'm hoping for general answers which could be applied to any statically-typed, object-oriented language.
What I've Tried
I tried refactoring it somehow to multiple types, but each of its operations require access to most of its fields, and I'm hesitant about migrating these operations elsewhere because I can't think of a way to do that without exposing the class's implementation.
I'm also not sure where I should migrate the operations to, assuming they are indeed independent of the implementation. An external utility class? An abstract base type? Will appreciate any input about this.

Perhaps you could remodel the data that your class holds, so that instead of holding the data directly, it holds objects that hold the data? Then you could move the methods that manipulate that data into the new classes, leaving the original class as a sort of container / dispatcher class.

Separating code - modularising

When developing how useful is it to create small classes to represent little data structures? For example say as a simplified example, a program is using an array of strings to represent names of something, e.g. cars. Instead of just keeping this array inside a method or class, how useful is it to separate this and make it it own class? This way I am thinking that it can be responsible for itself and more actions can be performed on it - validation, etc. which can all be kept separate. Also, it can be reused easily throughout the system. But then where does it stop, i.e. in the car example, you could then go on to create a car object etc. It really can be never ending can't it?

There are several guidelines I use to determine when I need to refactor a data structure into its own class:
Am I storing a lot of interrelated data? If you find yourself storing a couple of arrays, and manipulating them as a unit, it's probably best to store a single array containing objects.
Are these data structures exposed to other classes? If other classes are directly exposed to the data, it's probably best to encapsulate the data in its own class, which makes it easy to keep the conceptual and actual models separate.
Do I find myself frequently performing operations on the data? It might be fine to store an array of names, but if you start adding methods like validateName and checkName to the wrapping class, it might be a good idea to refactor and place those methods on a Name class itself.
Keep in mind: it's often a lot easier and cleaner to put a decent object model in place up front than to try and graft one on after the fact. You shouldn't do it arbitrarily, but as you're working through your program you should pay attention to when it becomes difficult to control the data structures you have--that's a good sign that you should refactor them, as needed.

It makes sense to do this as soon as you are repeating code to operate on the data structure.
Chris B. makes a great point about interrelated data. See the Extract Class refactoring example.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas