When is a class a data class? - kotlin

I know what classes are about, but for better understanding I need a use case. Recently I discovered the construct of data classes. I get the idea behind normal classes, but I cannot imagine a real use case for data classes.
When should I use a data class and when I use a "normal" class? For all I know, all classes keep data.
Can you provide a good example that distinguishes data classes from non-data classes?

A data class is used to store data. It's lighter than a normal class, and can be compared to an array with key/value (dictionary, hash, etc.), but represented as an object with fixed attributes. In kotlin, according to the documentation, that adds those attributes to the class:
equals()/hashCode() pair
toString() of the form "User(name=John, age=42)"
componentN() functions corresponding to the properties in their order of declaration.
copy() function
Also it has a different behavior during class inheritence :
If there are explicit implementations of equals(), hashCode(), or toString() in the data class body or final implementations in a
superclass, then these functions are not generated, and the existing
implementations are used.
If a supertype has componentN() functions that are open and return compatible types, the corresponding functions are generated for the
data class and override those of the supertype. If the functions of
the supertype cannot be overridden due to incompatible signatures or
due to their being final, an error is reported.
Providing explicit implementations for the componentN() and copy() functions is not allowed.
So in kotlin, if you want to describe an object (a data) then you may use a dataclass, but if you're creating a complex application and your class needs to have special behavior in the constructor, with inheritence or abstraction, then you should use a normal class.
I do not know Kotlin, but in Python, a dataclass can be seen as a structured dict. When you want to use a dict to store an object which has always the same attributes, then you should not put it in a dict but use a Dataclass.
The advantage with a normal class is that you don't need to declare the __init__ method, as it is "automatic" (inherited).
Example :
This is a normal class
class Apple:
def __init__(size:int, color:str, sweet:bool=True):
self.size = size
self.color = color
self.sweet = sweet
Same class as a dataclass
from dataclasses import dataclass
#dataclass
class Apple:
size: int
color: str
sweet: bool = True
Then the advantage compared to a dict is that you are sure of what attribute it has. Also it can contains methods.
The advantage over to a normal class is that it is simpler to declare and make the code lighter. We can see that the attributes keywords (e.g size) are repeated 3 times in a normal class, but appear only once in a dataclass.
The advantage of normal class also is that you can personalize the __init__ method, (in a dataclass also, but then you lose it's main advantage I think) example:
# You need only 2 variable to initialize your class
class Apple:
def __init__(size:int, color:str):
self.size = size
self.color = color
# But you get much more info from those two.
self.sweet = True if color == 'red' else False
self.weight = self.__compute_weight()
self.price = self.weight * PRICE_PER_GRAM
def __compute_weight(self):
# ...
return (self.size**2)*10 # That's a random example

Abstractly, a data class is a pure, inert information record that doesn’t require any special handling when copied or passed around, and it represents nothing more than what is contained in its fields; it has no identity of its own. A typical example is a point in 3D space:
data class Point3D(
val x: Double,
val y: Double,
val z: Double
)
As long as the values are valid, an instance of a data class is entirely interchangeable with its fields, and it can be put apart or rematerialized at will. Often there is even little use for encapsulation: users of the data class can just access the instance’s fields directly. The Kotlin language provides a number of convenience features when data classes are declared as such in your code, which are described in the documentation. Those are useful when for example building more complex data structures employing data classes: you can for example have a hashmap assign values to particular points in space, and then be able to look up the value using a newly-constructed Point3D.
val map = HashMap<Point3D, String>()
map.set(Point3D(3, 4, 5), "point of interest")
println(map.get(Point3D(3, 4, 5))) // prints "point of interest"
For an example of a class that is not a data class, take FileReader. Underneath, this class probably keeps some kind of file handle in a private field, which you can assume to be an integer (as it actually is on at least some platforms). But you cannot expect to store this integer in a database, have another process read that same integer from the database, reconstruct a FileReader from it and expect it to work. Passing file handles between processes requires more ceremony than that, if it is even possible on a given platform. That property makes FileReader not a data class. Many examples of non-data classes will be of this kind: any class whose instances represent transient, local resources like a network connection, a position within a file or a running process, cannot be a data class. Likewise, any class where different instances should not be considered equal even if they contain the same information is not a data class either.

From the comments, it sounds like your question is really about why non-data classes exist in Kotlin and why you would ever choose not to make a data class. Here are some reasons.
Data classes are a lot more restrictive than a regular class:
They have to have a primary constructor, and every parameter of the primary constructor has to be a property.
They cannot have an empty primary constructor.
They cannot be open so they cannot be subclassed.
Here are other reasons:
Sometimes you don't want a class to have a copy function. If a class holds onto some heavy state that is expensive to copy, maybe it shouldn't advertise that it should be copied by presenting a copy function.
Sometimes you want to use an instance of a class in a Set or as Map keys without two different instances being considered as equivalent just because their properties have the same values.
The features of data classes are useful specifically for simple data holders, so the drawbacks are often something you want to avoid.

Related

Recursively building a data class in Kotlin

I have am trying to create a recursive data class like so:
data class AttributeId (
val name: String,
val id: Int,
val children: List<AttributeId>?
)
The thing I'm struggling with now is building the data class by iterating over a source object.
How do I recursively build this object?? Is a data class the wrong solution here?
EDIT: Some more information about the Source object from which I want to construct my data class instance
The source object is a Java Stream that essentially* has the following shape:
public Category(final String value,
final Integer id,
final List<Category> children) {
this.value = value;
this.id = id;
this.children = children;
}
(For brevity the fields I don't care about have been removed from example)
I think I need to map over this stream and call a recursive function in order to construct the AttributeId data class, but my attempts seem to end in a stack overflow and a lot of confusion!
I don't think there's anything necessarily wrong with a data class that contains references to others.
There are certainly some gotchas.  For example:
If the list were mutable, or if its field was mutable (i.e. var rather than val), then you'd have to take care because its hashcode &c could change.
And if the chain of links could form a loop (i.e. you could follow the links and end up back at the original class), that could be very dangerous.  (E.g. calling a method such as toString() or hashCode() might either get stuck in an endless loop or crash the thread with a StackOverflowError.  You'd have to prevent that by overriding those methods to prevent them recursing.)  But that couldn't happen if the list and field were both immutable.
None of these issues are specific to data classes, though; a normal class could suffer the same issues (especially if you overrode methods like toString() or hashCode() without taking care).  So whether you make this a data class comes down to whether it feels like one: whether its primary purpose is to hold data, and/or whether the automatically-generated methods match how you want it to behave.
As Tenfour04 says, it depends what you're constructing these from.  If it naturally forms a tree structure, then this could be a good representation for it.
Obviously, you wouldn't be able to construct a parent before any of its children.  (In particular, the first instance you create would have to have either null or an empty list for its children.)  This would probably mean traversing the source in post-order.  The rest should fall out naturally from that.

When can a reference's type differ from the type of its object?

Yesterday I was asked a question in an interview:
Suppose class A is a base class, and class B is derived class.
Is it possible to create object of:
class B = new class A?
class A = new class B?
If yes, then what happen?
Objects of type B are guaranteed to also be objects of type A. This type of relationship is called "Is-a," or inheritance, and in OOP it's a standard way of getting polymorphism. For example, if objects of type A have a method foo(), objects of type B must also provide it, but its behavior is allowed to differ.
The reverse is not necessarily true: an object of type A (the base class) won't always be an object of type B (the derived class). Even if it is, this can't be guaranteed at compile-time, so what happens for your first line is that the code will fail to compile.
What the second line does depends on the language, but generally
Using a reference with the base type will restrict you to only accessing only members which the base type is guaranteed to have.
In Java, if member names are "hidden" (A.x exists and so does B.x, but they have different values), when you try to access the member you will get the value which corresponds to the type of the reference rather than the type of the object.
The code in your second example is standard practice when you are more interested in an API than its implementation, and want to make your code as generic as possible. For instance, often in Java one writes things like List<Integer> list = new ArrayList<Integer>(). If you decide to use a linked list implementation later, you will not have to change any code which uses list.
Take a look at this related question: What does Base b2 = new Child(); signify?
Normally, automatic conversions are allowed down the hierarchy, but not up. That is, you can automatically convert a derived class to its base class, but not the reverse. So only your second example is possible. class A = new class B should be ok since the derived class B can be converted to the base class A. But class B = new class A will not work automatically, but may be implemented by supplying an explicit conversion (overloading the constructor).
A is super class and B is a SubClass/Derived Class
the Statement
class A = new class B is always possible and it is called Upcasting because you are going Up in terms of more specific to more General
Example:
Fruit class is a Base Class and Apple Class is Derived
we can that Apple is more specific and must possess all the quality of an Fruit
so you can always do UPcasting where as
DownCasting is not always possible because Apple a=new Fruit();
A fruit can be a Apple or may it is not

OOP , object concept

According to the standard definition, an object is an entity that contains both data and behaviour.
According to my understanding the data is sent from outside.For eg,we have a class that computes the square of a number.We create an instance and sends a message,along with the number, to the object to compute the square,.
Are we not sending the data from outside?
Why do all the definitions state that the object contains the data?
Thanks
Data, in this context, is state of the object. The definition says that the state/data of object should be internally stored. For example, consider the following class:
class Math {
Double square(double x) {
return x * x;
}
// other similar functions
}
As a language construct, it is a class. But, it is not a true class in object-oriented sense. Because it does not have a state or data. It is just a function wrapped in a class construct. This is not necessarily wrong. Because in this case, it happens that you have operations that don't need a state.
What the definition trying to emphasize is that: you have a real object, when it (or it's class) has both data and behavior. Not every usage of the class construct represents a true object.
Therefore, you have an object if the class representing it satisfies the following three conditions.
The class has state/date. If not, then it is just a bunch of functions. It is not object-oriented, it is procedural.
The class has behavior. If not, then it is just a container, a bunch of variable ( Structures in C).
Not only the class has state/data and behavior/methods, but there is an intrinsic relation between the data and behavior. Which means that just throwing some variables and functions together does not make a true object. For example, if you have state/data and you also have some method, in the class, but if that function does not need to operate upon any of the state, then there is a question whether that method really belongs to that class.
Below is a simple example of what I think is a proper class (representation of object).
Class Patient {
// blood pressure
double systolic;
double diastolic;
double weight;
int age;
public Patient(double systolic, double diastolic, double weight, int age){
}
Public boolean isHealthy(){
// do some calculations and algorithms on age, weight and blood pressure indicators.
// return result as true of false
}
}
Here, we see that class has both state and behavior. We also see that both state and behavior really belong to this class. They are properties of the concept of patient. We further see that operation has an intrinsic relation to data. You can’t decide whether the patient is healthy or not, without consulting/using its state.
I think the problem is with your example which badly fit with an Object Oriented design. I just mean that computing the square of a number is a memoryless function thus there is obviously no reason to store data inside the object properties. However when you will have to deal with the management of stateful entities you will get more easily the importance of classes and object orientation in general.
Your example is a private case where the object doesn't need to hold data (i.e. state). In this case it can be replaced with a function (just the behavior). Most objects need to store data. E.g., an object Person should contain the qualities describing the person, not just possible behavior.
An object is an instance of a class.
Class (a, a*a) is square class but (2, 4) is an instance of it (object). Yes, data is sent to the class and creates new object.

How to call appropriate subclass constructor inside base class constructor in MATLAB

I'm trying to use single inheritance in Matlab, and to write a base class constructor that allows the creation of arrays of objects, including empty arrays, and which is inherited by subclasses. I can't work out how to do it without using some incredibly clunky code. There must be a better way.
In this toy example, my base class is called MyBaseClass, and my subclass is called MySubClass. Each can be constructed with a single numeric argument, or no arguments (in which case NaN is assumed). In the toy example my SubClass is trivial and doesn't extend the behavior of MyBaseClass in any way, but obviously in practice it would do more stuff.
I want to be able to call the constructor of each as follows:
obj = MyBaseClass; % default constructor of 'NaN-like' object
obj = MyBaseClass([]); % create an empty 0x0 array of type MyBaseClass
obj = MyBaseClass(1); % create a 1x1 array of MyBaseClass with value 1
obj = MyBaseClass([1 2; 3 4]) % create a 2x2 array of MyBaseClass with values 1, 2, 3, 4.
And the same four calls for MySubClass.
The solution I have found needs to call eval(class(obj)) in order to recover the subclass name and construct code in strings to call while in the base class constructor. This seems clunky and bad. (And it's somewhat surprising to me that it's possible, but it is.) I guess I could duplicate more logic between the MyBaseClass and MySubClass constructors, but that also seems clunky and bad, and misses the point of inheritance. Is there a better way?
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% MyBaseClass.m
classdef MyBaseClass
properties
data = NaN
end
methods
% constructor
function obj = MyBaseClass(varargin)
if nargin == 0
% Handle the no-argument case
return
end
arg = varargin{1};
% assume arg is a numeric array
if isempty(arg)
% Handle the case ClassName([])
% Can't write this, because of subclasses:
% obj = MyBaseClass.empty(size(arg));
obj = eval([class(obj) '.empty(size(arg))']);
return
end
% arg is an array
% Make obj an array of the correct size by allocating the nth
% element. Need to recurse for the no-argument case of the
% relevant class constructor, which might not be this one.
% Can't write this, because of subclasses
% obj(numel(arg)) = MyBaseClass;
obj(numel(arg)) = eval(class(obj));
% Rest of the constructor - obviously in this toy example,
% could be simplified.
wh = ~isnan(arg);
for i = find(wh(:))'
obj(i).data = arg(i);
end
% And reshape to the size of the original
obj = reshape(obj, size(arg));
end
end
end
% end of MyBaseClass.m
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% MySubClass.m
classdef MySubClass < MyBaseClass
methods
function obj = MySubClass(varargin)
obj = obj#MyBaseClass(varargin{:});
end
end
end
% end of MySubClass.m
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
Your solution is functional and embraces some loose MATLAB typing to achieve what you want. However, getting clean and structured OOP is probably going to require losing some of the functionality you want. At the same time, the best option for avoiding code duplication is templated/generic container classes but these are not supported in MATLAB at this time.
Your code mirrors the MATLAB documentation on Building Arrays in the Constructor and relies on MATLAB being a loosely typed language that enabled you to convert an object into an array of objects without problem. Exploiting this powerful and flexible feature of MATLAB does introduce some organizational issues and may undermine your efforts at clean, object oriented code.
Problems begin because the MyBaseClass constructor is not a true constructor for MyBaseClass.
Wikipedia says:
"In object-oriented programming, a constructor (sometimes shortened to ctor) in a class is a special type of subroutine called at the creation of an object. It prepares the new object for use, often accepting parameters which the constructor uses to set any member variables required when the object is first created. It is called a constructor because it constructs the values of data members of the class."
Notice that the MyBaseClass constructor is not constructing values for the object members. Instead, it is a function that sets the object equal to an array of objects of type MyBaseClass and tries to set their data members to some value. You can see where the obj is destroyed at set to an array here:
obj(numel(arg)) = eval(class(obj));
This behavior is especially unhelpful when you derive MySubClass from MyBaseClass because MyBaseClass isn’t supposed to assign a new object to the variable obj----MySubClass has already created the new object in obj and is simply asking MyBaseClass to construct the portion of the existing object in obj that MyBaseClass knows the details for.
Some clarity might be gained by noting that when you enter the constructor for both MyBaseClass and MySubClass the variable obj is already populated with a perfectly good instance of the class. Good OOP practice would have you keep this original instance, use it in the base class constructor, and only act to populate its members in the constructor----not to overwrite the object entirely with something new.
My conclusion would be to not assign obj to be an array inside of MyBaseClass. Instead, I would recommend creating a class MyBaseClassArray that creates an array of MyBaseClass objects.
Unfortunately, you would also need to create a duplicate class MySubClassArray that creates an array of MySubClass objects. Languages like C++ and Java get around this code duplication issue with templates and generics, respectively but MATLAB does not currently support any form of templates (http://www.mathworks.com/help/techdoc/matlab_oop/brqzfut-1.html). Without templates there is no good way to avoid code duplication.
You could try and avoid some duplication by creating a generic CreateClassArray function that takes the string name of a class to create and the constructor arguments to use for each object---but now we are coming back to code that looks like your original. The only difference is now we have a clear division between the array class and the individual objects. The truth is that although MATLAB does not support templates, its flexible classes and typing system allow you use eval() like you have to change code and overwrite obj at will and create code that acts generically across classes. The cost? Readability, speed, and the uncomfortable feeling you got when you saw your base class constructing the subclass.
In short, you used MATLAB’s flexibility to overwrite the obj in the constructor with an array to avoid creating a separate container class for MyBaseClass. You then used eval to make up for not having a template feature in MATLAB that would allow you to reuse your array creation code all types. In the end, your solution is functional, reduces code duplication, but does require some unnatural behavior from your classes. It’s just a trade you have to make.

VB.NET - I'm Refactoring and Could Use Some Help

I'm working with vb.net, wcf, wpf and I'm refactoring working code with the hope of being able to reduce some amount of redundancy. I have a bunch of methods that get called in several places throughout the code that only have a slight variation from each other and I would like to replace them with a single method instead.
Specifically, each of the redundant methods process an 1-d array that contain different objects I have created. There are several of these different object types each with different signatures but they have all have a "name" and "Id" property. (Also these objects don't have a shared base class but I could add that if needed.) Each of the redundant methods deal with a different one of the object types.
To refactor the code I would like to pass any of the different object arrays to a single new method that could access the "name" and "id" properties. I'm trying to write this new method in a fashion that wouldn't require me to update it if I created more objects down the road.
I've done some reading on Delegates and Generic Classes but I can't really figure out how this fits in. It would almost be as if I wanted to create a generic class that could handle each of my object types but then somehow also access the "name" and "id" propeties of the different object types.
Any help you can provide would be appretiated. Also, please keep in mind this project is written in VB.net.
Thanks
Mike
It sounds like having your object implement a common interface or have a shared base class would be best. Interfaces give you the most flexibility down the road if you ever need to pass a class to this method that must derive from some other class that does not implement the interface. However, a base class that implements the interface may also be useful just to reduce the duplicate declarations of these properties.
Public Interface IThingThatHasNameAndId 'good name not included
ReadOnly Property Name As String
ReadOnly Property Id As Integer
End Interface
Once you have the interface, you can then pass arrays of types implementing the interface as IEnumerable(Of IThingThatHasNameAndId) or make a generic method taking T() and constrain T to the interface.
Make a base class with the Name and ID properties, then you can make a method that takes in any class that derrives from that class.
Public Function TestFunction(Of t As YourBaseClass)(Byval obj As t) As Boolean
If obj.Name = "Some Name" AndAlso obj.ID = 1 Then
Return True
Else
Return False
End If
End Function