Abstract Data Type vs. non Abstract Data Types (in Java) - oop

I have read a lot about abstract data types (ADTs) and I'm askig myself if there are non-abstract/ concrete datatypes?
There is already a question on SO about ADTs, but this question doesn't cover "non-abstract" data types.
The definition of ADT only mentions what operations are to be
performed but not how these operations will be implemented
reference
So a ADT is hiding the concrete implementation from the user and "only" offers a bunch of permissible operations/ methods; e.g., the Stack in Java (reference). Only methods like pop(), push(), empty() are visible and the concrete implementation is hidden.
Following this argumentation leads me to the question, if there is a "non-abstract" data type?
Even a primitive data type like java.lang.Integer has well defined operations, like +, -, ... and according to wikipedia it is a ADT.
For example, integers are an ADT, defined as the values …, −2, −1, 0, 1, 2, …, and by the operations of addition, subtraction, multiplication, and division, together with greater than, less than, etc.,
reference

The java.lang.Integer is not a primitive type. It is an ADT that wraps the primitve java type int. The same holds for the other Java primitive types and the corresponding wrappers.
You don't need OOP support in a language to have ADTs. If you don't have support, you establish conventions for the ADT in the code you write (i.e. you only use it as previoulsy defined by the operations and possible values of the ADT)
That's why ADT's predate the class and object concepts present in OOP languages.They existed before. Statements like class just introduced direct support in the languages, allowing compilers to check what you are doing with the ADTs.
Primitive types are just values that can be stored in memory, without any other associated code. They don't know about themselves or their operations. And their internal representation is known by external actors, unlike the ADTs. Just like the possible operations. These are manipulations to the values done externally, from the outside.
Primitive types carry with them, although you don't necessary see it, implementation details relating the CPU or virtual machine architecture. Because they map to CPU available register sizes and instructions that the CPU executes directly. Hence the maximum integer value limits, for example.
If I am allowed to say this, the hardware knows your primitive types.
So your non-abstract data types are the primitive types of a language,
if those types aren't themselves ADT's too. If they happen to be ADTs,
you probably have to create them (not just declare them; there will
be code setting up things in memory, not only the storage in a certain
address), so they have an identity, and they usually offer methods
invoked through that identity, that is, they know about themselves.
Because in some languages everything is an object, like in Python, the
builtin types (the ones that are readily available with no
need to define classes) are sometimes called primitive too, despite
being no primitive at all by the above definition.
Edit:
As mentioned by jaco0646, there is more about concrete/abstract
words in OOP.
An ADT is already an abstraction. It represents a category
of similar objects you can instantiate from.
But an ADT can be even more abstract, and is referred as such (as
opposed to concrete data types) if you declare it with no intention of
instantiating objects from it. Usually you do this because other "concrete"
ADTs (the ones you instantiate) inherit from the "abstract" ADT. This allows the sharing and extension of behaviour between several different ADTs.
For example you can define an API like that, and make one or more different
ADTs offer (and respect) that API to their users, just by inheritance.
Abstract ADTs maybe defined by you or be available in language types or
libraries.
For example a Python builtin list object is also a collections.abc.Iterable.
In Python you can use multiple inheritance to add functionality like that.
Although there are other ways.
In Java you can't, but you have interfaces instead, and can declare a class to implement one or more interfaces, besides possibly extending another class.
So an ADT definition whose purpose is to be directly instantiated, is a
concrete ADT. Otherwise it is abstract.
A closely related notion is that of an abstract method in a class.
It is a method you don't fill with code, because it is meant to be filled by children classes that should implement it, respecting its signature (name and parameters).
So depending on your language you will find possible different (or similar) ways of implementing this concepts.

I agree with the answer from #progmatico, but I would add that concrete (non-abstract) data types include more than primitives.
In Java, Stack happens to be a concrete data type, which extends another concrete data type Vector, which extends an ADT AbstractList.
The interfaces implemented by AbstractList are also ADTs: Iterable, Collection, List.

Related

How does Scheme abstract data?

In statically typed language, people are able to use algebraic data type to abstract data and also generate constructors, or use class, trait and mixin to deal with data abstraction.
In dynamically typed language, like Python and Ruby, they all provide a class system to users.
But what about scheme, the simplest functional language, the closest one to λ-calculi, how does it abstract data?
Do scheme programmers usually just put data in a list or a lambda abstraction, and write some accessor function to make it look like a tree or something else? like EOPL says: specifying data via interfaces.
And then how does this abstraction technique relate to abstract data type (ADT) and objects? with regard to On understanding data abstraction, revisited.
What SICP (and I guess, EOPL) is advocating is just using functions to access data; then you can always switch one set of functions for another, implementing the same named set of functions to work with another concrete implementation. And that (i.e. the sets of such functions) is what forms the "interfaces", and that's what you put in different source files, and by just loading the appropriate one you can switch the concrete implementation while all the other code is none the wiser. That's what makes it "abstract" datatype.
As for the algebraic data types, the old bare-bones Scheme way is to create closures (that hold and hide the data) which respond to "messages" and thus become "objects" (something about "Scheme mailboxes"). This gives us products, i.e. records, and functions we get for free from Scheme itself. For sum types, just as in C/C++, we can use tagged unions in a disciplined manner (or, again, hide the specifics behind a set of "interface" functions).
EOPL has something called "variant-case" which handles such sum types in a manner similar to pattern matching. Searching brings up e.g. this link saying
I'm using DrScheme w/ the EOPL textbook, which uses define-record and variant-​case. I've got the macro definitions from the PLT site, but am now dealing with ...
so seems relevant, as one example.

Class or interface for declaring global types?

What is the recommended (modern) approach for reusing types between different classes?
SAP does not recommend to collect constants in an interface, calling it a loosy way of declaring:
" anti-pattern
INTERFACE /dirty/common_constants.
CONSTANTS:
warning TYPE symsgty VALUE 'W',
transitional TYPE i VALUE 1,
error TYPE symsgty VALUE 'E',
persisted TYPE i VALUE 2.
ENDINTERFACE.
Does the same applies to types? What are the drawbacks or advantages I am not aware of?
Should I use use classes, interfaces for types or maybe type pools?
As the document you linked to says, you should not use an interface for constants because that way you are:
misleading people to the conclusion that constants collections could be "implemented".
Which is why they recommend to put constants into an ABSTRACT FINAL class.
ABSTRACT means that this class as itself can not be instantiated (like an interface) and FINAL that it can not be inherited from (unlike an interface). This clearly communicates to the user that this class only exists as a container for static things like CONSTANTS, CLASS-DATA, CLASS-METHODS and of course TYPES.
So the same argument can be used against interfaces which do nothing but serve as a container for types. When you create an INTERFACE for that sole purpose, then people might wonder how they are supposed to implement that interface. So you would put them into a CLASS instead.
Another alternative which is still worth considering IMO (especially if you consider to use any of those types in database tables) is to use the good old ABAP Dictionary. It was invented for exactly that purpose: To serve as a directory for global types.
Unfortunately it doesn't allow you to organize types into namespaces (not unless you are working for SAP or a SAP partner who is able and willing to go through the bureaucracy to register a worldwide unique namespace for every product). So you have to make sure that each type you create has a proper prefix so people know which application it belongs to. But there is a character limit for dictionary type names, so you have to keep them short.
Using a class as a container for the types of each application solves that problem. The class serves as a namespace, so your type names no longer need to be system-unique.

What is the difference between subtyping and inheritance in OO programming?

I could not find the main difference. And I am very confused when we could use inheritance and when we can use subtyping. I found some definitions but they are not very clear.
What is the difference between subtyping and inheritance in object-oriented programming?
In addition to the answers already given, here's a link to an article I think is relevant.
Excerpts:
In the object-oriented framework, inheritance is usually presented as a feature that goes hand in hand with subtyping when one organizes abstract datatypes in a hierarchy of classes. However, the two are orthogonal ideas.
Subtyping refers to compatibility of interfaces. A type B is a subtype of A if every function that can be invoked on an object of type A can also be invoked on an object of type B.
Inheritance refers to reuse of implementations. A type B inherits from another type A if some functions for B are written in terms of functions of A.
However, subtyping and inheritance need not go hand in hand. Consider the data structure deque, a double-ended queue. A deque supports insertion and deletion at both ends, so it has four functions insert-front, delete-front, insert-rear and delete-rear. If we use just insert-rear and delete-front we get a normal queue. On the other hand, if we use just insert-front and delete-front, we get a stack. In other words, we can implement queues and stacks in terms of deques, so as datatypes, Stack and Queue inherit from Deque. On the other hand, neither Stack nor Queue are subtypes of Deque since they do not support all the functions provided by Deque. In fact, in this case, Deque is a subtype of both Stack and Queue!
I think that Java, C++, C# and their ilk have contributed to the confusion, as already noted, by the fact that they consolidate both ideas into a single class hierarchy. However, I think the example given above does justice to the ideas in a rather language-agnostic way. I'm sure others can give more examples.
A relative unfortunately died and left you his bookstore.
You can now read all the books there, sell them, you can look at his accounts, his customer list, etc. This is inheritance - you have everything the relative had. Inheritance is a form of code reuse.
You can also re-open the book store yourself, taking on all of the relative's roles and responsibilities, even though you add some changes of your own - this is subtyping - you are now a bookstore owner, just like your relative used to be.
Subtyping is a key component of OOP - you have an object of one type but which fulfills the interface of another type, so it can be used anywhere the other object could have been used.
In the languages you listed in your question - C++, Java and C# - the two are (almost) always used together, and thus the only way to inherit from something is to subtype it and vice versa. But other languages don't necessarily fuse the two concepts.
Inheritance is about gaining attributes (and/or functionality) of super types. For example:
class Base {
//interface with included definitions
}
class Derived inherits Base {
//Add some additional functionality.
//Reuse Base without having to explicitly forward
//the functions in Base
}
Here, a Derived cannot be used where a Base is expected, but is able to act similarly to a Base, while adding behaviour or changing some aspect of Bases behaviour. Typically, Base would be a small helper class that provides both an interface and an implementation for some commonly desired functionality.
Subtype-polymorphism is about implementing an interface, and so being able to substitute different implementations of that interface at run-time:
class Interface {
//some abstract interface, no definitions included
}
class Implementation implements Interface {
//provide all the operations
//required by the interface
}
Here, an Implementation can be used wherever an Interface is required, and different implementations can be substituted at run-time. The purpose is to allow code that uses Interface to be more widely useful.
Your confusion is justified. Java, C#, and C++ all conflate these two ideas into a single class hierarchy. However, the two concepts are not identical, and there do exist languages which separate the two.
If you inherit privately in C++, you get inheritance without subtyping. That is, given:
class Derived : Base // note the missing public before Base
You cannot write:
Base * p = new Derived(); // type error
Because Derived is not a subtype of Base. You merely inherited the implementation, not the type.
Subtyping doesn't have to be implemented via inheritance. Some subtyping that is not inheritance:
Ocaml's variant
Rust's lifetime anotation
Clean's uniqueness types
Go's interface
in a simple word: subtyping and inheritance both are polymorphism, (inheritance is a dynamic polymorphism - overriding). Actually, inheritance is subclassing, it means in inheritance there is no warranty to ensure capability of the subclass with the superclass (make sure subclass do not discard superclass behavior), but subtyping(such as implementing an interface and ... ), ensure the class does not discard the expected behavior.

How can type classes be used to implement persistence, introspection, identity, printing,

In the discussion on The Myths of Object-Orientation, Tim Sweeney describes what he thinks is a good alternative to the all-encompassing frameworks that we all use today.
He seems most interested in typeclasses:
we can use constructs like typeclasses to define features (like persistence, introspection,
identity, printing) orthogonally to type constructs like classes and
interfaces
I am passingly familiar with type classes as "types of types" but I am not sure exactly how they would be applied to the fore-mentioned problems: persistence, printing, ...
Any ideas?
My best guess would be code reuse through default methods and orthogonal definition through detachment of type class implementation from type itself.
Basically, when you define type class, you can define default implementations for methods. For example Eq (equality) class in Haskell defines /= (not equal) as not (x == y) and this method will work by default for all implementation of the type class. In a similar way in other language you could define a type class with all persistence code written (Save, Load) except for one or two methods. Or, in a language with good reflection capabilities you could define all persistence methods in advance. In practice, it is kind of similar to multiple inheritance.
Now, the other thing is that you do not have to attach the type class to your type in the same place where you define your type, you can actually do it later and in a different place. This allows persistence logic to be nicely separated from the original type.
Some good examples in how that looks like in an OOP language are in my favorite paper ever: http://www.stefanwehr.de/publications/Wehr_JavaGI_generalized_interfaces_for_Java.pdf. Their description of default implementations and retroactive interface implementations are essentially the same language features as I have just described.
Disclaimer: I do not really know Haskell so I might be wrong in places

Why avoid subtyping?

I have seen many people in the Scala community advise on avoiding subtyping "like a plague". What are the various reasons against the use of subtyping? What are the alternatives?
Types determine the granularity of composition, i.e. of extensibility.
For example, an interface, e.g. Comparable, that combines (thus conflates) equality and relational operators. Thus it is impossible to compose on just one of the equality or relational interface.
In general, the substitution principle of inheritance is undecidable. Russell's paradox implies that any set that is extensible (i.e. does not enumerate the type of every possible member or subtype), can include itself, i.e. is a subtype of itself. But in order to identify (decide) what is a subtype and not itself, the invariants of itself must be completely enumerated, thus it is no longer extensible. This is the paradox that subtyped extensibility makes inheritance undecidable. This paradox must exist, else knowledge would be static and thus knowledge formation wouldn't exist.
Function composition is the surjective substitution of subtyping, because the input of a function can be substituted for its output, i.e. any where the output type is expected, the input type can be substituted, by wrapping it in the function call. But composition does not make the bijective contract of subtyping-- accessing the interface of the output of a function, does not access the input instance of the function.
Thus composition does not have to maintain the future (i.e. unbounded) invariants and thus can be both extensible and decidable. Subtyping can be MUCH more powerful where it is provably decidable, because it maintains this bijective contract, e.g. a function that sorts a immutable list of the supertype, can operate on the immutable list of the subtype.
So the conclusion is to enumerate all the invariants of each type (i.e. of its interfaces), make these types orthogonal (maximize granularity of composition), and then use function composition to accomplish extension where those invariants would not be orthogonal. Thus a subtype is appropriate only where it provably models the invariants of the supertype interface, and the additional interface(s) of the subtype are provably orthogonal to the invariants of the supertype interface. Thus the invariants of interfaces should be orthogonal.
Category theory provides rules for the model of the invariants of each subtype, i.e. of Functor, Applicative, and Monad, which preserve function composition on lifted types, i.e. see the aforementioned example of the power of subtyping for lists.
One reason is that equals() is very hard to get right when sub-typing is involved. See How to Write an Equality Method in Java. Specifically "Pitfall #4: Failing to define equals as an equivalence relation". In essence: to get equality right under sub-typing, you need a double dispatch.
I think the general context is for the lanaguage to be as "pure" as possible (ie using as much as possible pure functions), and comes from the comparison with Haskell.
From "Ruminations of a Programmer"
Scala, being a hybrid OO-FP language has to take care of issues like subtyping (which Haskell does not have).
As mentioned in this PSE answer:
no way to restrict a subtype so that it can't do more than the type it inherits from.
For example, if the base class is immutable and defines a pure method foo(...), derived classes must not be mutable or override foo() with a function that is not pure
But the actual recommendation would be to use the best solution adapted to the program you are currently developing.
Focusing on subtyping, ignoring the issues related to classes, inheritance, OOP, etc.. We have the idea subtyping represents a isa relation between types. For example, types A and B have different operations but if A isa B we then can use any of B's operations on an A.
OTOH, using another traditional relation, if C hasa B then we can reuse any of B's operations on a C. Usually languages let you write one with a nicer syntax, a.opOnB instead of a.super.opOnB as it would be in the case of composition, c.b.opOnB
The problem is that in many cases there's more than one way to relate two types. For example Real can be embedded in Complex assuming 0 on the imaginary part, but Complex can be embedded in Real by ignoring the imaginary part, so both can be seen as subtypes of the other and subtyping forces one relation to be viewed as preferred. Also, there are more possible relations (e.g. view Complex as a Real using theta component of polar representation).
In formal terminology we usually say morphism to such relations between types and there are special kinds of morphisms for relations with different properties (e.g. isomorphism, homomorphism).
In a language with subtyping usually there's much more sugar on isa relations and given many possible embeddings we tend to see unnecessary friction whenever we're using the unpreferred relation. If we bring inheritance, classes and OOP to the mix the problem becomes much more visible and messy.
My answer does not answer why it is avoided but tries to give another hint at why it can be avoided.
Using "type classes" you can add an abstraction over existing types/classes without modifying them. Inheritance is used to express that some classes are specializations of a more abstract class. But with type classes you can take any existing classes and express that they all share a common property, for example they are Comparable. And as long as you are not concerned with them being Comparable you don't even notice it. The classes don't inherit any methods from some abstract Comparable type as long as you don't use them. It's a bit like programming in dynamic languages.
Further reads:
http://blog.tmorris.net/the-power-of-type-classes-with-scala-implicit-defs/
http://debasishg.blogspot.com/2010/07/refactoring-into-scala-type-classes.html
I don't know Scala, but I think the mantra 'prefer composition over inheritance' applies for Scala exactly the way it does for every other OO programming language (and subtyping is often used with the same meaning as 'inheritance'). Here
Prefer composition over inheritance?
you will find some more information.
I think lots of Scala programmers are former Java programmers. They are used to think in term of Object Oriented subtyping and they should be able to easily find OO-like solution for most problems. But Functional Programing is a new paradigm to discover, so people ask for a different kind of solutions.
This is the best paper I have found on the subject. A motivating quote from the paper –
We argue that while some of the simpler aspects of object-oriented languages are
compatible with ML, adding a full-fledged class-based object system to ML leads to an excessively complex
type system and relatively little expressive gain