Language without type-casting - oop

My question is pretty much what the title says: Is it possible to have a programming language which does not allow explicit type casting?
To clarify what I mean, assume we're working in some C#-like language with a parent Base class and a child Derived class. Clearly, such code would be safe:
Base a = new Derived();
Since going up the inheritance hierarchy is safe, but
Dervied b = (Base)a;
is not guarenteed safe, since going down is not safe.
But, regardless of the safety, such downcasts are valid in many languages (like Java or C#) - the code will compile, and will simply fail at runtime if the types aren't right. So technically, the code is still safe, but via runtime checks and not compile-time checks (btw, I'm not a fan of runtime checks).
I would personally find complete compile-time type safety to be very important, at least from a theoretical perspective, and at most from the perspective of reliable code. A consequence of compile-time type safety is that casts are no longer needed (which I think is great, 'cause they're ugly anyways). Any cast-like behaviour can be implemented by an implicit conversion operator or by a constructor.
So I'm wondering, are currently any OO languages which provide such a rigourous type safety at compile-time that casts are obsolete? I.e., they don't any allow unsafe conversion operations whatsoever? Or is there a reason this wouldn't work?
Thanks for any input.
Edit
If I can clarify by example, here's the big reason I hate downcasts so much.
Let's say I have the following (loosely based on C#'s collections):
public interface IEnumerable<T>
{
IEnumerator<T> GetEnumerator();
IEnumerable<T> Filter( Func<T, bool> );
}
public class List<T> : IEnumerable<T>
{
// All of list's implementation here
}
Now suppose someone decides to write code like this:
List<int> list = new List<int>( new int[]{1, 2, 3, 4, 5, 6} );
// Let's filter out the odd numbers
List<int> result = (List<int>)list.Filter( x => x % 2 != 0 );
Notice how the cast is necessary on that last line. But is it valid? Not in general. Sure, it makes sense that the implementation of List<T>.Filter will return another List<T>, but this is not guarenteed (it could be any subtype of IEnumerable<T>). Even if this runs at one point in time, a later version may change this, exposing how brittle the code is.
Pretty much all of the situations I can think that require downcasts would boil down to something like this example - a method has a return type of some class or interface, but since we know some implementation details, we're confident in downcasting the result. But this is anti-OOP, since OOP actually encourages abstracting from implementation details. So why do we do it anyways, even in purely OOP languages?

Downcasts can be gradually eliminated by improving the power of the type system.
One proposed solution to the example you gave is to add the ability to declare the return type of a method as "the same as this". This allows a subclass to return a subclass without requiring a cast. Thus you get something like this:
public interface IEnumerable<T>
{
IEnumerator<T> GetEnumerator();
This<T> Filter( Func<T, bool> );
}
public class List<T> : IEnumerable<T>
{
// All of list's implementation here
}
Now the cast is unnecessary:
List<int> list = new List<int>( new int[]{1, 2, 3, 4, 5, 6} );
// Compiler "knows" that Filter returns the same type as its receiver
List<int> result = list.Filter( x => x % 2 != 0 );
Other cases of downcasting also have proposed solutions by improving the type system, but these improvements have not yet been made to C#, Java, or C++.

Well, it's certainly possible to have programming languages that don't have subtyping at all, and then naturally there's no need for downcasts there. Most non-OO language fall into that class.
Even in a class-based OO language like Java, most downcasts could formally be replaced simply by letting the base class have a method
Foo meAsFoo() {
return null;
}
which the subclass would then override to return itself. However, that would still just be another way to express a run-time test, with the added downside of being more complicated to use. And it would be hard to forbid the pattern without losing all other advantages of inheritance-based subtyping.
Of course, this is only possible if you're able to modify the parent class. I suspect you might consider that a plus, but given how often one can modify the parent class and so use the workaround, I'm not sure how much that would be worth in terms of encouraging "good" design (for some more or less arbitrary value of "good").
A case could be made that it would encourage safe programming more if the language offered a case-matching construct instead of a downcast expression:
Shape x = .... ;
switch( x ) {
case Rectangle r:
return 5*r.diagonal();
case Circle c:
return c.radius();
case Point:
return 0 ;
default:
throw new RuntimeException("This can't happen, and I, "+
"the programmer, take full responsibility");
}
However, it might then be a problem in practice that without a closed-world assumption (which modern programming languages seem to be reluctant to make) many of those switches would need default: cases that the programmer knows can never happen, which might well desensitivize the programmer to the resultant throws.

There are many languages with duck typing and/or implicit type conversion. Perl certainly comes to mind; the intricacies of how subtypes of the scalar type are converted internally are a frequent source of criticism, but also receive praise because when they do work like you expect, they contribute to the DWIM feel of the language.
Traditional Lisp is another good example - all you have is atoms and lists, and nil which is both at the same time. Otherwise, the twain never meet ...
(You seem to come from a universe where programming languages are necessarily object-oriented, strongly typed, and compiled, though.)

Related

What is the antonym of encapsulation?

Using online dictionary tools doesn't really help. I think the way encapsulate is use in computer science doesn't exactly match its meaning in plain English.
What is the antonym of computer science's version of encaspulate? More specifically, what is an antonym for encapsulate that would work as a function name.
Why should I care? Here's my motivation:
// A class with a private member variable;
class Private
{
public:
// Test will be able to access Private's private members;
class Test;
private:
int i;
}
// Make Test exactly like Private
class Private::Test : public Private
{
public:
// Make Private's copy of i available publicly in Test
using Private::i;
};
// A convenience function to quickly break encapsulation on a class to be tested.
// I don't have good name for what it does
Private::Test& foo( Private& p )
{ return *reinterpret_cast<Private::Test*>(&p); } // power cast
void unit_test()
{
Private p;
// using the function quickly grab access to p's internals.
// obviously it would be evil to use this anywhere except in unit tests.
assert( foo(p).i == 42 );
}
The antonym is "C".
Ok, just kidding. (Sort of.)
The best terms I can come up with are "expose" and "violate".
The purpose behind encapsulation is to hide/cover/protect. The antonym would be reveal/expose/make public.
How about Decapsulation..
Though it aint a computer science term, but in medical science, Surgical removal of a capsule or enveloping membrane.. Check out here..
"Removing/Breaking encapsulation" is about the closest thing I've seen, honestly.
If you think of the word in the English sense, to encapsulate means to enclose within something. But in the CS sense, there's this concept of protection levels and it looks like you want to imply circumventing the access levels as well, so something like "extraction" doesn't really convey the meaning you're looking for.
But if you just think of it in terms of what the access levels are, it looks like you're making something public so, how about "publicizing"?
This is not such a simple question - Scott Meyers had an interesting article to demonstrate some of the nuances around encapsulation here.
I'll start with the punchline: If
you're writing a function that can be
implemented as either a member or as a
non-friend non-member, you should
prefer to implement it as a non-member
function. That decision increases
class encapsulation. When you think
encapsulation, you should think
non-member functions.
How about "Bad Idea"?
The true antonym of "Encapsulation" is "Global State".
The general opposite of encapsulation is coupling and we often talk about systems that are tightly coupled or loosely coupled.
The reason you'd want components to be encapsulated is because it makes it easier to reason about how they work.
Take the analogy of trains: the consequence of coupling the railcars is that the driver must consider the characteristics (inertia, length) of the entire train.
Obviously, though, we couple systems because we need them to work together.
Inverted encapsulation and data structures
There's another term that I've been digging for, which is how I came across this question, that refers to a non-standard style of data structures.
The standard style of encapsulation is exemplified by Java's LinkedList; the actual nodes of the list are designed to be inaccessible to the consumer. The theory is that this is an implementation detail and can change to improve performance, while existing code will continue to run.
Another style is the classic functional cons-list. This is a singly linked list, and the idea is that it's so simple that there's nothing to improve about the data structure, e.g.
data [a] = [] | a : [a] deriving (Eq, Ord)
-- Haskellers then work directly with the list
-- There's nothing to hide because it's so simple
typicalHaskell :: [a] -> b
typicalHaskell [] = emptyValue
typicalHaskell h : t = h `doAThing` (typicalHaskell t)
That's the definition from Haskell's standard prelude though the report notes that isn't valid Haskell syntax, and in practice [a] is defined in the guts of the compiler.
Then there's what I'm calling an "inverted" data structure, but I'm still looking for the correct term. This is, I think, really the opposite of encapsulation.
A good example of this is Python's heapq module. The data structure here is a binary heap, but there isn't a Heap class. Rather, you get a collection of functions that operate on generic Python lists and you're responsible for using those methods correctly to ensure the heap invariants are maintained.
How about "spaghetti"?

What is the difference between the concept of 'class' and 'type'?

i know this question has been already asked, but i didnt get it quite right, i would like to know, which is the base one, class or the type. I have few questions, please clear those for me,
Is type the base of a programing data type?
type is hard coded into the language itself. Class is something we can define ourselves?
What is untyped languages, please give some examples
type is not something that fall in to the oop concepts, I mean it is not restricted to oop world
Please clear this for me, thanks.
I didn't work with many languages. Maybe, my questions are correct in terms of : Java, C#, Objective-C
1/ I think type is actually data type in some way people talk about it.
2/ No. Both type and class we can define it. An object of Class A has type A. For example if we define String s = "123"; then s has a type String, belong to class String. But the vice versa is not correct.
For example:
class B {}
class A extends B {}
B b = new A();
then you can say b has type B and belong to both class A and B. But b doesn't have type A.
3/ untyped language is a language that allows you to change the type of the variable, like in javascript.
var s = "123"; // type string
s = 123; // then type integer
4/ I don't know much but I think it is not restricted to oop. It can be procedural programming as well
It may well depend on the language. I treat types and classes as the same thing in OO, only making a distinction between class (the definition of a family of objects) and instance (or object), specific concrete occurrences of a class.
I come originally from a C world where there was no real difference between language-defined types like int and types that you made yourself with typedef or struct.
Likewise, in C++, there's little difference (probably none) between std::string and any class you put together yourself, other than the fact that std::string will almost certainly be bug-free by now. The same isn't always necessary in our own code :-)
I've heard people suggest that types are classes without methods but I don't believe that distinction (again because of my C/C++ background).
There is a fundamental difference in some languages between integral (in the sense of integrated rather than integer) types and class types. Classes can be extended but int and float (examples for C++) cannot.
In OOP languages, a class specifies the definition of an object. In many cases, that object can serve as a type for things like parameter matching in a function.
So, for an example, when you define a function, you specify the type of data that should be passed to the function and the type of data that is returned:
int AddOne(int value) { return value+1; } uses int types for the return value and the parameter being passed in.
In languages that have both, the concepts of type and class/object can almost become interchangeable. However, there are many languages that do not have both. For instance, I believe that standard C has no support for custom-defined objects, but it certainly does still have types. On the otherhand, both PHP and Javascript are examples of languages where type is very loosely defined (basically, types are either single item, collection/array/object, or undefined [js only]), but they have full support for classes/objects.
Another key difference: you can have methods and custom-functions associated with a class/object, but not with a standard data-type.
Hopefully that clarified some. To answer your specific questions:
In some ways, type could be considered a base concept of programming, yes.
Yes, with the exception that classes can be treated as types in functions, as in the example above.
An untyped language is one that lets you use any type of variable interchangeably. Meaning that you can handle a string with the same code that handles an int, for instance. In practice most 'untyped' languages actually implement a concept called duck-typing, so named because they say that 'if it acts like a duck, it should be treated like a duck' and attempt to use any variable as the type that makes sense for the code encountered. Again, php and javascript are two languages which do this.
Very true, type is applicable outside of the OOP world.

Is there any disadvantage of writing a long constructor?

Does it affect the time in loading the application?
or any other issues in doing so?
The question is vague on what "long" means. Here are some possible interpretations:
Interpretation #1: The constructor has many parameters
Constructors with many parameters can lead to poor readability, and better alternatives exist.
Here's a quote from Effective Java 2nd Edition, Item 2: Consider a builder pattern when faced with many constructor parameters:
Traditionally, programmers have used the telescoping constructor pattern, in which you provide a constructor with only the required parameters, another with a single optional parameters, a third with two optional parameters, and so on...
The telescoping constructor pattern is essentially something like this:
public class Telescope {
final String name;
final int levels;
final boolean isAdjustable;
public Telescope(String name) {
this(name, 5);
}
public Telescope(String name, int levels) {
this(name, levels, false);
}
public Telescope(String name, int levels, boolean isAdjustable) {
this.name = name;
this.levels = levels;
this.isAdjustable = isAdjustable;
}
}
And now you can do any of the following:
new Telescope("X/1999");
new Telescope("X/1999", 13);
new Telescope("X/1999", 13, true);
You can't, however, currently set only the name and isAdjustable, and leaving levels at default. You can provide more constructor overloads, but obviously the number would explode as the number of parameters grow, and you may even have multiple boolean and int arguments, which would really make a mess out of things.
As you can see, this isn't a pleasant pattern to write, and even less pleasant to use (What does "true" mean here? What's 13?).
Bloch recommends using a builder pattern, which would allow you to write something like this instead:
Telescope telly = new Telescope.Builder("X/1999").setAdjustable(true).build();
Note that now the parameters are named, and you can set them in any order you want, and you can skip the ones that you want to keep at default values. This is certainly much better than telescoping constructors, especially when there's a huge number of parameters that belong to many of the same types.
See also
Wikipedia/Builder pattern
Effective Java 2nd Edition, Item 2: Consider a builder pattern when faced with many constructor parameters (excerpt online)
Related questions
When would you use the Builder Pattern?
Is this a well known design pattern? What is its name?
Interpretation #2: The constructor does a lot of work that costs time
If the work must be done at construction time, then doing it in the constructor or in a helper method doesn't really make too much of a difference. When a constructor delegates work to a helper method, however, make sure that it's not overridable, because that could lead to a lot of problems.
Here's some quote from Effective Java 2nd Edition, Item 17: Design and document for inheritance, or else prohibit it:
There are a few more restrictions that a class must obey to allow inheritance. Constructors must not invoke overridable methods, directly or indirectly. If you violate this rule, program failure will result. The superclass constructor runs before the subclass constructor, so the overriding method in the subclass will be invoked before the subclass constructor has run. If the overriding method depends on any initialization performed by the subclass constructor, the method will not behave as expected.
Here's an example to illustrate:
public class ConstructorCallsOverride {
public static void main(String[] args) {
abstract class Base {
Base() { overrideMe(); }
abstract void overrideMe();
}
class Child extends Base {
final int x;
Child(int x) { this.x = x; }
#Override void overrideMe() {
System.out.println(x);
}
}
new Child(42); // prints "0"
}
}
Here, when Base constructor calls overrideMe, Child has not finished initializing the final int x, and the method gets the wrong value. This will almost certainly lead to bugs and errors.
Interpretation #3: The constructor does a lot of work that can be deferred
The construction of an object can be made faster when some work is deferred to when it's actually needed; this is called lazy initialization. As an example, when a String is constructed, it does not actually compute its hash code. It only does it when the hash code is first required, and then it will cache it (since strings are immutable, this value will not change).
However, consider Effective Java 2nd Edition, Item 71: Use lazy initialization judiciously. Lazy initialization can lead to subtle bugs, and don't always yield improved performance that justifies the added complexity. Do not prematurely optimize.
Constructors are a little special in that an unhandled exception in a constructor may have weird side effects. Without seeing your code I would assume that a long constructor increases the risk of exceptions. I would make the constructor as simple as needed and utilize other methods to do the rest in order to provide better error handling.
The biggest disadvantage is probably the same as writing any other long function -- that it can get complex and difficult to understand.
The rest is going to vary. First of all, length and execution time don't necessarily correlate -- you could have a single line (e.g., function call) that took several seconds to complete (e.g., connect to a server) or lots of code that executed entirely within the CPU and finished quickly.
Startup time would (obviously) only be affected by constructors that were/are invoked during startup. I haven't had an issue with this in any code I've written (at all recently anyway), but I've seen code that did. On some types of embedded systems (for one example) you really want to avoid creating and destroying objects during normal use, so you create almost everything statically during bootup. Once it's running, you can devote all the processor time to getting the real work done.
Constructor is yet another function. You need very long functions called many times to make the program work slow. So if it's only called once it usually won't matter how much code is inside.
It affects the time it takes to construct that object, naturally, but no more than having an empty constructor and calling methods to do that work instead. It has no effect on the application load time
In case of copy constructor if we use donot use reference in that case
it will create an object and call the copy constructor and passing the
value to the copy constructor and each time a new object is created and
each time it will call the copy constructor it goes to infinite and
fill the memory then it display the error message .
if we pass the reference it will not create the new object for storing
the value. and no recursion will take place
I would avoid doing anything in your constructor that isn't absolutely necessary. Initialize your variables in there, and try not to do much else. Additional functionality should reside in separate functions that you call only if you need to.

C# 4.0 'dynamic' and foreach statement

Not long time before I've discovered, that new dynamic keyword doesn't work well with the C#'s foreach statement:
using System;
sealed class Foo {
public struct FooEnumerator {
int value;
public bool MoveNext() { return true; }
public int Current { get { return value++; } }
}
public FooEnumerator GetEnumerator() {
return new FooEnumerator();
}
static void Main() {
foreach (int x in new Foo()) {
Console.WriteLine(x);
if (x >= 100) break;
}
foreach (int x in (dynamic)new Foo()) { // :)
Console.WriteLine(x);
if (x >= 100) break;
}
}
}
I've expected that iterating over the dynamic variable should work completely as if the type of collection variable is known at compile time. I've discovered that the second loop actually is looked like this when is compiled:
foreach (object x in (IEnumerable) /* dynamic cast */ (object) new Foo()) {
...
}
and every access to the x variable results with the dynamic lookup/cast so C# ignores that I've specify the correct x's type in the foreach statement - that was a bit surprising for me... And also, C# compiler completely ignores that collection from dynamically typed variable may implements IEnumerable<T> interface!
The full foreach statement behavior is described in the C# 4.0 specification 8.8.4 The foreach statement article.
But... It's perfectly possible to implement the same behavior at runtime! It's possible to add an extra CSharpBinderFlags.ForEachCast flag, correct the emmited code to looks like:
foreach (int x in (IEnumerable<int>) /* dynamic cast with the CSharpBinderFlags.ForEachCast flag */ (object) new Foo()) {
...
}
And add some extra logic to CSharpConvertBinder:
Wrap IEnumerable collections and IEnumerator's to IEnumerable<T>/IEnumerator<T>.
Wrap collections doesn't implementing Ienumerable<T>/IEnumerator<T> to implement this interfaces.
So today foreach statement iterates over dynamic completely different from iterating over statically known collection variable and completely ignores the type information, specified by user. All that results with the different iteration behavior (IEnumarble<T>-implementing collections is being iterated as only IEnumerable-implementing) and more than 150x slowdown when iterating over dynamic. Simple fix will results a much better performance:
foreach (int x in (IEnumerable<int>) dynamicVariable) {
But why I should write code like this?
It's very nicely to see that sometimes C# 4.0 dynamic works completely the same if the type will be known at compile-time, but it's very sadly to see that dynamic works completely different where IT CAN works the same as statically typed code.
So my question is: why foreach over dynamic works different from foreach over anything else?
First off, to explain some background to readers who are confused by the question: the C# language actually does not require that the collection of a "foreach" implement IEnumerable. Rather, it requires either that it implement IEnumerable, or that it implement IEnumerable<T>, or simply that it have a GetEnumerator method (and that the GetEnumerator method returns something with a Current and MoveNext that matches the pattern expected, and so on.)
That might seem like an odd feature for a statically typed language like C# to have. Why should we "match the pattern"? Why not require that collections implement IEnumerable?
Think about the world before generics. If you wanted to make a collection of ints, you'd have to use IEnumerable. And therefore, every call to Current would box an int, and then of course the caller would immediately unbox it back to int. Which is slow and creates pressure on the GC. By going with a pattern-based approach you can make strongly typed collections in C# 1.0!
Nowadays of course no one implements that pattern; if you want a strongly typed collection, you implement IEnumerable<T> and you're done. Had a generic type system been available to C# 1.0, it is unlikely that the "match the pattern" feature would have been implemented in the first place.
As you've noted, instead of looking for the pattern, the code generated for a dynamic collection in a foreach looks for a dynamic conversion to IEnumerable (and then does a conversion from the object returned by Current to the type of the loop variable of course.) So your question basically is "why does the code generated by use of the dynamic type as a collection type of foreach fail to look for the pattern at runtime?"
Because it isn't 1999 anymore, and even when it was back in the C# 1.0 days, collections that used the pattern also almost always implemented IEnumerable too. The probability that a real user is going to be writing production-quality C# 4.0 code which does a foreach over a collection that implements the pattern but not IEnumerable is extremely low. Now, if you're in that situation, well, that's unexpected, and I'm sorry that our design failed to anticipate your needs. If you feel that your scenario is in fact common, and that we've misjudged how rare it is, please post more details about your scenario and we'll consider changing this for hypothetical future versions.
Note that the conversion we generate to IEnumerable is a dynamic conversion, not simply a type test. That way, the dynamic object may participate; if it does not implement IEnumerable but wishes to proffer up a proxy object which does, it is free to do so.
In short, the design of "dynamic foreach" is "dynamically ask the object for an IEnumerable sequence", rather than "dynamically do every type-testing operation we would have done at compile time". This does in theory subtly violate the design principle that dynamic analysis gives the same result as static analysis would have, but in practice it's how we expect the vast majority of dynamically accessed collections to work.
But why I should write code like this?
Indeed. And why would the compiler write code like that? You've removed any chance it might have had to guess that the loop could be optimized. Btw, you seem to interpret the IL incorrectly, it is rebinding to obtain IEnumerable.Current, the MoveNext() call is direct and GetEnumerator() is called only once. Which I think is appropriate, the next element might or might not cast to an int without problems. It could be a collection of various types, each with their own binder.

Non-nullable reference types

I'm designing a language, and I'm wondering if it's reasonable to make reference types non-nullable by default, and use "?" for nullable value and reference types. Are there any problems with this? What would you do about this:
class Foo {
Bar? b;
Bar b2;
Foo() {
b.DoSomething(); //valid, but will cause exception
b2.DoSomething(); //?
}
}
My current language design philosophy is that nullability should be something a programmer is forced to ask for, not given by default on reference types (in this, I agree with Tony Hoare - Google for his recent QCon talk).
On this specific example, with the unnullable b2, it wouldn't even pass static checks: Conservative analysis cannot guarantee that b2 isn't NULL, so the program is not semantically meaningful.
My ethos is simple enough. References are an indirection handle to some resource, which we can traverse to obtain access to that resource. Nullable references are either an indirection handle to a resource, or a notification that the resource is not available, and one is never sure up front which semantics are being used. This gives either a multitude of checks up front (Is it null? No? Yay!), or the inevitable NPE (or equivalent). Most programming resources are, these days, not massively resource constrained or bound to some finite underlying model - null references are, simplistically, one of...
Laziness: "I'll just bung a null in here". Which frankly, I don't have too much sympathy with
Confusion: "I don't know what to put in here yet". Typically also a legacy of older languages, where you had to declare your resource names before you knew what your resources were.
Errors: "It went wrong, here's a NULL". Better error reporting mechanisms are thus essential in a language
A hole: "I know I'll have something soon, give me a placeholder". This has more merit, and we can think of ways to combat this.
Of course, solving each of the cases that NULL current caters for with a better linguistic choice is no small feat, and may add more confusion that it helps. We can always go to immutable resources, so NULL in it's only useful states (error, and hole) isn't much real use. Imperative technqiues are here to stay though, and I'm frankly glad - this makes the search for better solutions in this space worthwhile.
Having reference types be non-nullable by default is the only reasonable choice. We are plagued by languages and runtimes that have screwed this up; you should do the Right Thing.
This feature was in Spec#. They defaulted to nullable references and used ! to indicate non-nullables. This was because they wanted backward compatibility.
In my dream language (of which I'd probably be the only user!) I'd make the same choice as you, non-nullable by default.
I would also make it illegal to use the . operator on a nullable reference (or anything else that would dereference it). How would you use them? You'd have to convert them to non-nullables first. How would you do this? By testing them for null.
In Java and C#, the if statement can only accept a bool test expression. I'd extend it to accept the name of a nullable reference variable:
if (myObj)
{
// in this scope, myObj is non-nullable, so can be used
}
This special syntax would be unsurprising to C/C++ programmers. I'd prefer a special syntax like this to make it clear that we are doing a check that modifies the type of the name myObj within the truth-branch.
I'd add a further bit of sugar:
if (SomeMethodReturningANullable() into anotherObj)
{
// anotherObj is non-nullable, so can be used
}
This just gives the name anotherObj to the result of the expression on the left of the into, so it can be used in the scope where it is valid.
I'd do the same kind of thing for the ?: operator.
string message = GetMessage() into m ? m : "No message available";
Note that string message is non-nullable, but so are the two possible results of the test above, so the assignment is value.
And then maybe a bit of sugar for the presumably common case of substituting a value for null:
string message = GetMessage() or "No message available";
Obviously or would only be validly applied to a nullable type on the left side, and a non-nullable on the right side.
(I'd also have a built-in notion of ownership for instance fields; the compiler would generate the IDisposable.Dispose method automatically, and the ~Destructor syntax would be used to augment Dispose, exactly as in C++/CLI.)
Spec# had another syntactic extension related to non-nullables, due to the problem of ensuring that non-nullables had been initialized correctly during construction:
class SpecSharpExampleClass
{
private string! _nonNullableExampleField;
public SpecSharpExampleClass(string s)
: _nonNullableExampleField(s)
{
}
}
In other words, you have to initialize fields in the same way as you'd call other constructors with base or this - unless of course you initialize them directly next to the field declaration.
Have a look at the Elvis operator proposal for Java 7. This does something similar, in that it encapsulates a null check and method dispatch in one operator, with a specified return value if the object is null. Hence:
String s = mayBeNull?.toString() ?: "null";
checks if the String s is null, and returns the string "null" if so, and the value of the string if not. Food for thought, perhaps.
A couple of examples of similar features in other languages:
boost::optional (C++)
Maybe (Haskell)
There's also Nullable<T> (from C#) but that is not such a good example because of the different treatment of reference vs. value types.
In your example you could add a conditional message send operator, e.g.
b?->DoSomething();
To send a message to b only if it is non-null.
Have the nullability be a configuration setting, enforceable in the authors source code. That way, you will allow people who like nullable objects by default enjoy them in their source code, while allowing those who would like all their objects be non-nullable by default have exactly that. Additionally, provide keywords or other facility to explicitly mark which of your declarations of objects and types can be nullable and which cannot, with something like nullable and not-nullable, to override the global defaults.
For instance
/// "translation unit 1"
#set nullable
{ /// Scope of default override, making all declarations within the scope nullable implicitly
Bar bar; /// Can be null
non-null Foo foo; /// Overriden, cannot be null
nullable FooBar foobar; /// Overriden, can be null, even without the scope definition above
}
/// Same style for opposite
/// ...
/// Top-bottom, until reset by scoped-setting or simply reset to another value
#set nullable;
/// Nullable types implicitly
#clear nullable;
/// Can also use '#set nullable = false' or '#set not-nullable = true'. Ugly, but human mind is a very original, mhm, thing.
Many people argue that giving everyone what they want is impossible, but if you are designing a new language, try new things. Tony Hoare introduced the concept of null in 1965 because he could not resist (his own words), and we are paying for it ever since (also, his own words, the man is regretful of it). Point is, smart, experienced people make mistakes that cost the rest of us, don't take anyones advice on this page as if it were the only truth, including mine. Evaluate and think about it.
I've read many many rants on how it's us poor inexperienced programmers who really don't understand where to really use null and where not, showing us patterns and antipatterns that are meant to prevent shooting ourselves in the foot. All the while, millions of still inexperienced programmers produce more code in languages that allow null. I may be inexperienced, but I know which of my objects don't benefit from being nullable.
Here we are, 13 years later, and C# did it.
And, yes, this is the biggest improvement in languages since Barbara and Stephen invented types in 1974.:
Programming With Abstract Data Types
Barbara Liskov
Massachusetts Institute of Technology
Project MAC
Cambridge, Massachusetts
Stephen Zilles
Cambridge Systems Group
IBM Systems Development Division
Cambridge, Massachusetts
Abstract
The motivation
behind the work in very-high-level languages is to ease the
programming task by providing the programmer with a language
containing primitives or abstractions suitable to his problem area.
The programmer is then able to spend his effort in the right place; he
concentrates on solving his problem, and the resulting program will be
more reliable as a result. Clearly, this is a worthwhile goal.
Unfortunately, it is very difficult for a designer to select in
advance all the abstractions which the users of his language might
need. If a language is to be used at all, it is likely to be used to
solve problems which its designer did not envision, and for which the
abstractions embedded in the language are not sufficient. This paper
presents an approach which allows the set of built-in abstractions to
be augmented when the need for a new data abstraction is discovered.
This approach to the handling of abstraction is an outgrowth of work
on designing a language for structured programming. Relevant aspects
of this language are described, and examples of the use and
definitions of abstractions are given.
I think null values are good: They are a clear indication that you did something wrong. If you fail to initialize a reference somewhere, you'll get an immediate notice.
The alternative would be that values are sometimes initialized to a default value. Logical errors are then a lot more difficult to detect, unless you put detection logic in those default values. This would be the same as just getting a null pointer exception.