Raku: return Type - raku

I want to write a function returning an array whose all subarrays must have a length of two.
For example return will be [[1, 2], [3, 4]].
I define:
(1) subset TArray of Array where { .all ~~ subset :: where [Int, Int] };
and
sub fcn(Int $n) of TArray is export(:fcn) {
[[1, 2], [3, 4]];
}
I find (1) over-complicated. Is there something simpler?

Stepping back first
subset TArray of Array where { .all ~~ subset :: where [Int, Int] };
Is there something simpler?
Before we go there, let's step back. Even ignoring your code's "overly-complicated" nature based on just looking at it, it's also potentially problematic and complicated for various reasons that may not be so obvious. I'll highlight three:
This subset will accept an Array containing Arrays, with each of those arrays containing two Ints. But it doesn't mandate an Array[Array[Int]]. The outer Array's type may be just a generic Array, rather than being an Array[Array] let lone an Array[Array[Int]]. Indeed it will be unless you deliberately introduce strongly typed values. I will cover strong typing in the last section of this answer.
What about an empty Array? Your subset will accept that. Is that your intent? If not, what about requiring at least one pair of Ints?
The outer where clause uses a common Raku idiom of the form .all ~~ ..., with a junction on the left hand side of the ~~ smart match operator. Astonishingly, per an issue I just filed, this may be a problem. What alternatives are there?
Starting simple
Raku does a decent job of keeping simple things simple. If we put aside any artificial desire for strong typing, and focus on simple tools for tightening code up, a simple subset I would have suggested in the past would be:
subset TArray where .all == 2; # BAD despite being idiomatic???
This has all of the problems your original code has, plus in addition it accepts data that has non-integers where integers belong.
But it does have the redeeming qualities that it does a useful check (that the inner arrays each have two elements) and it's significantly simpler than your code.
Now I've reminded myself that I need to view .all on the left hand side of ~~ as possibly a problem, I'll instead write it as:
subset TArray where 2 == .all; # Potentially the new idiomatic.
This version reads more poorly, but, while readability is important, basic correctness is more important.
Still fairly simple, and less problems
Here are two variants I came up with:
subset TArray where all .map: * ~~ (Int,Int);
subset TArray where .elems == .grep: (Int,Int);
These both avoid the junction/smartmatch problem. (The first where expression does have a junction to the left of a smart match, but it's not an example of the problem.)
The second version isn't so obviously correct (think of it as checking that the count of subarrays is the same as the count of subarrays that match (Int,Int)) but it nicely lends itself to fixing the problem of matching if there are zero subarrays, if that were to need fixing:
subset TArray where 0 < .elems == .grep: (Int,Int);
Strong typing solutions
The solutions thus far don't deal with strong typing. Perhaps that's desirable. Perhaps not.
To understand what I mean by this, let's first look at literals:
say WHAT 1; # (Int)
say WHAT [1,2]; # (Array)
say WHAT [[1,2],[3,4]]; # (Array)
These values have types determined by their literal constructors.
The last two are just Arrays, generic over their elements.
(The second is not an Array[Int], which might be expected. Similarly the last one is not an Array[Array[Int]].)
Current built in Raku literal forms for composite types (arrays and hashes) all construct generic Arrays which do not constrain the types of their elements.
See the PR Introduce [1,2,3]:Int syntax #4406 for a proposal/PR regarding element typed composite literals and a related issue I just posted in response to your Q here about an alternative and/or complementary approach to that PR. (There have been discussions over the years about this aspect of the type system but it seems like it's time for Rakoons to look at addressing it.)
What if you wanted to build a strongly typed data structure as the value to return from your routine, and to have the return type check that?
Here's one way one might build such a strongly typed value:
my Array[Array[Int]] $result .= new: Array[Int].new(1,2), Array[Int].new(3,4);
Super verbose! But now you could write the following for your sub's return type check and it'll work:
subset TArray of Array[Array[Int]] where 0 < .elems == .grep: (Int,Int);
sub fcn(Int $n) of TArray is export(:fcn) {
my Array[Array[Int]] $result .= new: Array[Int].new(1,2), Array[Int].new(3,4);
}
Another way to build a strongly typed value is to specify not only the strong typing in a variable's type constraint, but also coercion typing to bridge from a loosely typed value to a strongly typed target.
We keep the exact same subset (that establishes the strongly typed target data structure and adds "refinement typing" checks):
subset TArray of Array[Array[Int]] where 0 < .elems == .grep: (Int,Int);
But instead of using a verbose correct-by-construction initialization value, using full type names and news, we introduce additional coercion typing and then just use ordinary literal syntax:
constant TArrayInitialization = TArray(Array[Array[Int]()]());
sub fcn(Int $n) of TArray is export(:fcn) {
my TArrayInitialization $result = [[1,2],[3,4]];
}
(I could have written the TArrayInitialization declaration as another subset, but it would be a slight overkill to have done so. A constant does the job with less fuss.)

I gather that the aim is to restrict the type of the inner Array to [Int,Int] ... the closest I can get to this is to declare two subsets, one based on the other...
subset IArray where * ~~ [Int, Int];
subset TArray where .all ~~ IArray;
Otherwise, the anonymous subset form you use seems to be the briefest, although as #raiph points out you can drop the 'of Array' piece.

If you wanted to impose this sort of constraint on a function's parameter (rather than its return type) you could do so with something like:
sub fcn(#a where {all .map: * ~~ [Int, Int]}) {...}
As the other answers have mentioned, there currently isn't great syntax for similarly constraining the return type, but there's a proposal to add support for similar syntax for return types. In fact, as mentioned in that issue, someone has volunteered to work on an implementation but hasn't yet made any progress as far as I know. (And I guess I should know, since I was that volunteer… oops)
So, for now, a subset is the best option – but hopefully the future will have even better ways to write that.

Related

Raku: how do I make an argument optional, have a default, with a where test?

Can't find a way to get this to work:
sub triple(Str:D $mod where * ~~ any #modifiers = 'command' ) { }
If I don't pass in an argument, I get an error:
Too few positionals passed; expected 1 argument but got 0
With a question mark after $mod:
sub triple(Str:D $mod? where * ~~ any #modifiers = 'command' ) { }
I get:
Constraint type check failed in binding to parameter '$mod'; expected anonymous constraint to be met but got Str (Str)
Looks like it may have been a precedence problem. This works:
sub triple(Str:D $mod? where (* ~~ any #modifiers) = 'command' ) {}
TL;DR You've identified the problem in your answer -- precedence -- and provided a solution. This answer covers what happened; why the precedence issue arises; why Raku's grammar/parser didn't just get it right; and lists some solutions, a couple of which I'll start with.
Instead of:
sub triple(Str:D $mod? where * ~~ any #modifiers = 'command' ) { }
I suggest moving the any and writing one of these:
sub triple(Str:D $mod? where * ~~ #modifiers.any = 'command' ) { }
sub triple(Str:D $mod? where #modifiers.any = 'command' ) { }
What happened
The = ... at the end of the where clause is parsed as an assignment (to #modifiers) instead of as a default value (for $mod):
#modifiers = 'command' is evaluated, overwriting whatever values #modifiers had.
The any creates a junction with one element ('command').
Now the only argument that triple will accept is 'command'.
Why the precedence issue arises
Raku's grammar is designed to have nice ergonomics. This includes design details that reduce the need for parens and braces. Overall these design details yield a big net win. But there are wrinkles, and you've encountered one.
Raku lets one write where ... to specify a where clause without requiring that one uses an explicit braced lambda ({...}) for the ... bit. One can even create a lambda using just a *. Nice! But where does the lambda end? If you use explicit braces, it's clear. If not, what determines the end of the lambda?
More generally, shouldn't the parser just know that the = in = 'command' is not part of any lambda? That it should instead just automatically finish a where clause if there is one before parsing the = ... part? That the = ... should always be parsed as a default value for a parameter?
One can easily see the ambiguity (once one's attention is drawn to it), and so does/could Raku's grammar/parser. It just needs to resolve that ambiguity either by rejecting such syntax, demanding the coder explicitly disambiguates (eg with parens, as you've done), or by choosing which way to parse.
What Raku's grammar/parser does in the face of the ambiguity is choose. And it chooses wrong. (Unless of course one wanted it to be an assignment of some value on the ='s left, not a default for a parameter, though that's gotta be pretty unlikely.)
Why Raku's grammar/parser didn't just get it right
Why doesn't the parser reject this code as being too ambiguous, or be smart enough to choose the "it's a default" interpretation? It certainly could -- the Raku grammar/parser feature is Turing complete, equivalent in power to the unrestricted grammars category in the Chomsky parsing hierarchy -- so why doesn't it just get it right?
In a nutshell, it gets it right the right amount, at least imo. But that's subjective, oddly worded, and vague, so it's probably not a satisfactory summary. So I'll try provide a bit more detail in the hope it's more informative.
Every Raku design decision is discussed openly, and there are searchable public records of essentially all of it. To dig into these discussions I recommend starting out with Liz++'s awesome IRC log service, and within the numerous channels listed, focusing on the #perl6 logs that ran from 2005 thru 2019 or so.
Although I've been around for many of the Raku design discussions of the last 20 years, I don't have a good recollection of all the discussions surrounding this decision about the ambiguity of meaning of a = ... at the end of a where clause, and what to do about it. And I haven't myself recently done the digging I suggest; for now I'll leave that for any interested readers. Instead I will outline what I think will have been some contributing factors:
Single pass parsing
Raku's "braided" approach to language design requires single pass parsing.
Longest parsing orientation
Longest Token Matching is all but essential for user definable braiding (see link in previous point) to be truly viable. LTM reflects a general principle that humans naturally tend toward recognizing the longest "token" (within reason of course). That is to say, if one sees $100, it strikes one cognitively as a hundred dollars, not as a dollar sign, a 1, a 0, and then another 0.
A similar deal applies to parsing of a string of tokens (again, within reason); if it weren't for the fact one learns to think of = ... as specifying a parameter's default, the #modifiers = 'command' would naturally be read as being an assignment into #modifiers.
Limited backtracking
Backtracking is slow, pathological backtracking utterly evil. So Raku's grammar/parser avoids potentially backtracking in all but three cases for which it really is the right solution, and entirely avoids any risk of pathological backtracking.
Handling ambiguity
While artificial languages can aim to expunge all ambiguity, the closer one gets to eliminating all ambiguity, the greater the amount of extraneous and distracting syntax one requires, such as frequent required use of delimiters (parens, braces, square brackets, etc) to ensure disambiguation. That makes a language increasingly unfriendly and verbose for that reason instead. Raku culture avoids ideological "foolish consistency" extremes.
Raku's designers (principally Larry Wall) considered all these factors and many more and arrived at Raku's solution:
Be rationally predictable
A sufficiently simple and predictable approach to parsing, and requisite sensitivity to the likelihood and costs of any surprises a user may encounter, goes a long way, and the design relative to where clauses is a case in point.
While the precedence issue may have been a surprise, and the error message unhelpful, I, er, predict you'll find your ERN signal regarding this will tune up quite nicely in fairly short order, just as it will for most of the things that might trip you up as you learn Raku.
Use predictive parsing
While there are several ways to accommodate all of the above, predictive parsing1 is a great choice, and -- not coincidentally! -- the one most naturally written using Raku grammars, and the one used for Raku's own grammar/parser.
Some other solutions
Here's what does not work as expected:
sub triple(Str:D $mod? where * ~~ any #modifiers = 'command' ) { }
^ Needs to be end of `where` clause
You've suggested a solution, and I suggested a couple at the start. Some more follow.
You used parens. Here are some other ways to use parens:
sub triple(Str:D $mod? where * ~~ any(#modifiers) = 'command' ) { }
sub triple(Str:D $mod? where * ~~ (any #modifiers) = 'command' ) { }
Or, switch to use of $_ (aka "it") instead of * (aka "whatever") inside braces:
sub triple(Str:D $mod? where { $_ ~~ any #modifiers } = 'command' ) { }
Footnotes
1 The Wikipedia page discusses "grammars" and "ambiguity" in a manner that may be confusing given that they are not used in the same way those words are used in the context of Raku and of this answer. But discussing that would be a rabbit hole inappropriate for this SO.

Does Baggy add (+) work on MixHash weights?

I am using a MixHash to combine two Hashes with the Bag add (+) operator. This seems to work - but ... I am a bit surprised that the result of the union needs to be re-coerced back to a MixHash.
My guess is that the Bag add (+) infix operator coerces everything to a Bag first and returns the result as a Bag. This may be risky for me as some of my weights are negative (thus the Mix in the first place). Will this properly add negative weights?
Alternatively, is there a Mix add (+) operator?
my MixHash $dim-mix;
for ... {
my $add-mix = $!dims.MixHash;
$dim-mix = $dim-mix ?? ( $dim-mix (+) $add-mix ).MixHash !! $add-mix;
}
dd $dim-mix;
Now I look at this paraphrased code, perhaps there is some formulation of ternary ?? !! that can avoid spelling out $dim-mix in the test since already on the left?
Many thanks for any advice!
my $add-mix = (foo => 0.22, bar => -0.1).Mix;
my $dim-mix;
for ^5 {
$dim-mix (+)= $add-mix;
}
dd $dim-mix; # Mix $dim-mix = ("foo"=>1.1,"bar"=>-0.5).Mix
Obviously I've not used a MixHash, but you can sort that out if you need to after the loop.
(And of course you might be thinking "but isn't a Mix immutable?" It is -- but you have to distinguish variables and values. $dim-mix is a variable, a Scalar variable. Even if you type it -- my Mix $dim-mix; it's still a Scalar variable holding a Mix value. You can always assign to a Scalar.)
I'm starting to get a routine for questions like this where I don't know what's going on but I think I ought to be able to figure it out. Here was my process:
I got your code to run to see what it did. I tried to simplify the ternary. Hmm.
I turned to the doc. There was the doc page for (+). That called it "Baggy addition". That was worrisome given that a Bag only holds (positive) integers.
I turned to the source. I fired off a search of the rakudo sources for "Baggy addition". One result. I focused on the multi with (Mixy:D $a, QuantHash:D $b) signature. This showed me that the result should stay Mixy, i.e. the doc's implication it would or could go Baggy is a red herring.
I returned to the code and started wondering what I could do. When I initially tried to use (+)= to simplify the main assignment the compiler complained expected MixHash but got Mix. I tried a half dozen things that didn't work then just changed the MixHash constraint on $dim-mix to Mixy and it worked.
Then I thought through what was going on and realized that almost all the types were getting in the way of P6 just doing the right thing.
You can add some types back in if you really need them.
(But do you really need them? When types are absolutely necessary they're great. Otherwise, imo, think twice, and then twice again, before introducing them. They can easily make code harder to read, reason about, compose, and slower.)
(Of course there are occasions on which they're not strictly necessary but do really help overall. Imo, as with all things, keep it simple at first and only complexify if you see clear benefits for a particular line of code.)

Why does this predicate leave behind a choicepoint?

I've written the following predicate:
list_withoutlast([_Last], []). % forget the last element
list_withoutlast([First, Second|List], [First|WithoutLast]) :-
list_withoutlast([Second|List], WithoutLast).
Queries like list_withoutlast(X, [1, 2]). succeed deterministically, but queries like list_withoutlast([1, 2, 3], X) leave behind a choicepoint, even though there's only one answer.
When I trace it seems to be that SWI attempts to match list_withoutlast([3], Var) against both clauses, even though definitely only the first one will ever match!
Is there something else I can do to tell SWI that I want a list with more than one element? Or, if I want to take advantage of first-argument indexing, are my only options "zero-length list" and "non-zero length list"?
Do other Prologs handle this situation any differently?
You can rewrite your predicate to avoid the spurious choice point:
list_withoutlast([Head| Tail], List) :-
list_withoutlast(Tail, Head, List).
list_withoutlast([], _, []).
list_withoutlast([Head| Tail], Previous, [Previous| List]) :-
list_withoutlast(Tail, Head, List).
This definition takes advantage of first-argument indexing, which will distinguish in the list_withoutlast /3 predicate the first clause, which have an atom (the empty list) in the first argument, from the second clause, which have a (non-empty) list in the first argument.
Passing the head and tail of an input list argument as separate arguments to an auxiliary predicate is a common Prolog programming idiom to take advantage of first-argument indexing and avoid spurious choice-points.
Note that most Prolog systems don't apply deep term indexing. In particular, for compound terms, indexing usually only takes into account the name and arity and doesn't take into account the compound term arguments (a list with one element and a list with two or more elements share the same functor).

check if 2 linked list have the same elements regardless of order

Is there any way to check if 2 linked lists have the same elements regardless of order.
edit question:
I have fixed the code and given some more details:
this is the method that compares 2 lists
compare: object2
^ ((mylist asBag) = ((objetc2 getList) asBag)).
the method belongs to the class myClass that has a field : myLList. myList is a linkedList of type element.
I have compiled it in the workspace:
a: = element new id:1.
b:= element new id:2.
c:=element new id:3.
d: = element new id:1.
e:= element new id:2.
f:=element new id:3.
elements1 := myClass new.
elements addFirst:a.
elements addFirst:b.
elements addFirst:c.
elements2 := myClass new.
elements addFirst:d.
elements addFirst:e.
elements addFirst:f.
Transcript show: (elements1 compare:elements2).
so I am getting false.. seems like it checks for equality by reference rather than equality by value..
So I think the correct question to ask would be: how can I compare 2 Bags by value? I have tried the '=='..but it also returned false.
EDIT:
The question changed too much - I think it deserves a new question for itself.
The whole problem here is that (element new id: 1) = (element new id: 1) is giving you false. Unless it's particular class (or superclasses) redefine it, the = message is resolved comparing by identity (==) by default. That's why your code only works with a collection being compared with itself.
Test it with, for example, lists of numbers (which have the = method redefined to reflect what humans understand by numeric equality), and it will work.
You should redefine your element's class' = (and hashCode) methods for this to work.
Smalltalk handles everything by reference: all there exist is an object, which know (reference) other objects.
It would be wrong to say that two lists are equivalent if they are in different order, as the order is part of what a list means. A list without an order is what we call a bag.
The asBag message (as all of the other as<anotherCollectionType> messages) return a new collection of the named type with all the elements of the receiver. So, #(1 2 3 2) is an Array of four elements, and #(1 2 3 2) asBag is a bag containing those four elements. As it's a Bag, it doesn't have any particular order.
When you do bagA := Bag new. you are creating a new Bag instance, and reference it with bagA variable. But then you do bagA := myList asBag, so you lose the reference to the previous bag - the first assignment doesn't do anything useful in your code, as you don't use that bag.
Saying aBool ifTrue: [^true] ifFalse: [^false] has exactly the same meaning as saying ^aBool - so we prefer just to say that. And, as you only create those two new bags to compare them, you could simplify your whole method like this:
compareTo: anotherList
^ myList asBag = anotherList asBag
Read it out loud: this object (whatever it is) compares to another list if it's list without considering order is the same than the other list without order.
The name compareTo: is kind of weird for returning a boolean (containsSameElements: would be more descriptive), but you get the point much faster with this code.
Just to be precise about your questions:
1) It doesn't work because you're comparing bag1 and bag2, but just defined bagA and bagB.
2) It's not efficient to create those two extra bags just because, and to send the senseless ifTrue: message, but other way it's OK. You may implement a better way to compare the lists, but it's way better to rely on the implementation of asBag and the Bag's = message being performant.
3) I think you could see the asBag source code, but, yes, you can assume it to be something like:
Collection>>asBag
|instance|
instance := Bag new.
instance addAll: self.
^instance
And, of course, the addAll: method could be:
Collection>>addAll: anotherCollection
anotherCollection do: [ :element | self add: element ]
So, yes - it creates a new Bag with all the receiver's elements.
mgarciaisaia's answer was good... maybe too good! This may sound harsh, but I want you to succeed if you're serious about learning, so I reiterate my suggestion from another question that you pick up a good Smalltalk fundamentals textbook immediately. Depending on indulgent do-gooders to rework your nonsensical snippets into workable code is a very inefficient way to learn to program ;)
EDIT: The question has changed dramatically. The following spoke to the original three-part question, so I paraphrased the original questions inline.
Q: What is the problem? A: The problem is lack of fundamental Smalltalk understanding.
Q: Is converting to bags an efficient way to make the comparison? A: Although it's probably not efficient, don't worry about that now. In general, and especially at the beginning when you don't have a good intuition about it, avoid premature optimization - "make it work", and then only "make it fast" if justified by real-world profiling.
Q: How does #asBag work? A: The implementation of #asBag is available in the same living world as your own code. The best way to learn is to view the implementation directly (perhaps by "browsing implementors" if you aren't sure where it's defined") and answer your own question!! If you can't understand that implementation, see #1.

How to write additional methods in Smalltalk Collections which work only for Numeric Inputs?

I want to add a method "average" to array class.
But average doesn't make any sense if input array contains characters/strings/objects.
So I need to check if array contains only integers/floats.
Smalltalk says datatype check [checking if variable belongs to a particular datatype like int string array etc... or not] is a bad way of programming.
So what is best way to implement this?
The specification is somewhat incomplete. You'd need to specify what behavior the collection should show when you use it with non-numeric input.
There are a huge number of possibly desirable behaviors. Smalltalk supports most of them, except for the static typing solution (throw a compile-time error when you add a non-numeric thing to a numeric collection).
If you want to catch non-numeric objects as late as possible, you might just do nothing - objects without arithmetic methods will signal their own exceptions when you try arithmetic on them.
If you want to catch non-numeric elements early, implement a collection class which ensures that only numeric objects can be added (probably by signaling an exception when you add a non-numeric object is added).
You might also want to implement "forgiving" methods for sum or average that treat non-numeric objects as either zero-valued or non-existing (does not make a difference for #sum, but for #average you would only count the numeric objects).
In pharo at least there is
Collection >> average
^ self sum / self size
In Collections-arithmetic category. When you work with with a statically typed languages you are being hit by the language when you add non-number values to the collection. In dynamically typed languages you the same happens when you try to calculate average of inappropriate elements e.i. you try to send +, - or / to an object that does not understand it.
Don't think where you put data, think what are you doing with it.
It's reasonable to check type if you want to do different things, e.g.:
(obj isKindOf: Number) ifTrue: [:num| num doItForNum].
(obj isKindOf: Array ) ifTrue: [:arr| arr doItForArr].
But in this case you want to move the logic of type checking into the object-side.
So in the end it will be just:
obj doIt.
and then you'll have also something like:
Number >> doIt
"do something for number"
Array >> doIt
"do something for array"
(brite example of this is printOn: method)
I would have thought the Smalltalk answer would be to implement it for numbers, then be mindful not to send a collection of pets #sum or #average. Of course, if there later becomes a useful implementation for a pet to add itself to another pet or even an answer to #average, then that would be up to the implementer of Pet or PetCollection.
I did a similar thing when I implemented trivial algebra into my image. It allowed me to mix numbers, strings, and symbols in simple math equations. 2 * #x result in 2x. x + y resulted in x + y. It's a fun way to experiment with currencies by imagining algebra happening in your wallet. Into my walled I deposit (5 x #USD) + (15 * #CAN) for 5USD + 15CAN. Given an object that converts between currencies I can then answer what the total is in either CAN or USD.
We actually used it for supply-chain software for solving simple weights and measures. If a purchase order says it will pay XUSD/1TON of something, but the supplier sends foot-lbs of that same thing, then to verify the shipment value we need a conversion between ton and foot-lbs. Letting the library reduce the equation we're able to produce a result without molesting the input data, or without having to come up with new objects representing tons and foot-pounds or anything else.
I had high ambitions for the library (it was pretty simple) but alas, 2008 erased the whole thing...
"I want to add a method "average" to array class. But average doesn't make any sense if input array contains characters/strings/objects. So I need to check if array contains only integers/floats."
There are many ways to accomplish the averaging of the summation of numbers in an Array while filtering out non-numeric objects.
First I'd make it a more generic method by lifting it up to the Collection class so it can find more cases of reuse. Second I'd have it be generic for numbers rather than just floats and integers, oh it'll work for those but also for fractions. The result will be a float average if there are numbers in the collection array list.
(1) When adding objects to the array test them to ensure they are numbers and only add them if they are numbers. This is my preferred solution.
(2) Use the Collection #select: instance method to filter out the non-numbers leaving only the numbers in a separate collection. This makes life easy at the cost of a new collection (which is fine unless you're concerned with large lists and memory issues). This is highly effective, easy to do and a common solution for filtering collections before performing some operation on them. Open up a Smalltalk and find all the senders of #select: to see other examples.
| list numberList sum average |
list := { 100. 50. 'string'. Object new. 1. 90. 2/3. 88. -74. 'yup' }.
numberList := list select: [ :each | each isNumber ].
sum := numberList sum.
average := sum / (numberList size) asFloat.
Executing the above code with "print it" will produce the following for the example array list:
36.523809523809526
However if the list of numbers is of size zero, empty in other words then you'll get a divide by zero exception with the above code. Also this version isn't on the Collection class as an instance method.
(3) Write an instance method for the Collection class to do your work of averaging for you. This solution doesn't use the select since that creates intermediate collections and if your list is very large that's a lot of extra garbage to collect. This version merely loops over the existing collection tallying the results. Simple, effective. It also addresses the case where there are no numbers to tally in which case it returns the nil object rather than a numeric average.
Collection method: #computeAverage
"Compute the average of all the numbers in the collection. If no numbers are present return the nil object to indicate so, otherwise return the average as a floating point number."
| sum count average |
sum := 0.
count := 0.
self do: [ :each |
each isNumber ifTrue: [
count := count +1.
sum := sum + each.
]
].
count > 0 ifTrue: [
^average := sum / count asFloat
] ifFalse: [
^nil
]
Note the variable "average" is just used to show the math, it's not actually needed.
You then use the above method as follows:
| list averageOrNil |
list := { 100. 50. 'string'. Object new. 1. 90. 2/3. 88. -74. 'yup' }.
averageOrNil := list computeAverage.
averageOrNil ifNotNil: [ "got the average" ] ifNil: [ "there were no numbers in the list"
Or you can use it like so:
{
100. 50. 'string'. Object new. 1. 90. 2/3. 88. -74. 'yup'
} computeAverage
ifNotNil: [:average |
Transcript show: 'Average of list is: ', average printString
]
ifNil: [Transcript show: 'No numbers to average' ].
Of course if you know for sure that there are numbers in the list then you won't ever get the exceptional case of the nil object and you won't need to use an if message to branch accordingly.
Data Type/Class Checking At Runtime
As for the issue you raise, "Smalltalk says datatype check [checking if variable belongs to a particular datatype like int string array etc... or not] is a bad way of programming", there are ways to do things that are better than others.
For example, while one can use #isKindOf: Number to ask each element if it's not the best way to determine the "type" or "class" at runtime since it locks it in via predetermined type or class as a parameter to the #isKindOf: message.
It's way better to use an "is" "class" method such as #isNumber so that any class that is a number replies true and all other objects that are not numeric returns false.
A main point of style in Smalltalk when it comes to ascertaining the types or classes of things is that it's best to use message sending with a message that the various types/classes comprehend but behave differently rather than using explicit type/class checking if at all possible.
The method #isNumber is an instance method on the Number class in Pharo Smalltalk and it returns true while on the Object instance version it returns false.
Using polymorphic message sends in this away enables more flexibility and eliminates code that is often too procedural or too specific. Of course it's best to avoid doing this but reality sets in in various applications and you have to do the best that you can.
This is not the kind of thing you do in Smalltalk. You could take suggestions from the above comments and "make it work" but the idea is misguided (from a Smalltalk point of view).
The "Smalltalk" thing to do would be to make a class that could perform all such operations for you --computing the average, mean, mode, etc. The class could then do the proper checking for numerical inputs, and you could write how it would respond to bad input. The class would use a plain old array, or list or something. The name of the class would make it clear what it's usage would be for. The class could then be part of your deployment and could be exported/imported to different images as needed.
Make a new collection class; perhaps a subclass of Array, or perhaps of OrderedCollection, depending on what collection related behaviour you want.
In the new class' at:put: and/or add: methods test the new item for #isNumber and return an error if it fails.
Now you have a collection you can guarantee will have just numeric objects and nils. Implement your required functions in the knowledge that you won't need to deal with trying to add a Sealion to a Kumquat. Take care with details though; for example if you create a WonderNumericArray of size 10 and insert two values into it, when you average the array do you want to sum the two items and divide by two or by ten?