Smalltalk Pharo String concatenation instead of streams - smalltalk

why would the following code:
| list types |
list := Heap new.
types := #('a' 'b' 'c').
types do:[ :t |
1 to:9 do:[ :i |
list add:(t, i asString).
].
].
^ list
issue the String concatenation instead of streams warning in a method in Pharo?
Clicking the [?] button shows:
String concatenation instead of streams
Check for code using string concatenation inside some iteration message.
Am I doing something that can be done easier with streams? What I want to achieve is to create a list of all values a1 to a9, b1 to b9 and c1 to c9.

It complains because of the part t, i asString that is inside a collection loop (you can look at the actual implementation of the rule in the class RBStringConcatenationRule.
Normally string concatenation is discouraged because it's slower and more memory intense (IIRC about the memory).
So if you are doing some heavy concatenation (connecting a lots of parts into a single string), stream is preferable: you can look at most printOn: methods in the system to see it in action.
However in trivial cases concatenation with , is just fine, the warning rule is just too broad. Warnings are just that... warnings that something might be wrong, or that something might be written better.
Speaking of better writing, in Smalltalk it is preferable to use specialized collection methods (select:,collect:,...) over the overly-generic do:, e.g.
| list types |
types := #('a' 'b' 'c').
list := types flatCollect: [ :t | (1 to: 9) collect: [ :i | t , i asString ].
^ Heap withAll: list
(and if you don't need Heap you can just return the third line directly and not have the list tempvar.

To mininize the number of created objects, you could do like this:
| list types digits |
list := Heap new.
types := #($a $b $c).
digits := (1 to: 9) collect: #asCharacterDigit.
types do: [ :t |
digits do: [ :d |
list
add: ((String new: 2)
at: 1 put: t;
at: 2 put: d;
yourself)
] ].
^ list
So you are not creating interim strings and intervals. Nor conversions from integer to string.

Related

Can one create a standalone method/function (without any class)

I am trying to understand smalltalk. Is it possible to have a standalone method/function, which is not part of any particular class, and which can be called later:
amethod ['amethod called' printNl].
amethod.
Above code gives following error:
simpleclass.st:1: expected Eval, Namespace or class definition
How can I use Eval or Namespace as being suggested by error message?
I tried following but none work:
Eval amethod [...
amethod Eval [...
Eval amethod Eval[... "!"
Eval [... works but I want to give a name to the block so that I can call it later.
Following also works but gets executed immediately and does not execute when called later.
Namespace current: amethod ['amethod called' printNl].
Thanks for your insight.
In Smalltalk the equivalent to a standalone method is a Block (a.k.a. BlockClosure). You create them by enclosing Smalltalk expressions between square brackets. For example
[3 + 4]
To evaluate a block, you send it the message value:
[3 + 4] value
which will answer with 7.
Blocks may also have arguments:
[:s | 3 + s]
you evaluate them with value:
[:s | 3 + s] value: 4 "answers with 7"
If the block has several sentences, you separate them with a dot, as you would do in the body of a method.
Addendum
Blocks in Smalltalk are first class objects. In particular, one can reference them with variables, the same one does with any other objects:
three := 3.
threePlus := [:s | three + s].
for later use
threePlus value: 4 "7"
Blocks can be nested:
random := Random new.
compare := [:p :u | u <= p]
bernoulli60 := [compare value: 0.6 value: random next].
Then the sequence:
bernoulli60 value. "true"
bernoulli60 value. "false"
...
bernoulli60 value. "true"
will answer with true about 60% of the times.
Leandro's answer, altough being correct and with deep smalltalk understanding, is answering what you asked for, but I think, not 100% sure thou, you are actually asking how to "play" around with a code without the need to create a class.
In my eyes want you want is called a Workspace (Smalltalk/X and Dolphin) (it can have different names like Playground in Pharo Smalltalk).
If you want to play around you need to create a local variable.
| result |
result := 0. "Init otherwise nil"
"Adding results of a simple integer factorial"
1 to: 10 do: [ :integer |
result := result + integer factorial
].
Transcript show: result.
Explanation:
I'm using a do: block for 1-10 iterration. (:integer is a block local variable). Next I'm, showing the result on Transcript.

menhir - associate AST nodes with token locations in source file

I am using Menhir to parse a DSL. My parser builds an AST using an elaborate collection of nested types. During later typecheck and other passes in error reports generated for a user, I would like to refer to source file position where it occurred. These are not parsing errors, and they generated after parsing is completed.
A naive solution would be to equip all AST types with additional location information, but that would make working with them (e.g. constructing or matching) unnecessary clumsy. What are the established practices to do that?
I don't know if it's a best practice, but I like the approach taken in the abstract syntax tree of the Frama-C system; see https://github.com/Frama-C/Frama-C-snapshot/blob/master/src/kernel_services/ast_data/cil_types.mli
This approach uses "layers" of records and algebraic types nested in each other. The records hold meta-information like source locations, as well as the algebraic "node" you can match on.
For example, here is a part of the representation of expressions:
type ...
and exp = {
eid: int; (** unique identifier *)
enode: exp_node; (** the expression itself *)
eloc: location; (** location of the expression. *)
}
and exp_node =
| Const of constant (** Constant *)
| Lval of lval (** Lvalue *)
| UnOp of unop * exp * typ
| BinOp of binop * exp * exp * typ
...
So given a variable e of type exp, you can access its source location with e.eloc, and pattern match on its abstract syntax tree in e.enode.
So simple, "top-level" matches on syntax are very easy:
let rec is_const_expr e =
match e.enode with
| Const _ -> true
| Lval _ -> false
| UnOp (_op, e', _typ) -> is_const_expr e'
| BinOp (_op, l, r, _typ) -> is_const_expr l && is_const_expr r
To match deeper in an expression, you have to go through a record at each level. This adds some syntactic clutter, but not too much, as you can pattern match on only the one record field that interests you:
let optimize_double_negation e =
match e.enode with
| UnOp (Neg, { enode = UnOp (Neg, e', _) }, _) -> e'
| _ -> e
For comparison, on a pure AST without metadata, this would be something like:
let optimize_double_negation e =
match e.enode with
| UnOp (Neg, UnOp (Neg, e', _), _) -> e'
| _ -> e
I find that Frama-C's approach works well in practice.
You need somehow to attach the location information to your nodes. The usual solution is to encode your AST node as a record, e.g.,
type node =
| Typedef of typdef
| Typeexp of typeexp
| Literal of string
| Constant of int
| ...
type annotated_node = { node : node; loc : loc}
Since you're using records, you can still pattern match without too much syntactic overhead, e.g.,
match node with
| {node=Typedef t} -> pp_typedef t
| ...
Depending on your representation, you may choose between wrapping each branch of your type individually, wrapping the whole type, or recursively, like in Frama-C example by #Isabelle Newbie.
A similar but more general approach is to extend a node not with the location, but just with a unique identifier and to use a final map to add arbitrary data to nodes. The benefit of this approach is that you can extend your nodes with arbitrary data as you actually externalize node attributes. The drawback is that you can't actually guarantee the totality of an attribute since finite maps are no total. Thus it is harder to preserve an invariant that, for example, all nodes have a location.
Since every heap allocated object already has an implicit unique identifier, the address, it is possible to attach data to the heap allocated objects without actually wrapping it in another type. For example, we can still use type node as it is and use finite maps to attach arbitrary pieces of information to them, as long as each node is a heap object, i.e., the node definition doesn't contain constant constructors (in case if it has, you can work around it by adding a bogus unit value, e.g., | End can be represented as | End of unit.
Of course, by saying an address, I do not literally mean the physical or virtual address of an object. OCaml uses a moving GC so an actual address of an OCaml object may change during a program execution. Moreover, an address, in general, is not unique, as once an object is deallocated its address can be grabbed by a completely different entity.
Fortunately, after ephemera were added to the recent version of OCaml it is no longer a problem. Moreover, an ephemeron will play nicely with the GC, so that if a node is no longer reachable its attributes (like file locations) will be collected by the GC. So, let's ground this with a concrete example. Suppose we have two nodes c1 and c2:
let c1 = Literal "hello"
let c2 = Constant 42
Now we can create a location mapping from nodes to locations (we will represent the latter as just strings)
module Locations = Ephemeron.K1.Make(struct
type t = node
let hash = Hashtbl.hash (* or your own hash if you have one *)
let equal = (=) (* or a specilized equal operator *)
end)
The Locations module provides an interface of a typical imperative hash table. So let's use it. In the parser, whenever you create a new node you should register its locations in the global locations value, e.g.,
let locations = Locations.create 1337
(* somewhere in the semantics actions, where c1 and c2 are created *)
Locations.add c1 "hello.ml:12:32"
Locations.add c2 "hello.ml:13:56"
And later, you can extract the location:
# Locations.find locs c1;;
- : string = "hello.ml:12:32"
As you see, although the solution is nice in the sense, that it doesn't touch the node data type, so the rest of your code can pattern match on it nice and easy, it is still a little bit dirty, as it requires global mutable state, that is hard to maintain. Also, since we are using an object address as a key, every newly created object, even if it was logically derived from the original object, will have a different identity. For example, suppose you have a function, that normalizes all literals:
let normalize = function
| Literal str -> Literal (normalize_literal str)
| node -> node
It will create a new Literal node from the original nodes, so all the location information will be lost. That means, that you need to update the location information, every time you derive one node from another.
Another issue with ephemera is that they can't survive the marshaling or serialization. I.e., if you store your AST somewhere in a file, and then you restore it, all nodes will loose their identity, and the location table will become empty.
Speaking of the "monadic approach" that you mentioned in comments. Though monads are magic, they still can't magically solve all the problems. They are not silver bullets :) In order to attach something to a node we still need to extend it with an extra attribute - either a location information directly or an identity through which we can attach properties indirectly. The monad can be useful for the latter though, as instead of having a global reference to the last assigned identifier, we can use a state monad, to encapsulate our id generator. And for the sake of completeness, instead of using a state monad or a global reference to generate unique identifiers, you can use UUID and get identifiers that are not only unique in a program run, but are also universally unique, in the sense that there are no other objects in the world with the same identifier, no matter how often you run your program (in the sane world). And although it looks like that generating the UUID doesn't use any state, underneath the hood it still uses an imperative random number generator, so it is sort of cheating, but still can seen as pure functional, as it doesn't contain observable effects.

For loop for array in Pharo Smalltalk

I'm trying to make an array with random numbers (just 0 or 1), but when I run it, it just prints this: End of statement list encountered ->
This is my code:
GenList
| lista |
lista := Array new: 31.
1 to: 30 do: [ :i | lista at: i put: 2 atRandom - 1]
^lista
What can I do?
Some interesting things to consider:
1. The method selector doesn't start with a lowercase letter
It is a tradition for selectors to start with a lowercase letter. In this sense, genLista would be more correct than GenLista.
2. The method selector includes the abbreviated word 'gen'
For instance, genLista could be renamed to genereLista o listaAlAzar (if you decide to use Spanish)
3. The Array named lista has 31 elements, not 30
The result of Array new: 31 is an array of 31 elements. However, the code below it only fills 30 of them, leaving the last one uninitialized (i.e., nil). Possible solution: lista := Array new: 30.
4. A dot is missing causing a compilation error
The code
1 to: 30 do: [ :i | lista at: i put: 2 atRandom - 1]
^lista
does not compile because there is no dot indicating the separation between the two sentences. Note that the error happens at compilation time (i.e., when you save the method) because the return token ^ must start a statement (i.e., it cannot be inlined inside a statement).
There are other cases where a missing dot will not prevent the code from compiling. Instead, an error will happen at runtime. Here is a (typical) example:
1 to: 10 do: [:i | self somethingWith: i] "<- missing dot here"
self somethingElse
the missing dot will generate the runtime error self not understood by block.
5. There is a more expressive way of generating 0s and 1s at random
The calculation 2 atRandom - 1 is ok. However, it forces the reader to mentally do the math. A better way to reveal your intention would have been
#(0 1) atRandom
6. When playing with random numbers don't forget to save the seed
While it is ok to use atRandom, such a practice should only be used with "toy" code. If you are developing a system or a library, the recommended practice is to save the seed somewhere before generating any random data. This will allow you to reproduce the generation of random quantities later on for the sake of debugging or confirmation. (Note however, that this will not suffice for making your program deterministically reproducible because unordered (e.g. hashed) collections could form differently in successive executions.)

Does Perl 6 have an equivalent to Python's dir()?

I'm trying to become comfortable in Perl 6. One of the things I found handy in Python when I was at a REPL prompt was that I could do a dir(object) and find out the attributes of an object, which in Python includes the object's methods.
This often served as a helpful reminder of what I wanted to do; 'Oh, that's right, trim in Python is called strip', that sort of thing.
In Perl 6, I know about the introspection methods .WHO, .WHAT, .WHICH, .HOW and .WHY, but these are at a class or object level. How do I find out what's inside an object, and what I can do to it?
How do I find out what's inside an object, and what I can do to it?
You mentioned that you already know about the introspection methods - but are you aware of what you can find out by querying an objects metaobject (available from .HOW)?
$ perl6
>
> class Article {
* has Str $.title;
* has Str $.content;
* has Int $.view-count;
* }
>
> my Str $greeting = "Hello World"
Hello World
>
> say Article.^methods
(title content view-count)
>
> say Article.^attributes
(Str $!title Str $!content Int $!view-count)
>
> say $greeting.^methods
(BUILD Int Num chomp chop pred succ simplematch match ords samecase samemark
samespace word-by-word trim-leading trim-trailing trim encode NFC NFD NFKC NFKD
wordcase trans indent codes chars uc lc tc fc tclc flip ord WHY WHICH Bool Str
Stringy DUMP ACCEPTS Numeric gist perl comb subst-mutate subst lines split words)
>
> say $greeting.^attributes
Method 'gist' not found for invocant of class 'BOOTSTRAPATTR'
>
There is a shortcut for querying an object's metaobject;
a.^b translates to a.HOW.b(a). The methods and attributes for Article are themselves objects - instances of Method and Attribute. Whenever you call .say on an object, you implicitly call its .gist method which is meant to give you a summarized string representation of the object - i.e. the 'gist' of it.
The attributes of the builtin Str type appear to be of type BOOTSTRAPATTR type - which doesn't implement the .gist method. As an alternative, we can just ask for the attributes to spit out their names instead;
> say sort $greeting.^methods.map: *.name ;
(ACCEPTS BUILD Bool DUMP Int NFC NFD NFKC NFKD Num Numeric Str Stringy WHICH WHY
chars chomp chop codes comb encode fc flip gist indent lc lines match ord ords perl
pred samecase samemark samespace simplematch split subst subst-mutate succ tc tclc
trans trim trim-leading trim-trailing uc word-by-word wordcase words)
>
> say sort $greeting.^attributes.map: *.name
($!value)
>
You can find our more here (which is where just about everything for this answer came from).

Conditional swapping of items in an array

I have a collection of items of type A, B, and C.
I would like to process the collection and swap all A and B pairs, but if there is C (which is also a collection), I want to process that recursively.
So
#(A1 A2 B1 B2 A3 #(A4 A5 B3) )
would be translated into
#(A1 B1 A2 B2 A3 #(A4 B3 A5) )
The swap isn't transitive so #(A1 B1 B2) will be translated into #(B1 A1 B2) and not #(B1 B2 A1).
I wanted to use overlappingPairsDo: but the problem is that the second element is always processed twice.
Can this be achieved somehow with Collection API without resorting to primitive forloops?
I am looking for readable, not performant solution.
I think my solution below should do what you're after, but a few notes up front:
The "requirements" seem a bit artificial - as in, I'm having a hard time imagining a use case where you'd want this sort of swapping. But that could just be my lack of imagination, of course, or due to your attempt to simplify the problem.
A proper solution should, in my opinion, create the objects required so that code can be moved where it belongs. My solution just plunks it (mostly) in a class-side method for demonstration purposes.
You're asking for this to be "achieved somehow with [the] Collection API without resorting to primitive for[-]loops" - I wouldn't be so quick to dismiss going down to the basics. After all, if you look at the implementation of, say, #overlappingPairsDo:, that's exactly what they do, and since you're asking your question within the pharo tag, you're more than welcome to contribute your new way of doing something useful to the "Collections API" so that we can all benefit from it.
To help out, I've added a class SwapPairsDemo with two class-side methods. The first one is just a helper, since, for demonstration purposes, we're using the Array objects from your example, and they contain ByteSymbol instances as your A and B types which we want to distinguish from the C collection type - only, ByteSymbols are of course themselves collections, so let's pretend they're not just for the sake of this exercise.
isRealCollection: anObject
^anObject isCollection
and: [anObject isString not
and: [anObject isSymbol not]]
The second method holds the code to show swapping and to allow recursion:
swapPairsIn: aCollection ifTypeA: isTypeABlock andTypeB: isTypeBBlock
| shouldSwapValues wasJustSwapped |
shouldSwapValues := OrderedCollection new: aCollection size - 1 withAll: false.
aCollection overlappingPairsWithIndexDo: [:firstItem :secondItem :eachIndex |
(self isRealCollection: firstItem)
ifTrue: [self swapPairsIn: firstItem ifTypeA: isTypeABlock andTypeB: isTypeBBlock]
ifFalse: [
shouldSwapValues at: eachIndex put: ((self isRealCollection: secondItem) not
and: [(isTypeABlock value: firstItem)
and: [isTypeBBlock value: secondItem]])
]
].
(self isRealCollection: aCollection last)
ifTrue: [self swapPairsIn: aCollection last ifTypeA: isTypeABlock andTypeB: isTypeBBlock].
wasJustSwapped := false.
shouldSwapValues withIndexDo: [:eachBoolean :eachIndex |
(eachBoolean and: [wasJustSwapped not])
ifTrue: [
aCollection swap: eachIndex with: eachIndex + 1.
wasJustSwapped := true
]
ifFalse: [wasJustSwapped := false]
]
That's a bit of a handful, and I'd usually refactor a method this big, plus you might want to take care of nil, empty lists, etc., but hopefully you get the idea for an approach to your problem. The code consists of three steps:
Build a collection (size one less than the size of the main collection) of booleans to determine whether two items should be swapped by iterating with overlappingPairsWithIndexDo:.
This iteration doesn't handle the last element by itself, so we need to take care of this element possibly being a collection in a separate step.
Finally, we use our collection of booleans to perform the swapping, but we don't swap again if we've just swapped the previous time (I think this is what you meant by the swap not being transitive).
To run the code, you need to supply your collection and a way to tell whether things are type "A" or "B" - I've just used your example, so I just ask them whether they start with those letters - obviously that could be substituted by whatever fits your use case.
| collection target isTypeA isTypeB |
collection := #(A1 A2 B1 B2 A3 #(A4 A5 B3) ).
target := #(A1 B1 A2 B2 A3 #(A4 B3 A5) ).
isTypeA := [:anItem | anItem beginsWith: 'A'].
isTypeB := [:anItem | anItem beginsWith: 'B'].
SwapPairsDemo swapPairsIn: collection ifTypeA: isTypeA andTypeB: isTypeB.
^collection = target
Inspecting this in a workspace returns true, i.e. the swaps on the collection have been performed so that it is now the same as the target.
Here's a solution without recursion that uses a two step approach:
result := OrderedCollection new.
#(1 3 4 6 7)
piecesCutWhere: [ :a :b | a even = b even ]
do: [ :run |
result addAll: ((result isEmpty or: [ result last even ~= run first even ])
ifTrue: [ run ]
ifFalse: [ run reverse ]) ].
result asArray = #(1 4 3 6 7) "--> true"
So first we split the collection wherever we see the possibility for swapping. Then, in the second step, we only swap if the last element of the result collection still allows for swapping.
Adding recursion to this should be straight forward.