What languages support writing things once? - language-design

As an example we expect that we can define sum on entities of type OrderedCollection[Monoid], where Monoid is a trait/interface with an associative operation with a zero. Then we shouldn't need to cut and paste the code for sum to use it. But types can be monoids in more than one way: e.g. positive integers with + and 0 or with * and 1. I can't work out a nice way to handle this.

Haskell has a nice trick to handle the case for many monoids using newtype language feature:
http://blog.sigfpe.com/2009/01/haskell-monoids-and-their-uses.html (read from "The same type can give rise to a monoid in different ways.")
http://www.haskell.org/ghc/docs/6.12.2/html/libraries/base-4.2.0.1/Data-Monoid.html#3 (the official documentation on library additive and multiplicative monoids for numbers).

Surely you need to be explicit about which operation you wish to use? I don't see any sane way of avoiding this.
In Clojure I'd just use a higher-order function:
(defn sum-with [op]
(fn [coll]
(reduce op coll)))
Then you could do:
(def sum1 (sum-with +))
(sum1 [1 2 3 4])
=> 10
(def sum2 (sum-with *))
(sum2 [1 2 3 4])
=> 24

Related

Arrays with attributes in Julia

I am making my first steps in julia, and I would like to reproduce something I achieved with numpy.
I would like to write a new array-like type which is essentially an vector of elements of arbitrary type, and, to keep the example simple, an scalar attribute such as the sampling frequency fs.
I started with something like
type TimeSeries{T} <: DenseVector{T,}
data::Vector{T}
fs::Float64
end
Ideally, I would like:
1) all methods that take a Vector{T} as argument to take on TimeSeries{T}.
e.g.:
ts = TimeSeries([1,2,3,1,543,1,24,5], 12.01)
median(ts)
2) that indexing a TimeSeries always returns a TimeSeries:
ts[1:3]
3) built-in functions that return a Vector to return a TimeSeries:
ts * 2
ts + [1,2,3,1,543,1,24,5]
I have started by implementing size, getindex and so on, but I definitely do not see how it could be possible to match points 2 and 3.
numpy has a quite comprehensive way to doing this: http://docs.scipy.org/doc/numpy/user/basics.subclassing.html. R also seems to allow linking attributes attr()<- to arrays.
Do you have any idea about the best strategy to implement this sort of "array with attributes".
Maybe I'm not understanding, why is for say point 3 it not sufficient to do
(*)(ts::TimeSeries, n) = TimeSeries(ts.data*n, ts.fs)
(+)(ts::TimeSeries, n) = TimeSeries(ts.data+n, ts.fs)
As for point 2
Base.getindex(ts::TimeSeries, r::Range) = TimeSeries(ts.data[r], ts.fs)
Or are you asking for some easier way where you delegate all these operations to the internal vector? You can clever things like
for op in (:(+), :(*))
#eval $(op)(ts::TimeSeries, x) = TimeSeries($(op)(ts.data,x), ts.fs)
end

How to generate random elements depending on previous elements using Quickcheck?

I'm using QuickCheck to do generative testing in Clojure.
However I don't know it well and often I end up doing convoluted things. One thing that I need to do quite often is something like that:
generate a first prime number from a list of prime-numbers (so far so good)
generate a second prime number which is smaller than the first one
generate a third prime number which is smaller than the first one
However I have no idea as to how to do that cleanly using QuickCheck.
Here's an even simpler, silly, example which doesn't work:
(prop/for-all [a (gen/choose 1 10)
b (gen/such-that #(= a %) (gen/choose 1 10))]
(= a b))
It doesn't work because a cannot be resolved (prop/for-all isn't like a let statement).
So how can I generate the three primes, with the condition that the two latter ones are inferior to the first one?
In test.check we can use gen/bind as the bind operator in the generator monad, so we can use this to make generators which depend on other generators.
For example, to generate pairs [a b] where we must have (>= a b) we can use this generator:
(def pair-gen (gen/bind (gen/choose 1 10)
(fn [a]
(gen/bind (gen/choose 1 a)
(fn [b]
(gen/return [a b]))))))
To satisfy ourselves:
(c/quick-check 10000
(prop/for-all [[a b] pair-gen]
(>= a b)))
gen/bind takes a generator g and a function f. It generates a value from g, let's call it x. gen/bind then returns the value of (f x), which must be a new generator. gen/return is a generator which only generates its argument (so above I used it to return the pairs).

Transition from infix to prefix notation

I started learning Clojure recently. Generally it looks interesting, but I can't get used to some syntactic inconveniences (comparing to previous Ruby/C# experience).
Prefix notation for nested expressions. In Ruby I get used to write complex expressions with chaining/piping them left-to-right: some_object.map { some_expression }.select { another_expression }. It's really convenient as you move from input value to result step-by-step, you can focus on a single transformation and you don't need to move cursor as you type. Contrary to that when I writing nested expressions in Clojure, I write the code from inner expression to outer and I have to move cursor constantly. It slows down and distracts. I know about -> and ->> macros but I noticed that it's not an idiomatic. Did you have the same problem when you started coding in Clojure/Haskell etc? How did you solve it?
I felt the same about Lisps initially so I feel your pain :-)
However the good news is that you'll find that with a bit of time and regular usage you will probably start to like prefix notation. In fact with the exception of mathematical expressions I now prefer it to infix style.
Reasons to like prefix notation:
Consistency with functions - most languages use a mix of infix (mathematical operators) and prefix (functional call) notation . In Lisps it is all consistent which has a certain elegance if you consider mathematical operators to be functions
Macros - become much more sane if the function call is always in the first position.
Varargs - it's nice to be able to have a variable number of parameters for pretty much all of your operators. (+ 1 2 3 4 5) is nicer IMHO than 1 + 2 + 3 + 4 + 5
A trick then is to use -> and ->> librerally when it makes logical sense to structure your code this way. This is typically useful when dealing with subsequent operations on objects or collections, e.g.
(->>
"Hello World"
distinct
sort
(take 3))
==> (\space \H \W)
The final trick I found very useful when working in prefix style is to make good use of indentation when building more complex expressions. If you indent properly, then you'll find that prefix notation is actually quite clear to read:
(defn add-foobars [x y]
(+
(bar x y)
(foo y)
(foo x)))
To my knowledge -> and ->> are idiomatic in Clojure. I use them all the time, and in my opinion they usually lead to much more readable code.
Here are some examples of these macros being used in popular projects from around the Clojure "ecosystem":
Ring cookie parsing
Leiningen internals
ClojureScript compiler
Proof by example :)
If you have a long expression chain, use let. Long runaway expressions or deeply nested expressions are not especially readable in any language. This is bad:
(do-something (map :id (filter #(> (:age %) 19) (fetch-data :people))))
This is marginally better:
(do-something (map :id
(filter #(> (:age %) 19)
(fetch-data :people))))
But this is also bad:
fetch_data(:people).select{|x| x.age > 19}.map{|x| x.id}.do_something
If we're reading this, what do we need to know? We're calling do_something on some attributes of some subset of people. This code is hard to read because there's so much distance between first and last, that we forget what we're looking at by the time we travel between them.
In the case of Ruby, do_something (or whatever is producing our final result) is lost way at the end of the line, so it's hard to tell what we're doing to our people. In the case of Clojure, it's immediately obvious that do-something is what we're doing, but it's hard to tell what we're doing it to without reading through the whole thing to the inside.
Any code more complex than this simple example is going to become pretty painful. If all of your code looks like this, your neck is going to get tired scanning back and forth across all of these lines of spaghetti.
I'd prefer something like this:
(let [people (fetch-data :people)
adults (filter #(> (:age %) 19) people)
ids (map :id adults)]
(do-something ids))
Now it's obvious: I start with people, I goof around, and then I do-something to them.
And you might get away with this:
fetch_data(:people).select{|x|
x.age > 19
}.map{|x|
x.id
}.do_something
But I'd probably rather do this, at the very least:
adults = fetch_data(:people).select{|x| x.age > 19}
do_something( adults.map{|x| x.id} )
It's also not unheard of to use let even when your intermediary expressions don't have good names. (This style is occasionally used in Clojure's own source code, e.g. the source code for defmacro)
(let [x (complex-expr-1 x)
x (complex-expr-2 x)
x (complex-expr-3 x)
...
x (complex-expr-n x)]
(do-something x))
This can be a big help in debugging, because you can inspect things at any point by doing:
(let [x (complex-expr-1 x)
x (complex-expr-2 x)
_ (prn x)
x (complex-expr-3 x)
...
x (complex-expr-n x)]
(do-something x))
I did indeed see the same hurdle when I first started with a lisp and it was really annoying until I saw the ways it makes code simpler and more clear, once I understood the upside the annoyance faded
initial + scale + offset
became
(+ initial scale offset)
and then try (+) prefix notation allows functions to specify their own identity values
user> (*)
1
user> (+)
0
There are lots more examples and my point is NOT to defend prefix notation. I just hope to convey that the learning curve flattens (emotionally) as the positive sides become apparent.
of course when you start writing macros then prefix notation becomes a must-have instead of a convenience.
to address the second part of your question, the thread first and thread last macros are idiomatic anytime they make the code more clear :) they are more often used in functions calls than pure arithmetic though nobody will fault you for using them when they make the equation more palatable.
ps: (.. object object2 object3) -> object().object2().object3();
(doto my-object
(setX 4)
(sety 5)`

Equivalent of define-fun in Z3 API

Using Z3 with the textual format, I can use define-fun to define functions for reuse later on. For example:
(define-fun mydiv ((x Real) (y Real)) Real
(if (not (= y 0.0))
(/ x y)
0.0))
I wonder how to create define-fun with Z3 API (I use F#) instead of repeating the body of the function everywhere. I want to use it to avoid duplication and debug formulas easier. I tried with Context.MkFuncDecl, but it seems to generate uninterpreted functions only.
The define-fun command is just creating a macro. Note that the SMT 2.0 standard doesn’t allow recursive definitions.
Z3 will expand every occurrence of my-div during parsing time.
The command define-fun may be used to make the input file simpler and easier to read, but internally it does not really help Z3.
In the current API, there is no support for creating macros.
This is not a real limitation, since we can define a C or F# function that creates instances of a macro.
However, it seems you want to display (and manually inspect) formulas created using the Z3 API. In this case, macros will not help you.
One alternative is to use quantifiers. You can declare an uninterpreted function my-div and assert the universally quantified formula:
(declare-fun mydiv (Real Real) Real)
(assert (forall ((x Real) (y Real))
(= (mydiv x y)
(if (not (= y 0.0))
(/ x y)
0.0))))
Now, you can create your formula using the uninterpreted function mydiv.
This kind of quantified formula can be handled by Z3. Actually, there are two options to handle this kind of quantifier:
Use the macro finder: this preprocessing step identifies quantifiers that are essentially defining macros and expand them. However, the expansion only happens during preprocessing time, not during parsing (i.e., formula construction time). To enable the model finder, you have to use MACRO_FINDER=true
The other option is to use MBQI (model based quantifier instantiation). This module can also handle this kind of quantifier. However, the quantifiers will be expanded on demand.
Of course, the solving time may heavily depend on which approach you use. For example, if your formula is unsatisfiable independently of the “meaning” of mydiv, then approach 2 is probably better.

(x86) Assembler Optimization

I'm building a compiler/assembler/linker in Java for the x86-32 (IA32) processor targeting Windows.
High-level concepts (I do not have any "source code": there is no syntax nor lexical translation, and all languages are regular) are translated into opcodes, which then are wrapped and outputted to a file. The translation process has several phases, one is the translation between regular languages: the highest-level code is translated into the medium-level code which is then translated into the lowest-level code (probably more than 3 levels).
My problem is the following; if I have higher-level code (X and Y) translated to lower-level code (x, y, U and V), then an example of such a translation is, in pseudo-code:
x + U(f) // generated by X
+
V(f) + y // generated by Y
(An easy example) where V is the opposite of U (compare with a stack push as U and a pop as V). This needs to be 'optimized' into:
x + y
(essentially removing the "useless" code)
My idea was to use regular expressions. For the above case, it'll be a regular expression looking like this: x:(U(x)+V(x)):null, meaning for all x find U(x) followed by V(x) and replace by null. Imagine more complex regular expressions, for more complex optimizations. This should work on all levels.
What do you suggest? What would be a good approach to optimize and produce fast x86 assembly?
What you should actually do is build an Abstract Syntax Tree (AST).
It is a representation of the source code in the form of a tree, that is much easier to work with, especially to make transformations and optimizations.
That code, represented as a tree, would be something like:
(+
(+
x
(U f))
(+
(V f)
y))
You could then try to make some transformations: a sum of sums is a sum of all the terms:
(+
x
(U f)
(V f)
y)
Then you could scan the tree and you could have the following rules:
(+ (U x) (V x)) = 0, for all x
(+ 0 x1 x2 ...) = x, for all x1, x2, ...
Then you would obtain what you are looking for:
(+ x y)
Any good book on compiler-writing will discuss a lot on ASTs. Functional programming languages are specially suited for this task, since in general it is easy to represent trees and to do pattern matching to parse and transform the tree.
Usually, for this task, you should avoid using regular expressions. Regular expressions define what mathematicians call regular languages. Any regular language can be parsed by a set of regular expressions. However, I think your language is not regular, so it cannot be properly parsed by regexps.
People try, and try, and try to parse languages such as HTML using regular expressions. This has been extensively discussed here in SO, and you cannot parse HTML with regular expressions. There will always be an exceptional case in which your regular expressions would fail, and you would have to adapt it.
It might be the same with your language: if it is not regular, you should avoid lots of headaches and not try to parse it (and especially "transform" it) using regular expressions.
I'm having a lot of trouble understanding this question, but I think you will find it useful to learn something about term-rewriting systems, which seems to be what you are proposing. Whether the mechanism is tree rewriting (always works) or regular expressions (will work for some languages some of the time and other languages all of the time) is of secondary importance.
It is definitely possible to optimize object code by term rewriting. You probably also will benefit from learning something about peephole optimization; a good place to start, because it is very strong on the fundamentals, is a paper by Davidson and Fraser on a retargetable peephole optimizer. There's also excellent later work by Benitez and Davidson.