Generate a tree of structs with testing/quick, respecting invariants - testing

I have a tree of structs which I'd like to test using testing/quick, but constraining it to within my invariants.
This example code works:
var rnd = rand.New(rand.NewSource(time.Now().UnixNano()))
type X struct {
HasChildren bool
Children []*X
}
func TestSomething(t *testing.T) {
x, _ := quick.Value(reflect.TypeOf(X{}), rnd)
_ = x
// test some stuff here
}
But we hold HasChildren = true whenever len(Children) > 0 as an invariant, so it'd be better to ensure that whatever quick.Value() generates respects that (rather than finding "bugs" that don't actually exist).
I figured I could define a Generate function which uses quick.Value() to populate all the variable members:
func (X) Generate(rand *rand.Rand, size int) reflect.Value {
x := X{}
throwaway, _ := quick.Value(reflect.TypeOf([]*X{}), rand)
x.Children = throwaway.Interface().([]*X)
if len(x.Children) > 0 {
x.HasChildren = true
} else {
x.HasChildren = false
}
return reflect.ValueOf(x)
}
But this is panicking:
panic: value method main.X.Generate called using nil *X pointer [recovered]
And when I change Children from []*X to []X, it dies with a stack overflow.
The documentation is very thin on examples, and I'm finding almost nothing in web searches either.
How can this be done?

Looking at the testing/quick source code it seems that you can't create recursive custom generators and at the same time reuse the quick library facilities to generate the array part of the struct, because the size parameter, that is designed to limit the number of recursive calls, cannot be passed back into quick.Value(...)
https://golang.org/src/testing/quick/quick.go (see around line 50)
in your case this lead to an infinite tree that quickly "explodes" with 1..50 leafs at each level (that's the reason for the stack overflow).
If the function quick.sizedValue() had been public we could have used it to accomplish your task, but unfortunately this is not the case.
BTW since HasChildren is an invariant, can't you simply make it a struct method?
type X struct {
Children []*X
}
func (me *X) HasChildren() bool {
return len(me.Children) > 0
}
func main() {
.... generate X ....
if x.HasChildren() {
.....
}
}

Related

Go validator with sql null types?

I am having problems getting the golang validator to work with SQL null types. Here's an example of what I tried:
package main
import (
"database/sql"
"database/sql/driver"
"log"
"gopkg.in/go-playground/validator.v9"
)
// NullInt64
type NullInt64 struct {
sql.NullInt64
Set bool
}
func MakeNullInt64(valid bool, val int64) NullInt64 {
n := NullInt64{}
n.Set = true
n.Valid = valid
if valid {
n.Int64 = val
}
return n
}
func (n *NullInt64) Value() (driver.Value, error) {
if !n.NullInt64.Valid {
return nil, nil
}
return n.NullInt64.Int64, nil
}
type Thing struct {
N2 NullInt64 `validate:"min=10"`
N3 int64 `validate:"min=10"`
N4 *int64 `validate:"min=10"`
}
func main() {
validate := validator.New()
n := int64(6)
number := MakeNullInt64(true, n)
thing := Thing{number, n, &n}
e := validate.Struct(thing)
log.Println(e)
}
When I run this code, I only get this output:
Key: 'Thing.N3' Error:Field validation for 'N3' failed on the 'min'
tag
Key: 'Thing.N4' Error:Field validation for 'N4' failed on the
'min' tag
The problem is that I want it to also show that Thing.N2 failed for the same reasons as Thing.N3 and Thing.N4.
I tried introducing the func (n *NullInt64) Value() method because it was mentioned in the documentation. But I think I misunderstood something. Can anyone tell me what I did wrong?
UPDATE
There is an Example specifically for that. You may check it out. My other proposed solution should still work though.
Since the value you are trying to validate is Int64 inside sql.NullInt64, the easiest way would be to remove the validate tag and just register a Struct Level validation using:
validate.RegisterStructValidation(NullInt64StructLevelValidation, NullInt64{})
while NullInt64StructLevelValidation is a StructLevelFunc that looks like this:
func NullInt64StructLevelValidation(sl validator.StructLevel) {
ni := sl.Current().Interface().(NullInt64)
if ni.NullInt64.Int64 < 10 {
sl.ReportError(ni.NullInt64.Int64, "Int64", "", "min", "")
}
}
Note #1: this line thing := Thing{number,&number,n,&n} has one argument too many. I assume you meant thing := Thing{number, n, &n}
Note #2: Go tooling including gofmt is considered to be one of the most powerful features of the language. Please consider using it/them.
EDIT #1:
I don't think implementing Valuer interface is of any value in this context.

GoLang, REST, PATCH and building an UPDATE query

since few days I was struggling on how to proceed with PATCH request in Go REST API until I have found an article about using pointers and omitempty tag which I have populated and is working fine. Fine until I have realized I still have to build an UPDATE SQL query.
My struct looks like this:
type Resource struct {
Name *string `json:"name,omitempty" sql:"resource_id"`
Description *string `json:"description,omitempty" sql:"description"`
}
I am expecting a PATCH /resources/{resource-id} request containing such a request body:
{"description":"Some new description"}
In my handler I will build the Resource object this way (ignoring imports, ignoring error handling):
var resource Resource
resourceID, _ := mux.Vars(r)["resource-id"]
d := json.NewDecoder(r.Body)
d.Decode(&resource)
// at this point our resource object should only contain
// the Description field with the value from JSON in request body
Now, for normal UPDATE (PUT request) I would do this (simplified):
stmt, _ := db.Prepare(`UPDATE resources SET description = ?, name = ? WHERE resource_id = ?`)
res, _ := stmt.Exec(resource.Description, resource.Name, resourceID)
The problem with PATCH and omitempty tag is that the object might be missing multiple properties, thus I cannot just prepare a statement with hardcoded fields and placeholders... I will have to build it dynamically.
And here comes my question: how can I build such UPDATE query dynamically? In the best case I'd need some solution with identifying the set properties, getting their SQL field names (probably from the tags) and then I should be able to build the UPDATE query. I know I can use reflection to get the object properties but have no idea hot to get their sql tag name and of course I'd like to avoid using reflection here if possible... Or I could simply check for each property it is not nil, but in real life the structs are much bigger than provided example here...
Can somebody help me with this one? Did somebody already have to solve the same/similar situation?
SOLUTION:
Based on the answers here I was able to come up with this abstract solution. The SQLPatches method builds the SQLPatch struct from the given struct (so no concrete struct specific):
import (
"fmt"
"encoding/json"
"reflect"
"strings"
)
const tagname = "sql"
type SQLPatch struct {
Fields []string
Args []interface{}
}
func SQLPatches(resource interface{}) SQLPatch {
var sqlPatch SQLPatch
rType := reflect.TypeOf(resource)
rVal := reflect.ValueOf(resource)
n := rType.NumField()
sqlPatch.Fields = make([]string, 0, n)
sqlPatch.Args = make([]interface{}, 0, n)
for i := 0; i < n; i++ {
fType := rType.Field(i)
fVal := rVal.Field(i)
tag := fType.Tag.Get(tagname)
// skip nil properties (not going to be patched), skip unexported fields, skip fields to be skipped for SQL
if fVal.IsNil() || fType.PkgPath != "" || tag == "-" {
continue
}
// if no tag is set, use the field name
if tag == "" {
tag = fType.Name
}
// and make the tag lowercase in the end
tag = strings.ToLower(tag)
sqlPatch.Fields = append(sqlPatch.Fields, tag+" = ?")
var val reflect.Value
if fVal.Kind() == reflect.Ptr {
val = fVal.Elem()
} else {
val = fVal
}
switch val.Kind() {
case reflect.Int, reflect.Int8, reflect.Int16, reflect.Int32, reflect.Int64:
sqlPatch.Args = append(sqlPatch.Args, val.Int())
case reflect.String:
sqlPatch.Args = append(sqlPatch.Args, val.String())
case reflect.Bool:
if val.Bool() {
sqlPatch.Args = append(sqlPatch.Args, 1)
} else {
sqlPatch.Args = append(sqlPatch.Args, 0)
}
}
}
return sqlPatch
}
Then I can simply call it like this:
type Resource struct {
Description *string `json:"description,omitempty"`
Name *string `json:"name,omitempty"`
}
func main() {
var r Resource
json.Unmarshal([]byte(`{"description": "new description"}`), &r)
sqlPatch := SQLPatches(r)
data, _ := json.Marshal(sqlPatch)
fmt.Printf("%s\n", data)
}
You can check it at Go Playground. The only problem here I see is that I allocate both the slices with the amount of fields in the passed struct, which may be 10, even though I might only want to patch one property in the end resulting in allocating more memory than needed... Any idea how to avoid this?
I recently had same problem. about PATCH and looking around found this article. It also makes references to the RFC 5789 where it says:
The difference between the PUT and PATCH requests is reflected in the way the server processes the enclosed entity to modify the resource identified by the Request-URI. In a PUT request, the enclosed entity is considered to be a modified version of the resource stored on the origin server, and the client is requesting that the stored version be replaced. With PATCH, however, the enclosed entity contains a set of instructions describing how a resource currently residing on the origin server should be modified to produce a new version. The PATCH method affects the resource identified by the Request-URI, and it also MAY have side effects on other resources; i.e., new resources may be created, or existing ones modified, by the application of a PATCH.
e.g:
[
{ "op": "test", "path": "/a/b/c", "value": "foo" },
{ "op": "remove", "path": "/a/b/c" },
{ "op": "add", "path": "/a/b/c", "value": [ "foo", "bar" ] },
{ "op": "replace", "path": "/a/b/c", "value": 42 },
{ "op": "move", "from": "/a/b/c", "path": "/a/b/d" },
{ "op": "copy", "from": "/a/b/d", "path": "/a/b/e" }
]
This set of instructions should make it easier to build the update query.
EDIT
This is how you would obtain sql tags but you will have to use reflection:
type Resource struct {
Name *string `json:"name,omitempty" sql:"resource_id"`
Description *string `json:"description,omitempty" sql:"description"`
}
sp := "sort of string"
r := Resource{Description: &sp}
rt := reflect.TypeOf(r) // reflect.Type
rv := reflect.ValueOf(r) // reflect.Value
for i := 0; i < rv.NumField(); i++ { // Iterate over all the fields
if !rv.Field(i).IsNil() { // Check it is not nil
// Here you would do what you want to having the sql tag.
// Creating the query would be easy, however
// not sure you would execute the statement
fmt.Println(rt.Field(i).Tag.Get("sql")) // Output: description
}
}
I understand you don't want to use reflection, but still this may be a better answer than the previous one as you comment state.
EDIT 2:
About the allocation - read this guide lines of Effective Go about Data structures and Allocation:
// Here you are allocating an slice of 0 length with a capacity of n
sqlPatch.Fields = make([]string, 0, n)
sqlPatch.Args = make([]interface{}, 0, n)
With make(Type, Length, Capacity (optional))
Consider the following example:
// newly allocated zeroed value with Composite Literal
// length: 0
// capacity: 0
testSlice := []int{}
fmt.Println(len(testSlice), cap(testSlice)) // 0 0
fmt.Println(testSlice) // []
// newly allocated non zeroed value with make
// length: 0
// capacity: 10
testSlice = make([]int, 0, 10)
fmt.Println(len(testSlice), cap(testSlice)) // 0 10
fmt.Println(testSlice) // []
// newly allocated non zeroed value with make
// length: 2
// capacity: 4
testSlice = make([]int, 2, 4)
fmt.Println(len(testSlice), cap(testSlice)) // 2 4
fmt.Println(testSlice) // [0 0]
In your case, may want to the following:
// Replace this
sqlPatch.Fields = make([]string, 0, n)
sqlPatch.Args = make([]interface{}, 0, n)
// With this or simple omit the capacity in make above
sqlPatch.Fields = []string{}
sqlPatch.Args = []interface{}{}
// The allocation will go as follow: length - capacity
testSlice := []int{} // 0 - 0
testSlice = append(testSlice, 1) // 1 - 2
testSlice = append(testSlice, 1) // 2 - 2
testSlice = append(testSlice, 1) // 3 - 4
testSlice = append(testSlice, 1) // 4 - 4
testSlice = append(testSlice, 1) // 5 - 8
Alright, I think the solution I used back in 2016 was quite over-engineered for even more over-engineered problem and was completely unnecessary. The question asked here was very generalized, however we were building a solution that was able to build its SQL query on its own and based on the JSON object or query parameters and/or Headers sent in the request. And that to be as generic as possible.
Nowadays I think the best solution is to avoid PATCH unless truly necessary. And even then you still can use PUT and replace the whole resource with patched property/ies coming already from the client - i.e. not giving the client the option/possibility to send any PATCH request to your server and to deal with partial updates on their own.
However this is not always recommended, especially in cases of bigger objects to save some C02 by reducing the amount of redundant transmitted data. Whenever today if I need to enable a PATCH for the client I simply define what can be patched - this gives me clarity and the final struct.
Note that I am using a IETF documented JSON Merge Patch implementation. I consider that of JSON Patch (also documented by IETF) redundant as hypothetically we could replace the whole REST API by having one single JSON Patch endpoint and let clients control the resources via allowed operations. I also think the implementation of such JSON Patch on the server side is way more complicated. The only use-case I could think of using such implementation is if I was implementing a REST API over a file system...
So the struct may be defined as in my OP:
type ResourcePatch struct {
ResourceID some.UUID `json:"resource_id"`
Description *string `json:"description,omitempty"`
Name *string `json:"name,omitempty"`
}
In the handler func I'd decode the ID from the path into the ResourcePatch instance and unmarshall JSON from the request body into it, too.
Sending only this
{"description":"Some new description"}
to PATCH /resources/<UUID>
I should end up with with this object:
ResourcePatch
* ResourceID {"UUID"}
* Description {"Some new description"}
And now the magic: use a simple logic to build the query and exec parameters. For some it may seem tedious or repetitive or unclean for bigger PATCH objects, but my reply to this would be: if your PATCH object consists of more than 50% of the original resource' properties (or simply too many for your liking) use PUT and expect the clients to send (and replace) the whole resource instead.
It could look like this:
func (s Store) patchMyResource(r models.ResourcePatch) error {
q := `UPDATE resources SET `
qParts := make([]string, 0, 2)
args := make([]interface{}, 0, 2)
if r.Description != nil {
qParts = append(qParts, `description = ?`)
args = append(args, r.Description)
}
if r.Name != nil {
qParts = append(qParts, `name = ?`)
args = append(args, r.Name)
}
q += strings.Join(qParts, ',') + ` WHERE resource_id = ?`
args = append(args, r.ResourceID)
_, err := s.db.Exec(q, args...)
return err
}
I think there's nothing simpler and more effective. No reflection, no over-kills, reads quite good.
Struct tags are only visible through reflection, sorry.
If you don't want to use reflection (or, I think, even if you do), I think it is Go-like to define a function or method that "marshals" your struct into something that can easily be turned into a comma-separated list of SQL updates, and then use that. Build small things to help solve your problems.
For example given:
type Resource struct {
Name *string `json:"name,omitempty" sql:"resource_id"`
Description *string `json:"description,omitempty" sql:"description"`
}
You might define:
func (r Resource) SQLUpdates() SQLUpdates {
var s SQLUpdates
if (r.Name != nil) {
s.add("resource_id", *r.Name)
}
if (r.Description != nil) {
s.add("description", *r.Description)
}
}
where the type SQLUpdates looks something like this:
type SQLUpdates struct {
assignments []string
values []interface{}
}
func (s *SQLUpdates) add(key string, value interface{}) {
if (s.assignments == nil) {
s.assignments = make([]string, 0, 1)
}
if (s.values == nil) {
s.values = make([]interface{}, 0, 1)
}
s.assignments = append(s.assignments, fmt.Sprintf("%s = ?", key))
s.values = append(s.values, value)
}
func (s SQLUpdates) Assignments() string {
return strings.Join(s.assignments, ", ")
}
func (s SQLUpdates) Values() []interface{} {
return s.values
}
See it working (sorta) here: https://play.golang.org/p/IQAHgqfBRh
If you have deep structs-within-structs, it should be fairly easy to build on this. And if you change to an SQL engine that allows or encourages positional arguments like $1 instead of ?, it's easy to add that behavior to just the SQLUpdates struct without changing any code that used it.
For the purpose of getting arguments to pass to Exec, you would just expand the output of Values() with the ... operator.

Conflicting lifetime requirement for iterator returned from function

This may be a duplicate. I don't know. I couldn't understand the other answers well enough to know that. :)
Rust version: rustc 1.0.0-nightly (b47aebe3f 2015-02-26) (built 2015-02-27)
Basically, I'm passing a bool to this function that's supposed to build an iterator that filters one way for true and another way for false. Then it kind of craps itself because it doesn't know how to keep that boolean value handy, I guess. I don't know. There are actually multiple lifetime problems here, which is discouraging because this is a really common pattern for me, since I come from a .NET background.
fn main() {
for n in values(true) {
println!("{}", n);
}
}
fn values(even: bool) -> Box<Iterator<Item=usize>> {
Box::new([3usize, 4, 2, 1].iter()
.map(|n| n * 2)
.filter(|n| if even {
n % 2 == 0
} else {
true
}))
}
Is there a way to make this work?
You have two conflicting issues, so let break down a few representative pieces:
[3usize, 4, 2, 1].iter()
.map(|n| n * 2)
.filter(|n| n % 2 == 0))
Here, we create an array in the stack frame of the method, then get an iterator to it. Since we aren't allowed to consume the array, the iterator item is &usize. We then map from the &usize to a usize. Then we filter against a &usize - we aren't allowed to consume the filtered item, otherwise the iterator wouldn't have it to return!
The problem here is that we are ultimately rooted to the stack frame of the function. We can't return this iterator, because the array won't exist after the call returns!
To work around this for now, let's just make it static. Now we can focus on the issue with even.
filter takes a closure. Closures capture any variable used that isn't provided as an argument to the closure. By default, these variables are captured by reference. However, even is again a variable located on the stack frame. This time however, we can give it to the closure by using the move keyword. Here's everything put together:
fn main() {
for n in values(true) {
println!("{}", n);
}
}
static ITEMS: [usize; 4] = [3, 4, 2, 1];
fn values(even: bool) -> Box<Iterator<Item=usize>> {
Box::new(ITEMS.iter()
.map(|n| n * 2)
.filter(move |n| if even {
n % 2 == 0
} else {
true
}))
}

The best way to implement piece-wise periodic function in Objective-C?

I need to implement kind of illustrated function ( ratio = somefunction(time)) in Objective-c. Language doesn't actually matters, because task seems purely algorithmic. Is there any common way to do things like this?
Next things should be easily adjustable in process of design:
1) Number of small intervals per period (now it 4, but it can be 3 or 20 for example).
2) f1 can be changed. It's simple function like f(x) = sin(x).
3) if we say
f_resulting =
f1 for a time t1
f2 for a time t2
then again
f1 for a time t1
etc..
ratio when f1 "works" vs f2 "works" (t1 to t2) should be adjustable.
Standard C itself has no closures so if you use C function pointers you need to build your own closure - you just use a class instance to store the two function pointers and duration and an invoke method to call the composed periodic function.
However Apple has introduced blocks, which are just inline functions implemented with closures, so in Apple C you can easily compose functions. This is more interesting, but "best" you will have to decide...
An example which composes double -> double functions. First the header, Periodic.h:
typedef double (^Monadic)(double);
Monadic makePeriodic(Monadic firstFunction, double firstPeriod,
Monadic secondFunction, double secondPeriod);
And the implementation, Periodic.c:
#include "Periodic.h"
Monadic makePeriodic(Monadic firstFunction, double firstPeriod,
Monadic secondFunction, double secondPeriod)
{
return Block_copy(^(double time)
{
return fmod(time, firstPeriod+secondPeriod) < firstPeriod
? firstFunction(time)
: secondFunction(time);
}
);
}
Blocks are constructed on the stack and can reference local variables and parameters in the enclosing scope. As such you cannot simply return a block from a function as the scope it refers to disappears on return. The function Block_copy() moves a block, and any blocks it refers to, onto the heap allowing them to outlive their creating scope. A heap block must be released with Block_release() when no longer needed.
And a simple demo:
Monadic queer = makePeriodic(^(double t) { return t * t; }, 5,
^(double t) { return sqrt(t); }, 3);
for(double ix = 0; ix <= 16; ix++)
printf("%f -> %f\n", ix, queer(ix));
Block_release(queer); // clean up
Now you can wrap all this up in a class if you wish so it fits Obj-C style. When you do this you can also send copy and release messages to the block, just as if it was an Obj-C object. In a garbage collected environment you still need the copy to get thge block onto the heap, but the release is not needed.
Periodic.h:
typedef double (^Monadic)(double);
#interface ComposeOne : NSObject
{
}
+ (Monadic) makePeriodicWithFunction:(Monadic)firstFunction
forPeriod:(double)firstPeriod
andFunction:(Monadic)secondFunction
forPeriod:(double)secondPeriod;
#end
Periodic.M:
#implementation ComposeOne
+ (Monadic) makePeriodicWithFunction:(Monadic)firstFunction
forPeriod:(double)firstPeriod
andFunction:(Monadic)secondFunction
forPeriod:(double)secondPeriod
{
Monadic result = (^(double time)
{
return fmod(time, firstPeriod+secondPeriod) < firstPeriod
? firstFunction(time)
: secondFunction(time);
}
);
return [result copy];
}
#end
And the simple demo (still using printf though for convenience):
Monadic queer = [ComposeOne makePeriodicWithFunction:^(double t) { return t * t; }
forPeriod:5
andFunction:^(double t) { return sqrt(t); }
forPeriod:3];
for(double ix = 0; ix <= 16; ix++)
printf("%f -> %f\n", ix, queer(ix));
[queer release];

How do I write a generic memoize function?

I'm writing a function to find triangle numbers and the natural way to write it is recursively:
function triangle (x)
if x == 0 then return 0 end
return x+triangle(x-1)
end
But attempting to calculate the first 100,000 triangle numbers fails with a stack overflow after a while. This is an ideal function to memoize, but I want a solution that will memoize any function I pass to it.
Mathematica has a particularly slick way to do memoization, relying on the fact that hashes and function calls use the same syntax:
triangle[0] = 0;
triangle[x_] := triangle[x] = x + triangle[x-1]
That's it. It works because the rules for pattern-matching function calls are such that it always uses a more specific definition before a more general definition.
Of course, as has been pointed out, this example has a closed-form solution: triangle[x_] := x*(x+1)/2. Fibonacci numbers are the classic example of how adding memoization gives a drastic speedup:
fib[0] = 1;
fib[1] = 1;
fib[n_] := fib[n] = fib[n-1] + fib[n-2]
Although that too has a closed-form equivalent, albeit messier: http://mathworld.wolfram.com/FibonacciNumber.html
I disagree with the person who suggested this was inappropriate for memoization because you could "just use a loop". The point of memoization is that any repeat function calls are O(1) time. That's a lot better than O(n). In fact, you could even concoct a scenario where the memoized implementation has better performance than the closed-form implementation!
You're also asking the wrong question for your original problem ;)
This is a better way for that case:
triangle(n) = n * (n - 1) / 2
Furthermore, supposing the formula didn't have such a neat solution, memoisation would still be a poor approach here. You'd be better off just writing a simple loop in this case. See this answer for a fuller discussion.
I bet something like this should work with variable argument lists in Lua:
local function varg_tostring(...)
local s = select(1, ...)
for n = 2, select('#', ...) do
s = s..","..select(n,...)
end
return s
end
local function memoize(f)
local cache = {}
return function (...)
local al = varg_tostring(...)
if cache[al] then
return cache[al]
else
local y = f(...)
cache[al] = y
return y
end
end
end
You could probably also do something clever with a metatables with __tostring so that the argument list could just be converted with a tostring(). Oh the possibilities.
In C# 3.0 - for recursive functions, you can do something like:
public static class Helpers
{
public static Func<A, R> Memoize<A, R>(this Func<A, Func<A,R>, R> f)
{
var map = new Dictionary<A, R>();
Func<A, R> self = null;
self = (a) =>
{
R value;
if (map.TryGetValue(a, out value))
return value;
value = f(a, self);
map.Add(a, value);
return value;
};
return self;
}
}
Then you can create a memoized Fibonacci function like this:
var memoized_fib = Helpers.Memoize<int, int>((n,fib) => n > 1 ? fib(n - 1) + fib(n - 2) : n);
Console.WriteLine(memoized_fib(40));
In Scala (untested):
def memoize[A, B](f: (A)=>B) = {
var cache = Map[A, B]()
{ x: A =>
if (cache contains x) cache(x) else {
val back = f(x)
cache += (x -> back)
back
}
}
}
Note that this only works for functions of arity 1, but with currying you could make it work. The more subtle problem is that memoize(f) != memoize(f) for any function f. One very sneaky way to fix this would be something like the following:
val correctMem = memoize(memoize _)
I don't think that this will compile, but it does illustrate the idea.
Update: Commenters have pointed out that memoization is a good way to optimize recursion. Admittedly, I hadn't considered this before, since I generally work in a language (C#) where generalized memoization isn't so trivial to build. Take the post below with that grain of salt in mind.
I think Luke likely has the most appropriate solution to this problem, but memoization is not generally the solution to any issue of stack overflow.
Stack overflow usually is caused by recursion going deeper than the platform can handle. Languages sometimes support "tail recursion", which re-uses the context of the current call, rather than creating a new context for the recursive call. But a lot of mainstream languages/platforms don't support this. C# has no inherent support for tail-recursion, for example. The 64-bit version of the .NET JITter can apply it as an optimization at the IL level, which is all but useless if you need to support 32-bit platforms.
If your language doesn't support tail recursion, your best option for avoiding stack overflows is either to convert to an explicit loop (much less elegant, but sometimes necessary), or find a non-iterative algorithm such as Luke provided for this problem.
function memoize (f)
local cache = {}
return function (x)
if cache[x] then
return cache[x]
else
local y = f(x)
cache[x] = y
return y
end
end
end
triangle = memoize(triangle);
Note that to avoid a stack overflow, triangle would still need to be seeded.
Here's something that works without converting the arguments to strings.
The only caveat is that it can't handle a nil argument. But the accepted solution can't distinguish the value nil from the string "nil", so that's probably OK.
local function m(f)
local t = { }
local function mf(x, ...) -- memoized f
assert(x ~= nil, 'nil passed to memoized function')
if select('#', ...) > 0 then
t[x] = t[x] or m(function(...) return f(x, ...) end)
return t[x](...)
else
t[x] = t[x] or f(x)
assert(t[x] ~= nil, 'memoized function returns nil')
return t[x]
end
end
return mf
end
I've been inspired by this question to implement (yet another) flexible memoize function in Lua.
https://github.com/kikito/memoize.lua
Main advantages:
Accepts a variable number of arguments
Doesn't use tostring; instead, it organizes the cache in a tree structure, using the parameters to traverse it.
Works just fine with functions that return multiple values.
Pasting the code here as reference:
local globalCache = {}
local function getFromCache(cache, args)
local node = cache
for i=1, #args do
if not node.children then return {} end
node = node.children[args[i]]
if not node then return {} end
end
return node.results
end
local function insertInCache(cache, args, results)
local arg
local node = cache
for i=1, #args do
arg = args[i]
node.children = node.children or {}
node.children[arg] = node.children[arg] or {}
node = node.children[arg]
end
node.results = results
end
-- public function
local function memoize(f)
globalCache[f] = { results = {} }
return function (...)
local results = getFromCache( globalCache[f], {...} )
if #results == 0 then
results = { f(...) }
insertInCache(globalCache[f], {...}, results)
end
return unpack(results)
end
end
return memoize
Here is a generic C# 3.0 implementation, if it could help :
public static class Memoization
{
public static Func<T, TResult> Memoize<T, TResult>(this Func<T, TResult> function)
{
var cache = new Dictionary<T, TResult>();
var nullCache = default(TResult);
var isNullCacheSet = false;
return parameter =>
{
TResult value;
if (parameter == null && isNullCacheSet)
{
return nullCache;
}
if (parameter == null)
{
nullCache = function(parameter);
isNullCacheSet = true;
return nullCache;
}
if (cache.TryGetValue(parameter, out value))
{
return value;
}
value = function(parameter);
cache.Add(parameter, value);
return value;
};
}
}
(Quoted from a french blog article)
In the vein of posting memoization in different languages, i'd like to respond to #onebyone.livejournal.com with a non-language-changing C++ example.
First, a memoizer for single arg functions:
template <class Result, class Arg, class ResultStore = std::map<Arg, Result> >
class memoizer1{
public:
template <class F>
const Result& operator()(F f, const Arg& a){
typename ResultStore::const_iterator it = memo_.find(a);
if(it == memo_.end()) {
it = memo_.insert(make_pair(a, f(a))).first;
}
return it->second;
}
private:
ResultStore memo_;
};
Just create an instance of the memoizer, feed it your function and argument. Just make sure not to share the same memo between two different functions (but you can share it between different implementations of the same function).
Next, a driver functon, and an implementation. only the driver function need be public
int fib(int); // driver
int fib_(int); // implementation
Implemented:
int fib_(int n){
++total_ops;
if(n == 0 || n == 1)
return 1;
else
return fib(n-1) + fib(n-2);
}
And the driver, to memoize
int fib(int n) {
static memoizer1<int,int> memo;
return memo(fib_, n);
}
Permalink showing output on codepad.org. Number of calls is measured to verify correctness. (insert unit test here...)
This only memoizes one input functions. Generalizing for multiple args or varying arguments left as an exercise for the reader.
In Perl generic memoization is easy to get. The Memoize module is part of the perl core and is highly reliable, flexible, and easy-to-use.
The example from it's manpage:
# This is the documentation for Memoize 1.01
use Memoize;
memoize('slow_function');
slow_function(arguments); # Is faster than it was before
You can add, remove, and customize memoization of functions at run time! You can provide callbacks for custom memento computation.
Memoize.pm even has facilities for making the memento cache persistent, so it does not need to be re-filled on each invocation of your program!
Here's the documentation: http://perldoc.perl.org/5.8.8/Memoize.html
Extending the idea, it's also possible to memoize functions with two input parameters:
function memoize2 (f)
local cache = {}
return function (x, y)
if cache[x..','..y] then
return cache[x..','..y]
else
local z = f(x,y)
cache[x..','..y] = z
return z
end
end
end
Notice that parameter order matters in the caching algorithm, so if parameter order doesn't matter in the functions to be memoized the odds of getting a cache hit would be increased by sorting the parameters before checking the cache.
But it's important to note that some functions can't be profitably memoized. I wrote memoize2 to see if the recursive Euclidean algorithm for finding the greatest common divisor could be sped up.
function gcd (a, b)
if b == 0 then return a end
return gcd(b, a%b)
end
As it turns out, gcd doesn't respond well to memoization. The calculation it does is far less expensive than the caching algorithm. Ever for large numbers, it terminates fairly quickly. After a while, the cache grows very large. This algorithm is probably as fast as it can be.
Recursion isn't necessary. The nth triangle number is n(n-1)/2, so...
public int triangle(final int n){
return n * (n - 1) / 2;
}
Please don't recurse this. Either use the x*(x+1)/2 formula or simply iterate the values and memoize as you go.
int[] memo = new int[n+1];
int sum = 0;
for(int i = 0; i <= n; ++i)
{
sum+=i;
memo[i] = sum;
}
return memo[n];