Data.Vector.Binary overlaps Binary [a] instance - serialization

In my application I need to serialize a vector containing an arbitrary datatype, in this case is a list of Doubles. For serializing the vector I'm importing Data.Vector.Binary.
When loading the module in GHCi the following error arises:
Overlapping instances for Binary [Double]
arising from a use of `decode' at Statistics.hs:57:33-42
Matching instances:
instance (Data.Vector.Generic.Base.Vector v a, Binary a) =>
Binary (v a)
-- Defined in Data.Vector.Binary
instance (Binary a) => Binary [a] -- Defined in Data.Binary
Is the list an instance of Vector?
I looked through the documentation but could not find such instance.
What can I do to be able to serialize this structure?
Edit:
I'm using the following package versions:
vector-0.6.0.2
vector-binary-instances-0.1.2
binary-0.5.0.2
Also here is a snippet that shows the issue, this time with a list of chars:
import Data.Binary
import Data.Vector.Binary
import qualified Data.ByteString.Lazy as L
main = L.writeFile "/tmp/aaa" $ encode "hello"

Ok, I think I see the problem here. The vector-binary-instances package defines:
instance (Data.Vector.Generic.Base.Vector v a, Binary a) => Binary (v a)
which is very bad. This definition means "for any type 'v a', this is a valid Binary instance". That means this instance is available for any type that matches v a. That includes (but is not limited to) all lists, all functors, and all monads. As a demonstration, ghci reports the following:
Prelude Data.Binary Data.Vector.Binary Data.ByteString.Lazy> :t getChar
getChar :: IO Char
Prelude Data.Binary Data.Vector.Binary Data.ByteString.Lazy> encode getChar
<interactive>:1:0:
No instance for (Data.Vector.Generic.Base.Vector IO Char)
arising from a use of `encode' at <interactive>:1:0-13
Possible fix:
add an instance declaration for
(Data.Vector.Generic.Base.Vector IO Char)
In the expression: encode getChar
In the definition of `it': it = encode getChar
Here the interpreter is attempting to use this instance for getChar :: IO Char, which is obviously wrong.
Short answer: don't use vector-binary-instances for now. This instance is broken, and given how instances propagate through Haskell code it will cause problems. Until this is fixed, you should write your own binary instances for vectors. You should be able to copy the code from vector-binary-instances and restrict it to a monomorphic vector type
instance (Binary a) => Binary (Vector a) where
I believe this will work with any Vector which is an instance of Data.Vector.Generic.Vector.
You also may want to contact the vector-binary-instances maintainer about this.

Related

Can an Elm function have a docstring?

Python has them and I find them very useful:
def awesome_fn(x, y):
""""
Calculates some awesome function of x and y.
""""
.
.
.
Then in the iPython REPL you can query it with
In [1]: awesome_fn?
Signature: awesome_fn(x, y)
Docstring: Calculates some awesome function of x and y.
File: ...
Type: function
It's possible to specify documentation for a module using the following documentation format:
module Maybe exposing (Maybe(Just,Nothing), andThen, map, withDefault, oneOf)
{-| This library fills a bunch of important niches in Elm. A `Maybe` can help
you with optional arguments, error handling, and records with optional fields.
# Definition
#docs Maybe
# Common Helpers
#docs map, withDefault, oneOf
# Chaining Maybes
#docs andThen
-}
and for a method:
{-| Convert a list of characters into a String. Can be useful if you
want to create a string primarly by consing, perhaps for decoding
something.
fromList ['e','l','m'] == "elm"
-}
fromList : List Char -> String
fromList = ...
But it's not possible so far to view these docs from repl. There's even an issue related to this.
On the other hand, there's elm-oracle library, which allows you to integrate documentation hints into an editor (and it's already integrated into the popular ones), or even run it in command line as:
elm-oracle FILE query

How one deals with multiple pointer level (like char**) in Squeak FFI

I want to deal with a structure like this struct foo {char *name; char **fields ; size_t nfields};
If I define corresponding structure in Squeak
ExternalStructure subclass: #Foo
instanceVariableNames: ''
classVariableNames: ''
poolDictionaries: ''
category: 'FFI-Tests'.
and define the fields naively with
Foo class>fields
^#(
(name 'char*')
(fields 'char**')
(nfields 'unsigned long')
)
then generate the accessors with Foo defineFields, I get those undifferentiated types for name and fields:
Foo>>name
^ExternalData fromHandle: (handle pointerAt: 1) type: ExternalType char asPointerType
Foo>>fields
^ExternalData fromHandle: (handle pointerAt: 5) type: ExternalType char asPointerType
That is troubling, the second indirection is missing for the fields accessor.
How should I specify fields accessor in the spec?
If not possible, how do I define it manually?
And I have the same problem for this HDF5 function prototype: int H5Tget_array_dims(hid_t tid, hsize_t *dims[])
The following syntax is not accepted:
H5Tget_array_dims: tid with: dims
<cdecl: long 'H5Tget_array_dims'(Hid_t Hsize_t * * )>
The compiler barks argument expected -> before the second *...
I add to resort to void * instead, that is totally bypassing typechecking - less than ideal...
Any idea how to deal correctly with such prototype?
Since Compiler-mt.435, the parser will not complain anymore but call back to ExternalType>>asPointerToPointerType. See source.squeak.org/trunk/Compiler-mt.435.diff and source.squeak.org/FFI/FFI-Kernel-mt.96.diff
At the time of writing this, such pointer-to-pointer type will be treated as regular pointer type. So, you loose the information that the external type actually points to an array of pointers.
When would one need that information?
When coercing arguments in the FFI plugin during the call
When constructing the returned object in the FFI plugin during the call
When interpreting instances of ExternalData from struct fields and FFI call return values
In tools such as the object explorer
There already several kinds of RawBitsArray in Squeak. Adding String and ExternalStructure (incl. packed or union) to the mix, we have all kinds of objects in Squeak to map the inner-most dimension (i.e., int*, char*, void*). ExternalData can represent the other levels of the multi-dimensional array (i.e., int**, char**, void** and so on).
So, there are remaining tasks here:
Store that pointer dimension information maybe in a new external type to be found via ExternalType>>referencedType. We may want to put new information into compiledSpec. See http://forum.world.st/FFI-Plugin-Question-about-multi-dimensional-arrays-e-g-char-int-void-td5118484.html
Update value reading in ExternalArray to unwrap one pointer after the other; and let the code generator for struct-field accessors generate code in a similar fashion.
Extend argument coercing in the plugin to accept arrays of the already supported arrays (i.e. String etc.)

Object-oriented programming in Go -- use "new" keyword or nah?

I am learning Go, and I have a question based on the following code:
package main
import (
"fmt"
)
type Vector struct {
x, y, z int
}
func VectorFactory(x,y,z int) *Vector {
return &Vector{x, y, z}
}
func main() {
vect := VectorFactory(1, 2, 3)
fmt.Printf("%d\n", (vect.x * vect.y * vect.z))
}
Here I've defined a type Vector with x, y, and z, and I've defined function VectorFactory which declares a pointer to a Vector and returns that pointer. I use this function to create a new Vector named vect.
Is this bad code? Should I be using the new keyword rather than building a Factory?
Do I need to delete the Vector after using it, like in C++? If so, how?
Thanks. I'm still waiting for my Go book to be delivered.
Prefer NewThing to ThingFactory.
Don't make a NewThing function, unless you have complex initialisation, or you're intentionally not exporting parts of a struct. Selectively setting only parts of a struct is not complex, that can be accomplished by using labels. Complex would be things like "the value of slot Q depends on what the value of slot Zorb is". Unexported struct fields can be useful for information hiding, but should be used with care.
Go is garbage-collected, any piece of data that is not referenced is eligible to be collected. Start out y not worrying about it, then get to a point where you ensure you clean up any reference to data you're no longer interested in so as to avoid accidental liveness ("accidental liveness" is essentially the GC equivalent of "memory leak").
If you expect to print your data structures frequently, consider making a String method for them (this is not exactly corresponding to the print you do, but might be generally more useful for a vector):
func (v Vector) String() string {
return fmt.Sprintf("V<%d, %d, %d>", v.x v.y, v.z);
}
Unless "vect" really means something to you, prefer either "v" or "vector" as a name.

Why does smallCheck's `Series` class have two types in the constructor?

This question is related to my other question about smallCheck's Test.SmallCheck.Series class. When I try to define an instance of the class Serial in the following natural way (suggested to me by an answer by #tel to the above question), I get compiler errors:
data Person = SnowWhite | Dwarf Int
instance Serial Person where ...
It turns out that Serial wants to have two arguments. This, in turn, necessitates a some compiler flags. The following works:
{-# LANGUAGE FlexibleInstances, MultiParamTypeClasses #-}
import Test.SmallCheck
import Test.SmallCheck.Series
import Control.Monad.Identity
data Person = SnowWhite | Dwarf Int
instance Serial Identity Person where
series = generate (\d -> SnowWhite : take (d-1) (map Dwarf [1..7]))
My question is:
Was putting that Identity there the "right thing to do"? I was inspired by the type of the Test.Series.list function (which I also found extremely bizarre when I first saw it):
list :: Depth -> Series Identity a -> [a]
What is the right thing to do? Will I be OK if I just blindly put Identity in whenever I see it? Should I have put something like Serial m Integer => Serial m Person instead (that necessitates some more scary-looking compiler flags: FlexibleContexts and UndecidableInstances at least)?
What is that first parameter (the m in Serial m n) for?
Thank you!
I'm just an user of smallcheck and not a developer, but I think the answer is
1) Not really. You should leave it polymorphic, which you can do without the said extensions:
{-# LANGUAGE FlexibleInstances, MultiParamTypeClasses #-}
import Test.SmallCheck
import Test.SmallCheck.Series
import Control.Monad.Identity
data Person = SnowWhite | Dwarf Int deriving (Show)
instance (Monad m) => Serial m Person where
series = generate (\d -> SnowWhite : take (d-1) (map Dwarf [1..7]))
2) Series is currently defined as
newtype Series m a = Series (ReaderT Depth (LogicT m) a)
which means that mis the base monad for LogicT which is used to generate the values in the series. For example, writing IO in place of m would allow IO actions to happen while generating the series.
In SmallCheck, m appears also in the Testable instance declarations, such as instance (Serial m a, Show a, Testable m b) => Testable m (a->b). This has the concrete effect that the pre-existing driver functions such as smallCheck :: Testable IO a => Depth -> a -> IO () cannot be used if you only have instances for Identity.
In practice, you could make use of this fact by writing a custom driver function which
interleaves some monadic effect like logging of the generated values (or some such) inside the said driver.
It might also be useful for other things which I'm not aware of.

What is a programming language with dynamic scope and static typing?

I know the language exists, but I can't put my finger on it.
dynamic scope
and
static typing?
We can try to reason about what such a language might look like. Obviously something like this (using a C-like syntax for demonstration purposes) cannot be allowed, or at least not with the obvious meaning:
int x_plus_(int y) {
return x + y; // requires that x have type int
}
int three_plus_(int y) {
double x = 3.0;
return x_plus_(y); // calls x_plus_ when x has type double
}
So, how to avoid this?
I can think of a few approaches offhand:
Commenters above mention that Fortran pre-'77 had this behavior. That worked because a variable's name determined its type; a function like x_plus_ above would be illegal, because x could never have an integer type. (And likewise one like three_plus_, for that matter, because y would have the same restriction.) Integer variables had to have names beginning with i, j, k, l, m, or n.
Perl uses syntax to distinguish a few broad categories of variables, namely scalars vs. arrays (regular arrays) vs. hashes (associative arrays). Variables belonging to the different categories can have the exact same name, because the syntax distinguishes which one is meant. For example, the expression foo $foo, $foo[0], $foo{'foo'} involves the function foo, the scalar $foo, the array #foo ($foo[0] being the first element of #foo), and the hash %foo ($foo{'foo'} being the value in %foo corresponding to the key 'foo'). Now, to be quite clear, Perl is not statically typed, because there are many different scalar types, and these are types not distinguished syntactically. (In particular: all references are scalars, even references to functions or arrays or hashes. So if you use the syntax to dereference a reference to an array, Perl has to check at runtime to see if the value really is an array-reference.) But this same approach could be used for a bona fide type system, especially if the type system were a very simple one. With that approach, the x_plus_ method would be using an x of type int, and would completely ignore the x declared by three_plus_. (Instead, it would use an x of type int that had to be provided from whatever scope called three_plus_.) This could either require some type annotations not included above, or it could use some form of type inference.
A function's signature could indicate the non-local variables it uses, and their expected types. In the above example, x_plus_ would have the signature "takes one argument of type int; uses a calling-scope x of type int; returns a value of type int". Then, just like how a function that calls x_plus_ would have to pass in an argument of type int, it would also have to provide a variable named x of type int — either by declaring it itself, or by inheriting that part of the type-signature (since calling x_plus_ is equivalent to using an x of type int) and propagating this requirement up to its callers. With this approach, the three_plus_ function above would be illegal, because it would violate the signature of the x_plus_ method it invokes — just the same as if it tried to pass a double as its argument.
The above could just have "undefined behavior"; the compiler wouldn't have to explicitly detect and reject it, but the spec wouldn't impose any particular requirements on how it had to handle it. It would be the responsibility of programmers to ensure that they never invoke a function with incorrectly-typed non-local variables.
Your professor was presumably thinking of #1, since pre-'77 Fortran was an actual real-world language with this property. But the other approaches are interesting to think about. :-)
I haven't found elsewhere has it written down, but AXIOM CAS (and various forks, including FriCAS which is still been actively developed) uses a script language called SPAD with both a very novel strong static dependent type system and dynamic scoping (although it is possibly an unintended implementation bug).
Most of the time the user won't realize that, but when they start trying to build closures like other functional languages it reveals its dynamic scoping nature:
FriCAS Computer Algebra System
Version: FriCAS 2021-03-06
Timestamp: Mon May 17 10:43:08 CST 2021
-----------------------------------------------------------------------------
Issue )copyright to view copyright notices.
Issue )summary for a summary of useful system commands.
Issue )quit to leave FriCAS and return to shell.
-----------------------------------------------------------------------------
(1) -> foo (x,y) == x + y
Type: Void
(2) -> foo (1,2)
Compiling function foo with type (PositiveInteger, PositiveInteger)
-> PositiveInteger
(2) 3
Type: PositiveInteger
(3) -> foo
(3) foo (x, y) == x + y
Type: FunctionCalled(foo)
(4) -> bar x y == x + y
Type: Void
(5) -> bar
(5) bar x == y +-> x + y
Type: FunctionCalled(bar)
(6) -> (bar 1)
Compiling function bar with type PositiveInteger ->
AnonymousFunction
(6) y +-> #1 + y
Type: AnonymousFunction
(7) -> ((bar 1) 2)
(7) #1 + 2
Type: Polynomial(Integer)
Such a behavior is similar to what will happen when trying to build a closure by using (lambda (x) (lambda (y) (+ x y))) in a dynamically scoped Lisp, such as Emacs Lisp. Actually the underlying representation of functions is essentially the same as Lisp in the early days since AXIOM has been first developed on top of an early Lisp implementation on IBM mainframe.
I believe it is however a defect (like what JMC happened did when implementing the first version of LISP language) because the implementor made the parser to do uncurrying as in the function definition of bar, but it is unlikely to be useful without the ability to build the closure in the language.
It is also worth notice that SPAD automatically renames the variables in anomalous functions to avoid captures so its dynamic scoping could be used as a feature as in other Lisps.
Dynamic scope means, that the variable and its type in a specific line of your code depends on the functions, called before. This means, you can not know the type in a specific line of your code, because you can not know, which code has been executed before.
Static typing means, that you have to know the type in every line of your code, before the code starts to run.
This is irreconcilable.