How can I simplify my deserialization framework? - serialization

I have a Serialization interface which is designed to encapsulate the differences between XML/JSON/binary serialization for my application. It looks something like this:
interface Serialization {
bool isObject();
int opApply(int delegate(string member, Serialization value) del); //iterate object
...
int toInt(); //this part is ugly, but without template member overloading, I
long toLong(); //figure out any way to apply generics here, so all basic types
... //have a toType primitive
string toString();
}
class JSONSerialization : Serialization {
private JSON json;
...
long toLong() {
enforce(json.type == JSON_TYPE.NUMBER, SerializationException.IncorrectType);
return cast(long)json.toNumber();
}
...
}
So, what I then set up is a set of templates for registering type deserializers and calling them:
...
registerTypeDeserializer!Vec3(delegate Vec3(Serialization s) {
return Vec3(s[0].toFloat, s[1].toFloat, s[2].toFloat);
});
...
auto v = parseJSON("some file").deserialize!Vec3;
...
registerTypeDeserializer!Light(delegate Light(Serialization s) {
return new Light(s["intensity"].toFloat, s["position"].deserialize!Vec3);
});
This works well for structs and simple classes, and with the new parameter identifier tuple and parameter default value tuple I should even be able to add automatic deserializer generation. However, I don't really like the inconsistency between basic and user defined types, and more importantly, complex types have to rely on global state to acquire references:
static MaterialLibrary materials;
registerTypeDeserializer!Model(delegate Model(Serialization s) {
return new Model(materials.borrow(s["material"].toString), ...);
});
That's where it really falls apart. Because I can't (without a proliferation of register deserializer functions) pass other parameters to the deserializer, I'm having difficulty avoiding ugly global factories. I've thought about eliminating the deserialize template, and requiring a deserialize function (which could accept multiple parameters) for each user defined type, but that seems like a lot of work for e.g. POD structs.
So, how can I simplify this design, and hopefully avoid tons of boilerplate deserializers, while still allowing me to inject object factories appropriately, instead of assigning them globally?

Basic types can be read using readf \ formattedRead, so you can create a wrapper function that uses this formattedRead it possible, otherwise it uses a static function from the desired type to read the value. Something like this:
auto _readFrom(T)(string s){
static if(__traits(compiles,(readf("",cast(T*)(null))))){
T result;
formattedRead(s,"%s",&result);
return result;
}else{
return T.readFrom(s);
}
}

Related

OOP: Inheriting from immutable objects

Background
Suppose I have some set of fields which are related to each other I therefore make a class to gather them. Let us call this class Base. There are certain methods as well, which operate on these fields which will be common to all derived classes. Additionally, let us suppose we want Base and all its derived classes to be immutable.
In different contexts, these fields support additional operations, so I have different derived classes which inherit the fields and provide additional methods, depending on their context. Let us call these Derived1, Derived2, etc.
In certain scenarios, the program needs instances of a derived class, but the state of the fields must satisfy some condition. So I made a class RestrictedDerived1 which makes sure that the condition is satisfied (or changes the parameters to conform if it can) in the constructor before calling its base constructor, or throws an error otherwise.
Further, there are situations where I need even more conditions to be met, so I have SuperRestrictedDerived1. (Side note: given that some conditions are met, this class can more efficiently compute certain things, so it overrides some methods of Derived1.)
Problem
So far so good. The problem is that most of the methods of all these classes involve making another instance of some class in this hierarchy (not always the same as the one that the method was called on, but usually the same one) based on itself, but with some modifications which may involve somewhat complex computation (i.e. not just changing one field). For example one of the methods of Derived1 might look like:
public Derived1 foo(Base b) {
TypeA fieldA = // calculations using this and b
TypeB fieldB = // more calculations
// ... calculate all fields in this way
return new Derived1(fieldA, fieldB, /* ... */);
}
But then down the hierarchy RestrictedDerived1 needs this same function to return an instance of itself (obviously throwing an error if it can't be instantiated), so I'd need to override it like so:
#Override
public ResrictedDerived1 foo(Base b) {
return new RestrictedDerived1(super.foo(b));
}
This requires a copy constructor, and unnecessarily allocating an intermediate object which will immediately destroyed.
Possible solution
An alternative solution I thought of was to pass a function to each of these methods which constructs some type of Base, and then the functions would look like this:
// In Derived1
public Derived1 foo(Base b, BaseCreator creator) {
TypeA fieldA = // calculations using this and b
TypeB fieldB = // more calculations
// ... calculate all fields in this way
return creator.create(fieldA, fieldB, /* ... */);
}
public Derived1 foo(Base b) {
return foo(b, Derived1::create);
}
public static Derived1 create(TypeA fieldA, TypeB fieldB, /* ... */) {
return new Derived1(fieldA, fieldB, /* ... */);
}
// In RestrictedDerived1
#Override
public ResrictedDerived1 foo(Base b) {
return (RestrictedDerived1) foo(b, RestrictedDerived1::create);
}
public static RestrictedDerived1 create(TypeA fieldA, TypeB fieldB, /* ... */) {
return new RestrictedDerived1(fieldA, fieldB, /* ... */);
}
My question
This works, however it feels "clunky" to me. My question is, is there some design pattern or concept or alternative design that would facilitate my situation?
I tried do use generics, but that got messy quick, and didn't work well for more than one level of inheritance.
By the way, the actual classes that these refer to is 3D points and vectors. I have a base called Triple with doubles x, y, and z (and some functions which take a lambda and apply them to each coordinate and construct a new Triple with the result). Then I have a derived class Point with some point related functions, and another derived class Vector with its functions. Then I have NonZeroVector (extends Vector) which is a vector that cannot be the zero vector (since other objects that need a vector sometimes need to be guaranteed that it's not the zero vector, and I don't want to have to check that everywhere). Further, I have NormalizedVector (extends NonZeroVector) which is guaranteed to have a length of 1, and will normalize itself upon construction.
MyType
This can be solved using a concept variously known as MyType, this type, or self type. The basic idea is that the MyType is the most-derived type at runtime. You can think of it as the dynamic type of this, but referred to statically (at "compile time").
Unfortunately, not many mainstream programming languages have MyTypes, but e.g. TypeScript does, and I was told Raku does as well.
In TypeScript, you could solve your problem by making the return type of foo the MyType (spelled this in TypeScript). It would look something like this:
class Base {
constructor(public readonly fieldA: number, public readonly fieldB: string) {}
foo(b: Base): this {
return new this.constructor(this.fieldA + b.fieldA, this.fieldB + b.fieldB);
}
}
class Derived1 extends Base {
constructor(fieldA: number, fieldB: string, protected readonly repeat: number) {
super(fieldA * repeat, fieldB.repeat(repeat));
}
override foo(b: Base): this {
return new this.constructor(
this.fieldA + b.fieldA, this.fieldB + b.fieldB, this.repeat
);
}
}
class RestrictedDerived1 extends Derived1 {
constructor(fieldA: number, fieldB: string, repeat: number) {
super(fieldA * repeat, fieldB.repeat(repeat), repeat);
if (repeat >= 3) {
throw new RangeError(`repeat must be less than 3 but is ${repeat}`)
}
}
}
const a = new RestrictedDerived1(23, 'Hello', 2);
const b = new Base(42, ' World');
const restrictedDerived = a.foo(b); // Inferred type is RestrictedDerived1
Slightly b0rken Playground link
Implicit factories
In a language with type classes or implicits (like Scala), you could solve your problem with implicit Factory objects. This would be similar to your second example with the Creators, but without the need to explicitly pass the creators around everywhere. Instead, they would be implicitly summoned by the language.
In fact, your requirement is very similar to one of the core requirements of the Scala Collections Framework, namely that you want operations like map, filter, and reduce to only be implemented once, but still preserve the type of the collection.
Most other Collections Frameworks are only able to achieve one of those goals: Java, C#, and Ruby, for example, only have one implementation for each operation, but they always return the same, most-generic type (Stream in Java, IEnumerable in C#, Array in Ruby). Smalltalk's Collections Framework is type-preserving, but has duplicated implementations for every operation. A non-duplicated type-preserving Collections Framework is one of the holy grails of abstractions designers / language designers. (It's no coincidence that so many papers that present novel approaches to OO uses a refactoring of the Smalltalk Collection Framework as their working example.)
F-bounded Polymorphism
If you have neither MyType nor implicit builders available, you can use F-bounded Polymorphism.
The classic example is how Java's clone method should have been designed:
interface Cloneable<T extends Cloneable<T>> {
public T clone();
}
class Foo implements Cloneable<Foo> {
#Override
public Foo clone() {
return new Foo();
}
}
JDoodle example
However, this gets tedious very quickly for deeply-nested inheritance hierarchies. I tried to model it in Scala, but I gave up.

How to handle conflicting function names when implementing multiple interfaces?

I have an interface defined in C# that implements IEnumerable. The implementation of the interface will be done in C++/WinRT as it needs direct access to native code. When I attempt to implement this interface using C++/WinRT, the generated header/implementation contains two 'First()' functions (one from IIterable, and one from IBindableIterable) with different return types. Obviously this isn't going to compile.
Is there some way to "rename" one (or both) of the conflicting functions in the IDL file? C++/CX had a work around that allowed you to use a different function name and then 'bind' it back to the interface name.
Simplified example code below:
Interface:
public interface IUInt32Array : IEnumerable<uint> {}
IDL:
[default_interface]
runtimeclass UInt32Array : IUInt32Array
{
UInt32Array(UInt32 size);
}
IDL Generated Header:
struct UInt32Array : UInt32ArrayT<UInt32Array>
{
UInt32Array(uint32_t size);
Windows::Foundation::Collections::IIterator<uint32_t> First(); // <-- Problem
Windows::UI::Xaml::Interop::IBindableIterator First(); // <-- Problem
}
A solution for this specific problem is to use a combination of 'auto' as the declared return type for the First() function implementation, and to return a type with conversion operators for the two different return types.
Here is an example showing how this was solved in the CppWinRT source code. The linked source code is for the base_collections_vector.h header, specifically see the convertible_observable_vector::First() function (replicated below).
auto First() {
struct result {
container_type* container;
operator wfc::IIterator<T>() {
return static_cast<base_type*>(container)->First();
}
operator wfc::IIterator<Windows::Foundation::IInspectable>() {
return make<iterator>(container);
}
};
return result{ this };
}
Notice here that the function itself is defined as returning auto, which allows us to return an intermediate type. This intermediate type then implements conversion operators for converting to the type expected by the caller. This works for this particular problem as the generated CppWinRT source code immediately assigns the result of the call to a value of the expected type, thus immediately causing the invocation of the conversion operators which in turn return the final correct iterator type.
Thanks to Kenny Kerr who pointed me at both the example and a write-up explaining the above.

Serializing Models in Golang

I'm trying to separate my code into models and serializers with the idea that there be defined serializers that handles all json responsibilities, i.e. separation of concerns. I also want to be able to call a model object obj.Serialize() to get the serializer struct obj that I can then marshal. Therefore, I've come up with the following design. To avoid circular import I had to use interfaces in my serializers which leads to using getters in my models. I've read that getters/setters aren't idiomatic go code and I would prefer not to have "boilerplate" getter code all over my models. Is there a better solution to what I want to accomplish, keeping in mind I want separation of concerns and obj.Serialize()?
src/
models/
a.go
serializers/
a.go
models/a.go
import "../serializers"
type A struct {
name string
age int // do not marshal me
}
func (a *A) Name() string {
return a.name
}
// Serialize converts A to ASerializer
func (a *A) Serialize() interface{} {
s := serializers.ASerializer{}
s.SetAttrs(a)
return s
}
serializers/a.go
// AInterface used to get Post attributes
type AInterface interface {
Name() string
}
// ASerializer holds json fields and values
type ASerializer struct {
Name `json:"full_name"`
}
// SetAttrs sets attributes for PostSerializer
func (s *ASerializer) SetAttrs(a AInterface) {
s.Name = a.Name()
}
It looks like you are actually trying to translate between your internal structs and json. We can start by taking advantage of the json library.
If you want certain libraries to handle your struct fields in certain ways, there are tags. This example shows how json tags tell json to never marshal the field age into json, and to only add the field jobTitle if it is not empty, and that the field jobTitle is actually called title in json. This renaming feature is very useful when structs in go contain capitalized (exported) fields, but the json api you're connecting to uses lowercase keys.
type A struct {
Name string
Age int `json:"-"`// do not marshal me
location string // unexported (private) fields are not included in the json marshal output
JobTitle string `json:"title,omitempty"` // in our json, this field is called "title", but we only want to write the key if the field is not empty.
}
If you need to precompute a field, or simply add a field in your json output of a struct that isn't a member of that struct, we can do that with some magic. When json objects are decoded again into golang structs, fields that don't fit (after checking renamed fields and capitalization differences) are simply ignored.
// AntiRecursionMyStruct avoids infinite recursion in MashalJSON. Only intended for the json package to use.
type AntiRecursionMyStruct MyStruct
// MarshalJSON implements the json.Marshaller interface. This lets us marshal this struct into json however we want. In this case, we add a field and then cast it to another type that doesn't implement the json.Marshaller interface, and thereby letting the json library marshal it for us.
func (t MyStruct) MarshalJSON() ([]byte, error) {
return json.Marshal(struct {
AntiRecursionMyStruct
Kind string // the field we want to add, in this case a text representation of the golang type used to generate the struct
}{
AntiRecursionMyStruct: AntiRecursionMyStruct(t),
Kind: fmt.Sprintf("%T", MyStruct{}),
})
}
Keep in mind that json will only include your exported (capitalized) struct members. I've made this misstake multiple times.
As a general rule, if something seems too complicated, there's probably a better way to do it.

Type hinting v duck typing

Using the following simple Example (coded in php):
public function doSomething(Registry $registry)
{
$object = $registry->getData('object_key');
if ($object) {
//use the object to do something
}
}
public function doSomething($registry)
{
$object = $registry->getData('object_key');
if ($object) {
//use the object to do something
}
}
What are the benefits of either approach?
Both will ultimately fail just at different points:
The first example will fail if an object not of type Registry is passed, and the second will fail if the object passed does not implement a getData method.
How do you choose when to use either approach?
Those are 2 different design approaches. The responsibility falls on the developer(s) to make sure either methods won't fail.
Type hinting is a more robust approach while duck typing gives you more flexibility.

Does PetaPoco handle enums?

I'm experimenting with PetaPoco to convert a table into POCOs.
In my table, I've got a column named TheEnum. The values in this column are strings that represent the following enum:
public enum MyEnum
{
Fred,
Wilma
}
PetaPoco chokes when it tries to convert the string "Fred" into a MyEnum value.
It does this in the GetConverter method, in the line:
Convert.ChangeType( src, dstType, null );
Here, src is "Fred" (a string), and dstType is typeof(MyEnum).
The exception is an InvalidCastException, saying Invalid cast from 'System.String' to 'MyEnum'
Am I missing something? Is there something I need to register first?
I've got around the problem by adding the following into the GetConverter method:
if (dstType.IsEnum && srcType == typeof(string))
{
converter = delegate( object src )
{
return Enum.Parse( dstType, (string)src ) ;
} ;
}
Obviously, I don't want to run this delegate on every row as it'll slow things down tremendously. I could register this enum and its values into a dictionary to speed things up, but it seems to me that something like this would likely already be in the product.
So, my question is, do I need to do anything special to register my enums with PetaPoco?
Update 23rd February 2012
I submitted a patch a while ago but it hasn't been pulled in yet. If you want to use it, look at the patch and merge into your own code, or get just the code from here.
I'm using 4.0.3 and PetaPoco automatically converts enums to integers and back. However, I wanted to convert my enums to strings and back. Taking advantage of Steve Dunn's EnumMapper and PetaPoco's IMapper, I came up with this. Thanks guys.
Note that it does not handle Nullable<TEnum> or null values in the DB. To use it, set PetaPoco.Database.Mapper = new MyMapper();
class MyMapper : PetaPoco.IMapper
{
static EnumMapper enumMapper = new EnumMapper();
public void GetTableInfo(Type t, PetaPoco.TableInfo ti)
{
// pass-through implementation
}
public bool MapPropertyToColumn(System.Reflection.PropertyInfo pi, ref string columnName, ref bool resultColumn)
{
// pass-through implementation
return true;
}
public Func<object, object> GetFromDbConverter(System.Reflection.PropertyInfo pi, Type SourceType)
{
if (pi.PropertyType.IsEnum)
{
return dbObj =>
{
string dbString = dbObj.ToString();
return enumMapper.EnumFromString(pi.PropertyType, dbString);
};
}
return null;
}
public Func<object, object> GetToDbConverter(Type SourceType)
{
if (SourceType.IsEnum)
{
return enumVal =>
{
string enumString = enumMapper.StringFromEnum(enumVal);
return enumString;
};
}
return null;
}
}
You're right, handling enums is not built into PetaPoco and usually I just suggest doing exactly what you've done.
Note that this won't slow things down for requests that don't use the enum type. PetaPoco generates code to map responses to pocos so the delegate will only be called when really needed. In other words, the GetConverter will only be called the first time a particular poco type is used, and the delegate will only be called when an enum needs conversion. Not sure on the speed of Enum.Parse, but yes you could cache in a dictionary if it's too slow.
If you are using PetaPoco's T4 generation and you want enums in your generated type, you can use the PropertyType override in Database.tt:
tables["App"]["Type"].PropertyType = "Full.Namespace.To.AppType";
I you want to store the value of the enum instead of the index number (1,2,4 for example) you can locate the update function in PetaPoco class because the code is "managed" etc, when you add it as nuget package it will store the .cs file to your project. If we would have the enum variable Color = {red, yellow, blue}
Instead of:
// Store the parameter in the command
AddParam(cmd, pc.GetValue(poco), pc.PropertyInfo);
change to:
//enum?
if (i.Value.PropertyInfo.PropertyType.IsEnum)
{
AddParam(cmd, i.Value.GetValue(poco).ToString(), i.Value.PropertyInfo);
}
else
{
// Store the parameter in the command
AddParam(cmd, i.Value.GetValue(poco), i.Value.PropertyInfo);
}
It would store "yellow" instead of 2