Protobuf-net skip deserialization of specific fields - serialization

I have serialized this class:
[ProtoContract]
public class TestClass
{
[ProtoMember(1)] public int[] hugeArray;
[ProtoMember(2)] public int x;
[ProtoMember(3)] public int y;
//lot more fields and properties to serialize here...
}
How do I skip the [ProtoMember(1)] hugeArray during deserialization, so that only x, y, and other fields get deserialized?
My problem is that sometimes I quickly need only to get the 'metadata', which is what other fields and properties describe, but sometimes I need an entire object.

Two options:
two RuntimeTypeModel instances (one built manually with only the desired fields specified)
two types; i.e. create a simpler TestClass that simply omits the big fields - i.e. TestClassMetadata - and deserialize into that; protobuf-net won't mind at all

Related

How to easily access widely different subsets of fields of related objects/DB tables?

Imagine we have a number of related objects (equivalently DB tables), for example:
public class Person {
private String name;
private Date birthday;
private int height;
private Job job;
private House house;
..
}
public class Job {
private String company;
private int salary;
..
}
public class House {
private Address address;
private int age;
private int numRooms;
..
}
public class Address {
private String town;
private String street;
..
}
How to best design a system for easily defining and accessing widely varying subsets of data on these objects/tables? Design patterns, pros and cons, are very welcome. I'm using Java, but this is a more general problem.
For example, I want to easily say:
I'd like some object with (Person.name, Person.height, Job.company, Address.street)
I'd like some object with (Job.company, House.numRooms, Address.town)
Etc.
Other assumptions:
We can assume that we're always getting a known structure of objects on the input, e.g. a Person with its Job, House, and Address.
The resulting object doesn't necessarily need to know the names of the fields it was constructed from, i.e. for subset defined as (Person.name, Person.height, Job.company, Address.street) it can be the array of Objects {"Joe Doe", 180, "ACompany Inc.", "Main Street"}.
The object/table hierarchy is complex, so there are hundreds of data fields.
There may be hundreds of subsets that need to be defined.
A minority of fields to obtain may be computed from actual fields, e.g. I may want to get a person's age, computed as (now().getYear() - Person.birtday.getYear()).
Here are some options I see:
A SQL view for each subset.
Minuses:
They will be almost the same for similar subsets. This is OK just for field names, but not great for the joins part, which could ideally be refactored out to a common place.
Less testable than a solution in code.
Using a DTO assembler, e.g. http://www.genericdtoassembler.org/
This could be used to flatten the complex structure of input objects into a single DTO.
Minuses:
I'm not sure how I'd then proceed to easily define subsets of fields on this DTO. Perhaps if I could somehow set the ones irrelevant to the current subset to null? Not sure how.
Not sure if I can do computed fields easily in this way.
A custom mapper I came up with.
Relevant code:
// The enum has a value for each field in the Person objects hierarchy
// that we may be interested in.
public enum DataField {
PERSON_NAME(new PersonNameExtractor()),
..
PERSON_AGE(new PersonAgeExtractor()),
..
COMPANY(new CompanyExtractor()),
..
}
// This is the container for field-value pairs from a given instance of
// the object hierarchy.
public class Vector {
private Map<DataField, Object> fields;
..
}
// Extractors know how to get the value for a given DataField
// from the object hierarchy. There's one extractor per each field.
public interface Extractor<T> {
public T extract(Person person);
}
public class PersonNameExtractor implements Extractor<String> {
public String extract(Person person) {
return person.getName();
}
}
public class PersonAgeExtractor implements Extractor<Integer> {
public int extract(Person person) {
return now().getYear() - person.getBirthday().getYear();
}
}
public class CompanyExtractor implements Extractor<String> {
public String extract(Person person) {
return person.getJob().getCompany();
}
}
// Building the Vector using all the fields from the DataField enum
// and the extractors.
public class FullVectorBuilder {
public Vector buildVector(Person person) {
Vector vector = new Vector();
for (DataField field : DataField.values()) {
vector.addField(field, field.getExtractor().extract(person));
}
return vector;
}
}
// Definition of a subset of fields on the Vector.
public interface Selector {
public List<DataField> getFields();
}
public class SampleSubsetSelector implements Selector {
private List<DataField> fields = ImmutableList.of(PERSON_NAME, COMPANY);
...
}
// Finally, a builder for the subset Vector, choosing only
// fields pointed to by the selector.
public class SubsetVectorBuilder {
public Vector buildSubsetVector(Vector fullVector, Selector selector) {
Vector subsetVector = new Vector();
for (DataField field : selector.getFields()) {
subsetVector.addField(field, fullVector.getValue(field));
}
return subsetVector;
}
}
Minuses:
Need to create a tiny Extractor class for each of hundreds of data fields.
This is a custom solution that I came up with, seems to work and I like it, but I feel this problem must have been encountered and solved before, likely in a better way.. Has it?
Edit
Each object knows how to turn itself into a Map of fields, keyed on an enum of all fields.
E.g.
public enum DataField {
PERSON_NAME,
..
PERSON_AGE,
..
COMPANY,
..
}
public class Person {
private String name;
private Date birthday;
private int height;
private Job job;
private House house;
..
public Map<DataField, Object> toMap() {
return ImmutableMap
.add(DataField.PERSON_NAME, name)
.add(DataField.BIRTHDAY, birthday)
.add(DataField.HEIGHT, height)
.add(DataField.AGE, now().getYear() - birthday.getYear())
.build();
}
}
Then, I could build a Vector combining all the Maps, and select subsets from it like in 3.
Minuses:
Enum name clashes, e.g. if Job has an Address and House has an Address, then I want to be able to specify a subset taking street name of both. But how do I then define the toMap() method in the Address class?
No obvious place to put code doing computed fields requiring data from more than one object, e.g. physical distance from Address of House to Address of Company.
Many thanks!
Over in-memory object mapping in the application, I would favor database processing of the data for better performance. Views, or more elaborate OLAP/datawarehouse tooling could do the trick. If the calculated fields remain basic, as in "age = now - birth", I see nothing wrong with having that logic in the DB.
On the code side, given the large number of DTOs you have to deal with, you could use classless dynamic (available in some JVM languages) or JSON objects. The idea is that when a data structure changes, you only need to modify the DB and the UI, saving you the cost of changing a whole bunch of classes in between.

What is the job of 'Interface' in OO programming?

From what I understand, does it mean. making methods to build up different components of a program. e.g. if i was to make a program that adds and subtracts numbers then I would have something like;
public void addnum(int addnum){
addnum= addnum+1;
system.out.println(addnum);
}
public void subtractnum int subtractnum){
subtractnum = subtractnum-1;
system.out.println(addnum);
}
public static void main(String args[]){
int num = 21;
addnum(num);
subtractnum(num);
}
Am I correct, or does it mean something else?
In the Java and .NET frameworks, among others, having a class X inherit from Y has two benefits:
Instances of class X encapsulate the values of all of Y's fields, and can use any of Y's protected members on themselves as if those members belonged to X; additionally, the definition of class X may use Y's static members as though they were its own.
Variables of type Y may hold references to instances of type X.
Allowing a class object to regard as its own the contents of multiple other classes makes it impossible to have upcasts and downcasts preserve identity; since identity-preserving upcasts and downcasts are useful, Java and .NET allow each class to regard members of only one parent as its own (members of the parent's parent are also members of the parent, and get incorporated as such). The limitation of incorporating members from only one parent class is generally not overly restrictive.
On the other hand, if each type could only be stored in references of its own type or its ancestors' types, that would be restrictive. To allow for the possibility that it may be helpful to store references to an object in multiple independent types, Java and .NET both make it possible to define interface types. A reference to an object which implements an interface may be stored in a variable of that interface type (achieving the second benefit of inheritance) but unlike class inheritance which is restricted to a single parent, interface implementation is relatively unrestricted. A class may implement an arbitrary number of independent interfaces, and references to such a class may be stored in variables of any of those interfaces' types.
In short, interfaces provide the most important benefit of inheritance (reference substitutability), but give up some features in exchange for giving up a significant restriction (the inability to inherit from multiple classes).
You´re confusing different methods with different parameter types.
Maybe this example will help:
public interface GeometricalObject {
double getArea();
double getPerimeter();
}
...
public class Circle implements GeometricalObject {
public double r;
public double getArea() {
return 3.14 * r * r;
}
public double getPerimeter()
{
return 3.14 * 2 * r;
}
}
...
public class Square implements GeometricalObject {
public double s;
public double getArea() {
return s * s;
}
public double getPerimeter()
{
return 4 * s;
}
}
...
public void printGeomObject(GeometricalObject g) {
System.out.println("Area is " + g.getArea());
System.out.println("Perimeter is " + g.getPerimeter());
}
Interface provides us the way of multilevel inheritance.
Interface can be extended to any class
Common properties of any class can be define in interface and can be inherited to many classes.

Protobuf-net. Deserialize based on

Let's say we have the following three classes:
[ProtoContract]
[ProtoInclude(10, typeof(FirstType))]
[ProtoInclude(20, typeof(SecondType))]
public class Base
{
[ProtoMember(1)]
public int ClassId {get;set;}
}
public class FirstClass : Base
{
...
}
public class SecondClass : Base
{
...
}
And there's relationship between the class Id (in the base class) and the type of a matching child class. For example,
var obj1 = new FirstClass() {ClassId = 1}
var obj2 = new SecondClass() {ClassId = 2}
Now let's suppose we have serialized those objects. The question is: is there any good way to deserialize the serialized protobuf based the class Id value by looking over the ClassId field? i.e., if the value of classId in the serailized protobuf is 1, then use FirstClass to deserialize remaining stream bytes.
thanks!
If you are using ProtoInclude, then protobuf-net is already taking care of which subclass to use: that is the entire point of ProtoInclude. In some cases it is not possible to use inheritance, in which case there are ways to read the proto-stream via either ProtoReader, or by using a second model which only reads that property, then resetting the source and reading again. There is an example of that here: https://stackoverflow.com/a/14572685/23354

Pattern name/Convention -> Class that merge different attributes from other classes

I wanted to know if there is a known pattern or convention for the following scenario:
I have two classes: MAT (name:String, address:String) & MATversion(type:String, version:int)
Now I have a DataGrid (DataTable) which will take a generic List of objects for the column mapping and data filling.
The columns should be name, type, version. (Which are distributed in MAT and MATversion)
So I create a class to make this work. This class will merge the needed properties from each class (MAT, MATversion).
-> MAT_MATversion (name:String, type:String, version:int).
Does there exist a naming convention for such an class like MAT_MATversion? Any pattern that mirrors that?
Thanks!
Is there any specific reason why the merged result has to be a unique class?
I'm assuming every MAT object has a single MATversion
you can add a couple of custom properties who will return the type and version of the underlying MATversion object
In C# this would result in something like this
public class MAT{
public String name{ get;set;};
public String adress{ get;set;};
public MATversion myVersion;
public String type {
get{
return myVersion.type;
}
set{
myVersion.type = value;
}
}
public int version {
get{
return myVersion.version;
}
set{
myVersion.version = value;
}
}
}
I'm aware that this doesn't answer the question about design patterns, but I couldn't ask/suggest another approach in a comment since I don't have that right yet.

The Object-Oriented way to separate the model from its representation

Suppose we have an object that represents the configuration of a piece of hardware. For the sake of argument, a temperature controller (TempController). It contains one property, the setpoint temperature.
I need to save this configuration to a file for use in some other device. The file format (FormatA) is set in stone. I don't want the TempController object to know about the file format... it's just not relevant to that object. So I make another object, "FormatAExporter", that transforms the TempController into the desired output.
A year later we make a new temperature controller, let's call it "AdvancedTempController", that not only has a setpoint but also has rate control, meaning one or two more properties. A new file format is also invented to store those properties... let's call it FormatB.
Both file formats are capable of representing both devices ( assume AdvancedTempController has reasonable defaults if it lacks settings ).
So here is the problem: Without using 'isa' or some other "cheating" way to figure out what type of object I have, how can FormatBExporter handle both cases?
My first instinct is to have a method in each temperature controller that can provide a customer exporter for that class, e.g., TempController.getExporter() and AdvancedTempController.getExporter(). This doesn't support multiple file formats well.
The only other approach that springs to mind is to have a method in each temperature controller that returns a list of properties and their values, and then the formatter can decide how to output those. It'd work, but that seems convoluted.
UPDATE: Upon further work, that latter approach doesn't really work well. If all your types are simple it might, but if your properties are Objects then you end up just pushing the problem down a level... you are forced to return a pair of String,Object values, and the exporter will have to know what the Objects actually are to make use of them. So it just pushes the problem to another level.
Are there any suggestions for how I might keep this flexible?
What you can do is let the TempControllers be responsible for persisting itself using a generic archiver.
class TempController
{
private Temperature _setPoint;
public Temperature SetPoint { get; set;}
public ImportFrom(Archive archive)
{
SetPoint = archive.Read("SetPoint");
}
public ExportTo(Archive archive)
{
archive.Write("SetPoint", SetPoint);
}
}
class AdvancedTempController
{
private Temperature _setPoint;
private Rate _rateControl;
public Temperature SetPoint { get; set;}
public Rate RateControl { get; set;}
public ImportFrom(Archive archive)
{
SetPoint = archive.Read("SetPoint");
RateControl = archive.ReadWithDefault("RateControl", Rate.Zero);
}
public ExportTo(Archive archive)
{
archive.Write("SetPoint", SetPoint);
archive.Write("RateControl", RateControl);
}
}
By keeping it this way, the controllers do not care how the actual values are stored but you are still keeping the internals of the object well encapsulated.
Now you can define an abstract Archive class that all archive classes can implement.
abstract class Archive
{
public abstract object Read(string key);
public abstract object ReadWithDefault(string key, object defaultValue);
public abstract void Write(string key);
}
FormatA archiver can do it one way, and FormatB archive can do it another.
class FormatAArchive : Archive
{
public object Read(string key)
{
// read stuff
}
public object ReadWithDefault(string key, object defaultValue)
{
// if store contains key, read stuff
// else return default value
}
public void Write(string key)
{
// write stuff
}
}
class FormatBArchive : Archive
{
public object Read(string key)
{
// read stuff
}
public object ReadWithDefault(string key, object defaultValue)
{
// if store contains key, read stuff
// else return default value
}
public void Write(string key)
{
// write stuff
}
}
You can add in another Controller type and pass it whatever formatter. You can also create another formatter and pass it to whichever controller.
In C# or other languages that support this you can do this:
class TempController {
int SetPoint;
}
class AdvancedTempController : TempController {
int Rate;
}
class FormatAExporter {
void Export(TempController tc) {
Write(tc.SetPoint);
}
}
class FormatBExporter {
void Export(TempController tc) {
if (tc is AdvancedTempController) {
Write((tc as AdvancedTempController).Rate);
}
Write(tc.SetPoint);
}
}
I'd have the "temp controller", through a getState method, return a map (e.g. in Python a dict, in Javascript an object, in C++ a std::map or std::hashmap, etc, etc) of its properties and current values -- what's convoluted about it?! Could hardly be simpler, it's totally extensible, and totally decoupled from the use it's put to (displaying, serializing, &c).
Well, a lot of that depends on the file formats you're talking about.
If they're based on key/value combinations (including nested ones, like xml), then having some kind of intermediate memory object that's loosely typed that can be thrown at the appropriate file format writer is a good way to do it.
If not, then you've got a scenario where you've got four combinations of objects and file formats, with custom logic for each scenario. In that case, it may not be possible to have a single representation for each file format that can deal with either controller. In other words, if you can't generalize the file format writer, you can't generalize it.
I don't really like the idea of the controllers having exporters - I'm just not a fan of objects knowing about storage mechanisms and whatnot (they may know about the concept of storage, and have a specific instance given to them via some DI mechanism). But I think you're in agreement with that, and for pretty much the same reasons.
If FormatBExporter takes an AdvancedTempController, then you can make a bridge class that makes TempController conform to AdvancedTempController. You may need to add some sort of getFormat() function to AdvancedTempController though.
For example:
FormatBExporter exporterB;
TempController tempController;
AdvancedTempController bridged = TempToAdvancedTempBridge(tempController);
exporterB.export(bridged);
There is also the option of using a key-to-value mapping scheme. FormatAExporter exports/imports a value for key "setpoint". FormatBExporter exports/imports a values for keys "setpoint" and "ratecontrol". This way, old FormatAExporter can still read the new file format (it just ignores "ratecontrol") and FormatBExporter can read the old file format (if "ratecontrol" is missing, it uses a default).
In the OO model, the object methods as a collective is the controller. It's more useful to separate your program in to the M and V and not so much the C if you're programming using OO.
I guess this is the where the Factory method pattern would apply