design pattern for parsing object to specific model and vice versa - oop

I have a file uploading logic and a very specific business rules. And according them I should parse my filemodel to row, which looks like "Header:{processed field1},{processed field2},{processed field3},{processed field4},{processed field5},{processed field6},{processed field7},{processed field8} and so on for 19 params" It's initially custom serialization.
And I also should have possibility to parse this row back to object. So, the question is what is a common idea to codding such staff?
Because now for parsing model to row I just use string.format with many options, and for parsing row to model I split the row by ',' and then manipulation with parts of information assign it to models fields. But in this implementation there are a lot of low level work, some hard coded position and also a lot of things that do not look pretty for me.

There's not going to be any magic involved here, particularly since you are serializing the object to a non standard format. You're probably going to have to live with the 'ugly' code.

You should put your serialization / deserialization inside a custom serializer. You can follow the same pattern as the other serializers in the .net library and implement the IFormatter interface . This would provide you with a common interface that you can use to stream to and from a file (or any stream):
using (var fileStream = new FileStream(fileName, FileMode.Create))
{
var formatter = new CustomFormatter();
formatter.Serialize(fileStream, objectToSerialize);
}
using (var fileStream = new FileStream(fileName, FileMode.Read))
{
var formatter = new CustomFormatter();
return (CustomType)formatter.DeSerialize(fileStream);
}
You can see an example of a custom formatter in this download

Related

Common return type for all ANTLR visitor methods

I'm writing a parser for an old proprietary report specification with ANTLR and I'm currently trying to implement a visitor of the generated parse tree extending the autogenerated abstract visito class.
I have little experience both with ANTLR (which I learned only recently) and with the visitor pattern in general, but if I understood it correctly, the visitor should encapsulate one single operation on the whole data structure (in this case the parse tree), thus sharing the same return type between each Visit*() method.
Taking an example from The Definitive ANTLR 4 Reference book by Terence Parr, to visit a parse tree generated by a grammar that parses a sequence of arithmetic expressions, it feels natural to choose the int return type, as each node of the tree is actually part of the the arithmetic operation that contributes to the final result by the calculator.
Considering my current situation, I don't have a common type: my grammar parses the whole document, which is actually split in different sections with different responsibilities (variable declarations, print options, actual text for the rows, etc...), and I can't find a common type between the result of the visit of so much different nodes, besides object of course.
I tried to think to some possible solutions:
I firstly tried implementing a stateless visitor using object as
the common type, but the amount of type casts needed sounds like a
big red flag to me. I was considering the usage of JSON, but I think
the problem remains, potentially adding some extra overhead in the
serialization process.
I was also thinking about splitting the visitor in more smaller
visitors with a specific purpose (get all the variables, get all the
rows, etc.), but with this solution for each visitor I would
implement only a small subset of the method of the autogenerated
interface (as it is meant to support the visit of the whole tree),
because each visiting operation would probably focus only on a
specific subtree. Is it normal?
Another possibility could be to redesign the data structure so that
it could be used at every level of the tree or, better, define a generic
specification of the nodes that can be used later to build the data
structure. This solution sounds good, but I think it is difficult to
apply in this domain.
A final option could be to switch to a stateful visitor, which
incapsulates one or more builders for the different sections that
each Visit*() method could use to build the data structure
step-by-step. This solution seems to be clean and doable, but I have
difficulties to think about how to scope the result of each visit
operation in the parent scope when needed.
What solution is generally used to visit complex ANTLR parse trees?
ANTLR4 parse trees are often complex because of recursion, e.g.
I would define the class ParsedDocumentModel whose properties would added or modified as your project evolves (which is normal, no program is set in stone).
Assuming your grammar be called Parser in the file Parser.g4, here is sample C# code:
public class ParsedDocumentModel {
public string Title { get; set; }
//other properties ...
}
public class ParserVisitor : ParserBaseVisitor<ParsedDocumentModel>
{
public override ParsedDocumentModel VisitNounz(NounzContext context)
{
var res = "unknown";
var s = context.GetText();
if (s == "products")
res = "<<products>>"; //for example
var model = new ParsedDocumentModel();
model.Title = res; //add more info...
return model;
}
}

F#, Json.NET 6.0 and WebApi - serialization of record types

Json.NET 6.0.1 adds F# support for records and discriminated unions. When serializing a F# record type using Json.NET I now get nicely formatted JSON.
The serialization is done as follow:
let converters = [| (new StringEnumConverter() :> JsonConverter) |]
JsonConvert.SerializeObject(questionSet, Formatting.Indented, converters)
However, when I try to expose my F# types through a ASP.NET WebApi 5.0 service, written in C#, the serialized JSON includes an #-sign infront of all properties. The #-sign comes from the internal backing field for the record type (this used to be a known problem with Json.Net and F#).
But - since I'm using the updated version of Json.NET, shouldn't the result be the same as when calling JsonConvert? Or is JsonConvert behaving differently than JsonTextWriterand JsonTextReader?
As far as I can tell from reading the JsonMediaTypeFormatter in the WebApi source JsonTextWriterand JsonTextReader is used by WebApi.
You can adorn your records with the [<CLIMutable>] attribute:
[<CLIMutable>]
type MyDtr = {
Message : string
Time : string }
That's what I do.
For nice XML formatting, you can use:
GlobalConfiguration.Configuration.Formatters.XmlFormatter.UseXmlSerializer <- true
For nice JSON formatting, you can use:
config.Formatters.JsonFormatter.SerializerSettings.ContractResolver <-
Newtonsoft.Json.Serialization.CamelCasePropertyNamesContractResolver()
I believe it's because the backing fields that are emitted by F# records don't follow the same naming convention as C# property backing fields.
The easiest way I've found to get around this is to change the ContractResolver at the startup of your web application from the System.Net.Http.Formatting.JsonContractResolver to use the Newtonsoft.Json.Serialization.DefaultContractResolver instead: -
Formatters.JsonFormatter.SerializerSettings.ContractResolver <- DefaultContractResolver()
You'll then get all JSON formatting done via Newtonsoft's JSON formatter rather than the NET one.

Where should the responsibility for parsing the input stream be in this scenario?

Say if I am parsing readings from a handheld device of some sort via an input stream. There are readings of different types, and each need parsing differently.
Currently I have a class "handheld" that handles all parsing and creates reading objects of the appropriate type as required. It parses the reading and populates each reading via their "set" methods.
I'm wondering though if the readings themselves should know how to parse the input stream. For instance, when the next reading comes along, should I instantiate the appropriate reading object and call a "parse" method on it, passing it in the input stream?
The main thing I don't like about this is the parsing code is all over the place rather than kept neatly in one place. It does however get rid of the need for all those set methods and the reading can just apply itself to the server/database/whatever when required via the "apply" method I have.
So which would be considered the "nicer" (or more OO) way?
I would go by creating a Factory design pattern.
Create a base class to represent GeneralParser and make a child class for each parser and if there was something common in the parsing method, let it be in the base GeneralParser's Parse method and call base.parse method in child.parse method.
I am sure you have a way to determine which parser to use, and I think currently you're using control statements (if, switch...) and do the parsing. Well now instead of that let the specialized (child) parser class handle it for you.
Pseudo class diagram:
GeneralParser
|
|
->XMLParser
->JsonParser
Here is some implementation in C#.Net
public static class ParserFactory
{
public static GeneralParser CreateXMLParser()
{
return new XMLParser();
}
public static GeneralParser CreateJsonParser()
{
return new JSONParser();
}
}
In your program code, you may write something like this (pseudo-code) because it depends on the way that you're deciding which parser to use.
// ...
GeneralParser parser;
if( to_be_parsed_as_xml)
{
parser = ParserFactory.CreateXMLParser();
parser.Parse(stream);
}
else if( to_be_parsed_as_json )
{
parser = ParserFactory.CreateJsonParser();
parser.Parse(stream);
}
// ...
You can create a parser on the fly (without keeping its reference) if you only need parsers to parse and nothing more.

deserializing XML with dynamic types / converting string to System.Type

Hmmm I'm not sure if i titled this question properly or am asking it properly, but here goes.
I've got serialized objects (in XML) stored in a database, along with a string/varchar indicating the type.
Right now i am doing this: (because i have a finite number of different types)
Dim deserializer as XmlSerializer
If datatable("type") = "widget1" then
deserializer = new XmlSerializer(GetType(Widget1))
elseif datatable("type") = "widget2" then
deserializer = new XmlSerializer(GetType(Widget2))
...
i'd like to do something like
Dim deserializer as XmlSerializer
deserializer = new XmlSerializer(MagicallyConvertToSystemDotType(datatable("type"))
Am i barking up the wrong tree here?
Have you tried using Type.GetType? This takes a string parameter and returns a type for that name. You may have to give it additional information about the simple name "widget" and more along the lines of a full name. But it appears from your sample they should all have the same namespace so that shouldn't be a big hurdle.
The other option if you want an actual keyword Type to work with, and not a variable type is using something like (sorry I'm using C# and am too tired to do the VB conversion):
method in XmlSerializer like Deserialize(typestring, object);
method in XmlSerializer like Deserialize<T>(object);
public void Deserialize(string typestring, object obj)
{
MethodInfo deserialize = typeof(XmlSerializer)
.GetMethod("Deserialize", BindingFlags.Instance | BindingFlags.Public)
.MakeGenericMethod(new Type[] { Type.GetType(typestring) });
deserialize.Invoke(this, new[] { obj });
}
Specifically, I think you're looking for this code here (NOTE: I don't work much in VB.Net, so I hope everything there is syntactically correct):
VB.Net:
// Get the type of object being deserialized.
Dim t as Type = Type.GetType(typeNameString);
// Make a new instance of the object.
Dim o as Object = Activator.CreateInstance(t);
C#:
// Get the type of object being deserialized.
Type t = Type.GetType(typeNameString);
// Make a new instance of the object.
object o = Activator.CreateInstance(t);
Edit (26 Oct, 2009, 15:10 GMT-0600): The Type.GetType(string typeNameString) method does not always recognize types as simply their fully qualified name. It would be in your best interest to be sure and include as much information as you can in your parameter string, as follows:
VB.Net/C#:
typeNameString = objectSerialized.GetType().Namespace + ", " + objectSerialized.GetType().Name + ", " + objectSerialized.GetType().Assembly.FullName
Less specifically, I just had the same problem, and after a lot of research, I finally came up with a nice solution for handling all most of this dynamically. I've posted the entire source code to a class capable of serializing and deserializing objects of any type not containing generics or arrays using Reflection. Feel free to take it and use it as your own. If anyone decides to add the handling for generics and arrays, please send me an updated copy so I can post it back on my blog (and you'll get an honorable mention ;-)...). It will serialize everything recursively, and has some special coding in there for enums as well.
Take a look and see if that covers everything you're looking for at:
http://maxaffinity.blogspot.com/2009/10/serialize-objects-manually.html
~md5sum~
Edit (27 Oct, 2009 14:38 GMT-0600): Corrected some misinformation about the class available from my blog.

vb.net object persisted in database

How can I go about storing a vb.net user defined object in a sql database. I am not trying to replicate the properties with columns. I mean something along the lines of converting or encoding my object to a byte array and then storing that in a field in the db. Like when you store an instance of an object in session, but I need the info to persist past the current session.
#Orion Edwards
It's not a matter of stances. It's because one day, you will change your code. Then you will try de-serialize the old object, and YOUR PROGRAM WILL CRASH.
My Program will not "CRASH", it will throw an exception. Lucky for me .net has a whole set of classes dedicated for such an occasion. At which time I will refresh my stale data and put it back in the db. That is the point of this one field (or stance, as the case may be).
You can use serialization - it allows you to store your object at least in 3 forms: binary (suitable for BLOBs), XML (take advantage of MSSQL's XML data type) or just plain text (store in varchar or text column)
Before you head down this road towards your own eventual insanity, you should take a look at this (or one day repeat it):
http://thedailywtf.com/Articles/The-Mythical-Business-Layer.aspx
Persisting objects in a database is not a good idea. It kills all the good things that a database is designed to do.
You could use the BinaryFormatter class to serialize your object to a binary format, then save the resulting string in your database.
The XmlSerializer or the DataContractSerializer in .net 3.x will do the job for you.
#aku, lomaxx and bdukes - your solutions are what I was looking for.
#1800 INFORMATION - while i appreciate your stance on the matter, this is a special case of data that I get from a webservice that gets refreshed only about once a month. I dont need the data persisted in db form because thats what the webservice is for. Below is the code I finally got to work.
Serialize
#'res is my object to serialize
Dim xml_serializer As System.Xml.Serialization.XmlSerializer
Dim string_writer As New System.IO.StringWriter()
xml_serializer = New System.Xml.Serialization.XmlSerializer(res.GetType)
xml_serializer.Serialize(string_writer, res)
Deserialize
#'string_writer and xml_serializer from above
Dim serialization As String = string_writer.ToString
Dim string_reader As System.IO.StringReader
string_reader = New System.IO.StringReader(serialization)
Dim res2 As testsedie.EligibilityResponse
res2 = xml_serializer.Deserialize(string_reader)
What you want to do is called "Serializing" your object, and .Net has a few different ways to go about it. One is the XmlSerializer class in the System.Xml.Serialization namespace.
Another is in the System.Runtime.Serialization namespace. This has support for a SOAP formatter, a binary formatter, and a base class you can inherit from that all implement a common interface.
For what you are talking about, the BinaryFormatter suggested earlier will probably have the best performance.
I'm backing #1800 Information on this one.
Serializing objects for long-term storage is never a good idea
while i appreciate your stance on the matter, this is a special case of data that I get from a webservice that gets refreshed only about once a month.
It's not a matter of stances. It's because one day, you will change your code. Then you will try de-serialize the old object, and YOUR PROGRAM WILL CRASH.
If it crashes (or throws an exception) all you are left with is a bunch of binary data to try and sift through to recreate your objects.
If you are only persisting binary why not just save straight to disk. You also might want to look at using something like xml as, as has been mentioned, if you alter your object definition you may not be able to unserialise it without some hard work.