How do I set up Protobuf for my VB.net application? - vb.net

So, this may seem very elementary for you guys but I am officially stumped. I am trying to save some data in my application to a file using protobuf (suggested to me by some peers) but I can't seem to find any documentation for it and what I can find always gives me some weird error. I have an array declared as follows:
Private Terrain(,,) As TiledTerrain
The TiledTerrain class looks like this:
Public Class TiledTerrain
Public X As Integer
Public Y As Integer
Public Texture_X As Integer
Public Texture_Y As Integer
End Class
Pretty dog-on simple right? Well, I can't seem to figure out how to save my Terrain array to a file using Protobuf?
The Terrain array is just a simple 3 dimensional array (about 100x100x2). Each cell of the array may or may not actually contain a value (TiledTerrain) and if it doesn't it will contain "Nothing".
Can anybody explain to me in full on how I should go about doing this? I've currently referenced protobuf-net.dll and protobuf-net.Extensions.dll because I don't really know which to use...
Thanks for any help!
-A Moron Among Geniuses :)

first read Getting Started which describes the simplest scenario, using attributes. VB has slightly different syntax for attributes, which you are probably more familiar with than me - but the concept is the same.
There are alternatives, note:
in v2 the model can be configured entirey at runtime if you want, without the need for any attributes
if the type looks like an obvious "tuple" (including, importantly, a constructor that takes a parameter that matches every public member), it will use the constructor order to infer a contract
There is a problem though; protobuf-net does not currently support multi-dimensional arrays. It can of course be added, but as with all features: it doesn't exist until it gets written. The reason this isn't supported directly is that the underlying protobuf specification (by Google) does not support this. It would work if flattened into a vector (1-dimensional zero-based array). If you want help with an example, let me know.

Related

Proto-buf serialization with Obfuscation

I am looking for some guidance as to what is going on when using proto-buf net with obfuscation (Dotfuscator). One half of the project is a DLL and the other is an EXE elsewhere and using proto-buf NET they exchange data flawlessly. Until I obfuscate the DLL.
At that point P-BN fails without raising an exception, returning variously a 0 length byte array or a foreshortened one depending on what I have fiddled with. The class is fairly simple (VB):
<ProtoContract(Name:="DMailer")> _
Friend Class DMailer
Private _Lic As Cert
Private _Sys As Sys
Private _LList As List(Of LItem)
..
..
End Class
There are 3 props all decorated with ProtoMember to get/set the constituent class objects. Snipped for brevity.
Again, it works GREAT until I obfuscate the DLL. Then, Dotfuscator renames each of these to null, apparently since they are all Friend, and that seems to choke proto-buff. If I exempt the class from renaming (just the class name, not props/members), it seems to work again. It makes sense that P-BN would only be able to act on objects with a proper name, though when asked to serialize a null named object, it seems like an exception might be in order.
On the other hand, much of the charm of PB-N is supposed to be serialization independent of .NET names working from attributes - at least as I understand it. Yet in this case it only seems to work with classes with names. I tried using the Name qualifier or argument as shown above, to no avail - it apparently doesnt do what I thought it might.
So, I am curious if:
a) ...I have basically surmised the problem correctly
b) ...There is some other attribute or flag that might facilitate serializing
a null named object
c) ...if there are any other insights that would help.
If I exempt all 3 or 4 classes from Dotfuscator renaming (LList is not actually implemented yet, leaving DMailer, Cert and Sys), the DLL seems to work again - at least the output is the correct size. I can live with that, though obscured names would be better: Dotfuscator (CE) either exempts them or sets the names to Null - I cant seem to find a way to force them to be renamed.
Rather than exempt 3 or 4 classes from renaming, one alternative I am considering is to simply store the Serializer output for Cert and Sys as byte arrays or Base64 strings in DMailer instead of classes. Then have the receiver Deserialize each object individually. It is kind of nice to be able to unpack just one thing and have your toys right there as if by magic though.
(many)TIA
Interesting. I confess I have never tried this scenario, but if you can walk me through your process (or better: maybe provide a basic repro example with "run this, then this, then this: boom") I'll happily investigate.
Note: the Name on ProtoContract is mainly intended for GetProto() usage; it is not needed by the core serializer, and can be omitted to reduce your exposure. Also, protobuf-net isn't interested in fields unless those fields are decorated with the attributes, so that shouldn't be an issue.
However! there's probably a workaround here that should work now; you can pre-generate a static serialization dll; for example in a separate console exe (just as a tool; I really need to wrap this in a standalone utility!)
So if you create a console exe that references your unobfuscated library and protobuf-net.dll:
var model = RuntimeTypeModel.Create();
model.Add(typeof(DMailer), true); // true means "use the attributes etc"
// and other types needed, etc
model.Compile("MailSerializer", "MailSerializer.dll");
this should write MailSerializer.dll, which you can then reference from your main code (in addition to protobuf-net), and use:
var ser = new MailSerializer(); // our pre-genereated serializer
ser.Serialize(...); // etc
Then include MailSerializer.dll in your obfuscation payload.
(this is all v2 specific, btw)
If this doesn't work, I'll need to investigate the main issue, but I'm not an obfuscation expert so could do with your repro steps.
Since there were a few upticks of interest, here is what looks like will work:
a) No form of reflection will be able to get the list of properties for an obfuscated type.
I tried walking thru all the types to find the ones with ProtoContract on it, I could find them
but the property names are all changed to a,m, b, j, g.
I also tried Me.GetType.GetProperties with the same result.
You could implement a map from the output to indicate that Employee.FirstName is now a0.j, but distributing this defeats the purpose of obfuscation.
b) What does work to a degree is to exempt the class NAME from obfuscation. Since PB-N looks for the ProtoMember attributes to get the data, you CAN obfuscate the Property/Member names, just not the CLASS/type name. If the name is something like FederalReserveLogIn, your class/type has a bullseye on it.
I have had initial success doing the following:
1) Build a simple class to store a Property Token and value. Store everything as string using ConvertFromInvariantString. Taking a tip from PBN, I used an integer for the token:
<ProtoMember(propIndex.Foo)>
Property Foo As String
An enum helps tie everything together later. Store these in a Dictionary(Of T, NameValuePair)
2) add some accessors. these can perform the type conversions for you:
Public Sub Add(ByVal Key As T, ByVal value As Object)
If _col.ContainsKey(Key) Then
_col.Remove(Key)
End If
_col.Add(Key, New TValue(value))
End Sub
Public Function GetTItem(Of TT)(key As T) As TT
If _col.ContainsKey(key) Then
Return CType(_col(key).TValue, TT)
Else
Return Nothing
End If
End Function
T is whatever key type you wish to use. Integer results in the smallest output and still allows the subscribing code to use an Enum. But it could be String.
TT is the original type:
myFoo = props.GetTItem(Of Long)(propsEnum.Foo)
3) Expose the innerlist (dictionary) to PBN and bingo, all done.
Its also very easy to add converters for Point, Rectangle, Font, Size, Color and even bitmap.
HTH

How to get differing value type out of an concrete implementation if only Interface / abstract class is known?

what I am using:
VB.NET, NET 3.5, OpenXML SDK 2.0
what I want to do:
I am creating an xlsx reader / writer for my application (based on OpenXML SDK 2.0). I want to read xlsx files and store the data contained in each row in a DTO/PONO. Further I want to read the xlsx file and then modify it and save it.
my thoughts:
Now my problem is not with the OpenXML SDK, I can do what I need to do.
My problem is on how to structure my components. Specifically I have problems with the polymorphism at the lowest level of a Spreadsheet, the cell.
A cell in Excel/OpenXML can have different types of data associated with it. Like a Time, Date, Number, Text or Formula. These different type need to be handled differently when read/written from/to a spreadsheet.
I decided to have a common interface for all subtypes like TextCell, NumberCell, DateCell etc.
Now when I read the cell from the spreadsheet the Method/Factory can decide which type of cell to create.
Now because the cell is an abstract from the real implementation it does not know / does not need to know of what type it is. For writing / modifying the cell I solve this problem by calling .write(ICellWriter) on the cell I want to persist. As the cell itself knows what type of data it contains, it knows which method of ICellWriter it needs to call (static polymorpism).
My problem:
Writing to the xlsx file is no problem. My problem is, how do I get the data out of my cell into my DTO/PONO without resorting to type checking -> If TypeOf variable is ClassX then doesomething End If. As Methods / Properties have to have different Signatures and differentiating by only using a different return type is not allowed.
Edit:
The holder (collection, in this case a row of a table/spreadsheet) of the objects (refering to the cells) does not know the concrete implementations. So for writing a cell I pass it a Cellwriter. This Cellwriter has overloaded methods like Write(num as Integer), Write(text as String), Write(datum as Date). The cell object that gets this passed to it then calls the Write() method with the data type it holds. This works, as no return value is passed back.
But what do I do when I need the concrete data type returned?
Any ideas, advice, insight?
Thanks
Edit:
Glossary:
DTO: Data Transfere Object
PONO: Plain Old .Net Object
xlsx: referring to file ending of excel workbook files
Edit:
The Cell "subtypes" implement a common interface and do not inherit from a common superclass.
Edit:
After some thinking about the problem I came to realize that it’s not possible without reflection or knowledge of what type of cell I am expecting. Basically I was trying to recreate a spreadsheet or something with similar functionality and way too abstract/configurable for my needs. Thanks for your time & effort put in to writing the answer. I accepted the answer that was closest to what I realized.
I don't think you can.
If I'm understanding correctly, you have a different types of cells (StringCell, IntCell) and each of those concrete classes returns an object of type 'Object'. When you are using the base class 'Cell' and getting it's value - it's of type Object.
To work with it as a String, or Integer, Or Date, etc...etc... I think you need to inspect the type of that object, one way or another. You can use TypeOf like you demonstrated; I've also seen things like '.GetValueAsString()/.GetValueAsInteger()' on the base class. But you still need knowledge enough to say 'Dim myInt as Integer = myCell.GetValueAsInteger()'
Generally speaking, at least if you subscribe to the SOLID principals, you shouldn't care.
It states that, in a computer program if S is a subtype of T, then objects of type T may be replaced with objects of type S (i.e., objects of type S may be substitutes for objects of type T), without altering any of the desirable properties of that program (correctness, task performed, etc.)
http://en.wikipedia.org/wiki/Liskov_substitution_principle
If you have subtypes of cells, but you can't use them interchangeably, it's a good candidate for not using inheritance.
I don't know what you intending to do with the values in the cells that would require you to have the concrete class instead of using the base; but it might be possible to expose that functionality in the base itself. IE - if you need to add two cells, you can accomplish that treating them as generic cells (perhaps. At least provided they are of compatible types) without knowing what subtype they are. You should be able to return the base class in your DTO, regardless.
At least, I that's my understanding. I'd certainly wait for more people to chime in before listening to me.

A function that will convert input string to the actual object

I am unsure how to describe what I am looking for, so hopefully the situation will make it somewhat clear.
I have an object with a number of properties (let's say object.one, object.two, object.three). There are about 30 of these properties and they all hold a string ("Pass" or "Fail").
Right now the existing code checks whether the property has value "Pass" or "Fail" and then runs some code that prints stuff out. That is, the same snippet of code is duplicated 30 times, one for each of these properties.
The code looks something like this
If (object.one = ... )
...
End if
If (object.two = ... )
...
End if
If (object.three = ... )
...
End if
I want to use a loop to clean this mess up (each block is huge), but am not sure how to do it. I was thinking perhaps there was a way such that I might be able to construct a string like "object.one" and run some function that will tell the compiler that this is actually an object's property?
That way I could create an array containing the object's name like my array = {"object.one", "object.two", "object.three"} and then do something like, in pseudocode
For each string in my array
If (some_function(string) = ...)
...
End If
Essentially, it would take those massive blocks of duplicated code and reduce it to just one block. Is there such a some_function that I am looking for?
This is in VB.net.
I'm not sure i've understand but you can use reflection and get object properties runtime ?
With reflection you can access public object properties, use PropertyFiled.GetValue() to get one, two, etc. and build an array (i suppose that one, two, etc are object Properties, true?)
Here you can find more information: http://msdn.microsoft.com/en-us/library/system.reflection.aspx
Sorry for my bad english, i'm italian.
You seem to be describing serialization, which is the act of converting object state to a format that can be stored/transmitted and deserialization, which is the opposite.
The .NET framework has several different serializers that can work with text - either XML or JSON - the DataContractSerializer for XML and the DataContractJsonSerizlizer for JSON amongst them.

Newbie question: how do I create a class to hold data in Visual Basic Studio?

I'm really sorry. This must seem like an incredibly stupid question, but unless I ask I'll never figure it out. I'm trying to write a program to read in a csv file in Visual Basic (tried and gave up on C#) and I asked a friend of mine who is much better at programming than I am. He said I should create a class to hold the data that I read in.
The problem is, I've never created a class before, not in VB, Java, or anything. I know all the terms associated with classes, I understand at a high level how classes work no problem. But I suck at the actual details of making one.
So here's what I did:
Public Class TsvData
Property fullDataSet() As Array
Get
Return ?????
End Get
Set(ByVal value As Array)
End Set
End Property
End Class
I got as far as the question marks and I'm stuck.
The class is going to hold a lot of data, so I made it an array. That could be a bad move. I don't know. All i know is that it can't be a String and it certainly can't be an Integer or a Float.
As for the Getter and Setter, the reason I put the question marks in is because I want to return the whole array there. The class will eventually have other properties which are basically permutations of the same data, but this is the full set that I will use when I want to save it out or something. Now I want to return the whole Array, but typing "Return fullDataSet()" doesn't seem like a good idea. I mean, the name of the property is "fullDataSet()." It will just make some kind of loop. But there is no other data to return.
Should I Dim yet another array inside the property, which already is an array, and return that instead?
Instead of writing your own class, you could get yourself familiar with the pre-defined class System.Data.DataTable and then use that for holding CSV data.
In the last few years that I've been programming, I've never actually used a multi-dimensional array, and I'd advise you not to use them, either. There's usually ways of achieving the same with a better data structure. For example, consider creating a class (let's call it CsvRecord) that holds only one record; that is, only one line from the CSV file. Then use any of the standard collection types from the System.Collections.Generic namespace (e.g. List(Of CsvRecord)) to hold the entire data (ie. all lines) in the CSV file. This effectively reduces the problem to, "How do I read in one line of CSV data?"
If you want to take suggestion #2 even further, do as cHao says and don't simply lay out the information you've read as a CsvRecord; instead, create an object that reflects the actual content. For example, if your CSV file contains product–price information, call your CSV record class ProductInfo or something more fitting.
If, however, you want to go on with your current approach, you will need a backing field for the property, as demonstrated by Philipp's answer. Your property then becomes a "façade" that only delegates to this backing field. This is not absolutely necessary: You could simply make the backing field Public and let the user of your class access it directly, though that is not considered a good practice.
Ideally, you ought to have a class representing the specific data you want to read in. Setting an entire array at once is asking for trouble; some programs that read {C,T}SV files will freak out if all rows don't have the same number of columns, which is exceedingly easy to do if you can set the data to be an array of arbitrary length.
If you're trying to represent arbitrary data, frankly, you'd do just as well to use a List(Of String). If it's meant to be a table, you could instead read in the first line and make it a list as above (let's call it "headers"), and then make each row a Dictionary(Of String, String). (Let's call each row "row", and the collection (a list of these dictionary objects) "rows".) Just read in the line, split it like you did the first, and say something like row(headers(column number)) = value for each column, and then stuff it into 'rows'.
Or, you could use the data classes (System.Data.DataTable and System.Data.DataSet would do wonders here).
Usually you use a private member to store the actual data:
Public Class TsvData
Private _fullDataSet As String()
Public Property FullDataSet() As String()
Get
Return _fullDataSet
End Get
Set(ByVal value As String())
_fullDataSet = value
End Set
End Property
Note that this is an instance of bad design since it couples a concept to a concrete representation and allows the clients of the class to modify the internals without any error checking. Returning a ReadOnlyCollection or some dedicated container would be better.

Converting Value Type of a Dictionary in VB.net

Public Class A
Public Class B : Inherits A
Dim DictA As Dictionary(Of Integer, A)
Dim DictB As New Dictionary(Of Integer, B)
DictA = DictB
This doesn't work, as the type can't be converted. Is this somehow possible?
No. You're running into the problem of generic variance, which isn't supported in .NET except in a few very particular ways.
Here's the reason: after your last line, you've got a single dictionary with two "views" onto it, effectively. Now imagine you wrote:
DictA.Add(1, New A())
(Apologies if my VB is slightly off.) That's inserted a "non-B" into the dictionary, but DictB thinks that all the values will be instances of B.
One of the main purposes of generics is to catch potential type failures at compile time rather than execution time - which means this code should fail to compile, which indeed it does precisely because of the lack of variance.
It's a bit counter-intuitive when you're used to normal inheritance, but it does make sense. A bunch of bananas isn't just a collection of fruit - it's a collection of bananas.
Java takes a somewhat different stance on this using wildcarding from the caller's side - it would let you see just the "adding" side of the dictionary for DictB, for instance, as anything you add to DictB would be fine in DictA. Java's generics are very different to .NET generics though...
This is a interesting, and quite complex aspect of Static type systems.
The terms you are looking for is Covariance and Contravariance.
The c# language will get this for certain sorts of operations on Generic types in 4.0 though in a somewhat limited form.
You have it to a certain degree already in the form of
object[] x = new string[1];
Which is allowed because arrays are treated covariantly, which means that you could then do the following:
x[0] = new object();
which would throw an Exception at runtime. Do you see how the same would apply to you Dictionaries...
The c# 4.0 spec is still non final so is subject to change but for some discussions and explanations see this question
Unfortunately not. This is a known absence of covariance/contravariance support. The CLR supports it, but the compilers for C# and VB.Net do not.
The next version of C# (4.0) actually does support it now, but not sure about vb.net.
Also, with System.Linq, one of the extension methods is Cast() which would allow you to cast from a collection of one type to another. However it will still return IEnumerable so you need to manipulate a little bit, but I think key-value types will be harder than collections with only one generic type.