.NET Obfuscation [SmartAssembly] - vb.net

Just a quick question in relation to SmartAssembly and .NET applications. - I am experimenting with the software at the moment and it seems to be obfuscating code but My.Settings is still visible in plain text?
So previous to obfucating my code (using .NET reflector) I could literally see almost everything. Including the My.Settings class containing lots of info such as passwords, ip's, MySQL connection strings etc..
So I obfuscated the code using RedGate's SmartAssembly and sure enough all the classes/functions etc appeared with random symbols, however several items (again including My.Settings) remained untouched?
SmartAssembly Screenshot
Obfuscated result in .NET reflector

There are limitations to what most obfuscation tools can do, and this is one of them. Settings values are not stored as string literals or in backing fields, but as an attribute value:
Global.System.Configuration.DefaultSettingValueAttribute("bar")> _
Public Property Foo() As String
Get
Return CType(Me("Foo"), String)
End Get
Set(value As String)
Me("Foo") = value
End Set
End Property
VB/VS generates the Property getter/setter, but as you can see it uses an attribute to store the initial value as opposed to:
Private _foo As String = "bar"
In most cases there is no reason to hide the string content used in Attributes because they are usually instructions to the compiler about the class or property:
<Description("Bar String")>
<DefaultValue("bar")>
<DesignerSerializationVisibility(DesignerSerializationVisibility.Visible)>
Property BarString As String
None of these Attribute literals needs to be hidden because most Attributes contains neither runtime data nor sensitive information. As a result, My.Settings is a fringe case and is the result of how it is implemented. Note that this only applies to the default, initial values you enter in the IDE. If you update them at runtime, they are not written back to the Attributes, but saved to a file.
Since you have a trivial number of settings there, just write a small class to manage them yourself and save to a file in Environment.GetFolderPath(Environment.SpecialFolder.LocalApplicationData)

Related

Do I understand not using getters and setters correctly

After reading this piece by Yegor about not using getters and setters, it sounds like something that makes sense to me.
Please note this question is not about whether doing it is better/worst, only if I am implementing it correctly
I was wondering in the following two examples in VBA, if I understand the concept correctly, and if I am applying it correctly.
The standard way would be:
Private userName As String
Public Property Get Name() As String
Name = userName
End Property
Public Property Let Name(rData As String)
userName = rData
End Property
It looks to me his way would be something like this:
Private userName As String
Public Function returnName() As String
returnName = userName
End Function
Public Function giveNewName(newName As String) As String
userName = newName
End Function
From what I understand from the two examples above is that if I wanted to change the format of userName (lets say return it in all-caps), then I can do this with the second method without changing the name of the method that gives the name through - I can just let returnName point to a userNameCaps property. The rest of my code in my program can still stay the same and point to the method userName.
But if I want to do this with the first example, I can make a new property, but then have to change my code everywhere in the program as well to point to the new property... is that correct?
In other words, in the first example the API gets info from a property, and in the second example the API gets info from a method.
Your 2nd snippet is neither idiomatic nor equivalent. That article you link to, is about Java, a language which has no concept whatsoever of object properties - getFoo/setFoo is a mere convention in Java.
In VBA this:
Private userName As String
Public Property Get Name() As String
Name = userName
End Property
Public Property Let Name(rData As String)
userName = rData
End Property
Is ultimately equivalent to this:
Public UserName As String
Not convinced? Add such a public field to a class module, say, Class1. Then add a new class module and add this:
Implements Class1
The compiler will force you to implement a Property Get and a Property Let member, so that the Class1 interface contract can be fulfilled.
So why bother with properties then? Properties are a tool, to help with encapsulation.
Option Explicit
Private Type TSomething
Foo As Long
End Type
Private this As TSomething
Public Property Get Foo() As Long
Foo = this.Foo
End Property
Public Property Let Foo(ByVal value As Long)
If value <= 0 Then Err.Raise 5
this.Foo = value
End Property
Now if you try to assign Foo with a negative value, you'll get a runtime error: the property is encapsulating an internal state that only the class knows and is able to mutate: calling code doesn't see or know about the encapsulated value - all it knows is that Foo is a read/write property. The validation logic in the "setter" ensures the object is in a consistent state at all times.
If you want to break down a property into methods, then you need a Function for the getter, and assignment would be a Sub not a Function. In fact, Rubberduck would tell you that there's a problem with the return value of giveNewName being never assigned: that's a much worse code smell than "OMG you're using properties!".
Functions return a value. Subs/methods do something - in the case of an object/class, that something might imply mutating internal state.
But by avoiding Property Let just because some Java guy said getters & setters are evil, you're just making your VBA API more cluttered than it needs to be - because VBA understands properties, and Java does not. C# and VB.NET do however, so if anything the principles of these languages would be much more readily applicable to VBA than Java's, at least with regards to properties. See Property vs Method.
FWIW public member names in VB would be PascalCase by convention. camelCase public member names are a Java thing. Notice how everything in the standard libraries starts with a Capital first letter?
It seems to me that you've just given the property accessors new names. They are functionally identical.
I think the idea of not using getters/setters implies that you don't try to externally modify an object's state - because if you do, the object is not much more than a user-defined type, a simple collection of data. Objects/Classes should be defined by their behavior. The data they contain should only be there to enable/support that behavior.
That means you don't tell the object how it has to be or what data you want it to hold. You tell it what you want it to do or what is happening to it. The object itself then decides how to modify its state.
To me it seems your example class is a little too simple to work as an example. It's not clear what the intended purpose is: Currently you'd probably better off just using a variable UserName instead.
Have a look at this answer to a related question - I think it provides a good example.
Regarding your edit:
From what I understand from the two examples above is that if I wanted
to change the format of userName (lets say return it in all-caps),
then I can do this with the second method without changing the name of
the method that gives the name through - I can just let returnName
point to a userNameCaps property. The rest of my code in my program
can still stay the same and point to the method iserName.
But if I want to do this with the first example, I can make a new
property, but then have to change my code everywhere in the program as
well to point to the new property... is that correct?
Actually, what you're describing here, is possible in both approaches. You can have a property
Public Property Get Name() As String
' possibly more code here...
Name = UCase(UserName)
End Property
or an equivalent function
Public Function Name() As String
' possibly more code here...
Name = UCase(UserName)
End Function
As long as you only change the property/function body, no external code needs to be adapted. Keep the property's/function's signature (the first line, including the Public statement, its name, its type and the order and type of its parameters) unchanged and you should not need to change anything outside the class to accommodate.
The Java article is making some sort of philosophic design stance that is not limited to Java: The general advise is to severely limit any details on how a class is implemented to avoid making one's code harder to maintain. Putting such advice into VBA terms isn't irrelevant.
Microsoft popularized the idea of a Property that is in fact a method (or two) which masquerade as a field (i.e. any garden-variety variable). It is a neat-and-tidy way to package up a getter and setter together. Beyond that, really, behind the scenes it's still just a set of functions or subroutines that perform as accessors for your class.
Understand that VBA does not do classes, but it does do interfaces. That's what a "Class Module" is: An interface to an (anonymous) class. When you say Dim o As New MyClassModule, VBA calls some factory function which returns an instance of the class that goes with MyClassModule. From that point, o references the interface (which in turn is wired into the instance). As #Mathieu Guindon has demonstrated, Public UserName As String inside a class module really becomes a Property behind the scenes anyway. Why? Because a Class Module is an interface, and an interface is a set of (pointers to) functions and subroutines.
As for the philosophic design stance, the really big idea here is not to make too many promises. If UserName is a String, it must always remain a String. Furthermore, it must always be available - you cannot remove it from future versions of your class! UserName might not be the best example here (afterall, why wouldn't a String cover all needs? for what reason might UserName become superfluous?). But it does happen that what seemed like a good idea at the time the class was being made turns into a big goof. Imagine a Public TwiddlePuff As Integer (or instead getTwiddlePuff() As Integer and setTwiddlePuff(value As Integer)) only to find out (much later on!) that Integer isn't sufficient anymore, maybe it should have been Long. Or maybe a Double. If you try to change TwiddlePuff now, anything compiled back when it was Integer will likely break. So maybe people making new code will be fine, and maybe it's mostly the folks who still need to use some of the old code who are now stuck with a problem.
And what if TwiddlePuff turned out to be a really big design mistake, that it should not have been there in the first place? Well, removing it brings its own set of headaches. If TwiddlePuff was used at all elsewhere, that means some folks may have a big refactoring job on their hands. And that might not be the worst of it - if your code compiles to native binaries especially, that makes for a really big mess, since an interface is about a set of function pointers layed out and ordered in a very specific way.
Too reiterate, do not make too many promises. Think through on what you will share with others. Properties-getters-setters-accessors are okay, but must be used thoughtfully and sparingly. All of that above is important if what you are making is code that you are going to share with others, and others will take it and use it as part of a larger system of code, and it may be that these others intend to share their larger systems of code with yet even more people who will use that in their even larger systems of code.
That right there is probably why hiding implementation details to the greatest extent possible is regarded as fundamental to object oriented programming.

How can I create an "enum" type property value dynamically

I need to create a property in a class that will give the programmer a list of values to choose from. I have done this in the past using the enums type.
Public Enum FileType
Sales
SalesOldType
End Enum
Public Property ServiceID() As enFileType
Get
Return m_enFileType
End Get
Set(ByVal value As enenFileType)
m_enFileType = value
End Set
End Property
The problem is I would like to populate the list of values on init of the class based on SQL table values. From what I have read it is not possible to create the enums on the fly since they are basically constants.
Is there a way I can accomplish my goal possibly using list or dictionary types?
OR any other method that may work.
I don't know if this will answer your question, but its just my opinion on the matter. I like enums, mostly because they are convenient for me, the programmer. This is just because when I am writing code, using and enum over a constant value gives me not only auto-complete when typing, but also the the compile time error checking that makes sure I can only give valid enum values. However, enums just don't work for run-time defined values, since, like you say, there are compile time defined constants.
Typically, when I use flexible values that are load from an SQL Table like in your example, I'll just use string values. So I would just store Sales and SalesOldType in the table and use them for the value of FileType. I usually use strings and not integers, just because strings are human readable if I'm looking at data tables while debugging something.
Now, you can do a bit of a hybrid, allowing the values to be stored and come from the table, but defining commonly used values as constants in code, sorta like this:
Public Class FileTypeConstants
public const Sales = "Sales"
public const SalesOldType = "SalesOldType"
End Class
That way you can make sure when coding with common values, a small string typo in one spot doesn't cause a bug in your program.
Also, as a side note, I write code for and application that is deployed internally to our company via click-once deployment, so for seldom added values, I will still use an enum because its extremely easy to add a value and push out an update. So the question of using and enum versus database values can be one of how easy it is to get updates to your users. If you seldom update, database values are probably best, if you update often and the updates are not done by users, then enums can still work just as well.
Hope some of that helps you!
If the values aren't going to change after the code is compiled, then it sounds like the best option is to simply auto-generate the code. For instance, you could write a simple application that does something like this:
Public Shared Sub Main()
Dim builder As New StringBuilder()
builder.AppendLine("' Auto-generated code. Don't touch!! Any changes will be automatically overwritten.")
builder.AppendLine("Public Enum FileType")
For Each pair As KeyValuePair(Of String, Integer) In GetFileTypesFromDb()
builder.AppendLine(String.Format(" {0} = {1}", pair.Key, pair.Value))
End For
builder.AppendLine("End Enum")
File.WriteAllText("FileTypes.vb", builder.ToString())
End Sub
Public Function GetFileTypesFromDb() As Dictionary(Of String, Integer)
'...
End Function
Then, you could add that application as a pre-build step in your project so that it automatically runs each time you compile your main application.

Proto-buf serialization with Obfuscation

I am looking for some guidance as to what is going on when using proto-buf net with obfuscation (Dotfuscator). One half of the project is a DLL and the other is an EXE elsewhere and using proto-buf NET they exchange data flawlessly. Until I obfuscate the DLL.
At that point P-BN fails without raising an exception, returning variously a 0 length byte array or a foreshortened one depending on what I have fiddled with. The class is fairly simple (VB):
<ProtoContract(Name:="DMailer")> _
Friend Class DMailer
Private _Lic As Cert
Private _Sys As Sys
Private _LList As List(Of LItem)
..
..
End Class
There are 3 props all decorated with ProtoMember to get/set the constituent class objects. Snipped for brevity.
Again, it works GREAT until I obfuscate the DLL. Then, Dotfuscator renames each of these to null, apparently since they are all Friend, and that seems to choke proto-buff. If I exempt the class from renaming (just the class name, not props/members), it seems to work again. It makes sense that P-BN would only be able to act on objects with a proper name, though when asked to serialize a null named object, it seems like an exception might be in order.
On the other hand, much of the charm of PB-N is supposed to be serialization independent of .NET names working from attributes - at least as I understand it. Yet in this case it only seems to work with classes with names. I tried using the Name qualifier or argument as shown above, to no avail - it apparently doesnt do what I thought it might.
So, I am curious if:
a) ...I have basically surmised the problem correctly
b) ...There is some other attribute or flag that might facilitate serializing
a null named object
c) ...if there are any other insights that would help.
If I exempt all 3 or 4 classes from Dotfuscator renaming (LList is not actually implemented yet, leaving DMailer, Cert and Sys), the DLL seems to work again - at least the output is the correct size. I can live with that, though obscured names would be better: Dotfuscator (CE) either exempts them or sets the names to Null - I cant seem to find a way to force them to be renamed.
Rather than exempt 3 or 4 classes from renaming, one alternative I am considering is to simply store the Serializer output for Cert and Sys as byte arrays or Base64 strings in DMailer instead of classes. Then have the receiver Deserialize each object individually. It is kind of nice to be able to unpack just one thing and have your toys right there as if by magic though.
(many)TIA
Interesting. I confess I have never tried this scenario, but if you can walk me through your process (or better: maybe provide a basic repro example with "run this, then this, then this: boom") I'll happily investigate.
Note: the Name on ProtoContract is mainly intended for GetProto() usage; it is not needed by the core serializer, and can be omitted to reduce your exposure. Also, protobuf-net isn't interested in fields unless those fields are decorated with the attributes, so that shouldn't be an issue.
However! there's probably a workaround here that should work now; you can pre-generate a static serialization dll; for example in a separate console exe (just as a tool; I really need to wrap this in a standalone utility!)
So if you create a console exe that references your unobfuscated library and protobuf-net.dll:
var model = RuntimeTypeModel.Create();
model.Add(typeof(DMailer), true); // true means "use the attributes etc"
// and other types needed, etc
model.Compile("MailSerializer", "MailSerializer.dll");
this should write MailSerializer.dll, which you can then reference from your main code (in addition to protobuf-net), and use:
var ser = new MailSerializer(); // our pre-genereated serializer
ser.Serialize(...); // etc
Then include MailSerializer.dll in your obfuscation payload.
(this is all v2 specific, btw)
If this doesn't work, I'll need to investigate the main issue, but I'm not an obfuscation expert so could do with your repro steps.
Since there were a few upticks of interest, here is what looks like will work:
a) No form of reflection will be able to get the list of properties for an obfuscated type.
I tried walking thru all the types to find the ones with ProtoContract on it, I could find them
but the property names are all changed to a,m, b, j, g.
I also tried Me.GetType.GetProperties with the same result.
You could implement a map from the output to indicate that Employee.FirstName is now a0.j, but distributing this defeats the purpose of obfuscation.
b) What does work to a degree is to exempt the class NAME from obfuscation. Since PB-N looks for the ProtoMember attributes to get the data, you CAN obfuscate the Property/Member names, just not the CLASS/type name. If the name is something like FederalReserveLogIn, your class/type has a bullseye on it.
I have had initial success doing the following:
1) Build a simple class to store a Property Token and value. Store everything as string using ConvertFromInvariantString. Taking a tip from PBN, I used an integer for the token:
<ProtoMember(propIndex.Foo)>
Property Foo As String
An enum helps tie everything together later. Store these in a Dictionary(Of T, NameValuePair)
2) add some accessors. these can perform the type conversions for you:
Public Sub Add(ByVal Key As T, ByVal value As Object)
If _col.ContainsKey(Key) Then
_col.Remove(Key)
End If
_col.Add(Key, New TValue(value))
End Sub
Public Function GetTItem(Of TT)(key As T) As TT
If _col.ContainsKey(key) Then
Return CType(_col(key).TValue, TT)
Else
Return Nothing
End If
End Function
T is whatever key type you wish to use. Integer results in the smallest output and still allows the subscribing code to use an Enum. But it could be String.
TT is the original type:
myFoo = props.GetTItem(Of Long)(propsEnum.Foo)
3) Expose the innerlist (dictionary) to PBN and bingo, all done.
Its also very easy to add converters for Point, Rectangle, Font, Size, Color and even bitmap.
HTH

Can I autogenerate/compile code on-the-fly, at runtime, based upon values (like key/value pairs) parsed out of a configuration file?

This might be a doozy for some. I'm not sure if it's even 100% implementable, but I wanted to throw the idea out there to see if I'm really off of my rocker yet.
I have a set of classes that mimics enums (see my other questions for specific details/examples). For 90% of my project, I can compile everything in at design time. But the remaining 10% is going to need to be editable w/o re-compiling the project in VS 2010. This remaining 10% will be based on a templated version of my Enums class, but will generate code at runtime, based upon data values sourced in from external configuration files.
To keep this question small, see this SO question for an idea of what my Enums class looks like. The templated fields, per that question, will be the MaxEnums Int32, Names String() array, and Values array, plus each shared implementation of the Enums sub-class (which themselves, represent the Enums that I use elsewhere in my code).
I'd ideally like to parse values from a simple text file (INI-style) of key/value pairs:
[Section1]
Enum1=enum_one
Enum2=enum_two
Enum3=enum_three
So that the following code would be generated (and compiled) at runtime (comments/supporting code stripped to reduce question size):
Friend Shared ReadOnly MaxEnums As Int32 = 3
Private Shared ReadOnly _Names As String() = New String() _
{"enum_one", "enum_two", "enum_three"}
Friend Shared ReadOnly Enum1 As New Enums(_Names(0), 1)
Friend Shared ReadOnly Enum2 As New Enums(_Names(1), 2)
Friend Shared ReadOnly Enum3 As New Enums(_Names(2), 4)
Friend Shared ReadOnly Values As Enums() = New Enums() _
{Enum1, Enum2, Enum3}
I'm certain this would need to be generated in MSIL code, and I know from reading that the two components to look at are CodeDom and Reflection.Emit, but I was wondering if anyone had working examples (or pointers to working examples) versus really long articles. I'm a hands-on learner, so I have to have example code to play with.
Thanks!
You can use the Compiler Class to compile the code into a dll. Then just dynamically load the assembly at run-time.

Newbie question: how do I create a class to hold data in Visual Basic Studio?

I'm really sorry. This must seem like an incredibly stupid question, but unless I ask I'll never figure it out. I'm trying to write a program to read in a csv file in Visual Basic (tried and gave up on C#) and I asked a friend of mine who is much better at programming than I am. He said I should create a class to hold the data that I read in.
The problem is, I've never created a class before, not in VB, Java, or anything. I know all the terms associated with classes, I understand at a high level how classes work no problem. But I suck at the actual details of making one.
So here's what I did:
Public Class TsvData
Property fullDataSet() As Array
Get
Return ?????
End Get
Set(ByVal value As Array)
End Set
End Property
End Class
I got as far as the question marks and I'm stuck.
The class is going to hold a lot of data, so I made it an array. That could be a bad move. I don't know. All i know is that it can't be a String and it certainly can't be an Integer or a Float.
As for the Getter and Setter, the reason I put the question marks in is because I want to return the whole array there. The class will eventually have other properties which are basically permutations of the same data, but this is the full set that I will use when I want to save it out or something. Now I want to return the whole Array, but typing "Return fullDataSet()" doesn't seem like a good idea. I mean, the name of the property is "fullDataSet()." It will just make some kind of loop. But there is no other data to return.
Should I Dim yet another array inside the property, which already is an array, and return that instead?
Instead of writing your own class, you could get yourself familiar with the pre-defined class System.Data.DataTable and then use that for holding CSV data.
In the last few years that I've been programming, I've never actually used a multi-dimensional array, and I'd advise you not to use them, either. There's usually ways of achieving the same with a better data structure. For example, consider creating a class (let's call it CsvRecord) that holds only one record; that is, only one line from the CSV file. Then use any of the standard collection types from the System.Collections.Generic namespace (e.g. List(Of CsvRecord)) to hold the entire data (ie. all lines) in the CSV file. This effectively reduces the problem to, "How do I read in one line of CSV data?"
If you want to take suggestion #2 even further, do as cHao says and don't simply lay out the information you've read as a CsvRecord; instead, create an object that reflects the actual content. For example, if your CSV file contains product–price information, call your CSV record class ProductInfo or something more fitting.
If, however, you want to go on with your current approach, you will need a backing field for the property, as demonstrated by Philipp's answer. Your property then becomes a "façade" that only delegates to this backing field. This is not absolutely necessary: You could simply make the backing field Public and let the user of your class access it directly, though that is not considered a good practice.
Ideally, you ought to have a class representing the specific data you want to read in. Setting an entire array at once is asking for trouble; some programs that read {C,T}SV files will freak out if all rows don't have the same number of columns, which is exceedingly easy to do if you can set the data to be an array of arbitrary length.
If you're trying to represent arbitrary data, frankly, you'd do just as well to use a List(Of String). If it's meant to be a table, you could instead read in the first line and make it a list as above (let's call it "headers"), and then make each row a Dictionary(Of String, String). (Let's call each row "row", and the collection (a list of these dictionary objects) "rows".) Just read in the line, split it like you did the first, and say something like row(headers(column number)) = value for each column, and then stuff it into 'rows'.
Or, you could use the data classes (System.Data.DataTable and System.Data.DataSet would do wonders here).
Usually you use a private member to store the actual data:
Public Class TsvData
Private _fullDataSet As String()
Public Property FullDataSet() As String()
Get
Return _fullDataSet
End Get
Set(ByVal value As String())
_fullDataSet = value
End Set
End Property
Note that this is an instance of bad design since it couples a concept to a concrete representation and allows the clients of the class to modify the internals without any error checking. Returning a ReadOnlyCollection or some dedicated container would be better.