Occasional bad data error when decrypting - vb.net

I have a very strange situation.
Basically I have code that uses a decryptor created by:
Dim des3 As New TripleDESCryptoServiceProvider
des3.Mode = CipherMode.CBC
Return des3.CreateDecryptor(_encKey, _initVec)
The _encKey and _initVec are hardcoded.
I use it by calling:
Dim res() As Byte = decrypt(Convert.FromBase64String(_data))
m_transformDec.TransformFinalBlock(res, 0, res.Length)
Here _data is a string containing the encrypted value. m_transformDec is the Decryptor created previously.
Usually this works. Occasionally, I get a "bad data" error. I print out the value of _data, and it is always the same.
The code is multithreaded, which I suspect is the reason for both the problem, and it being hard to reproduce. The decryptor is created in the creation of the class, and the decryption is done in a Shared function, but I don't see anything there which is not thread-safe.
Any ideas?

You should not assume anything is safe for concurrent calls unless you have reason to believe it is. In the docs, you have the boilerplate text that instance members are not guaranteed to be thread-safe, so you should defensively lock the des3 object when you're using it.
You should not be hard coding the initialization vector; it should be randomly chosen when encrypting data, then stored in some way with the encrypted data (many people choose to tack it onto the beginning of the data, then remove it and use it for decryption; use whatever storage scheme you prefer, though). Using the same IV defeats the purpose of the IV, which serves to make plaintext attacks more difficult.

Related

Accord.net Codification can't handle non-strings

I am trying to use the Accord.net library to build test method of several of the machine learning algorithms that library supports.
One of the issues I have run into is that when I am trying to codify my string data, the Codification class does not seem capable of dealing with any datatable columns that are not strings, despite the documentation saying otherwise.
Codification codebook = new Codification(fulldata, AllAttributeNames);
I call that line where fulldata is a datatable, and I have tried including columns of both Int32 type and Double type, and the Codification class has thrown an error saying it is unable to convert them to type String.
"System.InvalidCastException: 'Unable to cast object of type 'System.Double' to type 'System.String'.'"
EDIT: It turns out this error is because the Codification system can only handle alternate data types if it is encoding the entire table. I suppose I can see the logic here, although I would prefer a better error, or that the method was a little smarter.
I now have another issue that has cropped up related to this. After changing my code to this:
Codification codebook = new Codification(fulldata);
I then learning.Learn(inputs, outputs) my algorithm and want to use the newly trained algorithm. So the next step would be to take a bunch of test data, make sure it matches the codebooks encoding, and send it through the algorithm. Unfortunately, when I try and use the
int[][] testinput = codebook.Transform(testData, inputColumnNameArray);
It blows up claiming it could not find a mapping to transform. It does this in reference to an Integer column that the codebook correctly did not map to new values. So now it seems this Transform method is not capable of handling non-string columns, and I have not found an overload of it that can, even though the documentation indicates it should be able to handle this.
Does anyone know how to get around this issue without manually building the entire int[][] testinput array one value at a time?
Turns out I was able to answer my own question eventually.
The Codification class has two methods of using it as near as I can tell. The constructor that takes a list of column names, as well as the Transform methods both lack intelligence in dealing with non-string data types, perhaps these methods are going away in the future.
The constructor that just takes a datatable by itself, as well as the Apply method, are both capable of handling data types other than strings. Once I switched to using these two methods my errors went away.
Codification codebook = new Codification(fulldata);
int[][] testinput = codebook.Apply(testData, inputColumnNameArray);
The confusion for me lay in all the example code seemingly randomly using these two methods, but using the Apply method only when processing the training data, and using the Transform method when encoding test data.
I am not sure why they chose to do this in the documentation example code, but it definitely took me a long time to figure out what was going on enough to stop having this particular issue.

With... End With vs Using in VB.NET

I just found out that like C#, VB.NET also has the using keyword.
Until now I thought it didn't have it (stupid of me, I know...) and did stuff like this instead:
With New OleDbConnection(MyConnectionString)
' Do stuff
End With
What are the implications of this compared to doing it with a using statement like this
Using cn as New OleDBConnection(MyConnectionString)
With cn
' Do stuff with cn
End With
End using
UPDATE:
I should add that I am familiar with what the using statement does in that it disposes of the object when the construct is exited.
However, as far as I understand the With New ... construct will have the object mark the object as ready for garbage collection when it goes out of scope.
So my question really is, is the only difference the fact that with using I will release the memory right away, whereas with the With construct it will be released whenever the GC feels like it? Or am I missing something bigger here?
Are there any best practice implications? Should I go and rewrite all the code with I used With New MyDisposableObject() ... End With as Using o as New MyDisposableObject()?
With Statements/Blocks
However, as far as I understand the With New ... construct will have the object mark the object as ready for garbage collection when it goes out of scope.
This is both true and not true. It is true in the sense that all objects are "marked" (purists might quibble with this terminology, but the details are not relevant) as ready for garbage collection when they go out of scope. But then in that sense, it is also not entirely true, as there's nothing special about the With keyword with respect to this behavior. When an object goes out of scope, it is eligible for garbage collection. Period. That's true for method-level scope and it's true for block-level scope (e.g., With, For, Using, etc.).
But that's not why you use With. The reason is that it allows you to set multiple properties in sequence on a deeply-nested object. In other words, assume you have an object on which you want to set a bunch of properties, and you access it this way: MyClass.MemberClass.AnotherMemberClass.Items(0). See all those dots? It can (theoretically) become inefficient to write code that has to work through that series of dots over and over to access the exact same object each time you set a property on it. If you know anything about C or C++ (or any other language that has pointers), you can think of each of those dots as implying a pointer dereference. The With statement basically goes through all of that indirection only once, storing the resulting object in a temporary variable, and allowing you to set properties directly on that object stored in the temporary variable.
Perhaps some code would help make things clearer. Whenever you see a dot, think might be slow!
Assume that you start out with the following code, retrieving object 1 from a deeply-nested Items collection and setting multiple properties on it. See how many times we have to retrieve the object, even though it's exactly the same object every time?
MyClass.MemberClass.AnotherMemberClass.Items(0).Color = Blue
MyClass.MemberClass.AnotherMemberClass.Items(0).Width = 10
MyClass.MemberClass.AnotherMemberClass.Items(0).Height = 5
MyClass.MemberClass.AnotherMemberClass.Items(0).Shape = Circle
MyClass.MemberClass.AnotherMemberClass.Items(0).Texture = Shiny
MyClass.MemberClass.AnotherMemberClass.Items(0).Volume = Loud
Now we modify that code to use a With block:
With MyClass.MemberClass.AnotherMemberClass.Items(0)
.Color = Blue
.Width = 10
.Height = 5
.Shape = Circle
.Texture = Shiny
.Volume = Loud
End With
The effect here, however, is identical to the following code:
Dim tempObj As MyObject = MyClass.MemberClass.AnotherMemberClass.Items(0)
tempObj.Color = Blue
tempObj.Width = 10
tempObj.Height = 5
tempObj.Shape = Circle
tempObj.Texture = Shiny
tempObj.Volume = Loud
Granted, you don't introduce a new scope, so tempObj won't go out of scope (and therefore be eligible for garbage collection) until the higher level scope ends, but that's hardly ever a relevant concern. The performance gain (if any) attaches to both of the latter two code snippets.
The real win of using With blocks nowadays is not performance, but readability. For more thoughts on With, possible performance improvements, stylistic suggestions, and so on, see the answers to this question.
With New?
Adding the New keyword to a With statement has exactly the same effect that we just discussed (creating a local temporary variable to hold the object), except that it's almost an entirely pointless one. If you need to create an object with New, you might as well declare a variable to hold it. You're probably going to need to do something with that object later, like pass it to another method, and you can't do that in a With block.
It seems that the only purpose of With New is that it allows you to avoid an explicit variable declaration, instead causing the compiler to do it implicitly. Call me crazy, but I see no advantage in this.
In fact, I can say that I have honestly never seen any actual code that uses this syntax. The only thing I can find on Google is nonsense like this (and Call is a much better alternative there anyway).
Using Statements/Blocks
Unlike With, Using is incredibly useful and should appear frequently in typical VB.NET code. However, it is very limited in its applicability: it works only with objects whose type implements the IDisposable interface pattern. You can tell this by checking to see whether the object has a Dispose method that you're supposed to call whenever you're finished with it in order to release any unmanaged resources.
That is, by the way, a rule you should always follow when an object has a Dispose method: you should always call it whenever you're finished using the object. If you don't, it's not necessarily the end of the world—the garbage collector might save your bacon—but it's part of the documented contract and always good practice on your part to call Dispose for each object that provides it.
If you try to wrap the creation of an object that doesn't implement IDisposable in a Using block, the compiler will bark at you and generate an error. It doesn't make sense for any other types because its function is essentially equivalent to a Try/Finally block:
Try
' [1: Create/acquire the object]
Dim g As Graphics = myForm.CreateGraphics()
' [2: Use the object]
g.DrawLine(Pens.Blue, 10, 10, 100, 100)
' ... etc.
End Try
Finally
' [3: Ensure that the object gets disposed, no matter what!]
g.Dispose()
End Finally
But that's ugly and becomes rather unwieldy when you start nesting (like if we'd created a Pen object that needed to be disposed as well). Instead, we use Using, which has the same effect:
' [1: Create/acquire the object]
Using g As Graphics = myForm.CreateGraphics()
' [2: Use the object]
g.DrawLine(Pens.Blue, 10, 10, 100, 100)
' ...etc.
End Using ' [3: Ensure that the object gets disposed, no matter what!]
The Using statement works both with objects you are first acquiring (using the New keyword or by calling a method like CreateGraphics), and with objects that you have already created. In both cases, it ensures that the Dispose method gets called, even if an exception gets thrown somewhere inside of the block, which ensures that the object's unmanaged resources get correctly disposed.
It scares me a little bit that you have written code in VB.NET without knowing about the Using statement. You don't use it for the creation of all objects, but it is very important when you're dealing with objects that implement IDisposable. You definitely should go back and re-check your code to ensure that you're using it where appropriate!
By using With...End With, you can perform a series of statements on a specified object without specifying the name of the object multiple times.
A Using block behaves like a Try...Finally construction in which the Try block uses the resources and the Finally block disposes of them.
Managed resources are disposed by the garbage collector without any extra coding on your part.
You do not need Using or With statements.
Sometimes your code requires unmanaged resources. You are responsible for their disposal. A Using block guarantees that the Dispose method on the object is called when your code is finished with them.
The difference is the Using With...End End
Using cn as New OleDBConnection(MyConnectionString)
With cn
' Do stuff with cn
End With
End using
Calls cn.Dispose() automatically when going out of scope (End Using). But in the With New...End
With New OleDbConnection(MyConnectionString)
' Do stuff
End With
.Dispose() is not explicitly called.
Also with the named object, you can create watches and ?cn in the immediate window. With the unnamed object, you cannot.
Using connection... End Using : Be careful !!!
This statement will close your database connection !
In the middle of a module or form(s), i.e.adding or updating records, this will close the connection. When you try to do another operation in that module you will receive a database error. For connections, I no longer use it. You can use the Try... End Try wihthouth the Using statement.
I open the connection upon entry to the module and close it it upon exit. That solves the problem.

Generating a key

I am writing an encryption application that requires a 64 bit key. I am currently using the following code to automatically generate a key.
Function GenerateKey() As String
' Create an instance of a symmetric algorithm. The key and the IV are generated automatically.
Dim desCrypto As DESCryptoServiceProvider = DESCryptoServiceProvider.Create()
' Use the automatically generated key for encryption.
Return ASCIIEncoding.ASCII.GetString(desCrypto.Key)
End Function
I am wanting the user to create their own key. Can I convert a user defined password (a string) into a 64 bit key that can be used?
The answer depends on how secure you want it to be, I'm no security expert so I wouldn't give advice on it.
I did see this though: http://msdn.microsoft.com/en-us/library/system.security.cryptography.rfc2898derivebytes.aspx It can be used to derives bytes from a string key and salt in the way Jodrell eluded to, and would be far better than rolling yor own.
The other constructor that might be suited after that stage is detailed here: http://msdn.microsoft.com/en-us/library/51cy2e75.aspx
I'm sure if you searched for that on the web you could find examples of how to use it.

Most appropriate data structure for dynamic languages field access

I'm implementing a dynamic language that will compile to C#, and it's implementing its own reflection API (.NET's is too slow, and the DLR is limited only to more recent and resourceful implementations).
For this, I've implemented a simple .GetField(string f) and .SetField(string f, object val) interface. Until recently, the implementation just switches over all possible field string values and makes the corresponding action.
Also, this dynamic language has the possibility to define anonymous objects. For those anonymous objects, at first, I had implemented a simple hash algorithm.
By now, I am looking for ways to optimize the dynamic parts of the language, and I have come across the fact that a hash algorithm for anonymous objects would be overkill. This is because the objects are usually small. I'd say the objects contain 2 or 3 fields, normally. Very rarely, they would contain more than 15 fields. It would take more time to actually hash the string and perform the lookup than if I would test for equality between them all. (This is not tested, just theoretical).
The first thing I did was to -- at compile-time -- create a red-black tree for each anonymous object declaration and have it laid onto an array so that the object can look for it in a very optimized way.
I am still divided, though, if that's the best way to do this. I could go for a perfect hashing function. Even more radically, I'm thinking about dropping the need for strings and actually work with a struct of 2 longs.
Those two longs will be encoded to support 10 chars (A-za-z0-9_) each, which is mostly a good prediction of the size of the fields. For fields larger than this, a special function (slower) receiving a string will also be provided.
The result will be that strings will be inlined (not references), and their comparisons will be as cheap as a long comparison.
Anyway, it's a little hard to find good information about this kind of optimization, since this is normally thought on a vm-level, not a static language compilation implementation.
Does anyone have any thoughts or tips about the best data structure to handle dynamic calls?
Edit:
For now, I'm really going with the string as long representation and a linear binary tree lookup.
I don't know if this is helpful, but I'll chuck it out in case;
If this is compiling to C#, do you know the complete list of fields at compile time? So as an idea, if your code reads
// dynamic
myObject.foo = "some value";
myObject.bar = 32;
then during the parse, your symbol table can build an int for each field name;
// parsing code
symbols[0] == "foo"
symbols[1] == "bar"
then generate code using arrays or lists;
// generated c#
runtimeObject[0] = "some value"; // assign myobject.foo
runtimeObject[1] = 32; // assign myobject.bar
and build up reflection as a separate array;
runtimeObject.FieldNames[0] == "foo"; // Dictionary<int, string>
runtimeObject.FieldIds["foo"] === 0; // Dictionary<string, int>
As I say, thrown out in the hope it'll be useful. No idea if it will!
Since you are likely to be using the same field and method names repeatedly, something like string interning would work well to quickly generate keys for your hash tables. It would also make string equality comparisons constant-time.
For such a small data set (expected upper bounds of 15) I think almost any hashing will be more expensive then a tree or even a list lookup, but that is really dependent on your hashing algorithm.
If you want to use a dictionary/hash then you'll need to make sure the objects you use for the key return a hash code quickly (perhaps a single constant hash code that's built once). If you can prevent collisions inside of an object (sounds pretty doable) then you'll gain the speed and scalability (well for any realistic object/class size) of a hash table.
Something that comes to mind is Ruby's symbols and message passing. I believe Ruby's symbols act as a constant to just a memory reference. So comparison is constant, they are very lite, and you can use symbols like variables (I'm a little hazy on this and don't have a Ruby interpreter on this machine). Ruby's method "calling" really turns into message passing. Something like: obj.func(arg) turns into obj.send(:func, arg) (":func" is the symbol). I would imagine that symbol makes looking up the message handler (as I'll call it) inside the object pretty efficient since it's hash code most likely doesn't need to be calculated like most objects.
Perhaps something similar could be done in .NET.

Copy bytes in memory to an Array in VB.NET

unfortunately I cannot resort to C# in my current project, so I'll have to solve this without the unsafe keyword.
I've got a bitmap, and I need to access the pixels and channel values directly. I'd like to go beyond Marshal.ReadByte() and Marshal.WriteByte() (and definitely beyond GetPixel and SetPixel).
Is there a way to put all the pixel data of the bitmap into a Byte array that works on both 32 and 64 bit systems? I want the exact same layout as the original bitmap, so the padding for each row (if it exists) also needs to be included.
Marshal doesn't seem to have something akin to:
byte[] ReadBytes(IntPtr start, int offset, int count)
Unless I totally missed it...
Any help greatly appreciated,
David
ps. So far all my images are in 32BppPArgb pixelformat.
Marshal does have a Method that does exactly what you are asking. See Marshall.Copy()
public static void Copy(
IntPtr source,
byte[] destination,
int startIndex,
int length
)
Copies data from an unmanaged memory
pointer to a managed 8-bit unsigned
integer array.
And there are overloads to go the other direction as well
Would something like this do? (untested):
Public Shared Function BytesFromBitmap(ByVal Image As Drawing.Bitmap) As Byte()
Using buffer As New IO.MemoryStream()
image.Save(result, Drawing.Imaging.ImageFormat.Bmp)
Using rdr As New IO.BinaryReader(buffer)
Return rdr.ReadBytes(buffer.Length)
End Using
End Using
End Function
It won't let you manipulate the pixels in a Drawing.Bitmap object directly, but it will let you copy that bitmap to a byte array, as per the question title.
Another option is serialization via the BinaryFormatter, but I think that will still require you to pass it through a MemoryStream.
VB does not offer methods for direct memory access. You have two choices:
Use the Marshal class
Write a small unsafe C# (or C++/CLI) library that handles only these operations and reference it from your VB code.
Alright, there is a third option. VB.Net does not inherently support direct memory access, but it can be accomplished. It's just ugly and prone to errors. Nonetheless, if you're willing to put in the effort, you can try building a bitmap access library using these techniques combined with the approach referenced previously.
shf301 is right on the money, but I'd like to add a link to a comprehensive explanation/tutorial on fast pixel data access. Rather than saving the image to a stream and accessing a file-in-memory, it would be better to lock the bitmap, copy pixel data out, access it, and copy it back in. The performance of this technique is pretty good.
Code is in c#, but the approach is language-neutral and easy to read.
http://ilab.ahemm.org/tutBitmap.html