How to extract class IL code from loaded assembly and save to disk? - serialization

How would I go about extracting the IL code for classes that are generated at runtime by reflection so I can save it to disk? If at all possible. I don't have control of the piece of code that generates these classes.
Eventually, I would like to load this IL code from disk into another assembly.
I know I could serialise/deserialise classes but I wish to use purely IL code. I'm not fussed with the security implications.
Running Mono 2.10.1

Or better yet, use Mono.Cecil.
It will allow you to get at the individual instructions, even manipulating them and disassembling them (with the mono decompiler addition).
Note that the decompiler is a work in progress (last time I checked it did not fully support lambda expressions and Visual Basic exception blocks), but you can have pretty decompiled output in C# pretty easily as far as you don't hit these boundary conditions. Also, work has progressed since.
Mono Cecil in general let's you write the IL to a new assembly, as well, which you can then subsequently load into your appdomain if you like to play with bleeding edge.
Update I came round to trying this. Unfortunately I think I found what problem you run into. It turns out there is seems to be no way to get at the IL bytes for a generated type unless the assembly happened to get written out somewhere you can load it from.
I assumed you could just get the bits via reflection (since the classes support the required methods), however the related methods just raise an exception The invoked member is not supported in a dynamic module. on invocation. You can try this with the code below, but in short I suppose it means that it ain't gonna happen unless you want to f*ck with Marshal::GetFunctionPointerForDelegate(). You'd have to binary dump the instructions and manually disassemble them as IL opcodes. There be dragons.
Code snippet:
using System;
using System.Linq;
using Mono.Cecil;
using Mono.Cecil.Cil;
using System.Reflection.Emit;
using System.Reflection;
namespace REFLECT
{
class Program
{
private static Type EmitType()
{
var dyn = AppDomain.CurrentDomain.DefineDynamicAssembly(new AssemblyName("Emitted"), AssemblyBuilderAccess.RunAndSave);
var mod = dyn.DefineDynamicModule("Emitted", "Emitted.dll");
var typ = mod.DefineType("EmittedNS.EmittedType", System.Reflection.TypeAttributes.Public);
var mth = typ.DefineMethod("SuperSecretEncryption", System.Reflection.MethodAttributes.Public | System.Reflection.MethodAttributes.Static, typeof(String), new [] {typeof(String)});
var il = mth.GetILGenerator();
il.EmitWriteLine("Emit was here");
il.Emit(System.Reflection.Emit.OpCodes.Ldarg_0);
il.Emit(System.Reflection.Emit.OpCodes.Ret);
var result = typ.CreateType();
dyn.Save("Emitted.dll");
return result;
}
private static Type TestEmit()
{
var result = EmitType();
var instance = Activator.CreateInstance(result);
var encrypted = instance.GetType().GetMethod("SuperSecretEncryption").Invoke(null, new [] { "Hello world" });
Console.WriteLine(encrypted); // This works happily, print "Emit was here" first
return result;
}
public static void Main (string[] args)
{
Type emitted = TestEmit();
// CRASH HERE: even if the assembly was actually for SaveAndRun _and_ it
// has actually been saved, there seems to be no way to get at the image
// directly:
var ass = AssemblyFactory.GetAssembly(emitted.Assembly.GetFiles(false)[0]);
// the rest was intended as mockup on how to isolate the interesting bits
// but I didn't get much chance to test that :)
var types = ass.Modules.Cast<ModuleDefinition>().SelectMany(m => m.Types.Cast<TypeDefinition>()).ToList();
var typ = types.FirstOrDefault(t => t.Name == emitted.Name);
var operands = typ.Methods.Cast<MethodDefinition>()
.SelectMany(m => m.Body.Instructions.Cast<Instruction>())
.Select(i => i.Operand);
var requiredTypes = operands.OfType<TypeReference>()
.Concat(operands.OfType<MethodReference>().Select(mr => mr.DeclaringType))
.Select(tr => tr.Resolve()).OfType<TypeDefinition>()
.Distinct();
var requiredAssemblies = requiredTypes
.Select(tr => tr.Module).OfType<ModuleDefinition>()
.Select(md => md.Assembly.Name as AssemblyNameReference);
foreach (var t in types.Except(requiredTypes))
ass.MainModule.Types.Remove(t);
foreach (var unused in ass.MainModule
.AssemblyReferences.Cast<AssemblyNameReference>().ToList()
.Except(requiredAssemblies))
ass.MainModule.AssemblyReferences.Remove(unused);
AssemblyFactory.SaveAssembly(ass, "/tmp/TestCecil.dll");
}
}
}

If all you want is the IL for your User class, you already have it. It's in the dll that you compiled it to.
From your other assembly, you can load the dll with the User class dynamically and use it through reflection.
UPDATE:
If what you have is a dynamic class created with Reflection.Emit, you have an AssemblyBuilder that you can use to save it to disk.
If your dynamic type was instead created with Mono.Cecil, you have an AssemblyDefinition that you can save to disk with myAssemblyDefinition.Write("MyAssembly.dll") (in Mono.Cecil 0.9).

Related

Google diff-match-patch : How to unpatch to get Original String?

I am using Google diff-match-patch JAVA plugin to create patch between two JSON strings and storing the patch to database.
diff_match_patch dmp = new diff_match_patch();
LinkedList<Patch> diffs = dmp.patch_make(latestString, originalString);
String patch = dmp.patch_toText(diffs); // Store patch to DB
Now is there any way to use this patch to re-create the originalString by passing the latestString?
I google about this and found this very old comment # Google diff-match-patch Wiki saying,
Unpatching can be done by just looping through the diff, swapping
DIFF_INSERT with DIFF_DELETE, then applying the patch.
But i did not find any useful code that demonstrates this. How could i achieve this with my existing code ? Any pointers or code reference would be appreciated.
Edit:
The problem i am facing is, in the front-end i am showing a revisions module that shows all the transactions of a particular fragment (take for example an employee details), like which user has updated what details etc. Now i am recreating the fragment JSON by reverse applying each patch to get the current transaction data and show it as a table (using http://marianoguerra.github.io/json.human.js/). But some JSON data are not valid JSON and I am getting JSON.parse error.
I was looking to do something similar (in C#) and what is working for me with a relatively simple object is the patch_apply method. This use case seems somewhat missing from the documentation, so I'm answering here. Code is C# but the API is cross language:
static void Main(string[] args)
{
var dmp = new diff_match_patch();
string v1 = "My Json Object;
string v2 = "My Mutated Json Object"
var v2ToV1Patch = dmp.patch_make(v2, v1);
var v2ToV1PatchText = dmp.patch_toText(v2ToV1Patch); // Persist text to db
string v3 = "Latest version of JSON object;
var v3ToV2Patch = dmp.patch_make(v3, v2);
var v3ToV2PatchTxt = dmp.patch_toText(v3ToV2Patch); // Persist text to db
// Time to re-hydrate the objects
var altV3ToV2Patch = dmp.patch_fromText(v3ToV2PatchTxt);
var altV2 = dmp.patch_apply(altV3ToV2Patch, v3)[0].ToString(); // .get(0) in Java I think
var altV2ToV1Patch = dmp.patch_fromText(v2ToV1PatchText);
var altV1 = dmp.patch_apply(altV2ToV1Patch, altV2)[0].ToString();
}
I am attempting to retrofit this as an audit log, where previously the entire JSON object was saved. As the audited objects have become more complex the storage requirements have increased dramatically. I haven't yet applied this to the complex large objects, but it is possible to check if the patch was successful by checking the second object in the array returned by the patch_apply method. This is an array of boolean values, all of which should be true if the patch worked correctly. You could write some code to check this, which would help check if the object can be successfully re-hydrated from the JSON rather than just getting a parsing error. My prototype C# method looks like this:
private static bool ValidatePatch(object[] patchResult, out string patchedString)
{
patchedString = patchResult[0] as string;
var successArray = patchResult[1] as bool[];
foreach (var b in successArray)
{
if (!b)
return false;
}
return true;
}

List of DisposableLazy`2 does not have 'Add' method when called using dynamic variable

Problem
I am facing a problem using dynamically created list of items when Add method is called on dynamicvariable. Consider following code.
IEnumerable<dynamic> plugins = (IEnumerable<dynamic>)field.GetValue(instance);
if (plugins == null)
continue;
dynamic filteredPlugins = null;
foreach (var plugin in plugins)
{
if (filteredPlugins == null)
filteredPlugins = Activator
.CreateInstance(typeof(List<>)
.MakeGenericType(plugin.GetType()));
if (/* this condition does not matter*/)
//filteredPlugins.Add(plugin);
filteredPlugins.GetType().GetMethod("Add")
.Invoke(filteredPlugins, new object[] { plugin });
}
And now, the commented line filteredPlugins.Add(plugin) will throw System.Reflection.TargetInvocationException with the message 'object' does not contain a definition for 'Add' when plugin is of type
System.ComponentModel.Composition.ExportServices.DisposableLazy<IPlugin,IMetadata>
but it works completely perfect when pluginis of type
System.Lazy<IPlugin, IMetadata>
When the reflection is used to call Add method on the instance filteredPlugins instance as is done on the next line - everything works fine for any type.
My question is WHY is not Add method found in case of DisposableLazy type.
Background
This code is part of the method that I use in OnImportsSatisfied() method. I am using two kinds of import - which differs only in RequiredCreationPolicy - on has CreationPolicy.NonShared and the other default value of CreationPolicy.Any.
[ImportMany(RequiredCreationPolicy = CreationPolicy.NonShared)]
private IEnumerable<Lazy<IPlugin, IMetadata>> plugins = null;
For CreationPolicy.NonShared fields the underlaying type in the plugins is DisposableLazy and for CreationPolicy.Any the underlaying type in the plugins is Lazy.
Edit: As asked in the answer - I am using dynamic variable because IPlugin interface can change everytime this method is called and they do not have to have anything in common.
Edit2: I just found similar question C# dynamic type gotcha, so this can be probably closed as duplicite.
Because System.ComponentModel.Composition.ExportServices.DisposableLazy is a private class, the runtime binder is having trouble believing you have permission to use type, where reflection doesn't care.
Which begs the question why are you using dynamics at all in this case. Since DisposableLazy<IPlugin,IMetadata> public interface is it's subclass Lazy<IPlugin, IMetadata> & IDisposable, shouldn't you just be using a List<Lazy<IPlugin, IMetadata>> for either case?
var plugins = (IEnumerable<Lazy<IPlugin, IMetadata>>)field.GetValue(instance);
if (plugins == null)
continue;
var filteredPlugins = new List<Lazy<IPlugin, IMetadata>>();
foreach (var plugin in plugins)
{
if (/* this condition does not matter*/)
filteredPlugins.Add(plugin);
}
}

Creating WCF service by determining type at runtime

I am trying to create a WCF service without knowing its type/interface at runtime. To do this, I use ChannelFactory. ChannelFactory is a generic class so I need to use Type.MakeGenericType. The type I pass to MakeGenericType is from a list of interfaces I previously gathered with reflection by searching some assemblies.
Ultimately, I call MethodInfo.Invoke to create the object. The object is created just fine, but I cannot cast it to the proper interface. Upon casting, I receive the following error:
"Unable to cast transparent proxy to type 'Tssc.Services.MyType.IMyType'"
After some experimenting, I have found that the interface/type passed to MakeGenericType seems to be the problem. If I substitute the interface in my list with the actual interface, then everything works fine. I have combed through the two objects and cannot see a difference. When I modify the code to produce both types, comparing them with Equals returns false. It is unclear to me whether Equals is just checking that they are referring to the same object (not) or thety are checking all properties, etc.
Could this have something to do with how I gathered my interfaces (Reflection, saving in a list...)? A comparison of the objects seems to indicate they are equivalent. I printed all properties for both objects and they are the same. Do I need to dig deeper? If so, into where?
// createService() method
//*** tried both of these interfaces, only 2nd works - but they seem to be identical
//Type t = interfaces[i]; // get type from list created above - doesn't work
Type t = typeof(Tssc.Services.MyType.IMyType); // actual type - works OK
// create ChannelFactory type with my type parameter (t)
Type factoryType = typeof(ChannelFactory<>);
factoryType = factoryType.MakeGenericType(new Type[] { t });
// create ChannelFactory<> object with two-param ctor
BasicHttpBinding binding = new BasicHttpBinding();
string address = "blah blah blah";
var factory = Activator.CreateInstance(factoryType, new object[] { binding, address });
// get overload of ChannelFactory<>.CreateChannel with no parameters
MethodInfo method = factoryType.GetMethod("CreateChannel", new Type[] { });
return method.Invoke(factory, null);
//--------------- code that calls code above and uses its return
object ob = createService();
//*** this cast fails
Tssc.Services.MyType.IMyType service = (Tssc.Services.MyType.IMyType)ob;
Ok, I understand whats happening here - the problem is relating to loading the same assembly being effectively loaded twice - once via a reference, and once via the assembly load command. What you need to do is change the place where you load your assembly, and check to see if it already exists in the current AppDomain, like this maybe:
Assembly assembly = AppDomain.CurrentDomain.GetAssemblies().FirstOrDefault(a => a.GetName().Name.Equals("ClassLibrary1Name"));
if (assembly == null)
{
assembly = System.Reflection.Assembly.LoadFile("path to your assembly");
}
//do your work here
This way if the assembly is already loaded into memory, it'll use that one.

RavenDB IsOperationAllowedOnDocument not supported in Embedded Mode

RavenDB throws InvalidOperationException when IsOperationAllowedOnDocument is called using embedded mode.
I can see in the IsOperationAllowedOnDocument implementation a clause checking for calls in embedded mode.
namespace Raven.Client.Authorization
{
public static class AuthorizationClientExtensions
{
public static OperationAllowedResult[] IsOperationAllowedOnDocument(this ISyncAdvancedSessionOperation session, string userId, string operation, params string[] documentIds)
{
var serverClient = session.DatabaseCommands as ServerClient;
if (serverClient == null)
throw new InvalidOperationException("Cannot get whatever operation is allowed on document in embedded mode.");
Is there a workaround for this other than not using embedded mode?
Thanks for your time.
I encountered the same situation while writing some unit tests. The solution James provided worked; however, it resulted in having one code path for the unit test and another path for the production code, which defeated the purpose of the unit test. We were able to create a second document store and connect it to the first document store which allowed us to then access the authorization extension methods successfully. While this solution would probably not be good for production code (because creating Document Stores is expensive) it works nicely for unit tests. Here is a code sample:
using (var documentStore = new EmbeddableDocumentStore
{ RunInMemory = true,
UseEmbeddedHttpServer = true,
Configuration = {Port = EmbeddedModePort} })
{
documentStore.Initialize();
var url = documentStore.Configuration.ServerUrl;
using (var docStoreHttp = new DocumentStore {Url = url})
{
docStoreHttp.Initialize();
using (var session = docStoreHttp.OpenSession())
{
// now you can run code like:
// session.GetAuthorizationFor(),
// session.SetAuthorizationFor(),
// session.Advanced.IsOperationAllowedOnDocument(),
// etc...
}
}
}
There are couple of other items that should be mentioned:
The first document store needs to be run with the UseEmbeddedHttpServer set to true so that the second one can access it.
I created a constant for the Port so it would be used consistently and ensure use of a non reserved port.
I encountered this as well. Looking at the source, there's no way to do that operation as written. Not sure if there's some intrinsic reason why since I could easily replicate the functionality in my app by making a http request directly for the same info:
HttpClient http = new HttpClient();
http.BaseAddress = new Uri("http://localhost:8080");
var url = new StringBuilder("/authorization/IsAllowed/")
.Append(Uri.EscapeUriString(userid))
.Append("?operation=")
.Append(Uri.EscapeUriString(operation)
.Append("&id=").Append(Uri.EscapeUriString(entityid));
http.GetStringAsync(url.ToString()).ContinueWith((response) =>
{
var results = _session.Advanced.DocumentStore.Conventions.CreateSerializer()
.Deserialize<OperationAllowedResult[]>(
new RavenJTokenReader(RavenJToken.Parse(response.Result)));
}).Wait();

Very slow performance deserializing using datacontractserializer in a Silverlight Application

Here is the situation:
Silverlight 3 Application hits an asp.net hosted WCF service to get a list of items to display in a grid. Once the list is brought down to the client it is cached in IsolatedStorage. This is done by using the DataContractSerializer to serialize all of these objects to a stream which is then zipped and then encrypted. When the application is relaunched, it first loads from the cache (reversing the process above) and the deserializes the objects using the DataContractSerializer.ReadObject() method. All of this was working wonderfully under all scenarios until recently with the entire "load from cache" path (decrypt/unzip/deserialize) taking hundreds of milliseconds at most.
On some development machines but not all (all machines Windows 7) the deserialize process - that is the call to ReadObject(stream) takes several minutes an seems to lock up the entire machine BUT ONLY WHEN RUNNING IN THE DEBUGGER in VS2008. Running the Debug configuration code outside the debugger has no problem.
One thing that seems to look suspicious is that when you turn on stop on Exceptions, you can see that the ReadObject() throws many, many System.FormatException's indicating that a number was not in the correct format. When I turn off "Just My Code" thousands of these get dumped to the screen. None go unhandled. These occur both on the read back from the cache AND on a deserialization at the conclusion of a web service call to get the data from the WCF Service. HOWEVER, these same exceptions occur on my laptop development machine that does not experience the slowness at all. And FWIW, my laptop is really old and my desktop is a 4 core, 6GB RAM beast.
Again, no problems unless running under the debugger in VS2008. Anyone else seem this? Any thoughts?
Here is the bug report link: https://connect.microsoft.com/VisualStudio/feedback/details/539609/very-slow-performance-deserializing-using-datacontractserializer-in-a-silverlight-application-only-in-debugger
EDIT: I now know where the FormatExceptions are coming from. It seems that they are "by design" - they occur when when I have doubles being serialized that are double.NaN so that that xml looks like NaN...It seems that the DCS tries to parse the value as a number, that fails with an exception and then it looks for "NaN" et. al. and handles them. My problem is not that this does not work...it does...it is just that it completely cripples the debugger. Does anyone know how to configure the debugger/vs2008sp1 to handle this more efficiently.
cartden,
You may want to consider switching over to XMLSerializer instead. Here is what I have determined over time:
The XMLSerializer and DataContractSerializer classes provides a simple means of serializing and deserializing object graphs to and from XML.
The key differences are:
1.
XMLSerializer has much smaller payload than DCS if you use [XmlAttribute] instead of [XmlElement]
DCS always store values as elements
2.
DCS is "opt-in" rather than "opt-out"
With DCS you explicitly mark what you want to serialize with [DataMember]
With DCS you can serialize any field or property, even if they are marked protected or private
With DCS you can use [IgnoreDataMember] to have the serializer ignore certain properties
With XMLSerializer public properties are serialized, and need setters to be deserialized
With XmlSerializer you can use [XmlIgnore] to have the serializer ignore public properties
3.
BE AWARE! DCS.ReadObject DOES NOT call constructors during deserialization
If you need to perform initialization, DCS supports the following callback hooks:
[OnDeserializing], [OnDeserialized], [OnSerializing], [OnSerialized]
(also useful for handling versioning issues)
If you want the ability to switch between the two serializers, you can use both sets of attributes simultaneously, as in:
[DataContract]
[XmlRoot]
public class ProfilePerson : NotifyPropertyChanges
{
[XmlAttribute]
[DataMember]
public string FirstName { get { return m_FirstName; } set { SetProperty(ref m_FirstName, value); } }
private string m_FirstName;
[XmlElement]
[DataMember]
public PersonLocation Location { get { return m_Location; } set { SetProperty(ref m_Location, value); } }
private PersonLocation m_Location = new PersonLocation(); // Should change over time
[XmlIgnore]
[IgnoreDataMember]
public Profile ParentProfile { get { return m_ParentProfile; } set { SetProperty(ref m_ParentProfile, value); } }
private Profile m_ParentProfile = null;
public ProfilePerson()
{
}
}
Also, check out my Serializer class that can switch between the two:
using System;
using System.IO;
using System.Runtime.Serialization;
using System.Text;
using System.Xml;
using System.Xml.Serialization;
namespace ClassLibrary
{
// Instantiate this class to serialize objects using either XmlSerializer or DataContractSerializer
internal class Serializer
{
private readonly bool m_bDCS;
internal Serializer(bool bDCS)
{
m_bDCS = bDCS;
}
internal TT Deserialize<TT>(string input)
{
MemoryStream stream = new MemoryStream(input.ToByteArray());
if (m_bDCS)
{
DataContractSerializer dc = new DataContractSerializer(typeof(TT));
return (TT)dc.ReadObject(stream);
}
else
{
XmlSerializer xs = new XmlSerializer(typeof(TT));
return (TT)xs.Deserialize(stream);
}
}
internal string Serialize<TT>(object obj)
{
MemoryStream stream = new MemoryStream();
if (m_bDCS)
{
DataContractSerializer dc = new DataContractSerializer(typeof(TT));
dc.WriteObject(stream, obj);
}
else
{
XmlSerializer xs = new XmlSerializer(typeof(TT));
xs.Serialize(stream, obj);
}
// be aware that the Unicode Byte-Order Mark will be at the front of the string
return stream.ToArray().ToUtfString();
}
internal string SerializeToString<TT>(object obj)
{
StringBuilder builder = new StringBuilder();
XmlWriter xmlWriter = XmlWriter.Create(builder);
if (m_bDCS)
{
DataContractSerializer dc = new DataContractSerializer(typeof(TT));
dc.WriteObject(xmlWriter, obj);
}
else
{
XmlSerializer xs = new XmlSerializer(typeof(TT));
xs.Serialize(xmlWriter, obj);
}
string xml = builder.ToString();
xml = RegexHelper.ReplacePattern(xml, RegexHelper.WildcardToPattern("<?xml*>", WildcardSearch.Anywhere), string.Empty);
xml = RegexHelper.ReplacePattern(xml, RegexHelper.WildcardToPattern(" xmlns:*\"*\"", WildcardSearch.Anywhere), string.Empty);
xml = xml.Replace(Environment.NewLine + " ", string.Empty);
xml = xml.Replace(Environment.NewLine, string.Empty);
return xml;
}
}
}
This is a guess, but I think it is running slow in debug mode because for every exception, it is performing some actions to show the exception in the debug window, etc. If you are running in release mode, these extra steps are not taken.
I've never done this, so I really don't know id it would work, but have you tried just setting that one assembly to run in release mode while all others are set to debug? If I'm right, it may solve your problem. If I'm wrong, then you only waste 1 or 2 minutes.
About your debugging problem, have you tried to disable the exception assistant ? (Tools > Options > Debugging > Enable the exception assistant).
Another point should be the exception handling in Debug > Exceptions : you can disable the user-unhandled stuff for the CLR or only uncheck the System.FormatException exception.
Ok - I figured out the root issue. It was what I alluded to in the EDIT to the main question. The problem was that in the xml, it was correctly serializing doubles that had a value of double.NaN. I was using these values to indicate "na" for when the denominator was 0D. Example: ROE (Return on Equity = Net Income / Average Equity) when Average Equity is 0D would be serialized as:
<ROE>NaN</ROE>
When the DCS tried to de-serialize it, evidently it first tries to read the number and then catches the exception when that fails and then handles the NaN. The problem is that this seems to generate a lot of overhead when in DEBUG mode.
Solution: I changed the property to double? and set it to null instead of NaN. Everything now happens instantly in DEBUG mode now. Thanks to all for your help.
Try disabling some IE addons. In my case, the LastPass toolbar killed my Silverlight debugging. My computer would freeze for minutes each time I interacted with Visual Studio after a breakpoint.