Lucene 6 Payloads - lucene

I am trying to work with payloads in Lucene 6 but I am having troubles. The idea is to index payloads and use them in a CustomScoreQuery to check if the payload of a query term matches the payload for the document term.
Here is my payload filter:
#Override
public final boolean incrementToken() throws IOException {
if (!this.input.incrementToken()) {
return false;
}
// get the current token
final char[] token = Arrays.copyOfRange(this.termAtt.buffer(), 0, this.termAtt.length());
String stoken = String.valueOf(token);
String[] parts = stoken.split(Constants.PAYLOAD_DELIMITER);
if (parts.length > 1 && parts.length == 2){
termAtt.setLength(parts[0].length());
// the rest is the payload
BytesRef br = new BytesRef(parts[1]);
System.out.println(br);
payloadAtt.setPayload(br);
}else if (parts.length > 1){
// skip
}else{
// no payload here
payloadAtt.setPayload(null);
}
return true;
}
It seems to be adding the payload, however when I try to access the payload in CustomScoreQuery it just keeps returning null.
public float determineBoost(int doc) throws IOException{
float boost = 1f;
LeafReader reader = this.context.reader();
System.out.println("Has payloads:" + reader.getFieldInfos().hasPayloads());
// loop through each location of the term and boost if location matches the payload
if (reader != null){
PostingsEnum posting = reader.postings(new Term(this.field, term.getTerm()), PostingsEnum.POSITIONS);
System.out.println("Term: " + term.getTerm());
if (posting != null){
// move to the document currently looking at
posting.advance(doc);
int count = 0;
while (count < posting.freq()){
BytesRef load = posting.getPayload();
System.out.println(posting);
System.out.println(posting.getClass());
System.out.println(posting.attributes());
System.out.println("Load: " + load);
// if the location matches in the term location than boos the term by the boost factor
try {
if(load != null && term.containLocation(new Payload(load))){
boost = boost * this.boost;
}
} catch (PayloadException e) {
// do not care too much, the payload is unrecognized
// this is not going to change the boost factor
}
posting.nextPosition();
count += 1;
}
}
}
return boost;
}
For my two tests it keeps stating the load is null. Any suggestions or help?

Related

Using Lucene's highlighting, getting too much highlighted, is there a workaround for this?

I am using the highlighting feature of Lucene to isolate matching terms for my query, but some of the matched terms are excessive.
I have some simple test cases which are delivered in an Ant project (download details below).
Materials
You can download the test case here: mydemo_with_libs.zip
That archive includes the Lucene 8.6.3 libraries which my test uses; if you prefer a copy without the JAR files you can download that from here: mydemo_without_libs.zip
The necessary libraries are: core, analyzers, queries, queryparser, highlighter, and memory.
You can run the test case by unzipping the archive into an empty directory and running the Ant command ant synsearch
Input
I have provided a short synonym list which is used for indexing and analysing in the highlighting methods:
cope,manage
jobs,tasks
simultaneously,at once
and there is one document being indexed:
Queues are a useful way of grouping jobs together in order to manage a number of them at once. You can:
hold or release multiple jobs at the same time;
group multiple tasks (for the same event);
control the priority of jobs in the queue;
Eventually log all events that take place in a queue.
Use either job.queue or task.queue in specifications.
Process
When building the index I am storing the text field, and using a custom analyzer. This is because (in the real world) the content I am indexing is technical documentation, so stripping out punctuation is inappropriate because so much of it may be significant in technical expressions. My analyzer uses a TechTokenFilter which breaks the stream up into tokens consisting of strings of words or digits, or individual characters which don't match the previous pattern.
Here's the relevant code for the analyzer:
public class MyAnalyzer extends Analyzer {
public MyAnalyzer(String synlist) {
if (synlist != "") {
this.synlist = synlist;
this.useSynonyms = true;
}
}
public MyAnalyzer() {
this.useSynonyms = false;
}
#Override
protected TokenStreamComponents createComponents(String fieldName) {
WhitespaceTokenizer src = new WhitespaceTokenizer();
TokenStream result = new TechTokenFilter(new LowerCaseFilter(src));
if (useSynonyms) {
result = new SynonymGraphFilter(result, getSynonyms(synlist), Boolean.TRUE);
result = new FlattenGraphFilter(result);
}
return new TokenStreamComponents(src, result);
}
and here's my filter:
public class TechTokenFilter extends TokenFilter {
private final CharTermAttribute termAttr;
private final PositionIncrementAttribute posIncAttr;
private final ArrayList<String> termStack;
private AttributeSource.State current;
private final TypeAttribute typeAttr;
public TechTokenFilter(TokenStream tokenStream) {
super(tokenStream);
termStack = new ArrayList<>();
termAttr = addAttribute(CharTermAttribute.class);
posIncAttr = addAttribute(PositionIncrementAttribute.class);
typeAttr = addAttribute(TypeAttribute.class);
}
#Override
public boolean incrementToken() throws IOException {
if (this.termStack.isEmpty() && input.incrementToken()) {
final String currentTerm = termAttr.toString();
final int bufferLen = termAttr.length();
if (bufferLen > 0) {
if (termStack.isEmpty()) {
termStack.addAll(Arrays.asList(techTokens(currentTerm)));
current = captureState();
}
}
}
if (!this.termStack.isEmpty()) {
String part = termStack.remove(0);
restoreState(current);
termAttr.setEmpty().append(part);
posIncAttr.setPositionIncrement(1);
return true;
} else {
return false;
}
}
public static String[] techTokens(String t) {
List<String> tokenlist = new ArrayList<String>();
String[] tokens;
StringBuilder next = new StringBuilder();
String token;
char minus = '-';
char underscore = '_';
char c, prec, subc;
// Boolean inWord = false;
for (int i = 0; i < t.length(); i++) {
prec = i > 0 ? t.charAt(i - 1) : 0;
c = t.charAt(i);
subc = i < (t.length() - 1) ? t.charAt(i + 1) : 0;
if (Character.isLetterOrDigit(c) || c == underscore) {
next.append(c);
// inWord = true;
}
else if (c == minus && Character.isLetterOrDigit(prec) && Character.isLetterOrDigit(subc)) {
next.append(c);
} else {
if (next.length() > 0) {
token = next.toString();
tokenlist.add(token);
next.setLength(0);
}
if (Character.isWhitespace(c)) {
// shouldn't be possible because the input stream has been tokenized on
// whitespace
} else {
tokenlist.add(String.valueOf(c));
}
// inWord = false;
}
}
if (next.length() > 0) {
token = next.toString();
tokenlist.add(token);
// next.setLength(0);
}
tokens = tokenlist.toArray(new String[0]);
return tokens;
}
}
Examining the index I can see that the index contains the separate terms I expect, including the synonym values. For example the text at the end of the first line has produced the terms
of
them
at , simultaneously
once
.
You
can
:
and the text at the end of the third line has produced the terms
same
event
)
;
When the application performs a search it analyzes the query without using the synonym list (because the synonyms are already in the index), but I have discovered that I need to include the synonym list when analyzing the stored text to identify the matching fragments.
Searches match the correct documents, but the code I have added to identify the matching terms over-performs. I won't show all the search method here, but will focus on the code which lists matched terms:
public static void doSearch(IndexReader reader, IndexSearcher searcher,
Query query, int max, String synList) throws IOException {
SimpleHTMLFormatter htmlFormatter = new SimpleHTMLFormatter("\001", "\002");
Highlighter highlighter = new Highlighter(htmlFormatter, new QueryScorer(query));
Analyzer analyzer;
if (synList != null) {
analyzer = new MyAnalyzer(synList);
} else {
analyzer = new MyAnalyzer();
}
// Collect all the docs
TopDocs results = searcher.search(query, max);
ScoreDoc[] hits = results.scoreDocs;
int numTotalHits = Math.toIntExact(results.totalHits.value);
System.out.println("\nQuery: " + query.toString());
System.out.println("Matches: " + numTotalHits);
// Collect matching terms
HashSet<String> matchedWords = new HashSet<String>();
int start = 0;
int end = Math.min(numTotalHits, max);
for (int i = start; i < end; i++) {
int id = hits[i].doc;
float score = hits[i].score;
Document doc = searcher.doc(id);
String docpath = doc.get("path");
String doctext = doc.get("text");
try {
TokenStream tokens = TokenSources.getTokenStream("text", null, doctext, analyzer, -1);
TextFragment[] frag = highlighter.getBestTextFragments(tokens, doctext, false, 100);
for (int j = 0; j < frag.length; j++) {
if ((frag[j] != null) && (frag[j].getScore() > 0)) {
String match = frag[j].toString();
addMatchedWord(matchedWords, match);
}
}
} catch (InvalidTokenOffsetsException e) {
System.err.println(e.getMessage());
}
System.out.println("matched file: " + docpath);
}
if (matchedWords.size() > 0) {
System.out.println("matched terms:");
for (String word : matchedWords) {
System.out.println(word);
}
}
}
Problem
While the correct documents are selected by these queries, and the fragments chosen for highlighting do contain the query terms, the highlighted pieces in some of the selected fragments extend over too much of the input.
For example, if the query is
+text:event +text:manage
(the first example in the test case) then I would expect to see 'event' and 'manage' in the highlighted list. But what I actually see is
event);
manage
Despite the highlighting process using an analyzer which breaks terms apart and treats punctuation characters as single terms, the highlight code is "hungry" and breaks on whitespace alone.
Similarly if the query is
+text:queeu~1
(my final test case) I would expect to only see 'queue' in the list. But I get
queue.
job.queue
task.queue
queue;
It is so nearly there... but I don't understand why the highlighted pieces are inconsistent with the index, and I don't think I should have to parse the list of matches through yet another filter to produce the correct list of matches.
I would really appreciate any pointers to what I am doing wrong or how I could improve my code to deliver exactly what I need.
Thanks for reading this far!
I managed to get this working by replacing the WhitespaceTokenizer and TechTokenFilter in my analyzer with a PatternTokenizer; the regular expression took a bit of work but once I had it all the matching terms were extracted with pinpoint accuracy.
The replacement analyzer:
public class MyAnalyzer extends Analyzer {
public MyAnalyzer(String synlist) {
if (synlist != "") {
this.synlist = synlist;
this.useSynonyms = true;
}
}
public MyAnalyzer() {
this.useSynonyms = false;
}
private static final String tokenRegex = "(([\\w]+-)*[\\w]+)|[^\\w\\s]";
#Override
protected TokenStreamComponents createComponents(String fieldName) {
PatternTokenizer src = new PatternTokenizer(Pattern.compile(tokenRegex), 0);
TokenStream result = new LowerCaseFilter(src);
if (useSynonyms) {
result = new SynonymGraphFilter(result, getSynonyms(synlist), Boolean.TRUE);
result = new FlattenGraphFilter(result);
}
return new TokenStreamComponents(src, result);
}

Does Sql batch query execution involve multiple exhange of data between server and client?

Batch query execution from what I have read from multiple sources online have been stating that it enables grouping multiple statements together and executing it at once, thereby eliminating multiple back and forth communication.
Some sources that claim this are:
https://www.tutorialspoint.com/jdbc/jdbc-batch-processing.htm#:~:text=Batch%20Processing%20allows%20you%20to,communication%20overhead%2C%20thereby%20improving%20performance.
http://tutorials.jenkov.com/jdbc/batchupdate.html
https://www.baeldung.com/jdbc-batch-processing
All of these talk about single network trip etc, however going through the source code of H2 or Sqlite, it does look like its executing one by one. Albeit with autocommit disabled.
Eg: Sqlite
final synchronized int[] executeBatch(long stmt, int count, Object[] vals, boolean autoCommit) throws SQLException {
if (count < 1) {
throw new SQLException("count (" + count + ") < 1");
}
final int params = bind_parameter_count(stmt);
int rc;
int[] changes = new int[count];
try {
for (int i = 0; i < count; i++) {
reset(stmt);
for (int j = 0; j < params; j++) {
rc = sqlbind(stmt, j, vals[(i * params) + j]);
if (rc != SQLITE_OK) {
throwex(rc);
}
}
rc = step(stmt);
if (rc != SQLITE_DONE) {
reset(stmt);
if (rc == SQLITE_ROW) {
throw new BatchUpdateException("batch entry " + i + ": query returns results", changes);
}
throwex(rc);
}
changes[i] = changes();
}
}
finally {
ensureAutoCommit(autoCommit);
}
reset(stmt);
return changes;
}
Eg: H2
public int[] executeBatch() throws SQLException {
try {
debugCodeCall("executeBatch");
if (batchParameters == null) {
// Empty batch is allowed, see JDK-4639504 and other issues
batchParameters = Utils.newSmallArrayList();
}
batchIdentities = new MergedResult();
int size = batchParameters.size();
int[] result = new int[size];
SQLException first = null;
SQLException last = null;
checkClosedForWrite();
for (int i = 0; i < size; i++) {
Value[] set = batchParameters.get(i);
ArrayList<? extends ParameterInterface> parameters =
command.getParameters();
for (int j = 0; j < set.length; j++) {
Value value = set[j];
ParameterInterface param = parameters.get(j);
param.setValue(value, false);
}
try {
result[i] = executeUpdateInternal();
// Cannot use own implementation, it returns batch identities
ResultSet rs = super.getGeneratedKeys();
batchIdentities.add(((JdbcResultSet) rs).result);
} catch (Exception re) {
SQLException e = logAndConvert(re);
if (last == null) {
first = last = e;
} else {
last.setNextException(e);
}
result[i] = Statement.EXECUTE_FAILED;
}
}
batchParameters = null;
if (first != null) {
throw new JdbcBatchUpdateException(first, result);
}
return result;
} catch (Exception e) {
throw logAndConvert(e);
}
}
From the above code I see that there are multiple calls to the database, with each having its own result set. How does batch execution actually work?

Rate limiting with redis

I'm using elasticache redis for rate limit and use Redisson as the client, the related code is:
public CompletableFuture<List<Long>> incrementSingleKeys(
List<String> keys, List<Long> increments, List<Long> ttls) {
RBatch batch = redissonClient.createBatch(BatchOptions.defaults());
for (int i = 0; i < keys.size(); i++) {
batch.getAtomicLong(keys.get(i)).addAndGetAsync(increments.get(i));
}
return batch
.executeAsync()
.thenCompose(
(counters) -> {
List<String> keysToSet = Lists.newArrayList();
List<Long> TTLsToSet = Lists.newArrayList();
for (int i = 0; i < counters.size(); i++) {
if (counters.get(i) == increments.get(i)) { // only set ttl for new keys
keysToSet.add(keys.get(i));
TTLsToSet.add(ttls.get(i));
}
}
if (!keysToSet.isEmpty()) { // Call setTTLS
return setTTLs(keysToSet, TTLsToSet)
.thenApply(
(r) -> counters
);
} else {
return CompletableFuture.completedFuture(counters);
}
});
}
public CompletableFuture<List<Boolean>> setTTLs(List<String> keys, List<Long> TTLs) {
CompletableFuture<List<Boolean>> future = new CompletableFuture<>();
Stopwatch timer = Stopwatch.createStarted();
RBatch batch = redissonClient.createBatch(BatchOptions.defaults());
for (int i = 0; i < keys.size(); i++) {
batch.getBucket(keys.get(i)).expireAsync(TTLs.get(i), TimeUnit.MILLISECONDS);
}
batch
.executeAsync()
.whenComplete(
(list, ex) -> {
if (ex != null) {
future.complete(
Collections.nCopies(size, false); // fail open
)
} else {
future.complete(
list.stream()
.map(entry -> (entry instanceof Boolean ? (Boolean) entry : false))
.collect(Collectors.toList()),
);
}
});
return future;
}
Basically, I'll set ttl only for new keys. The issue is that sometimes the increment batch call is successful but setTTL call timeout, which results in a permanent key and could lead to incorrect rate limiting. One workaround is to always get and set ttl whenever increment happens, but this would affect the performance. Is there any other solution?

Throwing an Exception In an Xss Attack

This is a Web API which Json payloads (so, no Razor).
I'm using ASP.NET Core 2.1
1st up I should mention that I am sanitizing the relevant inputs with HtmlEncoder. However, that is just in case any gets past my validator, which I want to ask about here.
I want to write a validator which will return an error code where a user tries to include an html string in an input (using a mobile app, which would be a property in the json payload).
I've seen some naive implementation suggestion here on SO - usually just checking to see of the string contains '<' or '>' (and maybe one or 2 other chars).
I guess I would like to know if that is sufficient for the task at hand. There's no reason for a user to post any kind of html/xml in this domain.
A lot of the libraries around will sanitize input. But none of them seem to have a method which tells you if a string contains potentially harmful input.
As I said, I'm already sanitizing (as a last line of defence). But ideally I would return an error code before it gets to that.
Use this class from Microsoft ASP.NET Core 1
// <copyright file="CrossSiteScriptingValidation.cs" company="Microsoft">
// Copyright (c) Microsoft Corporation. All rights reserved.
// </copyright>
public static class CrossSiteScriptingValidation
{
private static readonly char[] StartingChars = { '<', '&' };
#region Public methods
// Only accepts http: and https: protocols, and protocolless urls.
// Used by web parts to validate import and editor input on Url properties.
// Review: is there a way to escape colon that will still be recognized by IE?
// %3a does not work with IE.
public static bool IsDangerousUrl(string s)
{
if (string.IsNullOrEmpty(s))
{
return false;
}
// Trim the string inside this method, since a Url starting with whitespace
// is not necessarily dangerous. This saves the caller from having to pre-trim
// the argument as well.
s = s.Trim();
var len = s.Length;
if ((len > 4) &&
((s[0] == 'h') || (s[0] == 'H')) &&
((s[1] == 't') || (s[1] == 'T')) &&
((s[2] == 't') || (s[2] == 'T')) &&
((s[3] == 'p') || (s[3] == 'P')))
{
if ((s[4] == ':') || ((len > 5) && ((s[4] == 's') || (s[4] == 'S')) && (s[5] == ':')))
{
return false;
}
}
var colonPosition = s.IndexOf(':');
return colonPosition != -1;
}
public static bool IsValidJavascriptId(string id)
{
return (string.IsNullOrEmpty(id) || System.CodeDom.Compiler.CodeGenerator.IsValidLanguageIndependentIdentifier(id));
}
public static bool IsDangerousString(string s, out int matchIndex)
{
//bool inComment = false;
matchIndex = 0;
for (var i = 0; ;)
{
// Look for the start of one of our patterns
var n = s.IndexOfAny(StartingChars, i);
// If not found, the string is safe
if (n < 0) return false;
// If it's the last char, it's safe
if (n == s.Length - 1) return false;
matchIndex = n;
switch (s[n])
{
case '<':
// If the < is followed by a letter or '!', it's unsafe (looks like a tag or HTML comment)
if (IsAtoZ(s[n + 1]) || s[n + 1] == '!' || s[n + 1] == '/' || s[n + 1] == '?') return true;
break;
case '&':
// If the & is followed by a #, it's unsafe (e.g. S)
if (s[n + 1] == '#') return true;
break;
}
// Continue searching
i = n + 1;
}
}
#endregion
#region Private methods
private static bool IsAtoZ(char c)
{
return (c >= 'a' && c <= 'z') || (c >= 'A' && c <= 'Z');
}
#endregion
}
Then use this middleware to control URL,Query Parameteres and Content:
public class XssMiddleware
{
private readonly RequestDelegate _next;
public XssMiddleware(RequestDelegate next)
{
if (next == null)
{
throw new ArgumentNullException(nameof(next));
}
_next = next;
}
public async Task Invoke(HttpContext context)
{
// Check XSS in URL
if (!string.IsNullOrWhiteSpace(context.Request.Path.Value))
{
var url = context.Request.Path.Value;
int matchIndex;
if (CrossSiteScriptingValidation.IsDangerousString(url, out matchIndex))
{
throw new CrossSiteScriptingException("YOUR_ERROR_MESSAGE");
}
}
// Check XSS in query string
if (!string.IsNullOrWhiteSpace(context.Request.QueryString.Value))
{
var queryString = WebUtility.UrlDecode(context.Request.QueryString.Value);
int matchIndex;
if (CrossSiteScriptingValidation.IsDangerousString(queryString, out matchIndex))
{
throw new CrossSiteScriptingException("YOUR_ERROR_MESSAGE");
}
}
// Check XSS in request content
var originalBody = context.Request.Body;
try
{
var content = await ReadRequestBody(context);
int matchIndex;
if (CrossSiteScriptingValidation.IsDangerousString(content, out matchIndex))
{
throw new CrossSiteScriptingException("YOUR_ERROR_MESSAGE");
}
await _next(context);
}
finally
{
context.Request.Body = originalBody;
}
}
private static async Task<string> ReadRequestBody(HttpContext context)
{
var buffer = new MemoryStream();
await context.Request.Body.CopyToAsync(buffer);
context.Request.Body = buffer;
buffer.Position = 0;
var encoding = Encoding.UTF8;
var contentType = context.Request.GetTypedHeaders().ContentType;
if (contentType?.Charset != null) encoding = Encoding.GetEncoding(contentType.Charset);
var requestContent = await new StreamReader(buffer, encoding).ReadToEndAsync();
context.Request.Body.Position = 0;
return requestContent;
}
}

Use Cecil to insert begin/end block around functions

this simple code works fine and allows to add a BeginSample/EndSample call around each Update/LateUpdate/FixedUpdate function. However it doesn't take in consideration early return instructions, for example as result of a condition. Do you know how to write a similar function that take in considerations early returns so that the EndSample call will be executed under every circumstance?
Note that I am not a Cecil expert, I am just learning now. It appears to me that Cecil automatically updates the operations that returns early after calling InsertBefore and similar functions. So if a BR opcode was previously jumping to a specific instruction address, the address will be updated after the insertions in order to jump to the original instruction. This is OK in most of the cases, but in my case it means that an if statement would skip the last inserted operation as the BR operation would still point directly to the final Ret instruction. Note that Update, LateUpdate and FixedUpdate are all void functions.
foreach (var method in type.Methods)
{
if ((method.Name == "Update" || method.Name == "LateUpdate" || method.Name == "FixedUpdate") &&
method.HasParameters == false)
{
var beginMethod =
module.ImportReference(typeof (Profiler).GetMethod("BeginSample",
new[] {typeof (string)}));
var endMethod =
module.ImportReference(typeof (Profiler).GetMethod("EndSample",
BindingFlags.Static |
BindingFlags.Public));
Debug.Log(method.Name + " method found in class: " + type.Name);
var ilProcessor = method.Body.GetILProcessor();
var first = method.Body.Instructions[0];
ilProcessor.InsertBefore(first,
Instruction.Create(OpCodes.Ldstr,
type.FullName + "." + method.Name));
ilProcessor.InsertBefore(first, Instruction.Create(OpCodes.Call, beginMethod));
var lastRet = method.Body.Instructions[method.Body.Instructions.Count - 1];
ilProcessor.InsertBefore(lastRet, Instruction.Create(OpCodes.Call, endMethod));
changed = true;
}
}
as a Bonus, if you can explain to me the difference between Emit and Append a newly created instruction with the same operand. does Append execute an Emit under the hood or does something more?
I may have found the solution, at least apparently it works. I followed the code used to solve a similar problem from here:
https://groups.google.com/forum/#!msg/mono-cecil/nE6JBjvEFCQ/MqV6tgDCB4AJ
I adapted it for my purposes and it seemed to work, although I may find out other issues. This is the complete code:
static bool ProcessAssembly(AssemblyDefinition assembly)
{
var changed = false;
var moduleG = assembly.MainModule;
var attributeConstructor =
moduleG.ImportReference(
typeof(RamjetProfilerPostProcessedAssemblyAttribute).GetConstructor(Type.EmptyTypes));
var attribute = new CustomAttribute(attributeConstructor);
var ramjet = moduleG.ImportReference(typeof(RamjetProfilerPostProcessedAssemblyAttribute));
if (assembly.HasCustomAttributes)
{
var attributes = assembly.CustomAttributes;
foreach (var attr in attributes)
{
if (attr.AttributeType.FullName == ramjet.FullName)
{
Debug.LogWarning("<color=yellow>Skipping already-patched assembly:</color> " + assembly.Name);
return false;
}
}
}
assembly.CustomAttributes.Add(attribute);
foreach (var module in assembly.Modules)
{
foreach (var type in module.Types)
{
// Skip any classes related to the RamjetProfiler
if (type.Name.Contains("AssemblyPostProcessor") || type.Name.Contains("RamjetProfiler"))
{
// Todo: use actual type equals, not string matching
Debug.Log("Skipping self class : " + type.Name);
continue;
}
if (type.BaseType != null && type.BaseType.FullName.Contains("UnityEngine.MonoBehaviour"))
{
foreach (var method in type.Methods)
{
if ((method.Name == "Update" || method.Name == "LateUpdate" || method.Name == "FixedUpdate") &&
method.HasParameters == false)
{
var beginMethod =
module.ImportReference(typeof(Profiler).GetMethod("BeginSample",
new[] { typeof(string) }));
var endMethod =
module.ImportReference(typeof(Profiler).GetMethod("EndSample",
BindingFlags.Static |
BindingFlags.Public));
Debug.Log(method.Name + " method found in class: " + type.Name);
var ilProcessor = method.Body.GetILProcessor();
var first = method.Body.Instructions[0];
ilProcessor.InsertBefore(first,
Instruction.Create(OpCodes.Ldstr,
type.FullName + "." + method.Name));
ilProcessor.InsertBefore(first, Instruction.Create(OpCodes.Call, beginMethod));
var lastcall = Instruction.Create(OpCodes.Call, endMethod);
FixReturns(method, lastcall);
changed = true;
}
}
}
}
}
return changed;
}
static void FixReturns(MethodDefinition med, Instruction lastcall)
{
MethodBody body = med.Body;
var instructions = body.Instructions;
Instruction formallyLastInstruction = instructions[instructions.Count - 1];
Instruction lastLeaveInstruction = null;
var lastRet = Instruction.Create(OpCodes.Ret);
instructions.Add(lastcall);
instructions.Add(lastRet);
for (var index = 0; index < instructions.Count - 1; index++)
{
var instruction = instructions[index];
if (instruction.OpCode == OpCodes.Ret)
{
Instruction leaveInstruction = Instruction.Create(OpCodes.Leave, lastcall);
if (instruction == formallyLastInstruction)
{
lastLeaveInstruction = leaveInstruction;
}
instructions[index] = leaveInstruction;
}
}
FixBranchTargets(lastLeaveInstruction, formallyLastInstruction, body);
}
private static void FixBranchTargets(
Instruction lastLeaveInstruction,
Instruction formallyLastRetInstruction,
MethodBody body)
{
for (var index = 0; index < body.Instructions.Count - 2; index++)
{
var instruction = body.Instructions[index];
if (instruction.Operand != null && instruction.Operand == formallyLastRetInstruction)
{
instruction.Operand = lastLeaveInstruction;
}
}
}
basically what it does is to add a Ret instuction, but then replace all the previous Ret (usually one, why should it be more than one?) with a Leave function (don't even know what it means :) ), so that all the previous jumps remain valid. Differently than the original code, I make the Leave instruction point to the EndSample call before the last Ret