Same script, different behavior - scripting

I just stumbled upon an interesting bug... Still trying to figure out what is exactly happening. Maybe you can help.
First, the context. I'm currently building yet another man to html converter (for some reasons I won't motivate here, but I need it).
So, have a look at the screenshot below (see the link), more precisely at the outlined spots. See? On the upper shell, I have &lt ; and &gt ;, that is, escaped html.
While on the shell below I have < and > directly.
But as you can see (or do I seriously need looking glass ?), the command man 2 semget | webmanneris the same on both sides, as is the which webmanner. The two are executed roughly at the same moment, with no modification made to the script between.
[Oops, cannot post pictures just yet... Here comes the link]
http://aspyct.org/media/webmanner-bug.png
But the shell below is older (open about 1 hour ago). Newer shells all print out &lt ;. So my first guess was that it somehow had a cached reference to the old inode of the file, or old blocks or whatever.
So I modified parts of the script, at the start and then at the end, to print different messages. And, surprise, the message shown up on both terminals. But still, same difference between &lt ; and <.
I'm confused... How to explain that behavior? I'm working on a OSX 10.8 (Mountain Lion)
EDIT: OK, there is one big difference: the shell below uses ruby 1.9.3, while above is 1.8.7. Is there any known difference in string handling between the two versions ?

Are you using the htmlentities library? If so then this bug fix is probably what you are seeing
Ruby 1.9.3 has slightly different behaviour to 1.9.2: the result of
encode was not ASCII even when it only contained ASCII characters.
This may not be important, but this change makes both versions produce
the same result.
https://github.com/threedaymonk/htmlentities/commit/46dafc959de03a02d0c1705bef7f1b157b350025

Related

Getting reference error from appledoc when embedding code in comments

I have some code comments like this:
/**
How to use this method.
#discussion To use it, do something like the following
id hook = [[STDeallocHook alloc] initWithBlock:^{
// Do something when 'hook' is dealloced
}];
*/
So the code example is indented with 4 spaces. When I compile the docset with appledoc, it compiles correctly and shows the code as code in the API reference I generate. However back in XCode (Where I have appledoc creating warnings for issues in the doco) I get the warning:
Invalid [[STDeallocHook alloc] reference found hear STDeallocHook.h#16, unknown object: [STDeallocHook !
I think what's happening is that appledoc is looking for markdown links inside the code block.
How can I stop this warning from appearing?
I've been unable to stop it as well. It looks like it's been a known bug since 2011, but it's still broken.
Interestingly enough, I don't get it for everything. In a large code example, I'll only get a few of them... still haven't figured out how it determines to cause me grief or not...
Workarounds
This works around the warning, and looks fine in the generated documentation, but looks like crap in plain text: substitute the leading [ with the HTML escape code [
Future Fix
Supposedly, the mythical version 3 has addressed it, but I can't find any mention of an ETA for it. There is a "3.0exp1" branch from March 2012, and a "3.0dev" branch from October 2014.
If you have both the time and inclination, maybe you can see how it was fixed and patch it yourself (though the codebase has apparently changed a ton since then).
My Attempt
I felt unsatisfied with that answer, so, I went back and looked at the source code. First time in that code. It's not exactly easy to navigate... and none of the classes are documented, which I find quite strange, especially for a documentation tool.
Anyway, I think I know why I only get the warning sometimes. The parser treats all underscores as formatting markers. Thus, if it finds two of them in the same "block" of text, it splits them up. Since the code I tested on had category documentation, only the last one encountered in each "block" caused a warning... because all the others were treated as italics... and then ignored.
Also, it seems that I may be able to coerce it into skipping source-code blocks if they are marked as either...
#code
[self wjh_doSomething];
#endcode
or
```
[self wjh_doSomething];
```
or
~~~
[self wjh_doSomething];
~~~
The first is common in documentation blocks, the latter two in markdown.
It's a hack, but it seems to work. I sent a PR, which can be found here. Who knows if it will get accepted, but feel free to try it out yourself if you are so inclined.
I think I'll at least use it locally, as it cleans up a ton of warnings for me... and I may just go try to regenerate all my documented stuff to boot.
Edit
Well, I guess I should have gone and looked at the open PRs first. There seems to be a PR already sitting there that deals with the same issue, that has been there since May. It would have saved me time... but it was a little fun experimenting with it a bit ;-)
You may want to use that one... it seems to be simpler. Simpler is better, but I have not used that one and I'm not sure it completely ignores the blocks, but he seems to have quieted the warnings with his patch.
That one does not support #code/#endcode, which I'm glad to have.

Portable Executable DLL file and binary date format

I have got a PE executable file *.exe (32-bit), which is an small application (2.6Mb) to update firmware software of TV device. However, the update mechanism was only available up to 2013-03-12. I want to hack this executable just for pleasure. I'm trying to find this expiration date in file hexdump using PE Explorer, and replace it by some date in future to make this program work.
I found this article about binary date format:
binary date format
I am trying to find something like this value:
2013-03-xx: 0x713xxxxx
Is this a good approach to solve my task? Any suggestions? Do you know any others tools for hexdump that may be useful?
Best regard,
WP
There are likely a lot of values of the form 0x713xxxxx -- 2.6 MB might be larger than you've thought when you start looking through it more or less at random (you don't actually know that the application uses this date format internally).
The conventional approach to deal with this sort of problem is to use a tool to step through the program, examining the code that is executing, until you find the point where the check occurs. Then simply disable the check so that it always fails -- by altering the date, or simply by altering the code.
A popular tool for stepping through code that you do not control is the Interactive Dissassembler, IDA. You can download a freeware version of it here: https://www.hex-rays.com/products/ida/support/download_freeware.shtml
It might be harder than you think to do what you want, but you'll almost certainly learn a lot by trying.
Be aware of the legal issues you may be getting yourself into by making modifications to someone else's binaries, particularly if you distribute them afterwards.
dumpbin is a good PE parser (but if I were you, I won't do such kind of time stamp hacks :))

Localsolr wt=json and fl compatible?

We've got Localsolr (2.9.1 lucene-spatial library) running on Solr 1.4 with Tomcat 1.6. Everything's looking good, except for a couple little issues.
If we specify fl=id (or fl= anything) and wt=json it seems that the fl parameter is ignored (thus we get a lot more detail in our results than we'd like).
If we specify fl=id and leave out wt=json (which defaults to returning xml results), we get the expected fields back. We'd really prefer to use wt=json because the results are easier for us to deal with (also, the same issue also arises with wt=python and wt=ruby).
Ideas? Known issue? Workarounds?

Print complete control flow through gdb including values of variables

The idea is that given a specific input to the program, somehow I want to automatically step-in through the complete program and dump its control flow along with all the data being used like classes and their variables. Is their a straightforward way to do this? Or can this be done by some scripting over gdb or does it require modification in gdb?
Ok the reason for this question is because of an idea regarding a debugging tool. What it does is this. Given two different inputs to a program, one causing an incorrect output and the other a correct one, it will tell what part of the control flow differ for them.
So What I think will be needed is a complete dump of these 2 control flows going into a diff engine. And if the two inputs are following similar control flows then their diff would (in many cases) give a good idea about why the bug exist.
This can be made into a very engaging tool with many features build on top of this.
Tell us a little more about the environment. dtrace, for example, will do a marvelous job of this in Solaris or Leopard. gprof is another possibility.
A bumpo version of this could be done with yes(1), or expect(1).
If you want to get fancy, GDB can be scripted with Python in some versions.
What you are describing sounds a bit like gdb's "tracepoint debugging".
See gdb's internal help "help tracepoint". You can also see a whitepaper
here: http://sourceware.org/gdb/talks/esc-west-1999/
Unfortunately, this functionality is not currently implemented for
native debugging, but I believe that CodeSourcery is doing some work
on it.
Check this out, unlike Coverity, Fenris is free and widly used..
How to print the next N executed lines automatically in GDB?

How would one go about testing an interpreter or a compiler?

I've been experimenting with creating an interpreter for Brainfuck, and while quite simple to make and get up and running, part of me wants to be able to run tests against it. I can't seem to fathom how many tests one might have to write to test all the possible instruction combinations to ensure that the implementation is proper.
Obviously, with Brainfuck, the instruction set is small, but I can't help but think that as more instructions are added, your test code would grow exponentially. More so than your typical tests at any rate.
Now, I'm about as newbie as you can get in terms of writing compilers and interpreters, so my assumptions could very well be way off base.
Basically, where do you even begin with testing on something like this?
Testing a compiler is a little different from testing some other kinds of apps, because it's OK for the compiler to produce different assembly-code versions of a program as long as they all do the right thing. However, if you're just testing an interpreter, it's pretty much the same as any other text-based application. Here is a Unix-centric view:
You will want to build up a regression test suite. Each test should have
Source code you will interpret, say test001.bf
Standard input to the program you will interpret, say test001.0
What you expect the interpreter to produce on standard output, say test001.1
What you expect the interpreter to produce on standard error, say test001.2 (you care about standard error because you want to test your interpreter's error messages)
You will need a "run test" script that does something like the following
function fail {
echo "Unexpected differences on $1:"
diff $2 $3
exit 1
}
for testname
do
tmp1=$(tempfile)
tmp2=$(tempfile)
brainfuck $testname.bf < $testname.0 > $tmp1 2> $tmp2
[ cmp -s $testname.1 $tmp1 ] || fail "stdout" $testname.1 $tmp1
[ cmp -s $testname.2 $tmp2 ] || fail "stderr" $testname.2 $tmp2
done
You will find it helpful to have a "create test" script that does something like
brainfuck $testname.bf < $testname.0 > $testname.1 2> $testname.2
You run this only when you're totally confident that the interpreter works for that case.
You keep your test suite under source control.
It's convenient to embellish your test script so you can leave out files that are expected to be empty.
Any time anything changes, you re-run all the tests. You probably also re-run them all nightly via a cron job.
Finally, you want to add enough tests to get good test coverage of your compiler's source code. The quality of coverage tools varies widely, but GNU Gcov is an adequate coverage tool.
Good luck with your interpreter! If you want to see a lovingly crafted but not very well documented testing infrastructure, go look at the test2 directory for the Quick C-- compiler.
I don't think there's anything 'special' about testing a compiler; in a sense it's almost easier than testing some programs, since a compiler has such a basic high-level summary - you hand in source, it gives you back (possibly) compiled code and (possibly) a set of diagnostic messages.
Like any complex software entity, there will be many code paths, but since it's all very data-oriented (text in, text and bytes out) it's straightforward to author tests.
I’ve written an article on compiler testing, the original conclusion of which (slightly toned down for publication) was: It’s morally wrong to reinvent the wheel. Unless you already know all about the preexisting solutions and have a very good reason for ignoring them, you should start by looking at the tools that already exist. The easiest place to start is Gnu C Torture, but bear in mind that it’s based on Deja Gnu, which has, shall we say, issues. (It took me six attempts even to get the maintainer to allow a critical bug report about the Hello World example onto the mailing list.)
I’ll immodestly suggest that you look at the following as a starting place for tools to investigate:
Software: Practice and Experience April 2007. (Payware, not available to the general public---free preprint at http://pobox.com/~flash/Practical_Testing_of_C99.pdf.
http://en.wikipedia.org/wiki/Compiler_correctness#Testing (Largely written by me.)
Compiler testing bibliography (Please let me know of any updates I’ve missed.)
In the case of brainfuck, I think testing it should be done with brainfuck scripts. I would test the following, though:
1: Are all the cells initialized to 0
2: What happens when you decrement the data pointer when it's currently pointing to the first cell? Does it wrap? Does it point to invalid memory?
3: What happens when you increment the data pointer when it's pointing at the last cell? Does it wrap? Does it point to invalid memory
4: Does output function correctly
5: Does input function correctly
6: Does the [ ] stuff work correctly
7: What happens when you increment a byte more than 255 times, does it wrap to 0 properly, or is it incorrectly treated as an integer or other value.
More tests are possible too, but this is probably where i'd start. I wrote a BF compiler a few years ago, and that had a few extra tests. Particularly I tested the [ ] stuff heavily, by having a lot of code inside the block, since an early version of my code generator had issues there (on x86 using a jxx I had issues when the block produced more than 128 bytes or so of code, resulting in invalid x86 asm).
You can test with some already written apps.
The secret is to:
Separate the concerns
Observe the law of Demeter
Inject your dependencies
Well, software that is hard to test is a sign that the developer wrote it like it's 1985. Sorry to say that, but utilizing the three principles I presented here, even line numbered BASIC would be unit testable (it IS possible to inject dependencies into BASIC, because you can do "goto variable".