Fix bugs in library code, or abandom them? - api

Assume i have a function in a code library with a bug in it i've found a bug in a code library:
class Physics
{
public static Float CalculateDistance(float initialDistance, float initialSpeed, float acceleration, float time)
{
//d = d0 + v0t + 1/2*at^2
return initialDistance + (initialSpeed*time)+ (acceleration*Power(time, 2));
}
}
Note: The example, and the language, are hypothetical
i cannot guarantee that fixing this code will not break someone.
It is conceivable that there are people who depend on the bug in this code, and that fixing it might cause them to experience an error (i cannot think of a practical way that could happen; perhaps it has something to do with them building a lookup table of distances, or maybe they simply throw an exception if the distance is the wrong value not what they expect)
Should i create a 2nd function:
class Physics
{
public static Float CalculateDistance2(float initialDistance, float initialSpeed, float acceleration, float time) { ... }
//Deprecated - do not use. Use CalculateDistance2
public static Float CalculateDistance(float initialDistance, float initialSpeed, float acceleration, float time) { ... }
}
In a language without a way to formally deprecate code, do i just trust everyone to switch over to CalculateDistance2?
It's also sucky, because now the ideally named function (CalculateDistance) is forever lost to a legacy function that probably nobody needs, and don't want to be using.
Should i fix bugs, or abandon them?
See also
How to work in untestable legacy code- in bug fixing
Should we fix that bug?
Should this bug be fixed?
Working Effectively With Legacy Code

You'll never succeed at catering to every existing project using your library. Attempting to do so may create a welcomed sense of predictability, but it will also lead to it being bloated and stagnant. Eventually, this will leave it prone to replacement by a much more concise library.
Like any other project, it should be expected to go through iterations of change and rerelease. As most of your current user base are programmers that should be familiar with that, change really shouldn't come as a surprise to them. As long as you identify releases with versioning and document the changes made between, they should know what to expect when they go to update, even if it means they decide to stay with the version they already have.
Also, as a possible new user, finding your library to have an ever-growing number of lines of legacy code due to the blatant unwillingness to fix known bugs tells me that the maintainability and sustainability of the project are both potentially poor.
So, I would honestly just say to fix it.

Good question. I'm looking forward to some other answers. Here's my 2 cents on the issue:
Generally, if you suspect that many people indeed rely on the bug, then that's an argument for not fixing the bug and instead creating a new function CalculateDistance2.
On the other hand, and I think that's the better option, don't forget that people who rely on the bug can always continue using a specific, older version of your library. You can still document the removal of the bug (and therefore the modified behaviour or your library function) in the release notes.
(If your class happens to be a COM component, the conventional wisdom would be to create a new interface ICalculateDistance2 that makes the original interface obsolete, but preserves it for backwards compatibility.)

Another option is to fix the bug, but leave the old code around as a LegacyCalculateDistance method that's available if anybody really needs it.
You could even implement the ability to select the "legacy" implementation based on (for example) a configuration file or an environment variable setting, if you're concerned about offering a compatibility solution to users who may not be able to make code-level changes.

I once fought for a number of days with some MFC code that was behaving in an entirely unexpected way. When I finally figured out it was an error in the Microsoft supplied library, I checked the knowledge base. It was documented as (approximately) "this is a known bug we found 2 OS versions ago. We aren't fixing it because someone is probably depending on it".
I was a little mad...
I'd say that you should deprecate it. If you're upgrading the library your code depends on, you should test the code with the new library. If it's legacy code, then there's a known configuration it works on. Advise your users and move forward...

As you aready described, this is a tradeoff between satisfying two different user groups:
Existing users who have build their software based on a bug in your library
New users who will be using your library in the future
There is no ideal solution, nor do I think that there is a universal answer. I think this depends entirely on the bug and function in question.
I think you have to ask yourself
"Does the function make any sense with the currently existing bug?"
If it does, then leave I'd it in the library. Otherwise, I'd probably toss it out.

Just for the sake of argument, let's assume your hypothetical language was C++ (even though it looks a lot more like Java). In that case, I'd use namespaces to create a fork that was (reasonably) easy to deal with from a viewpoint of both new and legacy code:
namespace legacy {
class physics {
Float CalculateDistance(float initialDistance, float initialSpeed, float acceleration, float time)
{
// original code here
}
}
}
namespace current {
class physics {
Float CalculateDistance(float initialDistance, float initialSpeed, float acceleration, float time)
{
// corrected code here
}
}
From there you have a couple of options for choosing between the two. For example, existing code could add a using directive: using legacy::physics;, and they'd continue to use the existing code without any further modification. New code could add a using current::physics; instead, to get the current code. When you did this, you'd (probably) deprecate the legacy::physics class, and schedule it for removal after some given period of time, number of revisions, or whatever. This gives your customers a chance to check their code and switch over to the new code in an orderly fashion, while keeping the legacy namespace from getting too polluted with old junk.
If you really want to get elaborate with this, you can even add a version numbering scheme to your namespaces, so instead of just legacy::physics, it might be v2_7::physics. This allows for the possibility that even when you "fix" code, it's remotely possible that there might still be a bug or two left, so you might end up revising it again, and somebody might end up depending on some arbitrary version of it, not necessarily just the original or the current one.
At the same time, this restricts "awareness" of the version to be used to one (fairly) small part of the code (or at least a small part of each module) instead of it being spread throughout all the code. It also gives a fairly painless way for somebody to compile a module using the new code, check for errors, switch back to the old code if needed, etc., without having to deal directly with every individual invocation of the function in question.

Related

Debugging an obfuscated .NET core application with DotPeek

I am hunting for a possible logic bomb in the code deployed to production by our vendor software factory.
For sake of curiosity of the readers, here is a brief recap. The application stopped working with an infinite wait at some point. Decompiling the obfuscated code, I found an odd Thread.sleep that should never be in an MVC API, where the amount is computed by difference of the current ticks to a value computed somehow. I.e.
private long SomeFunction(long param) {
if (param > 0)
Thread.Sleep(param);
return param;
}
private long GetSomeLongValue() {
//Simplified. There is a lot of long to string and back
return SomeFunction(Manipulate(DateTime.Now.Ticks - GetMysteryNumber()));
}
private long Manipulate(long param){
if (param < 0)
return param;
else
# Compute a random number of days between 0 and param / 86400000,
# and return its milliseconds value, always positive
}
And by running experiments with system clock, I found that there is a magic DateTime.Now value when the application works (before) and stops (right after one second). The experiment was consistent and repeatable.
Back to the question
I have done all this work using JetBrains DotPeek. This was done by looking at the code: human static analysis.
The problem is that what I have called SomeMysteryFunction is too well obfuscated that I really can't get any clue about what it does. I have the full code but I would like to take another approach.
I'd like to exercise that function and try to see if it returns consistent values that may be equal to the guilty timestamp. The function depends on the result of GetCallingAssembly method, so that will be a pain in the back.
I thought about running some sort of Program.cs or unit test that exploits the obfuscated function by reflection, but I'd like to debug using DotPeek. Why?
Disassembly can be a mess
I tried Telerik, but I had a lot more success with DotPeek decompiling async methods not in their StateMachine representation
I have never done this in my work experience. I just need to be sure about this being intentional or not.
How do I set up a test bed environment so that I can debug into a linked DLL decompiled by DotPeek?
This post In the Jungle of .NET Decompilers explains all .NET Decompilers that are worth to use.
Definitely the free and OSS tool dnSpy is the one you want to use for that sort of hacking[I'd like to exercise that function] scenarios.

can i make an equivalent to /* and */ for comment blocks?

Its driving me crazy, I spend soo much time getting it wrong, and then fixing it wrong.
I'm thinking of using -= and =- as delineators, but it probably means a lot of hours in learning how to fool the compiler into a substitution. Is this a quixotic quest? can such be done? has it been done already, albeit with different keystrokes?
I work alone. I don't collaborate.
So I don't mind a non-standard work environment
If I need to in the future i could make a scheme whereby both could work.
Not without building your own custom version of the preprocessor. Comment syntax is an inherent part of the language and is not designed to be configurable.
(Incidentally, -= is already a token in Objective-C — it means "assign to LHS the result of subtracting RHS from LHS.")
It should be possible to extend clang, then modify your Xcode builds to include your clang extensions. I have no personal experience with writing complier extension for clang, but I did work on a tool that extended cl.exe. Warning: this would be a very deep dive into the internals of the build system.
Extending Clang
Good Luck

Why Decompilers cant produce original code theoretically

I searched the internet but did not find a concrete answer that why decompilers are unable to produce original source code. I dint get a satisfactory answer. Somewhere it was written that it is similar to halting problem but dint tell how. So what is the theoretical and technical limitation of creating a decompiler which is perfect.
It is, quite simply, a many-to-one problem. For example, in C:
b++;
and
b+=1;
and
b = b + 1;
may all get compiled to the same set of operations once the compiler and optimizer are done. It reorders things, drops in-effective operations, and rewrites entire sections of code. By the time it is done, it has no idea what you wrote, just a pretty good idea what you intended to happen, at a raw-CPU (or vCPU) level.
It is even smart enough to remove variables that aren't needed:
{
a=5;
b=func();
c=a+b;
d=func2(c);
}
## gets rewritten as:
REGISTERA=func()
REGISTERA+=5
return(func2(REGISTERA))
For starters, the variable names are never preserved when your program is compiled. ...so the best it could possibly do would be to use meaningless variable names throughout your re-constituted program. Compiling is generally a one-way transformation - like a one-way hashing function. Like the hash, it may be possible to generate something else that could hash to the same value, but it's highly unlikely the decompiled program will be the exact same as your original.
Compilers throw out information; not all the information that is in the source code is in the compiled code. For example in compiled Java, you can't tell the difference between a parameterized and unparameterized generic type because the information is only used by the compiler; some annotations are only used at compile time and are not included in the compiled output. That doesn't mean you couldn't get some sort of source code by decompiling; it just wouldn't match nor would be as informative as the actual source code.
There is usually not a 1-to-1 correspondence between source code and compiled code. If an essentially infinite number of possible sources could result in the same object code (given unbounded variable name lengths, etc.), how is a decompiler to guess which one to spit out?

Why use Intellij, or what to do during the evaluation

I downloaded IntelliJ IDEA and started with the 30 day evaluation.
Now I'm just wondering, why should I use IntelliJ for plain old java developement (so no Hibernate, JSP, etc)? It doesn't look that different from eclipse or NetBeans, so I hope some IntelliJ guru can give some examples of things IntelliJ can do to justify investing the money in it. I'm trying to use the evaluation to it's fullest and don't want to miss an important feature.
A list of things possible in IntelliJ but not in eclipse is already available, but I'm more interested in the daily workflow than some obscure features that will be used twice a month.
I've been using IntelliJ for about 5 years now (since version 4.5) and I also read through most of the Manning book "IntelliJ in Action" and I still wouldn't consider myself a guru on it. In fact, I also wanted to do "plain old Java development" with it, and honestly I have to say that it's quite good at that. Like the other answers, I can only say that there's a definite edge in it's helpfulness that really puts it over the top. We use Eclipse here at work also, and while I don't have as much experience, I can tell you that there are definitely a lot of basic things lacking in it. I put in some serious time and effort to learn Eclipse, looking up how to do the everyday sorts of things I take for granted in IntelliJ, and they're mostly not there or very poorly implemented. The refactoring stuff is definitely the thing that helps a lot.
Aside from the refactoring, I think there are just a ton of small touches that really make this helpful. I think an example might help clarify...
Try this:
Create a new, empty class. Move the cursor inside the braces and do psvm and hit Ctrl-J - this expands the "psvm" into "public static void main(String[] args)". There's a whole list of commonly-used idioms that this shortcut will handle (and it's configurable, too). Inside the main code block, enter this code:
public static void main(String[] args) {
int x = 1000;
sout
}
At the end of "sout", do Ctrl-J again - you'll see another popup that let's you choose from some different expansions, but in general this expands to "System.out.println("")" and helpfully puts the cursor between the double-quotes (it's small touches like this that really make it shine, I think. Compare with Visual Studio's "IntelliSense" - a total crock if you ask me).
Anyway, backspace over the first double-quote - notice it deletes the matching double-quote? It does the same thing with braces and brackets, too. I think there are a few corner cases where I prefer it doesn't do this, but the large majority of the time it helps a lot. Back to the code editing: just type x so the code now looks like this:
public static void main(String[] args) {
int x = 1000;
// add a few blank lines here too - the need for
// this will be obvious a little later
System.out.println(x);
}
Now, move the cursor over to the declaration of x, and do Shift-F6 - this is the refactoring in-place dialog (I dunno what to call it, so I just made that up). The name "x" gets a colored box around it, and you can start typing a new name for it - as you type, all uses of that name get dynamically updated too. Another neat touch I really like.
Try this: put a really long line comment somewhere, like so:
// this is a really long comment blah blah blah i love to hear myself talking hahaha
Now say you decide the comment is too long, so you move the cursor to the middle of it somewhere and hit Enter. IntelliJ will put the remaining portion of the comment with a "// " prepended - it "knows" this is a continuation of the previous comment, so it comments it for you. Another neat touch.
// this is a really long comment blah
// blah blah i love to hear myself talking hahaha
Another big bonus I like about IntelliJ compared to Eclipse is that it's much less intrusive - I really hated how Eclipse would manage to get popups on top of popups, and mouse focus would be somewhere but keyboard focus is stuck on something underneath, etc. I think it's possible to work in such a way that these sorts of things don't happen, but it annoyed me immensely in the first place. That reminds me, in IntelliJ if you move the mouse cursor over the package or file navigator in the left pane, that panel gets the mouse focus automatically, so I got accustomed to using the mouse wheel immediately to look around. In Eclipse? You mouse over, but focus stayed on the editing pane, so you have to CLICK with the mouse to transfer focus, and then be able to use the mouse wheel to look around. Like I said, it's a lot of small touches like that which help with productivity.
As you code around, pay attention to the left gutter bar for red "light bulb" type symbols on the current line - this is IntelliJ telling you there are possible things it can do. Use Alt-Enter to bring up a small in-place dialog box, and it will tell you what it can take care of automatically. Say you type in a method definition named "getFoo()" and there's no foo member - it will offer to create it for you. Or if you're using a class and call a non-existing method on it like getFoo() - it will offer to create a getter and a member, or a regular method. It's just plain helpful.
Overall, I'd say small touches are not what IntelliJ gurus will really want to talk about, but I really appreciate how these sorts of things are just "well done". They take care of small details so you don't have to spend so much mental runtime checking your own syntax. I think of it as a butler helping me out with my coding - it takes care of small chores for me so I don't have to. Batman has his Alfred, and I have my IntelliJ. All the settings are excruciatingly laid out for you to modify if you like, but it just seems like the defaults are all geared toward improving your productivity, instead of bothering you all the time with really mundane decisions (especially those which you would almost always make the same choice anyway).
There are some definite drawbacks to IntelliJ - the price is a bit high, and it's quite large that it can take a while to load up on a large project. I'm lucky in that my company pays for the license as well as a very nice workstation so it loads up reasonably quick, but your mileage will vary.
I have a day job where I use Eclipse (because that's all that is permitted). I also have my own company where I "moonlight" doing contract work and use IntelliJ.
I would say their feature sets are about the same. Pretty much the same refactorings, same code assists, same style of use, etc. Eclipse arguably has better plug-ins, and probably many more, than IntelliJ. IntelliJ just "knows" things right out of the box like Spring and Hibernate, and these things are better integrated than Eclipse plug-ins of similar functionality.
The reason I choose to use IntelliJ when allowed (personally purchased and upgraded several times) is that everything it does just feels cleaner. Its hard to put my finger on it, but the exact same functionality in IntelliJ feels more streamlined and easier to use than in Eclipse - enough that I would pay for it even though there is a full-featured free IDE available.
So, the activities you should do to decide are this: use IntelliJ everyday for all your development tasks for 30 days. Push through the curve of learning new shortcuts and ways of searching, refactoring, etc. and I suspect you will prefer it. If not, Eclipse is still there for you.
I've just moved back to Eclipse after 2 years with Intellij (due to a client's preferences).
I'm finding Eclipse to be less helpful. I know that's a nebulous term, but Intellij's feedback was clearer, the UI gave me better information on what was going on, the automatic building seemed more seamless. The project setup/configuration seems more intuitive.
This is perhaps subjective, but sometimes that's why you prefer one over the over. It just feels that little bit slicker, that little bit friendlier...
(I don't think the VIM plugin for Intellij is better, but that's another story!)
2weiji: use Tab instead of Ctrl+J :-) It's much handy :-))
sout then Tab
psvm then Tab
iter then Tab
etc... :o)
Why use IntelliJ? Because I find it more consistent in it's user interface and more polished. It allows you to keep your hands on the keyboard longer, rather than switching back and forth between keyboard and mouse. As other people have said, it's the little things that add up.
I found the user-intentions functionality a huge time-saver, and the way it reviews the code and suggests optimizations and corrects bad practices. That I can hit Alt-Enter in a lot of contexts and the IDE is able to figure out and insert what should be there. Pesky things like type declarations, optimizing an old-style for-loop to use the new JDK-5 type of loop, removing redundant value assignments or unused variables.
Being able to first type in the usage of a method that doesn't exist, and then hitting a key combination and having the IDE write out the bare structure of the method - huge timesaver. It makes for a better workflow for me, because it allows you to first think about how you'll be using a method, what it looks like when you read it in context.
Refactoring support - this was the big selling point when I first started using it in 2003, and I think it still leads the way (but I hear Netbeans is also pretty good now)
Highly recommend that you have a look at the IntelliJ KeyMap Reference.
Have a look at this discussion for often-used shortcuts: What are the most useful Intellij IDEA keyboard shortcuts?
I'm not an IntelliJ IDEA guru but what the fanboys usually like about IntelliJ are the great refactoring and code assist features. They often claim this product is far superior to its competitor from this point of view.
Personally, I'm a partisan of the following principle: use the IDE with which you feel the most productive, not the one somebody else prefer.

What is soft coding? (Anti-pattern)

I found the Wikipedia entry on the soft coding anti-pattern terse and confusing. So what is soft coding? In what settings is it a bad practice (anti-pattern)? Also, when could it be considered beneficial, and if so, how should it be implemented?
Short answer: Going to extremes to avoid Hard Coding and ending up with some monster convoluted abstraction layer to maintain that is worse than if the hard coded values had been there from the start. i.e. over engineering.
Like:
SpecialFileClass file = new SpecialFileClass( 200 ); // hard coded
SpecialFileClass file = new SpecialFileClass( DBConfig.Start().GetConnection().LookupValue("MaxBufferSizeOfSpecialFile").GetValue());
The main point of the Daily WTF article on soft coding is that because of premature optimization and fear a system that is very well defined and there is no duplicated knowledge is altered and becomes more complex without any need.
The main thing that you should keep in mind is if your changes actually improve your system and avoid to lightly label something as anti-pattern and avoid it by all means. Configuring your system and avoiding hardcoding is a simple cure for duplicated knowledge in your system (see point 11 : "DRY Don't Repeat Yourself" in The Pragmatic Programmer Quick Reference Guide) This is the driving need behind the suggestion of avoiding hardcoding. I.e. there should be ideally only one place in you system (that would be code or configuration) that should be altered if you have to change something as simple as an error message.
Ola, a good example of a real project that has the concept of softcoding built in to it is the Django project. Their settings.py file abstracts certain data settings so that you can make the changes there instead of embedding them within your code. You can also add values to that file if necessary and use them where necessary.
http://docs.djangoproject.com/en/dev/topics/settings/
Example:
This could be a snippet from the settings.py file:
num_rows = 20
Then within one of your files you could access that value:
from django.conf import settings
...
for x in xrange(settings.num_rows):
...
Soft-coding: it is process of inserting values from external source into computer program. like insert values through keyboard, command line interface. Soft-coding considered as good programming practice because developers can easily modify programs.
Hard-coding. Assign values to program during writing source code and make executable file of program.Now, it is very difficult process to change or modify the program source code values. like in block-chain technology, genesis block is hard-code that cannot changed or modified.
The ultimate in softcoding:
const float pi = 3.1415; // Don't want to hardcode this everywhere in case we ever need to ship to Indiana.