Good Use Cases of Comments - formatting

I've always hated comments that fill half the screen with asterisks just to tell you that the function returns a string, I never read those comments.
However, I do read comments that describe why something is done and how it's done (usually the single line comments in the code); those come in really handy when trying to understand someone else's code.
But when it comes to writing comments, I don't write that, rather, I use comments only when writing algorithms in programming contests, I'd think of how the algorithm will do what it does then I'd write each one in a comment, then write the code that corresponds to that comment.
An example would be:
//loop though all the names from n to j - 1
Other than that I can't imagine why anyone would waste valuable time writing comments when he could be writing code.
Am I right or wrong? Am I missing something? What other good use cases of comments am I not aware of?

Comments should express why you are doing something not what you are doing

It's an old adage, but a good metric to use is:
Comment why you're doing something, not how you're doing it.
Saying "loop through all the names from n to j-1" should be immediately clear to even a novice programmer from the code alone. Giving the reason why you're doing that can help with readability.

If you use something like Doxygen, you can fully document your return types, arguments, etc. and generate a nice "source code manual." I often do this for clients so that the team that inherits my code isn't entirely lost (or forced to review every header).
Documentation blocks are often overdone, especially is strongly typed languages. It makes a lot more sense to be verbose with something like Python or PHP than C++ or Java. That said, it's still nice to do for methods & members that aren't self explanatory (not named update, for instance).
I've been saved many hours of thinking, simply by commenting what I'd want to tell myself if I were reading my code for the first time. More narrative and less observation. Comments should not only help others, but yourself as well... especially if you haven't touched it in five years. I have some ten year old Perl that I wrote and I still don't know what it does anymore.
Something very dirty, that I've done in PHP & Python, is use reflection to retrieve comment blocks and label elements in the user interface. It's a use case, albeit nasty.
If using a bug tracker, I'll also drop the bug ID near my changes, so that I have a reference back to the tracker. This is in addition to a brief description of the change (inline change logs).
I also violate the "only comment why not what" rule when I'm doing something that my colleagues rarely see... or when subtlety is important. For instance:
for (int i = 50; i--; ) cout << i; // looping from 49..0 in reverse
for (int i = 50; --i; ) cout << i; // looping from 49..1 in reverse

I use comments in the following situations:
High-level API documentation comments, i.e. what is this class or function for?
Commenting the "why".
A short, high-level summary of what a much longer block of code does. The key word here is summary. If someone wants more detail, the code should be clear enough that they can get it from the code. The point here is to make it easy for someone browsing the code to figure out where some piece of logic is without having to wade through the details of how it's performed. Ideally these cases should be factored out into separate functions instead, but sometimes it's just not do-able because the function would have 15 parameters and/or not be nameable.
Pointing out subtleties that are visible from reading the code if you're really paying attention, but don't stand out as much as they should given their importance.
When I have a good reason why I need to do something in a hackish way (performance, etc.) and can't write the code more clearly instead of using a comment.

Comment everything that you think is not straightforward and you won't be able to understand the next time you see your code.

It's not a bad idea to record what you think your code should be achieving (especially if the code is non-intuitive, if you want to keep comments down to a minimum) so that someone reading it a later date, has an easier time when debugging/bugfixing. Although one of the most frustrating things to encounter in reading someone else's code is cases where the code has been updated, but not the comments....

I've always hated comments that fill half the screen with asterisks just to tell you that the function returns a string, I never read those comments.
Some comments in that vein, not usually with formatting that extreme, actually exist to help tools like JavaDoc and Doxygen generate documentation for your code. This, I think, is a good form of comment, because it has both a human- and machine-readable format for documentation (so the machine can translate it to other, more useful formats like HTML), puts the documentation close to the code that it documents (so that if the code changes, the documentation is more likely to be updated to reflect these changes), and generally gives a good (and immediate) explanation to someone new to a large codebase of why a particular function exists.
Otherwise, I agree with everything else that's been stated. Comment why, and only comment when it's not obvious. Other than Doxygen comments, my code generally has very few comments.

Another type of comment that is generally useless is:
// Commented out by Lumpy Cheetosian on 1/17/2009
...uh, OK, the source control system would have told me that. What it won't tell me is WHY Lumpy commented out this seemingly necessary piece of code. Since Lumpy is located in Elbonia, I won't be able to find out until Monday when they all return from the Snerkrumph holiday festival.
Consider your audience, and keep the noise level down. If your comments include too much irrelevant crap, developers will just ignore them in practice.
BTW: Javadoc (or Doxygen, or equiv.) is a Good Thing(tm), IMHO.

I also use comments to document where a specific requirement came from. That way the developer later can look at the requirement that caused the code to be like it was and, if the new requirement conflicts with the other requirment get that resolved before breaking an existing process. Where I work requirments can often come from different groups of people who may not be aware of other requirements then system must meet. We also get frequently asked why we are doing a certain thing a certain way for a particular client and it helps to be able to research to know what requests in our tracking system caused the code to be the way it is. This can also be done on saving the code in the source contol system, but I consider those notes to be comments as well.

Reminds me of
Real programmers don't write documentation

I wrote this comment a while ago, and it's saved me hours since:
// NOTE: the close-bracket above is NOT the class Items.
// There are multiple classes in this file.
// I've already wasted lots of time wondering,
// "why does this new method I added at the end of the class not exist?".

Related

Download OEIS sequences with known algorithm to produce them

I was reading some interesting questions about the topic "Can we make a program that, given a particular sequence, produces the next terms", like this one, and I really like the detailed answer of this one. I understand that the answer is "That's impossible without more restrictions", and that given some restrictions (polynomials, rational function or boolean map) we know some good algorithms, as the second answer I linked explains.
Now, a natural question is how much can we solve, trying our best even if we can't always solve it, to answer the original, general question. What I usually do when facing a hard sequence is trying to see if it's in OEIS, and if it seems to be there, seeing if there is any formula or algorithm to produce it in there. You can download a small version of OEIS with the first terms of each sequence, and you can make queries to find formulas or maple algorithms for a particular sequence. My question is, do you think it's feasible to download a small version of OEIS that includes, with the first terms, a little algorithm to produce it?
The natural problem here is that I haven't seen any link to download the entire database of OEIS with all the details, which maybe deserves its own question. Even if we had this, you need to read the formulas/algorithms (that can be written in different languages, from what I've seen) and interpret them correctly. But I thought maybe someone here knows how to solve this, in any case thanks in advance.
You could, as you note, download the sequences and their A-numbers from the link mentioned here: https://oeis.org/wiki/Welcome#Compressed_Versions
After searching that and finding one sequence (or a small number of sequences) of interest, you could scrape the respective page(s) for formulas. There are specific fields for Maple and Mathematica, which may be helpful, and otherwise, an entry in the PROGRAM field should include identifying information when it is not one of the standard languages with its own field in the database. See: http://oeis.org/wiki/Style_Sheet
Unofficially, but with the interests of the OEIS in mind, I would not recommend trying to download or scrape the OEIS in its entirety. Whether it's one person, or a whole host of people, we would certainly recommend using the compressed version of the database to identify sequences of interest by A-number first, then pulling their entire entry by scraping the site or querying the OEIS using methods that you have already mentioned: Programmatic access to On-Line Encyclopedia of Integer Sequences
If this sounds laborious, perhaps an alternative is the Wolfram Cloud, which actives this through other means. For example, you can navigate to the cloud (you may have to register just to get access) at: https://www.wolframcloud.com/
Typing in something like FindSequenceFunction[{1, 2, 3, 5, 17, 305, 34865}] will give you a formula, if Wolfram/Mathematica can find one. The documentation for FindSequenceFunction can be found here: https://reference.wolfram.com/language/ref/FindSequenceFunction.html
Wolfram/Mathematica can also invoke the OEIS using packages like the one described here: https://mathematica.stackexchange.com/questions/40/is-it-possible-to-invoke-the-oeis-from-mathematica

Is SQL Injection/XSS attack possible with preg_replace?

I have done some research on how injection/XSS attacks work. it seems like hackers simply make use of the USER INPUT fields to input codes.
However, suppose I restrict every USER INPUT fields with only alphanumerics(a-zA-Z0-9) with preg_replace, and lets assume that I use the soon-to-be-deprecated my_sql instead of PDO or my_sqli.
Would hackers still be able to inject/hack my website?
Thanks!
Short version: Don't do it.
Long version:
Suppose you have
SELECT * FROM my_table WHERE id = $user_input
If this happens, then some inputs (such as CURRENT_TIMESTAMP) are still possible, though the "attack" would be limited to the point of probably being harmless. The solution here could be to restrict the input to [0-9].
In Strings ("$user_input"), the problem shouldn't even exist.
However:
You have to make sure you implement your escape function correctly.
It is incredibly annoying for the end user. For instance, if this was a text field, why aren't white spaces allowed? What about á? What if I want to quote someone with ""? Write a math expression with < (or even write something apparently harmless such as i <3 u)?
So now you have:
A homebrew solution, which has to be checked for correctness (and may have bugs, as any other function). Bugs in this function are potential security issues;
A solution which is unfamiliar to other programmers, who have to get used to it. Code without the usual escape functions is usually wrong code, so it's masssively surprising;
A solution that's fragile. What if someone else modifies your code and forgets to add the validation? What if you forget the validation?
You are focusing on solving a problem that's already been solved. Why waste time doing something that takes time to develop and is hard to maintain when others have already developed proper solutions that take close to no effort to use.
Finally, don't use deprecated APIs. Things are deprecated for a reason. Deprecated can mean stuff like "we'll drop support at any minute" or "this is has severe issues but we can't fix it for some reason".
Deprecated APIs are supposed to be used by legacy applications of developers that did not have enough time or resources to migrate. When starting from scratch, use the supported APIs.

Modelica - how to implement a constructor for a record

What is the best way to implement a constructor for a record? It seems like a function should be able to return a record object in the instantiation of the record in some later model higher up the tree, but I can't get that to work. For now I just use a bunch of parameters at the top of the record that populate the variables stored in the record, but it seems like that will only work in simple cases.
Can anyone shed a little light? Perhaps I shouldn't be using a record but a model. Also does anyone know how the PDE functionality is coming? The book only says that it is coming, but I have seen some other things around.
I don't seem to have the clout to add tags (which makes sense, since my "reputation" is lower than yours) so sorry about that. I thought I had actually added one at one point, but perhaps I am mistaken.
I think you need to be clear what you mean by constructor since it has a very specific meaning in Modelica. If I understand your question correctly, it sounds like what you want to do is create an instance of a record that has some fields that are specified in the constructor arguments and from those arguments a bunch of other fields in the record are computed. Is that correct?
If so, there is a mechanism to do this. You mention "the book" but it isn't clear which one you mean. If it is mine, it definitely has no mention of these so called "record constructors" because it is too old. I do not know if Peter Fritzson's book mentions them either. However, they do exist and are documented in Section 12.6 of the Modelica 3.2 specification.
As for PDEs, there has been work into this kind of thing but nothing has really been done within the design group on this topic. I would add that if you want to solve either elliptical or parabolic PDEs on regular grids, this isn't too hard even with the current language. The only real drawback is that most tools probably don't handle sparsity very efficiently. Irregular grids would also be possible, but then you get into complicated basis functions. Finally, hyperbolic PDEs are, in my opinion, quite tricky (in any environment) due to the implicit physical constraints between time and space which are difficult to express (i.e. the CFL condition).
I hope that answers your questions so far.
I can only comment on your question regarding the book of Peter Fritzson. He confirmed that he's working on an update and he hopes to get it ready 'in the course of 2011'.
Original post here:
http://openmodelica.org/index.php/forum/topic?id=50
And thanks for initiating the modelica tag, I might be useful in the near future for me too... :-)
regards,
Roel

When to join name and when not to?

Most languages give guidelines to separate different words of a name by underscores (python, C etc.) or by camel-casing (Java). However the problem is when to consider the names as separate. The options are:
1) Do it at every instance when separate words from the English dictionary occur e.g. create_gui(), recv_msg(), createGui(), recvMsg() etc.
2) Use some intuition to decide when to do this and when not to do this e.g. recvmsg() is OK, but its better to have create_gui() .
What is this intuition?
The question looks trivial. But it presents a problem which is common and takes at least 5 seconds for each instance whenever it appears.
I always do your option 1, and as far as I can tell, all modern frameworks do.
One thing that comes to mind that just sticks names together is the standard C library. But its function names are often pretty cryptic anyway.
I'm probably biased as an Objective-C programmer, where things tend to be quite spelled out, but I'd never have a method like recvMsg. It would be receiveMessage (and the first parameter should be of type Message; if it's a string, then it should be receiveString or possibly receiveMessageString depending on context). When you spell things out this way, I think the question tends to go away. You would never say receivemessage.
The only time I abbreviate is when the abbreviation is more clear than the full version. createGUI is good because "GUI" (gooey) is the common way we say it in English. createGraphicalUserInterface is actually more confusing, so should be avoided.
So to the original question, I believe #1 is best, but coupled with an opposition to unclear abbreviations.
One of the most foolish naming choices ever made in Unix was creat(), making a nonsense word to save one keystroke. Code is written once and read many times, so it should be biased towards ease of reading rather than writing.
For me, and this is just me, I prefer to follow whatever is conventional for the language, thus camelCase for Java and C++, underscore for C and SQL.
But whatever you do, be consistent within any source file or project. The reader of your code will thank you; seeing an identifier that is inconsistent with most others makes the reader pause and ask "is something different going on with this identifier? Is there something here I should be noticing?"
Or in other words, follow the Principal of Least Surprise.
Edit: This got downmodded why??
Just follow coding style, such moments usually well described.
For example:
ClassNamesInCamelNotaionWithFirstLetterCapitalized
classMethod()
classMember
CONSTANTS_IN_UPPERCASE_WITH_UNDERSCORE
local_variables_in_lowercase_with_underscores

What techniques are available for memory optimizing in 8051 assembly language?

I need to optimize code to get room for some new code. I do not have the space for all the changes. I can not use code bank switching (80c31 with 64k).
You haven't really given a lot to go on here, but there are two main levels of optimizations you can consider:
Micro-Optimizations:
eg. XOR A instead of MOV A,0
Adam has covered some of these nicely earlier.
Macro-Optimizations:
Look at the structure of your program, the data structures and algorithms used, the tasks performed, and think VERY hard about how these could be rearranged or even removed. Are there whole chunks of code that actually aren't used? Is your code full of debug output statements that the user never sees? Are there functions specific to a single customer that you could leave out of a general release?
To get a good handle on that, you'll need to work out WHERE your memory is being used up. The Linker map is a good place to start with this. Macro-optimizations are where the BIG wins can be made.
As an aside, you could - seriously- try rewriting parts of your code with a good optimizing C compiler. You may be amazed at how tight the code can be. A true assembler hotshot may be able to improve on it, but it can easily be better than most coders. I used the IAR one about 20 years ago, and it blew my socks off.
With assembly language, you'll have to optimize by hand. Here are a few techniques:
Note: IANA8051P (I am not an 8501 programmer but I have done lots of assembly on other 8 bit chips).
Go through the code looking for any duplicated bits, no matter how small and make them functions.
Learn some of the more unusual instructions and see if you can use them to optimize, eg. A nice trick is to use XOR A to clear the accumulator instead of MOV A,0 - it saves a byte.
Another neat trick is if you call a function before returning, just jump to it eg, instead of:
CALL otherfunc
RET
Just do:
JMP otherfunc
Always make sure you are doing relative jumps and branches wherever possible, they use less memory than absolute jumps.
That's all I can think of off the top of my head for the moment.
Sorry I am coming to this late, but I once had exactly the same problem, and it became a repeated problem that kept coming back to me. In my case the project was a telephone, on an 8051 family processor, and I had totally maxed out the ROM (code) memory. It kept coming back to me because management kept requesting new features, so each new feature became a two step process. 1) Optimize old stuff to make room 2) Implement the new feature, using up the room I just made.
There are two approaches to optimization. Tactical and Strategical. Tactical optimizations save a few bytes at a time with a micro optimization idea. I think you need strategic optimizations which involve a more radical rethinking about how you are doing things.
Something I remember worked for me and could work for you;
Look at the essence of what your code has to do and try to distill out some really strong flexible primitive operations. Then rebuild your top level code so that it does nothing low level at all except call on the primitives. Ideally use a table based approach, your table contains stuff like; Input state, event, output state, primitives.... In other words when an event happens, look up a cell in the table for that event in the current state. That cell tells you what new state to change to (optionally) and what primitive(s) (if any) to execute. You might need multiple sets of states/events/tables/primitives for different layers/subsystems.
One of the many benefits of this approach is that you can think of it as building a custom language for your particular problem, in which you can very efficiently (i.e. with minimal extra code) create new functionality simply by modifying the table.
Sorry I am months late and you probably didn't have time to do something this radical anyway. For all I know you were already using a similar approach! But my answer might help someone else someday who knows.
In the whacked-out department, you could also consider compressing part of your code and only keeping some part that is actively used decompressed at any particular point in time. I have a hard time believing that the code required for the compress/decompress system would be small enough a portion of the tiny memory of the 8051 to make this worthwhile, but has worked wonders on slightly larger systems.
Yet another approach is to turn to a byte-code format or the kind of table-driven code that some state machine tools output -- having a machine understand what your app is doing and generating a completely incomprehensible implementation can be a great way to save room :)
Finally, if the code is indeed compiled in C, I would suggest compiling with a range of different options to see what happens. Also, I wrote a piece on compact C coding for the ESC back in 2001 that is still pretty current. See that text for other tricks for small machines.
1) Where possible save your variables in Idata not in xdata
2) Look at your Jmp statements – make use of SJmp and AJmp
I assume you know it won't fit because you wrote/complied and got the "out of memory" error. :) It appears the answers address your question pretty accurately; short of getting code examples.
I would, however, recommend a few additional thoughts;
Make sure all the code is really
being used -- code coverage test? An
unused sub is a big win -- this is a
tough step -- if you're the original
author, it may be easier -- (well, maybe) :)
Ensure the level of "verification"
and initialization -- sometimes we
have a tendency to be over zealous
in insuring we have initialized
variables/memory and sure enough
rightly so, how many times have we
been bitten by it. Not saying don't
initialize (duh), but if we're doing
a memory move, the destination
doesn't need to be zero'd first --
this dovetails with
1 --
Eval the new features -- can an
existing sub be be enhanced to cover
both functions or perhaps an
existing feature replaced?
Break up big code if a piece of the
big code can save creating a new
little code.
or perhaps there's an argument for hardware version 2.0 on the table now ... :)
regards
Besides the already mentioned (more or less) obvious optimizations, here is a really weird (and almost impossible to achieve) one: Code reuse. And with Code reuse I dont mean the normal reuse, but to a) reuse your code as data or b) to reuse your code as other code. Maybe you can create a lut (or whatever static data) that it can represented by the asm hex opcodes (here you have to look harvard vs von neumann architecture).
The other would reuse code by giving code a different meaning when you address it different. Here an example to make clear what I mean. If the bytes for your code look like this: AABCCCDDEEFFGGHH at address X where each letter stands for one opcode, imagine you would now jump to X+1. Maybe you get a complete different functionality where the now by space seperated bytes form the new opcodes: ABC CCD DE EF GH.
But beware: This is not only tricky to achieve (maybe its impossible), but its a horror to maintain. So if you are not a demo code (or something similiar exotic), I would recommend to use the already other mentioned ways to save mem.