Source core repositories and sticky notes - process

An interesting problem occured recently, and I've been thinking of the "best" way (for a given value of "best") to implement this.
In essence, it's one of tracking notes against source code. The example that flagged this was getting a problem fixed in live within SLAs, and how to best achieve this. Without going into all the details, it came down to finding a function that's used in a number of places which may or may not be buggy, yet the problem was being reporting only in a single location.
The fix to meet the SLAs was simply to add a check into the location where the problem was reported, rather than tweaking the common code and having to test everything that touches that function.
The interesting issue is then for upstreaming. The "correct" method would then be to go back and check the original function, validate it's correct for everywhere it's called and then make the change "properly" if its determined the library function is wrong.
The problem is this takes time, so upstreaming may simply take the workaround, etc. However if the problem occurs again (say six months later) in another location calling the same library function, there isn't an easy way to link the two problems together. You can search the bug tracking database, but this isn't guranteed to help - it depends if a note's been added saying something along the lines of "this library function needs more thorough checking, but no time to investigate now".
So the question is this: within a large team of developers (30 plus, split into teams of both support and on-going development), what methods do you use to manage (what are effectively) "sticky notes" against source code, short of adding a comment to the suspicious function's source code saying "this might be a bit dodgy"?
The problem with the commiting a comment is one of process: a change is a change, so committing a zero-change change (i.e., one where just comments are added) is not ideal; developers can make mistakes even adding a comment (hit a stray key or something) so it's always (IMO) better to commit only where actual changes are made.
Now a wiki could be used to track per-file notes, but we've got a minimum of four branches and inexcess of a few hundred files (SQL objects, source code, XML files, etc), so a wiki will get unmangable quite quickly.
This is the sort of thing that it would be nice if SCM's could support - bits of metadata against files that are simply notes, but don't add to the SCM's version history - that can be displayed when doing (say) an svn update, or manually viewed.
There may already be solutions out there -- so how do you manage this type of knowledge sharing?

Well we're now using this method: in each folder checked into SVN, we've created a .url shortcut (this is Windows we're dev'ing on) that links to a page on our development wiki about that folder. Thus we can update the Wiki info freely, and on checkout/update everyone gets a link that will take them to the appropriate Wiki page for that folder/module.
We've not long instigated it so we'll have to see how well it works long term -- but it's better than what we had before (i.e., nothing :-) ).

Related

Is there a way to save all feasible scores found?

I'm building a student schedule generator and I need a way of producing more than one solution. Is there some way to save off feasible scores or scores of Xhard/Ysoft?
I need to be able to output more than one potential schedule, that way the student will have a choice for one schedule over the other if for whatever reason they don't want the "best" schedule (maybe they don't like one of the professors, maybe they don't want an 8am class, whatever)
My original idea was to save off all feasible solutions using the bestSolutionChanged event listener. The problem with this, is that once it finds a 0hard/0soft score, it ignores all scores after that, including scores that are equal.
Ideally I'd like to save off all scores of 0hard/-3soft or better, but just being able to save any feasible scores or force optaplanner to look for a new best score would be useful as well.
This is not a solution, but an analysis of the problem:
Hacking the BestSolutionRecaller is obviously not just a big pain, it's also behaviour we don't want to encourage as it makes upgrading to newer version an even bigger pain. So don't expect us to solve this by adding an easy way to configure that in the solver config any time soon. That being said, a solution for this common problem is clearly needed.
When a new best solution is found, it is planning cloned (see docs for definition) from the working solution (the internal solution in OptaPlanner). This allow us to remember that new best solution as the working solution solution changes. That also means the BestSolutionChangedEvents gets a plannng clone and can safely ship it to another thread, for example to marshal it to a client (presuming any ProblemFactChanges you create do copies instead of alterations), without being corrupted by the solver thread that modifies the working solution.
New best solution imply that workingScore > bestScore. The moment it instead does workingScore >= bestScore, we need far more planning clones (which are a bit CPU expensive), but we could then just send out BestSolutionChangedEvents for that too, if and only if a flag is enabled of course, because most users (unlike yourself) don't want this behaviour.
One proposal is to create a separate BestSolutionChangedOrSameEvent, next to the BestSolutionChangedEvent. This might not be ideal, because we need to be able to detect whether or not someone needs those extra planning clones.
Another proposal is to just have a flag in the <solver> config that switches from > to >= behavior for BestSolutionChangedEvent.
Please create a jira (see "get help" on webpage) and link it it here, or create a support ticket (also see "get help" on webpage).

How to quickly analyse the impact of a program change?

Lately I need to do an impact analysis on changing a DB column definition of a widely used table (like PRODUCT, USER, etc). I find it is a very time consuming, boring and difficult task. I would like to ask if there is any known methodology to do so?
The question also apply to changes on application, file system, search engine, etc. At first, I thought this kind of functional relationship should be pre-documented or some how keep tracked, but then I realize that everything can have changes, it would be impossible to do so.
I don't even know what should be tagged to this question, please help.
Sorry for my poor English.
Sure. One can technically at least know what code touches the DB column (reads or writes it), by determining program slices.
Methodology: Find all SQL code elements in your sources. Determine which ones touch the column in question. (Careful: SELECT ALL may touch your column, so you need to know the schema). Determine which variables read or write that column. Follow those variables wherever they go, and determine the code and variables they affect; follow all those variables too. (This amounts to computing a forward slice). Likewise, find the sources of the variables used to fill the column; follow them back to their code and sources, and follow those variables too. (This amounts to computing a backward slice).
All the elements of the slice are potentially affecting/affected by a change. There may be conditions in the slice-selected code that are clearly outside the conditions expected by your new use case, and you can eliminate that code from consideration. Everything else in the slices you may have inspect/modify to make your change.
Now, your change may affect some other code (e.g., a new place to use the DB column, or combine the value from the DB column with some other value). You'll want to inspect up and downstream slices on the code you change too.
You can apply this process for any change you might make to the code base, not just DB columns.
Manually this is not easy to do in a big code base, and it certainly isn't quick. There is some automation to do for C and C++ code, but not much for other languages.
You can get a bad approximation by running test cases that involve you desired variable or action, and inspecting the test coverage. (Your approximation gets better if you run test cases you are sure does NOT cover your desired variable or action, and eliminating all the code it covers).
Eventually this task cannot be automated or reduced to an algorithm, otherwise there would be a tool to preview refactored changes. The better you wrote code in the beginning, the easier the task.
Let me explain how to reach the answer: isolation is the key. Mapping everything to object properties can help you automate your review.
I can give you an example. If you can manage to map your specific case to the below, it will save your life.
The OR/M change pattern
Like Hibernate or Entity Framework...
A change to a database column may be simply previewed by analysing what code uses a certain object's property. Since all DB columns are mapped to object properties, and assuming no code uses pure SQL, you are good to go for your estimations
This is a very simple pattern for change management.
In order to reduce a file system/network or data file issue to the above pattern you need other software patterns implemented. I mean, if you can reduce a complex scenario to a change in your objects' properties, you can leverage your IDE to detect the changes for you, including code that needs a slight modification to compile or needs to be rewritten at all.
If you want to manage a change in a remote service when you initially write your software, wrap that service in an interface. So you will only have to modify its implementation
If you want to manage a possible change in a data file format (e.g. length of field change in positional format, column reordering), write a service that maps that file to object (like using BeanIO parser)
If you want to manage a possible change in file system paths, design your application to use more runtime variables
If you want to manage a possible change in cryptography algorithms, wrap them in services (e.g. HashService, CryptoService, SignService)
If you do the above, your manual requirements review will be easier. Because the overall task is manual, but can be aided with automated tools. You can try to change the name of a class's property and see its side effects in the compiler
Worst case
Obviously if you need to change the name, type and length of a specific column in a database in a software with plain SQL hardcoded and shattered in multiple places around the code, and worse many tables present similar column namings, plus without project documentation (did I write worst case, right?) of a total of 10000+ classes, you have no other way than manually exploring your project, using find tools but not relying on them.
And if you don't have a test plan, which is the document from which you can hope to originate a software test suite, it will be time to make one.
Just adding my 2 cents. I'm assuming you're working in a production environment so there's got to be some form of unit tests, integration tests and system tests already written.
If yes, then a good way to validate your changes is to run all these tests again and create any new tests which might be necessary.
And to state the obvious, do not integrate your code changes into the main production code base without running these tests.
Yet again changes which worked fine in a test environment may not work in a production environment.
Have some form of source code configuration management system like Subversion, GitHub, CVS etc.
This enables you to roll back your changes

Using VBA to manage multiple styles definitions in the same Word document

TL/DR: I have a game plan on how to do this below; however, I am wondering if my plan is going to prove to be too complicated, and what additional considerations I need to take into account before diving into building this project. Although I am not an experienced programmer, I am NOT asking for code; I am asking for feedback from experienced Word VBA programmers as to whether my entire idea/approach is one huge mistake.
I have a document "template" (not yet a template file type - I hope to create that as described below) for a report. The report is broken up into different sections:
Letter to the Client
Table of Contents
Section I
Title Page
Body
1.0
2.0
Section II
Title Page
Body
1.0
2.0
Appendix A
Title Page
Body
Appendix B
Title Page
Body
I want each major "metasection" (such as Letter, Section I, Section II, Appendices) to have different styling and formatting. This could be accomplished by having multiple styles for each metasection, e.g.:
Normal-Letter
Normal-SectionI
Normal-Appendices
Heading1-Letter
Heading1-SectionI
Heading1-Appendices
This would quickly become unmanageable.
In order to avoid users having to wade through a huge number of styles to find the correct one (and it is worth noting that if users of this report have to do this, they will likely not use styles AT ALL), it would be nice if I could have the same style name (e.g, Normal) be different depending on which section of the document it is found in. Or said another way, I would like for a document to have multiple style sets depending on the section.
The goals for the user experience are:
The user simply applies the Normal style, Heading1 style, etc, as necessary.
Registered section-specific style definitions are updated when styles are edited via the Modify Style dialog box, or other ways.
The styles are applied automatically and transparently when styles are changed, or when the document is opened, saved, or printed.
ALTERNATIVE: If automatic/transparent style application proves too difficult, execute the style-application routine with a simple command button.
My initial idea on how I might do this in VBA is:
Write VBA code (probably a class) such that there is a style registry of Normals and Heading1s, etc., for each document section.
Write a style-application subroutine which iterates through the registered document sections, selects all the parts with each registered style, and applies the section-specific style from the style registry (preserving any styling that deviates from the style definition).
Write a style-update subroutine that automatically and transparently updates the registered style definitions
The style-application subroutine executes any time styling is applied anywhere in one of the registered sections (so I'll need to tie into Events here).
The style-update subroutine executes any time a style definition in a registered section is changed (so here's another Event I'll need to monitor).
I previously asked a similar question about this topic on Superuser. The feedback I received has led me to believe that I can only accomplish the behavior I want using VBA, so I am now asking a follow-up question here on Stackoverflow.
My question is: am I making a mistake here? I have a feeling there is a better way to solve this problem (perhaps using VBA, perhaps not) than this.
Yes, in my opinion, you are making a mistake.
I have just recently finished a project where I have created a document template for a company. My experiences:
Users vary in knowledge level (obviously)
High level users don't like over-engineered files, because they can't use their own macros as they might conflict with the file's own macros, they can't use their doc properties or their own building blocks etc., as these likely won't be compatible with the macros (or at least they think they won't work, and fiddle around until they actually manage to break them)
low level users are intimidated by the automatisms, and keep avoiding them as long as they can (which means as long as their bosses don't order them to use the file), after which point they start hating the file and the work
Complex solutions like this one usually get abandoned after a few years. Eg. the original developer changes jobs, or moves to another department, and nobody understands the code enough to keep managing it (especially if it is not a well-documented, well-written code, which it won't be, as you are not an experienced VBA programmer).
The developer (you) will be inundated with (sometimes false) bug reports and questions and minor change requests, which gets really annoying after a few weeks (trust me on that :) ). They won't dare change even a font size without consulting you, and in the end, they will ask you to do it. Or, even worse, they try to change something, break it, and then tell you to fix your bug.
Your users would have to remember to use section brakes or other kinds of indicators to indicate the next section. This will seem too much for some, too complex, and if they accidentally remove a section indicator (which they inevitably will), all hell brakes lose, and worst of all:
Undo function will be disabled after each macro run. This, to most users, is a disaster. You don't do that to your users.
So I would say don't go down the macro route. Don't use Doc properties, that didn't work at the company I was working with. (Actually an IT company, with mostly high-level users :) ) The high-level users will create and use their own doc properties, for others, it is just a hassle. Bookmarks get constantly deleted, so no-go either.
My advice:
Use styles. Users will learn to use them quickly.
Get a decent document design. Having 4 different sets of title, heading and normal styles in one document is really unprofessional. Consistency is important, especially as this seems to be a letter to you clients. (Yes, I know, your company is different and your bosses are dumb and this is a special case and and and ... Just saying, talk to a designer, and get a professional look for your template.)
You can manage the Style gallery (Home tab, centre) drop down list on a template basis - so your template will load the used styles into the dropdown at the top, and remove everything else. This works really good, and even as much as 20 styles is manageable, if they are well-named.
Use building blocks: title pages, tables, pre-written and formatted Quick parts (legal mumbo-jumbo, company introduction, contacts, etc.), headers and footers...
And, if you want happy-happy and cooperative users:
After creating a blank template, create a full template:
Fill up a document template with texts, pre-written paragraphs, pre-written titles, so they will only have to click and rewrite, without the need to format or bother with styles and Cover pages and the lot
Educate the users: 2 sessions of 1,5 hour Word class can go a long way. It is a must.
Long post. One last thing: creating a complex Word template, you will be sailing a sea of Word bugs and annoyances. Even without writing macros, this won't be a walk in the park. (I for example gave up on making my TOC work in Office 2013, as after 3 days and 10 versions, it still kept on creating a maximum sized extra paragraph whenever it was inserted. Only in W2013. Still no idea why, but I let it go.)
Whatever you decide to do, best of luck, and have a lot of patience! :)

Does opengrok really require a separate staging directory?

In the sample installation and configuration instructions, it is seemingly suggested that OpenGrok requires two staging areas, with the rationale being, that one area is an index-regeneration-work-area, and the other is a production area, and they are rotated with every index regen.
Is that really necessary? Can I only have one area instead of two?
I'm looking for an answer that is specific to opengrok, and not a general list of race conditions one might encounter.
Strictly said, this is not necessary. In fact, I am pretty sure overwhelming majority of the deployments are without staging area.
That said, you need to decide if you are comfortable with a window of inconsistency that could result in some failed/imprecise searches. Let's assume that the source was updated (e.g. via git pull in case of Git) and the indexer has not finished processing the new changes yet. Thus, the index still contains the data reflecting the old state of the source. Let's say the changes applied to the source removed a file. Now if someone initiates a search that matches the contents of the removed file, the search result will probably end with an error. This is probably the better alternative - consider the case when more subtle change is done to a file such as removal/addition of couple of lines of code. In that case the symbol definitions will be off so the search results will bring you to the wrong line of code. Or, not so subtle change, when e.g. a function definition is removed from a file, the search results for references of this function will contain invalid places.
The length of the inconsistency window stems from the indexing time that is largely dependent on 2 things, at least currently:
size of the changes applied to the source
size of the source directory tree
The first is relevant because of history processing. The more incoming history changes (e.g. changesets in Git), the more work the indexer will have to do to generate history cache and/or history fields for the index (assuming history handling is on).
The second is relevant because the indexer traverses the whole source directory tree to find out which files have changed which might incur lots syscalls and potentially lots of I/O. At least until https://github.com/oracle/opengrok/issues/3077 is implemented and that will help only Source Code Management systems based on changesets.

Programmatically check for Access database corruption?

Is there a way to programmatically check for database object corruption in Access 2003?
My development project has gotten complex enough that it's hard to manually check all the objects after a day of programming to see if some small control, form, report, query, or code object has been corrupted somehow. I already have the data split off into a separate SQL Database stored on another machine, and this project is merely a front-end application to work with the data.
Mostly an academic musing, as I just don't want to get so far - then have corruption put me back several weeks because some seldom used object got corrupted way back when.
Any ideas out there? Thanks in advance for any pointers!
EDITED 12/03/2009 # 11:51
Sadly, I can only accept one answer - though I got a few very good ones, thank you for all the pointers!
You might like to look at: Is it possible to programmatically detect corrupt Access 2007 database tables?
I am inclined to keep a copy of important databases at each compact & repair and to compare the new database against the previous one. You can also check for non-standard characters.
Neither Compact/Repair nor Decompile/Recompile catches all corruption problems, although you should be doing this anyway.
I use a function to export all Container Docs (and QueryDefs) using SaveAsText into a date/time stamped folder, and use it regularly throughout the day. If I suspect any corruption, I create a new mdb, and use LoadFromText to recreate the objects.
Proper compilation practices will prevent corruption of the VBA project (which is what you're talking about here).
That entails:
use OPTION EXPLICIT in all modules.
turn off COMPILE ON DEMAND in the VBE options.
compile your code regularly, while working.
periodically (e.g., once a day after a full day of coding) decompile and recompile the code.
If you do this, you'll never encounter corruption in the first place so you won't need to test for it (which is impossible in the first place).