Database documentation and reverse engineering [closed]

Database documentation and reverse engineering [closed] - sql

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I know this question has been asked in various forms over and over again but I cannot find a single complete answer and I believe it is a general problem in the RDBMS/data area and industry. To explain the problem I will tell you a short (and maybe boring) story!
The Story
"I have a friend" who works in A company that uses 100+ systems. The scale and size of these systems vary from full-blown ITIL to custom/in-house, single purpose, LAMP/SQLite/CSV-based solutions. The vast majority of these systems, at one point or another use a database/data-store... Big-data has now become a trend, and A company though it is a very good idea to keep (or log) historical data from all systems forever! For that reason they have built a "warehouse". My friend is responsible for writing software that will do the analysis on some of this data ... however, he is kinda confused. There are thousands of tables in that warehouse containing data from the beginning of time (1970s I think :)).
The Problem
[Since I started telling you about this guy, I should probably continue]
My friend is very upset because of the lack of documentation in that warehouse. It seems that no-one knows what is what?! Few of the problems he faces (and I quote):
Man, some fields are constants... they have a special meaning to the application but I have no way of knowing? But that is OK... cause some other fields are bit-masks! Different bit values in the field have different meaning!
and he continues...
That's not all... these are the easy cases you know... Since we have data from multiple systems, we end up with a situation where different systems refer to the same thing in a different way... how can I explain it to you... for example, a network device has an FQDN, however some systems treat it as the primary key, some others don't and instead they allocate an auto-increment integer value, which in turn they use for foreign keys (you know... referencing this device).
and he can go on forever!
The Question
[Yes, it is one question]
He says:
We have come a long way regarding documentation in the software world... we have started with documents, moved to wikis and concluded to inline docblocks serving both as parameter/signature documentation and as wiki! We can auto-generate documentation, clear enough that a person in the other hemisphere and side of this world can easily follow!
and he continues:
... in the data side of things, we also had major achievements! Storage methods, serialization, transmission and data analysis techniques have evolved tremendously... We have also managed to map database tables into objects and in some cases we can even represent relationships!
So why the frell don't we have a standard method/technique of documenting our data structures in an RDBMS?
... he concluded :)
Enough with my friend, so my comments:
I know about comments on fields in various systems, but that is usually enough for a "deprecated" and not for an explanation
Updating a wiki or even worse a document, every time you release a database patch is not a solution... that patch should contain the relevant documentation!
ER diagrams can be easily generated based on the schema information, however this is not the easiest form of documentation to read... for anything more than 10 tables!
There is the saying (please comment if you know who said this! - respect)
Documentation is like sex: when it is good, it is very, very good; and when it is bad, it is better than nothing
Why SQL doesn't provide the means to any?

Any kind of documentation would get stalled if not maintained.
Also, the SQL world provides all kind possibilities to document things:
comments in SQL files
comments in columns/tables metadata
as you said - E/R diagrams
the classic way of documenting stuff - docs and wikis
good discipline in adhering to an intuitive naming scheme for the things in DB - I think this should be the standard
We have all the tools we need, we just have to convince our managers to let us write the docs (lol)

Related

do you rely on your memory or consult references and use a lot of intellisense? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I have noticed I do not code as much as I use to. Today I dedicate more time to analysis and design, then I communicate that design to programmers. Then they do the coding. This has affected my coding productivity, because I must consult references and rely on intellisense. Things are becoming more complex everyday
Now, here is the irony. If I were to hire a programmer and ask him/her to sit in front of a computer, I may ask to do some coding and I would check abilities. I would evaluate them based on their use of memory vs. consulting references. Maybe I will prefer that programmer who did not consult too much, but who knows what they are doing.
What is your opinion and experience?

I would say that a developer who knows how to find the answers is better than one who has an overall good knowledge already. I find that intellisense is a good tool for finding answers, besides it is too much to remember all method names, arguments, overloads, etc.

I use memory to get me into the right general area (e.g. knowing which classes to use or at least which namespace they'll be in) and then often Intellisense/MSDN for the exact method name or arguments to use.
Having said that, Stack Overflow is improving my ability to code without any references (or even compilation) - I'm sure code will just work out of the box for me more often now than it used to. (I tend to post and then check the code works, add links to MSDN etc - assuming I'm reasonably confident in the approach.)

Someone knowing what resources are available, and how to find the answers, and how to effectively debug - these are qualities I look for now in prospective employees.
I used to consult my memory only, but two things have happened:
Class libraries have gotten larger, so has the number of languages available
The ratio of programming-related memory to personal-life-related memory has shifted away from code
Programming today is also eight times harder than it was when I started. I used to work on 8-bit machines, now I'm working on 64-bit ones. :)

I once was at a job interviewed with the CTO of a company. He asked a question based on a real life problem the company had a while back and solved. It was a multi step problem.
I was standing in front of a whiteboard working through my solution and struggling through a particular part, a part I would use google for before even attempting it, had I been tasked with solving this problem for real instead of for an interview. He asked me at that point, "would you do anything different if this wasn't an interview question." I responded, "Yes. I would exhaust all possibilities of using a third party component for this part of the task and look up the solution, because it is a well defined problem thats been solved several times." There was a bit more discussion where I justified my answer, explained exactly what I would research, and I solved some other parts of the question. In the end I was offered and accepted the job, partly because of knowing how to find out what I didn't know.
Being able to use references is as important as being able to code from memory. Obviously, if you are a one language shop, and want people proficient in that language,the person should be able to write a complete hello world app in notepad. Interview problems should focus on small problems, and one should not worry about small syntax errors. This is why a whiteboard is the best IDE for interview questions.
Unless you demand all your coders use notepad and don't give them internet access, don't be as concerned by the syntax. If you do sit them down in front of a computer, worry about the finished product as well as the technique used to get there.

I'm a PHP programmer in my early 30's. I rely on PHP's excellent documentation, for several reasons:
Programming concepts don't change. If I know what my object models are and how I want to manipulate data, then there's dozens of ways to implement the details. The details are important, but a better grasp of the design and structure is more important
PHP has notoriously inconsistent functions. One string function might use ($needle,$haystack) as parameters, and another might use ($haystack,$needle). Trying to keep them straight isn't worth the hassle when you can just type php.net/function_name and get the reference.
I don't rely on intellisense, simply because I haven't found a decent IDE for PHP that does it well. Eclipse is ok, but it's not fantastic. Netbeans gives me 'PHPDoc not found' for all the built-in PHP functions whenever I install it. There's nothing that I've found so far that beats out the documentation.
The bottom line is that the ability to memorize functions isn't indicative of coding ability. Obviously there's a key set of basic functions that a good programmer will know just from extensive usage over time, but I wouldn't base a hiring decision on whether someone knows substr_replace vs. str_replace from memory.

Because I've read either the documentation, or articles, or a book on a subject, the things I learn on a topic are organized. The result is that, if I can't bring something up from memory, I can probably find it quickly through IntelliSense or the Object Browser.
Worse come to worst, I can pick up the book again; something these youngsters are not being taught to do.
John Saunders
Age 51

Pretty much Google + Old Projects + my memory (of course)
References will not solve your problems though, its only for the nuts and bolts, the higher level of problem solving is the actual "programming" part IMHO.

I tend to use Intellisense and Resharper much more than I used to before, but this has helped my overall productivity. If I can get the idea of how I want to solve something and then use tools to get the more boring parts like class names and function signatures, why shouldn't I use the tools I have? I feel relieved that Jon Skeet has a similar approach it seems.

I rely on my bookmarks and books... and my ability to use them effectively. I have multiple books above my desk, including a copy of the ISO C90 standard. Moreover, I use Xmarks to have access to my bookmarks wherever I go. Sometimes, I make a pdf out of a particular page and upload it to my web-site if it is important enough.
Sometimes the information provided by the resources I use makes its way into my terrible memory... maybe.

How to hand over a project systematically? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 3 years ago.
Improve this question
We have a project hand over from on shore team to our team (off shore) not long ago. However we were having difficulties for the hand-over process.
We couldn't think of any questions to ask during their design walk-through, because we were overwhelmed by the sheer amount of information. We wanted to ask, but we didn't know what to ask. Since they got no question from us, the management think that the hand-over process was been done successfully.
We had tried to go through all the documentation from our company wiki page before attending the handover presentation, but there are too many documents, we don't even know where to start with.
I wonder, are there any rules or best practices that we can follow, to ensure a successful project hand-over, either from us, or to us.
Thanks.

In terms of reading the documentation, personally I'd go for this order:
Get a short overview of the basic function of the application - what is it meant to achieve. The business case is probably the best document which will already exist.
Then the functional specification. At this point you're not trying to understand any sort of how or technology, just what the app is meant to do. If it's massive, ask them what they key business processes are and focus on those.
Then the high level technical overview. This should include an architecture diagram, required platforms, versions, config and so on. List any questions you have.
Then skim any other useful looking technical documents - certainly a FAQ if there is one, test scripts can be good too as they outline detailed "how to" type scenarios. Maybe it's just me but I find reading technical documents before I've seen the system a waste - it's too academic and they're normally shockingly written. It's certainly an area I'd limit the time I spent on if I didn't feel I was getting a reasonable return for the time I was spending.
If there are several of you arrage structured reviews between you and discuss the documents you've read, making sure you've got what you need to out of it. If the system is big then each take an area and present to the others on it - give yourselves a reason to learn as much as possible and knowing you're going to be quizzed is a good motivator. Make a list of questions where you don't understand something. Having structured reviews between you will focus your minds and make it more of an interactive task, rather than just trawling through page after page of tedious document.
Once you get face to face with them:
Start with a full system demo. Ask questions as they come up, don't let them fob you off with unclear answers - if they can't answer something have it written down and task them with getting the answer.
Now get the code checked out and running on your machines. Do this on at least two machines - one they lead, one you lead. Document the whole process - this is the most important step. If you can't get the code running you're screwed.
Go through the build process. Ensure that you can build the app (including any automated build and unit tests they may have). Note that all unit tests should pass - if they don't or if they say "oh, that one always fails" then they need to fix that before final acceptance.
Go through the install process. Do this at least twice, one they lead, once you lead. Make sure that it's documented.
Now come up with a set of common business functions carried out with the application. Use this to walk the code with them. The code base will be too big to cover the whole thing but make sure you cover a representative sample.
If there is a database or an API do a similar exercise. Come up with some standard data you might need to extract or some basic tasks you might need to carry out using the API and spend some time working through these with them.
Ask them if there's anything they think you should know.
Make sure that any questions you've written down anywhere else are answered.
You may consider it worth going through the bug list (open and closed) - start with the high priority ones and talk through anything particularly worrying looking. Even if they've fixed it it may point at a bit of code which is troublesome.
And finally if the opportunity exists - if there are any outstanding bugs or changes, see if you can pair program a couple.
Do not finally accept the app unless you are 100% sure you can:
Get the code to compile
Get the code to build (including the database)
Get the application installed
Do not accept handover is complete until they have:
Documented anything you picked up on that wasn't covered to your satisfaction
Answered ALL of your questions - a question they won't answer after being asked repeatedly screams of something they're hiding
And grab their e-mail addresses and phone numbers. Even if it's only informal they'll probably be willing to help out if the shit really hits the fan...
Good luck.

My basic process for receiving a handover would be:
Get a general overview of the app, document it
Get a list of all future work that the client expects
... all known issues
... any implementation specifics
As much up-to-date documentation they have
If possible, have them write some tests for critical components of the system (or at least get them thoroughly documented)
If there is too much documentation (possible) just confirm that it is all up to date, and make sure you find out from them where to start, if it is not clear.
Ask as many question as possible; anything that comes to mind, because you may not have the chance again.

Most handovers, perhaps all of them, will cause a lot of information to be lost. The only effective way to perform a handover that I have seen is to do it gradually. One way to do it is to allow a few key people from phase One to stay on the project well into Phase Two.
The extreme solution is to get rid of all handovers, and start using an Agile mindset.

As a start, define the exit criteria for the handover. This should be discussed, negotiated and agreed with both parties and make sure higher management knows this. Then write up a checklist of all things needed to achieve the exit criteria and chase it.

Check out "Software Requirements" and Software Requirement Patterns for ideas on questions to ask when gathering information about a project. I think that just as they would work for new development, they would also help you to come to terms with an existing project.

How to document applications and how they integrate with other applications? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 5 years ago.
Improve this question
As the years go by we get more and more applications. Figuring out if one application is using a feature from another application can be hard. If we change something in application A, will something in application B break?
We have been using MediaWiki for documentation, but it's hard to keep the data up-to-date.
I think what we need is some kind of visual map of everything. And the possibility to create some sort of reference integrity? Any ideas?

I'm in the same boat and still trying to sell my peers on Enterprise Architect, a CASE tool. It's a round trip tool - code to diagrams to code is possible. It's a UML centric too - although it also supports other methods of notation that I'm unfamiliar with...
Here are some things to consider when selecting a tool for documenting designs (be they inter-system communication, or just designing the internals of a single app):
Usability of the tool. That is, how easy is it to not only create, but also maintain the data you're interested in.
Familiarity with the notation.
A. The notation, such as UML, must be one your staff understands. If you try using a UML tool with a few people understanding how to use it properly you will get a big ball of confusion as some people document things incorrectly, and someone who understands what the UML says to implement either spots the error, or goes ahead and implements the erroneously documented item. Conversely more sophisticated notations used by the adept will confound the uninitiated.
B. Documentation isn't/shouldn't be created only for the documenters exclusive use. So those who will be reading the documentation must understand what they're reading. So getting a tool with flexible output options is always a good choice.
Cost. There are far more advanced tools than Enterprise Architect. My reasoning for using this one tool is that due to lack of UML familiarity and high pressure schedules, leaves little room to educate myself or my peers beyond using basic structure diagrams. This tool easily facilitates such a use and is more stable than say StarUML. (I tried both, StarUML died on the reverse engineering of masses of code -- millions of lines) For small projects I found StarUML adequate for home use, up until I got vista installed. Being opensource, it's also free.
With all that said, you will always have to document what uses what, that means maintaining the documentation! That task is one few companies see the value in despite its obvious value to those who get to do it. . .

What does the term legacy database mean? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Closed 9 months ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
I read this term a lot. What exactly is a legacy database? I ask because I had thought it meant an old database like dbase or rdb, but I don't think I'm right.
When looking at RoR or Django and "legacy database" integration, what does legacy database really mean? Is it different than a generic term "legacy database"?

In the general context, it can refer to any of the older database technologies.
In a more specific context, it can refer to a database system that was inherited by a team from previous project owners.

legacy: anything from the past that keeps coming around to haunt you.

A legacy database is generally something that you will have to inherit and base some of your design decisions around. Most companies that put out work may already have some other (usually horrible) solution and you need to give them a bigger and better product...
BUT
It has to work with all of their old legacy data. The company is not going to want to manage two different applications just so they can keep all their old records. You will need to develop your solution to be able to migrate the data from the legacy system over into your system. This can have a massive impact on the overall design of the new database, because it cannot stray too far from the previous without introducing a lot of problems in terms of data integrity.

It's usually derogatory in my experience:
Something no-one wants to touch in case it breaks
Databases that can't be maintained (say that SQL 6.5 box lying around)
Someone else's badly designed and implemented database
Something that someone is trying to replace
Supported by the 93 year old wierdo
If it's in-use but still has maintenance or development activities, it can't be legacy...
Edit:
Given the age of the SQL language and the RDBMS, everything is legacy (including my new system due next year) compared to the software listed. At what point does Ruby turn legacy from the database perspective..?

We mostly use the term 'legacy database' as a db schema we can not 'easily' modify without breaking other software/systems using this schema.

this sums it up pretty well.
[edit] Broken link. Here's the quote from FOLDOC:
Legacy System -- A computer system or application program which continues to be used because of the cost of replacing or redesigning it and often despite its poor competitiveness and compatibility with modern equivalents. The implication is that the system is large, monolithic and difficult to modify.
If legacy software only runs on antiquated hardware the cost of maintaining this may eventually outweigh the cost of replacing both the software and hardware unless some form of emulation or backward compatibility allows the software to run on new hardware.

Flat file, hierarchy, and network databases are usually referred as legacy databases. They represent the ways people used to organize information in prehistoric times — about 30 years ago.

Legacy is used to denote the old thing. legacy database is something which continues to be used because of it cost of replacing and redesigning it.

In general context refers to old code inherited. Tipycally cobol code.
It is used for code which it is still used for historcal reasons.
It applies also for DB schemas

Allocating resources for project documentation [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
What would you suggest for the following scenario:
A dozen of developers need to build and design a complex system. This design needs to be documented for future developers and the design decisions must be noted. These reports need to be made about every two months. My question is how this project should be documented.
I see two possibilities. Each developer writes about the things they helped design and integrate and then one person combines each of these documents together. The final document will probably be incoherent or redundant at times since the person tasked of assembling everything won't have much time to adjust every part.
Assume that the documentation parts from each developer arrive just a few days before deadline. A collaborative system (i.e. wiki) wouldn’t work properly since there wouldn’t be anything to read until a few days before deadline.
Or should a few people (2-3) be tasked with writing the documentation while the rest of the team works on actually developing the system? The developers would need a way to transfer their design choices and conclusions to the technical writers. How could this be done efficiently?

We approach this from 2 sides, using a RUP style approach. In the first case, you'll have a domain expert who is responsible for roughing out the design of what you're going to deliver - with developers chipping in as necessary. In the second case, we use a technical author - they document the application, so they should have a good idea of how it hangs together, and you involve them right through the design and development process. In this case, they can help to polish the design, and to make sure that it matches what they thought was being developed.

We use confluence (atlassian's wiki-like-thing) and document all kinds of different "things". The developers do it continiously, and we push each other for docs - we let peer pressure decide what is necessary. Whenever someone new comes along he/she is tasked with reading through everything and to find out what still is correct. The incorrect stuff is either deleted or updated as a consequence of this. We're happy when we can delete stuff ;)
The nice thing about this process is that the relevant stuff stays and the irrelevant stuff is deleted. We always "got away" from the more formalized demands by claiming that we could always construct the word documents they wanted if "they" needed them. "They" never needed them.

I think alternative 2 is the less agile, because it means a new stage to the project (although it may be in parallel with tests).
If you are in an agile model, then just add documentation (following a guideline) as a story.
If you are in a staged approach, then I would nevertheless ask developers to work on documentation, following some guidelines, and review that documentation along the design and the code. Eventually, you may have a technical writer reviewing everything for proper English, but that would be a kind of "release" activity.

I think you can use Sand Castle to document your project.
Check it out
Sand Castle from Microsoft

It's not a complete documentation, but making sure that interfaces etc. are commented using Doxygen-style comments means writing code and documenting it are closer together.
That way, developers should document what they do. I still think a review by the architect(s) is needed to ensure consistent quality, but ensuring people document what they do is the best way to ensure they follow the architecture.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas