Visual modelling of permissions - permissions

I have come into the habit of hand-sketching various diagrams for software I create. My software is mostly for the web. I use E-R diagramming for the data logic (model of MVC) , and a personally invented diagram style for the interactions -- what pages lead to which other ones and what do they do, i.e. the views & controllers of MVC. This allows me to simplify the important concepts, eliminate the inconsistencies and highlights problem areas that need further investigation.
Now, I've been starting to look at an application that requires a fairly complicated system of permissions. Not really "big" -- just complicated -- with several permission "dimensions" where some permissions need to be created on the fly, and some be static.
I find myself wishing there was some simple way to diagram the permissions system, so that I can get the ideas out of my head in a clear form, and make sure there are no inconsistencies. Hence my question:
Has anybody seen/used any method of modeling permissions in a visual diagram?

The 'ls' command uses color to denote permissions.
Also I would think on a whiteboard or powerpoint, that since permissions link groups of users to files, that lines between the two or spatial grouping becomes a possibility.

I would model:
my users with their attributes on the one hand
my resources with their attributes on the other
Identify the key attributes (building blocks). Then start writing bullet-point rules e.g.:
managers in finance can approve up to X
managers in HR can edit employee records...
employees in HR can approve new accounts
What you're doing is building your authorization policies. Then you can consider factoring out parameters e.g. the department attribute / value.
From there, you want to build a tree/flow where the root would be the entry point, the level below would be the departments, the level below other attributes...
Example:
If the user is in purchasing{}
Else if the user in finance{}
Else if...
(but in a graphical tree-based way).
I use XACML for authorization which is policy/tree-based. You can then apply CSS to it (or XSLT) to get a graphical sense of the authorization. Check out my blog for samples: http://www.webfarmr.eu/2010/11/xacml-102-pimp-my-xacml-css/

Related

Simple OO question: Which of these classes should own this operation?

I was working on a UML class model and (basic as it is) I feel I need to see if my assumption is right. I have two types: Administrator and Location. (A relationship exists between the two where an Administrator can be at one or more locations).
An Administrator needs to assign themselves to a Location. Should this assignment operation be in the Location type or the Administrator type? What is the reasoning for one over the other?
As I see it, the Administrator would be the class having a state, and thus knowing where he is. The location is a data-entry, and should be stateless.
Hence, it's the administrator where you would add or remove locations.
Ie. typical C#:
Administrator.Locations.Add
Administrator.Locations.Remove
Also, I don't see how it could in the Location type, without you using two-way relations, which doesn't seem appropriate here.
Both are possible and with argumentation, both could also be just fine.
If the administrator is logged in at the main server, he could invoke AdminUtils.Locations(That).AddAdministrator(Me). On the other hand, if the administrator is at a specific location he could, prior to logging in at that location, invoke AdminUtils.Administrators(Me).AddLocation(This).
These kind of questions are for specific situations generally answered by considering responsibilities. Which entity is responsible for which entities?
Also note that these Add methods could perfectly be elements of both entities, but one of them is mapped to the actual storage which should be just one.
It depends on what is your software about. If it's a database about people, then Administrator will contain list of assigned places(addLocation / removeLocation), if it's about places, then Location should reference its Administrator(setAdministrator). If you track the history of assignments then obviously Assignment will be a separate entity.
Such things are very well described in DDD Book by Eric Evans

Custom user-driven reports on a known schema

There's an upcoming project at work to fill a requirement that end-users be able to generate custom reports off their data in within our fixed/known-schema relational database.
The interface needs to be very user friendly and so transposing all of t-sql's language concepts into a graphical paradigm is far too complex for both the project team and the end user.
What research or products, open-source or otherwise, exist around satisfying this of business need? I'm aware of general Business Analytic tools but this is more specific and I'm trying to understand the problem domain better rather than trying to reverse engineer it from vendor marketing materials.
I assume the research would be in the form of a some encoding of the schema that specifies which joins and tables are allowed, which fields are available, then then a method for allowing the user to select one particular valid combination among the possible many, generate the query, and display the results.
Brainstorming - feature support in order of complexity: SELECT, WHERE filters, FULL JOIN, LEFT JOIN, sorting, paging, grouping, aggregation, HAVING filter.
My backup plan is to just dumb it down to pre-written SQL Views (with JOINs built-in) with the ability to display available columns with custom row-wise filtering. Paging and sorting is doable. By itself, this doesn't allow for grouping, aggregate functions, HAVING filters, or other inter-row analysis.
As a follow-up to #Dems post (comment box wasn't bit enough :) )..
Agreed on most counts.. If your data is mostly analytic, then you might want to look into a tool like PowerPivot. In this case, you can write a general query then allow the users to derive reports based on the result set in a familiar tool (Excel).
At the core of every ad hoc reporting engine, you will find a few common themes:
Metadata
There will be some way of describing the schema such that the model may be easily consumed by the user. Sql Server Reporting Services (SSRS) requires you to build a metadata model in order to use the report builder. When using PowerPivot, you can alias column names to make them more readable, but in the end, you are simply providing a flat dataset and allowing the user to build the joins/relationships.
Query Builder
Once the metadata has been manipulated by the user, an intermediary system must be in place to convert the conceptual report into an actual query. Many tools are measured based on the complexity of the Sql that they produce as this can greatly affect performance. One way to get around this is to create views that the reporting engine may build queries against. One of the best open source examples of this that I have seen is the engine that backs Hibernate/NHibernate (look into how the various Dialects are used when building queries).
Rendering Engine
In my experience building a rendering engine is not a road you want to go down. There are many device-specific concerns as well as look & feel problems (i.e. how do you plan on representing cascading joins/relationships?). Every rendering engine has it's own quirks (PowerPivot uses Excel, SSRS has a service that builds the raw result and return it to the consuming application) that must be accounted for, so be careful how you choose.
Earlier I mentioned that I agreed on most counts. I would not recommend encouraging your users to learn Sql or allowing them to pass-through Sql to the underlying data-store. This opens the door to malicious code being written and can become a security nightmare. Not to mention that most business users think in terms of flat tables, not hierarchical sets.
Figure out what your users are comfortable with and try to fit your solution to that domain. I have often found that for sophisticated business users something like PowerPivot is perfect. For more day-to-day end users, having "canned" reports that might be modified by the end user via a simple user interface that allows them to modify restrictions/groupings/sorting is more useful.
There are many options out there, but the best of them cost money.
I really like QlikView as an easy to use report designed for semi-technical people. If your user base is more technically minded it may be a bit restrictive, but if your user base have no logical thought capabilities, it's too complicated. That's the biggest trap I see you falling in to...
- No, I want more than that!
- No, that's too complicated for me!
- At the same time...
If you were to build your own tool-set internally, you'd probably be best sticking with OLAP cubes. Let people slice and dice the data as they like, but with all the relationships pre-defined. Do it right and you can just point an Excel Pivot Table at the OLAP Cube and let them play...
The next up, as Bobby D says, could be SQL Server Reporting Services, or something similar.
But if your users end up wanting absolute flexibility, the tool they need is SQL itself. Unfortunately, all tools follow the same trend: The more flexible and powerful, the more time you need to spend learning/training.

Database Modification or start over?

So I'm currently working on rebuilding an existing website that is used internally at my company for project management, at heart it is a bug tracking utility that has some customer support and accounting operations linked into it.
Currently the database model is very repetitive, a good example of this is, currently a UserId is linked into a record (FK relationship into a user table that contains all the information about the user) and then all the information about the user also exists in the table.
I've been tasked with improving the website and the functionality of the model; however, I want to reduce the repetition of data in the website (is this normalization or is that the breaking apart of unlinked items into separate tables?). I'm not sure what the best method of doing this would be. I'm thinking of generating the creation scripts for the database and creating a new database project in VS to then modify the database, then generating some scripts to populate the new database model from the old database.
I plan on using the Entity Framework and ASP. NET MVC 2 to build the website as I think it provides the most flexible model moving forward for the modification and maintenance of the website.
The reason I ask all of this is because I'm very familiar with using databases and modifying existing ones to be used in applications and websites but I'm trying to discover the best way to build one.
I'm curious if there is any material on the best way to do this or if I should be using a different tool to do this with?
Edit: Providing more information on the model
There are 4 major areas that we have that are used:
Cases (Bugs, Features, Working Tasks, Etc)
2 .Tickets (Tech Support Events)
Errors (Errors Generated from our logging Library, Basically a stack trace with customer information)
License (Keeps track of each customers License allows modification to those licenses)
These are the Objects that are intermixed and used throughout the above 4 major areas.
Users (People who use the system)
Customers (People who use our software)
Stores (Places where our customers use our software)
Products (Our Software)
Relationships
Cases:
A Cases has to have a User, can have a Customer, Store, Error, Ticket and/or Product
Tickets
A Ticket has to have a User and a Customer, can have a Store, Error and/or Product
Errors:
A Error has to have a Product, Can Have a Case, Ticket, Store, and/or Product
Licenses:
A Licenses has to have a Product and Customer, can have a Store
Like I said very basic website, with a not super complex database, if done correctly.
Currently the database has no FK constraints, replication of lots of information across each table and lots of extra tables that are duplicates with different names.
E.g.
Each Case type has a separate table so there is a FeatureRequest, Bug, Tasks, Completed, etc table that all contain the same information.
Normalization is about storing data without redundancy or anomalies.
One example of an anomaly could be when attributes about a user in your main table are not in sync with the users table. Someone changes information about that user in one table without reflecting the changes in the redundant copy. The problem is that it's hard to know which change is the correct one.
Some people think that normalization is just about breaking apart tables into littler tables, because that's what they see as the most common type of change. But that's not the goal of normalization. It's just by coincidence that most mistakes of non-normalization involve stuffing too much data into one table where multiple tables would be correct.
It's hard to answer your question about whether to modify your database in-place or whether to create a whole new database and migrate to it.
What I would do in your case is to design a properly normalized database, and then examine the differences between that and your existing database. Imagine what you would have to do for each difference, to change your old database to the new one, versus a data migration. It could be that only a few changes are needed, only dropping the redundant columns. Or it could be that some major rework is needed. It's impossible to tell until you do the work of creating a normalized data model so you can compare.
The bigger task might be to adapt your application code that uses the database. One way to ease this transition is to create database views on top of the normalized database, which mimic your old non-normalized database. That way hopefully you don't have to rewrite every bit of code in your app all at once, you can keep some of it the same at least until you can refactor the code.
Also having a good set of regression tests in place is ideal, so you can be sure your app still does all the tasks it is supposed to do, as you refactor the database and the code that uses the database.
Re your comment: You mention that you're adding new functionality to the user model at the same time. I would find it too confusing to try to do this simultaneously with refactoring. Refactoring typically does not change functionality, it only changes implementation. But refactoring adds value because it makes the code easier to maintain or debug, improves efficiency, or prepares you to make future functionality changes more easily.
I would recommend that you bit the bullet and add your new user model features to the old non-normalized database. It's good to get the benefit of new features in the short term, and also you need to develop those features first to understand them well enough to account for them in your big refactoring project.
Here are some suggestions for resources to help you truly understand what normalization means:
SQL and Relational Theory by C. J. Date
A Simple Guide to Five Normal Forms in Relational Database Theory by William Kent
Database Normalization at Wikipedia and its sub-pages for each respective normal form
SQL Antipatterns Volume 1: Avoiding the Pitfalls of Database Programming by me, Bill Karwin. I wrote a chapter about database normalization that I hope explains it in plain English and with good examples.
Here are a couple of resources for managing changes to a database:
Refactoring Databases by Scott W. Ambler and Pramodkumar J. Sadalage
Agile Database Techniques: Effective Strategies for the Agile Software Developer by Scott W. Ambler
How long do you have, and how big is the database?
It's very difficult to answer this question black and white without being immersed in your environment and business case. It really doesn't seem like your limitation is technology wise, just to choose between solutions.
Re-creating is what programmers instinctively go for. However, in the "real world", sometimes we spend a lot of effort into something that isn't that used or wont last that long.
So food for thought. How long will it take you to re-do the database, how much will it cost. Will working with what's existent be sufficient for the functionality asked?

What are the principles behind, and benefits of, the "party model"?

The "party model" is a "pattern" for relational database design. At least part of it involves finding commonality between many entities, such as Customer, Employee, Partner, etc., and factoring that into some more "abstract" database tables.
I'd like to find out your thoughts on the following:
What are the core principles and motivating forces behind the party model?
What does it prescribe you do to your data model? (My bit above is pretty high level and quite possibly incorrect in some ways. I've been on a project that used it, but I was working with a separate team focused on other issues).
What has your experience led you to feel about it? Did you use it, and if so, would you do so again? What were the pros and cons?
Did the party model limit your choice of ORMs? For example, did you have to eliminate certain ORMs because they didn't allow for enough of an "abstraction layer" between your domain objects and your physical data model?
I'm sure every response won't address every one of those questions ... but anything touching on one or more of them is going to help me make some decisions I'm facing.
Thanks.
What are the core principles and motivating forces behind the party
model?
To the extent that I've used it, it's mostly about code reuse and flexibility. We've used it before in the guest / user / admin model and it certainly proves its value when you need to move a user from one group to another. Extend this to having organizations and companies represented with users under them, and it's really providing a form of abstraction that isn't particularly inherent in SQL.
What does it prescribe you do to your data model? (My bit above is
pretty high level and quite possibly
incorrect in some ways. I've been on a
project that used it, but I was
working with a separate team focused
on other issues).
You're pretty correct in your bit above, though it needs some more detail. You can imagine a situation where an entity in the database (call it a Party) contracts out to another Party, which may in turn subcontract work out. A party might be an Employee, a Contractor, or a Company, all subclasses of Party. From my understanding, you would have a Party table and then more specific tables for each subclass, which could then be further subclassed (Party -> Person -> Contractor).
What has your experience led you to feel about it? Did you use it, and if
so, would you do so again? What were
the pros and cons?
It has its benefits if you need flexibly to add new types to your system and create relationships between types that you didn't expect at the beginning and architect in (users moving to a new level, companies hiring other companies, etc). It also gives you the benefit of running a single query and retrieving data for multiple types of parties (Companies,Employees,Contractors). On the flip side, you're adding additional layers of abstraction to get to the data you actually need and are increasing load (or at least the number of joins) on the database when you're querying for a specific type. If your abstraction goes too far, you'll likely need to run multiple queries to retrieve the data as the complexity would start to become detrimental to readability and database load.
Did the party model limit your choice of ORMs? For example, did you
have to eliminate certain ORMs because
they didn't allow for enough of an
"abstraction layer" between your
domain objects and your physical data
model?
This is an area that I'm admittedly a bit weak in, but I've found that using views and mirrored abstraction in the application layer haven't made this too much of a problem. The real problem for me has always been a "where is piece of data X living" when I want to read the data source directly (it's not always intuitive for new developers on the system either).
The idea behind the party models (aka entity schema) is to define a database that leverages some of the scalability benefits of schema-free databases. The party model does that by defining its entities as party type records, as opposed to one table per entity. The result is an extremely normalized database with very few tables and very little knowledge about the semantic meaning of the data it stores. All that knowledge is pushed to the data access in code. Database upgrades using the party model are minimal to none, since the schema never changes. It’s essentially a glorified key-value pair data model structure with some fancy names and a couple of extra attributes.
Pros:
Kick-ass horizontal scalability. Once your 5-6 tables are defined in your entity model, you can go to the beach and sip margaritas. You can virtually scale this database out as much as you want with minimum efforts.
The database supports any data structure you throw at it. You can also change data structures and party/entities definitions on the fly without affecting your application. This is very very powerful.
You can model any arbitrary data entity by adding records, not changing the schema. Meaning you can say goodbye to schema migration scripts.
This is programmers’ paradise, since the code they write will define the actual entities they use in code, and there are no mappings from Objects to Tables or anything like that. You can think of the Party table as the base object of your framework of choice (System.Object for .NET)
Cons:
Party/Entity models never play well with ORMs, so forget about using EF or NHibernate to get semantically meaningful entities out of your entity database.
Lots of joins. Performance tuning challenges. This ‘con’ is relative to the practices you use to define your entities, but is safe to say that you’ll be doing a lot more of those mind-bending queries that will bring you nightmares at night.
Harder to consume. Developers and DB pros unfamiliar with your business will have a harder time to get used to the entities exposed by these models. Since everything is abstract, there no diagram or visualization you can build on top of your database to explain what is stored to someone else.
Heavy data access models or business rules engines will be needed. Basically you have to do the work of understanding what the heck you want out of your database at some point, and your database model is not going to help you this time around.
If you are considering a party or entity schema in a relational database, you should probably take a look at other solutions like a NoSql data store, BigTable or KV Stores. There are some great products out there with massive deployments and traction such as MongoDB, DynamoDB, and Cassandra that pioneered this movement.
This is a vast topic, I would recommend reading The Data Model Resource Book Volume 3 - Universal Patterns for Data Modeling by Len Silverston and Paul Agnew.
I've just received my copy and it's pretty good - It provides you with an overlook for many approaches to data modeling, including hybrid contextual role patterns and so on. It has detailed PROs and CONs for every approach.
There is a pletheora of ways to model party relationships and roles all with their benefits and disadvantages. The question that was accepted as an answer covers just one instance of a 'party model'.
For instance, in many approaches, notions like "Employee", "Project Manager" etc. are roles that a party can play within a certain context. I will try to give you a better breakdown once I get home.
When I was part of a team implementing these ideas in the early 1980's, it did not limit our choice of ORM's because those hadn't been invented yet.
I'd fall back on those ideas any time, as that particular project was one of the most convincing proofs-of-concept I have ever seen of a "revolutionary" idea (which it certainly was at the time).
It forces you to nothing. And it doesn't stop you from anything (from any mistake, I mean). The one defining your own information model is you.
All parties have lots of properties in common. The fact that they have a name and such (we called those "signaletics"). The fact that they have principal/primary locations called "addresses". The fact that they all are involved, in some sense, in the business' contracts.
as a simple talk from my understanding: Party modeling gives the flexibility and needs more effort (like T-sql join and ...) to be implemented.
I also wanna point that, "using Party modeling (serialization/generalization) gives you the ability to have FK-Relation to other tables". for example: think of different types of users (admin, user, ...) which generalized into User table, and you can have UserID in your Authorization table.
I'm not sure, but the party model sounds like a particular case of the generalization-specialization pattern. A search on "generalization specialization relational modeling" finds some interesting articles.

How to document a database [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
This post was edited and submitted for review 12 months ago and failed to reopen the post:
Original close reason(s) were not resolved
Improve this question
(Note: I realize this is close to How do you document your database structure? , but I don't think it's identical.)
I've started work at a place with a database with literally hundreds of tables and views, all with cryptic names with very few vowels, and no documentation. They also don't allow gratuitous changes to the database schema, nor can I touch any database except the test one on my own machine (which gets blown away and recreated regularly), so I can't add comments that would help anybody.
I tried using "Toad" to create an ER diagram, but after leaving it running for 48 hours straight it still hadn't produced anything visible and I needed my computer back. I was talking to some other recent hires and we all suggested that whenever we've puzzled out what a particular table or what some of its columns means, we should update it in the developers wiki.
So what's a good way to do this? Just list tables/views and their columns and fill them in as we go? The basic tools I've got to hand are Toad, Oracle's "SQL Developer", MS Office, and Visio.
In my experience, ER (or UML) diagrams aren't the most useful artifact - with a large number of tables, diagrams (especially reverse engineered ones) are often a big convoluted mess that nobody learns anything from.
For my money, some good human-readable documentation (perhaps supplemented with diagrams of smaller portions of the system) will give you the most mileage. This will include, for each table:
Descriptions of what the table means and how it's functionally used (in the UI, etc.)
Descriptions of what each attribute means, if it isn't obvious
Explanations of the relationships (foreign keys) from this table to others, and vice-versa
Explanations of additional constraints and / or triggers
Additional explanation of major views & procs that touch the table, if they're not well documented already
With all of the above, don't document for the sake of documenting - documentation that restates the obvious just gets in people's way. Instead, focus on the stuff that confused you at first, and spend a few minutes writing really clear, concise explanations. That'll help you think it through, and it'll massively help other developers who run into these tables for the first time.
As others have mentioned, there are a wide variety of tools to help you manage this, like Enterprise Architect, Red Gate SQL Doc, and the built-in tools from various vendors. But while tool support is helpful (and even critical, in bigger databases), doing the hard work of understanding and explaining the conceptual model of the database is the real win. From that perspective, you can even do it in a text file (though doing it in Wiki form would allow several people to collaborate on adding to that documentation incrementally - so, every time someone figures out something, they can add it to the growing body of documentation instantly).
One thing to consider is the COMMENT facility built into the DBMS. If you put comments on all of the tables and all of the columns in the DBMS itself, then your documentation will be inside the database system.
Using the COMMENT facility does not make any changes to the schema itself, it only adds data to the USER_TAB_COMMENTS catalog table.
In our team we came to useful approach to documenting legacy large Oracle and SQL Server databases. We use Dataedo for documenting database schema elements (data dictionary) and creating ERD diagrams. Dataedo comes with documentation repository so all your team can work on documenting and reading recent documentation online. And you don’t need to interfere with database (Oracle comments or SQL Server MS_Description).
First you import schema (all tables, views, stored procedures and functions – with triggers, foreign keys etc.). Then you define logical domains/modules and group all objects (drag & drop) into them to be able to analyze and work on smaller chunks of database. For each module you create an ERD diagram and write top level description. Then, as you discover meaning of tables and views write a short description for each. Do the same for each column. Dataedo enables you to add meaningful title for each object and column – it’s useful if object names are vague or invalid. Pro version enables you to describe foreign keys, unique keys/constraints and triggers – which is useful but not essential to understand a database.
You can access documentation through UI or you can export it to PDF or interactive HTML (the latter is available only in Pro version).
Described here is a continuous process rather than one time job. If your database changes (eg. new columns, views) you should sync your documentation on regular basis (couple clicks with Dataedo).
See sample documentation:
http://dataedo.com/download/Dataedo%20repository.pdf
Some guidelines on documentation process:
Diagrams:
Keep your diagrams small and readable – just include important tables, relations and columns – only the one that have any meaning to understand big picture – primary/business keys, important attributes and relations,
Use different color for key tables in a diagram,
You can have more than one diagram per module,
You can add diagram to description of most important tables/with most relations.
Descriptions:
Don’t document the obvious – don’t write description “Document date” for document.date column. If there’s nothing meaningful to add just leave it blank,
If objects stored in tables have types or statuses it’s good to list them in general description of a table,
Define format that is expected, eg. “mm/dd/yy” for a date that is stored in text field,
List all known/important values an it’s meaning, e.g. for status column could be something like this: “Document status: A – Active, C – Cancelled, D – Deleted”,
If there’s any API to a table – a view that should be used to read data and function/procedures to insert/update data – list it in the description of table,
Describe where does rows/columns’ values come from (procedure, form, interface etc.) ,
Use “[deprecated]” mark (or similar) for columns that should not be used (title column is useful for this, explain which field should be used instead in description field).
We use Enterprise Architect for our DB definitions. We include stored procedures, triggers, and all table definitions defined in UML. The three brilliant features of the program are:
Import UML Diagrams from an ODBC Connection.
Generate SQL Scripts (DDL) for the entire DB at once
Generate Custom Templated Documentation of your DB.
You can edit your class / table definitions within the UML tool, and generate a fully descriptive with pictures included document. The autogenerated document can be in multiple formats including MSWord. We have just less than 100 tables in our schema, and it's quite managable.
I've never been more impressed with any other tool in my 10+ years as a developer. EA supports Oracle, MySQL, SQL Server (multiple versions), PostGreSQL, Interbase, DB2, and Access in one fell swoop. Any time I've had problems, their forums have answered my problems promptly. Highly recommended!!
When DB changes come in, we make then in EA, generate the SQL, and check it into our version control (svn). We use Hudson for building, and it auto-builds the database from scripts when it sees you've modified the checked-in sql.
(Mostly stolen from another answer of mine)
This answer extends Kieveli's above, which I upvoted. If your version of EA supports Object Role Modeling (conceptual design, vs. logical design = ERD), reverse engineer to that and then fill out the model with the expressive richness it gives you.
The cheap and lighter-weight option is to download Visiomodeler for free from MS, and do the same with that.
The ORM (call it ORMDB) is the only tool I've ever found that supports and encourages database design conversations with non-IS stakeholders about BL objects and relationships.
Reality check - on the way to generating your DDL, it passes through a full-stop ERD phase where you can satisfy your questions about whether it does anything screwy. It doesn't. It will probably show you weaknesses in the ERD you designed yourself.
ORMDB is a classic case of the principle that the more conceptual the tool, the smaller the market. Girls just want to have fun, and programmers just want to code.
A wiki solution supports hyperlinks and collaborative editing, but a wiki is only as good as the people who keep it organized and up to date. You need someone to take ownership of the document project, regardless of what tool you use. That person may involve other knowledgeable people to fill in the details, but one person should be responsible for organizing the information.
If you can't use a tool to generate an ERD by reverse engineering, you'll have to design one by hand using TOAD or VISIO.
Any ERD with hundreds of objects is probably useless as a guide for developers, because it'll be unreadable with so many boxes and lines. In a database with so many objects, it's likely that there are "sub-systems" of a few dozen tables and views each. So you should make custom diagrams of these sub-systems, instead of expecting a tool to do it for you.
You can also design a pseudo-ERD, where groups of tables are represented by a single object in one diagram, and that group is expanded in another diagram.
A single ERD or set of ERD's are not sufficient to document a system of this complexity, any more than a class diagram would be adequate to document an OO system. You'll have to write a document, using the ERD's as illustrations. You need text descriptions of the meaning and use of each table, each column, and the relationships between tables (especially where such relationships are implicit instead of represented by referential integrity constraints).
All of this is a lot of work, but it will be worth it. If there's a clear and up-to-date place where the schema is documented, the whole team will benefit from it.
Since you have the luxury of working with fellow developers that are in the same boat, I would suggest asking them what they feel would convey the needed information, most easily. My company has over 100 tables, and my boss gave me an ERD for a specific set tables that all connect. So also, you might want to try breaking 1 massive ERD into a bunch of smaller, manageable, ERDs.
Well, a picture tells a thousand words so I would recommend creating ER diagrams where you can view the relationship between tables at a glance, something that is hard to do with a text-only description.
You don't have to do the whole database in one diagram, break it up into sections. We use Visual Paradigm at work but EA is a good alternative as is ERWIN, and no doubt there are lots of others that are just as good.
If you have the patience, then using html to document the tables and columns makes your documentation easier to access.
If describing your databases to your end users is your primary goal Ooluk Data Dictionary Manager can prove useful. It is a web-based multi-user software that allows you to attach descriptions to tables and columns and allows full text searches on those descriptions. It also allows you to logically group tables using labels and browse tables using those labels. Tables as well as columns can be tagged to find similar data items across your database/databases.
The software allows you to import metadata information such as table name, column name, column data type, foreign keys into its internal repository using an API. Support for JDBC data sources comes built-in and can be extended further as the API source is distributed under ASL 2.0. It is coded to read the COMMENTS/REMARKS from many RDBMSs.You can always manually override the imported information. The information you can store about tables and columns can be extended using custom fields.
The Data Dictionary Manager uses the "data object" and "attribute" terminology instead of table and column because it isn't designed specifically for relational databases.
Notes
If describing technical aspects of your database such as triggers,
indexes, statistics is important this software isn't the best option.
It is however possible to combine a technical solution with this
software using hyperlink custom fields.
The software doesn't produce an ERD
Disclosure: I work at the company that develops this product.