What is a database closure? - sql

I came across this term called database closure.
I tried to look for it and what exactly it means but I have not found any simple explanation.
Can someone please explain what the concept of closure is and specifically what is a database closure, if it is good /bad, how it can be used or avoided ?
Also seems like there is in general a closure term: http://en.wikipedia.org/wiki/Closure_%28computer_science%29 which relates to binding of variables to function. Is a database closure related to this ?
Thanks!

Closure is actually a relatively simple concept. When designing databases we want to know that our database tables have as little redundancy as possible. This means making sure that we can have as little relationships between sets (or tables) as possible.
An example:
If we have two sets X and Y (which you can think of as two tables called X and Y) and they have a relationship with each other as so:
X -> Y (Read this as Y is dependent on X)
And we have another set Z which is dependent on Y:
Y -> Z (also read as Y determines Z)
To find the closure we find the minimum number of tables that we can reach all relationships with. In this case all we need is X.
So now, when we design our database we know that we only have to have a relationship from X, and Z and Y can actually be derived from X. We can therefore make sure there are no extra relationships in our database which cause redundancy.
If you want to read more, closure is a part of a topic called normalisation.

Closure is mentioned in database theory / set theory discussions -- as in, Dr. Codd / design & normalization kind of stuff. It has to do with finding the minimally representational elements of sets (i.e., without redundancy, etc.). I tried reading-up on it a long time ago, but my eyes went crossed, and I got a really bad headache.
If you want to read a decent summary of closure, here is one: http://www.cs.sfu.ca/CC/354/jpei/slides/ClosureDecomposition.pdf

All operations are performed on an entire relation and result in an entire relation, a concept known as closure. And that is one of relational database systems characteristics

The closure is essentially the full set of attributes that can be determined from a set of known attributes, for a given database, using its functional dependencies.
Formal math definition:
Given a set of functional dependencies, F, and a set of attributes X. The closure is defined to be the set of attributes Y such that X -> Y follows from F.
Algorithm definition:
Closure(X, F)
1 INITIALIZE V:= X
2 WHILE there is a Y -> Z in F such that:
- Y is contained in V and
- Z is not contained in V
3 DO add Z to V
4 RETURN V
It can be shown that the two definition coincide.
A database closure might refer to the closure of all of the database attributes. According to the definitions above, this closure would be the set of all attributes of the database itself.
The closure (computer science) term that you linked to is not related to closure in databases but the mathematical closure is.
For a better understanding of functional dependencies and a simple example for closure in databases I suggest reading this.

If we are referring to Closure in the Functional Dependency sense (relating to database design),
The closure of a set F of functional dependencies is the set of all functional dependencies logically implied by F.
The minimal representation of sets is referred to as the canonical cover: the irreducible set of FD's that describe the closure.

Related

Get number of attached constraints on a variable in MiniZinc

I have two sets of variables in my Minizinc program. Each variable from the first set necessarily has several constraints placed on it, but the variables in the second set are only implicitly constrained via their interactions with variables in the first set. This means that each of the variables in the second set may have anywhere from 0 to ~8 constraints placed on it, depending on the values taken by the variables in the first set.
I see that there is a way to reference the number of constraints placed on a variable at search time via the dom_w_deg search annotation, but I was wondering if there was anyway to access this information at runtime? I want to do this because I would like to specify additional constraints related to the number of constraints already placed on the variables.
I realize this is a weird question, and I may be approaching this whole thing the wrong way, but I've been banging my head against this problem for a while now, so figured I'd ask.
As a general rule, I think that you are approaching your problem erroneously. There are several mis-conceptions in the approach that I can identify leading to this:
Different solver back-ends might do very different things with the model and how it is solved
"A constraint" is not a meaningful concept for the solver. A single constraint might be multiple propagators in the back-end solver, a single propagator, or even just part of a propagator covering several constraints (assuming that it is a propagator based back-end).
Constraint models have monotonic behavior, so you can not in a well-defined and meaningful way change the model based on the number of constraints connected to a variable.
Given that a constraint maps to a single propagator, it may still have very different propagation strength, meaning that it might be done early or very late in the solving process.
Without knowing what you are actually trying to achieve, as a general technique you might be interested in using reification, where the truth of a constraint is reflected onto a binary Boolean variable. In general, it is good practice to have as little reification as possible, since it does not propagate much, but sometimes it is needed.
As a very simple example of using reification, this is a (probably not very good) model that tries to maximize the number of constraints satisfied.
set of int: Domain = 1..10;
var Domain: x;
var Domain: y;
var Domain: z;
array[1..3] of var bool: holds;
constraint holds[1] <-> x < y;
constraint holds[2] <-> y < z;
constraint holds[3] <-> z < x;
var int: goal;
constraint goal = sum(holds);
solve maximize goal;

How to Store Graph-Like Data in SQL Server?

This is a bit of a complex one, and even trying to think it over is somewhat confusing.
Basically I'm having to design a series of tables that will house information about many different pieces of electrical equipment. The arrangement of this equipment is quite complex, and can vary fairly drastically.
The different types of equipment are as follows:
RDC - Remote Distribution Center
EBD - Electrical Bus Duct
UPB - Upright Panel Board
PDU - Power Distribution Unit
Now the way these units work together is slightly confusing as well.
PDU - Powers RDC's, EBD's, and UPB's. They are often redundant, and have a secondary
unit that powers the same equipment in the event of a power failure.
Can also contain breakers and power equipment directly.
RDC - Powers nearly all the equipment on the data center floors, are usually redundant.
They have two units side by side, being powered by a PDU. In the event of a
failure, the second RDC is activated and resumes operations.
EBD - Nearly identical to the RDC, being phased out, but still needs to be tracked in a
similar fashion.
UPB - Similar to an RDC, however, they are not redundant.
Now what I'm trying to do is figure out the most simplistic method of tracking this crazy relationship between all the different items?
I need to track the redundant sources for all possible hardware, but also what powers each unit. This can be quite complex because if two PDUs power a set of two RDCs, we need to be able to track exactly what goes where.
Any idea on exactly where to start?
EDIT Here is a visual representation of what I'm after. The objects that are touching are redundant, and must be documented as such. Also, the different hardware that is connected to each device must be cataloged.
Set up one table for equipment, one table for power supplies, then a third table that matches a piece of equipment with its power supply.
This sounds like a job for an entity-relationship model. You can learn more about that here: enter link description here
But, in the interest of answering your question, here's how I would set it up. I believe I understand the relationships between entities. My shorthand follows this pattern: Table [TableName] ([columns]). I tried to name them so they make the relationships obvious.
Table RDC (id)
Table PDU (id)
Table UPB (id, PduId) // Many-to-one relationship between UPBs and Pdus
Table PDU (id)
Table PDU_RDC (PduId, RdcId) // represents many-to-many relationship between PDUs and RDCs
Table PDU_EBD (PduId, EbdId) // represents many-to-many relationship between PDUs and EBDs
Good luck!
Instead of focusing on "entities" focus on basic facts. Each gives a table or view.
Some of the basic facts just involve entities; others are about (ids of) entities:
RDC(id) // id identifies a remote distribution center
powers(pid,rid) // PDU pid powers RDC rid
backup(rid1,rid2) // RDC rid1 is backed up by RDC rid2
active(rid) // RDC is active
Until you supply adequate statements you want to make/use we can only answer you with guesses or principles; give statements and business rules we can suggest alternatives and rearrangements.
When you get AND between two statements you already have, the table with that statement is expressible as a JOIN of the two statements' tables.
You can introduce notions like hardware type but the tables/statements for that way will involve simpler statements (for which you may have defined tables). The former tables/statements are joins of the latter, and the latter are projections of the former. This means you can write views of either way in terms of the other. Neither is more complex; you have fewer things with more parts or more simpler things. Queries involving given statement will be simpler--but using the appropriate view neither is more complex. However, each way has corresponding versions of constraints and SQL might make certain constraints hard to express declaratively. Investigate join performance later as a non-premature optimization.
When a column is a function of a set of columns there is an FD from the set to the column. A column set forms a key when all other columns are functions of it but of no subset. FDs and keys are kinds of constraint.
There will be certain constraints that a projection of a source table is always a subset of a projection of a target table (maybe the same one). That's an IND. Informally it means something(c1,...) IMPLIES otherthing(c1,...). Formally, EXISTS x1,... t1(c1,...,x1,...) IMPLIES EXISTS y1,... t2(c1,...,y1,...). If the target projection' columns form a key in its table, there's also a FK. SQL FK [sic] declarations actually declare INDs.
There will be other constraints.
Supplying whateever-to-whateverness for a table is just one property about it. Not being 0-or-more-to-0-or-more means a corresponding FD or IND holds. People talk about "a" "1-to-n" "relationship" between entity types or tables but that's just sloppy unclear expression of some constraint. Make sure you know exactly the table(s) and constraint(s) that means.
Read about ORM2 (or NIAM or FCO-IM) because it is based on relational principles (although could be moreso).

Rails - Common fields/report data among multiple models - STI, hstore, or split tables?

I have a rails app in which I have a group of models (let's call them Events) that have some fields in common (date, title, user_id), but then I need some "subtypes". A SalesEvent might have a article_id and an amount. An InterviewEvent might have a comments field. And so on.
I know 3 business requirements I need to meet:
in some occasions I'll want to frame the Events as a whole (i.e. "get all the Events for this user, and sort them chronologically, grouped in months")
in other occasions I will need only the "subtypes" ("get all the articles sold by this user").
the number of subtypes can be moderately high (still TBD, but we estimate around 20, depending on user feedback)
I'm pondering about how to structure the tables to support this model. I came out with 5 possible ways to model this, but each one has its own drawbacks.
Option A: Separate tables - sales_events and interview_events. This would make 2) very simple, and 3) feasible, but 1) would be very cumbersome to implement.
Option B: Single table inheritance. This would solve 1) and 2) more or less easily, but but has the issue of requiring more and more nullable fields, which doesn't play well with 3)
Option C: Using hstore - Since we're using Postgres in production, we could use hstore - we would have a "data" field governed by a "type" string field. This would solve 1), 2) and 3), but ties us to postgresql, and we would implement a key business object in a technology we are not very familiar with. I'd rather avoid that if possible.
Option D: events table with polymorphic link to ***_event_data. We would basically have an events table with a type and event_data_id, and then we would have sale_event_data, interview_event_data, etc. This satisfies 1) and 3) well, but 2) is a bit weak than in other approaches, since there will be lots of joins involved in linking the events with their data.
Option E: Sale has_one :event. This does the same as Option D, except that the "link to the other" is on the "data" part. It also solves 1) and 3), and also involves some joins in 2), but it seems a bit more "clean"; there are no polymorphic associations here, just "regular" sql ones.
Right now I'm inclined to use Option E. But I'd like to know if anyone sees an obvious disadvantage on it, or a greater benefit in one of the other options, or a better option that I didn't think of.
I have used almost all your suggested options. While I would eliminate options A, B and D for the following reasons, I can't talk about C because I don't know hstore and don't use Postgres:
Option A: Separate tables, as you said, would be very difficult to maintain. Each time you would want to change the structure of events, you'd have to do it on all the sub_events tables.
Option B: Single table inheritance, I have used it a lot and dropped it. I felt like a big design drawback between what you see in the database and what your models look like. Lots of nil fields also.
Option D: events table with polymorphic link to *_event_data. Polymorphic tables are not meant for that purpose. They are a way to have different type fields in a model so you could reference it without specifying the type explicitly.
Option E seems OK, but where the foreign key should be stored? Hard to tell and may lead to difficult to maintain situations.
Personally, I would go with the code I want to write, what would make using it and reading it later easier. I like things when they are more specific. And I would simply change the way I name my models so that it satisfies my needs. You have to be creative!
I would rather write something like that:
conference.event_information.users OR
sales_event.settings.title OR
interview.shared_information.comments OR
event.interview_details.starting_at
With all that examples, I'd use classical has_many and belongs_to relationships.
I think that the whole concept of data types and inheritance can put you in situations where it does not solve problems or make things clearer. Sometimes you just need to see things a little differently.
I hope it helps.
Rails doesn't support Multiple Table Inheritance by default, but it turns out it's possible to model it pretty closely.
See this article:
http://mediumexposure.com/multiple-table-inheritance-active-record/
Basically, it uses a module to "modify" Option D. I'm still pondering about Wawa Loo's answer, but this one is also worth considering.
EDIT: more on multiple-table inheritance: a gem called "citier" http://peterhamilton.github.com/citier/index.html
EDIT2: I ended up using multiple_table_inheritance:
https://github.com/mhuggins/multiple_table_inheritance
But I'm not very satisfied with the results. This is probably one of those places where having the business data tightly coupled with the persistence policies (as ActiveRecord does) doesn't help very much. It does the job sufficiently well, but it is not perfect (notably, instance methods can be "inherited", but not class methods. Things like scopes have to be repeated/mixed in separatedly on each subclass).

Table Naming Dilemma: Singular vs. Plural Names [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
Academia has it that table names should be the singular of the entity that they store attributes of.
I dislike any T-SQL that requires square brackets around names, but I have renamed a Users table to the singular, forever sentencing those using the table to sometimes have to use brackets.
My gut feel is that it is more correct to stay with the singular, but my gut feel is also that brackets indicate undesirables like column names with spaces in them etc.
Should I stay, or should I go?
I had same question, and after reading all answers here I definitely stay with SINGULAR, reasons:
Reason 1 (Concept). You can think of bag containing apples like "AppleBag", it doesn't matter if contains 0, 1 or a million apples, it is always the same bag. Tables are just that, containers, the table name must describe what it contains, not how much data it contains. Additionally, the plural concept is more about a spoken language one (actually to determine whether there is one or more).
Reason 2. (Convenience). it is easier come out with singular names, than with plural ones. Objects can have irregular plurals or not plural at all, but will always have a singular one (with few exceptions like News).
Customer
Order
User
Status
News
Reason 3. (Aesthetic and Order). Specially in master-detail scenarios, this reads better, aligns better by name, and have more logical order (Master first, Detail second):
1.Order
2.OrderDetail
Compared to:
1.OrderDetails
2.Orders
Reason 4 (Simplicity). Put all together, Table Names, Primary Keys, Relationships, Entity Classes... is better to be aware of only one name (singular) instead of two (singular class, plural table, singular field, singular-plural master-detail...)
Customer
Customer.CustomerID
CustomerAddress
public Class Customer {...}
SELECT FROM Customer WHERE CustomerID = 100
Once you know you are dealing with "Customer", you can be sure you will use the same word for all of your database interaction needs.
Reason 5. (Globalization). The world is getting smaller, you may have a team of different nationalities, not everybody has English as a native language. It would be easier for a non-native English language programmer to think of "Repository" than of "Repositories", or "Status" instead of "Statuses". Having singular names can lead to fewer errors caused by typos, save time by not having to think "is it Child or Children?", hence improving productivity.
Reason 6. (Why not?). It can even save you writing time, save you disk space, and even make your computer keyboard last longer!
SELECT Customer.CustomerName FROM Customer WHERE Customer.CustomerID = 100
SELECT Customers.CustomerName FROM Customers WHERE Customers.CustomerID = 103
You have saved 3 letters, 3 bytes, 3 extra keyboard hits :)
And finally, you can name those ones messing up with reserved names like:
User > LoginUser, AppUser, SystemUser, CMSUser,...
Or use the infamous square brackets [User]
I prefer to use the uninflected noun, which in English happens to be singular.
Inflecting the number of the table name causes orthographic problems (as many of the other answers show), but choosing to do so because tables usually contain multiple rows is also semantically full of holes. This is more obvious if we consider a language that inflects nouns based on case (as most do):
Since we're usually doing something with the rows, why not put the name in the accusative case? If we have a table that we write to more than we read, why not put the name in dative? It's a table of something, why not use the genitive? We wouldn't do this, because the table is defined as an abstract container that exists regardless of its state or usage. Inflecting the noun without a precise and absolute semantic reason is babbling.
Using the uninflected noun is simple, logical, regular and language-independent.
If you use Object Relational Mapping tools or will in the future I suggest Singular.
Some tools like LLBLGen can automatically correct plural names like Users to User without changing the table name itself. Why does this matter? Because when it's mapped you want it to look like User.Name instead of Users.Name or worse from some of my old databases tables naming tblUsers.strName which is just confusing in code.
My new rule of thumb is to judge how it will look once it's been converted into an object.
one table I've found that does not fit the new naming I use is UsersInRoles. But there will always be those few exceptions and even in this case it looks fine as UsersInRoles.Username.
Others have given pretty good answers as far as "standards" go, but I just wanted to add this... Is it possible that "User" (or "Users") is not actually a full description of the data held in the table? Not that you should get too crazy with table names and specificity, but perhaps something like "Widget_Users" (where "Widget" is the name of your application or website) would be more appropriate.
What convention requires that tables have singular names? I always thought it was plural names.
A user is added to the Users table.
This site agrees:
http://vyaskn.tripod.com/object_naming.htm#Tables
This site disagrees (but I disagree with it):
http://justinsomnia.org/writings/naming_conventions.html
As others have mentioned: these are just guidelines. Pick a convention that works for you and your company/project and stick with it. Switching between singular and plural or sometimes abbreviating words and sometimes not is much more aggravating.
How about this as a simple example:
SELECT Customer.Name, Customer.Address FROM Customer WHERE Customer.Name > "def"
vs.
SELECT Customers.Name, Customers.Address FROM Customers WHERE Customers.Name > "def"
The SQL in the latter is stranger sounding than the former.
I vote for singular.
I am of the firm belief that in an Entity Relation Diagram, the entity should be reflected with a singular name, similar to a class name being singular. Once instantiated, the name reflects its instance. So with databases, the entity when made into a table (a collection of entities or records) is plural. Entity, User is made into table Users. I would agree with others who suggested maybe the name User could be improved to Employee or something more applicable to your scenario.
This then makes more sense in a SQL statement because you are selecting from a group of records and if the table name is singular, it doesn't read well.
I stick with singular for table names and any programming entity.
The reason? The fact that there are irregular plurals in English like mouse ⇒ mice and sheep ⇒ sheep. Then, if I need a collection, i just use mouses or sheeps, and move on.
It really helps the plurality stand out, and I can easily and programatically determine what the collection of things would look like.
So, my rule is: everything is singular, every collection of things is singular with an s appended. Helps with ORMs too.
IMHO, table names should be plural like Customers.
Class names should be singular like Customer if it maps to a row in the Customers table.
Singular. I don't buy any argument involving which is most logical - every person thinks his own preference is most logical. No matter what you do it is a mess, just pick a convention and stick to it. We are trying to map a language with highly irregular grammar and semantics (normal spoken and written language) to a highly regular (SQL) grammar with very specific semantics.
My main argument is that I don't think of the tables as a set but as relations.
So, the AppUser relation tells which entities are AppUsers.
The AppUserGroup relation tells me which entities are AppUserGroups
The AppUser_AppUserGroup relation tells me how the AppUsers and AppUserGroups are related.
The AppUserGroup_AppUserGroup relation tells me how AppUserGroups and AppUserGroups are related (i.e. groups member of groups).
In other words, when I think about entities and how they are related I think of relations in singular, but of course, when I think of the entities in collections or sets, the collections or sets are plural.
In my code, then, and in the database schema, I use singular. In textual descriptions, I end up using plural for increased readability - then use fonts etc. to distinguish the table/relation name from the plural s.
I like to think of it as messy, but systematic - and this way there is always a systematically generated name for the relation I wish to express, which to me is very important.
I also would go with plurals, and with the aforementioned Users dilemma, we do take the square bracketing approach.
We do this to provide uniformity between both database architecture and application architecture, with the underlying understanding that the Users table is a collection of User values as much as a Users collection in a code artifact is a collection of User objects.
Having our data team and our developers speaking the same conceptual language (although not always the same object names) makes it easier to convey ideas between them.
I personaly prefer to use plural names to represent a set, it just "sounds" better to my relational mind.
At this exact moment i am using singular names to define a data model for my company, because most of the people at work feel more confortable with it.
Sometimes you just have to make life easier to everyone instead of imposing your personal preferences.
(that's how i ended up in this thread, to get a confirmation on what should be the "best practice" for naming tables)
After reading all the arguing in this thread, i reached one conclusion:
I like my pancakes with honey, no matter what everybody's favorite flavour is. But if i am cooking for other people, i will try to serve them something they like.
Singular. I'd call an array containing a bunch of user row representation objects 'users', but the table is 'the user table'. Thinking of the table as being nothing but the set of the rows it contains is wrong, IMO; the table is the metadata, and the set of rows is hierarchically attached to the table, it is not the table itself.
I use ORMs all the time, of course, and it helps that ORM code written with plural table names looks stupid.
I've actually always thought it was popular convention to use plural table names. Up until this point I've always used plural.
I can understand the argument for singular table names, but to me plural makes more sense. A table name usually describes what the table contains. In a normalized database, each table contains specific sets of data. Each row is an entity and the table contains many entities. Thus the plural form for the table name.
A table of cars would have the name cars and each row is a car. I'll admit that specifying the table along with the field in a table.field manner is the best practice and that having singular table names is more readable. However in the following two examples, the former makes more sense:
SELECT * FROM cars WHERE color='blue'
SELECT * FROM car WHERE color='blue'
Honestly, I will be rethinking my position on the matter, and I would rely on the actual conventions used by the organization I'm developing for. However, I think for my personal conventions, I'll stick with plural table names. To me it makes more sense.
I don't like plural table names because some nouns in English are not countable (water, soup, cash) or the meaning changes when you make it countable (chicken vs a chicken; meat vs bird).
I also dislike using abbreviations for table name or column name because doing so adds extra slope to the already steep learning curve.
Ironically, I might make User an exception and call it Users because of USER (Transac-SQL), because I too don't like using brackets around tables if I don't have to.
I also like to name all the ID columns as Id, not ChickenId or ChickensId (what do plural guys do about this?).
All this is because I don't have proper respect for the database systems, I am just reapplying one-trick-pony knowledge from OO naming conventions like Java's out of habit and laziness. I wish there were better IDE support for complicated SQL.
We run similar standards, when scripting we demand [ ] around names, and where appropriate schema qualifiers - primarily it hedges your bets against future name grabs by the SQL syntax.
SELECT [Name] FROM [dbo].[Customer] WHERE [Location] = 'WA'
This has saved our souls in the past - some of our database systems have run 10+ years from SQL 6.0 through SQL 2005 - way past their intended lifespans.
If we look at MS SQL Server's system tables, their names as assigned by Microsoft are in plural.
The Oracle's system tables are named in singular. Although a few of them are plural.
Oracle recommends plural for user-defined table names.
That doesn't make much sense that they recommend one thing and follow another.
That the architects at these two software giants have named their tables using different conventions, doesn't make much sense either... After all, what are these guys ... PhD's?
I do remember in academia, the recommendation was singular.
For example, when we say:
select OrderHeader.ID FROM OrderHeader WHERE OrderHeader.Reference = 'ABC123'
maybe b/c each ID is selected from a particular single row ...?
The system tables/views of the server itself (SYSCAT.TABLES, dbo.sysindexes, ALL_TABLES, information_schema.columns, etc.) are almost always plural. I guess for the sake of consistency I'd follow their lead.
I am a fan of singular table names as they make my ER diagrams using CASE syntax easier to read, but by reading these responses I'm getting the feeling it never caught on very well? I personally love it. There is a good overview with examples of how readable your models can be when you use singular table names, add action verbs to your relationships and form good sentences for every relationships. It's all a bit of overkill for a 20 table database but if you have a DB with hundreds of tables and a complex design how will your developers ever understand it without a good readable diagram?
http://www.aisintl.com/case/method.html
As for prefixing tables and views I absolutely hate that practice. Give a person no information at all before giving them possibly bad information. Anyone browsing a db for objects can quite easily tell a table from a view, but if I have a table named tblUsers that for some reason I decide to restructure in the future into two tables, with a view unifying them to keep from breaking old code I now have a view named tblUsers. At this point I am left with two unappealing options, leave a view named with a tbl prefix which may confuse some developers, or force another layer, either middle tier or application to be rewritten to reference my new structure or name viewUsers. That negates a large part of the value of views IMHO.
Tables: plural
Multiple users are listed in the users table.
Models: singular
A singular user can be selected from the users table.
Controllers: plural
http://myapp.com/users would list multiple users.
That's my take on it anyway.
I once used "Dude" for the User table - same short number of characters, no conflict with keywords, still a reference to a generic human. If I weren't concerned about the stuffy heads that might see the code, I would have kept it that way.
I've always used singular simply because that's what I was taught. However, while creating a new schema recently, for the first time in a long time, I actively decided to maintain this convention simply because... it's shorter. Adding an 's' to the end of every table name seems as useless to me as adding 'tbl_' in front of every one.
This may be a bit redundant, but I would suggest being cautious. Not necessarily that it's a bad thing to rename tables, but standardization is just that; a standard -- this database may already be "standardized", however badly :) -- I would suggest consistency to be a better goal given that this database already exists and presumably it consists of more than just 2 tables.
Unless you can standardize the entire database, or at least are planning to work towards that end, I suspect that table names are just the tip of the iceberg and concentrating on the task at hand, enduring the pain of poorly named objects, may be in your best interest --
Practical consistency sometimes is the best standard... :)
my2cents ---
As others have mentioned here, conventions should be a tool for adding to the ease of use and readability. Not as a shackle or a club to torture developers.
That said, my personal preference is to use singular names for both tables and columns. This probably comes from my programming background. Class names are generally singular unless they are some sort of collection. In my mind I am storing or reading individual records in the table in question, so singular makes sense to me.
This practice also allows me to reserve plural table names for those that store many-to-many relationships between my objects.
I try to avoid reserved words in my table and column names, as well. In the case in question here it makes more sense to go counter to the singular convention for Users to avoid the need to encapsulate a table that uses the reserved word of User.
I like using prefixes in a limited manner (tbl for table names, sp_ for proc names, etc), though many believe this adds clutter. I also prefer CamelBack names to underscores because I always end up hitting the + instead of _ when typing the name. Many others disagree.
Here is another good link for naming convention guidelines: http://www.xaprb.com/blog/2008/10/26/the-power-of-a-good-sql-naming-convention/
Remember that the most important factor in your convention is that it make sense to the people interacting with the database in question. There is no "One Ring to Rule Them All" when it comes to naming conventions.
Possible alternatives:
Rename the table SystemUser
Use brackets
Keep the plural table names.
IMO using brackets is technically the safest approach, though it is a bit cumbersome. IMO it's 6 of one, half-a-dozen of the other, and your solution really just boils down to personal/team preference.
My take is in semantics depending on how you define your container. For example, A "bag of apples" or simply "apples" or an "apple bag" or "apple".
Example:
a "college" table can contain 0 or more colleges
a table of "colleges" can contain 0 or more collegues
a "student" table can contain 0 or more students
a table of "students" can contain 0 or more students.
My conclusion is that either is fine but you have to define how you (or people interacting with it) are going to approach when referring to the tables; "a x table" or a "table of xs"
I think using the singular is what we were taught in university. But at the same time you could argue that unlike in object oriented programming, a table is not an instance of its records.
I think I'm tipping in favour of the singular at the moment because of plural irregularities in English. In German it's even worse due to no consistent plural forms - sometimes you cannot tell if a word is plural or not without the specifying article in front of it (der/die/das). And in Chinese languages there are no plural forms anyway.
I only use nouns for my table names that are spelled the same, whether singular or plural:
moose
fish
deer
aircraft
you
pants
shorts
eyeglasses
scissors
species
offspring
I did not see this clearly articulated in any of the previous answers. Many programmers have no formal definition in mind when working with tables. We often communicate intuitively in terms of of "records" or "rows". However, with some exceptions for denormalized relations, tables are usually designed so that the relation between the non-key attributes and the key constitutes a set theoretic function.
A function can be defined as a subset of a cross-product between two sets, in which each element of the set of keys occurs at most once in the mapping. Hence the terminology arising from from that perspective tends to be singular. One sees the same singular (or at least, non-plural) convention across other mathematical and computational theories involving functions (algebra and lambda calculus for instance).
I always thought that was a dumb convention. I use plural table names.
(I believe the rational behind that policy is that it make it easier for ORM code generators to produce object & collection classes, since it is easier to produce a plural name from a singular name than vice-versa)

Language features to implement relational algebra

I've been trying to encode a relational algebra in Scala (which to my knowlege has one of the most advanced type systems) and just don't seem to find a way to get where I want.
As I'm not that experienced with the academic field of programming language design I don't really know what feature to look for.
So what language features would be needed, and what language has those features, to implement a statically verified relational algebra?
Some of the requirements:
A Tuple is a function mapping names from a statically defined set of valid names for the tuple in question to values of the type specified by the name. Lets call this name-type set the domain.
A Relation is a Set of Tuples with the same domain such that the range of any tuple is uniqe in the Set
So far the model can eaisly be modeled in Scala simply by
trait Tuple
trait Relation[T<Tuple] extends Set[T]
The vals, vars and defs in Tuple is the name-type set defined above. But there should'n be two defs in Tuple with the same name. Also vars and impure defs should probably be restricted too.
Now for the tricky part:
A join of two relations is a relation where the domain of the tuples is the union of the domains from the operands tuples. Such that only tuples having the same ranges for the intersection of their domains is kept.
def join(r1:Relation[T1],r2:Relation[T2]):Relation[T1 with T2]
should do the trick.
A projection of a Relation is a Relation where the domain of the tuples is a subset of the operands tuples domain.
def project[T2](r:Relation[T],?1):Relation[T2>:T]
This is where I'm not sure if it's even possible to find a sollution. What do you think? What language features are needed to define project?
Implied above offcourse is that the API has to be usable. Layers and layers of boilerplate is not acceptable.
What your asking for is to be able to structurally define a type as the difference of two other types (the original relation and the projection definition). I honestly can't think of any language which would allow you to do that. Types can be structurally cumulative (A with B) since A with B is a structural sub-type of both A and B. However, if you think about it, a type operation A less B would actually be a supertype of A, rather than a sub-type. You're asking for an arbitrary, contravariant typing relation on naturally covariant types. It hasn't even been proven that sort of thing is sound with nominal existential types, much less structural declaration-point types.
I've worked on this sort of modeling before, and the route I took was to constraint projections to one of three domains: P == T, P == {F} where F in T, P == {$_1} where $_1 anonymous. The first is where the projection is equivalent to the input type, meaning it is a no-op (SELECT *). The second is saying that the projection is a single field contained within the input type. The third is the tricky one. It is saying that you are allowing the declaration of some anonymous type $_1 which has no static relationship to the input type. Presumably it will consist of fields which delegate to the input type, but we can't enforce that. This is roughly the strategy that LINQ takes.
Sorry I couldn't be more helpful. I wish it were possible to do what you're asking, it would open up a lot of very neat possibilities.
I think I have settled on just using the normal facilities for mapping collection for the project part. The client just specify a function [T<:Tuple](t:T) => P
With some java trickery to get to the class of P I should be able to use reflection to implement the query logic.
For the join I'll probably use DynamicProxy to implement the mapping function.
As a bonus I might be able to get the API to be usable with Scalas special for-syntax.