I have an entity Company with different attributes.
One of the attributes is logo, but logo itself is characterized by other attributes, like filename and type.
The relation between company and logo is one to one, an is mandatory to have a logo.
Should I add everything as attributes/columns of company or should I create a separate table for Logo and his characteristics ?
Using same table I avoid a join or another query in the database, plus more codding, what are the advantages to use a separate logo table ?
The advantage of a separate logo table is that you can scan the companies table without having to read the logo information. If it is just a handful of columns, this might not make a different. Under some circumstances, it could.
One argument in favor of a separate table would be whether other entities want to connect specifically to the logo and not to the company. If so, a separate id would allow such foreign key relationships.
I might challenge the notion that a company has exactly one logo. First, companies can have multiple brands with their own logos or they may have different logos in different geographies/languages. Second, the logo might change over time. If you want to keep a history, then another table would be a no-brainer.
Related
[Assuming there is a one to many relationship between an individual and an address, and assuming there is a one to many relationship between an agency and an address.]
Given the following table structure:
Wouldn't you want to merge the two address tables together and instead of using a foreign key within each one use a tie table?
Like this:
Are they both valid for normalization or only one?
Depends what you want to do.
In your second example with the tie tables, if I want to do a mailshot to my customers then my query has to go out to the agency tie table to exclude any agency addresses.
Of course you could have an address type column to differentiate but then you have a more complex query for your insert statement.
So although "address" is a global idea, sometimes it is easier to have it segregated by context.
Secondly, your customer data would usually be changing much more than your agency data. There may also be organisational and legal requirements around storage of personal data that make it better to separate the two.
e.g. in a health records system I want to be able to easily extract / restrict client data and to keep my configuration or commissioning data separate.
Thus in all the client systems I have used, the model tends to be the first one you describe rather than the second.
I'm completely redesigning my company's databases, back-end and front-end. One thing I've seen discussed online quite often is defining the relationships between tables using a primary key, clustered indexes, non-clustered indexes, etc.
In terms of performance, I've been debating how to best setup the structure for a couple of my tables. Within several of the tables, there is a location field. In the current setup, users manually enter the location, something similar to C1.H39.3. Where C1 denotes the building, .H is the letter, 39 is the number, and .3 is the tile.
My question is how should I link my location table to my other tables. The location table is setup with ID (identity), building, letter, number, and tile columns. When creating my relationships, would it be better to merely have a column for the id, or should I composite the columns together? What would be more beneficial to performance?
I wouldn't store concatenated values in a single field. You should split these out into what they actually represent and have 4 fields(Building, Letter, Number, Tile). This will save you headaches down the road. If you want to show the exact location, you can display those values concatenated back together.
If you want to grab all the records in building C for example, this would be more difficult if the data is all in a single field.
Also, this has the added benefit of allowing you to create lookup tables. You should have a table for bulidings, with the BuildingID, and building details, a table for Letters with all the letters etc. One of the problems it sounds like you may face is that when users MANUALLY enter in this complex location human errors are more likely. If you use a look up table the user will be less likely to make a mistake.
EDIT:
Would it be a good idea to just keep it all under 1 big table and have a flag that differentiates the different forms?
I have to build a site with 5 forms, maybe more. so far the fields for the forms are the following:
What would be the best approach to normalize this design?
I was thinking about splitting "Personal Details" into 3 different tables:
and then reference them from the others with an ID...
Would that make sense? It looks like I'll end up with lots of relationships...
Normalized data essentially means that the same data is not stored multiple times in multiple places. For example, instead of storing the customer contact info with an order, the customer ID is stored with the order and the customer's contact information is 'related' to the order. When the customer's phone number is updated, there is only one place the phone number needs to be updated (the customer table) and all the orders will have the correct information without being updated. Each piece of data exists in one, and only one, place. This is normalized data.
So, to answer your question: no, you will not make your database structure more normalized by breaking up a large table as you described.
The reason to break up a single table into multiple tables is usually to create a one to many relationship. For example, one person might have multiple e-mail addresses. Or multiple physical addresses. Another common reason for breaking up tables is to make systems modular, so that tables can be created that join to existing tables without modifying the existing tables.
Breaking one big table into multiple little tables, with a one to one relationship between them, doesn't make the data any more normalized, it just makes your queries more of a pain to write.* And you don't want to structure your database design around interfaces (forms) unless there is a good reason. There usually isn't.
*Although there are sometimes good reasons to break up big tables and create one to one relationships, normalization isn't one of them.
So I know the convention for naming M-M relationship tables in SQL is to have something like so:
For tables User and Data the relationship table would be called
UserData
User_Data
or something similar (from here)
What happens then if you need to have multiple relationships between User and Data, representing each in its own table? I have a site I'm working on where I have two primary items and multiple independent M-M relationships between them. I know I could just use a single relationship table and have a field which determines the relationship type, but I'm not sure whether this is a good solution. Assuming I don't go that route, what naming convention should I follow to work around my original problem?
To make it more clear, say my site is an auction site (it isn't but the principle is similar). I have registered users and I have items, a user does not have to be registered to post an item but they do need to be to do anything else. I have table User which has info on registered users and Items which has info on posted items. Now a user can bid on an item, but they can also report a item (spam, etc.), both of these are M-M relationships. All that happens when either event occurs is that an email is generated, in my scenario I have no reason to keep track of the actual "report" or "bid" other than to know who bid/reported on what.
I think you should name tables after their function. Lets say we have Cars and People tables. Car has owners and car has assigned drivers. Driver can have more than one car. One of the tables you could call CarsDrivers, second CarsOwners.
EDIT
In your situation I think you should have two tables: AuctionsBids and AuctionsReports. I believe that report requires additional dictinary (spam, illegal item,...) and bid requires other parameters like price, bid date. So having two tables is justified. You will propably be more often accessing bids than reports. Sending email will be slightly more complicated then when this data is stored in one table, but it is not really a big problem.
I don't really see this as a true M-M mapping table. Those usually are JUST a mapping. From your example most of these will have additional information as well. For example, a table of bids, which would have a User and an Item, will probably have info on what the bid was, when it was placed, etc. I would call this table... wait for it... Bids.
For reporting items you might want what was offensive about it, when it was placed, etc. Call this table OffenseReports or something.
You can name tables whatever you want. I would just name them something that makes sense. I think the convention of naming them Table1Table2 is just because sometimes the relationships don't make alot of sense to an outside observer.
There's no official or unofficial convention on relations or tables names. You can name them as you want, the way you like.
If you have multiple user_data relationships with the same keys that makes absolutely no sense. If you have different keys, name the relation in a descriptive way like: stores_products_manufacturers or stores_products_paymentMethods
I think you're only confused because the join tables are currently simple. Once you add more information, I think it will be obvious that you should append a functional suffix. For example:
Table User
UserID
EmailAddress
Table Item
ItemID
ItemDescription
Table UserItem_SpamReport
UserID
ItemID
ReportDate
Table UserItem_Post
UserID -- can be (NULL, -1, '', ...)
ItemID
PostDate
Table UserItem_Bid
UserId
ItemId
BidDate
BidAmount
Then the relation will have a Role. For instance a stock has 2 companies associated: an issuer and a buyer. The relationship is defined by the role the parent and child play to each other.
You could either put each role in a separate table that you name with the role (IE Stock_Issuer, Stock_Buyer etc, both have a relationship one - many to company - stock)
The stock example is pretty fixed, so two tables would be fine. When there are multiple types of relations possible and you can't foresee them now, normalizing it into a relationtype column would seem the better option.
This also depends on the quality of the developers having to work with your model. The column approach is a bit more abstract... but if they don't get it maybe they'd better stay away from databases altogether..
Both will work fine I guess.
Good luck, GJ
GJ
I'm trying to finalize my design of the data model for my project, and am having difficulty figuring out which way to go with it.
I have a table of users, and an undetermined number of attributes that apply to that user. The attributes are in almost every case optional, so null values are allowed. Each of these attributes are one to one for the user. Should I put them on the same table, and keep adding columns when attributes are added (making the user table quite wide), or should I put each attribute on a separate table with a foreign key to the user table.
I have decided against using the EAV model.
Thanks!
Edit
Properties include thing like marital status, gender, age, first and last name, occupation, etc. All are optional.
Tables:
USERS
USER_PREFERENCE_TYPE_CODES
USER_PREFERENCES
USER_PREFERENCES is a many-to-many table, connecting the USERS and USER_PREFERENCE_TYPE_CODES tables. This will allow you to normalize the preference type attribute, while still being flexible to add preferences without needing an ALTER TABLE statement.
Could you give some examples of what kind of properties you'd want to add to the user table? As long as you stay below roughly 50 columns, it shouldn't be a big deal.
How ever, one way would be to split the data:
One table (users) for username, hashed_password, last_login, last_ip, current_ip etc, another table (profiles) for display_name, birth_day etc.
You'd link them either via the same id property or you'd add an user_id column to the other tables.
It depends.
You need to Look at what percentage of users will have that attribute. If the attribute is 'WalkedOnTheMoon' then split it out, if it is 'Sex' include it on the user's table. Also consider the number of columns on the base table, a few, 10-20, won't hurt that much.
If you have several related attributes you could group them into a common table: 'MedicalSchoolId', 'MedicalSpeciality', 'ResidencyHospitalId', etc. could be combined in UserMedical table.
Personally I would decide on whether there are natural groupings of attributes. You might put the most commonly queried in the user table and the others in a separate table with a one-to-one relationship to keep the table from being too wide (we usually call that something like User_Extended). If some of the attributes fall into natural groupings, they may call for a separate table because those attributes will usually be queried together.
In looking at the attributes, examine if some can be combined into one column (for instance if a user cannot simlutaneoulsy be three differnt things (say intern, resident, attending) but only one of them at a time, it is better to have one field and put the data into it rather than three bit fields that have to be transalted. This is especially true if you will need to use a case statement with all three fileds to get the information (say title) that you want in reporting. IN other words look over your attributes and see if they are truly separate or if they can be abstracted into a more general one.