How to make Custom attributes for SQL table - sql

What is the proper way to give user to use custom attributes ?
As an example
Consider that,
There is a web application, using the frontend user interface admin user need to be able to add custom attributes to the Employee later. Which means not only for specific employee record but also for all employees.
Initially, There is an employee table which has following fields,
| Employee |
|----------|
| ID |
| Name |
| Email |
Later, system admin want to add few custom fields (attribute) to Employee table such as Nationality, Mobile Number, Address.
Is it good idea to alter the table and add new column to it ? or Is there any proper way to do this.
Currently, I am working on the ER diagram of database and hope to use Postgresql or MySQL to implement it.
Thanks !

There are multiple approaches to do what you want:
Add new columns to the given table.
Create a new table with additional attributes.
Use a JSON column in the table to handle flexible new attributes.
Create an entity-attribute-value table, with one row per entity attribute, for the flexible attributes.
How do you choose among these? It depends on factors that you have not discussed in the question. These include:
Do all entities have the same attributes?
Can you take the table offline to change its structure?
How large are the tables and how wide the rows? And are these issues with query performance?
How often will new columns be added?
There may be more considerations, and there may be more possible solutions. The point is that there is no generic right answer. Different solutions have different strengths and weaknesses.

Related

how to avoid creating new columns in database

I have a table of dentists and services. And I want it to be dynamic so I have added a "add services" function. but how can I normalize it if services are not defined yet? since I didn't add yet.
the solution I've made is that I created a new table which is specialty where it creates new column every time I added a new service. But I dont know if it's very improper to create a new column within the add services function itself. But that's the only way I think. Are there any ways to solve it?
Dynamic columns are not good at all!
Tables should be static to be reliable.
Use relation-tables instead.
Example
You have a dentists table. It has an id column as primary-key.
Create a services table. Of course it should have a primary-key. besides that, put a dentist_id column. This will contain the id of the dentist in the dentists table.
If you provide more information and code or database schema, we maybe able to help more.
EDIT
As ADyson mentioned, if it is possible that multiple dentists work on one service, do a many-to-many relation.
Example
You have a dentists table And a services table. They both have an id column as primary-key.
Create a r_dentist_service table. Of course it should have a primary-key. besides that, put a dentist_id column and a service_id column. These should contain the respective IDs. This table will relate dentists to services.

When to link tables or keep it as one table with more fields

I have a table with lots of fields which holds "totals" like so:
UserID | total_classA | total_classB | total_classC // and so on
I could have a second table however with:
ClassType | Total | UserID
But I don't really see how a second table would be beneficial here for a many to one relationship, firstly i would have to store more rows of data, AND i have to use a join for selecting data.
But alot of things i read would suggest having two tables is best over one table with lots of fields... why is this as i do not see the advantage to that in the above situation =/
Store your data cleanly, as you propose with your 'second table'.
You can always get the summarized column total display with a PIVOT (depending on your platform) or a specialized query if and when you need it.
The biggest benefit of doing so will be the elimination of having to change your table structure with every additional class type you decide to introduce. You will be able to extend your data tracking capabilities simply by adding rows (DML rather than DDL).
Take a look at second normal form for more of a technical explanation for going this route.

Best way to add content (large list) to relational database

I apologize if this may seem like somewhat of a novice question (which it probably is), but I'm just introducing myself to the idea of relational databases and I'm struggling with this concept.
I have a database with roughly 75 fields which represent different characteristics of a 'user'. One of those fields represents a the locations that user has been and I'm wondering what the best way is to store the data so that it is easily retrievable and can be used later on (i.e. tracking a route on Google Maps, identifying if two users shared the same location etc.)
The problem is that some users may have 5 locations in total while others may be well over 100.
Is it best to store these locations in a text file named using the unique id of each user(one location on each line, or in a csv)?
Or to create a separate table for each individual user connected to their unique id (that seems like overkill to me)?
Or, is there a way to store all of the locations directly in the single field in the original table?
I'm hoping that I'm missing a concept, or there is a link to a tutorial that will help my understanding.
If it helps, you can assume that the locations will be stored in order and will not be changed once stored. Also, these locations are static (I don't need to add any more locations once as they can't be updated).
Thank you for time in helping me. I appreciate it!
Store the location data for the user in a separate table. The location table would link back to the user table by a common user_id.
Keeping multiple locations for a particular user in a single table is not a good idea - you'll end up with denormalized data.
You may want to read up on:
Referential Integrity
Relational denormalization
The most common way would be to have a separate table, something like
USER_LOCATION
+------------+------------------+
| USER_ID | LOCATION_ID |
+------------+------------------+
| | |
If user 3 has 5 locations, there will be five rows containing user_id 3.
However, if you say the order of locations matter then an additional field specifying the ordinal position of the location within a user can be used.
The separate table approach is what we call normalized.
If you store a location list as a comma-separated string of location ids, for example, it is trival to maintain the order, but you lose the ability for the database to quickly answer the question "which users have been at location x?". Your data would be what we call denormalized.
You do have options, of course, but relational databases are pretty good with joining tables, and they are not overkill. They do look a little funny when you have ordering requirements, like the one you mention. But people use them all the time.
In a relational database you would use a mapping table. So you would have user, location and userlocation tables (user is a reserved word so you may wish to use a different name). This allows you to have a many-to-many relationship, i.e. many users can visit many locations. If you want to model a route as an ordered collection of locations then you will need to do more work. This site gives an example

What is a pickup table?

I'm developing a small application using the Clarizen API. The documentation references "pickup tables" numerous times, but the description
"Pick up tables are similar to the Regular entity types. Pick up tables usually contain limited set of fields and limited number of entities and are referenced from the other types of classes."
is meaningless to me. I've been trying to figure out what a pickup table is from context, but I'm stuck there too.
Some example fields from the documentation:
Country | Entity | Represents country. | Reference to the pickup table “Countries”.
State | Entity | Represents lifecycle State of the entity. For example, possible states of the Work Itemobjects can be Draft, Active, Cancelled, Completed, On Hold. Value is a reference to Statepickup table.
These examples make me think it's just a static list (which doesn't fit their given definition of pick up table), but if so the list/table is not provided to the user so I'm not sure how I would make use of it. If it does in fact refer to a static list, I'm going to have to try to coax them into giving me the tables.
I can't find a definition of pickup table online, so if anyone here knows it would help greatly.
My guess is that it's another name for a junction table, or association table. There are only foreign keys that refer to primary keys in other tables.
Pickup tables are picklists. The values of the picklist are stored in mini-tables (not really sure why but doesn't seem to be that important).
The work item state example is an example of a picklist.

Dynamic creation of new lookup tables based on values in main data table

I am working on an application which accepts any uploaded CSV data, stores it alongside other datasets which have been uploaded previously, and then produces output (CSV or HTML) based on the user selecting which columns/values they want returned. The database will be automatically expanded to handle new/different columns and datatypes as required. This is in preference to a entity-attribute-value model.
Example - uploading these 2 sets to a blank database:
dataset A:
name | dept | age
------+-------+------
Bob | Sales | 24
Tim | IT | 32
dataset B:
name | dept | age | salary
------+-------+------+--------
Bob | Sales | 24 | £20,000
Tim | IT | 32 | £20,000
Will programatically change the 'data' table so that importing dataset A results in 3 newly created columns (name,dept,age). Importing dataset B results in 1 newly created column (salary). At the moment, forget about whether the recordsets should be combined or not and that there's no normalisation.
The issue I have is that some columns will also have lookup values - let's say that the Dept column will at some point in the future have associated values which give the address and phone numbers of that department. The same could be true for the Salary column, looking up tax groupings etc.
The number of columns in this big table should not become too high (a few hundred) but will be high enough to want the user to administer the lookup table structure and values through an admin panel rather than have to involve developers each time.
The question is whether to use individual lookup tables for each column (value, description), or a combined lookup table which references the column (column, value, description). Normally I would opt for individual lookup tables, but here the application will need to create them automatically (e.g. lookup_dept, lookup_salary) and then add a new join into the master SQL statement. This would be done at the request of the user rather than when the column's added (to avoid hundreds of empty tables).
The combined lookup table on the other hand would need to be joined multiple times onto the data table, selecting on the column name each time.
Individual lookups seems to make sense to me but I may be barking up completely the wrong tree.
I would agree that individual tables is preferable. It is more scalable and better for query optimisation. Also, if in future the users want more columns on a particular lookup then you can add them.
Yes, the application will have to create tables and constraints automatically: I wouldn't normally do this, but then this application is already altering existing tables and adding columns to them, which I wouldn't normally do either!
Ah, the "One true lookup table" idea. One of the rare times I agree with Mr Celko.
Google search too
Individual tables every time. It's "correct" in the database sense.
My reason (no normalisation pedants please): each row in a table stores one entity only.
eg Fruit names, car makes, phone brands. To mix them is nonsense. I could have a phone brand called "Apple". Er... wait a minute...
You said,
This is in preference to a entity-attribute-value model.
But it looks to me like that is exactly what you need.
Consider using an RDF triplestore, and query it with SPARQL.
Forget SQL, this is a job for RDF.