one table with a lot of rows or a lot of tables with a view? on SQL Server - sql

My question comes from what is more efficient when making queries and insert, since the number of registers(data) in my table will grow a lot.
I would like to know what is more efficient to do if all the data is placed within a single table or is the partition and through a View and trigger is more efficient to obtain and enter registers(data).

As already mentioned take a look at database normalization.
SQL is a way to work with relational databases and is built on the idea that we should have many tables that are linked with each other trough relationships. Thus I recommend multiple tables, because you will be able to reuse data (for example user name and surname) through specific IDs rather than copying that data each time a user performs some action on your platform and you need to insert or update some information.
Hope this helps!

Related

Having all contact information in one table vs. using key-value tables

(NB. The question is not a duplicate for this, since I am dealing with an ORM system)
I have a table in my database to store all Contacts information. Some of the columns for each contact is fixed (e.g. Id, InsertDate and UpdateDate). In my program I would like to give user the option to add or remove properties for each contact.
Now there are of course two alternatives here:
First is to save it all in one table and add and remove entire columns when user needs to;
Create a key-value table to save each property alongside its type and connect the record to user's id.
These alternatives are both doable. But I am wondering which one is better in terms of speed? In the program it will be a very common thing for the user to view the entire Contact list to check for updates. Plus, I am using an ORM framework (Microsoft's Entity Framework) to deal with database queries. So if the user is to add and remove columns from a table all the time, it will be a difficult task to map them to my program. But again, if alternative (1) is a significantly better option than (2), then I can reconsider the key-value option.
I have actually done both of these.
Example #1
Large, wide table with columns of data holding names, phone, address and lots of small integer values of information that tracked details of the clients.
Example #2
Many different tables separating out all of the Character Varying data fields, the small integer values etc.
Example #1 was a lot faster to code for but in terms of performance, it got pretty slow once the table filled with records. 5000 wasn't a problem. When it reached 50,000 there was a noticeable performance degradation.
Example #2 was built later in my coding experience and was built to resolve the issues found in Example #1. While it took more to get the records I was after (LEFT JOIN this and UNION that) it was MUCH faster as you could ultimately pick and choose EXACTLY what the client was after without having to search a massive wide table full of data that was not all being requested.
I would recommend Example #2 to fit your #2 in the question.
And your USER specified columns for their data set could be stored in a table just to their own (depending on how many you have I suppose) which would allow you to draw on the table specific to that USER, which would also give you unlimited ability to remove and add columns to suit that particular setup.
You could then also have another table which kept track of the custom columns in the custom column table, which would give you the ability to "recover" columns later, as in "Do you want to add this to your current column choices or to one of these columns you have deleted in the past".

How can I divide a single table into multiple tables in access?

Here is the problem: I am currently working with a Microsoft Access database that the previous employee created by just adding all the data into one table (yes, all the data into one table). There are about 186 columns in that one table.
I am now responsible for dividing each category of data into its own table. Everything is going fine although progress is too slow. Is there perhaps an SQL command that will somehow divide each category of data into its proper table? As of now I am manually looking at the main table and carefully transferring groups of data into each respective table along with its proper IDs making sure data is not corrupted. Here is the layout I have so far:
Note: I am probably one of the very few at my campus with database experience.
I would approach this as a classic normalisation process. Your single hugely wide table should contain all of the entities within your domain so as long as you understand the domain you should be able to normalise the structure until you're happy with it.
To create your foreign key lookups run distinct queries against the columns your going to remove and then add the key values back in.
It sounds like you know what you're doing already ? Are you just looking for reassurance that you're on the right track ? (Which it looks like you are).
Good luck though, and enjoy it - it sounds like a good little piece of work.

Dynamically creating tables as a means of partitioning: OK or bad practice?

Is it reasonable for an application to create database tables dynamically as a means of partitioning?
For example, say I have a large table "widgets" with a "userID" column identifying the owner of each row. If this table tended to grow extremely large, would it make sense to instead have the application create a new table called "widgets_{username}" for each new user? Assume that the application will only ever have to query for widgets belonging to a single user at a time (i.e. no need to try and join any of these user widget tables together).
Doing this would break up the one large table into more easily-managed chunks, but this doesn't seem like an elegant solution. In my mind, the database schema should be defined when the application is written, and any runtime data is stored as rows, not as additional tables.
As a more general question, is modifying the database schema at runtime ever ok?
Edit: This question is mostly hypothetical; I had a pretty good feeling that creating tables at runtime didn't make sense. That being said, we do have a table with millions of rows in our application. SELECTs perform fine, but things like deleting all rows owned by a particular user can take a while. Basically I'm looking for some solid reasoning why just dynamically creating a table for each user doesn't make sense for when I'm asked.
NO, NO, NO!! Now repeat after me, I will not do this because it will create many headaches and problems in the future! Databases are made to handle large amounts of information. they use indexes to quickly find what you are after. think phone book how effective is the index? would it be better to have a different book for each last name?
This will not give you anything performance wise. Keep a single table, but be sure to index on UserID and you'll be able to get the data fast. however if you split the table up, it becomes impossible/really really hard to get any info that spans multiple users, like search all users for a certain widget, count of all widgets of a certain type, etc. you need to have every query be built dynamically.
If deleting rows is slow, look into that. How many rows at one time are we talking about 10, 1000, 100000? What is your clustered index on this table? Could you use a "soft delete", where you have a status column that you UPDATE to "D" to mark the row as deleted. Can you delete the rows at a later time, with less database activity. is the delete slow because it is being blocked by other activity. look into those before you break up the table.
No, that would be a bad idea. However some DBMSs (e.g. Oracle) allow a single table to be partitioned on values of a column, which would achieve the objective without creating new tables at run time. Having said that, it is not "the norm" to partition tables like this: it is only usually done in very large databases.
Using an index on userID should result nearly in the same performance.
In my opinion, changing the database schema at runtime is bad practice.
Consider, for example, security issues...
Is it reasonable for an application to create database tables
dynamically as a means of partitioning?
No. (smile)

When to Create, When to Modify a Table?

I wanted to know, what should i consider while deciding if i should create a new table or modify an existing table for a sql db. i use both mysql and sqlite.
-Edit- I always thought if i can put a column into a table where it makes sense and can be used by every row then i would always modify it. However at work if its a different 'release' we put it in a different table.
You can modify existing tables, as long as
you are keeping the database Normalized
you are not breaking code that uses the table
You can create new tables even if 1. and 2. are true for the following reasons:
Performance reasons
Clarity in your schema logic.
Not sure if I'm understanding your question correctly, but one thing I always try to consider is the impact on existing data.
Taking the case of an application which relies on a database...
When you update the application (including database schema updates), it is important to ensure that any existing, in-use databases will be either backwards compatible with the application, or there is way to migrate and update the existing database.
Generally if the data is in a one-to-one relationship with the existing data in the table and if the table row size is not too large already and if there aren't too many records in the table, then I usually alter the table to accept the new column.
However, suppose I want to add a column with a default value to a table where it doesn't exist. Adding it to the table with 50 million records might not be so speedy a process and it might lock up the table on production when we move the change up. In this case, putting it into a separate table and adding the records to it may work out better. In general, I wouldn't do this unless my testing has shown that adding and populating the column will take an unacceptably long time. I would prefer to keep the record together where possible.
Same thing with the overall record size. SQL server has a byte limit to the number of bytes that can be in a record, it will allow you to create a structure that is potentially larger than that, but it will not alow you to put more than the byte limit into a specific record. Further, less wide tables tend to be faster to access due to how they are stored. Frequently, people will create a table that has a one-to-one relationship (we call them extended tables in our structure) for additional columns that are not as frequnetly used. If the fields from both tables will be frequently used, often they still create two tables but have a view that will pickout all the columns needed.
And of course if the data is in a one to many relationship, you need a related table not just a new column.
Incidentally, you should always do an alter table through a script and the SSMS GUI as it is more efficient and easier to move to prod.

What is the best way to query data from multilpe tables and databases?

I have 5 databases which represent different regions of the country. In each database, there are a few hundred tables, each with 10,000-2,000,000 transaction records. Each table is a representation of a customer in the respective region. Each of these tables has the same schema.
I want to query all tables as if they were one table. The only way I can think of doing it is creating a view that unions all tables, and then just running my queries against that. However, the customer tables will change all the time (as we gain and lose customers), so I'd have to change the query for my view to include new tables (or remove ones that are no longer used).
Is there a better way?
EDIT
In response to the comments, (I also posted this as a response to an answer):
In most cases, I won't be removing any tables, they will remain for historic purposes. As I posted in comment to one response, the idea was to reduce the time it takes a smaller customers (one with only 10,000 records) to query their own history. There are about 1000 customers with an average of 1,000,000 rows (and growing) a piece. If I were to add all records to one table, I'd have nearly a billion records in that table. I also thought I was planning for the future, in that when we get say 5000 customers, we don't have one giant table holding all transaction records (this may be an error in my thinking). So then, is it better not to divide the records as I have done? Should I mash it all into one table? Will indexing on customer Id's prevent delays in querying data for smaller customers?
I think your design may be broken. Why not use one single table with a region and a customer column?
If I were you, I would consider refactoring to one single table, and if necessary (for reverse compatibility for example), I would use views to provide the same info as in the previous tables.
Edit to answer OP comments to this post :
One table with 10 000 000 000 rows in it will do just fine, provided you use proper indexing. Database servers are built to cope with this kind of volume.
Performance is definitely not a valid reason to split one such table into thousands of smaller ones !
The architecture of this system smells like it needs a vastly different approach if there are a few hundred tables and each has the same schema
Why are you adding or removing tables at all? This should not be happening under any normal circumstances.
Agree with Brann,
That's an insane DB Schema Design. Why didn't you go with (or is an option to change to) a single normalised structure with columns to filter by region and whatever condition separates each table within a region database.
In that structure you're stuck with some horribly large (~500 tables) unioned view that you would have to dynamically regenerate as regularly as new tables appear in the system.
2 solutions
1. write a stored procedure who build the view for you by parsing all table names in the 5 databases and build the view with union as you would do it by hand.
create a new database with one table and import each night per example all the records of all the tables in this one.
Sounds like your stuck somewhere between a multi and single tenant database shema. Specifically your storing it as "light"multi-tenant (separate tables vs separate databases) but querying it as single-tenant, one query to rule them all.
In the short term have your data access layer dynamically pick the table to query and not union everything together for one uber query.
In the long term pick one approach and stick too it. One database and one table or many databases.
Here are some posts on the subject.
What are the advantages of using a single database for EACH client?
http://msdn.microsoft.com/en-us/library/aa479086.aspx