SQL large table VS. multiple smaller tables [closed] - sql

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
I have the option to use a single table that will expand upwards of 1,000,000 records per year.
With that said, I could use a foreign key to break up this table into muitiple smaller tables, which will reduce this expansion to each smaller table of 100,000 records per year.
Lets say 50% of the time, users will query all of the records where the other 50% of the time users will query the segmented smaller table data set. ( think based on all geographic areas vs. specific geographic areas)
Using a database managed by a shared hosting account ( think site5, godaddy, etc... ), is it faster to use a single larger table or to use several smaller segmented tables given this situation?
Where each dataset is accessed 10%/%90, 20%/%80, %30/%70... etc, at what point would using a single table vs muiltiple smaller tables be the most/least efficient?

In general do it so as to reduce the amount of duplicated information. If you are making smaller tables which have many redundant columns, then it seems like it'd be more efficient to have just one table. But otherwise, one table.
It also depends on what percent of the row is being used per query, and how your queries are structured. If you are adding lots of joins or subqueries, then it'll most likely be slower.

Related

Can converting a SQL query to PL/SQL improve performance in Oracle 12c? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
I have been given an 800 lines SQL Query which is taking around 20 hours to fetch around 400 million records.
There are 13 tables which are partitioned by month.
The tables have records ranging from 10k to 400 million in each partition.
The tables are indexed on primary keys.
The query uses many inline views and outer joins and a few group by functions.
DBAs say we cannot add more indexes as it would slow down the performance since it is an OLTP system.
I have been asked to convert the query logic to pl/sql and then populate a table in chunks.Then do a select * from that table.
My end result should be a query which can be fed to my application.
So even after I use pl/sql to populate a table in chunks,ultimately I need to fetch the data from that table as a query.
My question is, since pl/sql would require select and insert both, are there any chances pl/sql can be faster than sql?
Are there any cases where pl/sql is faster for any result which is achievable by sql?
I will be happy to provide more information if the given info doesn't suffice.
Implementing it as a stored procedure could be faster because the SQL will already be parsed and compiled when the procedure is created. However, given the volume of data you are describing its unclear if this will make a significant difference. All you can do is try it and see.
I think you really need to identify where the performance problem is; where the time is being spent. For example (and I have seen examples of this many times), the majority of the time might be in fetching to 400M rows to whatever the "client" is. In that case, re-writing the query or as PL/SQL will make no difference.
Anyway, once you can enumerate the problem, you have a better chance of getting sound answers, rather than guesses...

how to improve sql performance in below case [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 7 years ago.
Improve this question
Resource 128 GB RAM, 6 TB space, Sql server 2014 enterprise.
Problem: Our DB has few tables including 5 'read only tables' of 200GB(approx) each.
Each of these 5 tables have 5 columns (all columns(varchar 100)).
Each col have repeated strings or values. Two columns (col3 and col4) have approx 3500 unique strings or values that are repeating.
Our only query is 'select col1......(all columns) from table x where col1 like 'searchstring%' or col2 like 'searchstring%' (only two col are queried for OR clause) as these are the only match criteria.
Presently query is taking 1hour to 1.5hrs to return results with present index.
I am wondering if there is an efficient way to get result within 5 to 10 minutes.
Thanks
Go in SQL Server Managment Studio and look at the execution plan for the query. This will show you any recommended index(s) as well as the rate limiting steps (say io). Make sure you have no cardinality mismatching and on large tables make sure that you are keeping your statistics up to date.
It all rather depends on your database but I suggest trying splitting the query in 2 to remove the OR.
I also recommend investigating functional indexes, you might be able to index on the first 10 character for instance and then the query may be faster.
Try everything, you never know what will work.

Inputting data to database by many users [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
Each of the salesmen should make a forecast of his sales. I know how he may input data directly from Excel sheet to SQL table. Do I need to create different tables - one table per salesman? At the end I need to aggregate all the forecasts. Is it possible to make it with just one table?
The condition is that one salesman is not allowed to see the other salesmen forecasts.
It seems to be a common problem of inputting data to database by many different users with restrictions on access.
Update. Each salesman is in different town. Say we have 500 salesmen so it is not the way to gather data from 500 Excel files into one big Excel file and then load it to SQL.
actually you don't need to create different tables for each salesmen. one table is enough to load all your salesman info Excel data. to find each salesmen's forecast sales simple transmission query will help u
You need at least two tables. You need a staging table to receive the excel data and perform the necessary validation, transformation, etc. You need at least one table for data storage. Given that you are talking about people and sales, you probably want a normalized database. If you don't know what that means, I've heard good things about the book, Database Design for Mere Mortals.

Database Schema SQL Rows vs Columns [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 9 years ago.
Improve this question
I have a lot of databases with relatively large amounts of columns ranging from 5 to 300. Each table has at least 50,000 rows in it.
What is the most effective way to store this data? Presently the data has just been dumped into an indexed sql database.
It was suggested to me to create 3 columns as follows.
Column Name, Column category, Row ID, Row Data.
example data would be
Male, 25-40, 145897, 365
Would this be faster? Would this be slower? Is there better ways to store such large and bulky databases?
I will almost never be updating or changing data. It simply be outputted to a 'datatables' dynamic table where it will be sorted, limited and ect. The category column will be used to break up the columns on the table.
Normalize your db!
I have struggled with this "theory" for a long time and experience has proven that if you can normalize data across multiple tables it is better and performance will not suffer.
Do Not try to put all the data in one row with hundreds of columns. Not because of performance but because of development ease.
Learn More here

Which type of database structure design is better for performance? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
MSSQL database. I have issue to create database using old databases data. Old database structure is thousands tables conected with each other by ID. In this tables data duplicated many times. Old database tables have more than 50 000 rows (users). Structure like this table
Users (id, login, pass, register-date, update-date),
Users-detail (id, users_id, some data)
Users-some-data (id, users_is, some data)
and this kind of tables is hundreds.
And the question is, which design of db structure to choose, one table with all of this data, or hundreds of tables separated by some theme.
Which type of db structure would be with better performance?
Select id, login, pass from ONE_BIG_TABLE
or
Select * from SMALL_ONLY_LOGINS_TABLE.
Answer really depends on the use. No one can optimize your database for you if they don't know the usage statistics.
Correct DB design dictates that an entity is stored inside a single table, that is, the client with their details for example.
However this rule can change on the occasion you only access/write some of the entity data multiple times, and/or of there is optional info you store about a client (eg, some long texts, biography, history, extra addresses etc) in which cases it would be optimal to store them on a child-table.
If you find yourself a bunch of columns with all-null values, that means you should strongly consider a child table.
If you only need to try login credentials against the DB table, a stored procedure that returns a bool value depending on if the username/password are correct, will save you the round-trip of the data.
Without indexes the select on the smaller tables will be faster. But you can create the same covering index (id, login, pass) on both tables, so if you need only those 3 columns performance will probably be the same on both tables.
The general question which database structure is better can not be answered without knowing the usage of your database.