For example, a website offers the ability to create mobile surveys. Each survey ID is a FK in the survey response table, which contains ALL of the survey responses.
What is the size limitation of this table in a SQL Server 2008 db, if the table contains, say 20 varchar(255) fields including the bigint PK & FK?
I realize this would depend on the file size limitation as well, but I would like some more of an educated answer rather than my guess on this.
In terms of searchability, some fields that contain geo-related details such as the survey ID, city, state, and two commends fields would have to be searchable, and thus indexed ... index only these fields?
Also, aged responses would expire after a given amount of time - thus deleted from the table. Does the table, at this point being very large, need to be re-indexed/cleaned up, after the deletions (which would be an automated process)?
Thanks.
Maximum Capacity Specifications for SQL Server
Bytes per row: 8,060
Rows per table: Limited by available storage
Note
SQL Server supports row-overflow storage which enables variable length
columns to be pushed off-row. Only a 24-byte root is stored in the
main record for variable length columns pushed out of row; because of
this, the effective row limit is higher than in previous releases of
SQL Server. For more information, see the "Row-Overflow Data Exceeding
8 KB" topic in SQL Server Books Online
You mention 'table size' -- does this mean number of rows?
Maximum Capacity Specifications for SQL Server
Rows per table : Limited by available storage
As per this Reference, the max size of a table is limited by the available storage.
It sounds like you are going to have a high traffic and high content table. You should consider performance and storage enhancements like Table Partitioning. Also, because this table will be the victim of often INSERTS/UPDATES/DELETES, carefully plan out your indexing, as indexes add overhead for DML statements on the table.
Related
My question is very simple - Does a SQL query with fewer attributes cost less?
Example:
Let's say our users table have 10 columns like userId, name, phone, email, ...
SELECT name, phone FROM users WHERE userId='id'
is cheapier than this
SELECT * FROM users WHERE userId='id'
Is it true in the perspective of resource utilization?
It depends.
It is certainly possible that limiting the number of columns in the projection improves performance but it depends on what indexes are available. If we assume that userId is either the primary key or at least an indexed column, you'd expect database's optimizer to determine which row(s) to fetch by doing a lookup using an index that has userId as the leading column.
If there is an index on (user_id, phone) or if phone is an included column on the index if your database supports that concept, the database can get the phone from the index it used to find the row(s) to return. In this way, the database never has to visit the actual table to fetch the phone. An index that has all the information the database needs to process the query without visiting the table is known as a "covering index". Roughly speaking, it is probably roughly as costly to search the index for the rows to return as it is to visit the table to fetch additional columns for the projection. If you can limit the number of columns in the projection in order to use a covering index, that to may significantly reduce the cost of the query. Even more significantly if visiting the table to fetch every column involves doing multiple reads because of chained rows or out-of-line LOB columns in Oracle, TOAST-able data types in PostgreSQL, etc.
Reducing the number of columns in the projection will also decrease the amount of data that needs to be sent over the network and the amount of memory required on the client to process that data. This tends to be most significant when you have larger fields. For example, if one of the columns in the users table happened to be an LDAP path for the user's record, that could easily be hundreds of characters in length and account for half the network bandwidth consumed and half the memory used on the middle tier. Those things probably aren't critical if you're building a relatively low traffic internal line of business application that needs to serve a few hundred users. It is probably very critical if you're building a high volume SaaS application that needs to serve millions of users.
in the grand scheme of things, both are negligible.
If the data is stored by rows, there isn't much of a difference as retrieving a line of data doesn't cost much. perhaps if one of the columns was particularly large then avoiding its retrieval would be beneficial.
but if the data is stored by columns, then the first one is cheaper as each entry is stored in a different location.
What is the maximum number of results possible from the following SQL query for DB2 on z/OS?
SELECT NAME FROM SYSIBM.SYSTABLES WHERE TYPE='T' AND CREATOR=? ORDER BY NAME ASC
This query is intended to fetch a list of all table names under a specific schema/creator in a DB2 subsystem.
I am having trouble finding a definitive answer. According to IBM's "Limits in DB2 for z/OS" article, the maximum number of internal objects for a DB2 database is 32767. Objects include views, indexes, etc.
I would prefer a more specific answer for maximum number of table names under one schema. For instance, here is an excerpt from an IDUG thread for a related question:
Based on the limit of 32767 objects in one database, where each tablespace takes two entries, and tables and indexes take one entry each, then the theoretical max would seem to be, with one tablespace per database,
32767 - 2 (for the single tablespace) = 32765 / 2 = 16382 tables, assuming you need at least one index per table.
Are these assumptions valid (each tablespace takes two entries, at least one index per table)?
assuming you need at least one index per table.
That assumption doesn't seem valid. Tables don't always have indexes. And you are thinking about edge cases where someone is already doing something weird, so I definitely wouldn't presume there will be indexes on each table.*
If you really want to handle all possible cases, I think you need to assume that you can have up to 32765 tables (two object identifiers are needed for a table space, as mentioned in the quote).
*Also, the footnote in the documentation you linked indicates that an index takes up two internal object descriptors. So the math is also incorrect in that quote. It would actually be 10921 tables if they each had an index. But I don't think that is relevant anyway.
I'm not sure your assumptions are appropriate because there are just too many possibilities to consider and in the grand scheme of things probably doesn't make much difference to the answer from your point of view
I'll rephrase your question to make sure I understand you correctly, you are after the maximum number of rows i.e. worst case scenario, that could possibly be returned by your SQL query?
DB2 System Limits
Maximum databases
Limited by system storage and EDM pool size
Maximum number of databases
65217
Maximum number of internal objects for each database
32767
The number of internal object descriptors (OBDs) for external objects are as follows
Table space: 2 (minimum required)
Table: 1
Therefore the maximum number of rows from your SQL query:
65217 * (32767 - 2) = 2,136,835,005
N.B. DB2 for z/OS does not have a 1:1 ratio between schemas and databases
N.N.B. This figure assumes 32,765 tables/tablespace/database i.e. 32765:1:1
I'm sure ±2 billion rows is NOT a "reasonable" expectation for max number of table names that might show up under a schema but it is possible
Hi I would like to know the maximum amount of columns allowed per table for the different storage engines and the max row size. I searched the mariadb website documentation and could not find the information. Thank you
MariaDB in its current form is still close enough to MySQL that the same limits apply. The MariaDB fork may diverge further as time goes on.
The actual answer for the maximum number of columns per table in MySQL is complex, because of differences in data types, storage engines, and metadata storage. Sorry there's not just a simple answer.
As #Manquer cites, there's an absolute limit of 64KB per row, but BLOB/TEXT columns don't count toward this limit.
InnoDB pages must fit at least two rows per page, and a page is 16KB minus some header information. So regardless of number of columns, the row must be about 8000 bytes. But VARCHAR/BLOB/TEXT columns can overflow to additional pages in interesting ways. See http://www.mysqlperformanceblog.com/2010/02/09/blob-storage-in-innodb/ for the details.
But there are even further restrictions, based on the .FRM metadata file that applies to all storage engines. It gets really complex, read http://www.mysqlperformanceblog.com/2013/04/08/understanding-the-maximum-number-of-columns-in-a-mysql-table/ for the details and examples about that. There's too much to copy down here.
Given the latter blog article, I was able to design a table that failed at 59 columns.
MariaDb being originally a fork and drop in replacement for MySQL, mostly follows similar design constraints as MySQL. Although MariaDB documentation does not explicitly say how many columns are allowed.
This number is highly dependent a number of factors including the storage engine used and the way the columns are structured. For InnoDB this is 1,000.
See explanation below from the official documentation (Ref: Column Count Limit)
There is a hard limit of 4096 columns per table, but the effective
maximum may be less for a given table. The exact limit depends on
several interacting factors.
Every table (regardless of storage engine) has a maximum row size of
65,535 bytes. Storage engines may place additional constraints on this
limit, reducing the effective maximum row size.
The maximum row size constrains the number (and possibly size) of
columns because the total length of all columns cannot exceed this
size.
...
Individual storage engines might impose additional restrictions that
limit table column count.
InnoDB permits up to 1000 columns.
This applies to MariaDb as well
I am looking for an up to date tool to accurately calculate the total row size and page-density of any SQL table definition for SQL Server 2005+.
Please note that there are plenty of resources concerning calculating sizes of rows in existing tables, estimating techniques for sizing, etc... However, I am designing tables and have some options about column size which I am trying to balance with efficient data access - meaning that I can relocate less-frequently accessed long text into dedicated tables to allow the most frequent access of these new tables to operate at optimum speed.
Ideally there would be an online facility where a create statement can be cut and pasted, or a sproc I can run on a dev db.
and The answer is a simple one until you start making proper table design and balance that against joins and FK data and disk access.
I'd have a look an see how many data pages you are using and remember that one reads an extend (8 data pages) from disk, not only the data page you are looking for. Then there is the option for data compression in your table as well as sparse columns and out of row type of data storage and variable length characters.
It's not about how much data is in a column, it's really about how many data reads and CPU you need to get it. this you can test when executing a Query and looking against the ACTUAL QUERY PLAN.
As for space used you can use a stored procedure called sp_spaceused. here is a source you can use to see how one could use it in dbforms
Hope it helps
Walter
Need to exceed the 8k record limit in SQL Server 2008 when using wide columns/sparse tables.
Long story new client old system using survey system, pivoting data so all answers are a column
I have 1500 columns and now I am getting
Cannot create a row that has sparse data of size 9652 which is greater than the allowable maximum sparse data size of 8019.
I need to exceed the 8k record limit if possible
Not possible, because SQL Server stores rows on 8K pages. The only way to do so would be to store some of the data off-row (e.g. using MAX or other LOB types for some of your columns). To your application, this will still look like it's on the same row, even though logically it is on a complete different area of disk.
If your sparse column set alone exceeds the limit, sorry, you'll need to look at a different way to store the data (either not pivoted, EAV, or simply use two tables joined by a key, each containing half of the column set). For the latter you can make this relatively transparent to users by using views and/or enforcing all data access / DML through stored procedures that understand the division.