Count(*) differs from rows in sys.partitions - sql

I'm using the following query to get information about all tables in a DB:
SELECT
t.NAME AS TableName,
i.name as indexName,
sum(p.rows) as RowCounts,
sum(a.total_pages) as TotalPages,
sum(a.used_pages) as UsedPages,
sum(a.data_pages) as DataPages,
(sum(a.total_pages) * 8) / 1024 as TotalSpaceMB,
(sum(a.used_pages) * 8) / 1024 as UsedSpaceMB,
(sum(a.data_pages) * 8) / 1024 as DataSpaceMB
FROM
sys.tables t
INNER JOIN
sys.indexes i ON t.OBJECT_ID = i.object_id
INNER JOIN
sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
INNER JOIN
sys.allocation_units a ON p.partition_id = a.container_id
WHERE
t.NAME NOT LIKE 'dt%' AND
i.OBJECT_ID > 255 AND
i.index_id <= 1
GROUP BY
t.NAME, i.object_id, i.index_id, i.name
ORDER BY
object_name(i.object_id)
The problem is that for some tables it reports a different row count than if I do:
select count(*) FROM someTable
Why is that?
Edit:
The first query returns a higher count:
First: 1 240 464
Second: 413 496

The problem is that there's more than one allocation_unit per partition, so the same partition can appear more than once and therefore the sum(p.rows) ends up counting the same partition more than once, so you get a multiple of the right number of rows.
Here's how I solved the problem:
(note that my query isn't identical to yours, I have slightly different columns and used Kb rather than Mb, but the idea is the same)
SELECT
s.Name + '.' + t.name AS table_name,
(select sum(p2.rows)
from sys.indexes i2 inner join sys.partitions p2 ON i2.object_id = p2.OBJECT_ID AND i2.index_id = p2.index_id
where i2.object_id = t.object_id and i2.object_id > 255 and (i2.index_id = 0 or i2.index_id = 1)
) as total_rows,
SUM(CASE WHEN (i.index_id=0) OR (i.index_id=1) THEN a.total_pages * 8 ELSE 0 END) AS data_size_kb,
SUM(CASE WHEN (i.index_id=0) OR (i.index_id=1) THEN a.used_pages * 8 ELSE 0 END) AS data_used_kb,
SUM(CASE WHEN (i.index_id=0) OR (i.index_id=1) THEN 0 ELSE a.total_pages * 8 END) AS index_size_kb,
SUM(CASE WHEN (i.index_id=0) OR (i.index_id=1) THEN 0 ELSE a.used_pages * 8 END) AS index_used_kb,
SUM(a.total_pages) * 8 AS total_size_kb,
SUM(a.used_pages) * 8 AS total_used_kb,
SUM(a.used_pages) * 100 / CASE WHEN SUM(a.total_pages) = 0 THEN 1 ELSE SUM(a.total_pages) END AS percent_full
FROM
sys.tables t
INNER JOIN
sys.schemas s ON s.schema_id = t.schema_id
INNER JOIN
sys.indexes i ON t.OBJECT_ID = i.object_id
INNER JOIN
sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
INNER JOIN
sys.allocation_units a ON p.partition_id = a.container_id
WHERE
t.is_ms_shipped = 0 AND i.OBJECT_ID > 255
GROUP BY
t.object_id, t.Name, s.Name
ORDER BY SUM(a.total_pages) DESC

From the sys.partitions documentation
rows bigint Approximate number of rows in this partition.
(emphasis mine). The system views aren't going to keep a spot-on number of rows in the table. Think of what that would entail and how much overhead it would add to all insert/delete statements. If I were a betting man, I'd say that it's doing something with the count of the number of pages in the clustered index or heap which is a far cheaper operation. That's purely speculative, though.

Have you looked into the help article regarding the sys.allocation_units view? Apparently, the container_id field is a little more than it seems. Try to add this to the where section:
and a.type = 2

In SQL Server 2016, To fix the count(*) and sys.partitions mismatch I performed an Index Rebuild on the Primary Key. Luckily the TABLE only had 2.4 million rows so it didn't take that long as I have standard edition so couldn't do the rebuild ONLINE.

Inner joins will cause unmatched rows to be filtered out. Groups will also affect your row counts as they can combine rows. These two conditions were cause a lower row count for the aggregate query than the simple count(*).
I see specifically you're asking about the sys.partitions table. The likely explanation is that there is not a match for every row in the sys.indexes table given the match condition of i.object_id = p.OBJECT_ID AND i.index_id = p.index_id. Try running this:
Select
count(*)
from
sys.partitions p
LEFT JOIN
sys.indexes i ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
You will then likely see the count you expect. Remove the count function to simply Select * ... to find the unmatched rows.

Related

SQL multiple tables reach

I have this chunk of code here, thing is it query selected "TestLog" and shows:
name,
schema,
rowcount,
totalspace,
usedspace,....
Thing is I need this information to be queried for more than one table, lets say a 5 specific ones... can anyone help out?
USE myDB
GO
SELECT
t.name AS TableName,
s.name AS SchemaName,
p.rows AS RowCounts,
SUM(a.total_pages) * 8 / 1024 AS TotalSpaceMB,
SUM(a.used_pages) * 8 / 1024 AS UsedSpaceMB,
(SUM(a.total_pages) - SUM(a.used_pages)) * 8 / 1024 AS UnusedSpaceMB
FROM
sys.tables t
INNER JOIN sys.indexes i ON t.object_id = i.object_id
INNER JOIN sys.partitions p ON i.object_id = p.object_id AND i.index_id = p.index_id
INNER JOIN sys.allocation_units a ON p.partition_id = a.container_id
LEFT OUTER JOIN sys.schemas s ON t.schema_id = s.schema_id
WHERE
t.name = ('TestLog')
AND t.is_ms_shipped = 0
AND i.object_id > 255
GROUP BY
t.name, s.name, p.rows
ORDER BY
t.name;
GO
It works as below:
-- specific tables
WHERE
t.name in ('Test1' , 'Test2' , 'Test3')
AND t.is_ms_shipped = 0
AND i.object_id > 255
-- one specific table
WHERE
t.name = 'TestLog'
AND t.is_ms_shipped = 0
AND i.object_id > 255
-- specific tables with test in front
WHERE
t.name like 'Test%'
AND t.is_ms_shipped = 0
AND i.object_id > 255

SQL SERVER much faster than COUNT(*) with JOINS DB consists more than 30 million rows with 30 columns

SELECT COUNT(*) FROM my_Table
making the system slow when we search for more than 30 million rows with 30 columns
so we interacted with the sys.partitions, sys.tables
Query:
SELECT SUM(p.rows) FROM sys.partitions AS p
INNER JOIN sys.tables AS t
ON p.[object_id] = t.[object_id]
INNER JOIN sys.schemas AS s
ON s.[schema_id] = t.[schema_id]
WHERE t.name = N'myTable'
AND s.name = N'dbo'
AND p.index_id IN (0,1);
However we want it for a JOIN. Please suggest
Execution plan for the Search Query:
https://www.brentozar.com/pastetheplan/?id=HyzzW-Oyf
Execution plan for the Count:
https://www.brentozar.com/pastetheplan/?id=rkkKfxO1M

Records not shown while selecting from table in mssql

I'm Getting total tables in whole database and its row-count from following query:
SELECT SCHEMA_NAME(A.schema_id) + '.' +
A.Name, SUM(B.rows) AS 'RowCount'
FROM sys.objects A
INNER JOIN sys.partitions B ON A.object_id = B.object_id
WHERE A.type = 'U'
GROUP BY A.schema_id, A.Name
Order By 'RowCount' desc
After that i'm getting below result:
And then finally when i'm trying to fetch records from one of this table
select * from dbo.[xxx_$Retail ICT Header]
It gives 0 rows as output......Any clue?
To get the Row count in each table try below query and after that cross check by running select query.
SELECT s.name+'.'+o.name,
ddps.row_count
FROM sys.indexes AS i
INNER JOIN sys.objects AS o ON i.OBJECT_ID = o.OBJECT_ID
INNER JOIN sys.schemas s on s.schema_id= o.schema_id
INNER JOIN sys.dm_db_partition_stats AS ddps ON i.OBJECT_ID = ddps.OBJECT_ID
AND i.index_id = ddps.index_id
WHERE i.index_id < 2 AND o.is_ms_shipped = 0 ORDER BY o.NAME

Select all tables with at least one row of data inserted

I want return all tables that have at least one row of data.
I was using this:
SELECT DISTINCT(OBJECT_NAME(OBJECT_ID))
FROM SYS.DM_DB_PARTITION_STATS ST
WHERE ST.ROW_COUNT > 0 AND OBJECT_ID > 100
But I don't want use the table SYS.DM_DB_PARTITION_STATS.
I want to know another way to find these tables?
Any clues?
thanks.
You could use something like this:
SELECT
t.NAME AS TableName,
p.rows AS RowCounts
FROM
sys.tables t
INNER JOIN
sys.indexes i ON t.OBJECT_ID = i.object_id
INNER JOIN
sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
WHERE
t.is_ms_shipped = 0
AND p.rows > 0
GROUP BY
t.Name, p.Rows
ORDER BY
t.Name
That would list all tables with the name and number of rows (> 0) in them.
And it doesn't use sys.dm_db_partition_stats ....
You should not use sys.dm_db_partition_stats

Get Table and Index storage size in sql server

I want to get table data and index space for every table in my database:
Table Name Data Space Index Space
-------------------------------------------------------
How can I achieve this result?
This query here will list the total size that a table takes up - clustered index, heap and all nonclustered indices:
SELECT
s.Name AS SchemaName,
t.NAME AS TableName,
p.rows AS RowCounts,
SUM(a.total_pages) * 8 AS TotalSpaceKB,
SUM(a.used_pages) * 8 AS UsedSpaceKB,
(SUM(a.total_pages) - SUM(a.used_pages)) * 8 AS UnusedSpaceKB
FROM
sys.tables t
INNER JOIN
sys.schemas s ON s.schema_id = t.schema_id
INNER JOIN
sys.indexes i ON t.OBJECT_ID = i.object_id
INNER JOIN
sys.partitions p ON i.object_id = p.OBJECT_ID AND i.index_id = p.index_id
INNER JOIN
sys.allocation_units a ON p.partition_id = a.container_id
WHERE
t.NAME NOT LIKE 'dt%' -- filter out system tables for diagramming
AND t.is_ms_shipped = 0
AND i.OBJECT_ID > 255
GROUP BY
t.Name, s.Name, p.Rows
ORDER BY
s.Name, t.Name
If you want to separate table space from index space, you need to use AND i.index_id IN (0,1) for the table space (index_id = 0 is the heap space, index_id = 1 is the size of the clustered index = data pages) and AND i.index_id > 1 for the index-only space
with pages as (
SELECT object_id, SUM (reserved_page_count) as reserved_pages, SUM (used_page_count) as used_pages,
SUM (case
when (index_id < 2) then (in_row_data_page_count + lob_used_page_count + row_overflow_used_page_count)
else lob_used_page_count + row_overflow_used_page_count
end) as pages
FROM sys.dm_db_partition_stats
group by object_id
), extra as (
SELECT p.object_id, sum(reserved_page_count) as reserved_pages, sum(used_page_count) as used_pages
FROM sys.dm_db_partition_stats p, sys.internal_tables it
WHERE it.internal_type IN (202,204,211,212,213,214,215,216) AND p.object_id = it.object_id
group by p.object_id
)
SELECT object_schema_name(p.object_id) + '.' + object_name(p.object_id) as TableName, (p.reserved_pages + isnull(e.reserved_pages, 0)) * 8 as reserved_kb,
pages * 8 as data_kb,
(CASE WHEN p.used_pages + isnull(e.used_pages, 0) > pages THEN (p.used_pages + isnull(e.used_pages, 0) - pages) ELSE 0 END) * 8 as index_kb,
(CASE WHEN p.reserved_pages + isnull(e.reserved_pages, 0) > p.used_pages + isnull(e.used_pages, 0) THEN (p.reserved_pages + isnull(e.reserved_pages, 0) - p.used_pages + isnull(e.used_pages, 0)) else 0 end) * 8 as unused_kb
from pages p
left outer join extra e on p.object_id = e.object_id
Takes into account internal tables, such as those used for XML storage.
Edit: If you divide the data_kb and index_kb values by 1024.0, you will get the numbers you see in the GUI.