While testing a website by adding records via the UI, I cannot always tell which tables are being updated. I would like a query - in MSSQL and a version for PostgreSQL - which returns the last entry/entries added/modified in the database, without knowing the table, so I can figure out which tables are related to the feature I am looking at.
In this case I cannot provide an example because I cannot tell which table is being updated and how.
If you are just trying to track "what table(s) is this UI writing to" without wanting to use Extended Events or Query Store to see what commands are actually running, and the service hasn't been restarted since the UI did its thing, and nobody else is using the database, you can do something like this:
SELECT TOP (10) -- or some other arbitrary number, or no TOP at all
[Schema] = s.name,
[Table] = t.name,
LastWrite = MAX(ius.last_user_update)
FROM sys.schemas AS s
INNER JOIN sys.objects AS t
ON s.[schema_id] = t.[schema_id]
INNER JOIN sys.dm_db_index_usage_stats AS ius
ON ius.[object_id] = t.[object_id]
GROUP BY s.name, t.name
ORDER BY LastWrite DESC;
But it seems like a narrow use case and can be invalidated by a lot of variables. If you want to know what your UI is doing, look at the code, or use Extended Events to monitor.
Related
I am writting the following query,
Execution plan
Its takes 30 seconds to load just 80 rows.
Is there anything we can do to reduce the time of running this query?
select
CO.ContributorsName [ContributorsName]
, D.DocumentLastPublished DocumentLastPublished
, CO.ContributorsImage [AuthorImage]
, T.NodeAliasPath
, D.DocumentID
, BD.*
from CMS_Tree T
inner join Cms_Class CC
on T.NodeClassID = CC.ClassID
and CC.ClassName = 'wv.blogdata'
inner join Cms_Document D
on T.NodeID = D.DocumentNodeID
inner join WV_BlogData BD
on D.DocumentForeignKeyValue = BD.BlogDataID
and COALESCE(BD.IsDeleted, 0) = 0
inner join WV_Contributors CO
on BD.AuthorID = CO.ContributorsID
where (
'ALL' = 'ALL'
or category = 'All'
)
and DocumentCulture = 'en-US'
Don't use * for all tables.Only specify column names what columns you need.Check your WHERE Clause also.
Covering indexes
(Looking at your execution plan, it looks like you've already got the appropriate covering indexes, but this is good general advice, and still worth a try)
If this is a frequently used query, make sure you've got the appropriate covering indexes on the tables involved. See this MSDN page for how to identify potential missing indexes. Note that adding indexes will improve query performance, at the cost of degrading your insert performance. You will also need to make sure you've got the appropriate maintenance plans in place to ensure your indexes don't get fragmented or unbalanced.
Query changes
I'd also recommend trying some changes to your query and comparing the execution plans.
It's difficult to make any meaningful suggestions without looking at your database and being able to try a few things.
From a cursory look at your query, the most obvious thing I can see is that you're performing an inner join on Cms_Class, but not selecting any of the data from it, or even joining it to other tables (apart from CMS_Tree). I'd suggest removing this join and using an exists statement instead, like so:
select
CO.ContributorsName [ContributorsName]
, D.DocumentLastPublished DocumentLastPublished
, CO.ContributorsImage [AuthorImage]
, T.NodeAliasPath
, D.DocumentID
, BD.*
from CMS_Tree T
inner join Cms_Document D
on T.NodeID = D.DocumentNodeID
inner join WV_BlogData BD
on D.DocumentForeignKeyValue = BD.BlogDataID
and COALESCE(BD.IsDeleted, 0) = 0
inner join WV_Contributors CO
on BD.AuthorID = CO.ContributorsID
where (
'ALL' = 'ALL'
or category = 'All'
)
and DocumentCulture = 'en-US'
and exists
(
select null
from Cms_Class CC
where T.NodeClassID = CC.ClassID
and CC.ClassName = 'wv.blogdata'
)
Give it a try, look at the execution plans, and see if it makes a difference for you.
If you create new covering indexes, re-run your queries and look at the execution plans again, because the most efficient query with missing indexes might not be the most efficient query once you've added indexes.
Document caching (SQL isn't always the best solution for accessing data)
Assuming you've done both of these, and the query performance is still too poor, you may want to ask yourself if you really need to query live data. Looking at your query, it looks like you're querying data from a CMS. The data in a CMS is only going to change when a content author actually makes a change. Most of the time, the data will stay the same from request to request. This means that doing a direct query from SQL every time you want to access content might be overkill for your needs.
A good use-case example is to look at how Umbraco CMS accesses its data. It keeps an XML document cache of all of the published documents on a given site. When a content author publishes changes, it then updates the XML document cache.
Accessing the cache is much more efficient than talking to SQL directly, and they even warn users not to use their SQL API for serving up CMS content, because it is too slow.
I'm struggling to understand how the following two queries could be blocking each other.
Running query (could be almost anything though):
insert bulk [Import].[WorkTable] ...
I'm trying to run the following SELECT query at the same time:
SELECT *
FROM ( SELECT * FROM #indexPart ip
JOIN sys.indexes i (NOLOCK)
ON i.object_id = ip.ObjectId
and i.name = ip.IndexName) i
CROSS
APPLY sys.dm_db_index_physical_Stats(db_id(), i.object_id,i.index_id,NULL,'LIMITED') ps
WHERE i.is_disabled = 0
The second query is blocked by the first query and shows a LCK_M_IS as wait info. Import information is that the temporary table #indexPart contains one record of an index on a completely different table. My expectation is that the cross apply tries to run the stats on that one index which has nothing to do with the other query running.
Thanks
EDIT (NEW):
After several more tests I think I found the culprit but again can't explain it.
Bulk Insert Session has an X lock on table [Import].[WorkTable]
The query above is checking for an Index on table [Import].[AnyOtherTable] BUT is requesting an IS lock on [Import].[WorkTable]. I've verified again and again that the query above (when running the stuff without cross apply) is only returning an index on table [Import].[AnyOtherTable].
Now here comes the magic, changing the CROSS APPLY to an OUTER APPLY runs through just fine without any locking issues.
I hope someone can explain this to me ...
The problem could be at the where clause you used. It should be within the inline table. The following change could make a difference.
FROM ( SELECT * FROM #indexPart ip
JOIN sys.indexes i (NOLOCK)
ON i.object_id = ip.ObjectId
and i.name = ip.IndexName
WHERE i.is_disabled = 0) i
If you do like so, this may reduce the number of records passed onto the cross apply statement.
Counting tables with large amount of data may be very slow, sometimes it takes minutes; it also may generate deadlock on a busy server. I want to display real values, NOLOCK is not an option.
The servers I use is SQL Server 2005 or 2008 Standard or Enterprise - if it matters.
I can imagine that SQL Server maintains the counts for every table and if there is no WHERE clause I could get that number pretty quickly, right?
For example:
SELECT COUNT(*) FROM myTable
should immediately return with the correct value. Do I need to rely on statistics to be updated?
Very close approximate (ignoring any in-flight transactions) would be:
SELECT SUM(p.rows) FROM sys.partitions AS p
INNER JOIN sys.tables AS t
ON p.[object_id] = t.[object_id]
INNER JOIN sys.schemas AS s
ON s.[schema_id] = t.[schema_id]
WHERE t.name = N'myTable'
AND s.name = N'dbo'
AND p.index_id IN (0,1);
This will return much, much quicker than COUNT(*), and if your table is changing quickly enough, it's not really any less accurate - if your table has changed between when you started your COUNT (and locks were taken) and when it was returned (when locks were released and all the waiting write transactions were now allowed to write to the table), is it that much more valuable? I don't think so.
If you have some subset of the table you want to count (say, WHERE some_column IS NULL), you could create a filtered index on that column, and structure the where clause one way or the other, depending on whether it was the exception or the rule (so create the filtered index on the smaller set). So one of these two indexes:
CREATE INDEX IAmTheException ON dbo.table(some_column)
WHERE some_column IS NULL;
CREATE INDEX IAmTheRule ON dbo.table(some_column)
WHERE some_column IS NOT NULL;
Then you could get the count in a similar way using:
SELECT SUM(p.rows) FROM sys.partitions AS p
INNER JOIN sys.tables AS t
ON p.[object_id] = t.[object_id]
INNER JOIN sys.schemas AS s
ON s.[schema_id] = t.[schema_id]
INNER JOIN sys.indexes AS i
ON p.index_id = i.index_id
WHERE t.name = N'myTable'
AND s.name = N'dbo'
AND i.name = N'IAmTheException' -- or N'IAmTheRule'
AND p.index_id IN (0,1);
And if you want to know the opposite, you just subtract from the first query above.
(How large is "large amount of data"? - should have commented this first, but maybe the exec below helps you out already)
If I run a query on a static (means no one else is annoying with read/write/updates in quite a while so contention is not an issue) table with 200 million rows and COUNT(*) in 15 seconds on my dev machine (oracle).
Considering the pure amount of data, this is still quite fast (at least to me)
As you said NOLOCK is not an option, you could consider
exec sp_spaceused 'myTable'
as well.
But this pins down nearly to the same as NOLOCK (ignoring contention + delete/update afaik)
I've been working with SSMS for well over a decade and only in the past year found out that it can give you this information quickly and easily, thanks to this answer.
Select the "Tables" folder from the database tree (Object Explorer)
Press F7 or select View > Object Explorer Details to open Object Explorer Details view
In this view you can right-click on the column header to select the columns you want to see including table space used, index space used and row count:
Note that the support for this in Azure SQL databases seems a bit spotty at best - my guess is that the queries from SSMS are timing out, so it only returns a handful of tables each refresh, however the highlighted one always seems to be returned.
Count will do either a table scan or an index scan. So for a high number of rows it will be slow. If you do this operation frequently, the best way is to keep the count record in another table.
If however you do not want to do that, you can create a dummy index (that will not be used by your query's) and query it's number of items, something like:
select
row_count
from sys.dm_db_partition_stats as p
inner join sys.indexes as i
on p.index_id = i.index_id
and p.object_id = i.object_id
where i.name = 'your index'
I am suggesting creating a new index, because this one (if it will not be used) will not get locked during other operations.
As Aaron Bertrand said, maintaining the query might be more costly then using an already existing one. So the choice is yours.
If you just need a rough count of number of rows, ie. to make sure a table loaded properly or to make sure the data was not deleted, do the following:
MySQL> connect information_schema;
MySQL> select table_name,table_rows from tables;
I have an Audit database(created by someone else).
Something is polulating it, with table sizes data (which makes sense as it is Audit database).
The SQL server has too many jobs.
I want to know what is populating the audit tables.
Is there anything like sys.comments etc? which can tell me what is populating tables or do I have to check the code inside each job?
Regards
Manjot
you could try running something like this:
SELECT DISTINCT
o.name,o.type_desc
FROM sys.sql_modules m
INNER JOIN sys.objects o ON m.object_id=o.object_id
WHERE m.definition Like '%YourTableName%'
ORDER BY 2,1
EDIT after OP mentioned SQL Server 2000
this should work on SQl Server 2000:
--remove comments to see the actual text too
SELECT DISTINCT
o.name --,c1.colid,c1.text
FROM sysobjects o
INNER JOIN syscomments c1 ON o.id = c1.id
--join to next section of code in case search value is split over two rows
LEFT OUTER JOIN syscomments c2 ON o.id = c2.id AND c2.colid=c1.colid+1
WHERE c1.text Like '%YourTableName%'
OR RIGHT(c1.text,100)+LEFT(c2.text,100) Like '%YourTableName%'
ORDER BY 1--,2
Try looking at msdb..sysjobsteps in the command column for the destination table names; this will only work if they are using T-SQL to populate the tables. If they're using an SSIS (or DTS) package, this won't work.
most likely it is being populated by triggers onteh the audited tables.
If you know what causes data to go into the audit table, you can run a (very) brief Profiler session against the database, filtering specifically on that table, while triggering the action. That will give you further steps to back-trace the root action.
I've been trying to come up with a good design pattern for mapping data contained in relational databases to the business objects I've created but I keep hitting a wall.
Consider the following tables:
TYPE: typeid, description
USER: userid, username, usertypeid->TYPE.typeid, imageid->IMAGE.imageid
IMAGE: imageid, location, imagetypeid->TYPE.typeid
I would like to gather all the information regarding a specific user. Creating a query for this isn't too difficult.
SELECT u.*, ut.*, i.*, it.* FROM user u
INNER JOIN type ut ON ut.typeid = u.usertypeid
INNER JOIN image i ON i.imageid = u.imageid
INNER JOIN type it ON it.typeid = i.imagetypeid
WHERE u.userid = #userid
The problem is that the field names collide and then I'm forced to alias every single field which gets out of hand very quickly.
Does anyone have a decent design pattern for this kind of thing?
I've thought about retrieving multiple results from a single stored procedure and then using a dataset to iterate through each one but I'm worried that some performance issues might bite me in the butt later. For example instead of the above query something like:
SELECT u.*, t.* FROM user u
INNER JOIN type t ON t.typeid = u.usertypeid
WHERE u.userid = #userid;
SELECT i.*, t.* FROM image i
INNER JOIN type t ON t.typeid = i.imagetypeid
INNER JOIN user u ON u.imageid = i.imageid
WHERE u.userid = #userid;
Does that seem like a decent solution? Can anyone foresee any issues with this approach?
Never use the SQL * wildcard in production code. Always spell out all the columns you want to retrieve.
Then aliasing some of them doesn't seem like such a huge amount of extra work.
Re your comment asking for background and reasoning:
Sometimes you don't really need every column from all tables, and fetching them can be needlessly costly (especially for large strings and blobs). There is no SQL syntax for "all columns except the following exceptions."
You can't alias columns that you fetch using the wildcard. Once you need to alias any of the columns, you need to expand the wildcard to list all the columns explicitly.
If the table structure changes, e.g. columns are renamed, reordered, dropped, or added, then the wildcard fetches them all, by position as defined in the tables. This may seem like a convenience, but not when your application depends on columns being in the result set by a given name or in a given position. You can get mysterious bugs where your application displays columns in the wrong order (if referencing columns by position), or shows them as blank (if referencing columns by name).
However, if the SQL query names columns explicitly, you can employ the "Fail Early" principle. This helps debugging, because it leads you directly to the SQL query that needs to be edited to account for the schema change.