How can you identify the PK columns in a View - sql-server-2005

I used to use 'GetSchemaTable' to read schema information, but it was missing some 'stuff', so I wrote a big query, referencing, among other columns, sys.columns,sys.index_columns, and sys.indexes (and other tables) to return the same information I used to get from GetSchemaTable and also return the other pieces of information I want.
Problem is that GetSchemaTable will tell me if a column returned from a view is a Key column from the underlying tables but my new query does not. It'll give me the right answer all day long for tables, but not for views.
Does anyone have a solution to this? I'd hate to have to go back to GetSchemaTable just for that one bit of information, when I'm examing a view. (Plus, I really just want a SQL based solution, ideally.)
Thanks!

Unfortunately in SQL Server 2005 this is not very easy. I have played with this a bit, and it is very close, but it relies on the fact that you name your columns in your view exactly the same as they are named in the base table. This is because the now-deprecated-in-SQL-Server-2008 view sys.sql_dependencies does not properly store the referencing column_id, so there is no way to match this up with the actual columns in the view. I think SQL Server 2008 will have better options for you as they have yet again introduced a new set of dependency objects. I also didn't chase down any paths with INFORMATION_SCHEMA.KEY_COLUMN_USAGE but since these views rely solely on names and not id's of any kind you are likely in the same pickle there. So maybe this can be a start for you but like I said this will only cover the simple cases. If you alias your columns you will be out of luck. Maybe someone else with some insight into the intricacies of how these things are referenced will pull a rabbit out and figure out how to reference mismatched columns...
-- very simple; one-column key:
CREATE TABLE dbo.boo
(
far INT PRIMARY KEY
);
GO
CREATE VIEW dbo.view_boo
AS
SELECT far FROM dbo.boo;
GO
-- slightly more complex. Two-column key,
-- not all columns are in key, view columns
-- are in different order:
CREATE TABLE dbo.foo
(
splunge INT,
a INT,
mort INT,
PRIMARY KEY(splunge, mort)
);
GO
CREATE VIEW dbo.view_foo
AS
SELECT
splunge,
mort,
a
FROM
dbo.foo;
GO
SELECT
QUOTENAME(OBJECT_SCHEMA_NAME(v.[object_id])) + '.'
+ QUOTENAME(v.name) + '.' + QUOTENAME(vc.name)
+ ' references '
+ QUOTENAME(OBJECT_SCHEMA_NAME(t.[object_id]))
+ '.' + QUOTENAME(t.name) + '.' + QUOTENAME(tc.name)
FROM
sys.views AS v
INNER JOIN
sys.sql_dependencies AS d
ON v.[object_id] = d.[object_id]
INNER JOIN
sys.tables AS t
ON d.referenced_major_id = t.[object_id]
INNER JOIN
sys.columns AS tc
ON tc.[object_id] = t.[object_id]
INNER JOIN
sys.index_columns AS ic
ON tc.[object_id] = ic.[object_id]
AND tc.column_id = ic.column_id
AND tc.column_id = d.referenced_minor_id
INNER JOIN
sys.columns AS vc
ON vc.[object_id] = v.[object_id]
AND vc.name = tc.name -- the part I don't like
INNER JOIN
sys.indexes AS i
ON ic.[object_id] = i.[object_id]
AND i.is_primary_key = 1
ORDER BY
t.name,
ic.key_ordinal;
GO
DROP VIEW dbo.view_boo, dbo.view_foo;
DROP TABLE dbo.foo, dbo.boo;

Related

Improving Query performance - T-SQL

I have a table that is already ordered by a column 'datetime'. Because when it is inserted I store the UTC date, so it is ordered. It's a very populated table. So I am trying to improve the query performance, if it is possible.
When I use something WHERE columnDateTime > dateToSearch it takes too long to return the rows. As my table is already ordered by columnDateTime what could I do to improve this query performance. For example, when a table is ordered by a cod and you try to search for cod > 40 T-SQL optimization will stop the search when it finds a cod = 41 and will return the rest of the table, cause it knows the table is ordered by that index. Is that a way that could tell T-SQL that my table is already ordered by that columnDateTime too?
Inserting the data in order doesn't mean it is saved in order. Without getting too technical and for faster performance:
Create a CLUSTERED INDEX on that column. This requiers that there are no other clustered indexes on you table and it doesn't have a PRIMARY KEY (or it has it NONCLUSTERED which is not the default). With a clustered index, the engine will do a index scan (not a full table scan) when filtering with > datetimeValue and doesn't need to access aditional pages for the data, since a clustered index leaves are the data.
Create a NONCLUSTERED INDEX on that column. No restrictions on this clause (at least for this case), but for each match with your filtered date, the engine will need to access another page with the requested columns, unless you INCLUDE them when creating your index. Keep in mind that inlcuded columns will raise the size of the index and will need additional maintenance tasks like, for example, when an included column is modified.
That aside, you should check your query plan; if you have joins, function calls or additional conditions, the SQL engine might not use the indexes even if they exist. There are many things that could make a query run slow, you will have to post the full query execution plan (for a start) to check the details.
You can use this query to check if your table already has indexes:
DECLARE #table_name VARCHAR(200) = 'YourTableName'
SELECT
SchemaName = SCHEMA_NAME(t.schema_id),
TableName = t.name,
IndexName = ind.name,
IndexType = CASE ind.index_id WHEN 0 THEN 'Heap' WHEN 1 THEN 'Clustered' ELSE 'Nonclustered' END,
Disabled = ind.is_disabled,
ColumnOrder = ic.index_column_id,
ColumnName = col.name,
ColumnType = y.name,
ColumnLength = y.max_length,
ColumnIncluded = ic.is_included_column
FROM
sys.indexes ind
INNER JOIN sys.index_columns ic ON ind.object_id = ic.object_id and ind.index_id = ic.index_id
INNER JOIN sys.columns col ON ic.object_id = col.object_id and ic.column_id = col.column_id
INNER JOIN sys.tables t ON ind.object_id = t.object_id
INNER JOIN sys.types y ON y.user_type_id = col.user_type_id
WHERE
t.is_ms_shipped = 0 AND
t.name = #table_name
ORDER BY
SchemaName,
t.name,
ind.name,
ic.index_column_id
You need to make sure that there is at least one index that has your datetimeColumn with ColumnOrder = 1 and it's not disabled. If it already exists then your problem lies elsewhere and we won't be able to help much without more detail.

Find related columns among hundreds of tables for future relational identification

I am using SQL Server 2016 to pull information out of our ERP system that is stored in a DB2 database. This has thousands of tables with no keys inside of them. When pulling tables from the system, I want to be able to identify matching column names in tables so I can start creating relationships and keys when building dimensions.
Is there a way to create a query that will search my database for column names and list every table that uses that column name? I have been using OPENQUERY and INFORMATION_SCHEMA.TABLES to determine the tables I want to pull over but now I want to start determining relationships between those tables.
Any help would be much appreciated!
You can look in the old yet gold system tables.
A few examples
find all tables with a column named like ID
select so.name, sc.name
from sys.sysobjects so
join sys.syscolumns sc on sc.id = so.id
where so.xtype = N'U'
and sc.name like 'ID%'
Find the FKs from a table
select so2.name
from sys.sysobjects so
join sys.sysforeignkeys fk on so.id = fk.rkeyid
join sys.sysobjects so2 on fk.fkeyid = so2.id
where so.name = 'MyTable'
Check MSDN documentation for further reference and if you want any specific combination just post a new question.
I had to do something similar once, and ended up using something similar to this:
SELECT
T.name
,C1.name
,C2.Name
FROM sys.Tables T
INNER JOIN sys.Columns C1
ON C1.object_id = T.object_id
CROSS APPLY
(
SELECT OBJECT_NAME(CX.object_id) + '.' + CX.Name AS Name
FROM sys.Tables TX
INNER JOIN sys.Columns CX
ON CX.object_id = TX.object_id
AND TX.is_ms_shipped = 0
WHERE CX.object_id <> T.object_id
AND CX.name = C1.name
AND CX.user_type_id = C1.user_type_id
) C2
;
Of course, the problem with any query that we can post here is that it will be extremely generalized, because we aren't familiar with your schema. It's entirely possible, for example, that you will have tables like these:
T_Customers T_Shipments
ID | Name ID | Customer_ID
1 | George 1 | 1
2 | Jane 2 | 1
3 | John 3 | 3
In a case such as that, T_Shipments.Customer_ID should be linked to T_Customers.ID, but won't be in this query, because the name is different.
To search for cases like that, I modified the query later to do a second comparison with concatenations and pattern searches. Not the speediest, but certainly the most thorough - we found all sorts of things we didn't know before. Unfortunately, I can't even begin to guess what your tables/attributes might look like without a lot of further details.
Edit:
Please note that the CROSS APPLY includes a reference to user_type_id, because I wasn't interested at the time in finding columns that had the same name but were a different data type. That might not be the case for you, so you can remove that reference if it isn't relevant.

How to list all tables which references a certain row of another table in SQL Server?

I'm currently working with a project that needs to clear some unused fields and normalize tables on its database in order to make it look better.
One of these tables has a field that is not a foreign key (but it should), so I can't use sp_help in order to know what tables are related.
My current task is to delete the rows with IdTipoEspecialidad = 3 only if there isn't any table that uses this rows (because I need to delete both).
Is there some shortcut or query that makes this task easier?
Assuming all these non-foreign-keyed columns at least follow some naming convention, you can execute the following query:
SELECT
'SELECT * FROM [' + schemas.name + '].[' + tables.name + ']'
+ ' WHERE [' + columns.name + '] = 3'
FROM
sys.schemas
INNER JOIN sys.tables
ON schemas.schema_id = tables.schema_id
INNER JOIN sys.columns
ON tables.object_id = columns.object_id
WHERE
columns.name LIKE '%IdTipoEspecialidad%'
The output of that query will give you a bunch of other queries, which you can run to see if there is any column referencing that specific row.

Searching for text over multiple columns?

Our company is using SQL Server 2008 to manage the website's database. We have currently had a name change and so I need to change all occurrences of our old company name.
The database has multiple Tables and A LOT of columns under each table, I am only interested in finding and updating the text from all columns in one table.
Essentially what I need to be able to do is; find the string "CSQTC" over all columns in a table named "Practices", note that some columns may not contain strings, and some values in the columns may be null.
I think I know how to search for text over multiple columns, but it is a lot of code and I am sure there is a better way. This is how I think I do it
WHERE columnName LIKE '%CSQTC%'
AND columnName2 LIKE '%CSQTC%'
AND columnName3 LIKE '%CSQTC%'
AND ....
Surely there is a better way?
Thanks heaps!
EDIT: I forgot to ask how I can replace each occurence of 'CSQTC' to be 'GPTQ' instead? Thanks again
You could probably write a stored procedure that would
look for all columns containing text in your table (I don't know sql server especially, but with MySQL you'd look in the information_schema)
for each matching column, do an update request to replace the string you want.
And then obviously call that procedure (and maybe discard it unless you think you'll need it later on).
For the replacement part, it will be a simple use of REPLACE(columnName, 'CSQTC', 'GPTQ') (see REPLACE documentation at Microsoft's Technet)
P.S. see SQL server query to get the list of columns in a table along with Data types, NOT NULL, and PRIMARY KEY constraints for how to get columns of a table, and SQL Server stored procedure beginner's guide [closed] for stored procedures on sql server.
You can start with this:
SELECT c.name ,
t.name
FROM sysobjects o
INNER JOIN syscolumns c ON c.id = o.id
INNER JOIN systypes t ON t.xusertype = c.xusertype
WHERE o.name = 'YourTableName'
AND t.name IN ( 'varchar', 'nvarchar', 'text', 'ntext', 'char', 'nchar' )
To get all the columns that have text from the table.
Also you can do this:
SELECT 'Update ' + QUOTENAME(o.name) + ' Set ' + c.name + ' = Replace(' + QUOTENAME(c.name) + ', ''CSQTC'', ''GPTQ'')'
+ ' Where ' + QUOTENAME(c.name) + ' LIKE ''%CSQTC%'''
FROM sysobjects o
INNER JOIN syscolumns c ON c.id = o.id
INNER JOIN systypes t ON t.xusertype = c.xusertype
WHERE o.name = 'YourTableName'
AND t.name IN ( 'varchar', 'nvarchar', 'text', 'ntext', 'char', 'nchar' )
To get the Update instructions for each column of the table.

Temp tables on the current connection

if I do:
select * from tempdb.sys.tables
I will see all the temporary tables in the system, however that view does not have information about which connection/user each table belongs to. I'm interested in finding only the tables I've created on my current connection. Is there a way to do this?
thanks - e
p.s. yes, I could try reading each table listed with the notion that those that succeed should prove to be mine (on recent versions one can't read other connections' tables) but that is too costly an approach since there may be thousands of tables on the system
p.p.s. I did read Is there a way to get a list of all current temporary tables in SQL Server? which asks the right question but did not get a good answer
Assuming you don't name your #temp tables with three consecutive underscores, this should only pick up your #temp tables. It won't, however, pick up your table variables, nor can you change this code somehow to pick the tables on someone else's connection - this only works because OBJECT_ID('tempdb..#foo') can only return true for a table in your session.
SELECT
name = SUBSTRING(t.name, 1, CHARINDEX('___', t.name)-1),
t.[object_id]
FROM tempdb.sys.tables AS t
WHERE t.name LIKE '#%[_][_][_]%'
AND t.[object_id] =
OBJECT_ID('tempdb..' + SUBSTRING(t.name, 1, CHARINDEX('___', t.name)-1));
You might also be interested in space used by each of these tables (at least for the heap or clustered index), e.g.:
SELECT
name = SUBSTRING(t.name, 1, CHARINDEX('___', t.name)-1),
t.[object_id],
p.used_page_count,
p.row_count
FROM tempdb.sys.tables AS t
INNER JOIN tempdb.sys.dm_db_partition_stats AS p
ON t.[object_id] = p.[object_id]
WHERE t.name LIKE '#%[_][_][_]%'
AND p.index_id IN (0,1)
AND t.[object_id] =
OBJECT_ID('tempdb..' + SUBSTRING(t.name, 1, CHARINDEX('___', t.name)-1));
You could extend that to show total space for all indexes. I didn't bother aggregating per partition since these are #temp tables.
select *
from tempdb.sys.objects
where object_id('tempdb.dbo.' + name, 'U') is not null
AND name LIKE '#%'
Would tell you all the tables in tempdb beginning with # that you can access, but Aaron's script just blew me out of the water haha
To find out the name of the user who create the object you just need to check for the schema ID and cross reference with the Schemas table
Select sch.name as 'User Owner' from tempdb.sys.tables TBL
join tempdb.sys.schemas SCH on TBL.schema_id = SCH.schema_id
where TBL.name like '#tmp_Foo%'