Removing several tables out of a sql server 2005 database - make a new db or use old?

Removing several tables out of a sql server 2005 database - make a new db or use old? - sql-server-2005

I'm killing a bunch of tables in a SQL Server db to move to an archive db. The current db has a couple of filegroups and has been working okay growing the tables that are still there. I'll be removing multiple gigabytes, though.
Should I make a new db and move the current tables in there? I'm paranoid about not setting growth right.
There is really only one table that sees a lot of activity - and that goes up to about 14,000 rows in a four-month

It's up to what you really want to do. If you use a new db then your existing applications may have to change the connection strings to reflect the new db name.
If you are worried about the growth setting, make sure you use a good number that is close to projected growth numbers and look for the autogrow events for data and log files using the default trace. Query is given below for you. No one gets the size correctly and you make your best guess based on the data available to you and monitor the growth. If you see any data back then make the changes appropriately to bump the numbers. And also 14000 rows in 4 month period is NOT considered active at all when compared to what SQL Server can handle.
DECLARE #filename VARCHAR(255)
SELECT #FileName = SUBSTRING(path, 0, LEN(path)-CHARINDEX('\', REVERSE(path))+1) + '\Log.trc'
FROM sys.traces
WHERE is_default = 1;
--Check if the data and log files auto-growed.
SELECT
gt.ServerName
, gt.DatabaseName
, gt.TextData
, gt.StartTime
, gt.Success
, gt.HostName
, gt.NTUserName
, gt.NTDomainName
, gt.ApplicationName
, gt.LoginName
FROM [fn_trace_gettable](#filename, DEFAULT) gt
JOIN sys.trace_events te ON gt.EventClass = te.trace_event_id
JOIN sys.databases d ON gt.DatabaseName = d.name
WHERE EventClass in ( 92, 93 ) --'Data File Auto Grow', 'Log File Auto Grow'
ORDER BY StartTime;

Related

How to Find the Database Views Which are not executed or Accessed for more than 6 Months in SQL Server 2005

I would like to clean up my database by identified & removing the views & stored procedures which were not in use or not accessed for a longer period (May be for last 6 months or 1 year) in SQL Server 2005.
Please help.

You can't do this 100% unless you're running a trace on your system 24/7 and keeping the data or using the auditing mechanisms of 2008.
All the data will be lost when you restart system, else you can find out the last used time for a specific object as queried below
select
DB_NAME(us.[database_id]) as [db],
OBJECT_NAME(us.[object_id],us.[database_id]) as [object],
MAX(us.[last_user_lookup]) as [last_user_lookup],
MAX(us.[last_user_scan]) as [last_user_scan],
MAX(us.[last_user_seek]) as [last_user_seek]
from sys.dm_db_index_usage_stats us
where us.[database_id] = DB_ID()
AND us.[object_id] = OBJECT_ID('tblname')
group by us.[database_id], us.[object_id];

Based on #Koushick's answer, I resolved it by using this ..
SELECT DB_NAME(us.[database_id]) AS [db],
OBJECT_NAME(us.[object_id], us.[database_id]) AS [object],
MAX(us.[last_user_lookup]) AS [last_user_lookup],
MAX(us.[last_user_scan]) AS [last_user_scan],
MAX(us.[last_user_seek]) AS [last_user_seek]
FROM sys.dm_db_index_usage_stats AS us
WHERE DB_NAME(us.[database_id]) = 'your database name'
AND OBJECT_NAME(us.[object_id], us.[database_id]) = 'your object'
GROUP BY us.[database_id], us.[object_id];
You can then quickly sort and play with the dates to see when objects were last used, as opposed to when they were last modified.

To find out the views/stored procedures older than particular date you can use following query
SELECT [name],create_date,modify_date
FROM sys.views (or sys.procedures)
WHERE modify_date<= 'date_older_than_you_want'
To find out unused views you can use following query:
SELECT [name],create_date,modify_date
FROM
sys.views
where create_date=modify_date

Query execution time MATLAB to MS Access

Since some time I'm testing a connectivity between an existing MS-Access database and MATLAB. Currently I have the following local configuration (both MATLAB and DB on the same local drive):
MATLAB 2013a (32 bits) and MS Access 2007.
After resolving connection problems with MATLAB 64bits, I moved to 32 bits and the connection works fine now. Connection is done via database toolbox:
conn = database('Test_DB', '', '');
What is very annoying is the execution time.
I have compared execution times within MS Access (executing the query directly in the database with run! button) with the times used by MATLAB to execute the query and bring the data with fetch. The difference is almost an order of magnitude.
Typically, I have two big tables (Table1 - 20 columns x 1'000'000 rows and Table2 - 10 columns x 10'000'000 rows). The query is quite simple combining several fields from both tables based on selected date. Inside Access (depending on version 2003 or 2007) it takes roughly between 7 to 10 seconds. When executed from Matlab (the SQL command is exactly the same) it takes between 70 and 75 seconds.
I have tried many things to understand what is the issue here, but with no success. If somebody knows about similar issues I would be glad to have some opinions.
To be more specific: Matlab 32 bits ver. 2013a on 64 bits Win 7, i7-3770 with 8GB RAM. For Database Toolbox I use ODBC Microsoft Access Driver 6.01.7601.17632, ODBCJT32.DLL dated 23.12.2011.
The query uses two tables T1 and T2 and looks as follows:
strSQL = [ 'SELECT DISTINCT T1.TF1, T1.SI1, T1.SI2, T2.TF2, T1.DATE1 ' ...
'FROM T2 INNER JOIN T1 ' ...
'ON T2.TF1 = T1.TF1 ' ...
'WHERE (((T1.DATE1)=#', date1, '#));'];
TF1, TF2 are textual fields
SI1, SI2 are numeric (simple) fields
DATE1 is date field
T1 has 7,000,000 rows, 2 text fields, 3 numeric fields, 1 date field
T2 has 13,000 rows, 39 text fields, 12 numeric fields, 1 date field

The additional time spent running it from Matlab is probably in transferring the data out of the Access engine and converting it to Matlab datatypes. There's quite an impedance mismatch there, and Matlab doesn't necessarily use the most efficient types for it.
This is slow enough it sounds like you might be using the default cellarray data return format. This is an inefficient format unsuitable for larger data sets. (Or, imho, much of anything.) It stores all columns, including numerics, in a 2-D cell array.
Switch to the structure or new table data return formats using setbdprefs(). That should give you some speedup and help with the memory fragmentation.
setdbprefs('DataReturnFormat', 'table');
conn = database( ... )
(I'm not sure if table is available yet in R2013a; it's new. Try it and see if it works; it's not well documented even in R2014a where it's definitely available.)
At that point, string and date columns will be your major data transfer cost. If you can alter your query or the schema to return numeric identifiers instead, that could speed things up a lot. And if you have low cardinality string columns, you can convert them to #categorical variables to save space once they're inside Matlab.
Retrieving dates as strings is expensive. And you want them to end up as datenums. You may be able to speed this up further by pushing the conversion from SQL DATE to Matlab datenum in to the Access layer by using a conversion expression written in SQL. And in fact, in this query, since you're already fixing T1.DATE1 to a known value in the WHERE clause, don't retrieve it as a column in the query. Just set the column to the known value in the Matlab layer. That'll save you the expense of transferring and converting the date values, which are expensive. Something like this instead.
setdbprefs('DataReturnFormat', 'table');
conn = database('Test_DB', '', '');
myDates = % ... a list of dates as datenums, not strings
for date = myDates
sql = [ 'SELECT DISTINCT T1.TF1, T1.SI1, T1.SI2, T2.TF2' ...
'FROM T2 INNER JOIN T1 ' ...
'ON T2.TF1 = T1.TF1 ' ...
'WHERE (((T1.DATE1)=#', datestr(date, 'yyyy-mm-dd'), '#));'];
curs = fetch(exec(conn, sql));
t = curs.Data;
t.DATE1 = repmat(date, [height(t) 1]);
% ... do stuff with t ...
end
And try using the Native ODBC Connection method. That will save you the added expense of the JDBC-ODBC bridge driver, which is what the plain database('DSN', '', '') connection uses.

Conditional Distinct

Sorry for the kinda longish introduction, but I want to make clear what I'm trying to do. If someone is able to come up with a more appropriate title, please feel free to edit.
I wrote an SNMP Collector that queries every switch in our data center once per hour and checks which ports are online, and stores the results in a MS SQL 2k12 DB. The motivation was that often admins don't report a discontinued server or some other device and we are running out of switch ports.
The DB schema looks like this (simplified screenshot):
The Interfaces table is a child table to the Crawls (Crawl = Run of the SNMP collector) table as the number of interfaces is not constant for every switch but changes between Crawls, as line cards are inserted or removed.
Now I want to write a query that returns every Interface on every Switch that ALWAYS had an ifOperStatus value of 2 and NEVER had an ifOperStatus of 1.
I wrote a query that has three nested sub-queries, is ugly to read and slow as hell. There sure has to be an easier way.
My approach was to filter the ports that NEVER changed by using
HAVING (COUNT(DISTINCT dbo.Interfaces.ifOperStatus) = 1)
and than inner-joining against a list of ports that had an ifOperStatus of 2 during the last crawl. Ugly, as I said.
So, a sample output from the DB would look like this:
And I'm looking for a query that returns rows 5-7 because ifOperStatus never changed, but DOES NOT return rows 3-4 because ifOperStatus flapped.

How about
HAVING (MIN(dbo.Interfaces.ifOperStatus) = 2 AND MAX(dbo.Interfaces.ifOperStatus) = 2)
MIN and MAX don't require SQL Server to maintain a set of all values seen so far, just the highest/lowest. This may also avoid the need to join "against a list of ports that had an ifOperStatus of 2 during the last crawl".

select
s.Hostname,
s.sysDescr,
i.ifOperStatus,
i.ifAllias,
i.ifIndex,
i.ifDescr
from
interfaces i
join crawl c on c.id = i.crawlId
join switches s on s.id = c.switchId
where
i.ifOperStatus = 2
and not exists
(
select 'x'
from
interfaces ii
join crawl cc on cc.id = ii.crawlId
join switches ss on ss.id = cc.switchId
where
s.id = ss.id
and ii.ifOperStatus = 1
)

SQL add up rows in a column

I'm running SQL queries in Orion Report Writer for Solarwinds Netflow Traffic Analyzer and am trying to add up data usage for specific conversations coming from the same general sources. In this case it is netflix. I've made some progress with my query.
SELECT TOP 10000 FlowCorrelation_Source_FlowCorrelation.FullHostname AS Full_Hostname_A,
SUM(NetflowConversationSummary.TotalBytes) AS SUM_of_Bytes_Transferred,
SUM(NetflowConversationSummary.TotalBytes) AS Total_Bytes
FROM
((NetflowConversationSummary LEFT OUTER JOIN FlowCorrelation FlowCorrelation_Source_FlowCorrelation ON (NetflowConversationSummary.SourceIPSort = FlowCorrelation_Source_FlowCorrelation.IPAddressSort)) LEFT OUTER JOIN FlowCorrelation FlowCorrelation_Dest_FlowCorrelation ON (NetflowConversationSummary.DestIPSort = FlowCorrelation_Dest_FlowCorrelation.IPAddressSort)) INNER JOIN Nodes ON (NetflowConversationSummary.NodeID = Nodes.NodeID)
WHERE
( DateTime BETWEEN 41539 AND 41570 )
AND
(
(FlowCorrelation_Source_FlowCorrelation.FullHostname LIKE 'ipv4_1.lagg0%')
)
GROUP BY FlowCorrelation_Source_FlowCorrelation.FullHostname, FlowCorrelation_Dest_FlowCorrelation.FullHostname, Nodes.Caption, Nodes.NodeID, FlowCorrelation_Source_FlowCorrelation.IPAddress
So I've got an output that filters everything but netflix sessions (Full_Hostname_A) and their total usage for each session (Sum_Of_Bytes_Transferred)
I want to add up Sum_Of_Bytes_Transferred to get a total usage for all netflix sessions
listed, which will output to Total_Bytes. I created the column Total_Bytes, but don't know how to output a total to it.
For some asked clarification, here is the output from the above query:
I want the Total_Bytes Column to be all added up into one number.

I have no familiarity with the reporting tool you are using.
From reading your post I'm thinking you want the the first 2 columns of data that you've got, plus at a later point in the report, a single figure being the sum of the total_bytes column you're already producing.
Your reporting tool probably has some means of totalling a column, but you may need to get the support people for the reporting tool to tell you how to do that.
Aside from this, if you can find a way of calling a separate query in a latter section of the report, or if you embed a new report inside your existing report, after the detail section, and use that to run a separate query then you should be able to get the data you want with this:
SELECT Sum(Total_Bytes) as [Total Total Bytes]
FROM ( yourExistingQuery ) x
yourExistingQuery means the query you've already got, in full (doesnt have to be put on one line), the paretheses are required, and so is the "x". (The latter provides a syntax-required name for the virtual table which your query defines).
Hope this helps.

Detect loops in hierarchy using SQL Server query [duplicate]

I have parent child data in excel which gets loaded into a 3rd party system running MS SQL server. The data represents a directed (hopefully) acyclic graph. 3rd party means I don't have a completely free hand in the schema. The excel data is a concatenation of other files and the possibility exists that in the cross-references between the various files someone has caused a loop - i.e. X is a child of Y (X->Y) then elsewhere (Y->A->B-X). I can write vb, vba etc on the excel or on the SQL server db. The excel file is almost 30k rows so I'm worried about a combinatorial explosion as the data is set to grow. So some of the techniques like creating a table with all the paths might be pretty unwieldy. I'm thinking of simply writing a program that, for each root, does a tree traversal to each leaf and if the depth gets greater than some nominal value flags it.
Better suggestions or pointers to previous discussion welcomed.

You can use a recursive CTE to detect loops:
with prev as (
select RowId, 1 AS GenerationsRemoved
from YourTable
union all
select RowId, prev.GenerationsRemoved + 1
from prev
inner join YourTable on prev.RowId = ParentRowId
and prev.GenerationsRemoved < 55
)
select *
from prev
where GenerationsRemoved > 50
This does require you to specify a maximum recursion level: in this case the CTE runs to 55, and it selects as erroneous rows with more than 50 children.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Removing several tables out of a sql server 2005 database - make a new db or use old? - sql-server-2005

Related

How to Find the Database Views Which are not executed or Accessed for more than 6 Months in SQL Server 2005

Query execution time MATLAB to MS Access

Conditional Distinct

SQL add up rows in a column

Detect loops in hierarchy using SQL Server query [duplicate]

Categories

Resources