How to forward demo data dates using a stored procedure? - sql

I am looking for a clean way to forward some demo data using a stored procedure. The data that I want to forward are date types. Due to the nature of my app, some of the data in my app will only appear when certain dates in the data are in the future. I hope this makes sense. : S
Since my database is ever expanding, I was thinking to write a stored procedure which essentially forwards all dates in all tables in my database that belongs to a demo user account. I will also keep track of the date the demo data was forwarded last. Obviously the stored proc will get run on login of a demo data, and when the difference between last date the demo data was forwarded and the current date has met a certain time difference (e.g. 30 days). This way I do not have to keep altering the script as much.
Now to the technical part:
I am using this to retrieve all the tables in the db:
Select
table_name
from
Information_Schema.Tables
Where
TABLE_TYPE like 'BASE TABLE'
and table_name not like 'Report_%'
and table_name not in ('Accounts', 'Manifest', 'System', 'Users')
What I need is a way to iterate through the table names, find the column names and column types. Then I wish to update all columns in each table that is of type datetime. I have read looping in SQL is not ideal, but I would like to minimise the number of database calls rather than putting this on the serverside code.
Am I going down the wrong path to solve this issue?
Thanks in advance.

I agree with the comment that it might not be a good idea to do this automatically and in a hidden manner, but if you want to you can use this.
(Note this assumes SQL Server)
select T.Name, C.Name
from sys.tables T
join sys.columns C
on T.object_id = C.object_id
and C.system_type_id = 61 -- I would do a little researcht o make sure 61 is all you need to return here
This will get you a list of all datetime columns, along with the table it is in by name.
Then the way I would accomplish it is to have a cursor which builds the update strings on the fly, and exec them kinda like:
DECLARE #UpdateString varchar(500)
DECLARE #DaysToAdd int
DECLARE #TableName VARCHAR(100)
DECLARE #ColumnName VARCHAR(100)
set #DaysToAdd = 10
DECLARE db_cursor CURSOR FOR
select T.Name, C.Name
from sys.tables T
join sys.columns C
on T.object_id = C.object_id
and C.system_type_id = 61
OPEN db_cursor
FETCH NEXT FROM db_cursor INTO #TableName, #ColumnName
WHILE ##FETCH_STATUS = 0
BEGIN
set #UpdateString = 'Update ' + #TableName + ' set ' + #ColumnName + ' = dateadd(dd, ' + cast(#DaysToAdd as varchar) + ', ' + #ColumnName + ') where ...'
exec(#UpdateString)
FETCH NEXT FROM db_cursor INTO #TableName, #ColumnName
END
CLOSE db_cursor
DEALLOCATE db_cursor
There are many things I don't like about this, the cursor, the fact its behind the scenes, and the exec call, along with I'm unsure how you will "update only the test data" since it will be very hard to write the where clause for a generic table in your database. But I think that will get you started.
On the side maybe you should think about having some test data population script which you can run to insert new data which satisfies your date requirements.

Related

How to pick a table_name value from one table and delete records from the table_name table based on a condition?

We have a table. Lets call it Table_A.
Table_A holds bunch of table_names and numeric value associated to each table_name. Refer to the picture below
Can someone help me write a query to:
Select table_names from TABLE_A one by one; go to that table, Check the Date_inserted of each record against NO_OF_DAYS in Table_A and if the record is older than NO_OF_DAYS in Table_A, then DELETE THAT RECORD from that specific table.
I'm guessing we have to create dynamic values for this query but I'm having a hard time.
So, in the above picture, the query should:
Select the first table_name (T_Table1) from Table_A
Go to that Table (T_Table1)
Check the date inserted of each record in (T_Table1) against the condition
If the condition (IF record was inserted prior to NO_OF_DAYS, which is 90 in this case THEN delete the record; ELSE move to next
record)
Move on to the next table (T_Table2) in Table_A
Continue till all the table_names in Table_A have been executed
What you posted as your attempt (in a comment), quite simply isn't going to work. Let's actually format that first, shall we:
SET SQL = '
DELETE [' + dbo + '].[' + TABLE_NAME + ']
where [Date_inserted ] < '
SET SQL = SQL + ' convert(varchar, DATEADD(day, ' + CONVERT(VARCHAR, NO_OF_DAYS) + ',' + '''' + CONVERT(VARCHAR, GETDATE(), 102) + '''' + '))'
PRINT SQL
EXEC (SQL)
Firstly, I actually have no idea what you're even trying to do here. You have things like [' + dbo + '], which means that you're referencing the column dbo; as you're using a SET, then no column dbo can exist. Also, variables are prefixed with a # in SQL Server; you have none.
Anyway, the solution. Some might not like this one, as I'm using a CURSOR, rather than doing it all in one go. I, however, do have my reasons. A CURSOR isn't actually a "bad" thing, like many believe; the problem is that people constantly use them incorrectly. Using a CURSOR to loop through records and create a hierarchy for example is a terrible idea; there are far better dataset approaches.
So, what are my reasons? Firstly I can parametrise the dynamic SQL; this would be harder outside a CURSOR as I'd need to declare a different parameter for every DELETE. Also, with a CURSOR, if the DELETE fails on one table, it won't on the others; one long piece of dynamic SQL would mean if one of the transactions fail, they would all be rolled back. Also, depending on the size of the deletes, that could be a very big DELETE.
It's important, however, you understand what I've done here; if you don't that's a problem unto itself. What happens if you need to trouble shoot it in the future? SO isn't a website for support like that; you need to support your own code. If you can't, understand the code you're given don't use it or learn what it's doing first (or you're doing the wrong thing).
Note I use my own objects, in the absence of consumable sample data:
CREATE TABLE TableOfTables (TableName sysname,
NoOfDays int);
GO
INSERT INTO TableOfTables
VALUES ('T1',10),
('T2',15),
('T3',5);
GO
DECLARE Deletes CURSOR FOR
SELECT TableName, NoOfDays
FROM TableOfTables;
DECLARE #SQL nvarchar(MAX), #TableName sysname, #Days int;
OPEN Deletes;
FETCH NEXT FROM Deletes
INTO #TableName, #Days;
WHILE ##FETCH_STATUS = 0 BEGIN
SET #SQL = N'DELETE FROM ' + QUOTENAME(#TableName) + NCHAR(10) +
N'WHERE DATEDIFF(DAY, InsertedDate, GETDATE()) >= #dDays;'
PRINT #SQL; --Say hello to your best friend. o/
--EXEC sp_executeSQL #SQL, N'#dDays int', #dDays = #Days; --Uncomment to run
FETCH NEXT FROM Deletes
INTO #TableName, #Days;
END
CLOSE Deletes;
DEALLOCATE Deletes;
GO
DROP TABLE TableOfTables;
GO

Is there a way add auto increment in all tables from a specific database at once?

I am trying to add auto increment in all existing tables in a specific database, and I can do that going through the table design, flagging the identity option, but in this case I have to do it table per table, and there is a lot of tables. Is there a way to do that automatically?
Copied from my comments per request:
I don't believe you're going to find an automated option for doing this for multiple tables. The change script that SSMS creates when you do this in table designer is already doing a ton of work you'd have to recreate for any other solution. Frankly, I wouldn't trust myself to do it as correctly as SSMS.
However, if it were a large enough number of tables, I would create a completely new database with the corrected schema. Ensure that everything in the new database is present and correct. Then, set identity insert to on all tables in the new db, copy the data over, set all the identity inserts off, and then move the new db to the old db with DETACH/ATTACH or BACKUP/RESTORE. In other words, I'd literally rebuild the database from the ground up because old schema had been completely trashed. It would take a lot for me to decide to do that in a production system, however.
I'd only do the DETACH/ATTACH or BACKUP/RESTORE if I absolutely needed to change the database file names or database names. I'd actually prefer to just use the new database as a new database for the application. That would also mean I could swap back to the old database pretty quickly if I ran into trouble.
It can be done by using a 'cursor', but you need to have all the columns that you need to add auto increment to in the same name as ID
Declare #Table nvarchar(50), #script nvarchar(100)
DECLARE cur CURSOR FORWARD_ONLY READ_ONLY LOCAL FOR
SELECT TABLE_SCHEMA + '.' + TABLE_NAME as 'Table' FROM INFORMATION_SCHEMA.TABLES where TABLE_NAME not in ('sysdiagrams') -- You can exclude any table from this process by adding it on the where statement
OPEN cur
FETCH NEXT FROM cur INTO #Table
WHILE ##FETCH_STATUS = 0 BEGIN
-- The sql command to alter a Table and add Identity to it, you can change ID by any column in your tables
set #script = 'Alter Table '+ #Table +' ADD ID INT IDENTITY'
EXEC sp_executesql #script
FETCH NEXT FROM cur INTO #Table
END
CLOSE cur
DEALLOCATE cur
Edit 1 : According to what you asked for in the comment
Declare #Table nvarchar(50), #script nvarchar(100), #primarykey_name nvarchar(20)
DECLARE cur CURSOR FORWARD_ONLY READ_ONLY LOCAL FOR
SELECT TABLE_SCHEMA + '.' + TABLE_NAME as 'Table' FROM INFORMATION_SCHEMA.TABLES where TABLE_NAME not in ('sysdiagrams') -- You can exclude any table from this process by adding it here
OPEN cur
FETCH NEXT FROM cur INTO #Table
WHILE ##FETCH_STATUS = 0 BEGIN
-- Find Primary key for the current Table and set it to #primarykey_name
Set #primarykey_name = (SELECT c.NAME FROM sys.key_constraints kc INNER JOIN sys.index_columns ic ON kc.parent_object_id = ic.object_id and kc.unique_index_id = ic.index_id
INNER JOIN sys.columns c ON ic.object_id = c.object_id AND ic.column_id = c.column_id
WHERE kc.name='PK_'+ substring(#Table, 5, LEN(#Table)-4) and kc.type = 'PK')
-- The sql command to alter a Table and add Identity to the primarykey of each table
set #script = 'Alter Table '+ #Table +' ADD ' + #primarykey_name + ' INT IDENTITY'
print #script
--EXEC sp_executesql #script
FETCH NEXT FROM cur INTO #Table
END
CLOSE cur
DEALLOCATE cur

SQL Server -- updating the `sys.*` tables and not just reading them

In an attempt to the query
UPDATE sys.columns
SET user_type_id = 106
WHERE object_id in (select object_id from sys.objects where type = 'U') and user_type_id = 108
I'm getting the error:
Msg 259, Level 16, State 1, Line 1
Ad hoc updates to system catalogs are not allowed.
Is there a way to get around this? In this case, I'm looking to change the types of all decimal fields of all the tables in the database.
Can do this "externally"-- without direct tampering with sys.* tables (haven't yet pinned down how-to though), but I'm looking to know whether I can update the sys.* tables -- and if so, which ones, when/how?
// =========================
EDIT:
would i be able to get any "deeper" than alter table... if i had full privileges for db access?
not sure what kind of privileges i have now, but would look into it.
These tables are informational only. I want to make this clear: the sys.* and INFORMATION_SCHEMA.* views exist to provide schema information from the database engine in a useful format. They do not represent the actual schema of the database*, and modifying them is thus impossible. The only way to change your schema is to use DDL (Data Definition Language) statements, such as ALTER TABLE.
In your case, you can use a cursor to iterate through all columns with the wrong type, generate SQL statements to correct that, and execute them dynamically. Here's a skeleton of how that would look:
DECLARE column_cursor CURSOR FOR
SELECT schemas.name AS schema_name,
objects.name AS table_name,
columns.name AS column_name
FROM sys.columns
JOIN sys.objects
ON objects.object_id = columns.object_id
JOIN sys.schemas
ON schemas.schema_id = objects.schema_id
WHERE objects.type = 'U'
AND columns.user_type_id = 108
DECLARE #schema_name VARCHAR(255)
DECLARE #table_name VARCHAR(255)
DECLARE #column_name VARCHAR(255)
OPEN column_cursor
FETCH NEXT FROM column_cursor INTO #schema_name, #table_name, #column_name
WHILE ##FETCH_STATUS = 0
BEGIN
DECLARE #sql VARCHAR(MAX)
-- TODO: modify to change to the actual type, scale and precision you want; also you may need to adjust for NOT NULL constraints, default constraints and foreign keys (all exercises for the reader)
SET #sql = 'ALTER TABLE ' + QUOTENAME(#schema_name) + '.' + QUOTENAME(#table_name) + ' CHANGE COLUMN ' + QUOTENAME(#column_name) + ' DECIMAL(12, 2)'
EXEC(#sql)
FETCH NEXT FROM column_cursor INTO #schema_name, #table_name, #column_name
END
CLOSE column_cursor
DEALLOCATE column_cursor
Because of the potential increase in complexity for dealing with constraints and keys, I'd recommend either updating the columns manually, building the ALTER TABLE statements manually, dumping your schema to script, updating that and recreating the tables and objects, or looking for a 3rd party tool that does this kind of thing (I don't know of any).
*For the sys.* views, at least, it's possible that they closely represent the underlying data structures, though I think there's still some abstraction. INFORMATION_SCHEMA is ANSI-defined, so it is unlikely to match the internal structures of any database system out there.

SSIS Multiple Unknow Column Updates

I wonder if anyone has come across a similar situation before that could point me in the right direction..? I'll add that it's a bit frustrating as someone has replaced the NULL value with a text string containing the word 'NULL' - which I need to remove.
I have 6 quite large tables, over 250+ columns and in excess of 1 million records in each and I need to update the columns where the word NULL appears in a row and replace it with a proper NULL value - the problem is that I have no idea in which column this appears.
As a start, I've got some code that will list every column with a count of the values and anything that looks to have a lower count than expected, I'll run a SQL query to ascertain if the column contains the string 'NULL' and using the following code, replace it with NULL.
declare #tablename sysname
declare #ColName nvarchar(500)
declare #sql nvarchar(1000)
declare #sqlUpdate nvarchar(1000)
declare #ParmDefinition nvarchar(1000)
set #tablename = N'Table_Name'
Set #ColName = N'Column_Name'
set #ParmDefinition = N'#ColName nvarchar OUTPUT';
set #sql= 'Select ' + #ColName + ', Count(' + #ColName + ') from ' + #tablename + ' group by ' + #ColName + ''
Set #sqlUpdate = 'Update ' + #tablename + ' SET ' + #ColName + ' = NULL WHERE '+ #ColName + ' = ''NULL'''
print #sql
print #sqlUpdate
EXECUTE sp_executesql #sql, #ParmDefinition, #ColName=#ColName OUTPUT;
EXECUTE sp_executesql #sqlUpdate, #ParmDefinition, #ColName=#ColName OUTPUT;
What I'm trying to with SSIS is to iterate through each column,
Select Column_Name from Table_Name where Column_Name = 'NULL'
run the appropriate query, and perform the update.
So far I can extract the column names from Information.Schema and get a record count from the appropriate table, but when it comes to running the actual UPDATE statement (as above, sqlUpdate) - there doesn't seem to be a component that's happy with the dynamic phrasing of the query.
I'm using a Conditional Split to determine where to go if there are records (which may be incorrect) and I've tried OLE DB Command for the update.
In short, I'm wondering whether SSIS is the best tool for this job or whether I'm looking in the wrong place!
I'm using SSIS 2005, which may well have limitations that I'm not yet aware of!
Any guidance would be appreciated.
Thanks,
Jon
The principle is basically sound, but I would leave SSIS out, and do it with SSMS directly against the SQL Server and build the looping logic there, probably with a cursor.
I'm not sure whether you need to check the count of potential values first - you might just as well apply the update and accept that sometimes it will update no rows - the filtering will then not be duplicated.
Something like
declare columns cursor local read_only for
select
c.TABLE_CATALOG,
c.TABLE_SCHEMA,
c.TABLE_NAME,
c.COLUMN_NAME
from INFORMATION_SCHEMA.COLUMNS c
inner join INFORMATION_SCHEMA.TABLES t
on c.TABLE_CATALOG = t.TABLE_CATALOG
and c.TABLE_SCHEMA = t.TABLE_SCHEMA
and c.TABLE_NAME = c.TABLE_NAME
where c.DATA_TYPE like '%varchar%'
open columns
declare #catalog varchar(100), #schema varchar(100), #table varchar(100), #column varchar(100)
fetch from columns into #catalog, #schema, #table, #column
while ##FETCH_STATUS= 0
begin
-- construct update here and execute it.
select #catalog, #schema, #table, #column
fetch next from columns into #catalog, #schema, #table, #column
end
close columns
deallocate columns
You might also consider applying all the updates to the table in one hit, removing the filter and using nullif dependent on the density of the bad data.
eg:
update table
set
col1 = nullif(col1, 'null'),
col2 = nullif(col2, 'null'),
...
SSIS won't be the best option for you. Conceptually, you are performing updates, lots of updates. SSIS can do really fast inserts. Updates, are fired off on a row by agonizing row basis.
In a SQL based approach, you'd be firing off 1000 update statements to fix everything. In an SSIS based scenario, using a data flow with OLE DB Command, you're looking at 1000 * 1000000.
I would skip the cursor myself. It is an acceptable time to use a cursor but if your tables are as littered with 'NULL' as it sounds, just assume you're updating every row and fix all the fields in a given record instead of coming back to the same row for each thing needing fixed.

Running the same SQL code against a number of tables sequentially

I have a number of tables (around 40) containing snapshot data about 40 million plus vehicles. Each snapshot table is at a specific point in time (the end of the quarter) and is identical in terms of structure.
Whilst most of our analysis is against single snapshots, on occasion we need to run some analysis against all the snapshots at once. For instance, we may need to build a new table containing all the Ford Focus cars from every single snapshot.
To achieve this we currently have two options:
a) write a long, long, long batch file repeating the same code over and over again, just changing the FROM clause
[drawbacks - it takes a long time to write and changing a single line of code in one of blocks requires fiddly changes in all the other blocks]
b) use a view to union all the tables together and query that instead
[drawbacks - our tables are stored in separate database instances and cannot be indexed, plus the resulting view is something like 600 million records long by 125 columns wide, so is incredibly slow]
So, what I would like to find out is whether I can either use dynamic sql or put the SQL into a loop to spool through all tables. This would be something like:
for each *table* in TableList
INSERT INTO output_table
SELECT *table* as OriginTableName, Make, Model
FROM *table*
next *table* in TableList
Is this possible? This would mean that updating the original SQL when our client changes what they need (a very regular occurrence!) would be very simple and we would benefit from all the indexes we already have on the original tables.
Any pointers, suggestions or help will be much appreciated.
If you can identify your tables (e.g. a naming pattern), you could simply say:
DECLARE #sql NVARCHAR(MAX);
SELECT #sql = N'';
SELECT #sql = #sql + 'INSERT output_table SELECT ''' + name + ''', Make, Model
FROM dbo.' + QUOTENAME(name) + ';'
FROM sys.tables
WHERE name LIKE 'pattern%';
-- or WHERE name IN ('t1', 't2', ... , 't40');
EXEC sp_executesql #sql;
This assumes they're all in the dbo schema. If they're not, the adjustment is easy... just replace dbo with ' + QUOTENAME(SCHEMA_NAME([schema_id])) + '...
In the end I used two methods:
Someone on another forum suggested making use of sp_msforeachtable and a table which contains all the table names. Their suggestion was:
create table dbo.OutputTable (OriginTableName nvarchar(500), RecordCount INT)
create table dbo.TableList (Name nvarchar (500))
insert dbo.TableList
select '[dbo].[swap]'
union select '[dbo].[products]'
union select '[dbo].[structures]'
union select '[dbo].[stagingdata]'
exec sp_msforeachtable #command1 = 'INSERT INTO dbo.OutputTable SELECT ''?'', COUNT(*) from ?'
,#whereand = 'and syso.object_id in (select object_id(Name) from dbo.TableList)'
select * from dbo.OutputTable
This works perfectly well for some queries, but seems to suffer from the fact that one cannot use a GROUP BY clause within the query (or, at least, I could not find a way to do this).
The final solution I used was to use Dynamic SQL with a lookup table containing the table names. In a very simple form, this looks like:
DECLARE #TableName varchar(500)
DECLARE #curTable CURSOR
DECLARE #sql NVARCHAR(1000)
SET #curTable = CURSOR FOR
SELECT [Name] FROM Vehicles_LookupTables.dbo.AllStockTableList
OPEN #curTable
FETCH NEXT
FROM #curTable INTO #TableName
WHILE ##FETCH_STATUS = 0
BEGIN
SET #sql = 'SELECT ''' +#TableName + ''', Make, sum(1) as Total FROM ' + #TableName + ' GROUP BY Make'
EXEC sp_executesql #sql
FETCH NEXT
FROM #curTable INTO #TableName
END
CLOSE #curTable
DEALLOCATE #curTable