A better way to get table statistics - sql

I'm developing a SQL Server 2012 Express and developer edition (with latest Service Pack) solution.
In my database I have a table CODES with codes. This table has a FLAG column indicating that a code has been printed, read or dropped. Codes are grouped by another column, LEVEL. CODES table has CODE and LEVEL as primary key.
I'm going to update table CODES very quickly, and if I do SELECT COUNT(code) FROM CODES WHERE FLAG=1 to get all codes read, sometime, I block that table, and when we have many many rows, SELECT COUNT CPU goes to 100%.
So, I have another table, STATISTICS to store how many codes has been printed, read or dropped. When I update a row in CODES table, I add 1 to STATISTICS table. I have tried this two ways:
With an UPDATE statement after updating CODES table.
declare #printed bigint;
set #printed = (Select CODES_PRINTED from STADISTICS where LEVEL = #level)
if (#printed is null)
begin
insert dbo.STADISTICS(LEVEL, CODES_PRINTED) values (#level, 1)
end
else
begin
update dbo.STADISTICS set CODES_PRINTED = (#printed + 1) where LEVEL = #level;
end
With a TRIGGER in CODES table.
ALTER trigger [dbo].[UpdateCodesStatistics] on [dbo].[CODES]
after update
as
SET NOCOUNT ON;
if UPDATE(FLAG)
BEGIN
declare #flag as tinyint;
declare #level as tinyint;
set #flag = (SELECT FLAG FROM inserted);
set #level = (SELECT LEVEL FROM inserted);
-- If we have printed a new code
if (#flag = 1)
begin
declare #printed bigint;
set #printed = (Select CODES_PRINTED from STADISTICS where LEVEL = #level)
if (#printed is null)
begin
insert dbo.STADISTICS(LEVEL, CODES_PRINTED) values (#level, 1)
end
else
begin
update dbo.STADISTICS set CODES_PRINTED = (#printed + 1) where LEVEL = #level;
end
end
END
But in both cases I lost data. After running my program I check CODES table and STATISTICS table and statistics data doesn't match: I have less printed codes and read codes in STATISTICS than in CODES table.
This is STATISTICS table that I'm using now:
CREATE TABLE [dbo].[BATCH_STATISTICS](
[CODE_LEVEL] [tinyint] NOT NULL,
[CODES_REQUESTED] [bigint] NOT NULL CONSTRAINT [DF_BATCH_STATISTICS_CODES_REQUESTED] DEFAULT ((0)),
[CODES_PRINTED] [bigint] NOT NULL CONSTRAINT [DF_BATCH_STATISTICS_CODES_PRINTED] DEFAULT ((0)),
[CODES_READ] [bigint] NOT NULL CONSTRAINT [DF_BATCH_STATISTICS_CODES_READ] DEFAULT ((0)),
[CODES_DROPPED] [bigint] NOT NULL CONSTRAINT [DF_BATCH_STATISTICS_CODES_DROPPED] DEFAULT ((0)),
[CODES_NOREAD] [bigint] NOT NULL CONSTRAINT [DF_BATCH_STATISTICS_CODES_NOREAD] DEFAULT ((0)),
CONSTRAINT [PK_BATCH_STATISTICS] PRIMARY KEY CLUSTERED
(
[CODE_LEVEL] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
By the way, I'm updating and inserting very quickly (more than 1200 rows in a minute).
Any idea what's happening or how can I do it better?

inserted and deleted can contain multiple (or no) rows. So idioms like set #flag = (SELECT FLAG FROM inserted) are fundamentally broken. From your description, it sounds like an indexed view could work for you instead, something like this:
CREATE VIEW dbo.Statistics
WITH SCHEMABINDING
AS
SELECT LEVEL, COUNT_BIG(*) as CODES_PRINTED
FROM dbo.Codes
WHERE Flag = 1
GROUP BY LEVEL
and:
CREATE UNIQUE CLUSTERED INDEX IX_Statistics ON dbo.Statistics (LEVEL)
And now SQL Server will (behind the scenes) maintain this data automatically and you don't have to write any triggers (or explicitly maintain a separate table)

Related

Data gets changed when copying data in chunks between two identical tables

In short, I am trying to copy data from one table to another nearly identical table (minus constraints, indices, and a precision change to a decimal column) in batches using Insert [NewTable] Select Top X * from [Table] but some data is getting changed during the copy. Read on for more details.
Why we are copying in the first place
We are altering the precision of a couple of columns in our largest table and do not have the time in our deployment window to do a simple alter statement. As an alternative, we decided to create a table with the new schema and copy the data in batches in the days leading up to the deploy to allow us to simple drop the old table and rename this table during the deployment window.
Creation scripts for new and old tables
These are not the exact tables we have in our DB, but they've been trimmed down for this question. The actual table has ~100 columns.
CREATE TABLE [dbo].[Table]
(
[Id] BIGINT NOT NULL PRIMARY KEY NONCLUSTERED IDENTITY,
[ForeignKey1] INT NOT NULL,
[ForeignKey2] INT NOT NULL,
[ForeignKey3] INT NOT NULL,
[Name] VARCHAR(MAX) NOT NULL,
[SomeValue] DECIMAL(14, 5) NULL,
CONSTRAINT [FK_Table_ForeignKeyTable1] FOREIGN KEY ([ForeignKey1]) REFERENCES [ForeignKeyTable1]([ForeignKey1]),
CONSTRAINT [FK_Table_ForeignKeyTable2] FOREIGN KEY ([ForeignKey2]) REFERENCES [ForeignKeyTable2]([ForeignKey2]),
CONSTRAINT [FK_Table_ForeignKeyTable3] FOREIGN KEY ([ForeignKey3]) REFERENCES [ForeignKeyTable3]([ForeignKey3]),
)
GO
CREATE INDEX [IX_Table_ForeignKey2] ON [dbo].[Table] ([ForeignKey2])
GO
CREATE TABLE [dbo].[NewTable]
(
[Id] BIGINT NOT NULL PRIMARY KEY NONCLUSTERED IDENTITY,
[ForeignKey1] INT NOT NULL,
[ForeignKey2] INT NOT NULL,
[ForeignKey3] INT NOT NULL,
[Name] VARCHAR(MAX) NOT NULL,
[SomeValue] DECIMAL(16, 5) NULL
)
SQL I wrote to copy data
DECLARE #BatchSize INT
DECLARE #Count INT
​
-- Leave these the same --
SET #Count = 1
​
-- Update these to modify run behavior --
SET #BatchSize = 5000
​
WHILE #Count > 0
BEGIN
SET IDENTITY_INSERT [dbo].[NewTable] ON;
INSERT INTO [dbo].[NewTable]
([Id],
[ForeignKey1],
[ForeignKey2],
[ForeignKey3],
[Name],
[SomeValue])
SELECT TOP (#BatchSize)
[Id],
[ForeignKey1],
[ForeignKey2],
[ForeignKey3],
[Name],
[SomeValue]
FROM [dbo].[Table]
WHERE not exists(SELECT 1 FROM [dbo].[NewTable] WHERE [dbo].[NewTable].Id = [dbo].[Table].Id)
ORDER BY Id
​
SET #Count = ##ROWCOUNT
​
SET IDENTITY_INSERT [dbo].[NewTable] OFF;
END
The Problem
Somehow data is getting garbled or modified in a seemingly random pattern during the copy. Most (maybe all) of the modified data we've seen has been for the ForeignKey2 column. And the value we end up with in the new table is seemingly random as well as it didn't exist at all in the old table. There doesn't seem to be any rhyme or reason to which records it affects either.
For example, here is one row for the original table and the corresponding row in the new table:
Old Table
ID: 204663
FK1: 452
FK2: 522413
FK3: 11190
Name: Masked
Some Value: 0.0
New Table
ID: 204663
FK1: 452
FK2: 120848
FK3: 11190
Name: Masked but matches Old Table
Some Value: 0.0
Environment
SQL was run in SSMS. Database is an Azure SQL Database.

How to create the trigger for the table in SQL Server 2008

I have the following table:
CREATE TABLE [RTS].[MFB]
(
[record_id] [int] IDENTITY(1,1) NOT NULL,
[marker_id] [nvarchar](50) NULL,
[lat] [numeric](38, 8) NULL,
[lng] [numeric](38, 8) NULL,
[address] [nvarchar](512) NULL,
[hash] [smallint] NULL,
[updated] [datetime] NULL,
[first_created_date] [datetime] NULL,
CONSTRAINT [PK_MFB_1]
PRIMARY KEY CLUSTERED ([record_id] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
where the "record_id" is the primary key.
I need to create a trigger after the INSERT operation.
The conditions are:
If the marker_id column is new, INSERT the record to the table and set the hash column to 0;
If the marker_id already exists, UPDATE the existing record by setting the new updated column;
If both the marker_id already exists and any of the "lat", "lng" and "address" has been changed, UPDATE the existing record by setting the new "lat", "lng" and/or "address" and also setting "hash" to "1".
Basically, the MFB table should not have duplicated marker_id.
How can I achieve this by a setting up a trigger? Thanks!
Rafal is right but you can make a cursor for bulk insert and update but i cant promise for performance it should be like this
CREATE TRIGGER DBO.MFBTRG
ON DBO.MFB
INSTEAD OF INSERT,UPDATE
AS
BEGIN
DECLARE #marker_id NVARCHAR(50)
DECLARE #lat NUMERIC(38,8)
DECLARE #lng NUMERIC(38,8)
DECLARE #address NVARCHAR(512)
DECLARE #hash SMALLINT
DECLARE #updated DATETIME
DECLARE #first_created_date DATETIME
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
DECLARE MFBINS CURSOR FAST_FORWARD FOR Select [marker_id],[lat],[lng],[address],[hash],[updated],[first_created_date] FROM INSERTED
OPEN MFBINS
FETCH NEXT FROM MFBINS INTO #marker_id,#lat,#lng,#address,#hash,#updated,#first_created_date
WHILE (##FETCH_STATUS=0)
BEGIN
IF NOT EXISTS (SELECT [marker_id] FROM MFB WHERE [marker_id]= #marker_id)
BEGIN
INSERT INTO [dbo].[MFB] ([marker_id],[lat],[lng],[address],[hash],[updated],[first_created_date])
VALUES (#marker_id,#lat,#lng,#address,#hash,#updated,#first_created_date)
END
ELSE
BEGIN
UPDATE MFB SET [updated]=#updated WHERE [marker_id]=#marker_id
END
-- Insert statements for trigger here
FETCH NEXT FROM MFBINS INTO #marker_id,#lat,#lng,#address,#hash,#updated,#first_created_date
END
CLOSE MFBINS
DEALLOCATE MFBINS
END
GO
and you can use to detect which column is update on update trigger with
IF UPDATE(COLUMN_NAME)
BEGIN
UPDATE LOGİC
END
If you really want to do it this way you would have to create INSTEAD OF INSERT trigger - but beware it is going to be slow as you wouldn't be able to benefit from bulk insert.
Alternatively you could use MERGE statement and perform your INSERT/UPDATE scenario there.

Understanding creating simple stored procedures

I'm in the process of learning to create stored procedures in Microsoft SQL Server Management Studio. I need to create a stored procedure that adds a single new record to my table. Also, I need to create two extra output parameters along with the stored procedure (I chose ##error and SCOPE_IDENTITY()).
This is the code I use to create my stored procedure:
use bieren
go
if exists
(select name from sysobjects
where name = 'spBierInsert' and xtype = 'p')
drop procedure spBierInsert
go
create procedure spBierInsert
#Biernr int = 0,
#Naam nvarchar(100) = '',
#BrouwerNr int = 0,
#SoortNr int = 0,
#Alcohol real,
#gelukt nvarchar(10) output,
#id int output
as
begin
declare #fout int
insert into bieren
values (#Biernr, #Naam, #BrouwerNr, #SoortNr, #Alcohol)
set #fout = ##error
print 'Foutnummer:' + cast(#fout as varchar(4))
if #fout > 0
set #gelukt = 'Neen: ' + cast(#fout as varchar(4))
else
set #gelukt = 'Ja'
set #id = SCOPE_IDENTITY()
end
I must be doing something wrong, because the result is the following:
Msg 547, Level 16, State 0, Procedure spBierInsert, Line 92
The INSERT statement conflicted with the FOREIGN KEY constraint
"FK_Bieren_Brouwers". The conflict occurred in database "Bieren", table
"dbo.Brouwers", column 'BrouwerNr'.
The statement has been terminated.
Foutnummer:547
(1 row(s) affected)
What have I done incorrectly?
EDIT 30/12/2015: I have updated this question with new information. I originally just used terms like "exampletable" because I had no idea that the search to the answer to my question would be more involved than a single answer, so I've gone ahead and changed the entire code above (as well as the text for the error), and I've added the script for my table underneath. The point of this question is that I come out with code that works, or, that I at least understand what's wrong with it.
USE [Bieren]
GO
/****** Object: Table [dbo].[Bieren] Script Date: 30/12/2015 0:19:56 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Bieren](
[BierNr] [int] NOT NULL,
[Naam] [nvarchar](100) NULL,
[BrouwerNr] [int] NULL,
[SoortNr] [int] NULL,
[Alcohol] [real] NULL,
CONSTRAINT [PK_Bieren] PRIMARY KEY CLUSTERED
(
[BierNr] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[Bieren] WITH CHECK ADD CONSTRAINT [FK_Bieren_Brouwers] FOREIGN KEY([BrouwerNr])
REFERENCES [dbo].[Brouwers] ([BrouwerNr])
GO
ALTER TABLE [dbo].[Bieren] CHECK CONSTRAINT [FK_Bieren_Brouwers]
GO
ALTER TABLE [dbo].[Bieren] WITH CHECK ADD CONSTRAINT [FK_Bieren_Soorten] FOREIGN KEY([SoortNr])
REFERENCES [dbo].[Soorten] ([SoortNr])
GO
ALTER TABLE [dbo].[Bieren] CHECK CONSTRAINT [FK_Bieren_Soorten]
GO
Your procedure is created fine. The problem is that you are inserting a value in column 'BrouwerNr' of table "dbo.Brouwers" which doesn't exist in "SoortNr" column of table "dbo.Soorten". There is foreign set on the table "dbo.Brouwers" named "[FK_Bieren_Soorten]" which is causing this restriction. I suggest you look into this article to know more about foreign keys.
The error is because you are inserting 1600 in #ColumnNr, which is a foreign key of another table and does not have 1600 in it.
You can do the following :
right click on "exampletable" table and select 'Script table as'->'Create to'->'new query editor window'
Now,find "ColumnNr" in it. It will be something like this =>
ALTER TABLE [dbo].[exampleTable] WITH CHECK ADD CONSTRAINT [FK_exampleTable_**OtherTableName**_ColumnNr] FOREIGN KEY([ColumnNr])
REFERENCES [dbo].**[OtherTableName]** ([ColumnNr])
GO
Now open the mentioned table "OtherTableName" in the query and look for the column "ColumnNr". It will not be having value 1600.
Try to insert any value in
#ColumnNr = {//Any value from **OtherTableName**},
which is in table "OtherTableName"

Convert a primarykey column from int to bigint has the column is approaching 2 billion rows in shortest time

I am having a table with 1.87 billion rows as its approaching the limit we are looking to convert to bigint.
we are trying to add a new column with bigint datatype and remove the primary key constraints copy the data in batching manner and later remove the original column and rename the new column. but its taking long time 4 to 5 hours in test environment with 60 million rows.
Table Definition
CREATE TABLE [dbo].[TableRecordsHistory](
[PrimaryKeyColumnID] [int] NOT NULL,
[RecordType] [char](1) NOT NULL,
[DisplayOrder] [int] NOT NULL,
[DisplayText] [varchar](50) NOT NULL,
[Hours] [decimal](8, 2) NULL,
[Amount] [decimal](10, 2) NULL,
[Created] [datetime] NOT NULL,
CONSTRAINT [PK_TableRecordsHistory] PRIMARY KEY CLUSTERED
(
[Created] ASC,
[PrimaryKeyColumnID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
)
GO
SET ANSI_PADDING OFF
GO
ALTER TABLE [dbo].[TableRecordsHistory] WITH NOCHECK ADD CONSTRAINT [FK_TableRecordsHistory_TableRecords] FOREIGN KEY([Created], [PrimaryKeyColumnID])
REFERENCES [dbo].[TableRecords] ([Created], [PrimaryKeyColumnID])
GO
ALTER TABLE [dbo].[TableRecordsHistory] NOCHECK CONSTRAINT [FK_TableRecordsHistory_TableRecords]
GO
I am afraid how long will it take in production,
its not working correctly
SET #NumProcessedRows = 0
SET #BatchNum = 0
WHILE #NumProcessedRows < #RowCount
BEGIN
BEGIN TRANSACTION tx
BEGIN TRY
SET #BatchNum = #BatchNum + 1
SET #BatchID = ##IDENTITY
IF (#NumProcessedRows + #BatchSize) < #RowCount
SET #UpperLimitRow = #NumProcessedRows + #BatchSize
ELSE
SET #UpperLimitRow = #RowCount
update TableName
set NewColumn = OldColumn
WHERE OldColumn BETWEEN #NumProcessedRows AND #UpperLimitRow
IF ##TRANCOUNT > 0
BEGIN
COMMIT TRANSACTION tx
END
You could create a new table with the column to be changed as bigint. Write all data from existing table into the new table. Then, rename the tables so that they are swapped.
If you do not have time to rename the tables, you could create a synonym pointing to your current table, create a new table like described above and then just change the synonym.
It's taking long time 4 to 5 hours in test environment with 60 million
rows.
You say in the comments that you are splitting this up into 60 batches of 1 million rows.
Your index on Created,PrimaryKeyColumnID does not permit a seek on PrimaryKeyColumnID so your query needs to perform 60 full table scans when identifying the rows to update.
This adds up to 3.6 billion rows scanned.
If you were to attempt this strategy on the live database with 1.87 billion rows and the same batch size you would end up with 1870 full scans on a 1.87 billion rows = 3.5 trillion rows.
I suggest that you either add a useful index that allows a range seek or simply alter the query to something like
WHERE Created >= #StartDateTime AND Created < #EndDateTime
And increment those datetime variables by some value that will end up having the roughly desired batch size.

Need very simple 'sequence' for GetNextOrderNumber for SQL Server

I'm trying to make an even simpler function than the one described here to get the next value of an order number for a shopping cart.
I don't care if there are gaps
Only completed orders get an ID (i.e. I'm deliberately not using IDENTITY)
Obviously there must not be duplicates
I don't care about performance and locking. If we have so many new orders that I care about lockin then I'll have other problems first
I've found quite a few other similar questions, but not the exact solution i'm looking for.
What I have so far is this :
USE [ShoppingCart]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Sequence_CompletedOrderID]
([val] [int] NOT NULL
CONSTRAINT [DF_Sequence_CompletedOrderID_NextValue] DEFAULT ((520000))
) ON [PRIMARY]
then for the stored proc :
CREATE PROC dbo.GetNextCompletedOrderId
#nextval AS INT OUTPUT
AS
UPDATE dbo.sequence_completedorderid SET #nextval=val += 1;
GO
Like I said I'm trying to base it on the article I linked to above - so perhaps this just a clumsy way of doing it. My SQL isn't quite up to much for even simple things like this, and its past my bedtime. Thanks!
OK, so you already have an IDENTITY column in your main table - but how about just having an additional table with again has an IDENTITY column?? This would save you so much trouble and hassle.....
USE [ShoppingCart]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Sequence_CompletedOrderID]
([val] [int] NOT NULL IDENTITY(520000, 1)
) ON [PRIMARY]
CREATE PROC dbo.GetNextCompletedOrderId
#nextval AS INT OUTPUT
AS
INSERT INTO dbo.Sequence_CompletedOrderID DEFAULT VALUES
SELECT #nextval = SCOPE_IDENTITY()
GO
That way, you can leave all the hassle of making sure things are unique etc. to SQL Server, and it will also make sure you won't ever get back the same value twice from the IDENTITY column!
If you use the Sequence_CompletedOrderID table as a one row table of order IDs then you should use UPDATE and rely on the OUTPUT clause to capture the new value:
CREATE PROC dbo.GetNextCompletedOrderId
#nextval AS INT OUTPUT
AS
SET NOCOUNT ON;
UPDATE dbo.Sequence_CompletedOrderID
SET val=val + 1
OUTPUT #nextval = INSERTED.val;
GO
The solution from #marc_s creates a new row for each number generated. At first I didn't think I liked this, but realized I can use it to my advantage.
What I did was added a date time audit column, and also an #orderid parameter to the stored proc. For a particular orderid it will be guaranteed to return the same completedorderid, which is the number from the sequence generator.
If for some reason my application layer requests the next id, but then crashes before it can commit the transaction - it will still be linked to that order so that when it is requested again the same number will be returned.
This is what I ended up with:
USE [ShoppingCart]
GO
/****** Object: Table [dbo].[Sequence_CompletedOrderID] Script Date: 11/29/2009 03:36:40 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[Sequence_CompletedOrderID](
[val] [int] IDENTITY(520000,1) NOT NULL,
[CreateDt] [datetime] NOT NULL CONSTRAINT [DF_Sequence_CompletedOrderID_CreateDt] DEFAULT (getdate()),
[Orderid] [int] NOT NULL CONSTRAINT [DF_Sequence_CompletedOrderID_Orderid] DEFAULT ((0)),
CONSTRAINT [PK_Sequence_CompletedOrderID] PRIMARY KEY CLUSTERED
(
[Orderid] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
USE [ShoppingCart]
GO
/****** Object: StoredProcedure [dbo].[GetCompletedOrderId] Script Date: 11/29/2009 03:34:08 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROC [dbo].[GetCompletedOrderId]
#orderid AS INT,
#completedorderid AS INT OUTPUT
AS
IF EXISTS (SELECT * FROM dbo.Sequence_CompletedOrderID WHERE orderid = #orderid)
BEGIN
SET #completedorderid =(SELECT val FROM dbo.Sequence_CompletedOrderID WHERE orderid = #orderid)
END
ELSE
BEGIN
INSERT INTO dbo.Sequence_CompletedOrderID (orderid) VALUES (#orderid)
SET #completedorderid =(SELECT SCOPE_IDENTITY())
END
How about using the following statement after inserting data in your table?
UPDATE dbo.sequence_completedorderid
SET #nextval = (SELECT MAX(val) + 1 FROM dbo.sequence_completedorderid)
You don't need a new id column, all you need is to add a new OrderCompleted column (bit), and combine that with the id you already have.
SELECT Id FROM T_Order WHERE OrderCompleted = 1