T-SQL split table vertically (moving column) with (almost) same performance - sql

In T-SQL (MS SQL Server 2016) I want to split vertically a big table (220 GB - 500 million rows) as some columns data are descriptions and some are daily data.
So from
CREATE TABLE [BigTable](
[OptionID] [int] NOT NULL,
[Date] [datetime] NOT NULL,
[ParentID] [bigint] NOT NULL,
[Description] [char](255) NOT NULL,
[Price] [real] NULL,
[PriceTheo] [real] NULL
CONSTRAINT [PK_BigTable] PRIMARY KEY CLUSTERED
(
[ParentID] ASC,
[Date] ASC,
[OptionID] ASC
) ON [PRIMARY]
) ON [PRIMARY]
GO
I would move to:
CREATE TABLE [DescriptionTable](
[OptionVersionID] [int] IDENTITY(1,1) NOT FOR REPLICATION NOT NULL,
[OptionID] [int] NOT NULL,
[ParentID] [bigint] NOT NULL,
[Description] [char](255) NOT NULL,
CONSTRAINT [PK_DescriptionTable] PRIMARY KEY CLUSTERED
([OptionVersionID] ASC) ON [PRIMARY]) ON [PRIMARY]
CREATE TABLE [DailyTable](
[OptionVersionID] [int] NOT NULL,
[Date] [datetime] NOT NULL,
[Price] [real] NULL,
[PriceTheo] [real] NULL
CONSTRAINT [PK_DailyTable] PRIMARY KEY CLUSTERED
([OptionVersionID] ASC,[Date] ASC) ON [PRIMARY]) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [IX_DailyTable_Date] ON [DailyTable]
([Date] ASC) ON [PRIMARY]
GO
CREATE NONCLUSTERED INDEX [IX_DailyTable_OptionVersionID] ON [DailyTable]
([OptionVersionID] ASC) ON [PRIMARY]
GO
ALTER TABLE [DailyTable] WITH CHECK ADD CONSTRAINT [FK_DailyTable_DescriptionTable] FOREIGN KEY([OptionVersionID])
REFERENCES [DescriptionTable] ([OptionVersionID])
GO
ALTER TABLE [DailyTable] CHECK CONSTRAINT [FK_DailyTable_DescriptionTable]
GO
I then create a view
CREATE VIEW [vBigTable]
AS
SELECT
[OptionID],
[Date],
[ParentID],
[Description],
[Price],
[PriceTheo]
FROM DailyTable da INNER JOIN
DescriptionTable de ON da.OptionVersionID = de.OptionVersionID
I thought I should get the same kind of performance (almost) when I request data from the view vBigTable but actually I don't (some request can be 10x slower). Do I miss something to have almost the same performance when I select, join, group by ... (only reading data) from vBigTable or even when I use the specific INNER JOIN between Description and Daily table?
PS: I have more non clustered indices and columns in real life.

Related

How to track deleted rows from database system versioned table in Azure SQL/SQL Server?

I have system versioned Customer table and need to track if admin user deleted the row ('Tesla') from the table.
CustomerHistory table do get new row for 'Tesla', but it does not explicitly tell if row has been updated or deleted.
I would need advice to create SELECT/INSERT SQL query, which compared Customers and CustomerHistory to check which row has been deleted and insert the row ('Tesla') to CustomersDeleted table.
I have System versioned table:
CREATE TABLE [sales].[Customers](
[Customer_PK] [int] IDENTITY(1,1) NOT NULL,
[Customer_Id] [smallint] NULL,
[Customer_Name] [nvarchar](150) NULL,
) ON [PRIMARY]
GO
It has value '1','100','Tesla' and '2','200','Ford'.
Admin user will deleted 'Tesla' row.
I have History versioned table:
CREATE TABLE [sales].[CustomersHistory](
[Customer_PK] [int] NOT NULL,
[Customer_Id] [smallint] NULL,
[Customer_Name] [nvarchar](150) NULL,
) ON [PRIMARY]
GO
I have third table where I would like to insert rows ('Tesla') that has been removed from Customer table.
CREATE TABLE [sales].[CustomersDeleted](
[Customer_PK] [int] NOT NULL,
[Customer_Id] [smallint] NULL,
[Customer_Name] [nvarchar](150) NULL,
[Deleted_time] [datetime2](7) NULL
) ON [PRIMARY]
GO

Audit history of sql child table

I'm recording all insert and update on TaskDetail table using a trigger,Now I want to assign multiple staff to a task, But if staff id stored in different child table how can I track audit history, I have considered storing staff id as comma separated values but child table is always a good option.
In TaskStaff table multiple staff will have same taskId
CREATE TRIGGER [dbo].[TaskDetail_History_Trigger]
ON [dbo].[TaskDetail]
FOR Insert,UPDATE
AS
INSERT INTO TaskHistory SELECT * FROM inserted
GO
ALTER TABLE [dbo].[ProductionDetail] ENABLE TRIGGER [Task_History_Trigger]
GO
CREATE TABLE [dbo].[TaskDetail](
[Id] [int] IDENTITY(1,1) NOT NULL,
[StaffId] [int] NULL,
CONSTRAINT [PK_ProductionDetail_1] PRIMARY KEY CLUSTERED
(
[Id] ASC
)
CREATE TABLE [dbo].[TaskHistory](
[HistoryId] [int] IDENTITY(1,1) NOT NULL,
[Id] [int] NOT NULL,
[StaffId] [int] NULL,
CONSTRAINT [PK_ProductionDetail_1] PRIMARY KEY CLUSTERED
(
[Id] ASC
)
CREATE TABLE [dbo].[TaskStaff](
[Id] [int] IDENTITY(1,1) NOT NULL,
[TaskId] [int] NOT NULL,
[StaffId] [int] NOT NULL,
CONSTRAINT [PK_ProductionDetailStaff] PRIMARY KEY CLUSTERED
(
[Id] ASC
)

SQL Azure unique nonclustered constraint

Does SQL Azure support unique non-clustered constraints? Something like that:
CREATE TABLE [dbo].[MyTable] (
[Id] [int] IDENTITY(1,1) NOT NULL,
[FieldA] [nvarchar](50) NULL,
[FieldB] [nvarchar](50) NULL,
[FieldC] [int] NULL,
CONSTRAINT [PK_Id] PRIMARY KEY CLUSTERED ([Id] ASC),
CONSTRAINT [UQ_ABC] UNIQUE NONCLUSTERED ([FieldA], [FieldB], [FieldC])
It is supported, between ending ) is missing in the above statement if that is troubling you

Inserting into many-to-many table in SQL Server

This is my Tag table:
CREATE TABLE [dbo].[Tag](
[Id] [int] IDENTITY(1,1) NOT NULL,
[Name] [nvarchar](max) NULL,
[CreationDate] [datetime] NOT NULL,
[TagSlug] [nvarchar](max) NOT NULL,
PRIMARY KEY CLUSTERED ([Id] ASC)
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
and this is my Post table:
CREATE TABLE [dbo].[Post](
[Id] [int] IDENTITY(1,1) NOT NULL,
[Title] [nvarchar](400) NOT NULL,
[Body] [nvarchar](max) NOT NULL,
[Summary] [nvarchar](max) NOT NULL,
[CreationDate] [datetime] NOT NULL,
[UrlSlug] [nvarchar](max) NOT NULL,
[Picture] [nvarchar](max) NULL,
[TagId] [int] NOT NULL,
PRIMARY KEY CLUSTERED ([Id] ASC)
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
ALTER TABLE [dbo].[Post] WITH CHECK ADD CONSTRAINT [Post_Tag] FOREIGN KEY([TagId])
REFERENCES [dbo].[Tag] ([Id])
ON DELETE CASCADE
GO
ALTER TABLE [dbo].[Post] CHECK CONSTRAINT [Post_Tag]
GO
I just wanna to insert the Id from Tag and PostId from Post into a new table named Post_Tag which is a many to many relation, this is the script of my Post_Tag table:
CREATE TABLE [dbo].[Post_Tag](
[PostId] [int] NOT NULL,
[TagId] [int] NOT NULL,
CONSTRAINT [PK_dbo.Post_Tag] PRIMARY KEY CLUSTERED ([PostId] ASC, [TagId] ASC)
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[Post_Tag] WITH CHECK
ADD CONSTRAINT [FK_dbo.Post_Tag_dbo.Post_PostId]
FOREIGN KEY([PostId]) REFERENCES [dbo].[Post] ([Id])
ON DELETE CASCADE
GO
ALTER TABLE [dbo].[Post_Tag] CHECK CONSTRAINT [FK_dbo.Post_Tag_dbo.Post_PostId]
GO
ALTER TABLE [dbo].[Post_Tag] WITH CHECK
ADD CONSTRAINT [FK_dbo.Post_Tag_dbo.Tag_TagId]
FOREIGN KEY([TagId]) REFERENCES [dbo].[Tag] ([Id])
ON DELETE CASCADE
GO
ALTER TABLE [dbo].[Post_Tag] CHECK CONSTRAINT [FK_dbo.Post_Tag_dbo.Tag_TagId]
GO
Now, to do that I've tried the below query:
insert into [Blog].[dbo].[Post_Tag] (PostId,TagId)
select [Id] as [PostId] from [OldBlog].[dbo].[Tag]
select [TagId] from [OldBlog].[dbo].[Post]
but this error appear while running the script:
The select list for the INSERT statement contains fewer items than the insert list. The number of SELECT values must match the number of INSERT columns.
what's wrong with my query? thanks
The 2 select queries are being processed separately. You will have to come up with a way to join [OldBlog].[dbo].[Tag] to [OldBlog].[dbo].[Post] so you can insert fields PostId,TagId into [Blog].[dbo].[Post_Tag] from this new table expression.
For this, you can use the row number of each row from the two select statements as a link so you can join them and select what you need from both of them.
SELECT POST.[PostId], TAG.[TagId]
FROM (
select ROW_NUMBER() OVER (ORDER BY [Id]) AS Link, [Id] as [PostId] from [OldBlog].[dbo].[Tag]) AS POST
JOIN (
select ROW_NUMBER() OVER (ORDER BY [TagId]) AS Link, [TagId] from [OldBlog].[dbo].[Post]) AS TAG ON POST.Link = TAG.Link
IMPORTANT NOTE:
This is just a means of "forcing" a relationship between tables without any relationship to each other whatsoever. This is indeed a dangerous thing to do because we are forcing a relationship between the tables based on row number and not an actual key. This should only be used if there is no definite expected output or as a last resort if there is no other way to link two or more unrelated tables where the relationship of each selected column don't matter.

Select all rows from table having two given values anywhere in rows

I have a table of bus route. this table has fields like bus no. , route code , starting point , end point, and upto 10 halts from halt1 , halt2...halt10 . i have filled data in this table. now i want to select all rows having two values,for example jaipur and vasai. in my table, there are two rows that have jaipur and vasai. In one row, jaipur is in column halt2 and vasai in halt9. Similarly another row has jaipur in halt4 column and vasai in halt10 column.
please help me to find out sql query. I am using MS SQL server.
scrip
USE [JaipuBus]
GO
/****** Object: Table [dbo].[MyRoutes] Script Date: 02/24/2014 13:28:54 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[MyRoutes](
[id] [int] IDENTITY(1,1) NOT NULL,
[Route_No] [nvarchar](50) NULL,
[Route_Code] [nvarchar](50) NULL,
[Color] [nvarchar](50) NULL,
[Start_Point] [nvarchar](200) NULL,
[End_Point] [nvarchar](200) NULL,
[halt1] [nvarchar](50) NULL,
[halt2] [nvarchar](50) NULL,
[halt3] [nvarchar](50) NULL,
[halt4] [nvarchar](50) NULL,
[halt5] [nvarchar](50) NULL,
[halt6] [nvarchar](50) NULL,
[halt7] [nvarchar](50) NULL,
[halt8] [nvarchar](50) NULL,
[halt9] [nvarchar](50) NULL,
[halt10] [nvarchar](50) NULL,
CONSTRAINT [PK_MyRoutes] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
use CONTAINS
SELECT *
WHERE CONTAINS((startingpoint,endpoint,halt1,halt2,halt3,halt4,halt5,halt6,halt7,halt8,halt9,halt10), 'jaipur')
AND CONTAINS((startingpoint,endpoint,halt1,halt2,halt3,halt4,halt5,halt6,halt7,halt8,halt9,halt10), 'vasai');
SELECT * FROM bus_rout WHERE (halt1='aaa' OR halt2='aaa' OR .... halt10='aaa') AND (halt1='bbb' OR halt2='bbb' OR ....... halt10='bbb')
The where clause could be generated by code.
Based on your input, it seems to be a necessity for you to have Normalized table structure.
CREATE TABLE [dbo].[MyRoutes](
[id] [int] IDENTITY(1,1) NOT NULL,
[Route_No] [nvarchar](50) NULL,
[Route_Code] [nvarchar](50) NULL,
[Color] [nvarchar](50) NULL,
[Start_Point] [nvarchar](200) NULL,
[End_Point] [nvarchar](200) NULL,
[HaltNum] INT,
[Halt] [nvarchar](50) NULL
)
Then a query to solve the routes problem can be written as below:
SELECT a.Route_No, a.Route_Code, a.Color, a.Start_Point, a.End_Point,
a.HaltNum StartNum, b.HaltNum StopNum
FROM MyRoutes a
INNER JOIN MyRoutes b
ON a.id = b.id
WHERE a.Halt = 'jaipur' AND b.Halt = 'vasai'
AND a.HaltNum < b.HaltNum
Even better design of the table structure would be to have a separate master table for all Stops where you can maintain only StopId and StopName. In the MyRoutes table then you can have HaltId as foreign key referencing StopId column of the all stops master table. The above query would then need to inner join with this table twice to have conditions on StopName