Creating columnstore Index on existing partitioned table with 800+ million rows on SQL Server 2017

Creating columnstore Index on existing partitioned table with 800+ million rows on SQL Server 2017 - sql-server-2016

I have partition table (on date) with B-Tree clustered index which contains more than 800 million rows.
I would like to create a clustered columnstore index on this table in place of existing clustered index, what would be the most efficient way?
Does this affect my existing primary key which has been created with B-tree clustered index?
Is there anything else do I need to do to make my columnstore index align with existing partition of table?
Please guide.
CREATE TABLE [dbo].[ORDHDR](
[DATE_DWID] [bigint] NOT NULL,
[VERSION] [bigint] NOT NULL,
[LOCATION_DWID] [bigint] NOT NULL,
[START_LOC_DWID] [bigint] NOT NULL,
[DESTINATION_LOC_DWID] [bigint] NOT NULL,
[XFY_ID] [bigint] NOT NULL,
[START_DWID] [bigint] NOT NULL,
[END_DWID] [bigint] NOT NULL,
[START_REQ_DWID] [bigint] NOT NULL,
[END_IYF_DWID] [bigint] NOT NULL,
[CREATED_AT_DWID] [bigint] NOT NULL,
[TIME_OF_IPB_DWID] [bigint] NOT NULL,
[DATAREC_NUM] [int] NOT NULL,
[REQUEST_FOR_DATA_TRANSFER] [varchar](30) NULL,
[DATAPCKT_NUM] [varchar](6) NOT NULL,
[INTERNAL_NUM_FOR_SUPPLY] [varchar](30) NULL,
[SOURCE_SUPPLY] [varchar](60) NULL,
[RECORD_MODE] [varchar](1) NULL,
[ORD_TYPE] [varchar](3) NULL,
[APO_ORD] [varchar](12) NULL,
[APO_APPLICATION] [int] NULL,
[SUPPLY_CATEGORY] [varchar](12) NULL,
[CONVERTABLE_ORD] [varchar](1) NULL,
[ORDSTATUS_OUTPUT] [varchar](1) NULL,
[ORDSTATUS_INPUT] [varchar](1) NULL,
[PARTIAL_DELIVERY_STATUS] [varchar](1) NULL,
[FINAL_DELIVERY_INDICATOR] [varchar](1) NULL,
[STATUS_DEALLOCATED] [varchar](1) NULL,
[STATUS_RELEASED] [varchar](1) NULL,
[STATUS_FIXED] [varchar](1) NULL,
[STATUS_STARTED] [varchar](1) NULL,
[ORD_COMPONENT_ISSUED] [int] NULL,
[PARTIALLY_CONFIRMED] [varchar](1) NULL,
[FINAL_CONFORMATION] [varchar](1) NULL,
[ORD_PLNG_TYPE] [int] NULL,
[ORD_STATUS] [int] NULL,
[START_TIME_OF_ACTIVITY] [varchar](15) NULL,
[END_DATE_OF_LATEST_ACTIVITY] [varchar](15) NULL,
[FLAG] [varchar](1) NULL,
[EDW_CREATE_DATE] [datetime] NULL,
[EDW_UPDATE_DATE] [datetime] NULL
) ON [ORD_PS]([DATE_DWID])
GO
CREATE UNIQUE CLUSTERED INDEX [ORD_HDR_PK] ON [dbo].[ORDHDR]
(
[DATE_DWID] ASC,
[VERSION] ASC,
[LOCATION_DWID] ASC,
[START_LOC_DWID] ASC,
[DESTINATION_LOC_DWID] ASC,
[XFY_ID] ASC,
[START_DWID] ASC,
[END_DWID] ASC,
[START_REQ_DWID] ASC,
[END_IYF_DWID] ASC,
[CREATED_AT_DWID] ASC,
[TIME_OF_IPB_DWID] ASC,
[DATAREC_NUM] ASC,
[DATAPCKT_NUM] ASC
)

Since a clustered columnstore index has only columns and no keys, you'll need to change the existing unique clustered index to a clustered columnstore index to convert the rowstore table into a columnstore and then create a new non-clustered b-tree index to enforce uniqueness.
This can be accomplished with the DROP_EXISTING=ON clause of CREATE CLUSTERED COLUMNSTORE INDEX followed by creation of the new index.
--change existing clustered index to clustered columnstore
CREATE CLUSTERED COLUMNSTORE INDEX ORD_HDR_PK ON [dbo].[ORDHDR]
WITH(DROP_EXISTING=ON) ON [ORD_PS]([DATE_DWID]);
--rename columnstore index to a more meaningful name
EXEC sp_rename 'dbo.ORDHDR.ORD_HDR_PK','ccidx_ORDHDR', 'INDEX';
--create new non-clustered unique index
CREATE UNIQUE NONCLUSTERED INDEX [ORD_HDR_PK] ON [dbo].[ORDHDR]
(
[DATE_DWID] ASC,
[VERSION] ASC,
[LOCATION_DWID] ASC,
[START_LOC_DWID] ASC,
[DESTINATION_LOC_DWID] ASC,
[XFY_ID] ASC,
[START_DWID] ASC,
[END_DWID] ASC,
[START_REQ_DWID] ASC,
[END_IYF_DWID] ASC,
[CREATED_AT_DWID] ASC,
[TIME_OF_IPB_DWID] ASC,
[DATAREC_NUM] ASC,
[DATAPCKT_NUM] ASC
) ON [ORD_PS]([DATE_DWID]);

Related

SQL query not using created index

I have a table with millions of data and i need to fetch the data with some conditions so I created a non clustered index , but after executing that query it still using the index scan with primary key but not the index I created.
below is the table :
CREATE TABLE [que].[cbsQue](
[requestId] [bigint] IDENTITY(1,1) NOT NULL PRIMARY KEY,
[msg] [nvarchar](max) NULL,
[meta] [nvarchar](max) NULL,
[actionName] [nvarchar](50) NULL,
[recordId] [nvarchar](50) NULL,
[branchId] [nvarchar](20) NULL,
[kmId] [nvarchar](20) NULL,
[queStatus] [nvarchar](20) NULL,
[error] [nvarchar](1000) NULL,
[response] [nvarchar](1000) NULL,
[createdDate] [datetime2](0) NULL,
[updatedDate] [datetime2](0) NULL,
[customerId] [nvarchar](50) NULL)
Below is my query:
SELECT TOP 50 Que.recordId, msg, meta, actionName
FROM que.cbsQue Que
WHERE (queStatus = 'TODO' OR
(queStatus = 'FAILED' AND
(error LIKE 'someString1%'
OR error LIKE '%someString2%'
OR error LIKE '%someString3%'
OR error = 'someString4'
OR error = 'someString5'
OR error = 'someString6'))
)
ORDER BY que.createdDate DESC
Below is the index I created :
CREATE NONCLUSTERED INDEX [CI_queStatus] ON [que].[cbsQue]
(
[queStatus] ASC
)
INCLUDE([error])
how do I use this index to be used in query ?
And is there a way to rewrite the where clause more effectively ?
And the reason for not using the above index might be due to using or in where clause ?

SQL Server partition and index

I have a requirement to design a table that is going to have around 80 million records. I created a partition for every month using persisted column (if its wrong suggest me the best way). Please find below scripts that I used to create tables and partition and the query that's going to be used often. Only Insertion and deletion will be done on this table.
-- Create the Partition Function
CREATE PARTITION FUNCTION PF_Invoice_item (int)
AS RANGE LEFT FOR VALUES (1,2,3,4,5,6,7,8,9,10,11,12);
-- Create the Partition Scheme
CREATE PARTITION SCHEME PS_Invoice_item
AS PARTITION PF_Invoice_item ALL TO ([Primary]);
CREATE TABLE [Invoice]
(
[invoice_id] [bigint] NOT NULL,
[Invoice_Number] [varchar](255) NULL,
[Invoice_Date] [date] NULL,
[Invoice_Total] [numeric](18, 2) NULL,
[Outstanding_Balance] [decimal](18, 2) NULL,
CONSTRAINT [PK_Invoice_id] PRIMARY KEY CLUSTERED([invoice_id] ASC)
)
CREATE TABLE [InvoiceItem](
[invoice_item_id] [bigint] NOT NULL,
[invoice_id] [bigint] NOT NULL,
[invoice_Date] [date] NULL,
[make] [varchar](255) NULL,
[serial_number] [varchar](255) NULL,
[asset_id] [varchar](100) NULL,
[application] [varchar](255) NULL,
[customer] [varchar](255) NULL,
[ucid] [varchar](255) NULL,
[dcn] [varchar](255) NULL,
[dcn_name] [varchar](255) NULL,
[device_serial_number] [varchar](255) NULL,
[subscription_name] [varchar](255) NULL,
[product_name] [varchar](255) NULL,
[subscription_start_date] [date] NULL,
[subscription_end_date] [date] NULL,
[duration] [varchar](50) NULL,
[promo_name] [varchar](255) NULL,
[promo_end_date] [date] NULL,
[discount] [decimal](18, 2) NULL,
[tax] [decimal](18, 2) NULL,
[line_item_total] [decimal](18, 2) NULL,
[mth] AS (datepart(month,[invoice_date])) PERSISTED NOT NULL,**
[RELATED_PRODUCT_RATEPLAN_NAME] [varchar](250) NULL,
[SUB_TOTAL] [decimal](18, 2) NULL,
[BILLING_START_DATE] [date] NULL,`enter code here`
[BILLING_END_DATE] [date] NULL,
[SUBSCRIPTION_ID] [varchar](200) NULL,
[DEVICE_TYPE] [varchar](200) NULL,
[BASE_OR_PROMO] [varchar](200) NULL,
CONSTRAINT [PK_InvoiceItem_ID] PRIMARY KEY CLUSTERED ([invoice_item_id]
ASC,[mth] ASC))
ON PS_Invoice_item(mth);
GO
ALTER TABLE [InvoiceItem] WITH CHECK ADD CONSTRAINT [FK_Invoice_ID]
FOREIGN KEY([invoice_id])
REFERENCES [Invoice] ([invoice_id])
GO
I will be using below queries
select subscription_name,duration,start_date,end_date,promotion_name,
promotion_end_date,sub_total,discount,tax,line_item_total from InvoiceItem
lt inner join Invoice on lt.invoice_id=invoice.invoice_id where
invoice.invoice_number='' and lt.customer='' and lt.ucid='' lt.make='' and
lt.SERIAL_NUMBER='' and lt.dcn='' and lt.application=''
select customer,make,application from billing.AssetApplicationTotals
lineItem inner join billing.Invoice invoice on
lineItem.invoice_id=invoice.invoice_id where invoice.invoice_number='';
SELECT [invoice_Date],[make],[serial_number],[application],[customer],
[ucid],[dcn],[dcn_name],[device_serial_number]
,[subscription_name],[product_name],[subscription_start_date],
[subscription_end_date],[duration],[promo_name],[promo_end_date]
FROM [InvoiceItem] where [application]=''
SELECT [invoice_Date],[make],[serial_number],[application],[customer],
[ucid],[dcn],[dcn_name],[device_serial_number]
,[subscription_name],[product_name],[subscription_start_date],
[subscription_end_date],[duration],[promo_name],[promo_end_date]
FROM [InvoiceItem] where [customer]=''
What is the best way to create index? Shall I create separate non clustered index for each filter, or shall I have Composite index and shall I have covering index to avoid key lookup?

Add a new column to Table with exisiting primary key

I am trying to add a new column(field) to a table with 4 columns and an extra id column.
But there seems to be a primary key restriction?
can someone help with this?
CREATE TABLE [dbo].[table1]( [id] [int] IDENTITY(1,1) NOT NULL, [a] [int] NOT NULL, [b]
[int] NOT NULL, [c] [int] NOT NULL, [d] [int] NOT NULL, [SCD_Date] [date] NOT NULL, [EndDate]
[date] NULL, CONSTRAINT [PK_table1] PRIMARY KEY CLUSTERED

You need to add the Primary Key in the constraint.
CREATE TABLE [dbo].[table1](
[id] [int] IDENTITY(1,1) NOT NULL,
[a] [int] NOT NULL,
[b] [int] NOT NULL,
[c] [int] NOT NULL,
[d] [int] NOT NULL,
[SCD_Date] [date] NOT NULL,
[EndDate]
[date] NULL,
CONSTRAINT [PK_table1] PRIMARY KEY CLUSTERED (id asc)
)

SQL query taking 1 minute and 35 seconds

I am executing the query on SQL server on hosting and it is taking 1 minute and 35 seconds. And the no, of rows of retrieval are 18000. Still it is taking too much time. Query is
select ID,
FirstName,
LastName,
Branch,
EnquiryID,
Course,
College,
Mobile,
ExamID,
EntranceID,
Entrance,
Venue,
RegNo,
VenueID,
Exam,
Gender,
row_number() over (partition by EnquiryID order by ID asc) as AttemptNO
from AGAM_View_AOPList
order by EnquiryID
TABLE SCHEMAS
CREATE TABLE [dbo].[AGAM_AceOFPace](
[ID] [int] IDENTITY(1,1) NOT NULL,
[EnquiryID] [int] NULL,
[FirstName] [nvarchar](100) NULL,
[MiddleName] [nvarchar](100) NULL,
[LastName] [nvarchar](100) NULL,
[BranchID] [int] NULL,
[Branch] [nvarchar](100) NULL,
[CourseID] [int] NULL,
[ExamID] [int] NULL,
[Exam] [nvarchar](200) NULL,
[EntranceID] [int] NULL,
[Entrance] [nvarchar](200) NULL,
[RegNo] [nvarchar](200) NULL,
[EntranceCode] [nvarchar](100) NULL,
[ExamDate] [nvarchar](50) NULL,
[UserID] [nvarchar](100) NULL,
[EntranceFees] [numeric](18, 2) NULL,
[VenueID] [int] NULL,
[Venue] [nvarchar](max) NULL,
[ChequeNumber] [nvarchar](50) NULL,
[Bank] [nvarchar](100) NULL,
[CreatedDate] [datetime] NULL,
CONSTRAINT [PK_AGAM_AceOFPace] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[AGAM_AceOFPace] WITH CHECK ADD CONSTRAINT [FK_AGAM_AceOFPace_AGAM_Inquiry] FOREIGN KEY([EnquiryID])
REFERENCES [dbo].[AGAM_Inquiry] ([ID])
GO
ALTER TABLE [dbo].[AGAM_AceOFPace] CHECK CONSTRAINT [FK_AGAM_AceOFPace_AGAM_Inquiry]
GO
SECOND TABLE
CREATE TABLE [dbo].[AGAM_Inquiry](
[ID] [int] IDENTITY(1,1) NOT NULL,
[RegNo] [nvarchar](200) NULL,
[BranchID] [int] NULL,
[Category] [nvarchar](100) NULL,
[CourseID] [int] NULL,
[EntranceFees] [numeric](18, 2) NULL,
[EntranceID] [int] NULL,
[UserID] [nvarchar](50) NULL,
[Status] [nvarchar](50) NULL,
[ReminderDate] [datetime] NULL,
[Reminder] [nvarchar](150) NULL,
[Mobile] [nvarchar](50) NULL,
[Email] [nvarchar](50) NULL,
[FirstName] [nvarchar](50) NULL,
[MiddleName] [nvarchar](50) NULL,
[LastName] [nvarchar](50) NULL,
[Landline] [nvarchar](50) NULL,
[Address] [nvarchar](100) NULL,
[DOB] [datetime] NULL,
[Gender] [nvarchar](50) NULL,
[PfBatchTime] [nvarchar](50) NULL,
[SourceOfInquiry] [nvarchar](50) NULL,
[ExStudentID] [int] NULL,
[InquiryDate] [datetime] NULL,
[ReceiptNumber] [nvarchar](50) NULL,
[RawID] [int] NULL,
[Deleted] [int] NULL,
[CreatedBy] [nvarchar](50) NULL,
[CreatedDate] [datetime] NULL,
[LastModifiedBy] [nvarchar](50) NULL,
[LastModifiedDate] [datetime] NULL,
[College] [nvarchar](150) NULL,
[Qualification] [nvarchar](150) NULL,
[RptNo] [nvarchar](100) NULL,
CONSTRAINT [PK_AGAM_Inquiry] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
ALTER TABLE [dbo].[AGAM_Inquiry] WITH CHECK ADD CONSTRAINT [FK_AGAM_Inquiry_AGAM_Branch] FOREIGN KEY([BranchID])
REFERENCES [dbo].[AGAM_Branch] ([ID])
GO
ALTER TABLE [dbo].[AGAM_Inquiry] CHECK CONSTRAINT [FK_AGAM_Inquiry_AGAM_Branch]
GO
ALTER TABLE [dbo].[AGAM_Inquiry] WITH CHECK ADD CONSTRAINT [FK_AGAM_Inquiry_AGAM_Course] FOREIGN KEY([CourseID])
REFERENCES [dbo].[AGAM_Course] ([ID])
GO
ALTER TABLE [dbo].[AGAM_Inquiry] CHECK CONSTRAINT [FK_AGAM_Inquiry_AGAM_Course]
GO
ALTER TABLE [dbo].[AGAM_Inquiry] WITH CHECK ADD CONSTRAINT [FK_AGAM_Inquiry_AGAM_Users] FOREIGN KEY([UserID])
REFERENCES [dbo].[AGAM_Users] ([UserID])
GO
ALTER TABLE [dbo].[AGAM_Inquiry] CHECK CONSTRAINT [FK_AGAM_Inquiry_AGAM_Users]
GO

Can you try with changing the view to this?
SELECT TOP (100) PERCENT
AP.ID,
AP.FirstName,
AP.LastName,
AP.Branch,
AP.EnquiryID,
AC.Name,
AI.College,
AI.Mobile,
AP.ExamID,
AP.EntranceID,
AP.RegNo,
AP.VenueID,
AP.Exam,
AI.Gender,
AP.BranchID,
AP.CourseID,
AP.CreatedDate,
AI.Status,
AP.Entrance,
AP.Venue
FROM dbo.AGAM_AceOFPace AS AP
INNER JOIN dbo.AGAM_Inquiry AS AI ON AI.ID = AP.EnquiryID
INNER JOIN dbo.AGAM_Course as AC on AC.ID = AP.CourseId
ORDER BY AP.EnquiryID

Do you have an index on EnquiryId and CourseID?
Seeing as you are joining, ordering and partitioning by it you really should.
CREATE INDEX IDX_AGAM_AceOFPace_EnquiryID
ON AGAM_AceOFPace (EnquiryID)
CREATE INDEX IDX_AGAM_AceOFPace_CourseID
ON AGAM_AceOFPace (CourseID)

Speeding up this query

This is my query which takes about 1.5 seconds. Can I lower this?
SELECT *
FROM
(SELECT
ROW_NUMBER() OVER (ORDER BY NAME asc) peta_rn,
peta_query.*
FROM
(SELECT
BOOK, PAGETRIMMED, NAME, TYPE, PDF
FROM
CCWiseDocumentNames2 cdn
INNER JOIN
CCWiseInstr2 cwi ON cwi.ID = cdn.ID) as peta_query) peta_paged
WHERE
peta_rn > 1331900 AND peta_rn <= 1331950
These are my table structures:
CREATE TABLE [dbo].[CCWiseDocumentNames2](
[ID] [int] NULL,
[BK_PG] [varchar](50) NULL,
[NAME] [varchar](100) NULL,
[OTHERNAM] [varchar](100) NULL,
[TYPE] [varchar](50) NULL,
[INDEXNAME] [varchar](50) NULL
) ON [PRIMARY]
CREATE TABLE [dbo].[CCWiseInstr2](
[ID] [int] NULL,
[BK_PG] [varchar](50) NULL,
[DATE] [datetime] NULL,
[ITYPE] [varchar](50) NULL,
[BOOK] [int] NULL,
[PAGE] [varchar](50) NULL,
[NOBP] [varchar](50) NULL,
[DESC] [varchar](240) NULL,
[TIF] [varchar](50) NULL,
[INDEXNAME] [varchar](50) NULL,
[CONFIRM] [varchar](50) NULL,
[PDF] [varchar](50) NULL,
[PAGETRIMMED] [varchar](10) NULL,
[PageINT] [int] NULL,
[PageCHAR] [varchar](2) NULL,
[IdAuto] [int] NOT NULL
) ON [PRIMARY]
This is my execution plan:
As you can see it is 97% clustered index seek and 3% index scan. Any way to improve this query further?

You can't add rownumber on the fly to more than a million rows and expect a where clause will instantly recognize those rows with the newly generated rownumbers.

Because I don't have that volume of data, can only provide some options for your consideration:
Dedicate the clustered index for Name column (other than ID)
Make the join after you get row_number over name.
Include the three columns from CCWiseInstr2 into a non-clustered index on ID column. This could save some hard disk spindle's movement. Perfomance gain could only be observed with large volume of data.
CREATE NONCLUSTERED INDEX [idx2_ID_include] ON [dbo].[CCWiseInstr2] ([ID] ASC) INCLUDE ( [BOOK], [PDF], [PAGETRIMMED])
GO
With a as (
Select *
from ( SELECT ROW_NUMBER() OVER (ORDER BY NAME asc) as peta_rn, ID,
type
from CCWiseDocumentNames2) as Temp
where peta_rn > 1331900 AND peta_rn <= 1331950
)
select a.peta_rn,
a.type,
b.book,
b.PAGETRIMMED,
b.PDF
from a
join CCWiseInstr2 as b on a.id = b.id

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Creating columnstore Index on existing partitioned table with 800+ million rows on SQL Server 2017 - sql-server-2016

Related

SQL query not using created index

SQL Server partition and index

Add a new column to Table with exisiting primary key

SQL query taking 1 minute and 35 seconds

Speeding up this query

Categories

Resources