Speeding up this query - sql

This is my query which takes about 1.5 seconds. Can I lower this?
SELECT *
FROM
(SELECT
ROW_NUMBER() OVER (ORDER BY NAME asc) peta_rn,
peta_query.*
FROM
(SELECT
BOOK, PAGETRIMMED, NAME, TYPE, PDF
FROM
CCWiseDocumentNames2 cdn
INNER JOIN
CCWiseInstr2 cwi ON cwi.ID = cdn.ID) as peta_query) peta_paged
WHERE
peta_rn > 1331900 AND peta_rn <= 1331950
These are my table structures:
CREATE TABLE [dbo].[CCWiseDocumentNames2](
[ID] [int] NULL,
[BK_PG] [varchar](50) NULL,
[NAME] [varchar](100) NULL,
[OTHERNAM] [varchar](100) NULL,
[TYPE] [varchar](50) NULL,
[INDEXNAME] [varchar](50) NULL
) ON [PRIMARY]
CREATE TABLE [dbo].[CCWiseInstr2](
[ID] [int] NULL,
[BK_PG] [varchar](50) NULL,
[DATE] [datetime] NULL,
[ITYPE] [varchar](50) NULL,
[BOOK] [int] NULL,
[PAGE] [varchar](50) NULL,
[NOBP] [varchar](50) NULL,
[DESC] [varchar](240) NULL,
[TIF] [varchar](50) NULL,
[INDEXNAME] [varchar](50) NULL,
[CONFIRM] [varchar](50) NULL,
[PDF] [varchar](50) NULL,
[PAGETRIMMED] [varchar](10) NULL,
[PageINT] [int] NULL,
[PageCHAR] [varchar](2) NULL,
[IdAuto] [int] NOT NULL
) ON [PRIMARY]
This is my execution plan:
As you can see it is 97% clustered index seek and 3% index scan. Any way to improve this query further?

You can't add rownumber on the fly to more than a million rows and expect a where clause will instantly recognize those rows with the newly generated rownumbers.

Because I don't have that volume of data, can only provide some options for your consideration:
Dedicate the clustered index for Name column (other than ID)
Make the join after you get row_number over name.
Include the three columns from CCWiseInstr2 into a non-clustered index on ID column. This could save some hard disk spindle's movement. Perfomance gain could only be observed with large volume of data.
CREATE NONCLUSTERED INDEX [idx2_ID_include] ON [dbo].[CCWiseInstr2] ([ID] ASC) INCLUDE ( [BOOK], [PDF], [PAGETRIMMED])
GO
With a as (
Select *
from ( SELECT ROW_NUMBER() OVER (ORDER BY NAME asc) as peta_rn, ID,
type
from CCWiseDocumentNames2) as Temp
where peta_rn > 1331900 AND peta_rn <= 1331950
)
select a.peta_rn,
a.type,
b.book,
b.PAGETRIMMED,
b.PDF
from a
join CCWiseInstr2 as b on a.id = b.id

Related

SQL Query to update a colums from a table base on a datetime field

I have 2 table called tblSetting and tblPaquets.
I need to update 3 fields of tblPaquets from tblSetting base on a where clause that use a datetime field of tblPaquest and tblSetting.
The sql below is to represent what I am trying to do and I know it make no sense right now.
My Goal is to have One query to achieve this goal.
I need to extract the data from tblSettings like this
SELECT TOP(1) [SupplierID],[MillID],[GradeFamilyID] FROM [tblSettings]
WHERE [DateHeure] <= [tblPaquets].[DateHeure]
ORDER BY [DateHeure] DESC
And Update tblPaquets with this data
UPDATE [tblPaquets]
SET( [SupplierID] = PREVIOUS_SELECT.[SupplierID]
[MillID] = PREVIOUS_SELECT.[MillID]
[GradeFamilly] = PREVIOUS_SELECT.[GradeFamilyID] )
Here the table design
CREATE TABLE [tblSettings](
[ID] [int] NOT NULL,
[SupplierID] [int] NOT NULL,
[MillID] [int] NOT NULL,
[GradeID] [int] NOT NULL,
[TypeID] [int] NOT NULL,
[GradeFamilyID] [int] NOT NULL,
[DateHeure] [datetime] NOT NULL,
[PeakWetEnable] [tinyint] NULL)
CREATE TABLE [tblPaquets](
[ID] [int] IDENTITY(1,1) NOT NULL,
[PaquetID] [int] NOT NULL,
[DateHeure] [datetime] NULL,
[BarreCode] [int] NULL,
[Grade] [tinyint] NULL,
[SupplierID] [int] NULL,
[MillID] [int] NULL,
[AutologSort] [tinyint] NULL,
[GradeFamilly] [int] NULL)
You can do this using CROSS APPLY:
UPDATE p
SET SupplierID = s.SupplierID,
MillID = s.MillID
GradeFamilly = s.GradeFamilyID
FROM tblPaquets p CROSS APPLY
(SELECT TOP (1) s.*
FROM tblSettings s
WHERE s.DateHeure <= p.DateHeure
ORDER BY p.DateHeure DESC
) s;
Notes:
There are no parentheses before SET.
I don't recommend using [ and ] to escape identifiers, unless they need to be escaped.
I presume the query on tblSettings should have an ORDER BY to get the most recent rows.

SQL Server partition and index

I have a requirement to design a table that is going to have around 80 million records. I created a partition for every month using persisted column (if its wrong suggest me the best way). Please find below scripts that I used to create tables and partition and the query that's going to be used often. Only Insertion and deletion will be done on this table.
-- Create the Partition Function
CREATE PARTITION FUNCTION PF_Invoice_item (int)
AS RANGE LEFT FOR VALUES (1,2,3,4,5,6,7,8,9,10,11,12);
-- Create the Partition Scheme
CREATE PARTITION SCHEME PS_Invoice_item
AS PARTITION PF_Invoice_item ALL TO ([Primary]);
CREATE TABLE [Invoice]
(
[invoice_id] [bigint] NOT NULL,
[Invoice_Number] [varchar](255) NULL,
[Invoice_Date] [date] NULL,
[Invoice_Total] [numeric](18, 2) NULL,
[Outstanding_Balance] [decimal](18, 2) NULL,
CONSTRAINT [PK_Invoice_id] PRIMARY KEY CLUSTERED([invoice_id] ASC)
)
CREATE TABLE [InvoiceItem](
[invoice_item_id] [bigint] NOT NULL,
[invoice_id] [bigint] NOT NULL,
[invoice_Date] [date] NULL,
[make] [varchar](255) NULL,
[serial_number] [varchar](255) NULL,
[asset_id] [varchar](100) NULL,
[application] [varchar](255) NULL,
[customer] [varchar](255) NULL,
[ucid] [varchar](255) NULL,
[dcn] [varchar](255) NULL,
[dcn_name] [varchar](255) NULL,
[device_serial_number] [varchar](255) NULL,
[subscription_name] [varchar](255) NULL,
[product_name] [varchar](255) NULL,
[subscription_start_date] [date] NULL,
[subscription_end_date] [date] NULL,
[duration] [varchar](50) NULL,
[promo_name] [varchar](255) NULL,
[promo_end_date] [date] NULL,
[discount] [decimal](18, 2) NULL,
[tax] [decimal](18, 2) NULL,
[line_item_total] [decimal](18, 2) NULL,
[mth] AS (datepart(month,[invoice_date])) PERSISTED NOT NULL,**
[RELATED_PRODUCT_RATEPLAN_NAME] [varchar](250) NULL,
[SUB_TOTAL] [decimal](18, 2) NULL,
[BILLING_START_DATE] [date] NULL,`enter code here`
[BILLING_END_DATE] [date] NULL,
[SUBSCRIPTION_ID] [varchar](200) NULL,
[DEVICE_TYPE] [varchar](200) NULL,
[BASE_OR_PROMO] [varchar](200) NULL,
CONSTRAINT [PK_InvoiceItem_ID] PRIMARY KEY CLUSTERED ([invoice_item_id]
ASC,[mth] ASC))
ON PS_Invoice_item(mth);
GO
ALTER TABLE [InvoiceItem] WITH CHECK ADD CONSTRAINT [FK_Invoice_ID]
FOREIGN KEY([invoice_id])
REFERENCES [Invoice] ([invoice_id])
GO
I will be using below queries
select subscription_name,duration,start_date,end_date,promotion_name,
promotion_end_date,sub_total,discount,tax,line_item_total from InvoiceItem
lt inner join Invoice on lt.invoice_id=invoice.invoice_id where
invoice.invoice_number='' and lt.customer='' and lt.ucid='' lt.make='' and
lt.SERIAL_NUMBER='' and lt.dcn='' and lt.application=''
select customer,make,application from billing.AssetApplicationTotals
lineItem inner join billing.Invoice invoice on
lineItem.invoice_id=invoice.invoice_id where invoice.invoice_number='';
SELECT [invoice_Date],[make],[serial_number],[application],[customer],
[ucid],[dcn],[dcn_name],[device_serial_number]
,[subscription_name],[product_name],[subscription_start_date],
[subscription_end_date],[duration],[promo_name],[promo_end_date]
FROM [InvoiceItem] where [application]=''
SELECT [invoice_Date],[make],[serial_number],[application],[customer],
[ucid],[dcn],[dcn_name],[device_serial_number]
,[subscription_name],[product_name],[subscription_start_date],
[subscription_end_date],[duration],[promo_name],[promo_end_date]
FROM [InvoiceItem] where [customer]=''
What is the best way to create index? Shall I create separate non clustered index for each filter, or shall I have Composite index and shall I have covering index to avoid key lookup?

Recursive query to speed up system

My application reads a table with parent child relation. The application queries every level of the tree on his own and it really slow (must do multiple levels deep). I has searching for another solution and came to the recursive queries. With the examples that I have found I cannot map it to my data structure.
My structure looks like:
CREATE TABLE [products].[BillOfMaterial](
[id] [bigint] IDENTITY(1,1) NOT NULL,
[parentNumber] [nvarchar](50) NOT NULL,
[warehouse] [nvarchar](50) NOT NULL,
[sequenceNumber] [int] NOT NULL,
[childNumber] [nvarchar](50) NOT NULL,
[childDescription] [nvarchar](50) NULL,
[qtyRequired] [numeric](18, 3) NOT NULL,
[childItemClass] [nvarchar](50) NULL,
[childItemType] [nvarchar](50) NULL,
[scrapFactor] [numeric](18, 3) NULL,
[bubbleNumber] [int] NOT NULL,
[operationNumber] [int] NOT NULL,
[effectivityDate] [date] NULL,
[discontinuityDate] [date] NULL,
[companyID] [bigint] NOT NULL,
CONSTRAINT [PK_BillOfMaterial] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
Some example data:
When I query for parentNumber 1 it should give me all those rows:
And for parentNumber 3 the output must be:
Now I need all the children of a parent in a recursive way. How can I achieve this using only sql? (I'm using SQL server)
I tried already using the with statement in sql, but it will now give a lot of results.
WITH bom ( [id]
,[parentNumber]
,[warehouse]
,[sequenceNumber]
,[childNumber]
,[childDescription]
,[qtyRequired]
,[childItemClass]
,[childItemType]
,[scrapFactor]
,[bubbleNumber]
,[operationNumber]
,[effectivityDate]
,[discontinuityDate]
,[companyID] )
AS
(
select * from [CR_ApplicationSuite].[products].[BillOfMaterial] where parentNumber IN ('F611882261', '2912435206')
UNION ALL
select b.* from [CR_ApplicationSuite].[products].[BillOfMaterial] b
INNER JOIN [CR_ApplicationSuite].[products].[BillOfMaterial] c on c.childNumber = b.parentNumber
)
SELECT *
FROM bom
Here's an example that uses a table variable for demonstration purposes.
For a recursive query you need to use the CTE within itself.
I included a rootParentNumber, so it's more obvious what the base parent was.
The WHERE clauses are commented in the example.
Because you can either put a WHERE clause within the Recursive Query, or at the outer query. The former should be faster.
declare #BillOfMaterial TABLE (
[id] [bigint] IDENTITY(1,1) NOT NULL,
[parentNumber] [nvarchar](50) NOT NULL,
[warehouse] [nvarchar](50) NOT NULL,
[sequenceNumber] [int] NOT NULL,
[childNumber] [nvarchar](50) NOT NULL,
[childDescription] [nvarchar](50) NULL,
[qtyRequired] [numeric](18, 3) NOT NULL,
[childItemClass] [nvarchar](50) NULL,
[childItemType] [nvarchar](50) NULL,
[scrapFactor] [numeric](18, 3) NULL,
[bubbleNumber] [int] NOT NULL,
[operationNumber] [int] NOT NULL,
[effectivityDate] [date] NULL,
[discontinuityDate] [date] NULL,
[companyID] [bigint] NOT NULL
);
insert into #BillOfMaterial (parentNumber, childNumber, warehouse, sequenceNumber, qtyRequired, bubbleNumber, operationNumber, companyID) values
('1','2','WH1',1,0,0,0,1),
('2','4','WH1',2,0,0,0,1),
('3','4','WH1',3,0,0,0,1),
('4','5','WH1',4,0,0,0,1),
('5','0','WH1',5,0,0,0,1);
WITH BOM
AS
(
select parentNumber as rootParentNumber, *
from #BillOfMaterial
--where parentNumber IN ('1','3')
union all
select bom.rootParentNumber, b.*
from BOM
INNER JOIN #BillOfMaterial b
on (BOM.childNumber = b.parentNumber and b.childNumber <> '0')
)
SELECT
[rootParentNumber]
,[parentNumber]
,[childNumber]
,[id]
,[warehouse]
,[sequenceNumber]
,[childDescription]
,[qtyRequired]
,[childItemClass]
,[childItemType]
,[scrapFactor]
,[bubbleNumber]
,[operationNumber]
,[effectivityDate]
,[discontinuityDate]
,[companyID]
FROM BOM
--WHERE rootParentNumber IN ('1','3')
ORDER BY [rootParentNumber], [parentNumber], [childNumber]
;

How do I aggregate 3 columns that are different with MIN(DATE)?

I'm facing a simple problem here that I can't solve, I have this query:
SELECT
MIN(TEA_InicioTarefa),
PFJ_Id_Analista,
ATC_Id,
SRV_Id
FROM
dbo.TarefaEtapaAreaTecnica
INNER JOIN Tarefa t ON t.TRF_Id = TarefaEtapaAreaTecnica.TRF_Id
WHERE SRV_Id = 88
GROUP BY SRV_Id, ATC_Id, PFJ_Id_Analista
ORDER BY ATC_Id ASC
It returns me this:
I was able to group it a little with GROUP BY SRV_Id, ATC_Id, PFJ_Id_Analista that gave me these 8 records, but as you can see some PFJ_Id_Analista are different.
What I want is to select only the early date of each SRV_Id and ATC_Id, the PFJ_Id_Analista don't need to grup, if I remove PFJ_Id_Analista from the grouping the query works, but I need the column.
For eg.: between row number 2 and 3 I want only the early date, so it will be row 2. The same goes for rows 5 to 8, I want only row 6.
DDL for TarefaEtapaAreaTecnica (important key: TRF_Id)
CREATE TABLE [dbo].[TarefaEtapaAreaTecnica](
[TEA_Id] [int] IDENTITY(1,1) NOT NULL,
**[TRF_Id] [int] NOT NULL,**
[ETS_Id] [int] NOT NULL,
[ATC_Id] [int] NOT NULL,
[TEA_Revisao] [int] NOT NULL,
[PFJ_Id_Projetista] [int] NULL,
[TEA_DoctosQtd] [int] NULL,
[TEA_InicioTarefa] [datetime2](7) NULL,
[PFJ_Id_Analista] [int] NULL,
[TEA_FimTarefa] [datetime2](7) NULL,
[TEA_HorasQtd] [numeric](18, 1) NULL,
[TEA_NcfQtd] [int] NULL,
[PAT_Id] [int] NULL
DDL for Tarefa (important keys TRF_Id and SRV_Id (which I need it)):
CREATE TABLE [dbo].[Tarefa](
**[TRF_Id] [int] IDENTITY(1,1) NOT FOR REPLICATION NOT NULL,**
**[SRV_Id] [int] NOT NULL,**
[TRT_Id] [int] NOT NULL,
[TRF_Descr] [varchar](255) NULL,
[TRF_Entrada] [datetime] NOT NULL,
[TRF_DoctosQtd] [int] NOT NULL,
[TRF_Devolucao] [datetime] NULL,
[TRF_NcfQtd] [int] NULL,
[TRF_EhDocInsuf] [bit] NULL,
[TRF_Observ] [varchar](255) NULL,
[TRF_AreasTrfQtd] [int] NULL,
[TRF_AreasTrfLiqQtd] [int] NULL
Thanks a lot.
EDIT:
CORRECT QUERY
Based on #Gordon Linoff post:
select t.TEA_InicioTarefa, t.PFJ_Id_Analista, t.ATC_Id, t.SRV_Id
from (select t.*,
row_number() over (partition by ATC_Id, SRV_Id
order by TEA_InicioTarefa) as seqnum, ta.SRV_Id
from dbo.TarefaEtapaAreaTecnica t
inner join dbo.Tarefa ta on t.TRF_Id = ta.TRF_Id
) t
where seqnum = 1 AND t.SRV_Id = 88
Just use window functions:
select t.*
from (select t.*,
row_number() over (partition by ATC_Id, SRV_Id
order by ini) as seqnum
from dbo.TarefaEtapaAreaTecnica t
) t
where seqnum = 1;
This is really an example of filtering, not aggregation. The problem is getting the right value to filter on.
Then get the grouping first and then do a JOIN with it like
SELECT
x.Min_TEA_InicioTarefa,
t.PFJ_Id_Analista,
t.ATC_Id,
t.SRV_Id
FROM
dbo.TarefaEtapaAreaTecnica t
INNER JOIN Tarefa ta ON ta.TRF_Id = t.TRF_Id
INNER JOIN (
select SRV_Id, MIN(TEA_InicioTarefa) as Min_TEA_InicioTarefa
from dbo.TarefaEtapaAreaTecnica
GROUP BY SRV_Id
) x ON t.SRV_Id = x.SRV_Id
WHERE t.SRV_Id = 88
ORDER BY t.ATC_Id ASC;

SQL fastest 'GROUP BY' script

Is there any difference in how I edit the GROUP BY command?
my code:
SELECT Number, Id
FROM Table
WHERE(....)
GROUP BY Id, Number
is it faster if i edit it like this:
SELECT Number, Id
FROM Table
WHERE(....)
GROUP BY Number , Id
it's better to use DISTINCT if you don't want to aggregate data. Otherwise, there is no difference between the two queries you provided, it'll produce the same query plan
This examples are equal.
DDL:
CREATE TABLE dbo.[WorkOut]
(
[WorkOutID] [bigint] IDENTITY(1,1) NOT NULL PRIMARY KEY,
[TimeSheetDate] [datetime] NOT NULL,
[DateOut] [datetime] NOT NULL,
[EmployeeID] [int] NOT NULL,
[IsMainWorkPlace] [bit] NOT NULL,
[DepartmentUID] [uniqueidentifier] NOT NULL,
[WorkPlaceUID] [uniqueidentifier] NULL,
[TeamUID] [uniqueidentifier] NULL,
[WorkShiftCD] [nvarchar](10) NULL,
[WorkHours] [real] NULL,
[AbsenceCode] [varchar](25) NULL,
[PaymentType] [char](2) NULL,
[CategoryID] [int] NULL
)
Query:
SELECT wo.WorkOutID, wo.TimeSheetDate
FROM dbo.WorkOut wo
GROUP BY wo.WorkOutID, wo.TimeSheetDate
SELECT DISTINCT wo.WorkOutID, wo.TimeSheetDate
FROM dbo.WorkOut wo
SELECT wo.DateOut, wo.EmployeeID
FROM dbo.WorkOut wo
GROUP BY wo.DateOut, wo.EmployeeID
SELECT DISTINCT wo.DateOut, wo.EmployeeID
FROM dbo.WorkOut wo
Execution plan: