Sometime SQL Server Select Query is too slow - sql

I have a table like this which has more than 7 million records:
CREATE TABLE [dbo].[Test]
(
[Id] [bigint] IDENTITY(1,1) NOT NULL,
[UUID] [nvarchar](100) NOT NULL,
[FirstName] [nvarchar](50) NULL,
[LastName] [nvarchar](50) NULL,
[AddrLine1] [nvarchar](100) NULL,
[AddrLine2] [nvarchar](100) NULL,
[City] [nvarchar](50) NULL,
[Prov] [nvarchar](10) NULL,
[Postal] [nvarchar](10) NULL,
[DateAdded] [datetime] NULL,
CONSTRAINT [PK_Test]
PRIMARY KEY CLUSTERED ([Id] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
Now, the system runs the following select query everyday during the afternoons. The funny thing is sometimes the same query is so slow which takes about 4 minutes. The second time or other times, same query is pretty fast which takes less than a second.
The query:
WITH testquery AS
(
SELECT TOP 1
'Matched' as location,Firstname, LastName,
AddrLine1, AddrLine2, City, Prov, Postal
FROM
[Test]
WHERE
UUID = 'BLABLABLABLABLABLABLABLABLA'
ORDER BY
DateAdded DESC
),
defaults AS
(
SELECT
'Rejected' AS location, NULL AS Firstname, NULL AS LastName,
NULL AS AddrLine1, NULL AS AddrLine2, NULL AS City, NULL AS Prov,
NULL AS Postal
)
SELECT *
FROM testquery
UNION ALL
SELECT *
FROM defaults
WHERE NOT EXISTS (SELECT * FROM testquery);
Can somebody help please?
Notes:
I have a service which adds around 1000 new records to the table everyday in the mornings.
[avg_fragmentation_in_percent] is 0.01
UUID can be duplicated if I have the same person with different addresses.
The table is not used somewhere else at the same time.
Database is not busy with other queries at the same time. I checked using "sys.dm_exec_requests"

You need a good index to service this query efficiently.
You say that you can't create one because of duplicate key errors: there is no need for an index to be unique.
So the one you're looking for will depend on what other queries you are running, but the following will suffice for this query:
CREATE NONCLUSTERED INDEX IX_Test_UuidDate ON
Test (UUID ASC, DateAdded DESC)
INCLUDE (Firstname, LastName, AddrLine1, AddrLine2, City, Prov, Postal)
GO
Furthermore, there is no need to query the table twice.
Start with a dummy VALUES table constructor so you always have a row, then LEFT JOIN the table and use CASE to deal with not having a row.
WITH testquery AS
(
SELECT TOP 1
*
FROM
[Test]
WHERE
UUID = 'BLABLABLABLABLABLABLABLABLA'
ORDER BY
DateAdded DESC
)
SELECT
CASE WHEN UUID IS NULL 'Rejected' ELSE 'Matched' END as location,
t.Firstname,
t.LastName,
t.AddrLine1,
t.AddrLine2,
t.City,
t.Prov,
t.Postal
FROM (VALUES(0)) AS v(dummy)
LEFT JOIN testquery AS t ON 1=1;

The usual explanation for this is a cold cache. In your case, I think the issue would be the ORDER BY in the first CTE.
To fix this problem, you want an index on test(UUID, DateAdded desc).
I'm not sure why this would speed up after the first execution. Perhaps the server's caches are working particularly well.

Related

How do I tell the query to only use distinct on only one field [duplicate]

This question already has answers here:
How do I select the first row per group in an SQL Query?
(8 answers)
Closed 12 months ago.
I'm attempting to pull a report from this table:
CREATE TABLE [Miscellaneous].[BackupStatus](
[BackupStatusId] [int] IDENTITY(1,1) NOT NULL,
[CompanyCode] [varchar](3) NOT NULL,
[ComputerName] [varchar](50) NOT NULL,
[DateTimeOfBackup] [datetime] NOT NULL,
[Description] [varchar](50) NOT NULL,
[IsSuccess] [bit] NOT NULL,
[Message] [nvarchar](max) NOT NULL,
[Exception] [nvarchar](max) NULL,
[ProgramDate] [datetime] NULL,
CONSTRAINT [PK_BackupStatus] PRIMARY KEY CLUSTERED
(
[BackupStatusId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
ALTER TABLE [Miscellaneous].[BackupStatus] ADD CONSTRAINT [DF_BackupStatus_IsSuccess] DEFAULT ((0)) FOR [IsSuccess]
GO
I need to get a list of most recent backup by CompanyCode. I only want the most recent record for each company (ie CompanyCode). I believe that I need to sort the table decending and distinct but I cannot figure it out.
SELECT Distinct CompanyCode, ComputerName, DateTimeOfBackup,
[Description], IsSuccess, [Message],
[Exception], ProgramDate, BackupStatusId
FROM Miscellaneous.BackupStatus
ORDER BY DateTimeOfBackup DESC
I'm getting back all records still. How do I tell the query to use distinct on only one field?
UPDATE
I hope to add some clarity to what I need.
In this case, it should only return records 4262 and 4266, since they are the most recent unique for that CompanyCode.
DISTINCT won't help here. You can use a windowed ROW_NUMBER() function to assign ordinals and then filter by ordinal = 1.
Something like:
SELECT CompanyCode, ComputerName, DateTimeOfBackup,
[Description], IsSuccess, [Message],
[Exception], ProgramDate, BackupStatusId
FROM (
SELECT
*,
ordinal = ROW_NUMBER() OVER(PARTITION BY CompanyCode ORDER BY DateTimeOfBackup DESC)
FROM Miscellaneous.BackupStatus
) A
WHERE A.Ordinal = 1 -- Just the latest
ORDER BY CompanyCode
This can also be coded using a CTE (common table expression):
WITH A AS (
SELECT
*,
ordinal = ROW_NUMBER() OVER(PARTITION BY CompanyCode ORDER BY DateTimeOfBackup DESC)
FROM Miscellaneous.BackupStatus
)
SELECT CompanyCode, ComputerName, DateTimeOfBackup,
[Description], IsSuccess, [Message],
[Exception], ProgramDate, BackupStatusId
FROM A
WHERE A.Ordinal = 1 -- Just the latest
ORDER BY CompanyCode

RANK() SQL Server execution plan issue

What is driving SQL Server to use less optimal execution plan for queries where 6000+ rows are returned? I need to improve query performance for scenario where all rows are returned.
I select all fields and add rank over same three columns included in index. Depending on number of returned rows, query has two different execution plans, hence execution takes 0.2s or 3s respectively.
From 1 row returned up to ca. 5000 query runs fast. From 6000 rows returned up to all, query runs slow.
Table1 has ca. 38000 rows. Database runs on Azure SQL v12.
Table:
CREATE TABLE [dbo].[Table1](
[ID] [int] IDENTITY(1,1) NOT NULL,
[KOD_ID] [int] NULL,
[SYM] [nvarchar](20) NULL,
[AN] [nvarchar](35) NULL,
[A] [nvarchar](10) NULL,
[B] [nvarchar](2) NULL,
[C] [datetime] NULL,
[D] [datetime] NULL,
CONSTRAINT [PK_Table1] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
)
GO
CREATE NONCLUSTERED INDEX [IX_Table1] ON [dbo].[Table1]
(
[KOD_ID] ASC,
[SYM] ASC,
[AN] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO
Queries:
SELECT TOP 6000 *, RANK() OVER(ORDER BY KOD_ID ASC, SYM ASC, AN ASC) AS Rank#
FROM [dbo].[Table1]
SELECT TOP 7000 *, RANK() OVER(ORDER BY KOD_ID ASC, SYM ASC, AN ASC) AS Rank#
FROM [dbo].[Table1]
Execution plans for both queries
CREATE NONCLUSTERED INDEX [IX_Table1] ON [dbo].[Table1]
(
[KOD_ID] ASC,
[SYM] ASC,
[AN] ASC
) INCLUDE ([A], [B], [C], [D]);
Create such kind of a covering index and it should scan this index and most likely sort won't even be needed because it's data is already sorted in index.
The key points in your queries are:
First plan has a key lookup, avoid them as much as possible (key lookup is additional scan for each row because index does not have them) create covering indexes with INCLUDED columns
Avoid sort operations too, they're costly to SQL Server
If you're alright with index rebuilds and favor reads over inserts, these could be alternate DDLs for your table considering that and KOD_ID, SYM, AN are not null-able:
If ID is needed to ensure uniqueness:
CREATE TABLE [dbo].[Table1] (
[KOD_ID] [int] NOT NULL
, [SYM] [nvarchar](20) NOT NULL
, [AN] [nvarchar](35) NOT NULL
, [ID] [int] IDENTITY(1, 1) NOT NULL
, [A] [nvarchar](10) NULL
, [B] [nvarchar](2) NULL
, [C] [datetime2] NULL
, [D] [datetime2] NULL
, CONSTRAINT [PK_Table1] PRIMARY KEY CLUSTERED ([KOD_ID], [SYM], [AN], [ID])
);
GO
If ID is not needed to ensure uniqueness:
CREATE TABLE [dbo].[Table1] (
[KOD_ID] [int] NOT NULL
, [SYM] [nvarchar](20) NOT NULL
, [AN] [nvarchar](35) NOT NULL
, [A] [nvarchar](10) NULL
, [B] [nvarchar](2) NULL
, [C] [datetime2] NULL
, [D] [datetime2] NULL
, CONSTRAINT [PK_Table1] PRIMARY KEY CLUSTERED ([KOD_ID], [SYM], [AN])
);
GO
Also, note that I use datetime2 instead of datetime, that's what Microsoft recommends: https://learn.microsoft.com/en-us/sql/t-sql/data-types/datetime-transact-sql
Use the time, date, datetime2 and datetimeoffset data
types for new work. These types align with the SQL Standard. They are
more portable. time, datetime2 and datetimeoffset provide
more seconds precision. datetimeoffset provides time zone support
for globally deployed applications.

Inserto into ( Select ) with not auto increment

Have two tables
CREATE TABLE [dbo].[TABELAA]
(
[ID] [bigint] NOT NULL,
[PodatakA] [nvarchar](50) NULL,
[PodatakB] [nvarchar](50) NULL,
CONSTRAINT [PK_TABELAA]
PRIMARY KEY CLUSTERED ([ID] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[TABELAB]
(
[PodatakX] [nvarchar](50) NULL,
[PodatakY] [nvarchar](50) NULL
) ON [PRIMARY]
GO
I need to insert value from tabelaB to tabelaA with autogenerating ID in tabelaA so I need something like this. But this would be great if there is only one row. I'm talking about thousands of rows where it should auto generate id exact like AutoIncrement (1)
Useless try where I think I should use OVER
INSERT INTO TABELAA
SELECT
(SELECT MAX(id) + 1 FROM TabelaA) AS Id, *
FROM
tabelaB
You are looking for the IDENTITY:
CREATE TABLE [dbo].[TABLAAA](
[ID] [bigint] IDENTITY(1, 1) PRIMARY KEY, -- NOT NULL is handled by PRIMARY KEY
[PodatakA] [nvarchar](50) NULL,
[PodatakB] [nvarchar](50) NULL
);
INSERT INTO TABLEAA (PodatakA, PodatakB)
SELECT PodatakA, PodatakB
FROM TABLEBB;
I agree with Rahul's comment and Gordon that if you can modify your schema it would make the most sense to add an Identity Column. However if you cannot you can still accomplish what you want using a couple of methods.
One method is get the MAX ID of TableAA and then add a ROW_NUMBER() to it like so:
INSERT INTO TableAA (ID, PodatakA, PodatakB)
SELECT
m.CurrentMaxId + ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
,b.PodatakA
,b.PodatakB
FROM
TableAB b
CROSS APPLY (
SELECT ISNULL(MAX(Id),0) as CurrentMaxId
FROM
TableAA) m
Again this would be work around the most ideal solution is to specify IDENTITY
Also this is susceptible to problems due to simultaneous writes and other scenarios in a heavy traffic DB.

Bypass duplicate record when one table data is inserted from another in SQL Server

I am trying to insert data from one database table to another database table. This work is performed very well but need bypass that duplicate data cannot be inserted. Here is my query below. How can I check duplicate record?
;WITH ABC AS (
SELECT
5 AS DeviceID
, nUserID AS CardNo
, CONVERT(DATE, dbo.fn_ConvertToDateTime(nDateTime)) AS InOutDate
, CONVERT(VARCHAR(8) ,CONVERT(TIME,dbo.fn_ConvertToDateTime(nDateTime))) AS InOutTime
FROM [BioStar].[dbo].[TB_EVENT_LOG]
)
SELECT * INTO #tempAtten FROM ABC
INSERT [HR].[dbo].[HR_DeviceInOut](DeviceID, CardNo, InOutDate, InOutTime, ShiftprofileID, ExecutedBy)
SELECT DeviceID, CardNo, InOutDate, InOutTime, NULL, NULL
FROM #tempAtten
WHERE #tempAtten.InOutDate = CONVERT(DATE, GETDATE()) AND #tempAtten.CardNo <> 0
DROP TABLE #tempAtten
--HR_DeviceInOut
CREATE TABLE [dbo].[HR_DeviceInOut](
[id] [bigint] IDENTITY(100000000000001,1) NOT NULL,
[DeviceID] [nvarchar](20) NULL,
[CardNo] [nvarchar](20) NOT NULL,
[InOutDate] [date] NOT NULL,
[InOutTime] [nvarchar](10) NOT NULL,
[ShiftprofileID] [tinyint] NULL,
[ExecutedBy] [int] NULL,
CONSTRAINT [PK_HR_AttenHistory] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
--Function
ALTER FUNCTION [dbo].[fn_ConvertToDateTime] (#Datetime BIGINT)
RETURNS DATETIME
AS
BEGIN
DECLARE #LocalTimeOffset BIGINT
,#AdjustedLocalDatetime BIGINT;
SET #LocalTimeOffset = DATEDIFF(second,GETDATE(),GETUTCDATE())
SET #AdjustedLocalDatetime = #Datetime - #LocalTimeOffset
RETURN (SELECT DATEADD(second,#AdjustedLocalDatetime, CAST('1970-01-01 00:00:00' AS datetime)))
END;
Assuming I'm understanding correctly, here's one option using not exists:
INSERT [HR].[dbo].[HR_DeviceInOut] (DeviceID, CardNo, InOutDate,
InOutTime, ShiftprofileID, ExecutedBy)
SELECT DeviceID, CardNo, InOutDate, InOutTime, NULL, NULL
FROM #tempAtten t
WHERE t.InOutDate = CONVERT(DATE, GETDATE()) AND
t.CardNo <> 0 AND
NOT EXISTS (
SELECT 1
FROM [HR].[dbo].[HR_DeviceInOut] d
WHERE t.DeviceID = d.DeviceId AND
t.CardNo = d.CardNo AND
t.InOutDate = d.InOutDate AND
t.InOutTime = d.InOutTime
)
Consider adding a unique_index to the those fields that cannot be duplicated.
Which column set make record unique as i see some column are hard coded
ie 5 AS DeviceID ...
Create unique key for rest of the column in temp table and destinationtabel.to avoid duplicate .

get rows and rows from one table SQL server

i have comments table in sql server structured as
CREATE TABLE [dbo].[LS_Commentes](
[CommentId] [int] IDENTITY(1,1) NOT NULL,
[OwnerId] [uniqueidentifier] NULL,
[OwnerName] [nvarchar](50) NULL,
[Email] [nvarchar](250) NULL,
[Date] [nvarchar](15) NULL,
[ParentId] [int] NULL,
[CommentText] [nvarchar](400) NULL,
[ItemId] [int] NULL,
[upVotes] [int] NULL,
[downVotes] [int] NULL,
[isApproved] [bit] NULL,
CONSTRAINT [PK_LS_MsgCommentes] PRIMARY KEY CLUSTERED
(
[CommentId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
and i have sample data like this:
CommentId OwnerId OwnerName Email Date ParentId CommentText ItemId upVotes downVotes isApproved
1 NULL Test Commneter NULL 1/4/2013 NULL test 9 0 0 NULL
2 NULL Test Commneter NULL 1/4/2013 1 test NULL 0 0 NULL
3 NULL Test Commneter NULL 1/4/2013 1 test NULL 0 0 NULL
i want to write one query can get me all rows have itemid =9 and rows have parentid= comment id that selected (because itemid = 9)
look here i can solve it by adding item id 9 to the sub comments too but i just want to know if that could be solved without adding item id to comments and sub comments
I think the following query does what you want:
select *
from ls_comments c
where c.itemID = 9 or
c.parentID in (select c2.commentId from ls_comments c2 where c2.itemId = 9)
Would a recursive Common Table Expression give you the results you're after?
;with cte as
(
--Anchor
select
commentid,
ParentId
from
LS_Commentes
where
ItemId = 9
union all
--Recursive member
select
c.commentId,
c.ParentId
from
LS_Commentes c join cte on c.ParentId = cte.CommentId
)
select * from cte
If you want to include more columns in the results ensure that both parts (the Anchor and recursive member) have identical columns.
Explanation:
The anchor part (the first select) of a recursive query selects all rows where ItemId = 9, the second part uses the existing records in the result to include further records that satisfy it's criters (ParentId = cte.CommentId) this keeps going until nothing more is selected. And then the entire results must be selected at the end (after the CTEs definition)
I think it would be good with an embedded SQL query
SELECT *
FROM `LS_Commentes`
WHERE `ItemId` = '9'
AND `ParentID`= (SELECT `CommentID` FROM `LS_Commentes` WHERE `ItemId` = 9);