SQL Server View with TOP - sql

I have a view that is driving me absolutely crazy..
Table AlarmMsg looks like this:
[TypeID] [smallint] NULL,
[SEFNum] [int] NULL,
[ServerName] [nvarchar](20) NOT NULL,
[DBName] [varchar](20) NOT NULL,
[PointName] [varchar](50) NOT NULL,
[AppName] [varchar](50) NOT NULL,
[Description] [varchar](100) NOT NULL,
[Priority] [tinyint] NOT NULL,
[Value] [float] NOT NULL,
[Limit] [float] NOT NULL,
[Msg] [nvarchar](255) NULL,
[DateStamp] [datetime2](7) NULL,
[UID] [uniqueidentifier] NOT NULL
On top of that AlarmMsg table is a view applied looking like this:
CREATE VIEW AlarmMsgView
AS
SELECT TOP (2000) WITH TIES
SEFNum, ServerName, DBName,
PointName, AppName, Description,
Priority, Value, Limit, Msg,
DateStamp, UID
FROM dbo.AlarmMsg WITH (NOLOCK)
ORDER BY DateStamp DESC
This query straight against the table returns the expected ten (10) rows:
SELECT TOP(10) [SEFNum]
FROM [RTIME_Logs].[dbo].[AlarmMsg]
where [Priority] = 1
The same query against the view returns....nothing (!):
SELECT TOP(10) [SEFNum]
FROM [RTIME_Logs].[dbo].[AlarmMsgView]
where [Priority] = 1
The table AlarmMsg contains some 11M+ rows and has a FT index declared on column Msg.
Can someone please tell me what's going on here, I think I'm losing my wits.

NOLOCK causes this issue.
Read this and this.
Basically, NOLOCK came from SQL Server 2000 era. It needs to be forgotten. You have upgraded your SQL Server (I hope), so you need to upgrade your queries. Consider switching to READ_COMMITTED_SNAPSHOT to read data in "unblocked" manner. Read here to decide which isolation level is best for your situation
EDIT:
After reading the comments from the author, I think this is the reason:
SQL Server is not doing anything wrong. Treat your view as a subquery of the main query, something like this:
SELECT * FROM (SELECT TOP(2000) col1, col2 FROM aTable ORDER BY Date1 DESC) WHERE priority = 1.
In this case the query in the brackets will be executed first, and the WHERE clause will be applied to the resulting set

Related

Query Optimization using 'except'

I've been lately working on some performance optimization and have been a bit stuck with the below query. Breaking it down, the individual steps don't seem to take very long, but when I run the query as a whole, it takes about 30minutes to complete.
The TABLE has around 100k rows, and the VIEW has around 400k rows, so they're not terribly large. I wasn't sure if I'm just not understanding the EXCEPT logic accurately, and if that's the likely culprit? Would there be an alternative to EXCEPT perhaps?
EDIT - The view itself has about 4 joins and a UNION, so it does have some logic to it.
CREATE TABLE [SCHEMA].[TABLE](
ColumnA [int] IDENTITY(1,1) NOT NULL,
ColumnB [tinyint] NOT NULL,
ColumnC [tinyint] NOT NULL,
ColumnD [int] NULL,
ColumnE [nvarchar](50) NOT NULL,
ColumnF [int] NOT NULL,
ColumnG [nvarchar](250) NULL,
ColumnH [nvarchar](250) NULL,
ColumnI [nvarchar](250) NULL,
ColumnJ [nvarchar](50) NULL,
columnK [nvarchar](400) NULL,
ColumnL [nvarchar](2) NULL,
ColumnM [nvarchar](250) NULL,
ColumnN [nvarchar](3) NULL,
----
DELETE FROM [DB].[SCHEMA].[TABLE] WHERE ColumnB NOT IN (4,6)
AND ColumnG not in
(SELECT ColumnG
FROM
(
SELECT ColumnG,ColumH,ColumnI FROM [DB].[SCHEMA].[TABLE] EXCEPT
SELECT ColumnG,ColumnH,ColumnI FROM [DB].[SCHEMA].[VIEW]
WHERE VIEW.ColumnB='Active' and year(LastChgDateTime) = 9999
) AAA )
Thanks for any help!
Without knowing your schema and indexing, it's hard to say. We also haven;t seen a query plan. And you haven't provided the view definition, so we don't know what's involved with that.
But for a start, you can simplify this query in the following way
Note the use of a sarge-able predicate on LastChgDateTime
DELETE FROM [DB].[SCHEMA].[TABLE]
WHERE ColumnB NOT IN (4,6)
AND NOT EXISTS (
SELECT ColumnG,ColumH,ColumnI
FROM [DB].[SCHEMA].[TABLE] AAA
WHERE AAA.ColumnG = [TABLE].ColumnG
EXCEPT
SELECT ColumnG,ColumnH,ColumnI
FROM [DB].[SCHEMA].[VIEW]
WHERE [VIEW].ColumnB = 'Active' and LastChgDateTime >= '9999-01-01'
);
For the above, the following indexes would make sense
The view would need indexing on the base tables
[TABLE] (ColumnG, ColumH, ColumnI) INCLUDE (ColumnB)
[VIEW] (ColumnB, ColumnG, ColumH, ColumnI, LastChgDateTime)
We can optimize this further by using an updatable CTE with a window function.
I'm not entirely sure the logic you are trying to achieve, but it appears to be something like this.
WITH cte AS (
SELECT
t.*,
IsGNotInView = COUNT(v.IsNotInView) OVER (PARTITION BY t.ColumnG)
FROM [DB].[SCHEMA].[TABLE] t
CROSS APPLY (
SELECT
CASE WHEN NOT EXISTS (SELECT 1
FROM [DB].[SCHEMA].[VIEW] v
WHERE v.ColumnB = 'Active'
AND v.LastChgDateTime >= '9999-01-01'
AND v.ColumnG = t.ColumnG
AND v.ColumnH = t.ColumnH
AND t.ColumnI = v.ColumnI
)
THEN 1 END
) v(IsNotInView)
)
DELETE FROM cte
WHERE ColumnB NOT IN (4,6)
AND IsGNotInView = 0;

SQL Server - Operand type clash: numeric is incompatible with datetimeoffset

i am having issue with passing the data from one table to another due to data type.
I tried converting datetimeoffset into date, and inserting into table where i have it as date type and im still getting this error.
this is the format of date/time i have:
2018-12-12 13:00:00 -05:00 in one table, and i have to just pars time and insert it into new table. I tried with casting using ,
CAST([from] AS date) DATE_FROM
I can run the query as select and it works but the moment i try to insert the data into other table even if the other table is formatted and prepared as date type i still get the issue.
Here is the table that stored data with datetimeoffset:
[dbo].[tmp_count](
[elements_Id] [numeric](20, 0) NULL,
[content_Id] [numeric](20, 0) NULL,
[element_Id] [numeric](20, 0) NULL,
[element-name] [nvarchar](255) NULL,
[sensor-type] [nvarchar](255) NULL,
[data-type] [nvarchar](255) NULL,
[from] [datetimeoffset](0) NULL,
[to] [datetimeoffset](0) NULL,
[measurements_Id] [numeric](20, 0) NULL,
[measurement_Id] [numeric](20, 0) NULL,
[from (1)] [datetimeoffset](0) NULL,
[to (1)] [datetimeoffset](0) NULL,
[values_Id] [numeric](20, 0) NULL,
[label] [nvarchar](255) NULL,
[text] [tinyint] NULL
And I am trying to cast columns with datetimeoffset to date and time and push it to #tmp1 table with
SELECT [elements_Id]
,[content_Id]
,[element_Id]
,[element-name]
,[sensor-type]
,[data-type]
,CAST([from] AS date) DATE_FROM
,[to]
,[measurements_Id]
,[measurement_Id]
,CAST([from (1)] AS time (0)) TIME_FROM
,CAST([to (1)] AS TIME(0)) TIME_TO
,[values_Id]
,[label]
,[text]
INTO #Tmp1
FROM [VHA].[dbo].[tmp_count]
SELECT
FROM #tmp1
which gives me the time in format for DATE_FROM as 2018-12-12 and for the DATE_FROM and DATE_TO as 13:00:00 which is exactly what i need.
Now i am trying to splice this table with another table and push it in final table that looks like this:
[dbo].[tbl_ALL_DATA_N](
[serial-number] [nvarchar](255) NULL,
[ip-address] [nvarchar](255) NULL,
[name] [nvarchar](255) NULL,
[group] [nvarchar](255) NULL,
[device-type] [nvarchar](255) NULL,
[elements_Id] [numeric](20, 0) NULL,
[content_Id] [numeric](20, 0) NULL,
[element_Id] [numeric](20, 0) NULL,
[element-name] [nvarchar](255) NULL,
[sensor-type] [nvarchar](255) NULL,
[data-type] [nvarchar](255) NULL,
[DATE_FROM] [date] NULL,
[to] [datetimeoffset](0) NULL,
[measurements_Id] [numeric](20, 0) NULL,
[measurement_Id] [numeric](20, 0) NULL,
[TIME_FROM] [time](0) NULL,
[TIME_TO] [time](0) NULL,
[values_Id] [numeric](20, 0) NULL,
[label] [nvarchar](255) NULL,
[text] [tinyint] NULL
using query below:
INSERT INTO [dbo].[tbl_ALL_DATA_N]
([serial-number],
[ip-address],
[name],
[group],
[device-type],
[measurement_id],
TIME_FROM,
TIME_TO,
[content_id],
[elements_id],
[element-name],
[sensor-type],
[data-type],
DATE_FROM,
[to],
[element_id],
[measurements_id],
[values_id],
[label],
[text])
SELECT *
FROM [VHA].[dbo].[tmp_sensor_info] A
FULL OUTER JOIN #tmp1 B
ON 1 = 1
And here is another message im getting: Msg 206, Level 16, State 2, Line 25
Operand type clash: numeric is incompatible with time
Any ideas?
The solution, which #PanagiotisKanavos alluded to in the comments, is to explicitly list the columns in your final SELECT * FROM.... The order of the columns in that SELECT statement aren't lining up with the columns you're INSERTing into in the destination table.
You may need to run an ad hoc instance of the query to sort out the column order. And then do yourself a favor for future maintenance and be sure to include a table alias on all of the listed columns so you (or whoever has to look at the code next) can easily find out if data is coming from [VHA].[dbo].[tmp_sensor_info] or #tmp1.
This is just one of many dangers in using SELECT * in production code. There's a ton of discussion on the issue in this question: Why is SELECT * considered harmful?
Also, as long as you're in there fixing up the query, consider meaningful table aliases. See: Bad habits to kick : using table aliases like (a, b, c) or (t1, t2, t3).

how to Deleting Duplicates and updating "one to many" related table?

I have searched, and perhaps I am not asking the question correctly.
I have inherited a nasty database and am trying to "normalize" it.
I have broken one table into two: Owners and Buildings
And now I have two One to One tables.
I know how to deleted duplicate records (in the Owners table) but I do not know how to then update the "one to many" related table.
I have one table "Owners" and one table "Owners(one) to Buildings(many)"
"Owners" Table schema:
CREATE TABLE
[dbo].[tbl_BuildingOwners]
(
[OwnerID] [int] IDENTITY(1,1) NOT NULL,
[OwnerName] [nvarchar](255) NULL,
[OwnerAddress1] [nvarchar](255) NULL,
[OwnerAddress2] [nvarchar](255) NULL,
[OwnerAddress3] [nvarchar](255) NULL,
[OwnerCity] [nvarchar](255) NULL,
[OwnerState] [nvarchar](255) NULL,
[OwnerZip] [float] NULL,
[OwnerZipExt] [float] NULL,
[OwnerPhone] [nvarchar](255) NULL,
[OwnerFax] [nvarchar](255) NULL
)
"Owners(one) to Buildings(many)" Relational Table schema:
CREATE TABLE
[dbo].[BuildingOwnerID]
(
[OwnerRelationshipID] [int] IDENTITY(1,1) NOT NULL,
[OwnerID] [int] NOT NULL,
[FileNumber] [nvarchar](255) NOT NULL
)
I need to delete the duplicates in the BuildingOwners table and update the OwnerID in the BuildingOwnerID table to the DISTINCT OwnerID that is left in the BuildingOwners table.
I hope this made sense.
I have already tried this but could not make it work for me. Lastly, I can use either SQL sever or MS Access which ever is easier.
To remove duplicate you can use below query (sample query to remove duplicate state entries [duplicate by Country and State])....
WITH dupDel
AS ( SELECT ROW_NUMBER() OVER ( PARTITION BY country, STATE ORDER BY country ) AS RowNum
FROM tblTest
)
DELETE FROM dupDel
WHERE RowNum > 1

Anyway to speed up the archiving of a large table

At midnight, I archive a SQL Server 2008 table by doing this in a stored procedure:
INSERT INTO archive(col1, col2....)
select col1, col2...
from tablename
where date <= #endDate
delete from tablename where date <= #enddate
Here is the table schema. I've changed the column names obviously. The archive table is exactly the same structure.
[col1] [uniqueidentifier] NOT NULL,
[col1] [bigint] NOT NULL,
[col1] [nvarchar](255) NOT NULL,
[col1] [nvarchar](255) NOT NULL,
[col1] [datetime] NOT NULL,
[col1] [nvarchar](75) NULL,
[col1] [nvarchar](255) NULL,
[col1] [nvarchar](255) NULL,
[col1] [nvarchar](255) NULL,
[col1] [nvarchar](255) NULL,
[col1] [nvarchar](50) NULL,
[col1] [nvarchar](50) NULL,
[col1] [nvarchar](1000) NULL,
[col1] [nvarchar](2) NULL,
[col1] [nvarchar](255) NULL,
[col1] [nvarchar](255) NULL,
The table typically has about 100,000 - 150,0000 rows with several indexes and is still having information written to it while I'm trying to perform this archive.
This process takes at the fastest, six minutes, and the slowest, 13 minutes.
Is there a faster way of doing this?
Partitioning is the fastest technique, but adds complexity and requires Enterprise Edition.
An alternate approach is to combine the DELETE and the INSERT into one statement by using the OUTPUT clause. http://msdn.microsoft.com/en-us/library/ms177564.aspx. A DELETE with an OUTPUT clause is faster than individual INSERT/DELETE statements.
DELETE FROM tablename
OUTPUT DELETED.Col1, DELETED.col2, DELETED.col3 DELETED.col4 -- etc
INTO archive ( col1, col2, col3, col4 )
WHERE date <= #enddate;
If you have issues with blocking due to the concurrent inserts, then you can batch the above statement by doing a loop:
DECLARE #i int
SET #i = 1
WHILE #i > 0
BEGIN
DELETE top (1000) FROM tablename
OUTPUT DELETED.Col1, DELETED.col2, DELETED.col3 DELETED.col4 -- Eric
INTO archive ( col1, col2, col3, col4 )
WHERE date <= #enddate
SET #i = ##rowcount
END
Additional note: There are a few restrictions for the output table. It can't have triggers, be involved in foreign keys or have check constraints.
A more appropriate way to handle archiving would be by creating and managing partitions.
There are several guides and tutorials available, such as:
http://blogs.msdn.com/b/felixmar/archive/2011/08/29/partitioning-amp-archiving-tables-in-sql-server-part-2-split-merge-and-switch-partitions.aspx

Query is very very slow for processing 200000 plus records

I have 200,000 rows in Patient & Person table, and the query shown takes 30 secs to execute.
I have defined the primary key (and clustered index) in the Person table on PersonId and on PatientId in the Patient table. What else can I do here to improve performance of my procedure?
New to database development side. I know only basic SQL. Also not sure SQL Server can handle 200,000 rows quickly.
Whole dynamic Procedure you can see at https://github.com/Padayappa/SQLProblem/blob/master/Performance
Anyone faced handling huge rows like this? How do I improve performance here?
DECLARE #return_value int,
#unitRows bigint,
#unitPages int,
#TenantId int,
#unitItems int,
#page int
SET #TenantId = 1
SET #unitItems = 20
SET #page = 1
DECLARE #PatientSearch TABLE(
[PatientId] [bigint] NOT NULL,
[PatientIdentifier] [nvarchar](50) NULL,
[PersonNumber] [nvarchar](20) NULL,
[FirstName] [nvarchar](100) NOT NULL,
[LastName] [nvarchar](100) NOT NULL,
[ResFirstName] [nvarchar](100) NOT NULL,
[ResLastName] [nvarchar](100) NOT NULL,
[AddFirstName] [nvarchar](100) NOT NULL,
[AddLastName] [nvarchar](100) NOT NULL,
[Address] [nvarchar](255) NULL,
[City] [nvarchar](50) NULL,
[State] [nvarchar](50) NULL,
[ZipCode] [nvarchar](20) NULL,
[Country] [nvarchar](50) NULL,
[RowNumber] [bigint] NULL
)
INSERT INTO #PatientSearch SELECT PAT.PatientId
,PAT.PatientIdentifier
,PER.PersonNumber
,PER.FirstName
,PER.LastName
,RES_PER.FirstName AS ResFirstName
,RES_PER.LastName AS ResLastName
,ADD_PER.FirstName AS AddFirstName
,ADD_PER.LastName AS AddLastName
,PER.Address
,PER.City
,PER.State
,PER.ZipCode
,PER.Country
,ROW_NUMBER() OVER (ORDER BY PAT.PatientId DESC) AS RowNumber
FROM dbo.Patient AS PAT
INNER JOIN dbo.Person AS PER
ON PAT.PersonId = PER.PersonId
INNER JOIN dbo.Person AS RES_PER
ON PAT.ResponsiblePersonId = RES_PER.PersonId
INNER JOIN dbo.Person AS ADD_PER
ON PAT.AddedBy = ADD_PER.PersonId
INNER JOIN dbo.Booking AS B
ON PAT.PatientId = B.PatientId
WHERE PAT.TenantId = #TenantId AND B.CategoryId = #CategoryId
GROUP BY PAT.PatientId
,PAT.PatientIdentifier
,PER.PersonNumber
,PER.FirstName
,PER.LastName
,RES_PER.FirstName
,RES_PER.LastName
,ADD_PER.FirstName
,ADD_PER.LastName
,PER.Address
,PER.City
,PER.State
,PER.ZipCode
,PER.Country
;
SELECT #unitRows = ##ROWCOUNT
,#unitPages = (#unitRows / #unitItems) + 1;
SELECT *
FROM #PatientSearch AS IT
WHERE RowNumber BETWEEN (#page - 1) * #unitItems + 1 AND #unitItems * #page
Well, unless I am missing something (like duplicate rows?) you should be able to remove the GROUP BY
GROUP BY PAT.PatientId
,PAT.PatientIdentifier
,PER.PersonNumber
,PER.FirstName
,PER.LastName
,RES_PER.FirstName
,RES_PER.LastName
,ADD_PER.FirstName
,ADD_PER.LastName
,PER.Address
,PER.City
,PER.State
,PER.ZipCode
,PER.Country
as you are grouping by all fields in the select list, and you are partitioning by PAT.PatientId
Further to that, you should create index on the tables with the index containing columns that you join/filter on.
So for instance I would create an index on table Patient with columns (TenantId,PersonId,ResponsiblePersonId,AddedBy) with included columns (PatientId,PatientIdentifier)
Frankly speaking, 200,000 rows is nothing to SQL server. Please first remove logic redundancy, like you have primary key, why still group so many columns, and why you need to join same table (person) 3 times? After removing logic redundancy, you need to create some composite index/include index at least. Get the execution plan (CTRL+M) or (CTRL+M), to see what index you missed. If you need further help, please paste your table schema with few rows of sample data.