How do I tell the query to only use distinct on only one field [duplicate] - sql

This question already has answers here:
How do I select the first row per group in an SQL Query?
(8 answers)
Closed 12 months ago.
I'm attempting to pull a report from this table:
CREATE TABLE [Miscellaneous].[BackupStatus](
[BackupStatusId] [int] IDENTITY(1,1) NOT NULL,
[CompanyCode] [varchar](3) NOT NULL,
[ComputerName] [varchar](50) NOT NULL,
[DateTimeOfBackup] [datetime] NOT NULL,
[Description] [varchar](50) NOT NULL,
[IsSuccess] [bit] NOT NULL,
[Message] [nvarchar](max) NOT NULL,
[Exception] [nvarchar](max) NULL,
[ProgramDate] [datetime] NULL,
CONSTRAINT [PK_BackupStatus] PRIMARY KEY CLUSTERED
(
[BackupStatusId] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
ALTER TABLE [Miscellaneous].[BackupStatus] ADD CONSTRAINT [DF_BackupStatus_IsSuccess] DEFAULT ((0)) FOR [IsSuccess]
GO
I need to get a list of most recent backup by CompanyCode. I only want the most recent record for each company (ie CompanyCode). I believe that I need to sort the table decending and distinct but I cannot figure it out.
SELECT Distinct CompanyCode, ComputerName, DateTimeOfBackup,
[Description], IsSuccess, [Message],
[Exception], ProgramDate, BackupStatusId
FROM Miscellaneous.BackupStatus
ORDER BY DateTimeOfBackup DESC
I'm getting back all records still. How do I tell the query to use distinct on only one field?
UPDATE
I hope to add some clarity to what I need.
In this case, it should only return records 4262 and 4266, since they are the most recent unique for that CompanyCode.

DISTINCT won't help here. You can use a windowed ROW_NUMBER() function to assign ordinals and then filter by ordinal = 1.
Something like:
SELECT CompanyCode, ComputerName, DateTimeOfBackup,
[Description], IsSuccess, [Message],
[Exception], ProgramDate, BackupStatusId
FROM (
SELECT
*,
ordinal = ROW_NUMBER() OVER(PARTITION BY CompanyCode ORDER BY DateTimeOfBackup DESC)
FROM Miscellaneous.BackupStatus
) A
WHERE A.Ordinal = 1 -- Just the latest
ORDER BY CompanyCode
This can also be coded using a CTE (common table expression):
WITH A AS (
SELECT
*,
ordinal = ROW_NUMBER() OVER(PARTITION BY CompanyCode ORDER BY DateTimeOfBackup DESC)
FROM Miscellaneous.BackupStatus
)
SELECT CompanyCode, ComputerName, DateTimeOfBackup,
[Description], IsSuccess, [Message],
[Exception], ProgramDate, BackupStatusId
FROM A
WHERE A.Ordinal = 1 -- Just the latest
ORDER BY CompanyCode

Related

Sometime SQL Server Select Query is too slow

I have a table like this which has more than 7 million records:
CREATE TABLE [dbo].[Test]
(
[Id] [bigint] IDENTITY(1,1) NOT NULL,
[UUID] [nvarchar](100) NOT NULL,
[FirstName] [nvarchar](50) NULL,
[LastName] [nvarchar](50) NULL,
[AddrLine1] [nvarchar](100) NULL,
[AddrLine2] [nvarchar](100) NULL,
[City] [nvarchar](50) NULL,
[Prov] [nvarchar](10) NULL,
[Postal] [nvarchar](10) NULL,
[DateAdded] [datetime] NULL,
CONSTRAINT [PK_Test]
PRIMARY KEY CLUSTERED ([Id] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
Now, the system runs the following select query everyday during the afternoons. The funny thing is sometimes the same query is so slow which takes about 4 minutes. The second time or other times, same query is pretty fast which takes less than a second.
The query:
WITH testquery AS
(
SELECT TOP 1
'Matched' as location,Firstname, LastName,
AddrLine1, AddrLine2, City, Prov, Postal
FROM
[Test]
WHERE
UUID = 'BLABLABLABLABLABLABLABLABLA'
ORDER BY
DateAdded DESC
),
defaults AS
(
SELECT
'Rejected' AS location, NULL AS Firstname, NULL AS LastName,
NULL AS AddrLine1, NULL AS AddrLine2, NULL AS City, NULL AS Prov,
NULL AS Postal
)
SELECT *
FROM testquery
UNION ALL
SELECT *
FROM defaults
WHERE NOT EXISTS (SELECT * FROM testquery);
Can somebody help please?
Notes:
I have a service which adds around 1000 new records to the table everyday in the mornings.
[avg_fragmentation_in_percent] is 0.01
UUID can be duplicated if I have the same person with different addresses.
The table is not used somewhere else at the same time.
Database is not busy with other queries at the same time. I checked using "sys.dm_exec_requests"
You need a good index to service this query efficiently.
You say that you can't create one because of duplicate key errors: there is no need for an index to be unique.
So the one you're looking for will depend on what other queries you are running, but the following will suffice for this query:
CREATE NONCLUSTERED INDEX IX_Test_UuidDate ON
Test (UUID ASC, DateAdded DESC)
INCLUDE (Firstname, LastName, AddrLine1, AddrLine2, City, Prov, Postal)
GO
Furthermore, there is no need to query the table twice.
Start with a dummy VALUES table constructor so you always have a row, then LEFT JOIN the table and use CASE to deal with not having a row.
WITH testquery AS
(
SELECT TOP 1
*
FROM
[Test]
WHERE
UUID = 'BLABLABLABLABLABLABLABLABLA'
ORDER BY
DateAdded DESC
)
SELECT
CASE WHEN UUID IS NULL 'Rejected' ELSE 'Matched' END as location,
t.Firstname,
t.LastName,
t.AddrLine1,
t.AddrLine2,
t.City,
t.Prov,
t.Postal
FROM (VALUES(0)) AS v(dummy)
LEFT JOIN testquery AS t ON 1=1;
The usual explanation for this is a cold cache. In your case, I think the issue would be the ORDER BY in the first CTE.
To fix this problem, you want an index on test(UUID, DateAdded desc).
I'm not sure why this would speed up after the first execution. Perhaps the server's caches are working particularly well.

Inserto into ( Select ) with not auto increment

Have two tables
CREATE TABLE [dbo].[TABELAA]
(
[ID] [bigint] NOT NULL,
[PodatakA] [nvarchar](50) NULL,
[PodatakB] [nvarchar](50) NULL,
CONSTRAINT [PK_TABELAA]
PRIMARY KEY CLUSTERED ([ID] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[TABELAB]
(
[PodatakX] [nvarchar](50) NULL,
[PodatakY] [nvarchar](50) NULL
) ON [PRIMARY]
GO
I need to insert value from tabelaB to tabelaA with autogenerating ID in tabelaA so I need something like this. But this would be great if there is only one row. I'm talking about thousands of rows where it should auto generate id exact like AutoIncrement (1)
Useless try where I think I should use OVER
INSERT INTO TABELAA
SELECT
(SELECT MAX(id) + 1 FROM TabelaA) AS Id, *
FROM
tabelaB
You are looking for the IDENTITY:
CREATE TABLE [dbo].[TABLAAA](
[ID] [bigint] IDENTITY(1, 1) PRIMARY KEY, -- NOT NULL is handled by PRIMARY KEY
[PodatakA] [nvarchar](50) NULL,
[PodatakB] [nvarchar](50) NULL
);
INSERT INTO TABLEAA (PodatakA, PodatakB)
SELECT PodatakA, PodatakB
FROM TABLEBB;
I agree with Rahul's comment and Gordon that if you can modify your schema it would make the most sense to add an Identity Column. However if you cannot you can still accomplish what you want using a couple of methods.
One method is get the MAX ID of TableAA and then add a ROW_NUMBER() to it like so:
INSERT INTO TableAA (ID, PodatakA, PodatakB)
SELECT
m.CurrentMaxId + ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
,b.PodatakA
,b.PodatakB
FROM
TableAB b
CROSS APPLY (
SELECT ISNULL(MAX(Id),0) as CurrentMaxId
FROM
TableAA) m
Again this would be work around the most ideal solution is to specify IDENTITY
Also this is susceptible to problems due to simultaneous writes and other scenarios in a heavy traffic DB.

Bypass duplicate record when one table data is inserted from another in SQL Server

I am trying to insert data from one database table to another database table. This work is performed very well but need bypass that duplicate data cannot be inserted. Here is my query below. How can I check duplicate record?
;WITH ABC AS (
SELECT
5 AS DeviceID
, nUserID AS CardNo
, CONVERT(DATE, dbo.fn_ConvertToDateTime(nDateTime)) AS InOutDate
, CONVERT(VARCHAR(8) ,CONVERT(TIME,dbo.fn_ConvertToDateTime(nDateTime))) AS InOutTime
FROM [BioStar].[dbo].[TB_EVENT_LOG]
)
SELECT * INTO #tempAtten FROM ABC
INSERT [HR].[dbo].[HR_DeviceInOut](DeviceID, CardNo, InOutDate, InOutTime, ShiftprofileID, ExecutedBy)
SELECT DeviceID, CardNo, InOutDate, InOutTime, NULL, NULL
FROM #tempAtten
WHERE #tempAtten.InOutDate = CONVERT(DATE, GETDATE()) AND #tempAtten.CardNo <> 0
DROP TABLE #tempAtten
--HR_DeviceInOut
CREATE TABLE [dbo].[HR_DeviceInOut](
[id] [bigint] IDENTITY(100000000000001,1) NOT NULL,
[DeviceID] [nvarchar](20) NULL,
[CardNo] [nvarchar](20) NOT NULL,
[InOutDate] [date] NOT NULL,
[InOutTime] [nvarchar](10) NOT NULL,
[ShiftprofileID] [tinyint] NULL,
[ExecutedBy] [int] NULL,
CONSTRAINT [PK_HR_AttenHistory] PRIMARY KEY CLUSTERED
(
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
GO
--Function
ALTER FUNCTION [dbo].[fn_ConvertToDateTime] (#Datetime BIGINT)
RETURNS DATETIME
AS
BEGIN
DECLARE #LocalTimeOffset BIGINT
,#AdjustedLocalDatetime BIGINT;
SET #LocalTimeOffset = DATEDIFF(second,GETDATE(),GETUTCDATE())
SET #AdjustedLocalDatetime = #Datetime - #LocalTimeOffset
RETURN (SELECT DATEADD(second,#AdjustedLocalDatetime, CAST('1970-01-01 00:00:00' AS datetime)))
END;
Assuming I'm understanding correctly, here's one option using not exists:
INSERT [HR].[dbo].[HR_DeviceInOut] (DeviceID, CardNo, InOutDate,
InOutTime, ShiftprofileID, ExecutedBy)
SELECT DeviceID, CardNo, InOutDate, InOutTime, NULL, NULL
FROM #tempAtten t
WHERE t.InOutDate = CONVERT(DATE, GETDATE()) AND
t.CardNo <> 0 AND
NOT EXISTS (
SELECT 1
FROM [HR].[dbo].[HR_DeviceInOut] d
WHERE t.DeviceID = d.DeviceId AND
t.CardNo = d.CardNo AND
t.InOutDate = d.InOutDate AND
t.InOutTime = d.InOutTime
)
Consider adding a unique_index to the those fields that cannot be duplicated.
Which column set make record unique as i see some column are hard coded
ie 5 AS DeviceID ...
Create unique key for rest of the column in temp table and destinationtabel.to avoid duplicate .

Conversion failed when converting the nvarchar value 'newValue' to data type int

I'm pretty sure that I am inserting the correct data. I have drop all Indexes, Triggers, Constraint to make sure nothing is intervening my insert.
Here's the catch: the field that I'm trying to insert to previously has a datatype of int but changed to nvarchar(100) later. Does anybody know where should I look into?
This is the table structure
CREATE TABLE [dbo].[myTable.Values](
[myTableId] [bigint] NOT NULL,
[myTableCode] [nvarchar](100) NOT NULL,
[Code] [nvarchar](100) NOT NULL,
[value] [nvarchar](250) NOT NULL,
[id] [bigint] IDENTITY(1,1) NOT NULL,
[UpdateById] [bigint] NOT NULL,
[UpdateDate] [datetime] NOT NULL,
CONSTRAINT [PK_myTable.Values] PRIMARY KEY CLUSTERED
(
[myTableCode] ASC,
[Code] ASC,
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, `ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]`
) ON [PRIMARY]
and this is the script for insert
INSERT INTO [dbo].[myTable.Values] (
[myTableId]
,[myTableCode]
,[Code]
,[value]
,[UpdateById]
,[UpdateDate]
)
VALUES (
581
,'myParentTableCODE'
,'myTableNewCode'
,'myTableNewCode'
,5197
,getdate()
)
Thanks for looking into this, I have found out one of our View is using "WITH SCHEMABINDING" clause. And that view is Casting the field "Code" as int. Here is the view.
CREATE VIEW [dbo].[vSomeView] WITH SCHEMABINDING
AS
SELECT
id,
CAST(Code AS INT) AS [Code],
myTableCode
FROM dbo.[myTable.Values]
WHERE myTableCode LIKE '%ABC'
AND ISNUMERIC(Code) = 1
it seems yourinsertion code is not complete, you should to add another field and assign value to it,the field is IsDefaultObsolete, then try this code for insert :
INSERT INTO [dbo].[myTable.Values] (
[myTableId]
,[myTableCode]
,[Code]
,[value]
,[IsDefaultObsolete]
,[UpdateById]
,[UpdateDate]
)
VALUES (
581
,'myParentTableCODE'
,'myTableNewCode'
,'myTableNewCode'
0,--THIS MAY BE 0 OR 1 DEPENDING YOU
,5197
,getdate()
)

How to determine size of continious range for given criteria?

I have a positions table in SQL Server 2008R2 (definition below).
In the system boxes there are positions.
I have a requirement to find a box, which has X free positions remaining. However, the X positions must be continuous (left to right, top to bottom i.e. ascending PositionID).
It has been simple to construct a query that finds a box with X positions free. I now have the problem of determining if the positions are continuous.
Any suggestions on a TSQL based solution?
Table Definition
` CREATE TABLE [dbo].[Position](
[PositionID] [int] IDENTITY(1,1) NOT NULL,
[BoxID] [int] NOT NULL,
[pRow] [int] NOT NULL,
[pColumn] [int] NOT NULL,
[pRowLetter] [char](1) NOT NULL,
[pColumnLetter] [char](1) NOT NULL,
[SampleID] [int] NULL,
[ChangeReason] [nvarchar](4000) NOT NULL,
[LastUserID] [int] NOT NULL,
[TTSID] [bigint] NULL,
CONSTRAINT [PK_Position] PRIMARY KEY CLUSTERED
(
[PositionID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]`
Edit
http://pastebin.com/V8DLiucN - pastebin link with sample positions for 1 box (all positions empty in sample data)
Edit 2
A 'free' position is one with SampleID = null
DECLARE #AvailableSlots INT
SET #AvailableSlots = 25
;WITH OrderedSet AS (
SELECT
BoxID,
PositionID,
Row_Number() OVER (PARTITION BY BoxID ORDER BY PositionID) AS rn
FROM
Position
WHERE
SampleID IS NULL
)
SELECT
BoxID,
COUNT(*) AS AvailableSlots,
MIN(PositionID) AS StartingPosition,
MAX(PositionID) AS EndingPosition
FROM
OrderedSet
GROUP BY
PositionID - rn,
BoxID
HAVING
COUNT(*) >= #AvailableSlots
The trick is the PositionID - rn (row number) in the GROUP BY statement. This works to group together continuous sets... and from there it's easy to just do a HAVING to limit the results to the BoxIDs that have the required amount of free slots.