How to determine size of continious range for given criteria? - sql

I have a positions table in SQL Server 2008R2 (definition below).
In the system boxes there are positions.
I have a requirement to find a box, which has X free positions remaining. However, the X positions must be continuous (left to right, top to bottom i.e. ascending PositionID).
It has been simple to construct a query that finds a box with X positions free. I now have the problem of determining if the positions are continuous.
Any suggestions on a TSQL based solution?
Table Definition
` CREATE TABLE [dbo].[Position](
[PositionID] [int] IDENTITY(1,1) NOT NULL,
[BoxID] [int] NOT NULL,
[pRow] [int] NOT NULL,
[pColumn] [int] NOT NULL,
[pRowLetter] [char](1) NOT NULL,
[pColumnLetter] [char](1) NOT NULL,
[SampleID] [int] NULL,
[ChangeReason] [nvarchar](4000) NOT NULL,
[LastUserID] [int] NOT NULL,
[TTSID] [bigint] NULL,
CONSTRAINT [PK_Position] PRIMARY KEY CLUSTERED
(
[PositionID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]`
Edit
http://pastebin.com/V8DLiucN - pastebin link with sample positions for 1 box (all positions empty in sample data)
Edit 2
A 'free' position is one with SampleID = null

DECLARE #AvailableSlots INT
SET #AvailableSlots = 25
;WITH OrderedSet AS (
SELECT
BoxID,
PositionID,
Row_Number() OVER (PARTITION BY BoxID ORDER BY PositionID) AS rn
FROM
Position
WHERE
SampleID IS NULL
)
SELECT
BoxID,
COUNT(*) AS AvailableSlots,
MIN(PositionID) AS StartingPosition,
MAX(PositionID) AS EndingPosition
FROM
OrderedSet
GROUP BY
PositionID - rn,
BoxID
HAVING
COUNT(*) >= #AvailableSlots
The trick is the PositionID - rn (row number) in the GROUP BY statement. This works to group together continuous sets... and from there it's easy to just do a HAVING to limit the results to the BoxIDs that have the required amount of free slots.

Related

RANK() SQL Server execution plan issue

What is driving SQL Server to use less optimal execution plan for queries where 6000+ rows are returned? I need to improve query performance for scenario where all rows are returned.
I select all fields and add rank over same three columns included in index. Depending on number of returned rows, query has two different execution plans, hence execution takes 0.2s or 3s respectively.
From 1 row returned up to ca. 5000 query runs fast. From 6000 rows returned up to all, query runs slow.
Table1 has ca. 38000 rows. Database runs on Azure SQL v12.
Table:
CREATE TABLE [dbo].[Table1](
[ID] [int] IDENTITY(1,1) NOT NULL,
[KOD_ID] [int] NULL,
[SYM] [nvarchar](20) NULL,
[AN] [nvarchar](35) NULL,
[A] [nvarchar](10) NULL,
[B] [nvarchar](2) NULL,
[C] [datetime] NULL,
[D] [datetime] NULL,
CONSTRAINT [PK_Table1] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
)
GO
CREATE NONCLUSTERED INDEX [IX_Table1] ON [dbo].[Table1]
(
[KOD_ID] ASC,
[SYM] ASC,
[AN] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)
GO
Queries:
SELECT TOP 6000 *, RANK() OVER(ORDER BY KOD_ID ASC, SYM ASC, AN ASC) AS Rank#
FROM [dbo].[Table1]
SELECT TOP 7000 *, RANK() OVER(ORDER BY KOD_ID ASC, SYM ASC, AN ASC) AS Rank#
FROM [dbo].[Table1]
Execution plans for both queries
CREATE NONCLUSTERED INDEX [IX_Table1] ON [dbo].[Table1]
(
[KOD_ID] ASC,
[SYM] ASC,
[AN] ASC
) INCLUDE ([A], [B], [C], [D]);
Create such kind of a covering index and it should scan this index and most likely sort won't even be needed because it's data is already sorted in index.
The key points in your queries are:
First plan has a key lookup, avoid them as much as possible (key lookup is additional scan for each row because index does not have them) create covering indexes with INCLUDED columns
Avoid sort operations too, they're costly to SQL Server
If you're alright with index rebuilds and favor reads over inserts, these could be alternate DDLs for your table considering that and KOD_ID, SYM, AN are not null-able:
If ID is needed to ensure uniqueness:
CREATE TABLE [dbo].[Table1] (
[KOD_ID] [int] NOT NULL
, [SYM] [nvarchar](20) NOT NULL
, [AN] [nvarchar](35) NOT NULL
, [ID] [int] IDENTITY(1, 1) NOT NULL
, [A] [nvarchar](10) NULL
, [B] [nvarchar](2) NULL
, [C] [datetime2] NULL
, [D] [datetime2] NULL
, CONSTRAINT [PK_Table1] PRIMARY KEY CLUSTERED ([KOD_ID], [SYM], [AN], [ID])
);
GO
If ID is not needed to ensure uniqueness:
CREATE TABLE [dbo].[Table1] (
[KOD_ID] [int] NOT NULL
, [SYM] [nvarchar](20) NOT NULL
, [AN] [nvarchar](35) NOT NULL
, [A] [nvarchar](10) NULL
, [B] [nvarchar](2) NULL
, [C] [datetime2] NULL
, [D] [datetime2] NULL
, CONSTRAINT [PK_Table1] PRIMARY KEY CLUSTERED ([KOD_ID], [SYM], [AN])
);
GO
Also, note that I use datetime2 instead of datetime, that's what Microsoft recommends: https://learn.microsoft.com/en-us/sql/t-sql/data-types/datetime-transact-sql
Use the time, date, datetime2 and datetimeoffset data
types for new work. These types align with the SQL Standard. They are
more portable. time, datetime2 and datetimeoffset provide
more seconds precision. datetimeoffset provides time zone support
for globally deployed applications.

Inserto into ( Select ) with not auto increment

Have two tables
CREATE TABLE [dbo].[TABELAA]
(
[ID] [bigint] NOT NULL,
[PodatakA] [nvarchar](50) NULL,
[PodatakB] [nvarchar](50) NULL,
CONSTRAINT [PK_TABELAA]
PRIMARY KEY CLUSTERED ([ID] ASC)
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY]
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[TABELAB]
(
[PodatakX] [nvarchar](50) NULL,
[PodatakY] [nvarchar](50) NULL
) ON [PRIMARY]
GO
I need to insert value from tabelaB to tabelaA with autogenerating ID in tabelaA so I need something like this. But this would be great if there is only one row. I'm talking about thousands of rows where it should auto generate id exact like AutoIncrement (1)
Useless try where I think I should use OVER
INSERT INTO TABELAA
SELECT
(SELECT MAX(id) + 1 FROM TabelaA) AS Id, *
FROM
tabelaB
You are looking for the IDENTITY:
CREATE TABLE [dbo].[TABLAAA](
[ID] [bigint] IDENTITY(1, 1) PRIMARY KEY, -- NOT NULL is handled by PRIMARY KEY
[PodatakA] [nvarchar](50) NULL,
[PodatakB] [nvarchar](50) NULL
);
INSERT INTO TABLEAA (PodatakA, PodatakB)
SELECT PodatakA, PodatakB
FROM TABLEBB;
I agree with Rahul's comment and Gordon that if you can modify your schema it would make the most sense to add an Identity Column. However if you cannot you can still accomplish what you want using a couple of methods.
One method is get the MAX ID of TableAA and then add a ROW_NUMBER() to it like so:
INSERT INTO TableAA (ID, PodatakA, PodatakB)
SELECT
m.CurrentMaxId + ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
,b.PodatakA
,b.PodatakB
FROM
TableAB b
CROSS APPLY (
SELECT ISNULL(MAX(Id),0) as CurrentMaxId
FROM
TableAA) m
Again this would be work around the most ideal solution is to specify IDENTITY
Also this is susceptible to problems due to simultaneous writes and other scenarios in a heavy traffic DB.

Finding Nearest adjacent point Recursively

I have a dataset of all locations my phone has been. (I got it via Google Takeout if you are interested.) The problem with the data is that at a certain point, I got a second phone. The dataset I have doesn't have any information that allows me to track data by a specific phone. So if I leave a phone at home then it shows me at two places at once. I decided to write a query that tries to find adjacent points by determining which point in the last 5 are closest eliminating and point I had to have traveled faster than 150mph in order to get to.
The table definition for the data is here:
CREATE TABLE [dbo].[locationdata](
[ID] [bigint] IDENTITY(1,1) NOT NULL,
[t] [datetime] NULL,
[lat] [float] NULL,
[long] [float] NULL,
[accuracy] [smallint] NULL,
[activity] [varchar](14) NULL,
[confidence] [int] NULL,
[velocity] [varchar](2) NULL,
[altitude] [smallint] NULL,
[heading] [smallint] NULL,
[point] [geography] NULL,
[tag] [varchar](50) NULL,
CONSTRAINT [PK_locationdata] PRIMARY KEY CLUSTERED
(
[ID] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]
GO
The rows were inserted in time order so the IDs line up in the correct order with the exception being that the same time can exist for multiple points.
So here is my attempt at writing this without a CURSOR. The issue being that you can't use a "TOP" in the recursive part of a Common Table Expression.
WITH tripdata(originid, endid, startid, speed, distance, startpoint, startt)
AS
(
SELECT originid, endid, startid, speed, distance, startpoint, startt
FROM
(
SELECT
origin.id as originid
, NULL as endid
, origin.id as startid
, NULL as distance
, NULL as speed
, point as startpoint
, t as startt
FROM locationdata origin
) a
UNION ALL
SELECT
originid as originid
, startid as endid
, l.id as startid
, origin.startpoint.STDistance(l.point) as distance
, (origin.startpoint.STDistance(l.point)/(datediff(S, origin.startt, l.t))) * -2.23694 as speed
, l.point as startpoint
, l.t as startt
FROM tripdata origin
CROSS APPLY
(
SELECT top 1
z.id
,z.point
,z.t
FROM locationdata z
where origin.startid > z.ID and origin.startid -5 < z.ID
and z.t <> origin.startt
and (origin.startpoint.STDistance(z.point)/(datediff(S, origin.startt, z.t))) * -2.23694 < 150
order by origin.startpoint.STDistance(z.point)
) l
)
SELECT *
FROM tripdata
WHERE originid = 218255
;
I am open to suggestions on how this query might be fixed or if it is even possible.

Conversion failed when converting the nvarchar value 'newValue' to data type int

I'm pretty sure that I am inserting the correct data. I have drop all Indexes, Triggers, Constraint to make sure nothing is intervening my insert.
Here's the catch: the field that I'm trying to insert to previously has a datatype of int but changed to nvarchar(100) later. Does anybody know where should I look into?
This is the table structure
CREATE TABLE [dbo].[myTable.Values](
[myTableId] [bigint] NOT NULL,
[myTableCode] [nvarchar](100) NOT NULL,
[Code] [nvarchar](100) NOT NULL,
[value] [nvarchar](250) NOT NULL,
[id] [bigint] IDENTITY(1,1) NOT NULL,
[UpdateById] [bigint] NOT NULL,
[UpdateDate] [datetime] NOT NULL,
CONSTRAINT [PK_myTable.Values] PRIMARY KEY CLUSTERED
(
[myTableCode] ASC,
[Code] ASC,
[id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, `ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]`
) ON [PRIMARY]
and this is the script for insert
INSERT INTO [dbo].[myTable.Values] (
[myTableId]
,[myTableCode]
,[Code]
,[value]
,[UpdateById]
,[UpdateDate]
)
VALUES (
581
,'myParentTableCODE'
,'myTableNewCode'
,'myTableNewCode'
,5197
,getdate()
)
Thanks for looking into this, I have found out one of our View is using "WITH SCHEMABINDING" clause. And that view is Casting the field "Code" as int. Here is the view.
CREATE VIEW [dbo].[vSomeView] WITH SCHEMABINDING
AS
SELECT
id,
CAST(Code AS INT) AS [Code],
myTableCode
FROM dbo.[myTable.Values]
WHERE myTableCode LIKE '%ABC'
AND ISNUMERIC(Code) = 1
it seems yourinsertion code is not complete, you should to add another field and assign value to it,the field is IsDefaultObsolete, then try this code for insert :
INSERT INTO [dbo].[myTable.Values] (
[myTableId]
,[myTableCode]
,[Code]
,[value]
,[IsDefaultObsolete]
,[UpdateById]
,[UpdateDate]
)
VALUES (
581
,'myParentTableCODE'
,'myTableNewCode'
,'myTableNewCode'
0,--THIS MAY BE 0 OR 1 DEPENDING YOU
,5197
,getdate()
)

How to update a column via Row_Number with a different value for each row?

I have this table right now
CREATE TABLE [dbo].[DatosLegales](
[IdCliente] [int] NOT NULL,
[IdDatoLegal] [int] NULL,
[Nombre] [varchar](max) NULL,
[RFC] [varchar](13) NULL,
[CURP] [varchar](20) NULL,
[IMSS] [varchar](20) NULL,
[Calle] [varchar](100) NULL,
[Numero] [varchar](10) NULL,
[Colonia] [varchar](100) NULL,
[Pais] [varchar](50) NULL,
[Estado] [varchar](50) NULL,
[Ciudad] [varchar](50) NULL,
[CodigoPostal] [varchar](10) NULL,
[Telefono] [varchar](13) NULL,
[TipoEmpresa] [varchar](20) NULL,
[Tipo] [varchar](20) NULL,
CONSTRAINT [PK_DatosLegales] PRIMARY KEY CLUSTERED
(
[IdCliente] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]
)
I need to update the IdDatoLegal Column. Right now I have 80 rows on that table, so I need to update each row with the numbers 1, 2, 3... 79, 80.
I have tried simple queries to stored procedures with no succeed at all.
I have this stores procedure right now:
ALTER PROCEDURE dbo.ActualizarDatosLegales
#RowCount int
AS
DECLARE #Inicio int
SET #Inicio = 0
WHILE #Inicio < ##RowCount
SET #Inicio += 1;
BEGIN
UPDATE DatosLegales SET IdDatoLegal = #Inicio WHERE (SELECT ROW_NUMBER() OVER (ORDER BY IdCliente) AS RowNum FROM DatosLegales) = #Inicio;
END
It returns this message when I run it
Subquery returned more than 1 value. This is not permitted when the subquery follows =, !=, <, <= , >, >= or when the subquery is used as an expression.
I guess that's because in the subquery (SELECT ROW_NUMBER() OVER (ORDER BY IdCliente) AS RowNum FROM DatosLegales) it returns 80 rows where it should only return one (but each time it should be a diferent number.
Do you know what do I have to add to the subquery to make it work? and above all, Is the loop and the rest of the procedure right?
thanks in advance
You can update all the rows in one statement using a CTE as below.
;WITH T
AS (SELECT IdDatoLegal,
Row_number() OVER (ORDER BY IdCliente ) AS RN
FROM dbo.DatosLegales)
UPDATE T
SET IdDatoLegal = RN
UPDATE D
SET IdDatoLegal = RN
FROM DatosLegales D JOIN
(
SELECT IdCliente, Row_number() OVER (ORDER BY IdCliente) AS RN
FROM DatosLegales
) Temp
ON D.IdCliente = Temp.IdCliente