select the value from the nth position above - sql

I need to calculate something by getting the corresponding value from a field (let's say abc256) to which i must subtract the value from the same column but 249 rows above (let's say abc7) and then multiply it by 100 into a new column in a temporary table (only for displayin the output).
How to count from the current value 249 fields above?
I already ordered the list as it should be by 2 columns asc.
So the query which orders my list looks like:
select [rN] ,[rD],[r],[rId]
from [someName].[dbo].[some_table]
where rN like '%bla%'
and rD >= 'yyy-mm-dd'
EDIT: order by rD asc, rID asc
a pseudocode of what i need is:
[(case when rN like 'something' then newSomething = (r.value - r.count(249).value))*100) as newSomething)]
FROM [someName].[dbo].[some_table]
then i tried
select [rN] ,[rD],[r],[rId]
from select (ROW_NUMBER() over (order by key ASC) AS rownumber,
r)
from
[someName].[dbo].[some_table]
where rownumber = r -249
where rN like '%bla%'
and rD >= 'yyy-mm-dd'
I should mention that i need to to a repetitive process of this (after each 249 rows i am calculating using the current value - the value from 249 rows above). And i will have 12 cases for rN like 'something1' ...'something12'
How to get this to work?
Thanks

You will have to adapt this to your specific needs, but whenever I am posed with the "Compare a column to the one nth one before it" problem, I revert to CTE expressions and then do a self-join. I've thrown together a quick example to get you started. I hope you find it useful.
-- DROP TABLE IF EXISTS
IF OBJECT_ID('Table_1') IS NOT NULL
DROP TABLE Table_1
GO
-- CREATE A TABLE FOR TESTING
CREATE TABLE [dbo].[Table_1](
[Id] [int] IDENTITY(1,1) NOT NULL,
[Num1] [int] NULL,
[Num2] [int] NULL
) ON [PRIMARY]
GO
-- FILL THE TABLE WITH VALUES
DECLARE #cnt INT; SET #cnt = 0
WHILE #cnt <=10000
BEGIN
SET #cnt = #cnt + 1
INSERT INTO Table_1 (Num1, Num2) VALUES (#cnt, #cnt * 1000)
END
GO
-- DO THE SELECT
; With RowedTables AS (
SELECT
ROW_NUMBER() OVER (ORDER BY Id ASC) As R,
T1.*
FROM Table_1 T1)
SELECT
RT1.Num1 - RT2.Num1 AS SomeMath,
RT1.*,
RT2.*
FROM RowedTables RT1 JOIN RowedTables RT2 ON RT1.R = RT2.R - 10

Related

Adding Random Id for each unique value in table

I have the table like
ID RANDOM_ID
1 123
10 456
25 789
1 1112
55 1314
10 1516
I want the result to be like :
ID RANDOM_ID
1 123
10 456
25 789
1 123
55 1314
10 456
The same ID should have same random_ids. I'm using the update statement to generate the Random_IDs after creating the table.
CREATE TABLE [RANDOMID_TABLE]([ID] [int] NULL, [RANDOM_ID] [int] NULL)
GO
INSERT INTO [RANDOMID_TABLE] ([ID])
select distinct ABC_ID from RANDOMID_ABC
GO
******** This is the update statement for the RANDOM_ID column in
[RANDOMID_TABLE] table ************
UPDATE [RANDOMID_TABLE]
SET RANDOM_ID = abs(checksum(NewId()) % 1000000)
Is there something else that I need to add to the update statement?
Please advise.
Why would you use update for this? Just generate the values when you insert them:
insert into [RANDOMID_TABLE] (ID, RANDOM_ID)
select ABC_ID, abs(checksum(NewId()) % 1000000)
from RANDOMID_ABC
group by ABC_ID;
EDIT:
If your problem is collisions, then fix how you do the assignment. Just assign a number . . . randomly:
insert into [RANDOMID_TABLE] (ID, RANDOM_ID)
select ABC_ID, row_number() over (order by newid())
from RANDOMID_ABC
group by ABC_ID;
This is guaranteed to not return duplicates.
At a total guess, are you simpling wanting to UPDATE the table so that all the values of a specific ID to have the same value for Random_ID? Like this?
CREATE TABLE YourTable (ID int, Random_ID int);
INSERT INTO YourTable
VALUES(1 ,123),
(10,456),
(25,789),
(1 ,1112),
(55,1314),
(10,1516);
GO
WITH CTE AS(
SELECT ID,
Random_ID,
MIN(Random_ID) OVER (PARTITION BY ID) AS Min_Random_ID
FROM YourTable)
UPDATE CTE
SET Random_ID = Min_Random_ID;
GO
SELECT *
FROM YourTable;
GO
DROP TABLE YourTable;
Here is the script you need with use of temporary table (you need it to persist your random results for each unique ID):
DECLARE #Tbl TABLE (ID INT, RANDOM_ID INT)
INSERT #Tbl (Id) VALUES(1), (10), (25), (1), (55), (10)
SELECT Id, abs(checksum(NewId()) % 1000000) AS Random_Id INTO #distinctData FROM #Tbl GROUP BY Id
SELECT D.* FROM #Tbl T JOIN #distinctData D ON D.ID = T.ID
DROP TABLE #distinctData
Obviously, you don't need the first two rows where I create and initialize data table
Result:
Id Random_Id
1 354317
1 62026
10 532304
10 604768
25 874209
55 718643
You want one random value per ID. So one should think that the following would work:
with ids as
(
select distinct id
from randomid_table
)
, ids_with_rnd as
(
select id, abs(checksum(NewId()) % 1000000) as rnd
from ids
)
update randomid_table
set random_id =
(
select rnd
from ids_with_rnd
where ids_with_rnd.id = randomid_table.id
);
It doesn't however. SQL Server is somewhat buggy here and still creates different numbers for the same ID.
So, your best bet may be: do your update that does create different values (your original update statement). Then correct the data as follows:
update randomid_table
set random_id =
(
select min(random_id)
from randomid_table rt2
where rt2.id = randomid_table.id
);
Demo: https://dbfiddle.uk/?rdbms=sqlserver_2017&fiddle=504236db66fba0f12dc7e407a51451f8

Delete old entries from table in MSSQL

a table in my MSSQL db is getting over 500MB. I would like to delete the x last entries so the table size is only 100MB. This should be a task which runs once a week. How could I do this?
Example:
Table before deleting the old entries:
Table after deleting the old entries:
You can use DATALENGTH to get the size of the data in a particular column. With a window function, you can sum up a running total of DATALENGTH values. Then you can delete all records in a table that push you past a desired max table size. Here's an example:
-- Sample table with a VARBINARY(MAX) column
CREATE TABLE tmp (id INT IDENTITY(1,1) PRIMARY KEY, col VARBINARY(MAX))
-- Insert sample data - 20 bytes per row.
;WITH cte AS
(
SELECT 1 AS rn, HASHBYTES('sha1', 'somerandomtext') t
UNION all
SELECT rn + 1, HASHBYTES('sha1', 'somerandomtext')
FROM cte
WHERE rn< 5000
)
INSERT INTO tmp (col)
SELECT t FROM cte
OPTION (maxrecursion 0)
-- #total_bytes is the desired size of the table after the delete statement
DECLARE #total_bytes int = 200
-- Use the SUM window function to get a running total of the DATALENGTH
-- of the VARBINARY field, and delete when the running total exceeds
-- the desired table size.
-- You can order the window function however you want to delete rows
-- in the correct sequence.
DELETE t
FROM tmp t
INNER JOIN
(
SELECT id, SUM(DATALENGTH(col)) OVER (ORDER BY id) total_size
FROM tmp
)sq ON t.id = sq.id AND sq.total_size > #total_bytes
Now check what's left in tmp: 10 rows, and the total size of the "col" column matches the 200 byte size specified in the #total_bytes variable.
UPDATE: Here's an example using your sample data:
CREATE TABLE tmp (id INT PRIMARY KEY, contact VARCHAR(100), country VARCHAR(25))
GO
INSERT INTO tmp VALUES
(1, 'Maria Anders', 'Germany'),
(2, 'Francisco Chang', 'Mexico'),
(3, 'Roland Mendel', 'Austria'),
(4, 'Helen Bennett', 'UK'),
(5, 'Yoshi Tannamuri', 'Canada'),
(6, 'Giovanni Rovelli', 'Italy')
GO
-- #total_bytes is the desired size of the table after the delete statement
DECLARE #total_bytes INT = 40
-- Use the SUM window function to get a running total of the DATALENGTH
-- of the VARBINARY field, and delete when the running total exceeds
-- the desired table size.
-- You can order the window function however you want to delete rows
-- in the correct sequence.
DELETE t
FROM tmp t
INNER JOIN
(
SELECT id, SUM(DATALENGTH(contact)) OVER (ORDER BY id)
+ SUM(DATALENGTH(country)) OVER (ORDER BY id) total_size
FROM tmp
)sq ON t.id = sq.id AND sq.total_size > #total_bytes
SELECT * FROM tmp -- 2 rows left!
DELETE FROM TABLE_NAME WHERE date_column < '2018-01-01';
This will delete all data that entered before January 2018
if you want to delete last 7days data
delete from table_name WHERE date_column >= DATEADD(day,-7, GETDATE())

with clause execution order

In a simplified scenario I have table T that looks somthing like:
Key Value
1 NULL
1 NULL
1 NULL
2 NULL
2 NULL
3 NULL
3 NULL
I also have a very time-consuming function Foo(Key) which must be considered as a black box (I must use it, I can't change it).
I want to update table T but in a more efficient way than
UPDATE T SET Value = dbo.Foo(Key)
Basically I would execute Foo only one time for each Key.
I tried something like
WITH Tmp1 AS
(
SELECT DISTINCT Key FROM T
)
, Tmp2 AS
(
SELECT Key, Foo(Key) Value FROM Tmp1
)
UPDATE T
SET T.Value = Tmp2.Value
FROM T JOIN Tmp2 ON T.Key = Tmp2.Key
but unexpectedly computing time doesn't change at all, because Sql Server seems to run Foo again on every row.
Any idea to solve this without other temporary tables?
One method is to use a temporary table. You don't have much control over how SQL Server decides to optimize its queries.
If you don't want a temporary table, you could do two updates:
with toupdate as (
select t.*, row_number() over (partition by id order by id) as seqnum
from t
)
update toupdate
set value = db.foo(key)
where seqnum = 1;
Then you can run a similar update again:
with toupdate as (
select t.*, max(value) over (partition by id) as as keyvalue
from t
)
update toupdate
set value = keyvalue
where value is null;
You might try it like this:
CREATE FUNCTION dbo.Foo(#TheKey INT)
RETURNS INT
AS
BEGIN
RETURN (SELECT #TheKey*2);
END
GO
CREATE TABLE #tbl(MyKey INT,MyValue INT);
INSERT INTO #tbl(MyKey) VALUES(1),(1),(1),(2),(2),(3),(3),(3);
SELECT * FROM #tbl;
DECLARE #tbl2 TABLE(MyKey INT,TheFooValue INT);
WITH DistinctKeys AS
(
SELECT DISTINCT MyKey FROM #tbl
)
INSERT INTO #tbl2
SELECT MyKey,dbo.Foo(MyKey) TheFooValue
FROM DistinctKeys;
UPDATE #tbl SET MyValue=TheFooValue
FROM #tbl
INNER JOIN #tbl2 AS tbl2 ON #tbl.MyKey=tbl2.MyKey;
SELECT * FROM #tbl2;
SELECT * FROM #tbl;
GO
DROP TABLE #tbl;
DROP FUNCTION dbo.Foo;

Sql Select 0 1 0 1 sequence from database table

I need to fetch output from given below table in
0
1
0
1
sequence .All other data at the end of table.
create table #Temp
(
EventID INT IDENTITY(1, 1) primary key ,
Value bit
)
INSERT INTO #Temp(Value) Values
(0),(1)
,(0),(0)
,(0),(0)
,(0),(0)
,(1),(1)
,(1),(1)
,(1),(0)
,(0),(0)
,(1),(1)
,(0),(1)
This is probably what you want:
select *
from #Temp
order by row_number() over (partition by Value order by EventID), Value

Left join with nearest value without duplicates

I want to achieve in MS SQL something like below, using 2 tables and through join instead of iteration.
From table A, I want each row to identify from table B which in the list is their nearest value, and when value has been selected, that value cannot re-used. Please help if you've done something like this before. Thank you in advance! #SOreadyToAsk
Below is a set-based solution using CTEs and windowing functions.
The ranked_matches CTE assigns a closest match rank for each row in TableA along with a closest match rank for each row in TableB, using the index value as a tie breaker.
The best_matches CTE returns rows from ranked_matches that have the best rank (rank value 1) for both rankings.
Finally, the outer query uses a LEFT JOIN from TableA to the to the best_matches CTE to include the TableA rows that were not assigned a best match due to the closes match being already assigned.
Note that this does not return a match for the index 3 TableA row indicated in your sample results. The closes match for this row is TableB index 3, a difference of 83. However, that TableB row is a closer match to the TableA index 2 row, a difference of 14 so it was already assigned. Please clarify you question if this isn't what you want. I think this technique can be tweaked accordingly.
CREATE TABLE dbo.TableA(
[index] int NOT NULL
CONSTRAINT PK_TableA PRIMARY KEY
, value int
);
CREATE TABLE dbo.TableB(
[index] int NOT NULL
CONSTRAINT PK_TableB PRIMARY KEY
, value int
);
INSERT INTO dbo.TableA
( [index], value )
VALUES ( 1, 123 ),
( 2, 245 ),
( 3, 342 ),
( 4, 456 ),
( 5, 608 );
INSERT INTO dbo.TableB
( [index], value )
VALUES ( 1, 152 ),
( 2, 159 ),
( 3, 259 );
WITH
ranked_matches AS (
SELECT
a.[index] AS a_index
, a.value AS a_value
, b.[index] b_index
, b.value AS b_value
, RANK() OVER(PARTITION BY a.[index] ORDER BY ABS(a.Value - b.value), b.[index]) AS a_match_rank
, RANK() OVER(PARTITION BY b.[index] ORDER BY ABS(a.Value - b.value), a.[index]) AS b_match_rank
FROM dbo.TableA AS a
CROSS JOIN dbo.TableB AS b
)
, best_matches AS (
SELECT
a_index
, a_value
, b_index
, b_value
FROM ranked_matches
WHERE
a_match_rank = 1
AND b_match_rank= 1
)
SELECT
TableA.[index] AS a_index
, TableA.value AS a_value
, best_matches.b_index
, best_matches.b_value
FROM dbo.TableA
LEFT JOIN best_matches ON
best_matches.a_index = TableA.[index]
ORDER BY
TableA.[index];
EDIT:
Although this method uses CTEs, recursion is not used and is therefore not limited to 32K recursions. There may be room for improvement here from a performance perspective, though.
I don't think it is possible without a cursor.
Even if it is possible to do it without a cursor, it would definitely require self-joins, maybe more than once. As a result performance is likely to be poor, likely worse than straight-forward cursor. And it is likely that it would be hard to understand the logic and later maintain this code. Sometimes cursors are useful.
The main difficulty is this part of the question:
when value has been selected, that value cannot re-used.
There was a similar question just few days ago.
The logic is straight-forward. Cursor loops through all rows of table A and with each iteration adds one row to the temporary destination table. To determine the value to add I use EXCEPT operator that takes all values from the table B and removes from them all values that have been used before. My solution assumes that there are no duplicates in value in table B. EXCEPT operator removes duplicates. If values in table B are not unique, then temporary table would hold unique indexB instead of valueB, but main logic remains the same.
Here is SQL Fiddle.
Sample data
DECLARE #TA TABLE (idx int, value int);
INSERT INTO #TA (idx, value) VALUES
(1, 123),
(2, 245),
(3, 342),
(4, 456),
(5, 608);
DECLARE #TB TABLE (idx int, value int);
INSERT INTO #TB (idx, value) VALUES
(1, 152),
(2, 159),
(3, 259);
Main query inserts result into temporary table #TDst. It is possible to write that INSERT without using explicit variable #CurrValueB, but it looks a bit cleaner with variable.
DECLARE #TDst TABLE (idx int, valueA int, valueB int);
DECLARE #CurrIdx int;
DECLARE #CurrValueA int;
DECLARE #CurrValueB int;
DECLARE #iFS int;
DECLARE #VarCursor CURSOR;
SET #VarCursor = CURSOR FAST_FORWARD
FOR
SELECT idx, value
FROM #TA
ORDER BY idx;
OPEN #VarCursor;
FETCH NEXT FROM #VarCursor INTO #CurrIdx, #CurrValueA;
SET #iFS = ##FETCH_STATUS;
WHILE #iFS = 0
BEGIN
SET #CurrValueB =
(
SELECT TOP(1) Diff.valueB
FROM
(
SELECT B.value AS valueB
FROM #TB AS B
EXCEPT -- remove values that have been selected before
SELECT Dst.valueB
FROM #TDst AS Dst
) AS Diff
ORDER BY ABS(Diff.valueB - #CurrValueA)
);
INSERT INTO #TDst (idx, valueA, valueB)
VALUES (#CurrIdx, #CurrValueA, #CurrValueB);
FETCH NEXT FROM #VarCursor INTO #CurrIdx, #CurrValueA;
SET #iFS = ##FETCH_STATUS;
END;
CLOSE #VarCursor;
DEALLOCATE #VarCursor;
SELECT * FROM #TDst ORDER BY idx;
Result
idx valueA valueB
1 123 152
2 245 259
3 342 159
4 456 NULL
5 608 NULL
It would help to have the following indexes:
TableA - (idx) include (value), because we SELECT idx, value ORDER BY idx;
TableB - (value) unique, Temp destination table - (valueB) unique filtered NOT NULL, to help EXCEPT. So, it may be better to have a temporary #table for result (or permanent table) instead of table variable, because table variables can't have indexes.
Another possible method would be to delete a row from table B (from original or from a copy) as its value is inserted into result. In this method we can avoid performing EXCEPT again and again and it could be faster overall, especially if it is OK to leave table B empty in the end. Still, I don't see how to avoid cursor and processing individual rows in sequence.
SQL Fiddle
DECLARE #TDst TABLE (idx int, valueA int, valueB int);
DECLARE #CurrIdx int;
DECLARE #CurrValueA int;
DECLARE #iFS int;
DECLARE #VarCursor CURSOR;
SET #VarCursor = CURSOR FAST_FORWARD
FOR
SELECT idx, value
FROM #TA
ORDER BY idx;
OPEN #VarCursor;
FETCH NEXT FROM #VarCursor INTO #CurrIdx, #CurrValueA;
SET #iFS = ##FETCH_STATUS;
WHILE #iFS = 0
BEGIN
WITH
CTE
AS
(
SELECT TOP(1) B.idx, B.value
FROM #TB AS B
ORDER BY ABS(B.value - #CurrValueA)
)
DELETE FROM CTE
OUTPUT #CurrIdx, #CurrValueA, deleted.value INTO #TDst;
FETCH NEXT FROM #VarCursor INTO #CurrIdx, #CurrValueA;
SET #iFS = ##FETCH_STATUS;
END;
CLOSE #VarCursor;
DEALLOCATE #VarCursor;
SELECT
A.idx
,A.value AS valueA
,Dst.valueB
FROM
#TA AS A
LEFT JOIN #TDst AS Dst ON Dst.idx = A.idx
ORDER BY idx;
I highly believe THIS IS NOT A GOOD PRACTICE because I am bypassing the policy SQL made for itself that functions with side-effects (INSERT,UPDATE,DELETE) is a NO, but due to the fact that I want solve this without resulting to iteration options, I came up with this and gave me better view of things now.
create table tablea
(
num INT,
val MONEY
)
create table tableb
(
num INT,
val MONEY
)
I created a hard-table temp which I shall drop from time-to-time.
if((select 1 from sys.tables where name = 'temp_tableb') is not null) begin drop table temp_tableb end
select * into temp_tableb from tableb
I created a function that executes xp_cmdshell (this is where the side-effect bypassing happens)
CREATE FUNCTION [dbo].[GetNearestMatch]
(
#ParamValue MONEY
)
RETURNS MONEY
AS
BEGIN
DECLARE #ReturnNum MONEY
, #ID INT
SELECT TOP 1
#ID = num
, #ReturnNum = val
FROM temp_tableb ORDER BY ABS(val - #ParamValue)
DECLARE #SQL varchar(500)
SELECT #SQL = 'osql -S' + ##servername + ' -E -q "delete from test..temp_tableb where num = ' + CONVERT(NVARCHAR(150),#ID) + ' "'
EXEC master..xp_cmdshell #SQL
RETURN #ReturnNum
END
and my usage in my query simply looks like this.
-- initialize temp
if((select 1 from sys.tables where name = 'temp_tableb') is not null) begin drop table temp_tableb end
select * into temp_tableb from tableb
-- query nearest match
select
*
, dbo.GetNearestMatch(a.val) AS [NearestValue]
from tablea a
and gave me this..