Example table:
Sample
------
id (primary key)
secondaryId (optional secondary key)
crtdate (created date)
... (other fields)
Some users use secondaryId for identifying rows (i.e., should be unique)
When the rows were created, secondaryId did not get a value, and was defaulted to 0.
Subsequently, rows were given a secondaryId value as they were used.
I need to update all rows with value 0 to be the next available number.
Desired result (w/ simplified values):
From: To:
id secondaryId id secondaryId
1 0 1 7 // this is the max(secondaryId)+1
2 0 2 8 // incrementing as we fill in the zeroes
3 5 3 5 // already assigned
4 0 4 9 // keep incrementing...
5 6 5 6 // already assigned
This query would accomplish what I want to do; but alas, CTE + UPDATE is not supported:
with temp(primary, rownumber) as (
values (
select
id,
row_number() over (partition by secondaryId order by crtdate)+6 --6 is max secondaryId
from Sample
where secondaryId='0'
)
update Sample set secondaryId=temp.rownumber where Sample.id=temp.id
Does anyone have suggestions for a different way to approach this problem? I'm now suffering from tunnel vision...
You can use MERGE statement as id is primary key and there won't be any duplicates.
MERGE INTO Sample as trgt
Using (
select id,
row_number() over (partition by secondaryId order by crtdate)+6 secondaryId
--6 is max secondaryId
from Sample where secondaryId='0'
) as src
ON( src.id= trgt.id)
WHEN MATCHED THEN UPDATE SET trgt.secondaryid = src.secondaryId
You can also UPDATE a SELECT (well "fullselect"), which is probably the neatest solution here
create table sample (
id int not null primary key
, secondaryId int not null
, crtdate date not null
)
;
INSERT INTO sample VALUES
( 1 , 0 , current_date)
,( 2 , 0 , current_date)
,( 3 , 5 , current_date)
,( 4 , 0 , current_date)
,( 5 , 6 , current_date)
;
UPDATE (
SELECT id, secondaryId
, ROW_NUMBER() OVER(ORDER BY secondaryId Desc)
+ (SELECT MAX(id) from sample) as new_secondaryId
FROM
sample s
WHERE secondaryId = 0
)
SET secondaryId = new_secondaryId
;
https://www.ibm.com/support/knowledgecenter/en/SSEPGG_11.1.0/com.ibm.db2.luw.sql.ref.doc/doc/r0001022.html
For a Template method (not necessary to know max value), you can try this:
create table tmptable as (
select f1.id, row_number() over(order by f1.crtdate) + ifnull((select max(f2.secondaryId) from Sample f2), 0) newsecondary
from Sample f1 where f1.secondaryId='0'
) with data;
update Sample f1
set f1.secondaryId=(
select f2.newsecondary
from tmptable f2
where f2.id=f1.id
)
where exists
(
select * from tmptable f2
where f2.id=f1.id
);
drop table tmptable;
Related
I have a table named "ROSTER" and in this table I have 22 columns.
I want to query and compare any 2 rows of that particular table with the purpose to check if each column's values of that 2 rows are exactly the same. ID column always has different values in each row so I will not include ID column for the comparing. I will just use it to refer to what rows will be used for the comparison.
If all column values are the same: Either just display nothing (I prefer this one) or just return the 2 rows as it is.
If there are some column values not the same: Either display those column names only or display both the column name and its value (I prefer this one).
Example:
ROSTER Table:
ID
NAME
TIME
1
N1
0900
2
N1
0801
Output:
ID
TIME
1
0900
2
0801
OR
Display "TIME"
Note: Actually I'm okay with whatever result or way of output as long as I can know in any way that the 2 rows are not the same.
What are the possible ways to do this in SQL Server?
I am using Microsoft SQL Server Management Studio 18, Microsoft SQL Server 2019-15.0.2080.9
Please try the following solution based on the ideas of John Cappelletti. All credit goes to him.
SQL
-- DDL and sample data population, start
DECLARE #roster TABLE (ID INT PRIMARY KEY, NAME VARCHAR(10), TIME CHAR(4));
INSERT INTO #roster (ID, NAME, TIME) VALUES
(1,'N1','0900'),
(2,'N1','0801')
-- DDL and sample data population, end
DECLARE #source INT = 1
, #target INT = 2;
SELECT id AS source_id, #target AS target_id
,[key] AS [column]
,source_Value = MAX( CASE WHEN Src=1 THEN Value END)
,target_Value = MAX( CASE WHEN Src=2 THEN Value END)
FROM (
SELECT Src=1
,id
,B.*
FROM #roster AS A
CROSS APPLY ( SELECT [Key]
,Value
FROM OpenJson( (SELECT A.* For JSON Path,Without_Array_Wrapper,INCLUDE_NULL_VALUES))
) AS B
WHERE id=#source
UNION ALL
SELECT Src=2
,id = #source
,B.*
FROM #roster AS A
CROSS APPLY ( SELECT [Key]
,Value
FROM OpenJson( (SELECT A.* For JSON Path,Without_Array_Wrapper,INCLUDE_NULL_VALUES))
) AS B
WHERE id=#target
) AS A
GROUP BY id, [key]
HAVING MAX(CASE WHEN Src=1 THEN Value END)
<> MAX(CASE WHEN Src=2 THEN Value END)
AND [key] <> 'ID' -- exclude this PK column
ORDER BY id, [key];
Output
+-----------+-----------+--------+--------------+--------------+
| source_id | target_id | column | source_Value | target_Value |
+-----------+-----------+--------+--------------+--------------+
| 1 | 2 | TIME | 0900 | 0801 |
+-----------+-----------+--------+--------------+--------------+
A general approach here might be to just aggregate over the entire table and report the state of the counts:
SELECT
CASE WHEN COUNT(DISTINCT ID) = COUNT(*) THEN 'Yes' ELSE 'No' END AS [ID same],
CASE WHEN COUNT(DISTINCT NAME) = COUNT(*) THEN 'Yes' ELSE 'No' END AS [NAME same],
CASE WHEN COUNT(DISTINCT TIME) = COUNT(*) THEN 'Yes' ELSE 'No' END AS [TIME same]
FROM yourTable;
I have a table as shown in the screenshot (first two columns) and I need to create a column like the last one. I'm trying to calculate the length of each sequence of consecutive values for each id.
For this, the last column is required. I played around with
row_number() over (partition by id, value)
but did not have much success, since the circled number was (quite predictably) computed as 2 instead of 1.
Please help!
First of all, we need to have a way to defined how the rows are ordered. For example, in your sample data there is not way to be sure that 'first' row (1, 1) will be always displayed before the 'second' row (1,0).
That's why in my sample data I have added an identity column. In your real case, the details can be order by row ID, date column or something else, but you need to ensure the rows can be sorted via unique criteria.
So, the task is pretty simple:
calculate trigger switch - when value is changed
calculate groups
calculate rows
That's it. I have used common table expression and leave all columns in order to be easy for you to understand the logic. You are free to break this in separate statements and remove some of the columns.
DECLARE #DataSource TABLE
(
[RowID] INT IDENTITY(1, 1)
,[ID]INT
,[value] INT
);
INSERT INTO #DataSource ([ID], [value])
VALUES (1, 1)
,(1, 0)
,(1, 0)
,(1, 1)
,(1, 1)
,(1, 1)
--
,(2, 0)
,(2, 1)
,(2, 0)
,(2, 0);
WITH DataSourceWithSwitch AS
(
SELECT *
,IIF(LAG([value]) OVER (PARTITION BY [ID] ORDER BY [RowID]) = [value], 0, 1) AS [Switch]
FROM #DataSource
), DataSourceWithGroup AS
(
SELECT *
,SUM([Switch]) OVER (PARTITION BY [ID] ORDER BY [RowID]) AS [Group]
FROM DataSourceWithSwitch
)
SELECT *
,ROW_NUMBER() OVER (PARTITION BY [ID], [Group] ORDER BY [RowID]) AS [GroupRowID]
FROM DataSourceWithGroup
ORDER BY [RowID];
You want results that are dependent on actual data ordering in the data source. In SQL you operate on relations, sometimes on ordered set of relations rows. Your desired end result is not well-defined in terms of SQL, unless you introduce an additional column in your source table, over which your data is ordered (e.g. auto-increment or some timestamp column).
Note: this answers the original question and doesn't take into account additional timestamp column mentioned in the comment. I'm not updating my answer since there is already an accepted answer.
One way to solve it could be through a recursive CTE:
create table #tmp (i int identity,id int, value int, rn int);
insert into #tmp (id,value) VALUES
(1,1),(1,0),(1,0),(1,1),(1,1),(1,1),
(2,0),(2,1),(2,0),(2,0);
WITH numbered AS (
SELECT i,id,value, 1 seq FROM #tmp WHERE i=1 UNION ALL
SELECT a.i,a.id,a.value, CASE WHEN a.id=b.id AND a.value=b.value THEN b.seq+1 ELSE 1 END
FROM #tmp a INNER JOIN numbered b ON a.i=b.i+1
)
SELECT * FROM numbered -- OPTION (MAXRECURSION 1000)
This will return the following:
i id value seq
1 1 1 1
2 1 0 1
3 1 0 2
4 1 1 1
5 1 1 2
6 1 1 3
7 2 0 1
8 2 1 1
9 2 0 1
10 2 0 2
See my little demo here: https://rextester.com/ZZEIU93657
A prerequisite for the CTE to work is a sequenced table (e. g. a table with an identitycolumn in it) as a source. In my example I introduced the column i for this. As a starting point I need to find the first entry of the source table. In my case this was the entry with i=1.
For a longer source table you might run into a recursion-limit error as the default for MAXRECURSION is 100. In this case you should uncomment the OPTION setting behind my SELECT clause above. You can either set it to a higher value (like shown) or switch it off completely by setting it to 0.
IMHO, this is easier to do with cursor and loop.
may be there is a way to do the job with selfjoin
declare #t table (id int, val int)
insert into #t (id, val)
select 1 as id, 1 as val
union all select 1, 0
union all select 1, 0
union all select 1, 1
union all select 1, 1
union all select 1, 1
;with cte1 (id , val , num ) as
(
select id, val, row_number() over (ORDER BY (SELECT 1)) as num from #t
)
, cte2 (id, val, num, N) as
(
select id, val, num, 1 from cte1 where num = 1
union all
select t1.id, t1.val, t1.num,
case when t1.id=t2.id and t1.val=t2.val then t2.N + 1 else 1 end
from cte1 t1 inner join cte2 t2 on t1.num = t2.num + 1 where t1.num > 1
)
select * from cte2
I have two tables: table 1 a package_id and a timestamp column for which I have no weight information available, table 2 a package_id, a timestamp and a weight column where I do have the information.
What I'm trying to do is fill in the table 1 weight information based on table 2 using the following restrictions:
use the closest package_id available ie. for package_id 1 use 2 if available, if not 3 etc
if there is only one weight available use it for all the missing package_id's
if two weights are available, use the higher one ie. for package_id 5, if 4 and 6 are available use 6
The code:
IF OBJECT_ID('tempdb..#TIMEGAPS') IS NOT NULL DROP TABLE #TIMEGAPS
CREATE TABLE #TIMEGAPS (PACK_ID INT, Local_Time DATETIME)
IF OBJECT_ID('tempdb..#REALVALUES') IS NOT NULL DROP TABLE #REALVALUES
CREATE TABLE #REALVALUES (PACK_ID INT, Local_Time DATETIME, WEIGHT INT)
INSERT INTO #TIMEGAPS VALUES
(1,'2018-01-20 18:40:00.000'),
(1,'2018-01-20 18:50:00.000'),
(1,'2018-01-20 19:00:00.000'),
-----------------------------
(7,'2018-01-20 18:40:00.000'),
(7,'2018-01-20 18:50:00.000'),
(7,'2018-01-20 19:00:00.000'),
------------------------------
(12,'2018-01-20 18:40:00.000'),
(12,'2018-01-20 18:50:00.000'),
(12,'2018-01-20 19:00:00.000'),
(12,'2018-01-20 20:00:00.000')
INSERT INTO #REALVALUES VALUES
(2,'2018-01-20 18:40:00.000',50),
(3,'2018-01-20 18:40:00.000',70),
(4,'2018-01-20 18:40:00.000',150),
(5,'2018-01-20 18:40:00.000',60),
(6,'2018-01-20 18:40:00.000',45),
(8,'2018-01-20 18:40:00.000',55),
(9,'2018-01-20 18:40:00.000',25),
---------------------------------
(2,'2018-01-20 18:50:00.000',75),
(3,'2018-01-20 18:50:00.000',80),
(4,'2018-01-20 18:50:00.000',120),
(5,'2018-01-20 18:50:00.000',110),
(11,'2018-01-20 18:50:00.000',30),
---------------------------------
(8,'2018-01-20 19:00:00.000',70)
EDIT:
I've adapted the solution from here which I believe is what I need.
SELECT tg.PACK_ID, tg.Local_Time, p.WEIGHT
FROM #TIMEGAPS tg
OUTER APPLY
(
SELECT TOP 1 *, ABS(tg.PACK_ID - rv.PACK_ID) AS diff
FROM #REALVALUES rv
WHERE (tg.Local_Time = rv.Local_time OR rv.Local_time is null)
ORDER BY CASE WHEN rv.Local_time IS NULL THEN 2 ELSE 1 END,
ABS(rv.PACK_ID- tg.PACK_ID) ASC
) p
EDIT 2:
3. if two weights are available, use the highest PACK_ID ie. for package_id 5, if PACK_ID 4 and PACK_ID 6 are available use 6
Something like this?
It uses a row_number by distance.
SELECT
PACK_ID, Local_Time, WEIGHT
FROM (
SELECT g.PACK_ID, g.Local_Time, v.WEIGHT,
ROW_NUMBER() OVER (PARTITION BY g.PACK_ID, g.Local_Time
ORDER BY ABS(v.PACK_ID - g.PACK_ID), v.PACK_ID DESC) AS RN
FROM #TIMEGAPS AS g
JOIN #REALVALUES AS v ON v.Local_Time = g.Local_Time
) AS q
WHERE RN = 1
ORDER BY PACK_ID, Local_Time
I suppose we can start here. Make sure to initialize your tables before you run this. I assumed that #TIMEGAPS has a weight column based on your output.
DECLARE
#Pack_id INT
, #Weight_id INT
, #mloop INT = 0
DECLARE #possible TABLE (
id INT IDENTITY (1,1)
, pack_id INT
, weight )
BEGIN_LABEL:
SELECT TOP 1 #pack_id = PACK_ID
FROM #TIMEGAPS AS g
WHERE g.WEIGHT IS NULL
ORDER BY PACK_ID ASC
IF #pack_id IS NULL
BEGIN
PRINT 'Done'
EXIT
END
INSERT INTO #possible (pack_id , weight )
SELECT PACK_ID , WEIGHT
FROM #REALVALUES as r
LEFT JOIN #TIMEGAPS as g
ON g.WEIGHT = r.PACK_ID
WHERE g.WEIGHT IS NULL
ORDER BY ABS(#pack_id - PACK_ID) ASC , WEIGHT DESC
SELECT TOP 1 #Weight_id = weight
FROM #possible
ORDER BY id ASC
IF (#Weight_id IS NULL)
BEGIN
RAISERROR('No Weights available' , 18 , 1)
EXIT
END
UPDATE #TIMEGAPS
SET WEIGHT = #Weight_id
WHERE PACK_ID = #Pack_id
SET #mloop = #mloop + 1
IF #mloop > 99
BEGIN
PRINT 'Hit Safety'
EXIT
END
SELECT #Pack_id = NULL , #Weight_id = NULL;
DELETE #possible;
GOTO BEGIN_LABEL
SELECT g.PACK_ID , g.Local_Time , r.WEIGHT
FROM #TIMEGAPS AS g
INNER JOIN #REALVALUES AS r
r.PACK_ID = g.WEIGHT;
Im pretty sure RAISERROR() works in SQL 2008, but you can just replace them with print statements if they don't
i have a table like this. If times increas i expect to increase Rating column but sometimes rating column decrease. I want to find how many times decrease this table. In this example rating column 2 times decrease (4-->2 and 3--> 1 ) i want to this 2 number in query. Also times column every time increase.How can i write this situation's sql query. (note: i am using DB2 DBMS)
Rating times
1 20.09.2016
2 21.09.2016
3 22.09.2016
4 23.09.2016
2 24.09.2016
3 25.09.2016
1 26.09.2016
Thanks,
SELECT COUNT(1) AS COUNT_OF_TIMES_RATING_GOT_DECREASED
FROM
(
SELECT rating,
times,
rating - LEAD( rating, 1 ) OVER ( ORDER BY times ) AS diff_rating
FROM table
)
WHERE diff_rating > 0;
Without those functions being available, e.g. with the DB2 for i, the following should suffice [easily omitting the fluff I add to make a pretty report, and/or optionally correcting my inclusion of the row_number to reveal the from/to instead of showing the "times" values that exhibit that decreasing transition]:
create table ratings
( "rating" for r dec
, "times" for t date
)
;
insert into ratings values
/* Rating times */
( 1 , '20.09.2016' )
, ( 2 , '21.09.2016' )
, ( 3 , '22.09.2016' )
, ( 4 , '23.09.2016' )
, ( 2 , '24.09.2016' )
, ( 3 , '25.09.2016' )
, ( 1 , '26.09.2016' )
;
with
ord_ratings as
( select row_number() over(order by "times) as rn
, "rating"
from ratings
)
select 'From' as "From"
, dec(a.rn, 6) as rn_a
, 'to' as "to"
, dec(b.rn, 6) as rn_b
, a."rating" as rating_a
, '-->' as "-->"
, b."rating" as rating_b
from ord_ratings a
join ord_ratings b
on a.rn = ( b.rn - 1 )
and b."rating" < a."rating"
; -- a likeness of the report from the above query:
"From" RN_A "to" RN_B RATING_A "-->" RATING_B
From 4 to 5 4 --> 2
From 6 to 7 3 --> 1
******** End of data ********
I have a table like below
DECLARE #ProductTotals TABLE
(
id int,
value nvarchar(50)
)
which has following value
1, 'abc'
2, 'abc'
1, 'abc'
3, 'abc'
I want to update this table so that it has the following values
1, 'abc'
2, 'abc_1'
1, 'abc'
3, 'abc_2'
Could someone help me out with this
Use a cursor to move over the table and try to insert every row in a second temporary table. If you get a collision (technically with a select), you can run a second query to get the maximum number (if any) that's appended to your item.
Once you know what maximum number is used (use isnull to cover the case of the first duplicate) just run an update over your original table and keep going with your scan.
Are you looking to remove duplicates? or just change the values so they aren't duplicate?
to change the values use
update producttotals
set value = 'abc_1'
where id =2;
update producttotals
set value = 'abc_2'
where id =3;
to find duplicate rows do a
select id, value
from producttotals
group by id, value
having count() > 2;
Assuming SQL Server 2005 or greater
DECLARE #ProductTotals TABLE
(
id int,
value nvarchar(50)
)
INSERT INTO #ProductTotals
VALUES (1, 'abc'),
(2, 'abc'),
(1, 'abc'),
(3, 'abc')
;WITH CTE as
(SELECT
ROW_NUMBER() OVER (Partition by value order by id) rn,
id,
value
FROM
#ProductTotals),
new_values as (
SELECT
pt.id,
pt.value,
pt.value + '_' + CAST( ROW_NUMBER() OVER (partition by pt.value order by pt.id) as varchar) new_value
FROM
#ProductTotals pt
INNER JOIN CTE
ON pt.id = CTE.id
and pt.value = CTE.value
WHERE
pt.id NOT IN (SELECT id FROM CTE WHERE rn = 1)) --remove any with the lowest ID for the value
UPDATE
#ProductTotals
SET
pt.value = nv.new_value
FROM
#ProductTotals pt
inner join new_values nv
ON pt.id = nv.id and pt.value = nv.value
SELECT * FROM #ProductTotals
Will produce the following
id value
----------- --------------------------------------------------
1 abc
2 abc_1
1 abc
3 abc_2
Explanation of the SQL
The first CTE creates a row number Value. So the numbering gets restarted whenever it sees a new value
rn id value
-------------------- ----------- --------
1 1 abc
2 1 abc
3 2 abc
4 3 abc
The second CTE called new_values ignores any IDs that are assoicated with with a RN of 1. So rn 1 and rn 2 get removed because they share the same ID. It also uses ROW_NUMBER() again to determine the number for the new_value
id value new_value
----------- ------ -------------
2 abc abc_1
3 abc abc_2
The final statement just updates the Old value with the new value