Increment value in column in an SQL table - sql

So i have the following case to resolve.
I have a table with the following structure and i need to do the column Comment3 incremented by 1 but not for whole the records, but only for the records that are matched based on the Tid column.
So my table looks like this
------------------------
ID | TID | COMMENT3
------------------------
101 | 715 | 1
102 | 715 | 2
103 | 715 | NULL
104 | 715 | NULL
So i need every null value in column Comment3 to get updated with the last value plus 1, based on the TID which is the reference column.
Thanks in advance.

Use a ROW_NUMBER() with the proper OVER clause:
;WITH Comment3NewRanking AS
(
SELECT
T.ID,
T.TID,
T.Comment3,
Ranking = ROW_NUMBER() OVER (PARTITION BY T.TID ORDER BY T.ID ASC)
FROM
YourTable AS T
)
UPDATE C SET
Comment3 = C.Ranking
FROM
Comment3NewRanking AS C
WHERE
C.Comment3 IS NULL
This solution assumes that whenever the IDs are set for a particular TID, they are in ascending ID order. Also assuming you want values from 1 onwards on each tier of TID (if not please supply another representative example).

Related

Delete duplicates, keeping one of them, grouping by [duplicate]

This question already has answers here:
How to delete duplicate rows in SQL Server?
(26 answers)
Closed 4 years ago.
I need to delete all the duplicates, all but one, for each of the table ids. Like the following. I need to delete all the duplicates on valueid for 01,02,03...
Original:
id | valueid | data
____________________________
01 | 1001 | datadata1
01 | 1002 | datadata2
01 | 1001 | datadata1
02 | 1323 | datamoredata123
02 | 1323 | datamoredata123
03 | 22123 | evenmoredata
03 | 24444 | andalsomore
Should end like:
id | valueid | data
____________________________
01 | 1001 | datadata1
01 | 1002 | datadata2
02 | 1323 | datamoredata123
03 | 22123 | evenmoredata
03 | 24444 | andalsomore
Was trying to do it with something like this, but I don´t get how can I group that delete on the id
WITH CTE AS(
SELECT valueid,
RN = ROW_NUMBER()OVER(PARTITION BY valueid ORDER BY valueid)
FROM tblvalues
)
DELETE FROM CTE WHERE RN > 1
Any suggestions?
Thanks in advance
You need to add id column to the PARTITION:
WITH CTE AS(
SELECT valueid,
RN = ROW_NUMBER()OVER( PARTITION BY id, valueid ORDER BY data)
FROM tblvalues
)
DELETE FROM CTE WHERE RN > 1
This way you delete duplicate valueid values separately for each id. Column data determines which duplicates are deleted.
You are pretty close. You need to change the partition by clause. You want one row per id/valueid pair, so these should both be in the partitioning clause:
WITH todelete AS (
SELECT valueid,
RN = ROW_NUMBER() OVER (PARTITION BY id, valueid ORDER BY data)
FROM tblvalues
)
DELETE FROM todelete WHERE RN > 1;
A very simple way to do this is to add the UNIQUE index in column (valueid). When you write an ALTER statement, specify the IGNORE keyword.
ALTER IGNORE TABLE tblvalues
ADD UNIQUE INDEX idx_name (valueid);
This will remove all duplicate rows. As an added advantage, future INSERTs that are duplicates will be erroneous. As always, you can take a backup before running something like this.

sybase - update column to be identity, ordered by certain values

I want to update the sequence column below to be an IDENTITY column in future, and the current rows must be updated to be ordered by update_time ascending.
How do I do this in Sybase? Simplified example of what I have below.
Current table:
SEQUENCE | UPDATE_TIME | DATA
null | 2016-01-01 | x
null | 2013-01-01 | y
null | 2015-01-01 | z
Desired table:
SEQUENCE | UPDATE_TIME | DATA
3 | 2016-01-01 | x
1 | 2013-01-01 | y
2 | 2015-01-01 | z
I did this by joining the table onto itself but with 1 additional ID row. This additional row is created using the ROW_NUMBER function by ordering the update_time ascending. Something like...
UPDATE myTable SET update_seq = tmp.ID
FROM myTable a INNER JOIN (
SELECT update_time, data, ROW_NUMBER() OVER (ORDER BY update_seq ASC) as ID from myTable tmp
on a.update_time = tmp.update_time
and a.data = tmp.data
On Sybase ASA you can numerate column using that update:
update [table_name]
set [SEQUENCE]=number(*)
order by [UPDATE_TIME]

Partitioning function for continuous sequences

There is a table of the following structure:
CREATE TABLE history
(
pk serial NOT NULL,
"from" integer NOT NULL,
"to" integer NOT NULL,
entity_key text NOT NULL,
data text NOT NULL,
CONSTRAINT history_pkey PRIMARY KEY (pk)
);
The pk is a primary key, from and to define a position in the sequence and the sequence itself for a given entity identified by entity_key. So the entity has one sequence of 2 rows in case if the first row has the from = 1; to = 2 and the second one has from = 2; to = 3. So the point here is that the to of the previous row matches the from of the next one.
The order to determine "next"/"previous" row is defined by pk which grows monotonously (since it's a SERIAL).
The sequence does not have to start with 1 and the to - from does not necessary 1 always. So it can be from = 1; to = 10. What matters is that the "next" row in the sequence matches the to exactly.
Sample dataset:
pk | from | to | entity_key | data
----+--------+------+--------------+-------
1 | 1 | 2 | 42 | foo
2 | 2 | 3 | 42 | bar
3 | 3 | 4 | 42 | baz
4 | 10 | 11 | 42 | another foo
5 | 11 | 12 | 42 | another baz
6 | 1 | 2 | 111 | one one one
7 | 2 | 3 | 111 | one one one two
8 | 3 | 4 | 111 | one one one three
And what I cannot realize is how to partition by "sequences" here so that I could apply window functions to the group that represents a single "sequence".
Let's say I want to use the row_number() function and would like to get the following result:
pk | row_number | entity_key
----+-------------+------------
1 | 1 | 42
2 | 2 | 42
3 | 3 | 42
4 | 1 | 42
5 | 2 | 42
6 | 1 | 111
7 | 2 | 111
8 | 3 | 111
For convenience I created an SQLFiddle with initial seed: http://sqlfiddle.com/#!15/e7c1c
PS: It's not the "give me the codez" question, I made my own research and I just out of ideas how to partition.
It's obvious that I need to LEFT JOIN with the next.from = curr.to, but then it's still not clear how to reset the partition on next.from IS NULL.
PS: It will be a 100 points bounty for the most elegant query that provides the requested result
PPS: the desired solution should be an SQL query not pgsql due to some other limitations that are out of scope of this question.
I don’t know if it counts as “elegant,” but I think this will do what you want:
with Lagged as (
select
pk,
case when lag("to",1) over (order by pk) is distinct from "from" then 1 else 0 end as starts,
entity_key
from history
), LaggedGroups as (
select
pk,
sum(starts) over (order by pk) as groups,
entity_key
from Lagged
)
select
pk,
row_number() over (
partition by groups
order by pk
) as "row_number",
entity_key
from LaggedGroups
Just for fun & completeness: a recursive solution to reconstruct the (doubly) linked lists of records. [ this will not be the fastest solution ]
NOTE: I commented out the ascending pk condition(s) since they are not needed for the connection logic.
WITH RECURSIVE zzz AS (
SELECT h0.pk
, h0."to" AS next
, h0.entity_key AS ek
, 1::integer AS rnk
FROM history h0
WHERE NOT EXISTS (
SELECT * FROM history nx
WHERE nx.entity_key = h0.entity_key
AND nx."to" = h0."from"
-- AND nx.pk > h0.pk
)
UNION ALL
SELECT h1.pk
, h1."to" AS next
, h1.entity_key AS ek
, 1+zzz.rnk AS rnk
FROM zzz
JOIN history h1
ON h1.entity_key = zzz.ek
AND h1."from" = zzz.next
-- AND h1.pk > zzz.pk
)
SELECT * FROM zzz
ORDER BY ek,pk
;
You can use generate_series() to generate all the rows between the two values. Then you can use the difference of row numbers on that:
select pk, "from", "to",
row_number() over (partition by entity_key, min(grp) order by pk) as row_number
from (select h.*,
(row_number() over (partition by entity_key order by ind) -
ind) as grp
from (select h.*, generate_series("from", "to" - 1) as ind
from history h
) h
) h
group by pk, "from", "to", entity_key
Because you specify that the difference is between 1 and 10, this might actually not have such bad performance.
Unfortunately, your SQL Fiddle isn't working right now, so I can't test it.
Well,
this not exactly one SQL query but:
select a.pk as PK, a.entity_key as ENTITY_KEY, b.pk as BPK, 0 as Seq into #tmp
from history a left join history b on a."to" = b."from" and a.pk = b.pk-1
declare #seq int
select #seq = 1
update #tmp set Seq = case when (BPK is null) then #seq-1 else #seq end,
#seq = case when (BPK is null) then #seq+1 else #seq end
select pk, entity_key, ROW_NUMBER() over (PARTITION by entity_key, seq order by pk asc)
from #tmp order by pk
This is in SQL Server 2008

Delete rows except for one for every id

I have a dataset with multiple ids. For every id there are multiple entries. Like this:
--------------
| ID | Value |
--------------
| 1 | 3 |
| 1 | 4 |
| 1 | 2 |
| 2 | 1 |
| 2 | 2 |
| 3 | 3 |
| 3 | 5 |
--------------
Is there a SQL DELETE query to delete (random) rows for every id, except for one (random rows would be nice but is not essential)? The resulting table should look like this:
--------------
| ID | Value |
--------------
| 1 | 2 |
| 2 | 1 |
| 3 | 5 |
--------------
Thanks!
It doesn't look like hsqldb fully supports olap functions (in this case row_number() over (partition by ...), so you'll need to use a derived table to identify the one value you want to keep for each ID. It certainly won't be random, but I don't think anything else will be either. Something like so
This query will give you the first part:
select
id,
min(value) as minval
from
group by id
Then you can delete from your table where you don't match:
delete from
<your table> t1
inner join
(
select
id,
min(value) as minval
from
<your table>
group by id
) t2
on t1.id = t2.id
and t1.value <> t2.value
Try this:
alter ignore table a add unique(id);
Here a is the table name
This should do what you want:
SELECT ID, Value
FROM (SELECT ID, Value, ROW_NUMBER() OVER(PARTITION BY ID ORDER BY NEWID()) AS RN
FROM #Table) AS A
WHERE A.RN = 1
I tried the given answers with HSQLDB but it refused to execute those queries for different reasons (join is not allowed in delete query, ignore statement is not allowed in alter query). Thanks to Andrew I came up with this solution (which is a little bit more circumstantial, but allows it to delete random rows):
Add a new column for random values:
ALTER TABLE <table> ADD COLUMN rand INT
Fill this column with random data:
UPDATE <table> SET rand = RAND() * 1000000
Delete all rows which don't have the minimum random value for their id:
DELETE FROM <table> WHERE rand NOT IN (SELECT MIN(rand) FROM <table> GROUP BY id)
Drop the random column:
ALTER TABLE <table> DROP rand
For larger tables you probably should ensure that the random values are unique, but this worked perfectly for me.

Updating Single Row per Group

The Background
I have a temporary table containing information including a unique rowID, OrderNumber, and guestCount. RowID and OrderNumber already exist in this table, and I am running a new query to fill in the missing guestCount for each orderNumber. I would like to then update the temp table with this information.
Example
What I currently have looks something like this, with only RowID being unique, meaning that there can be multiple items having the same OrderNumber.
RowID | OrderNumber | guestCount
1 | 30001 | 0
2 | 30002 | 0
3 | 30002 | 0
4 | 30003 | 0
My query returns the following table, only returning one total number of guests per orderNumber:
OrderNumber | guestCount
30001 | 3
30002 | 10
30003 | 5
The final table should look like:
RowID | OrderNumber | guestCount
1 | 30001 | 3
2 | 30002 | 10
3 | 30002 | 0
4 | 30003 | 5
I'm only interested in updating one (doesn't matter which) entry per orderNumber, but my current logic is resulting in errors:
UPDATE temp
SET temp.guestCount = cc.guestCount
FROM( SELECT OrderNumber, guestCount
FROM (SELECT OrderNumber, guestCount, RowID = MIN(RowID)
FROM #tempTable
GROUP BY RowID, OrderNumber, guestCount) t)temp
INNER JOIN queryTable q ON temp.OrderNumber = q.OrderNumber
I'm not sure if this logic is even a valid way of doing this, but I do know that I'm getting errors in my update due to the fact that I'm using an aggregate function, as well as a GROUP function. Is there any way to go about this operation differently?
You can define the row to update by using row_number() in a CTE. This identifies the first row in the group for the update:
with toupdate as (
select tt.*, row_number() over (partition by OrderNumber order by id) as seqnum
from #tempTable tt
)
UPDATE toupdate
SET toupdate.guestCount = q.guestCount
FROM toupdate
INNER JOIN queryTable q
ON temp.OrderNumber = q.OrderNumber
where toupdate.seqnum = 1;
The problem with you query is that temp is based on an aggregation subquery. Such a subquery is not updatable, because it does not have a 1-1 relationship with the rows of the original query. Using the CTE with row_number() is updatable. In addition, your set statement uses the table alias cc which is not defined in the query.