I have a table in PostgreSQL that has an ID column that is supposed to be unique. However, a large number of the rows (around 3 million) currently have an ID of "1".
What I know:
The total number of rows
The current maximum value for the ID column
The number of rows with an (incorrect) ID of "1"
What I need is a query that will pull all the rows with an ID of "1" and assign them a new ID that increments automatically so that every row in the table will have a unique ID. I'd like it to start at the currentMaxId + 1 and assign each row the subsequent ID.
This is the closest I've gotten with a query:
UPDATE table_name
SET id = (
SELECT max(id) FROM table_name
) + 1
WHERE id = '1'
The problem with this is that the inner SELECT only runs the first time, thus setting the ID of the rows in question to the original max(id) + 1, not the new max(id) + 1 every time, giving me the same problem I'm trying to solve.
Any suggestions on how to tweak this query to achieve my desired result or an alternative method would be greatly appreciated!
You may do it step by step with a temporary sequence.
1) creation
create temporary sequence seq_upd;
2) set it to the proper initial value
select setval('seq_upd', (select max(id) from table_name));
3) update
update table_name set id=nextval('seq_upd') where id=1;
If you are going to work with a SEQUENCE, consider the serial pseudo data type for you id. Then you can just draw nextval() from the "owned" (not temporary) sequence, which will then be up to date automatically.
If you don't want that, you can fall back to using the ctid and row_number() for a one-time numbering:
UPDATE tbl t
SET id = x.max_id + u.rn
FROM (SELECT max(id) AS max_id FROM tbl) x
, (SELECT ctid, row_number() OVER (ORDER BY ctid) AS rn
FROM tbl WHERE id = 1) u
WHERE t.ctid = u.ctid;
Related answer on dba.SE:
numbering rows consecutively for a number of tables
Related
I have a table that I would like to update one column data on every nth row if it meets row requirement.
My table has many columns but the key are Object_Id (in case this could be useful for creating temp table)
But the one I'm trying to update is online_status, it looks like below, but on bigger scales so I usually have 10rows that has same time but they all have %Online% in it and in total around 2000 rows (with Online and about another 2000 with Offline). I just need to update every 2-4 rows of those 10 that are repeating itself.
Table picture here: (for some reason table formatting doesn't come up good)
Table
So what I tried is: This pulls a list of every 3rd record that matches criteria Online, I just need a way to update it but can't get through this.
SELECT * FROM (SELECT *, row_number() over() rn FROM people
WHERE online_status LIKE '%Online%') foo WHERE online_status LIKE '%Online%' AND foo.rn % 3 =0
What I also tried is:
However this has updated every single row. not the ones I needed.
UPDATE people
SET online_status = 'Offline 00:00-24:00'
WHERE people.Object_id IN
(SELECT *
FROM
(SELECT people.Object_id, row_number() over() rn FROM people
WHERE online_status LIKE '%Online%') foo WHERE people LIKE '%Online%' AND foo.rn % 3 =0);
Is there a way to take list from Select code above and simply update it or run a few scripts that could add it to like temp table and store object ids, and the next script would update main table if object id would match temp table.
Thank you for any help :)
Don't select other columns but Object_id in the subquery at WHERE people.Object_id IN (..)
UPDATE people
SET online_status = 'Offline 00:00-24:00'
WHERE Object_id IN
( SELECT Object_id
FROM
( SELECT p.Object_id, row_number() over() rn
FROM people p
WHERE p.online_status LIKE '%Online%') foo
WHERE foo.rn % 3 = 0
);
I have a table that newer data starts with id=1 and increasing the id, we get to the older data, now the problem is, as new data will be added every day, and I want to show the results from new to old, there will be a problem. I want to reverse the order of the table that even after new data are added I can display the data from newer to older one.
Has anyone any clue what should I do?
You could hard-code the arbitrary number where the switch-over happens. Say you have 1,000 rows in there right now (so #1 is the newest, but #1001 will be the newest when it's added):
SELECT (CASE WHEN id < 1000 THEN id * -1 ELSE id END) AS sort_order
ORDER BY sort_order DESC
Probably the better solution would be to add a timestamp column, as it's not nice to rely on auto-generated columns for real data.
Edit
To create a new table, use the above with an insert.
INSERT INTO [new_table] (col1,col2,col3, ...)
SELECT (col1, col2, col3, ...)
FROM [old_table]
ORDER BY (CASE WHEN id < 1000 THEN id * -1 ELSE id END) DESC
It would be better to add a datetime field to you table.
However, your ID is not usable as order field directly since until now ID=1 was newest but from now on every new record will get a higher ID than X. So you need to identify X which is the ID value of the oldest record until now. Then you can order by two cases:
belongs it to the low-id=new or to the high-id=new group?
Here's an example where X=10:
SELECT ID
FROM dbo.tbl
ORDER BY CASE WHEN ID <=10 THEN ID ELSE -ID END ASC
Here's the fiddle: http://sqlfiddle.com/#!6/95c9a/3/0
but i want to copy the content to a new table so that i won't need to
treat them differently
Then you can use the sql above to insert into the new table in that order. I would use a new ID column with IDENTITY(1,1):
INSERT INTO dbo.tblCopy
SELECT ID
FROM dbo.tbl
ORDER BY CASE WHEN ID <=10 THEN ID ELSE -ID END ASC
If you don't want to add another column and you want to reuse the old id, you can use ROW_NUMBER with above CASE:
INSERT INTO dbo.tblCopy
SELECT ID =
ROW_NUMBER()OVER(ORDER BY CASE WHEN ID <=10 THEN ID ELSE -ID END ASC)
FROM dbo.tbl
Demo: http://sqlfiddle.com/#!6/00213/1/0
SQL Server table with custom sort has columns: ID (PK, auto-increment), OrderNumber, Col1, Col2..
By default an insert trigger copies value from ID to OrderNumber as suggested here.
Using some visual interface, user can sort records by incrementing or decrementing OrderNumber values.
However, how to deal with records being deleted in the meantime?
Example:
Say you add records with PK ID: 1,2,3,4,5 - OrderNumber receives same values. Then you delete records with ID=4,ID=5. Next record will have ID=6 and OrderNumber will receive the same value. Having a span of 2 missing OrderNumbers would force user to decrement record with ID=6 like 3 times to change it's order (i.e. 3x button pressed).
Alternatively, one could insert select count(*) from table into OrderNumber, but it would allow to have several similar values in table, when some old rows are deleted.
If one doesn't delete records, but only "deactivate" them, they're still included in sort order, just invisible for user. At the moment, solution in Java is needed, but I think the issue is language-independent.
Is there a better approach at this?
I would simply modify the script that switches the OrderNumber values so it does it correctly without relying on their being without gaps.
I don't know what arguments your script accepts and how it uses them, but the one that I've eventually come up with accept the ID of the item to move and the number of positions to move by (a negative value would mean "toward the lower OrderNumber values", and a positive one would imply the opposite direction).
The idea is as follows:
Look up the specified item's OrderNumber.
Rank all the items starting from OrderNumber in the direction determined by the second argument. The specified item thus receives the ranking of 1.
Pick the items with rankings from 1 to the one that is the absolute value of the second argument plus one. (I.e. the last item is the one where the specified item is being moved to.)
Join the resulting set with itself so that every row is joined with the next one and the last row is joined with the first one and thus use one set of rows to update the other.
This is the query that implements the above, with comments explaining some tricky parts:
Edited: fixed an issue with incorrect reordering
/* these are the arguments of the query */
DECLARE #ID int, #JumpBy int;
SET #ID = ...
SET #JumpBy = ...
DECLARE #OrderNumber int;
/* Step #1: Get OrderNumber of the specified item */
SELECT #OrderNumber = OrderNumber FROM atable WHERE ID = #ID;
WITH ranked AS (
/* Step #2: rank rows including the specified item and those that are sorted
either before or after it (depending on the value of #JumpBy */
SELECT
*,
rnk = ROW_NUMBER() OVER (
ORDER BY OrderNumber * SIGN(#JumpBy)
/* this little "* SIGN(#JumpBy)" trick ensures that the
top-ranked item will always be the one specified by #ID:
* if we are selecting rows where OrderNumber >= #OrderNumber,
the order will be by OrderNumber and #OrderNumber will be
the smallest item (thus #1);
* if we are selecting rows where OrderNumber <= #OrderNumber,
the order becomes by -OrderNumber and #OrderNumber again
becomes the top ranked item, because its negative counterpart,
-#OrderNumber, will again be the smallest one
*/
)
FROM atable
WHERE OrderNumber >= #OrderNumber AND #JumpBy > 0
OR OrderNumber <= #OrderNumber AND #JumpBy < 0
),
affected AS (
/* Step #3: select only rows that need be affected */
SELECT *
FROM ranked
WHERE rnk BETWEEN 1 AND ABS(#JumpBy) + 1
)
/* Step #4: self-join and update */
UPDATE old
SET OrderNumber = new.OrderNumber
FROM affected old
INNER JOIN affected new ON old.rnk = new.rnk % (ABS(#JumpBy) + 1) + 1
/* if old.rnk = 1, the corresponding new.rnk is N,
because 1 = N MOD N + 1 (N is ABS(#JumpBy)+1),
for old.rnk = 2 the matching new.rnk is 1: 2 = 1 MOD N + 1,
for 3, it's 2 etc.
this condition could alternatively be written like this:
new.rnk = (old.rnk + ABS(#JumpBy) - 1) % (ABS(#JumpBy) + 1) + 1
*/
Note: this assumes SQL Server 2005 or later version.
One known issue with this solution is that it will not "move" rows correctly if the specified ID cannot be moved exactly by the specified number of positions (for instance, if you want to move the topmost row up by any number of positions, or the second row by two or more positions etc.).
Ok - if I'm not mistaken, you want to defragment your OrderNumber.
What if you use ROW_NUMBER() for this ?
Example:
;WITH calc_cte AS (
SELECT
ID
, OrderNumber
, RowNo = ROW_NUMBER() OVER (ORDER BY ID)
FROM
dbo.Order
)
UPDATE
c
SET
OrderNumber = c.RowNo
FROM
calc_cte c
WHERE EXISTS (SELECT * FROM inserted i WHERE c.ID = i.ID)
Didn't want to reply my own question, but I believe I have found a solution.
Insert query:
INSERT INTO table (OrderNumber, col1, col2)
VALUES ((select count(*)+1 from table),val1,val2)
Delete trigger:
CREATE TRIGGER Cleanup_After_Delete ON table
AFTER DELETE AS
BEGIN
WITH rowtable AS (SELECT [ID], OrderNumber, rownum = ROW_NUMBER()
OVER (ORDER BY OrderNumber ASC) FROM table)
UPDATE rt SET OrderNumber = rt.rownum FROM rowtable rt
WHERE OrderNumber >= (SELECT OrderNumber FROM deleted)
END
The trigger fires up after every delete and corrects all OrderNumbers above the deleted one (no gaps). This means that I can simply change the order of 2 records by switching their OrderNumbers.
This is a working solution for my problem, however this one is also very good one, perhaps more useful for others.
The problem:
I have a table that records data rows in foo. Each time the row is updated, a new row is inserted along with a revision number. The table looks like:
id rev field
1 1 test1
2 1 fsdfs
3 1 jfds
1 2 test2
Note: the last record is a newer version of the first row.
Is there an efficient way to query for the latest version of a record and for a specific version of a record?
For instance, a query for rev=2 would return the 2, 3 and 4th row (not the replaced 1st row though) while a query for rev=1 yields those rows with rev <= 1 and in case of duplicated ids, the one with the higher revision number is chosen (record: 1, 2, 3).
I would not prefer to return the result in an iterative way.
To get only latest revisions:
SELECT * from t t1
WHERE t1.rev =
(SELECT max(rev) FROM t t2 WHERE t2.id = t1.id)
To get a specific revision, in this case 1 (and if an item doesn't have the revision yet the next smallest revision):
SELECT * from foo t1
WHERE t1.rev =
(SELECT max(rev)
FROM foo t2
WHERE t2.id = t1.id
AND t2.rev <= 1)
It might not be the most efficient way to do this, but right now I cannot figure a better way to do this.
Here's an alternative solution that incurs an update cost but is much more efficient for reading the latest data rows as it avoids computing MAX(rev). It also works when you're doing bulk updates of subsets of the table. I needed this pattern to ensure I could efficiently switch to a new data set that was updated via a long running batch update without any windows of time where we had partially updated data visible.
Aging
Replace the rev column with an age column
Create a view of the current latest data with filter: age = 0
To create a new version of your data ...
INSERT: new rows with age = -1 - This was my slow long running batch process.
UPDATE: UPDATE table-name SET age = age + 1 for all rows in the subset. This switches the view to the new latest data (age = 0) and also ages older data in a single transaction.
DELETE: rows having age > N in the subset - Optionally purge old data
Indexing
Create a composite index with age and then id so the view will be nice and fast and can also be used to look up by id. Although this key is effectively unique, its temporarily non-unique when you're ageing the rows (during UPDATE SET age=age+1) so you'll need to make it non-unique and ideally the clustered index. If you need to find all versions of a given id ordered by age, you may need an additional non-unique index on id then age.
Rollback
Finally ... Lets say you're having a bad day and the batch processing breaks. You can quickly revert to a previous data set version by running:
UPDATE table-name SET age = age - 1 -- Roll back a version
DELETE table-name WHERE age < 0 -- Clean up bad stuff
Existing Table
Suppose you have an existing table that now needs to support aging. You can use this pattern by first renaming the existing table, then add the age column and indexing and then create the view that includes the age = 0 condition with the same name as the original table name.
This strategy may or may not work depending on the nature of technology layers that depended on the original table but in many cases swapping a view for a table should drop in just fine.
Notes
I recommend naming the age column to RowAge in order to indicate this pattern is being used, since it's clearer that its a database related value and it complements SQL Server's RowVersion naming convention. It also won't conflict with a column or view that needs to return a person's age.
Unlike other solutions, this pattern works for non SQL Server databases.
If the subsets you're updating are very large then this might not be a good solution as your final transaction will update not just the current records but all past version of the records in this subset (which could even be the entire table!) so you may end up locking the table.
This is how I would do it. ROW_NUMBER() requires SQL Server 2005 or later
Sample data:
DECLARE #foo TABLE (
id int,
rev int,
field nvarchar(10)
)
INSERT #foo VALUES
( 1, 1, 'test1' ),
( 2, 1, 'fdsfs' ),
( 3, 1, 'jfds' ),
( 1, 2, 'test2' )
The query:
DECLARE #desiredRev int
SET #desiredRev = 2
SELECT * FROM (
SELECT
id,
rev,
field,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY rev DESC) rn
FROM #foo WHERE rev <= #desiredRev
) numbered
WHERE rn = 1
The inner SELECT returns all relevant records, and within each id group (that's the PARTITION BY), computes the row number when ordered by descending rev.
The outer SELECT just selects the first member (so, the one with highest rev) from each id group.
Output when #desiredRev = 2 :
id rev field rn
----------- ----------- ---------- --------------------
1 2 test2 1
2 1 fdsfs 1
3 1 jfds 1
Output when #desiredRev = 1 :
id rev field rn
----------- ----------- ---------- --------------------
1 1 test1 1
2 1 fdsfs 1
3 1 jfds 1
If you want all the latest revisions of each field, you can use
SELECT C.rev, C.fields FROM (
SELECT MAX(A.rev) AS rev, A.id
FROM yourtable A
GROUP BY A.id)
AS B
INNER JOIN yourtable C
ON B.id = C.id AND B.rev = C.rev
In the case of your example, that would return
rev field
1 fsdfs
1 jfds
2 test2
SELECT
MaxRevs.id,
revision.field
FROM
(SELECT
id,
MAX(rev) AS MaxRev
FROM revision
GROUP BY id
) MaxRevs
INNER JOIN revision
ON MaxRevs.id = revision.id AND MaxRevs.MaxRev = revision.rev
SELECT foo.* from foo
left join foo as later
on foo.id=later.id and later.rev>foo.rev
where later.id is null;
How about this?
select id, max(rev), field from foo group by id
For querying specific revision e.g. revision 1,
select id, max(rev), field from foo where rev <= 1 group by id
i have loanTable that contain two field loan_id and status
loan_id status
==============
1 0
2 9
1 6
5 3
4 5
1 4 <-- How do I select this??
4 6
In this Situation i need to show the last Status of loan_id 1 i.e is status 4. Can please help me in this query.
Since the 'last' row for ID 1 is neither the minimum nor the maximum, you are living in a state of mild confusion. Rows in a table have no order. So, you should be providing another column, possibly the date/time when each row is inserted, to provide the sequencing of the data. Another option could be a separate, automatically incremented column which records the sequence in which the rows are inserted. Then the query can be written.
If the extra column is called status_id, then you could write:
SELECT L1.*
FROM LoanTable AS L1
WHERE L1.Status_ID = (SELECT MAX(Status_ID)
FROM LoanTable AS L2
WHERE L2.Loan_ID = 1);
(The table aliases L1 and L2 could be omitted without confusing the DBMS or experienced SQL programmers.)
As it stands, there is no reliable way of knowing which is the last row, so your query is unanswerable.
Does your table happen to have a primary id or a timestamp? If not then what you want is not really possible.
If yes then:
SELECT TOP 1 status
FROM loanTable
WHERE loan_id = 1
ORDER BY primaryId DESC
-- or
-- ORDER BY yourTimestamp DESC
I assume that with "last status" you mean the record that was inserted most recently? AFAIK there is no way to make such a query unless you add timestamp into your table where you store the date and time when the record was added. RDBMS don't keep any internal order of the records.
But if last = last inserted, that's not possible for current schema, until a PK addition:
select top 1 status, loan_id
from loanTable
where loan_id = 1
order by id desc -- PK
Use a data reader. When it exits the while loop it will be on the last row. As the other posters stated unless you put a sort on the query, the row order could change. Even if there is a clustered index on the table it might not return the rows in that order (without a sort on the clustered index).
SqlDataReader rdr = SQLcmd.ExecuteReader();
while (rdr.Read())
{
}
string lastVal = rdr[0].ToString()
rdr.Close();
You could also use a ROW_NUMBER() but that requires a sort and you cannot use ROW_NUMBER() directly in the Where. But you can fool it by creating a derived table. The rdr solution above is faster.
In oracle database this is very simple.
select * from (select * from loanTable order by rownum desc) where rownum=1
Hi if this has not been solved yet.
To get the last record for any field from a table the easiest way would be to add an ID to each record say pID. Also say that in your table you would like to hhet the last record for each 'Name', run the simple query
SELECT Name, MAX(pID) as LastID
INTO [TableName]
FROM [YourTableName]
GROUP BY [Name]/[Any other field you would like your last records to appear by]
You should now have a table containing the Names in one column and the last available ID for that Name.
Now you can use a join to get the other details from your primary table, say this is some price or date then run the following:
SELECT a.*,b.Price/b.date/b.[Whatever other field you want]
FROM [TableName] a LEFT JOIN [YourTableName]
ON a.Name = b.Name and a.LastID = b.pID
This should then give you the last records for each Name, for the first record run the same queries as above just replace the Max by Min above.
This should be easy to follow and should run quicker as well
If you don't have any identifying columns you could use to get the insert order. You can always do it like this. But it's hacky, and not very pretty.
select
t.row1,
t.row2,
ROW_NUMBER() OVER (ORDER BY t.[count]) AS rownum from (
select
tab.row1,
tab.row2,
1 as [count]
from table tab) t
So basically you get the 'natural order' if you can call it that, and add some column with all the same data. This can be used to sort by the 'natural order', giving you an opportunity to place a row number column on the next query.
Personally, if the system you are using hasn't got a time stamp/identity column, and the current users are using the 'natural order', I would quickly add a column and use this query to create some sort of time stamp/incremental key. Rather than risking having some automation mechanism change the 'natural order', breaking the data needed.
I think this code may help you:
WITH cte_Loans
AS
(
SELECT LoanID
,[Status]
,ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS RN
FROM LoanTable
)
SELECT LoanID
,[Status]
FROM LoanTable L1
WHERE RN = ( SELECT max(RN)
FROM LoanTable L2
WHERE L2.LoanID = L1.LoanID)