Rails way to reset seed on id field - sql

I have found the "pure SQL" answers to this question. Is there a way, in Rails, to reset the id field for a specific table?
Why do I want to do this? Because I have tables with constantly moving data - rarely more than 100 rows, but always different. It is up to 25k now, and there's just no point in that. I intend on using a scheduler internal to the Rails app (rufus-scheduler) to run the id field reset monthly or so.

You never mentioned what DBMS you're using. If this is postgreSQL, the ActiveRecord postgres adapter has a reset_pk_sequences! method that you could use:
ActiveRecord::Base.connection.reset_pk_sequence!('table_name')

I came out with a solution based on hgimenez's answer and this other one.
Since I usually work with either Sqlite or PostgreSQL, I've only developed for those; but extending it to, say MySQL, shouldn't be too troublesome.
Put this inside lib/ and require it on an initializer:
# lib/active_record/add_reset_pk_sequence_to_base.rb
module ActiveRecord
class Base
def self.reset_pk_sequence
case ActiveRecord::Base.connection.adapter_name
when 'SQLite'
new_max = maximum(primary_key) || 0
update_seq_sql = "update sqlite_sequence set seq = #{new_max} where name = '#{table_name}';"
ActiveRecord::Base.connection.execute(update_seq_sql)
when 'PostgreSQL'
ActiveRecord::Base.connection.reset_pk_sequence!(table_name)
else
raise "Task not implemented for this DB adapter"
end
end
end
end
Usage:
Client.count # 10
Client.destroy_all
Client.reset_pk_sequence
Client.create(:name => 'Peter') # this client will have id=1
EDIT: Since the most usual case in which you will want to do this is after clearing a database table, I recommend giving a look to database_cleaner. It handles the ID resetting automatically. You can tell it to delete just selected tables like this:
DatabaseCleaner.clean_with(:truncation, :only => %w[clients employees])

I assume you don't care about the data:
def self.truncate!
connection.execute("truncate table #{quoted_table_name}")
end
Or if you do, but not too much (there is a slice of time where the data only exists in memory):
def self.truncate_preserving_data!
data = all.map(&:clone).each{|r| raise "Record would not be able to be saved" unless r.valid? }
connection.execute("truncate table #{quoted_table_name}")
data.each(&:save)
end
This will give new records, with the same attributes, but id's starting at 1.
Anything belongs_toing this table could get screwy.

Based on #hgmnz 's answer, I made this method that will set the sequence to any value you like... (Only tested with the Postgres adapter.)
# change the database sequence to force the next record to have a given id
def set_next_id table_name, next_id
connection = ActiveRecord::Base.connection
def connection.set_next_id table, next_id
pk, sequence = pk_and_sequence_for(table)
quoted_sequence = quote_table_name(sequence)
select_value <<-end_sql, 'SCHEMA'
SELECT setval('#{quoted_sequence}', #{next_id}, false)
end_sql
end
connection.set_next_id(table_name, next_id)
end

One problem is that these kinds of fields are implemented differently for different databases- sequences, auto-increments, etc.
You can always drop and re-add the table.

No there is no such thing in Rails. If you need a nice ids to show the users then store them in a separate table and reuse them.

You could only do this in rails if the _ids are being set by rails. As long as the _ids are being set by your database, you won't be able to control them without using SQL.
Side note: I guess using rails to regularly call a SQL procedure that resets or drops and recreates a sequence wouldn't be a purely SQL solution, but I don't think that is what you're asking...
EDIT:
Disclaimer: I don't know much about rails.
From the SQL perspective, if you have a table with columns id first_name last_name and you usually insert into table (first_name, last_name) values ('bob', 'smith') you can just change your queries to insert into table (id, first_name, last_name) values ([variable set by rails], 'bob', 'smith') This way, the _id is set by a variable, instead of being automatically set by SQL. At that point, rails has entire control over what the _ids are (although if it is a PK you need to make sure you don't use the same value while it's still in there).
If you are going to leave the assignment up to the database, you have to have rails run (on whatever time schedule) something like:
DROP SEQUENCE MY_SEQ;
CREATE SEQUENCE MY_SEQ START WITH 1 INCREMENT BY 1 MINVALUE 1;
to whatever sequence controls the ids for your table. This will get rid of the current sequence, and create a new one. This is the simplest way I know of you 'reset' a sequence.

Rails way for e.g. MySQL, but with lost all data in table users:
ActiveRecord::Base.connection.execute('TRUNCATE TABLE users;')
Maybe helps someone ;)

There are CounterCache methods:
https://www.rubydoc.info/docs/rails/4.1.7/ActiveRecord/CounterCache/ClassMethods
I used Article.reset_counters Article.all.length - 1 and it seemed to work.

Related

Django - SQL bulk get_or_create possible?

I am using get_or_create to insert objects to database but the problem is that doing 1000 at once takes too long time.
I tried bulk_create but it doesn't provide functionality I need (creates duplicates, ignores unique value, doesn't trigger post_save signals I need).
Is it even possible to do get_or_create in bulk via customized sql query?
Here is my example code:
related_data = json.loads(urllib2.urlopen(final_url).read())
for item in related_data:
kw = item['keyword']
e, c = KW.objects.get_or_create(KWuser=kw, author=author)
e.project.add(id)
#Add m2m to parent project
related_data cotains 1000 rows looking like this:
[{"cmp":0,"ams":3350000,"cpc":0.71,"keyword":"apple."},
{"cmp":0.01,"ams":3350000,"cpc":1.54,"keyword":"apple -10810"}......]
KW model also sends signal I use to create another parent model:
#receiver(post_save, sender=KW)
def grepw(sender, **kwargs):
if kwargs.get('created', False):
id = kwargs['instance'].id
kww = kwargs['instance'].KWuser
# KeyO
a, b = KeyO.objects.get_or_create(defaults={'keyword': kww}, keyword__iexact=kww)
KW.objects.filter(id=id).update(KWF=a.id)
This works but as you can imagine doing thousands of rows at once takes long time and even crashes my tiny server, what bulk options do I have?
As of Django 2.2, bulk_create has an ignore_conflicts flag. Per the docs:
On databases that support it (all but Oracle), setting the ignore_conflicts parameter to True tells the database to ignore failure to insert any rows that fail constraints such as duplicate unique values
This post may be of use to you:
stackoverflow.com/questions/3395236/aggregating-saves-in-django
Note that the answer recommends using the commit_on_success decorator which is deprecated. It is replaced by the transaction.atomic decorator. Documentation is here:
transactions
from django.db import transaction
#transaction.atomic
def lot_of_saves(queryset):
for item in queryset:
modify_item(item)
item.save()
If I understand correctly, "get_or_create" means SELECT or INSERT on the Postgres side.
You have a table with a UNIQUE constraint or index and a large number of rows to either INSERT (if not yet there) and get the newly create ID or otherwise SELECT the ID of the existing row. Not as simple as it may seem on the outside. With concurrent write load, the matter is even more complicated.
And there are various parameters that need to be defined (how to handle conflicts exactly):
How to use RETURNING with ON CONFLICT in PostgreSQL?

Generating serial numbers in Rails ActiveRecord

In a Rails app I need a source of unique, sequential (no gaps) integers to use as serial numbers. It must be persistent and allow concurrent access.
Database auto-increment isn't adequate because most don't guarentee the "no gaps" property.
In straight SQL I would just create a one-line table and say (in PostgreSQL) something like:
update sequence set value = value + 1 returning value
This is apparently standard practice in the SQL world. References exist.
In ActiveRecord I easily created a model the model and found .increment! and .increment_counter in the documentation. But I can't figure out how to atomically retrieve the incremented value. Locks and transactions don't seem to help.
Since update ... returning acts like a select for output purposes, it turns out you can use find_by_sql to both update and get the updated value in one operation.
class SequenceNumber < ActiveRecord::Base
attr_accessible :tag, :value
validates :tag, :uniqueness => true
def self.get_next(tag)
find_by_sql("update sequence_numbers
set value = value + 1
where tag = '#{tag}'
returning value").first.value
end
end
The remaining problem is that this is totally non-portable because returning is a pgsql extension. Maybe the ActiveRecord developers will notice this.
If you want to use redis (and maybe you already are because of Resque or Sidekiq), you can do an INCR on a key, this is atomic and returns the new value.

Rails/Active Record .save! efficiency question

New to rails/ruby (using rails 3 and ruby 1.9.2), and am trying to get rid of some unnecessary queries being executed.
When I'm running an each do:
apples.to_a.each do |apple|
new_apple = apple.clone
new_apple.save!
end
and I check the sql LOG, I see three select statements followed by one insert statement. The select statements seem completely unnecessary. For example, they're something like:
SELECT Fruit.* from Fruits where Fruit.ID = 5 LIMIT 1;
SELECT Color.* from Colors where Color.ID = 6 LIMIT 1;
SELECT TreeType.* from TreeTypes where TreeType.ID = 7 LIMIT 1;
INSERT into Apples (Fruit_id, color_id, treetype_id) values (6, 7, 8) RETURNING "id";
Seemingly, this wouldnt' take much time, but when I've got 70k inserts to run, I'm betting those three selects for each insert will take up a decent amount of time.
So I'm wondering the following:
Is this typical of ActiveRecord/Rails .save! method, or did the previous developer add some sort of custom code?
Would those three select statements, being executed for each item, cause a noticeable amount of extra time?
If it is built into rails/active record, would it be easily bypassed, if that would make it run more efficiently?
You must be validating your associations on save for such a thing to occur, something like this:
class Apple < ActiveRecord::Base
validates :fruit,
:presence => true
end
In order to validate that the relationship, the record must be loaded, and this needs to happen for each validation individually, for each record in turn. That's the standard behavior of save!
You could save without validations if you feel like living dangerously:
apples.to_a.each do |apple|
new_apple = apple.clone
new_apple.save(:validate => false)
end
The better approach is to manipulate the records directly in SQL by doing a mass insert if your RDBMS supports it. For instance, MySQL will let you insert thousands of rows with one INSERT call. You can usually do this by making use of the Apple.connection access layer which allows you to make arbitrary SQL calls with things like execute
I'm guessing that there is a before_save EDIT: (or a validation as suggested above) method being called that is looking up the color and type of the fruit and storing that with the rest of the attributes when the fruit is saved - in which case these lookups are necessary ...
Normally I wouldn't expect activerecord to do unnecessary lookups - though that does not mean it is always efficient ...

How to resuse deleted model id number in Rails?

Say I have a Post model. When I delete last post 'Post 24', I want the next post to take id of Post 24 and not Post 25.
I want to show id in views and I don't want missing numbers. How do I do that?
Thanks for your help.
The purpose of an id is to be nothing more than an internal identifier. It shouldn't be used publicly at all. This isn't a Rails thing, but a database issue. MySQL won't reclaim id's because it can lead to very serious complications in your app. If a record is deleted, its id is laid to rest forevermore, so that no future record will be mistaken for it.
However, there is a way to do what you want. I believe you want a position integer column instead. Add that to your model/table, and then install the acts_as_list plugin.
Install it the usual way:
script/plugin install git://github.com/rails/acts_as_list.git
Then add the "hook" to your model:
class Post < ActiveRecord::Base
acts_as_list
end
Now the position column of your post model will automatically track itself, with no sequence gaps. It'll even give you some handy methods for re-ordering if you so choose.
Conversely, you could let the SQL do this itself:
SELECT rownum AS id, [whatever other columns you want]
FROM posts_table
WHERE [conditions]
ORDER BY [ordering conditions]
This will add numbers to each row without skipping any like you said.
NOTE: I use Oracle. I don't know if this exact code will work in other flavors.

SQL to search and replace in mySQL

In the process of fixing a poorly imported database with issues caused by using the wrong database encoding, or something like that.
Anyways, coming back to my question, in order to fix this issues I'm using a query of this form:
UPDATE table_name SET field_name =
replace(field_name,’search_text’,'replace_text’);
And thus, if the table I'm working on has multiple columns I have to call this query for each of the columns. And also, as there is not only one pair of things to run the find and replace on I have to call the query for each of this pairs as well.
So as you can imagine, I end up running tens of queries just to fix one table.
What I was wondering is if there is a way of either combine multiple find and replaces in one query, like, lets say, look for this set of things, and if found, replace with the corresponding pair from this other set of things.
Or if there would be a way to make a query of the form I've shown above, to run somehow recursively, for each column of a table, regardless of their name or number.
Thank you in advance for your support,
titel
Let's try and tackle each of these separately:
If the set of replacements is the same for every column in every table that you need to do this on (or there are only a couple patterns), consider creating a user-defined function that takes a varchar and returns a varchar that just calls replace(replace(#input,'search1','replace1'),'search2','replace2') nested as appropriate.
To update multiple columns at the same time you should be able to do UPDATE table_name SET field_name1 = replace(field_name1,...), field_name2 = replace(field_name2,...) or something similar.
As for running something like that for every column in every table, I'd think it would be easiest to write some code which fetches a list of columns and generates the queries to execute from that.
I don't know of a way to automatically run a search-and-replace on each column, however the problem of multiple pairs of search and replace terms in a single UPDATE query is easily solved by nesting calls to replace():
UPDATE table_name SET field_name =
replace(
replace(
replace(
field_name,
'foo',
'bar'
),
'see',
'what',
),
'I',
'mean?'
)
If you have multiple replaces of different text in the same field, I recommend that you create a table with the current values and what you want them replaced with. (Could be a temp table of some kind if this is a one-time deal; if not, make it a permanent table.) Then join to that table and do the update.
Something like:
update t1
set field1 = t2.newvalue
from table1 t1
join mycrossreferncetable t2 on t1.field1 = t2.oldvalue
Sorry didn't notice this is MySQL, the code is what I would use in SQL Server, my SQL syntax may be different but the technique would be similar.
I wrote a stored procedure that does this. I use this on a per database level, although it would be easy to abstract it to operate globally across a server.
I would just paste this inline, but it would seem that I'm too dense to figure out how to use the markdown deal, so the code is here:
http://www.anovasolutions.com/content/mysql-search-and-replace-stored-procedure