ThinkingSphinx3: How to prevent searchd threads from freezing? - ruby-on-rails-3

I am using Rails 3.2.12, RSpec-rails 2.13.0 and ThinkingSphinx 3.0.10
The problem:
When I run bundle exec rpsec spec/controllers/ads_controller_spec.rb, thinking sphinx spawns 3 searchd processes which become frozen, my tests just lockup until I manually killed the searchd processes after which the tests continue running.
The setup:
Here is my sphinx_env.rb file which in which I setup TS for testing:
require 'thinking_sphinx/test'
def sphinx_environment(*tables, &block)
obj = self
begin
before(:all) do
obj.use_transactional_fixtures = false
ThinkingSphinx::Test.init
ThinkingSphinx::Test.start
sleep(0.5)
end
yield
ensure
after(:all) do
ThinkingSphinx::Test.stop
sleep(0.5)
obj.use_transactional_fixtures = true
end
end
end
Here is my test script:
describe "GET index" do
before(:each) do
#web_origin = FactoryGirl.create(:origin)
#api_origin = FactoryGirl.create(:api_origin)
#first_ad = FactoryGirl.create(:ad, :origin_id => #web_origin.id)
ThinkingSphinx::Test.index #index ads created above
sleep 0.5
end
sphinx_environment :ads do
it 'should return a collection of all live ads' do
get :index, {:format => 'json'}
response.code.should == '200'
end
end
...
UPDATE
No progress made, however here are some additional details:
When I run my tests, thinking sphinx always starts 3 searchd
processes.
The pid in my test.sphinx.pid always has just one of the searchd
pid's, its always the second searchd process pid.
Here is the output from my test.searchd.log file:
[ 568] binlog: finished replaying total 49 in 0.006 sec
[ 568] accepting connections
[ 568] caught SIGHUP (seamless=1, in queue=1)
[ 568] rotating index 'ad_core': started
[ 568] caught SIGHUP (seamless=1, in queue=2)
[ 568] caught SIGTERM, shutting down
Any help is appreciated, I have been trying to sort out this issue for over a day & a bit lost.
Thanks.

Sphinx 2.0.x releases with threaded Sphinx workers (which is what Thinking Sphinx v3 uses, hence the multiple searchd processes) are buggy on OS X, but this was fixed in Sphinx 2.0.6 (which was one of the main things holding back TS v3 development - my own tests wouldn't run due to problems like what you've been seeing).
I'd recommend upgrading Sphinx to 2.0.6 and I'm pretty sure that should resolve these issues.

Related

Rails test fails requests did not finish in 60 seconds

After upgrading rails from 4.2 to 5.2 my test gets stuck on a request while it is working in development server I'm getting following failure on running test suit.
Failures:
1) cold end overview shows cold end stats
Failure/Error: example.run
RuntimeError:
Requests did not finish in 60 seconds
# /home/asnad/.rvm/gems/ruby-2.5.0/gems/capybara-2.18.0/lib/capybara/server.rb:94:in `rescue in wait_for_pending_requests'
# /home/asnad/.rvm/gems/ruby-2.5.0/gems/capybara-2.18.0/lib/capybara/server.rb:91:in `wait_for_pending_requests'
# /home/asnad/.rvm/gems/ruby-2.5.0/gems/capybara-2.18.0/lib/capybara/session.rb:130:in `reset!'
# /home/asnad/.rvm/gems/ruby-2.5.0/gems/capybara-2.18.0/lib/capybara.rb:314:in `block in reset_sessions!'
# /home/asnad/.rvm/gems/ruby-2.5.0/gems/capybara-2.18.0/lib/capybara.rb:314:in `reverse_each'
# /home/asnad/.rvm/gems/ruby-2.5.0/gems/capybara-2.18.0/lib/capybara.rb:314:in `reset_sessions!'
# /home/asnad/.rvm/gems/ruby-2.5.0/gems/capybara-2.18.0/lib/capybara/rspec.rb:22:in `block (2 levels) in <top (required)>'
# ./spec/spec_helper.rb:43:in `block (3 levels) in <top (required)>'
# /home/asnad/.rvm/gems/ruby-2.5.0/gems/database_cleaner-1.6.2/lib/database_cleaner/generic/base.rb:16:in `cleaning'
# /home/asnad/.rvm/gems/ruby-2.5.0/gems/database_cleaner-1.6.2/lib/database_cleaner/base.rb:98:in `cleaning'
# /home/asnad/.rvm/gems/ruby-2.5.0/gems/database_cleaner-1.6.2/lib/database_cleaner/configuration.rb:86:in `block (2 levels) in cleaning'
# /home/asnad/.rvm/gems/ruby-2.5.0/gems/database_cleaner-1.6.2/lib/database_cleaner/configuration.rb:87:in `cleaning'
# ./spec/spec_helper.rb:37:in `block (2 levels) in <top (required)>'
# ------------------
# --- Caused by: ---
# Timeout::Error:
# execution expired
# /home/asnad/.rvm/gems/ruby-2.5.0/gems/capybara-2.18.0/lib/capybara/server.rb:92:in `sleep'
Top 1 slowest examples (62.59 seconds, 97.0% of total time):
cold end overview shows cold end stats
62.59 seconds ./spec/features/cold_end_overview_spec.rb:13
Finished in 1 minute 4.51 seconds (files took 4.15 seconds to load)
1 example, 1 failure
my spec_helper.rb has configurations
RSpec.configure do |config|
config.include FactoryBot::Syntax::Methods
config.around(:each) do |example|
DatabaseCleaner[:active_record].clean_with(:truncation)
DatabaseCleaner.cleaning do
if example.metadata.key?(:js) || example.metadata[:type] == :feature
# VCR.configure { |c| c.ignore_localhost = true }
WebMock.allow_net_connect!
VCR.turn_off!
VCR.eject_cassette
example.run
else
# WebMock.disable_net_connect!
VCR.turn_on!
cassette_name = example.metadata[:full_description]
.split(/\s+/, 2)
.join('/')
.underscore.gsub(/[^\w\/]+/, '_')
# VCR.configure { |c| c.ignore_localhost = false }
VCR.use_cassette(cassette_name) { example.run }
VCR.turn_off!
WebMock.allow_net_connect!
end
end
end
config.expect_with :rspec do |expectations|
expectations.include_chain_clauses_in_custom_matcher_descriptions = true
end
config.mock_with :rspec do |mocks|
mocks.verify_partial_doubles = true
end
config.filter_run :focus
config.run_all_when_everything_filtered = true
config.example_status_persistence_file_path = "spec/examples.txt"
if config.files_to_run.one?
config.default_formatter = 'doc'
end
# Print the 10 slowest examples and example groups at the
# end of the spec run, to help surface which specs are running
# particularly slow.
config.profile_examples = 10
# Run specs in random order to surface order dependencies. If you find an
# order dependency and want to debug it, you can fix the order by providing
# the seed, which is printed after each run.
# --seed 1234
config.order = :random
# Seed global randomization in this process using the `--seed` CLI option.
# Setting this allows you to use `--seed` to deterministically reproduce
# test failures related to randomization by passing the same `--seed` value
# as the one that triggered the failure.
Kernel.srand config.seed
end
# Selenium::WebDriver.logger.level = :debug
# Selenium::WebDriver.logger.output = 'selenium.log'
Capybara.register_driver :selenium_chrome_headless do |app|
capabilities = Selenium::WebDriver::Remote::Capabilities.chrome(chromeOptions: { args: %w[headless no-sandbox disable-dev-shm-usage disable-gpu window-size=1200,1500] }, loggingPrefs: { browser: 'ALL' })
Capybara::Selenium::Driver.new(app, browser: :chrome, desired_capabilities: capabilities)
end
Chromedriver.set_version '2.39'
Capybara.javascript_driver = :selenium_chrome_headless
Capybara::Screenshot.prune_strategy = :keep_last_run
in my spec the line sign_in current_user takes too much time actually it redirects to a page and do not get response even long time while it is working on development environment.
what can be the reason if you need anything else please comment.
I've just arrived here myself after upgrading from 4.2 to 5.1 and now 5.2, and I'm seeing the same thing in my testing, when I have frozen a test in mid-request with binding.pry, I get the message Requests did not finish in 60 seconds. What a great story, skip to the end for tl;dr (I may have figured it out.)
Now I have upgraded all of the gems, incrementally so I can preserve the capacity to bisect and observe the source of interesting changes like this one. I only noticed this new 60 second timeout after changing over from chromedriver-helper which reported it had been deprecated to the new webdrivers gem that's taking over, but that seems to be not related as I searched webdrivers for any timeout or 60 second value, and only found references to an unrelated Pull Request #60 (fixes Issue #59).
I checked my gem source directory for this message, Requests did not finish in 60 seconds, and found it was not in fact an older version of Capybara, but that it has been raised from versions dating back to at least 3.9.0, and in the most current version 3.24.0 in lib/capybara/server.rb.
The object used there is a Timer which you can find an interface to it here, in the helper:
https://github.com/teamcapybara/capybara/blob/320ee96bb8f63ac9055f7522961a1e1cf8078a8a/lib/capybara/helpers.rb#L79
This particular message is raised out of the method wait_for_pending_requests which passes a hard 60 into the :expire_in named parameter, then after sends any errors that were encountered in the server thread. This means the time is not configurable, probably 60 seconds is a reasonable length of time to wait for a web request in progress to complete, although it's a bit inconvenient for my test.
That method is only called in one place, reset!, which you can find defined here in capybara/session.rb: https://github.com/teamcapybara/capybara/blob/320ee96bb8f63ac9055f7522961a1e1cf8078a8a/lib/capybara/session.rb#L126
The reset! method is an interesting one that comes with some documentation about how it's used. #server&.wait_for_pending_requests looks like it might call wait_for_pending_requests if it has an active server thread in a request, and then raise_server_error! which similarly acts only if #server&.error is truthy.
Now we find that reset! comes with two aliases, this message reset! is received whenever Capybara calls cleanup! or reset_session!. At this point we can probably understand what happened, but it's still a little bit mysterious when I've been using chromedriver-helper and selenium testing for several years, but never recall seeing this 60 second timeout before. I'm hesitant to point the finger at webdriver, but I don't have any other answers for why this timeout is new. I haven't done really anything that could account for it but upgrade to this gem, and any other gems, plus clearing out deprecation warnings.
It seems possible that in Rails 5.1+, capybara calls reset! a lot more, maybe more than in between test examples. Especially when you read that documentation of the method and think about what Single-Page focus there has been now, and consider all of the things that reset! documentation tells you it doesn't reset, clear browser cache/HTML 5 local storage/IndexedDB/Web SQL database/etc — or maybe I'm imagining it, and this isn't new. But I'm imagining there are a lot of ways that it can call reset! and not land in this timeout code, that are likely to be driver-dependent.
Did you change to webdrivers gem by any chance when you did your Rails upgrade?
Edit: I reverted to chromedriver-helper just to be sure, and that wasn't it. What's actually happening is my test is failing in one thread, but the server has left a binding.pry session open. Capybara has moved onto the next test, and thus to get a fresh session it has called reset!, but 60 seconds later I am still in my pry session, and the server is still not ready to serve a root request. I have a feeling that the threading behavior of capybara has changed, in my memory a pry session opened during a server request would block the test from failing until it had returned. But that's apparently not what's happening anymore.
How did you arrive here? I have no idea unfortunately, but this is a fair description of what's happening when that message is received.

Rails 3.2.x: How to change logging levels without restarting the application

I would like to change the logging levels of a running Rails 3.2.x application without restarting the application. My intent is to use it to do short-time debugging and information gathering before reverting it to the usual logging level.
I also understand that the levels in ascending order are debug, info, warn, error, and fatal, and that production servers log info and higher, while development logs debug and higher.
I understand that if I run
Rails.logger.level=:debug #or :info, :warn, :error, :fatal
Will this change the logging level immediately?
If so, can I do this by writing a Rake task to adjust the logging level, or do I need to support this by adding a route? For example in config/routes.rb:
match "/set_logging_level/:level/:secret" => "logcontroller#setlevel"
and then setting the levels in the logcontroller. (:level is the logging level, and :secret which is shared between client and server, is something to prevent random users from tweaking the log levels)
Which is more appropriate, rake task or /set_logging_level?
Why don't you use operating system signals for that? For example on UNIX user1 and user2 signals are free to use for your application:
config/initializers/signals.rb:
trap('USR1') do
Rails.logger.level = Logger::DEBUG
end
trap('USR2') do
Rails.logger.level = Logger::WARN
end
Then just do this:
kill -SIGUSR1 pid
kill -SIGUSR2 pid
Just make sure you dont override signals of your server - each server leverages various signals for things like log rotation, child process killing and terminating and so on.
In Rails console, you can simply do:
Rails.logger.level = :debug
Now all executed code will run with this log level
As you have to change the level in the running rails instance, a simple rake task will not work.
I would go with the dedicated route.
instead of a shared secret I would use the app's standard user authentication (if your app has users) and restrict access to admin/super user.
In your controller LogController try this
def setlevel
begin
Rails.logger.level = Logger.const_get(params[:level].upcase)
rescue
logger.info("Logging level #{params[:level]} not supported")
end
end
You can also use gdb to attach to the running process, set the Rails.logger to debug level and then detach. I have created the following 1 liner to do this for my puma process:
gdb attach $(pidof puma) -ex 'call(rb_eval_string("Rails.logger.level = Logger::DEBUG"))' -ex detach -ex quit
NOTE: pidof will return multiple pids, in descending order. So if you have multiple processes with the same name this will only run on the first one returned by pidof. The others will be discarded by the "gdb attach" command with the message: "Excess command line arguments ignored. (26762)". However you can safely ignore it if you only care about the first process returned by pidof.
Using rufus-scheduler, I created this schedule:
scheduler.every 1.second do
file_path = "#{Rails.root}/tmp/change_log_level.#{Process.pid}"
if File.exists? file_path
log_level = File.open(file_path).read.strip
case log_level
when "INFO"
Rails.logger.level = Logger::INFO
Rails.logger.info "Changed log_level to INFO"
when "DEBUG"
Rails.logger.level = Logger::DEBUG
Rails.logger.info "Changed log_level to DEBUG"
end
File.delete file_path
end
end
Then, log level can be changed by creating a file under tmp/change_log_level.PID, where pid is the process id of the rails process. You can create a rake/capistrano task to detect and create these files, allowing you to quickly switch log level of your running production server.
Just remember to start rufus in the worker threads, if you are using unicorn or similar.

Delayed job not working

I want to scrape some website & I want that this should be done by separate worker process. I came to know about delayed job to do jobs in background. I am using collectiveidea / delayed_job in my rails application.I followed the installation steps for rails 3.0 & active record.
After that I created a dj.rb in lib file & wrote code as follows.
require 'nokogiri'
require 'open-uri'
class Dj_testing
def perform
#code for scraping the site
#code to add entry into database
end
end
Now after that I use following command to start worker
script/delayed_job start
rake jobs:work
My worker started & on my terminal I can see
[Worker(host:user1234-desktop pid:9487)] Starting job worker
Now my problem is When I call the perform method directly It works fine. I mean following code works perfectly scrapes the site and populates the database.
ruby-1.9.2-p0 > Dj_testing.new.perform
But when I delay that same job it adds job to delayed_job table & does nothing :(
ruby-1.9.2-p0 > Dj_testing.delay.new
or
ruby-1.9.2-p0 > Delayed::Job.enqueue Dj_testing.new
#<Delayed::Backend::ActiveRecord::Job id: 150, priority: 0, attempts: 0, handler:
"---!ruby/object:Delayed::PerformableMethod \nargs: ...", last_error: nil,
run_at: "2012-04-27 05:25:29", locked_at: nil, failed_at: nil, locked_by: nil,
queue: nil,created_at: "2012-04-27 05:25:29", updated_at: "2012-04-27 05:25:29">
Why the job is not working as desired?
You need to call 'delay' on the object on which you are invoking the method. So, in your case, it should be:
Dj_testing.new.delay.perform
and not
Dj_testing.delay.new
#salil's way of calling the method is correct, however you might be suffering from other issues.
Follow the guidelines in my answer to this post (I did not post here since there are many :-))
https://stackoverflow.com/a/15000180/226255
I hope this helps

how do I get all my rake tasks to write to the same log file?

I have two rake tasks that I'd like to run nightly. I'd like them to log to one file. I thought this would do the trick (got it here: Rake: logging any task executing):
application.rb
module Rake
class Task
alias_method :origin_invoke, :invoke if method_defined?(:invoke)
def invoke(*args)
#logger = Logger.new('rake_tasks_log.log')
#logger.info "#{Time.now} -- #{name} -- #{args.inspect}"
origin_invoke(args)
end
end
end
and then in the rakefile:
task :hello do
#logger.warn "Starting Hello task"
puts "Hello World!"
puts "checking connection "
checkConnection
puts "done checking"
#logger.debug "End hello rake task"
end
But when I run the task I get:
private method 'warn' called for nil:NilClass
I've tried a couple of flavors of that call to logging (#, ##, no #) to no avail. Read several threads on here about it. The
rubyonrails.org site doesn't mention logging in rake tasks. The tasks that I'm invoking are fairly complex (about 20-40 mins to complete) so I'll really want to know what went wrong if they fail. I'd prefer for DRY reasons to only create the logger object once.
Unless you're wrapping everything in giant begin/rescue's and catching errors that way, the best way to log errors is to catch all output from stderr and stdout with something like:
rake your:job 2>&1 >> /var/log/rake.log
You could also set your Rails environment to use the system logger as well.
I ended up solving this (or at least well enough) by making a "log" task and depending on that in other tasks. Not really ideal, since that means having to include that dependency in any new task, but I have only a few tasks so this will do fine. I'm aware that there is a "file" task but it didn't seem to want to work in Windows, so I chose this because it seems to be more cross platform and it's more explicit.
I need a logger object because I am passing that object into some method calls in the [...] sections. There's enough begin/rescue/end in there that writing to the output stream wouldn't work (I think).
#log_file = "log/tasks.log"
directory "log"
task :check_log => ["log"] do
log = #log_file
puts 'checking log existence'
if not FileTest.exists? ("./#{log}")
puts 'creating log file'
File.open(log, 'w')
end
end
task :check_connection => [:check_log] do
begin
conn = Mongo::Connection.new
[...]
end
end
task :nightly_tasks => [:check_connection, :environment ] do
for i in 1..2
logger.warn "#########################"
end
[...]
logger.warn "nightly tasks complete"
end
def logger
##logger ||= Logger.new( File.join(Rails.root, #log_file) )
end

Is it possible to terminate an already running delayed job using Ruby Threading?

Let's say I have delayed_job running in the background. Tasks can be scheduled or run immediately(some are long tasks some are not)
If a task is too long, a user should be able to cancel it. Is it possible in delayed job? I checked the docs and can't seem to find a terminate method or something. They only provide a catch to cancel delayed job itself(thus cancelling all tasks...I need to just cancel a certain running task)
UPDATE
My boss(who's a great programmer btw) suggested to use Ruby Threading for this feature of ours. Is this possible? Like creating new threads per task and killing that thread while it's running?
something like:
t1 = Thread.new(task.run)
self.delay.t1.join (?) -- still reading on threads so correct me if im wrong
then to stop it i'll just use t1.stop (?) again don't know yet
Is this possible? Thanks!
It seems that my boss hit the spot so here's what we did(please tell us if there's some possibility this is bad practice so I can bring it up):
First, we have a Job model that has def execute! (which runs what it's supposed to do).
Next, we have delayed_job worker in the background, listening for new jobs. Now when you create a job, you can schedule it to run immediately or run every certain day (we use rufus for this one)
When a job is created, it checks if its supposed to run immediately. If it is, it adds itself to the delayed job queue. The execute function creates a Thread, so each job has its own thread.
User in the ui can see if a job is running(if there's a started_at and no finished_at). If it IS running, there's a button to cancel it. Canceling it just sets the job's canceled_at to Time.now.
While the job is running it also checks itself if it has a canceled_at or if Time.now is > finished_at. If so, kill the thread.
Voila! We've tested it for one job and it seems to work. Now the only problem is scaling...
If you see any problems with this please do so in the comments or give more suggestions if ever :) I hope this helps some one too!
Delayed::Job is an < ActiveRecord::Base model, so you can query it just like you normally would like Delayed::Job.all(:conditions => {:last_error => nil}).
Delayed::Job objects have a payload field which contain a serialized version of the method or job that you're attempting to run. This object is accessed by their '#payload_object' method, which loads the object in question.
You can combine these two capabilities to make queriable job workers, for instance, if you have a User model, and the user has a paperclip'ed :avatar, then you can make a method to delete unprocessed jobs like so:
class User < ActiveRecord::Base
has_attached_file :avatar, PaperclipOptions.new(:avatar)
before_create :'process_avatar_later'
def process_avatar_later
filename = Rails.root.join('tmp/avatars_for_processing/',self.id)
open(filename, 'w') do |file| file <<self.avatar.to_file end
Delayed::Job.enqueue(WorkAvatar.new(self.id, filename))
self.avatar = nil
end
def cancel_future_avatar_processing
WorkAvatar.future_jobs_for_user(self.id).each(&:destroy)
#ummm... tell them to reupload their avatar, I guess?
end
class WorkAvatar < Struct.new(:user_id, :path)
def user
#user ||= User.find(self.user_id)
end
def self.all_jobs
Delayed::Job.scoped(:conditions => 'payload like "%WorkAvatar%"')
end
def self.future_jobs_for_user(user_id)
all_jobs.scoped(:conditions => {:locked_at => nil}).select do |job|
job.payload_object.user_id == user_id
end
end
def perform
#user.avatar = File.open(path, 'rb')
#user.save()
end
end
end
It's possible someone has made a plugin make queryable objects like this. Perhaps searching on GitHub would be fruitful.
Note also that you'd have to work with any process monitoring tools you might have to cancel any running job worker processes that are being executed if you want to cancel a job that has locked_at and locked_by set.
You can wrap the task into a Timeout statement.
require 'timeout'
class TaskWithTimeout < Struct.new(:parameter)
def perform
Timeout.timeout(10) do
# ...
end
rescue Timeout::Error => e
# the task took longer than 10 seconds
end
end
No, there's no way to do this. If you're concerned about a runaway job you should definitely wrap it in a timeout as Simone suggests. However, it sounds like you're in search of something more but I'm unclear on your end goal.
There will never be a way for a user to have a "cancel" button since this would involve finding a method to directly communicate with the worker running process running the job. It would be possible to add a signal handler to the worker so that you could do something like kill -USR1 pid to have it abort the job it's currently working and move on. Would this accomplish you goal?