How can I load data into database using activerecord - sql

I have a ruby script that extracts information from a file (genbank) and I would like to load this data into the database. I have created the model and the schema and a connection script:
require 'active_record'
def establish_connection(db_location= "protein.db.sqlite3")
ActiveRecord::Base.establish_connection(
:adapter => "sqlite3",
:database => db_location,
:pool => 5,
:timeout => 5000
)
end
This is my script that outputs the data:
require 'rubygems'
require 'bio'
require 'snp_db_models'
establish_connection
snp_positions_file = File.open("snp_position.txt")
outfile = File.open("output.txt", "w")
genome_sequence = Bio::FlatFile.open(Bio::EMBL, "ref.embl").next_entry
snp_positions = Array.new
snp_positions_file.gets # header line
while line = snp_positions_file.gets
snp_details = line.chomp.split("\t")
snp_seq = snp_details[1]
snp_positions << snp_details[1].to_i
end
mean_snp_per_base = snp_positions.size/genome_sequence.sequence_length.to_f
puts "Mean snps per base: #{mean_snp_per_base}"
#outfile = File.open("/Volumes/DataRAID/Projects/GAS/fastq_files/bowtie_results/snp_annotation/genes_with_higher_snps.tsv", "w")
outfile.puts("CDS start\tCDS end\tStrand\tGene\tLocus_tag\tnote\tsnp_ID\ttranslation_seq\tProduct\tNo_of_snps_per_gene\tsnp_rate_vs_mean")
genome_sequence.features do |feature|
if feature.feature !~ /gene/i && feature.feature !~ /source/i
start_pos = feature.locations.locations.first.from
end_pos = feature.locations.locations.first.to
number_of_snps_in_gene = (snp_positions & (start_pos..end_pos).to_a).size # intersect finds number of times snp occurs within cds location
mean_snp_per_base_in_gene = number_of_snps_in_gene.to_f/(end_pos - start_pos)
outfile.print "#{start_pos}\t"
outfile.print "#{end_pos}\t"
if feature.locations.locations.first.strand == 1
outfile.print "forward\t"
else
outfile.print "reverse\t"
end
qualifiers = feature.to_hash
["gene", "locus_tag", "note", "snp_id", "translation", "product"].each do |qualifier|
if qualifiers.has_key?(qualifier) # if there is gene and product in the file
# puts "#{qualifier}: #{qualifiers[qualifier]}"
outfile.print "#{qualifiers[qualifier].join(",")}\t"
else
outfile.print " \t"
end
end
outfile.print "#{number_of_snps_in_gene}\t"
outfile.print "%.2f" % (mean_snp_per_base_in_gene/mean_snp_per_base)
outfile.puts
end
end
outfile.close
How can I load the data in outfile.txt into the database. Do I have to do something like marshall dump?
Thanks in advance
Mark

Your can write a rake task to do this. Save it in lib/tasks and give it a .rake extension.
desc "rake task to load data into db"
task :load_data_db => :environment do
...
end
Since the rails environment is loaded, you can access your Model directly as you would in any Rails model/controller. Of course, it'll connect to the database depending on the environment variable defined when you execute your rake task.

In a mere script, your models are unknown.
You have to define a minimum to use them as if in a Rails App. Simply declare them:
class Foo << ActiveRecord:Base
end
Otherwise, in a Rails context, use rake tasks which are aware of the Rails app details.

Related

Loading ISSUE in rails

I'm facing a problem with Loading a Constant in Rails console (rails console). Here how my structure look like this
- app
- controllers
- models
- earning
- daily_earning.rb
- monthly_earning.rb
- weekly_earning.rb
- yearly_earning.rb
- views
Some more information
I also have a rake which look like this
namespace :past_days do
desc "Past 7 Days Earning"
task :earning => :environment do
puts $:.select { |i| i=~ /models/ }.to_yaml
7.downto(1).each do |i|
start_date = i.days.ago.beginning_of_day
puts "====== Dumping past #{start_date.strftime('%A')} earning ====="
end_date = start_date.end_of_day
Performer.top_daily_earners(start_date,end_date)
puts "====== Dumped #{start_date.strftime('%A')} earning !!! ======="
puts
end
end
end
And the top_daily_earners method look like this If you check this #klass = DailyEarning
def top_daily_earners(start_date=nil,end_date=nil)
unless start_date or end_date
date = 1.day.ago
#start_date,#end_date = date.beginning_of_day,date.end_of_day
end
if start_date and end_date
#start_date,#end_date = start_date,end_date
end
#klass = DailyEarning
#earning_performers = retrieve_earnings
puts "COUNT -----"
puts #earning_performers.count
puts ""
store_earning
end
Question :
Now when I run rake task bundle exec rake past_days:earning (Rake run without any error) all work fine but when I run this
rails console see attach screenshot
I get errors NameError: uninitialized constant DailyEarning and I have manually require the file as can be seen the above screenshot
So the POINT of all the above question is why the error on rails console (NameError: uninitialized constant DailyEarning) and why not the error in
rake task
Attaching DailyEarning Code based on #dtt comment
puts 'DailyEarning'
class DailyEarning
include Mongoid::Document
store_in session: "writeable"
field :performer_id, :type => Integer
field :user_id,:type => Integer
field :stage_name,:type => String
field :full_name,:type => String
field :start_date,:type => DateTime
field :end_date,:type => DateTime
field :amount,:type => BigDecimal
before_create :other_details
## Please avoid using default scope because it AFAIK it make the date parameter as static
class << self
def default_scoping
where(:start_date.gte => 1.day.ago.beginning_of_day).and(:end_date.lte => 1.day.ago.end_of_day)
end
end
private
def other_details
## Fetch from Mongo Instead of Mysql to avoid the slow sql query
performer_source = PerformerSource.where(performer_id: performer_id).only([:stage_name,:user_id]).first
self.user_id = performer_source.user_id
self.stage_name = self.stage_name
#self.full_name = self.full_name
end
end
My understanding is that to autoload a model in a folder you would need to namespace it:
to autoload the model in app/models/earning/daily_earning.rb
class Earning::DailyEarning
end
it may be that instead you could use:
module Earning
class DailyEarning
end
end

Custom error message processor without monkey patch

I've implemented a custom error message processor for Korean language. In Korean, postpositions take different forms depending on the sound of the preceding noun or pronoun.
For example, when marking a subject, ka (가) is used following a vowel and i (이) is used following a consonant.
Examples (hyphens are to denote morpheme boundaries):
Romanization: sakwa-ka ppalkah-ta.
Gloss: Apple-SUBJECT red-PRESENT.
Translation: Apple is red.
Romanization: phainayphul-i tal-ta.
Gloss: Pineapple-SUBJECT sweet-PRESENT.
Translation: Pineapple is sweet.
Therefore, the standard error message system implemented in ActiveModel::Errors is not adequate for Korean. You should either include the attribute in the message making a lot of duplicates ("A is blank", "B is blank", "C is blank", ...), or avoid postpositions after the attribute which is often difficult or makes awkward sentences.
I monkey patched ActiveModel::Errors and altered generate_message to solve this problem. Following is the code (Gist) which is currently in config/initializers in my Rails app.
# encoding: UTF-8
# Original article: http://dailyupgrade.me/post/6806676778/rubyonrails-full-messages-for-korean
# Modified to support more postpositions and client_side_validations gem.
#
# Add Korean translations like this:
# ko:
# errors:
# format: "%{message}"
# messages:
# blank: "%{attribute}((이)) 입력되지 않았습니다"
# taken: "이미 사용 중인 %{attribute}입니다"
# invalid: "잘못된 %{attribute}입니다"
# too_short: "%{attribute}((이)) 너무 짧습니다"
#
class Korean
POSTPOSITIONS = {"은" => "는", "이" => "가", "을" => "를", "과" => "와", "으로" => "로"}
def self.select_postposition(last_letter, postposition)
return postposition unless last_letter >= "가" && last_letter <= "힣"
final = last_letter.mb_chars.last.decompose[2]
if final.nil?
# 받침 없음
POSTPOSITIONS[postposition] || postposition
elsif final == "ㄹ" && (postposition == "으로" || postposition == "로")
# 'ㄹ 받침 + (으)로'를 위한 특별 규칙
"로"
else
# 받침 있음
POSTPOSITIONS.invert[postposition] || postposition
end
end
end
module ActiveModel
class Errors
old_generate_message = instance_method("generate_message")
define_method("generate_message") do |attribute, type = :invalid, options = {}|
msg = old_generate_message.bind(self).(attribute, type, options)
return msg unless I18n.locale == :ko
msg.gsub(/(?<=(.{1}))\(\((은|는|이|가|을|를|과|와|으로|로)\)\)/) do
Korean.select_postposition $1, $2
end
end
end
end
My first question is whether it is possible to achieve the same goal without monkey patching. I'm new to Rails (and Ruby too) so couldn't come up with a better solution.
The second question is about extracting this code from my Rails app and making it into a separate gem. I'm cutting my teeth on developing gems and recently made my first gem. In what place should I put this code in a gem? config/initializers doesn't seem right.
I'm not good at ruby but the following javascript code do the same thing. I hope it may help.
var hasJongsung = function(word) {
return !!(word && word[word.length -1].charCodeAt()>=0xAC00 && word[word.length-1].charCodeAt()<=0xD7A3 && (word[word.length -1].charCodeAt()-0xAC00)%28);
};
source:http://spectrumdig.blogspot.kr/2012/11/unicode-20.html

Am I extending this inbuilt ruby class correctly?

In my rails app in lib/matrix.rb I have entered the following code to extend the inbuilt Matrix class:
module Matrix
require 'matrix'
class Matrix
def symmetric?
return false if not square?
(0 ... row_size).each do |i|
(0 .. i).each do |j|
return false if self[i,j] != self[j,i]
end
end
true
end
def cholesky_factor
raise ArgumentError, "must provide symmetric matrix" unless symmetric?
l = Array.new(row_size) {Array.new(row_size, 0)}
(0 ... row_size).each do |k|
(0 ... row_size).each do |i|
if i == k
sum = (0 .. k-1).inject(0.0) {|sum, j| sum + l[k][j] ** 2}
val = Math.sqrt(self[k,k] - sum)
l[k][k] = val
elsif i > k
sum = (0 .. k-1).inject(0.0) {|sum, j| sum + l[i][j] * l[k][j]}
val = (self[k,i] - sum) / l[k][k]
l[i][k] = val
end
end
end
Matrix[*l]
end
end
end
Is this the correct way to add methods to an existing class within the rails app? Should I have the require matrix line there?
EDIT 1: Additional info provided
I have now removed the require 'matrix' line.
If I type the following test code in a view page, it only works if I delete my lib/matrix.rb file:
<% require 'matrix' %>
<%
m = Matrix[
[0,0],
[1,1]
]
%>
<%= m.column(0) %>
Otherwise it gives the error:
undefined method `[]' for Matrix:Module
It appears that I am eliminating the default methods of the built in Matrix class when I try to extend the class. Is there a way around this error?
no you should not have to require 'matrix' here. Whoever uses your code(rails app in your case), should use require 'matrix'
To extend a core class in Rails, you simply open it, add your methods, and close it. For example, to extend the Matrix class:
class Matrix
def my_method
"New method"
end
end
You should not need to require 'matrix' in your code either. As long as the file holding your extension is in one of the autoload paths, you should have direct access to the new methods.
If you need to add a directory to your Rails autoload path, simply update /config/application.rb with the following line:
config.autoload_paths += %W(#{config.root}/app/extras) # Autoload /app/extras/*.rb

Two extended classes - one works and the other doesn't

I have had some difficulties in using extended classes in rails, in particular extending the Matrix class - and I have also asked some questions related to this:
Am I extending this inbuilt ruby class correctly
Using custom functions in rails app
Where do I put this matrix class in my rails app
In general the answers have been around autoloading in rails 3. However, when I extend 'Math' new functions work, when I extend 'Matrix' new functions don't work - even if I treat them in the same way
I've tried many iterations and change module names, requires, autoloads but here are my latest key files:
application.rb:
require File.expand_path('../boot', __FILE__)
require 'rails/all'
require 'matrix'
# If you have a Gemfile, require the gems listed there, including any gems
# you've limited to :test, :development, or :production.
Bundler.require(:default, Rails.env) if defined?(Bundler)
module SimpleFixed
class Application < Rails::Application
# Settings in config/environments/* take precedence over those specified here.
# Application configuration should go into files in config/initializers
# -- all .rb files in that directory are automatically loaded.
# Custom directories with classes and modules you want to be autoloadable.
# **I have tried all these autoload variants**
# config.autoload_paths += %W(#{config.root}/extras)
# config.autoload_paths += %W(#{config.root}/lib)
# config.autoload_paths += Dir["#{config.root}/lib/**/"]
# config.autoload_paths << "#{Rails.root}/lib"
config.autoload_paths << "#{::Rails.root.to_s}/lib" # <-- set path
require "extend_matrix" # <-- forcibly load your matrix extension
*other standard code here*
lib/extend_math.rb
module Math
class << self
def cube_it(num)
num**3
end
def add_five(num)
num+5
end
end
end
lib/extend_matrix.rb
module Extend_matrix **An error is raised if I call this Matrix**
class Matrix
def symmetric?
return false if not square?
(0 ... row_size).each do |i|
(0 .. i).each do |j|
return false if self[i,j] != self[j,i]
end
end
true
end
def cholesky_factor
raise ArgumentError, "must provide symmetric matrix" unless symmetric?
l = Array.new(row_size) {Array.new(row_size, 0)}
(0 ... row_size).each do |k|
(0 ... row_size).each do |i|
if i == k
sum = (0 .. k-1).inject(0.0) {|sum, j| sum + l[k][j] ** 2}
val = Math.sqrt(self[k,k] - sum)
l[k][k] = val
elsif i > k
sum = (0 .. k-1).inject(0.0) {|sum, j| sum + l[i][j] * l[k][j]}
val = (self[k,i] - sum) / l[k][k]
l[i][k] = val
end
end
end
Matrix[*l]
end
end
end
my view page:
<%= Math.add_five(6) %> **works*
<%= Matrix[[25,15,-5],[15,18,0],[-5,0,11]].cholesky_factor %> **doesn't work**
Could it be because Math is a Module in ruby but Matrix is a class? If so, how do I correct for this?
If you have a look at the implementation of Matrix, you will see the reason. For me, it is located in .../ruby192/lib/ruby/1.9.1/matrix.rb. The definition is (omitted all methods):
class Matrix
include Enumerable
include ExceptionForMatrix
...
end
That means that Matrix is not contained in a module. Your source for your additions should begin:
class Matrix
def symmetric?
...
end
def cholesky_factor
...
end
end
So your addition to a class or module has to match the current definition. Matrix is known as Matrix in the Ruby constants, and not as Extend_matrix::Matrix which is what you have defined.

Override Rack method

My setup: Rails 3.0.9, Ruby 1.9.2
Due to a bug in Rack 1.2.3, I'm attempting to override Rack::Utils::Multipart.parse_multipart by creating a new file
rack_parse_multipart.rb
module Rack
module Utils
module Multipart
def self.parse_multipart(env)
...my changes...
end
end
end
end
Now I just need to figure out where I require this file, can someone point me in the right direction? Thanks in advance for your help.
For others having problems with this Rack 1.2.3 bug, there is a nice copy-paste solution here https://github.com/rack/rack/issues/186
goes in config/initializers
# -*- encoding: binary -*-
require 'rack/utils'
module Rack
module Utils
module Multipart
def self.parse_multipart(env)
unless env['CONTENT_TYPE'] =~
%r|\Amultipart/.*boundary=\"?([^\";,]+)\"?|n
nil
else
boundary = "--#{$1}"
params = {}
buf = ""
content_length = env['CONTENT_LENGTH'].to_i
input = env['rack.input']
input.rewind
boundary_size = Utils.bytesize(boundary) + EOL.size
bufsize = 16384
content_length -= boundary_size
read_buffer = ''
status = input.read(boundary_size, read_buffer)
raise EOFError, "bad content body" unless status == boundary + EOL
rx = /(?:#{EOL})?#{Regexp.quote boundary}(#{EOL}|--)/n
loop {
head = nil
body = ''
filename = content_type = name = nil
until head && buf =~ rx
if !head && i = buf.index(EOL+EOL)
head = buf.slice!(0, i+2) # First \r\n
buf.slice!(0, 2) # Second \r\n
token = /[^\s()<>,;:\\"\/\[\]?=]+/
condisp = /Content-Disposition:\s*#{token}\s*/i
dispparm = /;\s*(#{token})=("(?:\\"|[^"])*"|#{token})*/
rfc2183 = /^#{condisp}(#{dispparm})+$/i
broken_quoted = /^#{condisp}.*;\sfilename="(.*?)"(?:\s*$|\s*;\s*#{token}=)/i
broken_unquoted = /^#{condisp}.*;\sfilename=(#{token})/i
if head =~ rfc2183
filename = Hash[head.scan(dispparm)]['filename']
filename = $1 if filename and filename =~ /^"(.*)"$/
elsif head =~ broken_quoted
filename = $1
elsif head =~ broken_unquoted
filename = $1
end
if filename && filename !~ /\\[^\\"]/
filename = Utils.unescape(filename).gsub(/\\(.)/, '\1')
end
content_type = head[/Content-Type: (.*)#{EOL}/ni, 1]
name = head[/Content-Disposition:.*\s+name="?([^\";]*)"?/ni, 1] || head[/Content-ID:\s*([^#{EOL}]*)/ni, 1]
if filename
body = Tempfile.new("RackMultipart")
body.binmode if body.respond_to?(:binmode)
end
next
end
# Save the read body part.
if head && (boundary_size+4 < buf.size)
body << buf.slice!(0, buf.size - (boundary_size+4))
end
c = input.read(bufsize < content_length ? bufsize : content_length, read_buffer)
raise EOFError, "bad content body" if c.nil? || c.empty?
buf << c
content_length -= c.size
end
# Save the rest.
if i = buf.index(rx)
body << buf.slice!(0, i)
buf.slice!(0, boundary_size+2)
content_length = -1 if $1 == "--"
end
if filename == ""
# filename is blank which means no file has been selected
data = nil
elsif filename
body.rewind
# Take the basename of the upload's original filename.
# This handles the full Windows paths given by Internet Explorer
# (and perhaps other broken user agents) without affecting
# those which give the lone filename.
filename = filename.split(/[\/\\]/).last
data = {:filename => filename, :type => content_type,
:name => name, :tempfile => body, :head => head}
# elsif !filename && content_type
# body.rewind
#
# # Generic multipart cases, not coming from a form
# data = {:type => content_type,
# :name => name, :tempfile => body, :head => head}
else
data = body
end
Utils.normalize_params(params, name, data) unless data.nil?
# break if we're at the end of a buffer, but not if it is the end of a field
break if (buf.empty? && $1 != EOL) || content_length == -1
}
input.rewind
params
end
end
end
end
end
Don't do it like this, your file should be like this:
Rack::Utils::UploadedFile.class_eval do
def self.parse_multipart( env )
# add your code here
end
end
This file can be placed in a initializer file on your initializers folder.
The difference between doing the way you did and the way I'm showing is that when you're using module/class you might break the Rails autoload mechanism, as Rails could think you're defining the class (and not load the original class by itself) and the original class would never be loaded.
Whenever you're doing monkey patching like this make sure you use the class_eval solution so that Rails is forced to load the original class first and then runs your code.