Rails 3: Sum a Field from several Documents - ruby-on-rails-3

I have several hundred documents with this information:
=> User(id: integer, email: string, amount: string)
How can I sum the amount variable of all the documents?
I am using Rails + Mongoid.
thanks

Normally with an integer field, you could do this:
User.sum(:amount)
# => 142.0
However, since your amount field is a string, it'll just concatenate strings instead. For example, if you had 3 users with amounts of 12, 30, and 100, you would get "01230100" as a result. In this case, you may need to use something like Ruby's inject instead:
User.all.inject(0) { |sum, user| sum + user.amount.to_f }
# => 142.0

Related

Can Redisgraph return numbers and booleans instead of their string representation?

I'm working with Redisgraph.
I have a node Person with three properties: name (string), age (number), isAlive (boolean).
If I store the age as number, without the quotes, it correctly store it as a number. So, if I query:
MATCH (p:Person) RETURN p
what I have is:
{ name: 'John', age: 30, isAlive: 'true' }
but there's a way to query and get real booleans?
What I want is:
{ name: 'John', age: 30, isAlive: true }
Thank you!
It sounds like you're querying RedisGraph using redis-cli. The RESP protocol that processes module replies only allows strings and integers as primitive data types that can be passed, so your request can't be accomplished through redis-cli.
All of the client libraries, however, will decode replies to their correct type. I'd recommend using one as an intermediary to interact with RedisGraph - https://oss.redis.com/redisgraph/clients/.
Redisgraph can return a compact format where the type of the values are included. In order to use this you need to pass the --compact flag (which also works in redis-cli):
GRAPH.QUERY demo "MATCH (a) RETURN a" --compact
Some client libraries takes advantage of this compact format in order to return the correct type. The type of value is returned as an integer:
typedef enum {
PROPERTY_UNKNOWN = 0,
PROPERTY_NULL = 1,
PROPERTY_STRING = 2,
PROPERTY_INTEGER = 3,
PROPERTY_BOOLEAN = 4,
PROPERTY_DOUBLE = 5,
} PropertyTypeUser;
You can read more about the compact format here.

Setting group_by in specialized query

I need to perform data smoothing using averaging, with a non-standard group_by variable that is created on-the-fly. My model consists of two tables:
class WthrStn(models.Model):
name=models.CharField(max_length=64, error_messages=MOD_ERR_MSGS)
owner_email=models.EmailField('Contact email')
location_city=models.CharField(max_length=32, blank=True)
location_state=models.CharField(max_length=32, blank=True)
...
class WthrData(models.Model):
stn=models.ForeignKey(WthrStn)
date=models.DateField()
time=models.TimeField()
temptr_out=models.DecimalField(max_digits=5, decimal_places=2)
temptr_in=models.DecimalField(max_digits=5, decimal_places=2)
class Meta:
ordering = ['-date','-time']
unique_together = (("date", "time", "stn"),)
The data in WthrData table are entered from an xml file in variable time increments, currently 15 or 30 minutes, but that could vary and change over time. There are >20000 records in that table. I want to provide an option to display the data smoothed to variable time units, e.g. 30 minutes, 1, 2 or N hours (60, 120, 180, etc minutes)
I am using SQLIte3 as the DB engine. I tested the following sql, which proved quite adequate to perform the smoothing in 'bins' of N-minutes duration:
select id, date, time, 24*60*julianday(datetime(date || time))/N jsec, avg(temptr_out)
as temptr_out, avg(temptr_in) as temptr_in, avg(barom_mmhg) as barom_mmhg,
avg(wind_mph) as wind_mph, avg(wind_dir) as wind_dir, avg(humid_pct) as humid_pct,
avg(rain_in) as rain_in, avg(rain_rate) as rain_rate,
datetime(avg(julianday(datetime(date || time)))) as avg_date from wthr_wthrdata where
stn_id=19 group by round(jsec,0) order by stn_id,date,time;
Note I create an output variable 'jsec' using the SQLite3 function 'julianday', which returns number of days in the integer part and fraction of day in the decimal part. So, multiplying by 24*60 gives me number of minutes. Dividing by N-minute resolution gives me a nice 'group by' variable, compensating for varying time increments of the raw data.
How can I implement this in Django? I have tried the objects.raw(), but that returns a RawQuerySet, not a QuerySet to the view, so I get error messages from the html template:
</p>
Number of data entries: {{ valid_form|length }}
</p>
I have tried using a standard Query, with code like this:
wthrdta=WthrData.objects.all()
wthrdta.extra(select={'jsec':'24*60*julianday(datetime(date || time))/{}'.format(n)})
wthrdta.extra(select = {'temptr_out':'avg(temptr_out)',
'temptr_in':'avg(temptr_in)',
'barom_mmhg':'avg(barom_mmhg)',
'wind_mph':'avg(wind_mph)',
'wind_dir':'avg(wind_dir)',
'humid_pct':'avg(humid_pct)',
'rain_in':'avg(rain_in)',
'rain_sum_in':'sum(rain_in)',
'rain_rate':'avg(rain_rate)',
'avg_date':'datetime(avg(julianday(datetime(date || time))))'})
Note that here I use the sql-avg functions instead of using the django aggregate() or annotate(). This seems to generate correct sql code, but I cant seem to get the group_by set properly to my jsec data that is created at the top.
Any suggestions for how to approach this? All I really need is to have the QuerySet.raw() method return a QuerySet, or something that can be converted to a QuerySet instead of RawQuerySet. I can not find an easy way to do that.
The answer to this turns out to be really simple, using a hint I found from
[https://gist.github.com/carymrobbins/8477219][1]
though I modified his code slightly. To return a QuerySet from a RawQuerySet, all I did was add to my models.py file, right above the WthrData class definition:
class MyManager(models.Manager):
def raw_as_qs(self, raw_query, params=()):
"""Execute a raw query and return a QuerySet. The first column in the
result set must be the id field for the model.
:type raw_query: str | unicode
:type params: tuple[T] | dict[str | unicode, T]
:rtype: django.db.models.query.QuerySet
"""
cursor = connection.cursor()
try:
cursor.execute(raw_query, params)
return self.filter(id__in=(x[0] for x in cursor))
finally:
cursor.close()
Then in my class definition for WthrData:
class WthrData(models.Model):
objects=MyManager()
......
and later in the WthrData class:
def get_smoothWthrData(stn_id,n):
sqlcode='select id, date, time, 24*60*julianday(datetime(date || time))/%s jsec, avg(temptr_out) as temptr_out, avg(temptr_in) as temptr_in, avg(barom_mmhg) as barom_mmhg, avg(wind_mph) as wind_mph, avg(wind_dir) as wind_dir, avg(humid_pct) as humid_pct, avg(rain_in) as rain_in, avg(rain_rate) as rain_rate, datetime(avg(julianday(datetime(date || time)))) as avg_date from wthr_wthrdata where stn_id=%s group by round(jsec,0) order by stn_id,date,time;'
return WthrData.objects.raw_as_qs(sqlcode,[n,stn_id]);
This allows me to grab results from the highly populated WthrData table smoothed over time increments, and the results come back as a QuerySet instead of RawQuerySet

Ruby dbi select statement returning BigDecimal?

I'm having trouble using ruby with dbi for some reason, I'm trying to do a select and put the results in an array but no luck.
require 'dbi'
db = DBI.connect('DBI:OCI8:database', XXXX, XXXX)
#Gets Consumer Id Number you want to create accounts for
numberOfAccounts = []
puts("Please enter a CID")
NewCID = gets.chomp()
numberOfAccounts << db.execute("select T_NBR from T_CBA where C_ID='#{NewCID}'").fetch
My array ends up like this:
[[<#BigDecimal:fc115f8,'0.8000169202 2E11',12(16)>]]
where I would like to have several different numbers like [222, 3232, 2323] etc.
I've searched online but to no avail.
DBI has probably determined that the underlying column can contain integers too large to fit in a regular int type, based on the data field. Or it may just use BigDecimal for all integer types to avoid worrying about it.
If you know that your values are all small enough to fit into a regular integer, you can convert the array to integers after you've populated it, like so:
1.9.3-p194 :014 > numberOfAccounts
=> [[#<BigDecimal:119cd90,'0.123E3',9(36)>], [#<BigDecimal:119cd18,'0.456E3',9(36)>]]
1.9.3-p194 :015 > numberOfAccounts.flatten!.collect!(&:to_i)
=> [123, 456]
1.9.3-p194 :016 > numberOfAccounts
=> [123, 456]

NHibernate Like with integer

I have a NHibernate search function where I receive integers and want to return results where at least the beginning coincides with the integers, e.g.
received integer: 729
returns: 729445, 7291 etc.
The database column is of type int, as is the property "Id" of Foo.
But
int id = 729;
var criteria = session.CreateCriteria(typeof(Foo))
criteria.Add(NHibernate.Criterion.Expression.InsensitiveLike("Id", id.ToString() + "%"));
return criteria.List<Foo>();
does result in an error (Could not convert parameter string to int32). Is there something wrong in the code, a work around, or other solution?
How about this:
int id = 729;
var criteria = session.CreateCriteria(typeof(Foo))
criteria.Add(Expression.Like(Projections.Cast(NHibernateUtil.String, Projections.Property("Id")), id.ToString(), MatchMode.Anywhere));
return criteria.List<Foo>();
Have you tried something like this:
int id = 729;
var criteria = session.CreateCriteria(typeof(Foo))
criteria.Add(NHibernate.Criterion.Expression.Like(Projections.SqlFunction("to_char", NHibernate.NHibernateUtil.String, Projections.Property("Id")), id.ToString() + "%"));
return criteria.List<Foo>();
The idea is convert the column before using a to_char function. Some databases do this automatically.
AFAIK, you'll need to store your integer as a string in the database if you want to use the built in NHibernate functionality for this (I would recommend this approach even without NHibernate - the minute you start doing 'like' searches you are dealing with a string, not a number - think US Zip Codes, etc...).
You could also do it mathematically in a database-specific function (or convert to a string as described in Thiago Azevedo's answer), but I imagine these options would be significantly slower, and also have potential to tie you to a specific database.

searching for and ranking results

I'm trying to write a relatively simple algorithm to search for a string on several attributes
Given some data:
Some data:
1: name: 'Josh', location: 'los angeles'
2: name: 'Josh', location: 'york'
search string: "josh york"
The results should be [2, 1] because that query string hits the 2nd record twice, and the 1st record once.
It's safe to assume case-insensitivity here.
So here's what I have so far, in ruby/active record:
query_string = "josh new york"
some_attributes = [:name, :location]
results = {}
query_string.downcase.split.each do |query_part|
some_attributes.each do |attribute|
find(:all, :conditions => ["#{attribute} like ?", "%#{query_part}%"]).each do |result|
if results[result]
results[result] += 1
else
results[result] = 1
end
end
end
end
results.sort{|a,b| b[1]<=>a[1]}
The issue I have with this method is that it produces a large number of queries (query_string.split.length * some_attributes.length).
Can I make this more efficient somehow by reducing the number of queries ?
I'm okay with sorting within ruby, although if that can somehow be jammed into the SQL that'd be nice too.
Why aren't you using something like Ferret? Ferret is a Ruby + C extension to make a full text index. Since you seem to be using ActiveRecord, there's also acts_as_ferret.