Efficient query for distinct count and group with two columns - sql

Given a simple model that consists of descriptions, tags, and some other fields
The results should be:
a list of all tags in Entry.all without duplicates (e.g. Entry.select("DISTINCT(tag)") )
the number of duplicates for each tag, also used to sort tags
all descriptions for each tag sorted alphabetically, again without duplicates (however, the exactly same description can exist with a different tag)
Is it possible to combine this in one (efficient) query?
Edit:
def change
create_table :entries do |t|
t.datetime :datum, :null => false
t.string :description
t.string :tag
(and some others)
end
add_index :entries, :user_id
end

It's better to create additional table:
rails g model Tag name:string description:string
rails g model Entry tag:references ...
And then just call them:
#entries = Entry.select('tag_id, count(tag_id) as total').group(:tag_id).includes(:tag)
After that, you will have all descriptions in your object:
#entries.first.tag.description # description of entry tag
#entries.first.tag.total # total number of such kind of tags
P.S.: Why just one tag per entry?

Related

database design (joining 3 tables together)

My goal is to create a web app that show elections results from my country.
The data is the results for every candidates in every city for every election.
An election has many candidates and many cities.
A candidate has many elections and many cities.
A city has many elections and many candidates.
For the 2nd round of the last presidential election:
City
inscrits
votants
exprime
candidate1
score C1
candidate2
score C2
Dijon
129000
100000
80000
Macron
50000
Le Pen
30000
Lyon
1000000
900000
750000
Macron
450000
Le Pen
300000
How can I join those 3 tables together?
Is it possible to create a join table between the three, like this?
create_table "results", force: :cascade do |t|
t.integer "election_id", null: false
t.integer "candidate_id", null: false
t.integer "city_id", null: false
t.datetime "created_at", null: false
t.datetime "updated_at", null: false
t.index ["city_id"], name: "index_results_on_city_id"
t.index ["candidate_id"], name: "index_results_on_candidate_id"
t.index ["election_id"], name: "index_results_on_election_id"
end
But in this case, where can I add the city infos for election? (Column 2, 3, 4 of my data example, i.e: in this city, for this election XXX people voted, XXX didn't vote.)
I came with this database schema:
my database schema
This will not work because I will not be able to access the result of a candidate in a specific city for a specific election. It looks like there is no connection between cities and candidates.
To actually tie these models together and record the data required you need a series of tables that record the election results at each level your interested in:
# rails g model national_result candidate:belongs_to election:belongs_to votes:integer percentage:decimal
class NationalResult < ApplicationRecord
belongs_to :candidate
belongs_to :election
delegate :name, to: :candidate,
prefix: true
end
# rails g model city_result candidate:belongs_to election:belongs_to votes:integer percentage:decimal
class CityResult < ApplicationRecord
belongs_to :city
belongs_to :candidate
belongs_to :election
delegate :name, to: :candidate,
prefix: true
end
Instead of having C1 and C2 columns you should use one row per candidate instead to record their result. That will let you use the same table layout even if there are more then two candidates (like in a primary) and avoids the problem of figuring out which column a candidate is in. Use foreign keys and record the primary key instead of filling your table with duplicates of the names of the candidates which can easily become denormalized.
While you might naively think "But I don't need NationalResult, I can just sum up all the LocalResult's!" - that process would actually expose any problems in your data set and very likely be quite expensive. Get the data from a repubable source instead.
You can then create the has_many assocations on the other side:
class Canditate < ApplicationRecord
has_many :local_results
has_many :national_results
end
class Election < ApplicationRecord
has_many :local_results
has_many :national_results
end
class City < ApplicationRecord
has_many :local_results
end
Keeping track of the number of eligable voters per election/city will most likely require another table.

Rails 4 Generating Invalid Column Names

I have a fairly simple query to return the first record in a many-to-many relation or create one if it doesn't exist.
UserCategorization.where(category_id: 3, user_id: 5).first_or_create
My model looks like:
class UserCategorization < ActiveRecord::Base
belongs_to :user
belongs_to :category
self.primary_key = [:user_id, :category_id]
end
However it generates an invalid column name in the SQL:
SQLite3::SQLException: no such column: user_categorizations.[:user_id, :category_id]:
SELECT "user_categorizations".* FROM "user_categorizations" WHERE
"user_categorizations"."category_id" = 3 AND "user_categorizations"."user_id" = 5
ORDER BY "user_categorizations"."[:user_id, :category_id]" ASC LIMIT 1
If I remove self.primary_key = [:user_id, :category_id] from the model, it can retrieve the record correctly but cannot save because it doesn't know what to use in the WHERE clause:
SQLite3::SQLException: no such column: user_categorizations.:
UPDATE "user_categorizations" SET "score" = ?
WHERE "user_categorizations"."" IS NULL
Has anyone seen this before?
I think one of these two suggestions will work:
First, try adding the following migration:
add_index :user_categorizations, [:user_id, :category_id]
Make sure to keep self.primary_key = [:user_id, :category_id] in your UserCategorization model.
If that doesn't work, destroy the UserCategorization table and run this migration:
def change
create_table :user_categorizations do |t|
t.references :user
t.references :category
t.timestamps
end
end
references are new to Rails 4. They add a foreign key and index to the specified columns.
Good Luck!
So it looks like Rails 4 ActiveRecord doesn't do composite keys very well so many-to-many models create the issues above. I fixed it by using this extension to ActiveRecord: http://compositekeys.rubyforge.org/

Order Players on the SUM of their association model

I have a database with 6500 players and each player has an average of 15 game results.
Use case
I want to generate a list of players, ordered by the sum of their prize money (a field in the results table).
I prefer this to be in some sort of scope, so I can also filter the list on the player's country, etc.
Performance
I have seen posts that mention a cache_counter field for performance. In my case I have thousands of result records (75.000+) so I don't want the calculations being done every time someone visits the generated listings.
Question
What is the best pattern to solve this? And how do I implement it?
Models
class Player < ActiveRecord::Base
has_many :results
end
class Result < ActiveRecord::Base
belongs_to :player
end
Schemas
create_table "players", :force => true do |t|
t.string "name"
t.string "nationality"
end
create_table "results", :force => true do |t|
t.integer "player_id"
t.date "event_date"
t.integer "place"
t.integer "prize"
end
Update
What I am trying to accomplish is getting to a point where I can use:
#players = Player.order_by_prize
and
#players = Player.filter_by_country('USA').order_by_prize('desc')
You should be able to use something like this:
class Player
scope :order_by_prize, joins(:results).select('name, sum(results.prize) as total_prize').order('total_prize desc')
Refer rails api - active record querying for details.

Rails 3 one-to-many relationship question - how to return column name value

I have a one-tomany relationship with 2 tables as follows:
Models:
class MediaType < ActiveRecord::Base
belongs_to :media
end
class Media < ActiveRecord::Base
has_many :media_types
end
SQL for simplicity sake are:
create_table :media do |t|
t.string "name", :limit => 255
t.integer "media_type_id"
end
create_table :media_types do |t|
t.string "name", :limit => 255
end
Once I insert a Media record relating to a media_type_id, how do I pull back the media_type.name value related to the media record?
I blindly tried:
media = Media.find(1)
media.media_type_id.name
But that didn't work of course. Is my SQL not Rails standards possibly?
Appreciate any help.
If you idea: media_type has many medias, but every media has only one media_type
You need another models:
class MediaType < ActiveRecord::Base
has_many :medias
end
class Media < ActiveRecord::Base
belongs_to :media_type
end
And
media = Media.find(1)
media.media_type.name
give you name
It seems that media has_many media_types.
In that case you would create media_id column in media_types table, but you did it other way around.
You will then approach each relation by
types = Media.media_types
to get the media_types that the Media has, and
media = MediaType.media
to get the media that mediatype belongs to.

Rails 3 HABTM find by record associated model attribute

I think this is really basic but I'm horrible with SQL so I have no idea how to do it...
I have a standard HABTM relationship between two models, LandUse and Photo. So I have a land_uses_photos join table, and each model has the standard macro, like:
Photo
has_and_belongs_to_many :land_uses
The land use table has: ID, and Name(string).
I want to find Photos where Land Use Name = 'foo','bar',or 'baz'. How do I do that?
I looked at this question and tried:
Photo.includes(:land_uses).where('land_use.id'=>[6,7])
... but that gave me:
ActiveRecord::StatementInvalid: No attribute named `id` exists for table `land_use`
That's bogus, here's the schema.rb for both LandUse and the join table:
create_table "land_uses", :force => true do |t|
t.string "name"
t.datetime "created_at"
t.datetime "updated_at"
t.integer "display_order"
end
create_table "land_uses_photos", :id => false, :force => true do |t|
t.integer "land_use_id"
t.integer "photo_id"
end
So, how do I do this kind of find? And to make it only one question instead of two, how could I find with an "and" condition instead of only an "or" condition?
Thanks!
Photo.joins(:land_uses).where('land_uses.name' => ['foo', 'bar', 'baz'])
you gen an 'id' error since table name is: land_uses and not land_use