Search Ruby on Rails Database for Key Value Pair - ruby-on-rails-3

Hey all i'm getting hung up on how to search a database for a specific key pair. What i'm doing is sending a combine array of objects that represent data not sent for that user for the day. Here is the basic logic
Unless the database contains the object_id and user_id
insert it into the array
The association is many objects to many users.
Is their a way to search the table cases_users so that it is a query like this
unless cases_users.contains? {object_id, user_id}
objectarray.push(object_id)

I think you want
objectarray.push(object_id) unless User.find_by_object_id_and_user_id(object_id, user_id)

Related

EF: Inserting already present record in many to many relationship

For what I searched there are 2 ways to insert an already present record into a ICollection list:
group.Users.Add(db.Users.FirstOrDefault(x=> x.Id = 1));
var to_add = new User{Id: 1}; db.Users.Attach(to_add); group.Users.Add(to_add);
The problem with both the above approach is it makes a db call every time we want to add a record. While we already know the user's Id and the group's id and that's all it needs to create a relationship.
Imagine a long list to be added, both the above methods would make multiple calls to db.
So you have Groups and Users. Every Group has zero or more Users; every User has zero or more Groups. A traditional many-to-many relationship.
Normally one would add a User to a Group, or a Group to a User. However you don't have a Group, nor a User, you only have a GroupId and a UserId. and because of the large number of insertions you don't want to fetch the Users and the Groups of which you want to create relations
The problem is, if you could add the GroupId-UserId combination directly to your junction table, how would you know that you wouldn't be adding a Group-User relation that already exists? If you wouldn't care, you'd end up with twice the relation. This would lead to problems: Would you want them to be shown twice if you'd ask the Users of a Group? Which one should be removed if the relation ends, or should they all be removed?
If you really want to implement the possibility of double relation, then you'd need to Implement a a Custom Junction Table as described here The extra field would be the number of relations.
This would not help you with your large batch, because you would still need to fetch the field from the custom junction table to increment the NrOfRelations value.
On the other hand, if you don't want double relations, you'd have to check whether the value already exists, and you didn't want to fetch data before inserting.
Usually the number of additions to a database is far less then the number of queries. If you have a large batch of data to be inserted, then it is usually only during the initialization phase of the database. I wouldn't bother optimizing initialization too much.
Consider remembering already fetched Groups and Users in a dictionary, preventing them to be fetched twice. However, if your list is really huge, this is not a practical solution.
If you really need this functionality for a prolonged period of time consider creating a Stored Procedure that checks if the GroupId / UserId already exists in the junction table, and if not, add it.
See here For SQL code how to do Add-Or-Update
Entity Framework call stored procedure

Exchangeable fields in SQL index

I m designing a table for a list of chats. Every chat references two users and must be unique per given user pair. Evidently, a user pair is symmetric for permutations: it does not matter, which user in a pair comes first and which one comes second.
Suppose a user table has an integer UserID as PK. Then chat table will have a pair of FK fields: UserID1, UserID2. I want that every pair of users have a unique record in chats table. Then I can create unique INDEX(UserID1, UserID2). However, this index will not be unique in terms of a user pair because it may include also permutations when UserID2 comes first: (UserID1, UserID2) and (UserID2, UserID1) will be two distinct different pairs, while by required logic, they should be treated one distinct record.
Is there a way to implement this construct in pure SQL without external coding, such as DB triggers or scripting? I m using MS SQL Server for prototyping. But I want to have a design universal and neutral as possible to be compatible with most SQL compliant databases. It is mere a question on optimal architecture than on specific SQL implementation code as such.
Possible ideas:
Make and index on a hash of ordered user pairs
Check user order and put a user with smaller ID first in all queries
But all these put restrictions on external queries outside of SQL.

What is this form of database called?

I'm new to databases and I'm thinking of creating one for a website. I started with SQL, but I really am not sure if I'm using the right kind of database.
Here's the problem:
What I have right now is the first option. So that means that, my query looks something like this:
user_id photo_id photo_url
0 0 abc.jpg
0 1 123.jpg
0 2 lol.png
etc.. But to me that seems a little bit inefficient when the database becomes BIG. So the thing I want is the second option shown in the picture. Something like this, then:
user_id photos
0 {abc.jpg, 123.jpg, lol.png}
Or something like that:
user_id photo_ids
0 {0, 1, 2}
I couldn't find anything like that, I only find the ordinary SQL. Is there anyway to do something like that^ (even if it isn't considered a "database")? If not, why is SQL more efficient for those kinds of situations? How can I make it more efficient?
Thanks in advance.
Your initial approach to having a user_id, photo_id, photo_url is correct. This is the normalized relationship that most database management systems use.
The following relationship is called "one to many," as a user can have many photos.
You may want to go as far as separating the photo details and just providing a reference table between the users and photos.
The reason your second approach is inefficient is because databases are not designed to search or store multiple values in a single column. While it's possible to store data in this fashion, you shouldn't.
If you wanted to locate a particular photo for a user using your second approach, you would have to search using LIKE, which will most likely not make use of any indexes. The process of extracting or listing those photos would also be inefficient.
You can read more about basic database principles here.
Your first example looks like a traditional relational database, where a table stores a single record per row in a standard 1:1 key-value attribute set. This is how data is stored in RDBMS' like Oracle, MySQL and SQL Server. Your second example looks more like a document database or NoSQL database, where data is stored in nested data objects (like hashes and arrays). This is how data is stored in database systems like MongoDB.
There are benefits and costs to storing data in either model. With relational databases, where data is spread accross multiple tables and linked by keys, it is easy to get at data from multiple angles and aggregate it for multiple purposes. With document databases, data is typically more difficult to join in single queries, but much faster to retrieve, and also typically formatted for quicker application use.
For your application, the latter (document database model) might be best if you only care about referencing a user's images when you have a user ID. This would not be ideal for say, querying for all images of category 'profile pic' or for all images uploaded after a certain date. You could probably accomplish your task with either database type, and choosing the right database will always depend on the application(s) that it will be used for, but as a general rule-of-thumb, relational databases are more flexible and hard to go wrong with.
What you want (having user -> (photo1, photo2, ...)) is kind of an INDEX :
When you execute your request, it will go to the INDEX and fetch the INDEX "user" in the photos table, and get the photo list to fetch. Not all the database will be looked up, it's optimised.
I would do something like
Users_Table(One User - One Photo)
With all the column that every user will have. if one user will have only one photo then just add a column in this table with photo_url
One User Many Photos
If one User Can have multiple Photos. then create a table separately for photos which contains only UserID from Users_Table and the Photo_ID and Photo_File.
Many Users Many Photos
If One Photo can be assigned to multiple users then Create a Separate table for Photos Where there are PhotoID and Photo_File. Third Table User_Photos which can have UserID from Users_Table and Photo_ID from Photos Table.

Storing Allowed Websites Per User in Postgres

I have a User table in my Postgres database. In my application, the User can have various allowed websites. My question is: which is more disk space efficent, having a many-to-many relationship between a user and a url or storing the array in JSON in a column in the User table. Essintially, how much space does postgres use to store table headers.
Thanks.
which is more disk space efficent, having a many-to-many relationship between a user and a url or storing the array in JSON in a column in the User table.
Updating a many-to-many relationship means an UPDATE (and/or DELETE?) statement.
Updating a JSON array stored in a database tables means:
SELECTing the data to get it out of the database, to the application
Manipulating the data in the application
UPDATE statement to write the updated JSON array back to the table
Which is simpler/more efficient to you?

What is the best way to add users to multiple groups in a database?

In an application where users can belong to multiple groups, I'm currently storing their groups in a column called groups as a binary. Every four bytes is a 32 bit integer which is the GroupID. However, this means that to enumerate all the users in a group I have to programatically select all users, and manually find out if they contain that group.
Another method was to use a unicode string, where each character is the integer denoting a group, and this makes searching easy, but is a bit of a fudge.
Another method is to create a separate table, linking users to groups. One column called UserID and another called GroupID.
Which of these ways would be the best to do it? Or is there a better way?
You have a many-to-many relationship between users and groups. This calls for a separate table to combine users with groups:
User: (UserId[PrimaryKey], UserName etc.)
Group: (GroupId[PrimaryKey], GroupName etc.)
UserInGroup: (UserId[ForeignKey], GroupId[ForeignKey])
To find all users in a given group, you just say:
select * from User join UserInGroup on UserId Where GroupId=<the GroupId you want>
Rule of thumb: If you feel like you need to encode multiple values in the same field, you probably need a foreign key to a separate table. Your tricks with byte-blocks or Unicode chars are just clever tricks to encode multiple values in one field. Database design should not use clever tricks - save that for application code ;-)
I'd definitely go for the separate table - certainly the best relational view of data. If you have indexes on both UserID and GroupID you have a quick way of getting users per group and groups per user.
The more standard, usable and comprehensible way is the join table. It's easily supported by many ORMs, in addition to being reasonably performant for most cases. Only enter in "clever" ways if you have a reason to, say a million of users and having to answer that question every half a second.
I would make 3 tables. users, groups and usersgroups which is used as cross-reference table to link users and groups. In usersgroups table I would add userId and groupId columns and make them as primary key. BTW. What naming conventions there are to name those xref tables?
It depends what you're trying to do, but if your database supports it, you might consider using roles. The advantage of this is that the database provides security around roles, and you don't have to create any tables.