I have a table "friends" with "user1" and "user2" columns.
How can I remove duplicate relationships?
For example, if I already have
user1=1 and user2=3,
how can I remove user1=3 and user2=1 values?
Could I enforce that in MySQL with a unique index?
With this query you can find duplicate relationships
SELECT *
FROM friends d
WHERE EXISTS (
SELECT 1
FROM friends f
WHERE f.user1 = d.user2
AND f.user2 = d.user1
AND f.user2 < f.user1
)
I recommend put the lower userId for user1 and create a unique index
For example, don't allow rows, where user1 > user2.
Depending on the other columns in your table you could create a composite primary key over the 2 columns to enforce the uniqueness.
To find the duplicates, just join the table to itself. It is not difficult.
Related
I used a query to find a list of Primary Keys. One Primary key per each ForiegnKey in a table by using below query.
select foreignKey, min(primaryKey)
from t
group by foreignKey;
Let us say this is the result : 1,4,5
NOw I have another table - Table B that has list of all Primary keys. It has 1,2,3,6,7,8,9
I want a write a query using the above query So that I get a subset of the original query(above) that does not exist in Table B. I want 4 and 5 back with the new query.
Use a having clause:
select foreignKey, min(primaryKey)
from t
group by foreignKey
having min(primarykey) not in (select pk from b);
You should also be able to express this as not exists:
having not exists (select 1
from b
where b.pk = min(t.primaryKey)
)
I have table A with fields user1 and user2 and table B with user3 and user4 pairs.
I want to be able to find any rows from table A where the user1/user2 combination is also in table B.
The order doesn't matter. For example table A with user1=Mike, user2=Joe would match table B with user3=Joe and user4=Mike.
I'd probably just use an explicit or in a join:
select user1, user2
from tableA join tableB on
(user1=user3 and user2=user4) or
(user1=user4 and user2=user3)
...but take it with a grain of salt. I've been accused of over-using joins, probably with at least a little reason.
select a.user1, a.user2 from a, b
where (a.user1 == b.user3 and a.user2 == b.user4)
or (a.user1 = b.user4 and a.user2 = b.user3);
I'm posting this question which is somewhat a summary of my other question.
I have two databases:
1) db_users.
2) db_friends.
I stress that they're stored in separate databases on different servers and therefore no foreign keys can be used.
In 'db_friends' I have the table 'tbl_friends' which has the following columns:
- id_user
- id_friend
Now how do I make sure that each pair is unique at this table ('tbl_friends')?
I'd like to enfore that at the table level, and not through a query.
For example these are invalid rows:
1 - 2
2 - 1
I'd like this to be impossible to add.
Additionally - how would I seach for all of the friends of user 713 while he could be mentioned, on some friendship rows, at the second column ('id_friend')?
You're probably not going to be able to do this at the database level -- your application code is going to have to do this. If you make sure that your tbl_friends records always go in with (lowId, highId), then a typical PK/Unique Index will solve the duplicate problem. In fact, I'd go so far to rename the columns in your tbl_friends to (id_low, id_high) just to reinforce this.
Your query to find anything with user 713 would then be something like
SELECT id_low AS friend FROM tbl_friends WHERE (id_high = ?)
UNION ALL
SELECT id_high AS friend FROM tbl_friends WHERE (id_low = ?)
For efficiency, you'd probably want to index it forward and backward -- that is by (id_user, id_friend) and (id_friend, id_user).
If you must do this at a DB level, then a stored procedure to swap arguments to (low,high) before inserting would work.
You'd have to use a trigger to enforce that business rule.
Making the two columns in tbl_friends the primary key (unique constraint failing that) would only ensure there can't be duplicates of the same set: 1, 2 can only appear once but 2, 1 would be valid.
how would I seach for all of the friends of user 713 while he could be mentioned, on some friendship rows, at the second column ('id_friend')?
You could use an IN:
WHERE 713 IN (id_user, id_friend)
..or a UNION:
JOIN (SELECT id_user AS user
FROM TBL_FRIENDS
UNION ALL
SELECT id_friend
FROM TBL_FRIENDS) x ON x.user = u.user
Well, a unique constraint on the pair of columns will get you half way there. I think the easiest way to ensure you don't get the reversed version would be to add a constraint ensuring that id_user < id_friend. You will need to compensate for this ordering at insertion time, but it will get you the database Level constraint you desire without duplicating data or relying on foreign keys.
As for the second question, to find all friends for id=1 you could select id_user, id_friend from tbl_friend where id_user = 1 or id_friend = 1 and then in your client code throw out all the 1's regardless of column.
One way you could do it is to store the two friends on two rows:
CREATE TABLE FriendPairs (
pair_id INT NOT NULL,
friend_id INT NOT NULL,
PRIMARY KEY (pair_id, friend_id)
);
INSERT INTO FriendPairs (pair_id, friend_id)
VALUES (1234, 317), (1234, 713);
See? It doesn't matter which order you insert them, because both friends go in the friend_id column. So you can enforce uniqueness easily.
You can also query easily for friends of 713:
SELECT f2.friend_id
FROM FriendPairs AS f1
JOIN FriendPairs AS f2 ON (f1.pair_id = f2.pair_id)
WHERE f1.friend_id = 713
I have a table with some rows in. Every row has a date-field. Right now, it may be duplicates of a date. I need to delete all the duplicates and only store the row with the highest id. How is this possible using a SQL query?
Now:
date id
'07/07' 1
'07/07' 2
'07/07' 3
'07/05' 4
'07/05' 5
What I want:
date id
'07/07' 3
'07/05' 5
DELETE FROM table WHERE id NOT IN
(SELECT MAX(id) FROM table GROUP BY date);
I don't have comment rights, so here's my comment as an answer in case anyone comes across the same problem:
In SQLite3, there is an implicit numerical primary key called "rowid", so the same query would look like this:
DELETE FROM table WHERE rowid NOT IN
(SELECT MAX(rowid) FROM table GROUP BY date);
this will work with any table even if it does not contain a primary key column called "id".
For mysql,postgresql,oracle better way is SELF JOIN.
Postgresql:
DELETE FROM table t1 USING table t2 WHERE t1.date=t2.date AND t1.id<t2.id;
MySQL
DELETE FROM table
USING table, table as vtable
WHERE (table.id < vtable.id)
AND (table.date=vtable.date)
SQL aggregate (max,group by) functions almost always are very slow.
Take for example an application which has users, each of which can be in exactly one group. If we want to SELECT the list of groups which have no members, what would be the correct SQL? I keep feeling like I'm just about to grasp the query, and then it disappears again.
Bonus points - given the alternative senario, where it's a many to many pairing, what is the SQL to identify unused groups?
(if you want concrete field names:)
One-To-Many:
Table 'users': | user_id | group_id |
Table 'groups': | group_id |
Many-To-Many:
Table 'users': | user_id |
Table 'groups': | group_id |
Table 'user-group': | user_id | group_id |
Groups that have no members (for the many-many pairing):
SELECT *
FROM groups g
WHERE NOT EXISTS
(
SELECT 1
FROM users_groups ug
WHERE g.groupid = ug.groupid
);
This Sql will also work in your "first" example as you can substitute "users" for "users_groups" in the sub-query =)
As far as performance is concerned, I know that this query can be quite performant on Sql Server, but I'm not so sure how well MySql likes it..
For the first one, try this:
SELECT * FROM groups
LEFT JOIN users ON (groups.group_id=users.group_id)
WHERE users.user_id IS NULL;
For the second one, try this:
SELECT * FROM groups
LEFT JOIN user-group ON (groups.group_id=user-group.group_id)
WHERE user-group.user_id IS NULL;
SELECT *
FROM groups
WHERE groups.id NOT IN (
SELECT user.group_id
FROM user
)
It will return all group id which not present in user