How to match/compare values in two resultsets in SQL Server 2008?

How to match/compare values in two resultsets in SQL Server 2008? - sql

I'm working on a employee booking application. I've got two different entities Projects and Users that are both assigned a variable number of Skills.
I've got a Skills table with the various skills (columns: id, name)
I register the user skills in a table called UserSkills (with two foreign key columns: fk_user and fk_skill)
I register the project skills in another table called ProjectSkills (with two foreign key columns: fk_project and fk_skill).
A project can require maybe 6 different skills and users when registering sets up their Skills aswell.
The tricky part is when I have to find users for my Projects based on their skills. I'm only interested in users that meet that have ALL the skills required by the project. Users are ofcause allowed to have more skilled then required.
The following code will not work, (and even if it did, would not be very performance friendly), but it illustrates my idea:
SELECT * FROM Users u WHERE
( SELECT us.fk_skill FROM UserSkills us WHERE us.fk_user = u.id )
>=
( SELECT ps.fk_skill FROM ProjectSkills ps WHERE ps.fk_project = [some_id] )
I'm thinking about making my own function that takes two TABLE-variables, and then working out the comparisson in that (kind of a modified IN-function), but I'd rather find a solution that's more performance friendly.
I'm developing on SQL Server 2008.
I really appreciate any ideas or suggestions on this. Thanks!

SELECT *
FROM Users u
WHERE NOT EXISTS
(
SELECT NULL
FROM ProjectSkill ps
WHERE ps.pk_project = #someid
AND NOT EXISTS
(
SELECT NULL
FROM UserSkills us
WHERE us.fk_user = u.id
AND us.fk_skill = ps.fk_skill
)
)

-- Assumes existance of variable #ProjectId, specifying
-- which project to analyze
SELECT us.UserId
from UserSkills us
inner join ProjectSkills ps
on ps.SkillId = us.SkillId
and ps.ProjectId = #ProjectId
group by us.UserId
having count(*) = (select count(*)
from ProjectSkills
where ProjectId = #ProjectId)
You'd want to test an debug this, as I have no test data to run it through. Ditto for indexing to optimize it.
(Now to post, and see if someone's come up with a better way--there should be something more subtle and effective than this.)

Related

SQL query to exclude records that are part of a group

I can't believe this hasn't been answered elsewhere, but I don't seem to know the right words to convey what I'm trying to do. I'm using Ruby/Rails and PostgreSQL.
I have a bunch of Users in the DB that I'm trying to add to a Group based on a name search. I need to return Users that do not belong to a particular Group, but there is a join table as well (UserGroups, with the appropriate FKs).
Is there a simple way to use this configuration to perform this query without having to result to grabbing all the Users from which belong to the group and doing something like .where.not(id: users_in_group.pluck(:id)) (these groups can be pretty huge, so I don't want to send that query to the DB on a text search as the user types).

I need to return Users that do not belong to a particular Group
SELECT *
FROM users u
WHERE username ~ 'some pattern' -- ?
AND NOT EXISTS (
SELECT FROM user_groups ug
WHERE ug.group_id = 123 -- your group_id to exclude here
AND ug.user_id = u.id
);
See:
Select rows which are not present in other table

Select a user by their username and then select data from another table using their UID

Sorry if that title is a bit convoluted... I'm spoiled by an ORM usually and my raw SQL skills are really poor, apparently.
I'm writing an application that links to a vBulletin forum. Users authenticate with their forum username, and the query for that is simple (selecting by username from the users table). The next half of it is more complex. There's also a subscriptions table that has a timestamp in it, but the primary key for these is a user id, not a username.
This is what I've worked out so far:
SELECT
forum.user.userid,
forum.user.usergroupid,
forum.user.password,
forum.user.salt,
forum.user.pmunread,
forum.subscriptionlog.expirydate
FROM
forum.user
JOIN forum.subscriptionlog
WHERE
forum.user.username LIKE 'SomeUSER'
Unfortunately this returns the entirety of the subscriptionlog table, which makes sense because there's no username field in it. Is it possible to grab the subscriptionlog row using the userid I get from forum.user.userid, or does this need to be split into two queries?
Thanks!

The issue is that you are blindly joining the two tables. You need to specify what column they are related by.
I think you want something like:
SELECT * FROM user u
INNER JOIN subscriptionlog sl ON u.id = sl.userid
WHERE u.username LIKE 'SomeUSER'

select * from user u JOIN subscriptions s ON u.id = s.id where u.username = 'someuser'
The bit in bold is what you want to add, it combines the 2 tables into one that you return results from.

try this
SELECT
forum.user.userid,
forum.user.usergroupid,
forum.user.password,
forum.user.salt,
forum.user.pmunread,
forum.subscriptionlog.expirydate
FROM
forum.user
INNER JOIN forum.subscriptionlog
ON forum.subscriptionlog.userid = forum.user.userid
WHERE
forum.user.username LIKE 'SomeUSER'

How can I improve a mostly "degenerate" inner join?

This is Oracle 11g.
I have two tables whose relevant columns are shown below (I have to take the tables as given -- I cannot change the column datatypes):
CREATE TABLE USERS
(
UUID VARCHAR2(36),
DATA VARCHAR2(128),
ENABLED NUMBER(1)
);
CREATE TABLE FEATURES
(
USER_UUID VARCHAR2(36),
FEATURE_TYPE NUMBER(4)
);
The tables express the concept that a user can be assigned a number of features. The (USER_UUID, FEATURE_TYPE) combination is unique.
I have two very similar queries I am interested in. The first one, expressed in English, is "return the UUIDs of enabled users who are assigned feature X". The second one is "return the UUIDs and DATA of enabled users who are assigned feature X". The USERS table has about 5,000 records and the FEATURES table has about 40,000 records.
I originally wrote the first query naively as:
SELECT u.UUID FROM USERS u
JOIN FEATURES f ON f.USER_UUID=u.UUID
WHERE f.FEATURE_TYPE=X and u.ENABLED=1
and that had lousy performance. As an experiment I tried to see what would happen if I didn't care about whether or not a user was enabled and that inspired me to try:
SELECT USER_UUID FROM FEATURES WHERE TYPE=X
and that ran very quickly. That in turn inspired me to try
(SELECT USER_UUID FROM FEATURES WHERE TYPE=X)
INTERSECT
(SELECT UUID FROM USERS WHERE ENABLED=1)
That didn't run as quickly as the second query, but ran much more quickly than the first.
After more thinking I realized that in the case at hand every user or almost every user was assigned at least one feature, which meant that the join condition was always or almost always true, which meant that the inner join completely or mostly degenerated into a cross join. And since 5,000 x 40,000 = 200,000,000 that is not a good thing. Obviously the INTERSECT version would be dealing with many fewer rows which presumably is why it is significantly faster.
Question: Is INTERSECT really the way go to in this case or should I be looking at some other type of join?
I wrote the query for the one that also needs to return DATA similarly to the very first one:
SELECT u.UUID, u.DATA FROM USERS u
JOIN FEATURES f ON f.USER_UUID=u.UUID
WHERE f.FEATURE_TYPE=X and u.ENABLED=1
But it would seem I can't do the INTERSECT trick here because there's no column in FEATURES that matches the DATA column.
Question: How can I rewrite this to avoid the degenerate join problem and perform like the query that doesn't return DATA?

I would intuitively use the EXISTS clause:
SELECT u.UUID
FROM USERS u
WHERE u.ENABLED=1
AND EXISTS (SELECT 1 FROM FEATURES f where f.FEATURE_TYPE=X and f.USER_UUID=u.UUID)
or similarly:
SELECT u.UUID, u.DATA
FROM USERS u
WHERE u.ENABLED=1
AND EXISTS (SELECT 1 FROM FEATURES f where f.FEATURE_TYPE=X and f.USER_UUID=u.UUID)
This way you can select every field from USERS since there is no need for INTERSECT anymore (which was a rather good choice for the 1st case, IMHO).

Returning customized results with SQL Server

This is a little complicated so I'm going to break it down. I'm trying to get results but couldn't figure out what the query is gonna look like. The premise is this, a users has purchased a specific set of items, their gear. When they go to my site, they see kits or setups that users have submitted. I want to only show them setups that have only the gear that they've purchased. They don't need to see setups with gear that they do not have. I hope this makes sense. Here's what my tables look like:
[Gear]
gearID is the unique key
has a list of all the gear (mics, heads, effects) with unique id's for each
[Kits]
kitID is the unique key
has a list of all the user submitted kits
[KitGearLink]
This table connects the [Kits] and [Gear] with each other. So this is the table that lists out what gear the user submitted kit has.
[Users]
userID is the unique key
list of all the users
[UserGear]
Links the [users] and [gear] table together. This is what stores what the user's personal gear consists of.
So how do I pull up records for each user that will show them all the kits that will work with the gear they have. If a kit has something the user doesn't own, then it won't show them. Any ideas?
Thanks guys!

Perhaps something like this:
SELECT *
FROM Kit k
WHERE k.KitID NOT IN (
SELECT DISTINCT kg.KitID
FROM KitGearLink kg
LEFT JOIN (
SELECT ug0.GearID
FROM UserGear ug0
WHERE ug0.UserID = #userParam
) ug ON kg.GearID = ug.GearID
WHERE ug.GearID = null
)
For a given user id the sub query return kits which fail the gear id join betwen user and kit (kits with a bit of gear the user doesn't have). This is used to filter the list of kits in the system.
Edit: Introduced second sub-query to filter user gear by user id parameter before the left join occurs, see comments.

Ok, you want a list of kits where all the gear in the kit has at some point been associated with a specific user. Best bet is to go through an SP and use a memory table. This assumes you are asking for kits for a specific user. you could get a complete list for all users, but more costly and complex....
DECLARE #UserId as int
SELECT #Userid = YOURUSER
Declare #tblUserKits TABLE (
UserId int null,
KitId int null,
GearId int null
)
-- GET ALL KITS THAT A USER HAS AT LEAST ONE PART OF
Insert Into #tbluserkits
select distinct u.userid, k.kitid, k.gearid
from usergear u
inner join Kitgearlink k on u.gearid = k.gearid
where u.Userid = #UserId
-- DELETE ALL KITS THAT THE USER IS MISSING A PART FOR
Delete from #tblUserKits WHERE KitId in
(select distinct t.kitid
from #tbluserkits t
inner join kitgearlink k on t.kitid = k.kitid
left outer join usergear u on k.gearid = u.gearid
WHERE u.gearid is null)
-- Finally return a list of kits
Select distinct KitId from #tblUserKits

SQL Nearest Neighbor Query (Movie Recommendation Algorithm)

Need help making this (sort of) working query more dynamic.
I have three tables myShows, TVShows and Users
myShows
ID (PK)
User (FK to Users)
Show (FK to TVShows)
Would like to take this query and change it to a stored procedure that I can send a User ID into and have it do the rest...
SELECT showId, name, Count(1) AS no_users
FROM
myShows LEFT OUTER JOIN
tvshows ON myShows.Show = tvshows.ShowId
WHERE
[user] IN (
SELECT [user]
FROM
myShows
WHERE
show ='1' or show='4'
)
AND
show <> '1' and show <> '4'
GROUP BY
showId, name
ORDER BY
no_users DESC
This right now works. But as you can see the problem lies within the WHERE (show ='1' or show='4') and the AND (show <> '1' and show <> '4') statements which is currently hard-coded values, and that's what I need to be dynamic, being I have no idea if the user has 3 or 30 shows I need to check against.
Also how inefficient is this process? this will be used for a iPad application that might get a lot of users. I currently run a movie API (IMDbAPI.com) that gets about 130k hits an hour and had to do a lot of database/code optimization to make it run fast. Thanks again!
If you want the database schema for testing let me know.

This will meet your requirements
select name, count(distinct [user]) from myshows recommend
inner join tvshows on recommend.show = tvshows.showid
where [user] in
(
select other.[user] from
( select show from myshows where [User] = #user ) my,
( select show, [user] from myshows where [user] <> #user ) other
where my.show = other.show
)
and show not in ( select show from myshows where [User] = #user )
group by name
order by count(distinct [user]) desc
If your SQL platform supports WITH Common Table Expressions, the above can be optimized to use them.
Will it be efficient as the data sizes increase? No.
Will it be effective? No. If just one user shares a show with your selected user, and they watch a popular show, then that popular show will rise to the top of the ranking.
I'd recommend
a) reviewing your thinking of what recommends a show
b) periodically calculating the results rather than performing it on demand.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How to match/compare values in two resultsets in SQL Server 2008? - sql

SELECT * FROM Users u WHERE NOT EXISTS ( SELECT NULL FROM ProjectSkill ps WHERE ps.pk_project = #someid AND NOT EXISTS ( SELECT NULL FROM UserSkills us WHERE us.fk_user = u.id AND us.fk_skill = ps.fk_skill ) )

Related

SQL query to exclude records that are part of a group

Select a user by their username and then select data from another table using their UID

How can I improve a mostly "degenerate" inner join?

Returning customized results with SQL Server

SQL Nearest Neighbor Query (Movie Recommendation Algorithm)

Categories

Resources