Finding entries with all relations in a relational database table - sql

I am making a relational database using tags. The database has three tables:
object
match
tag
where match is a simple relation between an object and a tag (i.e. each entry consists of a primary key and two foreign keys). I want to structure a query where I can find all objects with all given tags, but am uncertain how to do it.
For instance, these are the three tables:
Object
Death becomes her
Billy Madison
Tag
Comedy
Horror
Match
1 | 1
1 | 2
2 | 1
Given that someone wants a horror-comedy, how do I structure the query to find only the objects with all matches? I realize this is elementary but I genuinely haven't found any answers. If the whole schema is off naturally feel free to point that out.
For the record I'm using Python, SQLAlchemy, and SQLite. Currently I've made a list of all tag IDs to find in Match.
Edit: For any future reference, I used astentx' solution with a slight modification to the query in order to access data from object right away:
select object.Length, object.title
from object
join match
on object.id = match.object
join tag
on match.tag = tag.id
join filter_tags
on tag.name = filter_tags.word

You can pass all your tags as array and use Carray() function or as comma-separated string and transform it to table in this way, for example.
Then for AND condition select rows that have exactly the same tags as you've expected:
select relation.obj_id
from relation
join tags
on relation.tag_id = tags.id
join <generated table>
on tagsvalue = <generated table>.value
group by relation.obj_id
having count(1) = (select count(1) from <generated table>)
Fiddle here.

Related

Updating column values in a table based on join with another table?

I have two tables called resource and resource_owners.
The resource_owners table contains two columns called resource_id and owner_id.
resource_id | owner_id |
-------------+-----------
The resource table contains two relevant columns: parentresource_id and id.
parentresource_id | id |
-------------------+------
resource_owners.resource_id, resource.id and resource.parentresource_id are all join columns between the two tables. Now what I want to do is the following:
For every row in the resource table, take the value in id, match it with a corresponding resource_owners.resource_id, retrieve the corresponding resource_owners.owner_id value (call it $owner_value), then set resource_owners.owner_id to $owner_value where resource_owners.resource_id equals resource.parentresource_id.
In conversational terms, this is what I want to do: For each resource, I want to re-assign the parent-resource's owner_id to be the resource's owner_id.
I've tried to wrap my head around this problem and it looks like I'll need two different table joins (resource.id with resource_owners.resource_id and resource.parentresource_id with resource_owners.resource_id).
Can someone point me in the right direction? Is what I want even possible with a single query? I'm okay with a PostgreSQL script as well if that works better for my use case.
I'm not sure what database you are using but you should be able to accomplish using the logic below if I understood your question correctly:
UPDATE RESOURCE_OWNER SET
OWNER_ID = UP.OWNER_ID
FROM (SELECT rc.ID, TMP.OWNER_ID FROM (SELECT RSC.ID, ROWRS.OWNER_ID, ROWRS.RESOURCE_ID FROM RESOURCE RSC JOIN RESOURCE_OWNER ROWRS
ON RSC.ID = ROWRS.RESOURCE_ID) TMP JOIN RESOURCE rc on rc.PARENTRESOURCE_ID = TMP.RESOURCE_ID) UP WHERE RESOURCE_OWNER.RESOURCE_ID = UP.ID;

SQL difference between IN and JOIN

First I need to say that it is safe to assume that I have no formal education in SQL although I have education in relational algebra.
I am investigating what would be the best approach to the following problem.
Our database is holding texts and keywords for every text.
Articles
id | text
Keywords
id | word
Articles_keywords
id_article | id_keyword
For the sake of this question the provider of answer can assume that tables are indexed however one wants.
So the problem is getting all articles that have a specific keyword.
I have talked with 2 groups of people that solve this in 2 ways, and they both claim that the approach of other group is wrong.
First solution using the IN operator:
SELECT * FROM Articles AS a WHERE a.id IN
(SELECT id_article FROM Articles_Keywords AS ak WHERE ak.id_keyword IN
(SELECT id FROM keywords AS k WHERE k.word = 'xyz'));
Other solution is using JOIN operator of course:
SELECT * FROM Articles as a
JOIN Articles_Keywords as ak
ON a.id = ak.id_article
JOIN Keywords as k
ON k.id = ak.id_keyword
WHERE k.word = 'xyz';
Which approach is better and, above all, why?
Edit
In articles table we have an id column being unique and, just for the sake of this question we could assume that there are no duplicate texts.
The same thing goes for the keywords table.
In article_keywords table the ordered pair (id_article,id_keyword) is unique

SQL Cross-Table Referencing

Okay, so I've got two tables. One table (table 1) contains a column Books_Owned_ID which stores a series of numbers in the form of 1,3,7. I have another table (table 2) which stores the Book names in one column and the book ID in another column.
What I want to do is create an SQL code which will take the numbers from Books_Owned_IDand display the names of those books in a new column. Like so:
|New Column |
Book 1 Name
Book 2 Name
Book 3 Name
I can't wrap my head around this, it's simple enough but all the threads I look on get really confusing.
Table1 contains the following columns:
|First_Name| Last_Name| Books_Owned_ID |
Table2 contains the following columns:
|Book_Name|Book_ID|
You need to do an inner join. This is a great example/reference for these
SELECT Book_Name FROM Table2
INNER JOIN Table1
ON Table1.Books_Owned_ID = Table2.Book_ID
EDIT SQL Fiddle
I will work on getting the column comma split working. It wont be a lot extra for this.
EDIT 2 See this answer to build a function to split your string. Then you can do this:
SELECT Book_Name FROM Table2
WHERE Book_ID IN(SELECT FN_ListToTable(',',Table1.Books_Owned_ID) FROM Table1 s)
The core of this centers around data normalisation... Each fact is stored only once (and so is "authoritative"). You should also get into the habit of only storing a single fact in any field.
So, imagine the following table layouts...
Books
Id, Name, Description
Users
Id, Username, EmailAddress, PasswordHash, etc....
BooksOwned
UserId, BookId
So if a single user owns multiple books, there will be multiple entries in the BooksOwned table...
UserId, BookID
1, 1
1, 2
1, 3
Indicates that User 1 owns books 1 through 3.
The reason to do it this way is that it makes it much easier to query in future. You also treat BookId as an Integer instead of a string containing a list - so you don't need to worry about string manipulation to do your query.
The following would return the name of all books owned by the user with Id = 1
SELECT Books.Name
FROM BooksOwned
INNER JOIN Books
ON BooksOwned.BookId = Books.Id
WHERE BooksOwned.UserId = 1
You need a function which takes a comma separated list and returns a table. This is slow and fundamentally a bad idea. Really all this does is convert this way of doing it to be like the data model I describe below. (see ProfessionalAmateur's answer for an example of this).
If you are just starting change your data model. Make a linking table. Like this:
Okay, so I've got two tables. One table (table 1) contains a column Books_Owned_ID which stores a series of numbers in the form of 1,3,7. I have another table (table 2) which stores the Book names in one column and the book ID in another column.
What I want to do is create an SQL code which will take the numbers from Books_Owned_IDand display the names of those books in a new column. Like so:
Person Table
|First_Name| Last_Name| Person_ID |
Book Table
|Book_Name|Book_ID|
PersonBook Table
|PersonID|BookID|
This table can have more than one row for each person.

"Tag" searching/exclusion query design issue

Background: I'm working on a homebrew project for managing a collection of my own images, and have been trying to implement a tag-based search so I can easily sift through them.
Right now, I'm working with RedBean's tagging API for applying tags to each image's database entry, however I'm stuck on a specific detail of my implementation; currently, to allow search of tags where multiple tags will refine the search (when searching for "ABC XYZ", tagged image must have tags "ABC" and "XYZ"),
I'm having to handle some of the processing in the server-side language and not SQL, and then run an (optional) second query to verify that any returned images don't have a tag that has been explicitly excluded from results. (when searching for "ABC -XYZ", tagged image must have tag "ABC" and not "XYZ").
The problem here is that my current method requires that I run all results by my server-side code, and makes any attempts at sensible pagination/result offsets inaccurate.
My goal is to just grab the rows of the post table that contain the requested tags (and not contain any excluded tags) with one query, and still be able to use LIMIT/OFFSET arguments for my query to obtain reasonably paginated results.
Table schemas is as follows:
Table "post"
Columns:
id (PRIMARY KEY for post table)
(image metadata, not relevant to tag search)
Table "tag"
Columns:
id (PRIMARY KEY for tag table)
title (string of varying length - assume varchar(255))
Table "post_tag"
Columns:
id (PRIMARY KEY for post_tag table)
post_id (associated with column "post.id")
tag_id (associated with column "tag.id")
If possible, I'd like to also be able to have WHERE conditions specific to post table columns as well.
What should I be using for query structure? I've been playing with left joins but haven't been able to get the exact structure I need to solve this.
Here is the basic idea:
The LEFT OUTER JOIN is the set of posts that match tags you want to exclude. The final WHERE clause in the query makes sure that none of these posts match the entries in the first post table.
The INNER JOIN is the set of posts that match all of the tags. Note that the number two must match the number of unique tag names that you provide in the IN clause.
select p.*
from post p
left outer join (
select pt.post_id
from post_tag pt
inner join tag t on pt.tag_id = t.id
where t.title in ('UVW', 'XYZ')
) notag on p.id = notag.post_id
inner join (
select pt.post_id
from post_tag pt
inner join tag t on pt.tag_id = t.id
where t.title in ('ABC', 'DEF')
group by pt.post_id
having count(distinct t.title) = 2
) yestag on p.id = yestag.post_id
where notag.post_id is null
--add additional WHERE filters here as needed

how do i display all the tags related to all the feedbacks in one query

I am trying to write a sql query which fetches all the tags related to every topic being displayed on the page.
like this
TITLE: feedback1
POSTED BY: User1
CATEGORY: category1
TAGS: tag1, tag2, tag3
TITLE: feedback2
POSTED BY: User2
CATEGORY: category2
TAGS: tag2, tag5, tag7,tag8
TITLE: feedback3
POSTED BY: User3
CATEGORY: category3
TAGS: tag1, tag5, tag6, tag3
The relationship of tags to topics is many to many.
Right now I am first fetching all the topics from the "topics" table and to fetch the related tags of every topic I loop over the returned topics array for fetching tags.
But this method is very expensive in terms of speed and not efficient too.
Please help me write this sql query.
Query for fetching all the topics and its information is as follows:
SELECT
tbl_feedbacks.pk_feedbackid as feedbackId,
tbl_feedbacks.type as feedbackType,
DATE_FORMAT(tbl_feedbacks.createdon,'%M %D, %Y') as postedOn,
tbl_feedbacks.description as description,
tbl_feedbacks.upvotecount as upvotecount,
tbl_feedbacks.downvotecount as downvotecount,
(tbl_feedbacks.upvotecount)-(tbl_feedbacks.downvotecount) as totalvotecount,
tbl_feedbacks.viewcount as viewcount,
tbl_feedbacks.title as feedbackTitle,
tbl_users.email as userEmail,
tbl_users.name as postedBy,
tbl_categories.pk_categoryid as categoryId,
tbl_clients.pk_clientid as clientId
FROM
tbl_feedbacks
LEFT JOIN tbl_users
ON ( tbl_users.pk_userid = tbl_feedbacks.fk_tbl_users_userid )
LEFT JOIN tbl_categories
ON ( tbl_categories.pk_categoryid = tbl_feedbacks.fk_tbl_categories_categoryid )
LEFT JOIN tbl_clients
ON ( tbl_clients.pk_clientid = tbl_feedbacks.fk_tbl_clients_clientid )
WHERE
tbl_clients.pk_clientid = '1'
What is the best practice that should be followed in such cases when you need to display all the tags related to every topic being displayed on a single page.
How do I alter the above sql query, so that all the tags plus related information of topics is fetched using a single query.
For a demo of what I am trying to achieve is similar to the'questions' page of stackoverflow.
All the information (tags + information of every topic being displayed) is properly displayed.
Thanks
To do this, I would have three tables:
Topics
topic_id
[whatever else you need to know for a topic]
Tags
tag_id
[etc]
Map
topic_id
tag_id
select t.[whatever], tag.[whatever]
from topics t
join map m on t.topic_id = m.topic_id
join tags tag on tag.tag_id = m.tag_id
where [conditionals]
Set up partitions and/or indexes on the map table to maximize the speed of your query. For example, if you have many more topics than tags, partition the table on topics. Then, each time you grab all the tags for a topic, it will be 1 read from 1 area, no seeking needed. Make sure to have both topics and tags indexed on their _id.
Use your 'explain plan' tool. (I am not familiar with mysql, but I assume there is some tool that can tell you how a query will be run, so you can optimize it)
EDIT:
So you have the following tables:
tbl_feedbacks
tbl_users
tbl_categories
tbl_clients
tbl_tags
tbl_topics
tbl_topics_tags
The query you provide as a starting point shows how feedback, users, categories and clients relate to each other.
I assume that tbl_topics_tags contains FKs to tags and topics, showing which topic has which tag. Is this correct?
What of (feedbacks, users, categories, and clients) has a FK to topics or tags? Or, do either topics or tags have a FK to any of the initial 4?
Once I know this, I'll be able to show how to modify the query.
EDIT #2
There are two different ways to go about this:
The easy way is the just join on your FK. This will give you one row for each tag. It is much easier and more flexible to put together the SQL to do it this way. If you are using some other language to take the results of the query and translate them to present them to the user, this method is better. If nothing else, it will be far more obvious what is going on, and will be easier to debug and maintain.
However, you may want each row of the query results to contain one feedback (and the tags that go with it).
SQL joining question <- this is a question I posted on how to do this. The answer I accepted is an oracle-only answer AFAIK, but there are other non-oracle answers.
Adapting Kevin's answer (which is supposed to work in SQL92 compliant systems):
select
[other stuff: same as in your post],
(select tag
from tbl_tag tt
join tbl_feedbacks_tags tft on tft.tag_id = tt.tag_id
where tft.fk_feedbackid = tbl_feedbacks.pk_feedbackid
order by tag_id
limit 1
offset 0 ) as tag1,
(select tag
from tbl_tag tt
join tbl_feedbacks_tags tft on tft.tag_id = tt.tag_id
where tft.fk_feedbackid = tbl_feedbacks.pk_feedbackid
order by tag_id
limit 1
offset 1 ) as tag2,
(select tag
from tbl_tag tt
join tbl_feedbacks_tags tft on tft.tag_id = tt.tag_id
where tft.fk_feedbackid = tbl_feedbacks.pk_feedbackid
order by tag_id
limit 1
offset 2 ) as tag3
from [same as in the OP]
This should do the trick.
Notes:
This will pull the first three tags. AFAIK, there isn't a way to have an arbitrary number of tags. You can expand the number of tags shown by copying and pasting more of those parts of the query. Make sure to increase the offset setting.
If this does not work, you'll probably have to write up another question, focusing on how to do the pivot in mysql. I've never used mysql, so I'm only guessing that this will work based on what others have told me.
One tip: you'll usually get more attention to your question if you strip away all the extra details. In the question I linked to above, I was really joining between 4 or 5 different tables, with many different fields. But I stripped it down to just the part I didn't know (how to get oracle to aggregate my results into one row). I know some stuff, but you can usually do far better than just one person if you trim your question down to the essentials.