Eliminate subquery in the FROM clause - sql

The tagging table has 3 columns: id (the primary key), tag, and resource.
I want to select the tags that are associated with at least 3 resources. A resource can be associated several times with the same tag, so a single GROUP BY is not enough.
My current SQL query is the following:
SELECT tag FROM
(SELECT resource, tag FROM tagging GROUP BY resource, tag) AS tagging
GROUP BY tag HAVING count(*) > 2;
I need to convert this request in HQL, and HQL does not accept subqueries inside the FROM clause.
Is there a (fast) way to do the same thing without using a subquery, or with a subquery in the WHERE clause?
Thank you

To find tags that are associated with more than 2 different resources you can use
SELECT tag
FROM tagging
GROUP BY tag
HAVING count(DISTINCT resource) > 2;

Related

How do I select one row in SQL when another row has same id but a different value in a column?

I am creating a website where you can share posts with multiple tags, now I encountered the problem that a post is shown multiple times, each one with one tag. In my database I have a table posts and a table tags where you link the post_id. Now my question is: how can I get only one post but multiple tags on this one post?
screenshot of query in database
This should work, edit column names if needed:
SELECT *,
(SELECT GROUP_CONCAT(DISTINCT tag) FROM tags WHERE post_id = posts.id)
FROM posts
You can use GROUP_CONCAT() in MySQL and string_agg() in MS SQL Server
SELECT posts.id,GROUP_CONCAT(tags.tag)
FROM posts
Left JOIN tags on tags.post_id = posts.id
GROUP BY posts.id

SQL question: how do I find the count of IDs that are always mapped to a 'true' field in another table

I have a database that collects a list of document packages in one table and each individual page in another table
Each page has a PackageID connecting the two tables.
I'm trying to find the count of all packages where ALL pages connected to it have a boolean field (stored on the page table) of true. Even if 1/20 of the pages connected to the packageID is false, I don't want that packageID counted
Right now all I have is:
SELECT COUNT(DISTINCT pages.package_id)
FROM pages
WHERE boolean_field = true
But I'm not sure how to add that if one page w/ that package_id has the boolean_field != true than I don't want it counted. I also want to know the count of those packages that have any that are false.
I'm not sure if I need a subquery, if statement, having clause, or what.
Any direction even if it's what operators I should study on would be super helpful. Thanks :).
select count(*)
from
(
select package_id
from pages
group by package_id
having min(boolean_field) = 1
) tmp
Another way to express this is:
select count(*)
from packages p
where not exists (select 1
from pages pp
where pp.package_id = p.package_id and
not pp.boolean_field
);
The advantage of this approach is that it avoids aggregation, which can be a big win performance wise. It can also take advantage of an index on pages(package_id, boolean_field).

How to select records that have multiple values in a related table?

I have the following three tables for tagging content where each content can have one-to-many tags. For example, a content record could have a tag of California and Variable.
Table w/ the content
Content
-ContentID
-ContentName
Table w/ the tags
Tag
-TagID
-TagName
Table that links the content and the tags
ContentTag
-ContentID
-TagID
With the following SELECT statement I want to get records with TagID of both 21 and 54 however no rows are returned.
SELECT * FROM ContentTag
INNER JOIN Content On ContentTag.ContentID=Content.ContentID
INNER JOIN Tag ON ContentTag.TagID=Tag.TagID
Where (Tag.TagID=21 And Tag.TagID=54)
How do I create a SQL SELECT statement to retrieve content that has one-to-many tags?
I like to approach this question using aggregation and a having clause:
SELECT c.ContentId, c.ContentName
FROM ContentTag ct INNER JOIN
Content c
On ct.ContentID = c.ContentID
WHERE ct.TagID IN (21, 54)
GROUP BY c.ContentId, c.ContentName
HAVING COUNT(Distinct ct.TagId) = 2;
Some notes:
You don't need the join to the tags table. You are using the id and which is in ContentTag.
You don't need *. I presume you are looking content that has the two tags.
The WHERE clause limits the tags to the two tags in question.
The HAVING clause makes sure both are there.

Rails subquery without SQL?

I have a User model that has many Post.
I want to get, on a single query, a list of users IDs, ordered by name, and include the ID of their last post.
Is there a way to do this using the ActiveRecord API instead of a SQL query like the following?
SELECT users.id,
(SELECT id FROM posts
WHERE user_id = users.id
ORDER BY id DESC LIMIT 1) AS last_post_id
FROM users
ORDER BY id ASC;
You should be able to do this with the query generator:
User.joins(:posts).group('users.id').order('users.id').pluck(:id, 'MAX(posts.id)')
There's a lot of options on the relationship you can use to get data out of it. pluck is handy for getting values independent of models.
Update: To get models instead:
User.joins(:posts).group('users.id').order('users.id').select('users.*', 'MAX(posts.id) AS max_post_id')
That will create a field called max_post_id which works as any other attribute.

SELECT USING COUNT in mysql

I have a very large database with about 120 Million records in one table.I have clean up the data in this table first before I divide it into several tables(possibly normalizing it). The columns of this table is as follows: "id(Primary Key), userId, Url, Tag " . This is basically a subset of the dataset from delicious website. As I said, each row has an id, userID a url and only "one" tag. So for example a bookmark in delicious website is composed of several tags for a single url, this corresponds to several lines of my database. for example:
"id"; "user" ;"url" ;"tag"
"38";"12c2763095ec44e498f870ed67ee948d";"http://forkjavascript.org/";"ajax"
"39";"12c2763095ec44e498f870ed67ee948d";"http://forkjavascript.org/";"api"
"40";"12c2763095ec44e498f870ed67ee948d";"http://forkjavascript.org/";"javascript"
"41";"12c2763095ec44e498f870ed67ee948d";"http://forkjavascript.org/";"library"
"42";"12c2763095ec44e498f870ed67ee948d";"http://forkjavascript.org/";"rails"
I need a query to count the number of times that a tag is used for a url.
Thank you for you help
This query should work for you:
SELECT tag, url, count(tag) FROM table GROUP BY tag, url
Haven't tested it for you though.
Is this what you are looking for?
SELECT COUNT(tag) FROM TABLENAME
WHERE tag='sometag'
I think it's actually more like SELECT tag, COUNT(tag) FROM TABLENAME WHERE URL='someurl' GROUP BY tag