non-repeating column in postgresql - sql

my query is like that:
SELECT o.id, o.title, oc.category_id,
(SELECT name from categories c where c.id = oc.category_id)
FROM objects o
LEFT JOIN object_categories oc ON oc.object_id = o.id
WHERE type_id = 17
it returns me table like at the image
I want to return non-repeating category name. Can anyone help me?

To do that you can use the DISTINCT ON expression:
https://www.postgresql.org/docs/9.0/sql-select.html#SQL-DISTINCT

In Postgres, you can use distinct on:
SELECT DISTINCT ON(c.id)
o.id,
o.title,
oc.category_id,
c.name,
count(*) over(partition by o.id) cnt
FROM objects o
LEFT JOIN object_categories oc ON oc.object_id = o.id
LEFT JOIN categories c ON c.c.id = oc.category_id
WHERE type_id = 17
ORDER BY c.id, o.id
When a category appears in than one record, this selects only the one that has the smallest object id.
I used the category id rather than the name to identify duplicates - you can use the category name instead, if that matters to you.
Note that I converted the inline subquery on categories to a regular join, since I find it more readable.

Related

how to count two values from three dataset

I have 3 datasets: company, post, postedited,
I want to count the numbers of companies' post and postedited. some companies post but did not edited.
here is my query :
SELECT company.name, company.id, count(*),
( select count(*)
from post, postedited
where post.id=postedited.post_id)
from company, post as p
where company.id=p.company_id
group by company_id
the outcome of post is right, but the column of postedited is the same. what's wrong with my query?
Your subquery is completely unrelated to the main query. It selects post and postedited and counts. You are showing this result for every row of the main query.
You want the subquery relate to the main query's post. So remove the post table from the subquery's from clause:
(select count(*) from postedited where postedited.post_id = p.id)
Now this subquery selects a count for the post_id of the main query's records. At last you must get the sum of the counts:
select
c.name, c.id, count(*) as posts,
sum(select count(*) from postedited pe where pe.post_id = p.id) as edits
from company c
join post p on p.company_id = c.id
group by c.id;
You can achieve the same thus:
select
c.name, c.id, count(distinct p.id) as posts, count(pe.post_id) as edits
from company c
join post p on p.company_id = c.id
left join postedited pe on pe.post_id = p.id
group by c.id;
SELECT c.name AS companyName
, c.id AS companyID
, COUNT(DISTINCT p.id) AS postCount
, COUNT(DISTINCT pe.post_id) AS postEditCount
FROM company c
LEFT OUTER JOIN post p ON p.Company_ID = c.ID
LEFT OUTER JOIN postEdited pe ON pe.Company_ID = c.ID
GROUP BY c.id, c.name
That will give you a list of all companies in your company table with a count of each of their posts and edited posts. If you need to further query against that dataset, you can. Or you can add a WHERE clause to the above query to filter it.
And I agree, please don't use comma syntax. It's very easy to produce unintended results, and it doesn't give a good representation of what you're actually querying against. Plus, it's no longer standard and being deprecated in many flavors of SQL. Good JOIN syntax will make your life much easier.

Why is a random String/varchar after the table name valid?

Why is the following query a valid select ?
SELECT * from arelation somerandomtext;
The content of arelation does not matter, it just hast to be an existing view/table.
It returns the correct result, respectively the output of the select without the somerandomtext.
Why does this query do not throw an error/exception, is there no keyword (Group By, limit...) check ?
Its an alias
i.e.
select c.id, c.*
from products c
is valid syntax as it allows joins from a table to itself
i.e.
select c.id, p.id
from products c inner join products p on p.id = c.id

Getting SQL tuples that contain the largest value of an aggregate attribute after grouping

Here is the schema:
ACTOR (id, name)
PLAY (id, name, year)
CASTS (pid, aid, character)
The question is:
Find the plays with the largest cast (actors distinct) and return the titles and cast size of those plays.
This is SQL query that I have so far:
select mm.id, mm.name, count(distinct a.id) as numOfActors
from actor a
join casts c on c.pid = a.id
join play mm on mm.id = c.aid
group by mm.id, mm.name;
Every tuple returned from that query contains a different play, displaying its id, name, and the number of casts it has. But from there I'm having difficulty trying to fit it as a subquery within an outer query that would allow me to extract only the tuples that have the largest numofActors value (so like if the largest value was 100, then the only tuples that would be returned all have 100 actors).
Yeah this is one of those "homework"-type of problems, but I'm also looking for a conceptual understanding too (essentially, extracting the tuples that contain the largest value of a certain aggregated attribute after grouping has been done). Ordering by descending and selecting the top tuple doesn't work since there may be more than one tuple with the largest value.
Here is the approach in SQL Server:
select acp.*
from (select p.id, p.name, count(distinct a.id) as numOfActors,
max(count(distinct a.id)) over () as maxcnt
from actor a join
casts c
on c.pid = a.id join
play p
on p.id = c.aid
group by p.id, p.name
) acp
where numOfActors = maxnt;
The expression max(count(distinct a.id)) over (partition by partition by p.id) is an example of a window function. It is calculating the maximum value of a field over a group of rows. Because the () are empty (there is no partition by clause), this assigns the same maximum value to a new column in all rows.
What value is that? It is the maximum of the calculated value count(distinct a.id)) over (partition by partition by p.id). You want to find all plays that have this number of actors, so the outer query just selects these.
A subquery is needed because you cannot use window functions in the where clause.
EDIT:
with acp as (
select p.id, p.name, count(distinct a.id) as numOfActors
from actor a join
casts c
on c.pid = a.id join
play p
on p.id = c.aid
group by p.id, p.name
)
select acp.*
from acp join
(select p.id, max(numOfActors) as maxnoa
from acp
group by p.id
) acpm
on acp.id = acpm.id and acp.numOfActors = acpm.maxnoa;

Error in query: aggregate function or the GROUP BY clause

Hi all I have a problem with an SQL query: the problem is that if i add GROUP BY the database engine outputs the error:
Column 'dbo.classes.class_name' is invalid in the select list because
it is not contained in either an aggregate function or the GROUP BY clause.
My query is:
string query = "SELECT p.*
FROM dbo.classes AS p INNER JOIN teacher_classes AS a
ON a.class_id = p.class_id
and teach_id = #id
GROUP BY p.class_id";
Is there any help please for that.
Note without group by the query work fine but the result not grouped.
Your query is:
SELECT p.*
FROM dbo.classes AS p INNER JOIN
teacher_classes AS a
ON a.class_id = p.class_id and teach_id = #id
GROUP BY p.class_name;
You are trying to select all the columns from p and yet you're are grouping by class_name. This is not allowed in most databases. What happens if you have two classes, but information is different from them?
One option is to use distinct rather than group by to remove duplicates:
SELECT distinct c.*
FROM dbo.classes c INNER JOIN
teacher_classes tc
ON tc.class_id = c.class_id and tc.teach_id = #id;
Another option is to use something like in to find the matching classes for the teacher:
select c.*
from classes c
where c.class_id in (select tc.class_id from teacher_classes where teach_id = #id)
Notice I also changed your aliases so they have some relationship to the table names. This makes the query much easier to read.

Join two tables where all child records of first table match all child records of second table

I have four tables: Customer, CustomerCategory, Limit, and LimitCategory. A customer can be in multiple categories and a limit can also have multiple categories. I need to write a query that will return the customer name and limit amount where ALL the customers categories match ALL the limit categories.
I'm guessing it would be similar to the answer here, but I can't seem to get it right. Thanks!
Edit - Here's what the tables look like:
tblCustomer
customerId
name
tblCustomerCategory
customerId
categoryId
tblLimit
limitId
limit
tblLimitCategory
limitId
categoryId
I THINK you're looking for:
SELECT *
FROM CustomerCategory
LEFT OUTER JOIN Customer
ON CustomerCategory.CustomerId = Customer.Id
INNER JOIN LimitCategory
ON CustomerCategory.CategoryId = LimitCategory.CategoryId
LEFT OUTER JOIN Limit
ON Limit.Id = LimitCategory.LimitId
Updated!
Thanks to Felix for pointing out a flaw in my existing solution (3 years after I originally posted it, hehe). After looking at it again, I think this might be correct. Here I'm getting (1) the customers and limits with matching categories, plus the number of matching categories, (2) the number of categories per customer, (3) the number of categories per limit, (4) I then ensure the number of categories for customer and limits is the same as the number of the matches between the customers and limits:
UNTESTED!
select
matches.name,
matches.limit
from (
select
c.name,
c.customerId,
l.limit,
l.limitId,
count(*) over(partition by cc.customerId, lc.limitId) as matchCount
from tblCustomer c
join tblCustomerCategory cc on c.customerId = cc.customerId
join tblLimitCategory lc on cc.categoryId = lc.categoryId
join tblLimit l on lc.limitId = l.limitId
) as matches
join (
select
cc.customerId,
count(*) as categoryCount
from tblCustomerCategory cc
group by cc.customerId
) as customerCategories
on matches.customerId = customerCategories.customerId
join (
select
lc.limitId,
count(*) as categoryCount
from tblLimitCategory lc
group by lc.limitId
) as limitCategories
on matches.limitId = limitCategories.limitId
where matches.matchCount = customerCategories.categoryCount
and matches.matchCount = limitCategories.categoryCount
I don't know if this will work or not, just a thought i had and i can't test it, I'm sures theres a nicer way! don't be too harsh :)
SELECT
c.customerId
, l.limitId
FROM
tblCustomer c
CROSS JOIN
tblLimit l
WHERE NOT EXISTS
(
SELECT
lc.limitId
FROM
tblLimitCategory lc
WHERE
lc.limitId = l.id
EXCEPT
SELECT
cc.categoryId
FROM
tblCustomerCategory cc
WHERE
cc.customerId = l.id
)