PostgreSQL - SELECT DISTINCT, ORDER BY expressions must appear in select list - sql

I'm new to SQL.
I guess I've misunderstood the concept of how to use DISTINCT keyword.
Here's my code:
SELECT DISTINCT(e.id), e.text, e.priority, CAST(e.order_number AS integer), s.name AS source, e.modified_time, e.creation_time, (SELECT string_agg(DISTINCT text, '|') FROM definitions WHERE entry_id = d.entry_id) AS definitions
FROM entries AS e
LEFT JOIN definitions d ON d.entry_id = e.id
INNER JOIN sources s ON e.source_id = s.id
WHERE vocabulary_id = 22
ORDER BY e.order_number
The error is as follows:
ERROR: for SELECT DISTINCT, ORDER BY expressions must appear in select list
LINE 6: ORDER BY e.order_number
Just trying to understand what my SELECT statement should look like.

It appears to me that you are trying to distinct on a single column and not on others - which is bound to fail.
For example, select distinct a,b,c from x returns the unique combinations of a,b and c, not unique a but normal b and c

If you want one row per distinct e.id, then you are looking for distinct on. It is very important that the order by be consistent with the distinct on keys:
SELECT DISTINCT ON (e.id), e.id, e.text, e.priority, CAST(e.order_number AS integer),
s.name AS source, e.modified_time, e.creation_time,
(SELECT string_agg(DISTINCT d2.text, '|') FROM definitions d2 WHERE d2.entry_id = d.entry_id) AS definitions
FROM entries e LEFT JOIN
definitions d
ON d.entry_id = e.id INNER JOIN
sources s
ON e.source_id = s.id
WHERE vocabulary_id = 22
ORDER BY e.id, e.order_number;
Given the subquery, I suspect that there are better ways to write the query. If that is of interest, ask another question, provide sample data, desired results, and a description of the logic.

Related

SUM a column count from two tables

I have this simple unioned query in SQL Server 2014 where I am getting counts of rows from each table, and then trying to add a TOTAL row at the bottom that will SUM the counts from both tables. I believe the problem is the LEFT OUTER JOIN on the last union seems to be only summing the totals from the first table
SELECT A.TEST_CODE, B.DIVISION, COUNT(*)
FROM ALL_USERS B, SIGMA_TEST A
WHERE B.DOMID = A.DOMID
GROUP BY A.TEST_CODE, B.DIVISION
UNION
SELECT E.TEST_CODE, F.DIVISION, COUNT(*)
FROM BETA_TEST E, ALL_USERS F
WHERE E.DOMID = F.DOMID
GROUP BY E.TEST_CODE, F.DIVISION
UNION
SELECT 'TOTAL', '', COUNT(*)
FROM (SIGMA_TEST A LEFT OUTER JOIN BETA_TEST E ON A.DOMID
= E.DOMID )
Here is a sample of the results I am getting:
I would expect the TOTAL row to display a result of 6 (2+1+3=6)
I would like to avoid using a Common Table Expression (CTE) if possible. Thanks in advance!
Since you are counting users with matching DOMIDs in the first two statements, the final statement also needs to include the ALL_USERS table. The final statement should be:
SELECT 'TOTAL', '', COUNT(*)
FROM ALL_USERS G LEFT OUTER JOIN
SIGMA_TEST H ON G.DOMID = H.DOMID
LEFT OUTER JOIN BETA_TEST I ON I.DOMID = G.DOMID
WHERE (H.TEST_CODE IS NOT NULL OR I.TEST_CODE IS NOT NULL)
I would consider doing a UNION ALL first then COUNT:
SELECT COALESCE(TEST_CODE, 'TOTAL'),
DIVISION,
COUNT(*)
FROM (
SELECT A.TEST_CODE, B.DIVISION
FROM ALL_USERS B
INNER JOIN SIGMA_TEST A ON B.DOMID = A.DOMID
UNION ALL
SELECT E.TEST_CODE, F.DIVISION
FROM BETA_TEST E
INNER JOIN ALL_USERS F ON E.DOMID = F.DOMID ) AS T
GROUP BY GROUPING SETS ((TEST_CODE, DIVISION ), ())
Using GROUPING SETS you can easily get the total, so there is no need to add a third subquery.
Note: I assume you want just one count per (TEST_CODE, DIVISION). Otherwise you have to also group on the source table as well, as in #Gareth's answer.
I think you can achieve this with a single query. It seems your test tables have similar structures, so you can union them together and join to ALL_USERS, finally, you can use GROUPING SETS to get the total
SELECT ISNULL(T.TEST_CODE, 'TOTAL') AS TEST_CODE,
ISNULL(U.DIVISION, '') AS DIVISION,
COUNT(*)
FROM ALL_USERS AS U
INNER JOIN
( SELECT DOMID, TEST_CODE, 'SIGNMA' AS SOURCETABLE
FROM SIGMA_TEST
UNION ALL
SELECT DOMID, TEST_CODE, 'BETA' AS SOURCETABLE
FROM BETA_TEST
) AS T
ON T.DOMID = U.DOMID
GROUP BY GROUPING SETS ((T.TEST_CODE, U.DIVISION, T.SOURCETABLE), ());
As an aside, the implicit join syntax you are using was replaced over a quarter of a century ago in ANSI 92. It is not wrong, but there seems to be little reason to continue to use it, especially when you are mixing and matching with explicit outer joins and implicit inner joins. Anyone else that might read your SQL will certainly appreciate consistency.

SQL Beginner - How to get items from Table1 that are associated with at least 10 items in Table2

This is probably very easy, but I'm confused after looking at things online. Each item in my Contract table has multiple Envelopes. I want to find Contracts that have at least 10 envelopes. How do I go about this?
I've tried the following
select c.*, COUNT(e.ID)
from [Contract] c
INNER JOIN Envelope e ON e.ContractID = c.ID
Group By c.ID
HAVING Count(e.ID) > 10
And I get
Column 'Contract.PresenterUserID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
I haven't dealt with aggregate or group by clauses before, so I'm not sure what it means.
By default, MySQL will accept your query. So, I am assuming that you are not using MySQL or the system has full group by turned on.
Here is another approach that will work in any database:
select c.*, e.cnt
from [Contract] c inner join
(select e.ContractId, count(*) as cnt
from Envelope e
group by e.ContractId
having count(*) >= 10
) e
on e.ContractID = c.ID;
This moves the aggregation to a subquery, before the join. You can then take all the columns from the contract table.
You are pretty close. You must include all fileds in the selectt either an aggregate function or the GROUP BY clause. Try this:
select c.id, c.PresenterUserID, COUNT(e.ID)
from [Contract] c
INNER JOIN Envelope e ON e.ContractID = c.ID
Group By c.ID, c.PresenterUserID
HAVING Count(e.ID) >= 10

Eliminate duplicate rows from query output

I have a large SELECT query with multiple JOINS and WHERE clauses. Despite specifying DISTINCT (also have tried GROUP BY) - there are duplicate rows returned. I am assuming this is because the query selects several IDs from several tables. At any rate, I would like to know if there is a way to remove duplicate rows from a result set, based on a condition.
I am looking to remove duplicates from results if x.ID appears more than once. The duplicate rows all appear grouped together with the same IDs.
Query:
SELECT e.Employee_ID, ce.CC_ID as CCID, e.Manager_ID, e.First_Name, e.Last_Name,,e.Last_Login,
e.Date_Created AS Date_Created, e.Employee_Password AS Password,e.EmpLogin
ISNULL((SELECT TOP 1 1 FROM Gift g
JOIN Type t ON g.TypeID = t.TypeID AND t.Code = 'Reb'
WHERE g.Manager_ID = e.Manager_ID),0) RebGift,
i.DateCreated as ImportDate
FROM #EmployeeTemp ct
JOIN dbo.Employee c ON ct.Employee_ID = e.Employee_ID
INNER JOIN dbo.Manager p ON e.Manager_ID = m.Manager_ID
LEFT JOIN EmployeeImp i ON e.Employee_ID = i.Employee_ID AND i.Active = 1
INNER JOIN CreditCard_Updates cc ON m.Manager_ID = ce.Manager_ID
LEFT JOIN Manager m2 ON m2.Manager_ID = ce.Modified_By
WHERE ce.CCType ='R' AND m.isT4L = 1
AND CHARINDEX(e.first_name, Selected_Emp) > 0
AND ce.Processed_Flag = #isProcessed
I don't have enough reputation to add a comment, so I'll just try to help you in an answer proper (even though this is more of a comment).
It seems like what you want to do is select distinctly on just one column.
Here are some answers which look like that:
SELECT DISTINCT on one column
How can I SELECT rows with MAX(Column value), DISTINCT by another column in SQL?

Recursive SQL command and duplicate results?

I was wondering if how I am handling removing duplicate results using DISTINCT is the best way to approach my recursive call. Here is my code sample:
with cte as(
SELECT
dbo.Users.Username,
dbo.Contacts.FirstName,
dbo.Contacts.LastName,
tenant.Name,
tenant.Id,
tenant.ParentTenantId
FROM
dbo.Tenants AS tenant
INNER JOIN dbo.Users ON tenant.Id = dbo.Users.TenantId
INNER JOIN dbo.Contacts ON dbo.Users.ContactId = dbo.Contacts.Id
where tenant.Id = '6CD4C969-C794-4C95-9CA2-5984AEC0E32C'
union all
SELECT
dbo.Users.Username,
dbo.Contacts.FirstName,
dbo.Contacts.LastName,
childTenant.Name,
childTenant.Id,
childTenant.ParentTenantId
FROM
dbo.Tenants AS childTenant
INNER JOIN dbo.Users ON childTenant.Id = dbo.Users.TenantId
INNER JOIN dbo.Contacts ON dbo.Users.ContactId = dbo.Contacts.Id
INNER JOIN cte on childTenant.ParentTenantId = cte.Id)
select DISTINCT UserName, FirstName, LastName, Name, Id, ParentTenantId from cte ORDER BY Id
here are the results:
Here are the results without using the DISTINCT key word
While DISTINCT works I am wondering if it is the best way to handle the duplicate results or if I should rework my query somehow.
I'm not sure whether that is the case here, but it is a common misunderstanding that distinct is a function that applies to a certain column. Distinct applies to the row, that is:
select distinct(x), y from t
is the same as:
select distinct x, y from t or select distinct x, (y) from t
Furthermore:
select x, distinct(y) from t
is an invalid construction

Error in query: aggregate function or the GROUP BY clause

Hi all I have a problem with an SQL query: the problem is that if i add GROUP BY the database engine outputs the error:
Column 'dbo.classes.class_name' is invalid in the select list because
it is not contained in either an aggregate function or the GROUP BY clause.
My query is:
string query = "SELECT p.*
FROM dbo.classes AS p INNER JOIN teacher_classes AS a
ON a.class_id = p.class_id
and teach_id = #id
GROUP BY p.class_id";
Is there any help please for that.
Note without group by the query work fine but the result not grouped.
Your query is:
SELECT p.*
FROM dbo.classes AS p INNER JOIN
teacher_classes AS a
ON a.class_id = p.class_id and teach_id = #id
GROUP BY p.class_name;
You are trying to select all the columns from p and yet you're are grouping by class_name. This is not allowed in most databases. What happens if you have two classes, but information is different from them?
One option is to use distinct rather than group by to remove duplicates:
SELECT distinct c.*
FROM dbo.classes c INNER JOIN
teacher_classes tc
ON tc.class_id = c.class_id and tc.teach_id = #id;
Another option is to use something like in to find the matching classes for the teacher:
select c.*
from classes c
where c.class_id in (select tc.class_id from teacher_classes where teach_id = #id)
Notice I also changed your aliases so they have some relationship to the table names. This makes the query much easier to read.