SQL Server : Group By causes "column invalid" error, how to solve that? - sql

I am trying to filter cg_group names (please check the query) and group (using: GROUP BY) the results according to last updated opportunity (using: ORDER BY opportunities.date_modified DESC).
When I used query without use group by it returns the following results:
SELECT cg_groups.name
FROM cg_groups
JOIN cg_groups_cstm ON cg_groups_cstm.id_c = cg_groups.id
JOIN accounts_cstm ON cg_groups.name = accounts_cstm.client_group_c
JOIN accounts ON accounts.id = accounts_cstm.id_c
JOIN accounts_opportunities ON accounts.id = accounts_opportunities.account_id
JOIN opportunities ON accounts_opportunities.opportunity_id = opportunities.id
WHERE cg_groups.deleted='0' AND cg_groups_cstm.status_c='1' AND opportunities.deleted='0'
ORDER BY opportunities.date_modified DESC
Results:
ABC Group
ABC Group
CBC Group
ABC Group
XYZ Group
But I want to group this to following order:
ABC Group
CBC Group
XYZ Group
To do that I added GROUP BY cg_groups.name
SELECT cg_groups.name
FROM cg_groups
JOIN cg_groups_cstm ON cg_groups_cstm.id_c = cg_groups.id
JOIN accounts_cstm ON cg_groups.name = accounts_cstm.client_group_c
JOIN accounts ON accounts.id = accounts_cstm.id_c
JOIN accounts_opportunities ON accounts.id = accounts_opportunities.account_id
JOIN opportunities ON accounts_opportunities.opportunity_id = opportunities.id
WHERE cg_groups.deleted='0' AND cg_groups_cstm.status_c='1' AND opportunities.deleted='0'
GROUP BY cg_groups.name
ORDER BY opportunities.date_modified DESC
But now I get this error:
Msg 8127, Level 16, State 1, Line 10
Column "opportunities.date_modified" is invalid in the ORDER BY clause because it is not contained in either an aggregate function or the GROUP BY clause.
Someone please help me to solve this issue, thank you.

Use ROW_NUMBER to find the most recently updated record for each group:
WITH cte AS (
SELECT cg_groups.name, o.date_modified,
ROW_NUMBER() OVER (PARTITION BY o.date_modified DESC) rn
FROM cg_groups cg
INNER JOIN cg_groups_cstm cgc
ON cgc.id_c = cg.id
INNER JOIN accounts_cstm ac
ON cg.name = ac.client_group_c
INNER JOIN accounts a
ON a.id = ac.id_c
INNER JOIN accounts_opportunities ao
ON a.id = ao.account_id
INNER JOIN opportunities o
ON ao.opportunity_id = o.id
WHERE cg.deleted = '0' AND cgc.status_c = '1' AND o.deleted = '0'
)
SELECT name
FROM cte
WHERE rn = 1
ORDER BY date_modified DESC;
Note that this may not be exactly what you want. This answer returns a single record per name group which is the most recently updated for that group. It then orders all results descending, but maybe you want ascending.

put opportunities.date_modified in selection and group by then you can use that in order by
SELECT opportunities.date_modified,cg_groups.name
FROM cg_groups
JOIN cg_groups_cstm ON cg_groups_cstm.id_c = cg_groups.id
JOIN accounts_cstm ON cg_groups.name = accounts_cstm.client_group_c
JOIN accounts ON accounts.id = accounts_cstm.id_c
JOIN accounts_opportunities ON accounts.id = accounts_opportunities.account_id
JOIN opportunities ON accounts_opportunities.opportunity_id = opportunities.id
WHERE cg_groups.deleted='0' AND cg_groups_cstm.status_c='1' AND opportunities.deleted='0'
GROUP BY cg_groups.name,opportunities.date_modified
ORDER BY opportunities.date_modified DESC
but for your result you can try like below just use distinct
SELECT distinct cg_groups.name
FROM cg_groups
JOIN cg_groups_cstm ON cg_groups_cstm.id_c = cg_groups.id
JOIN accounts_cstm ON cg_groups.name = accounts_cstm.client_group_c
JOIN accounts ON accounts.id = accounts_cstm.id_c
JOIN accounts_opportunities ON accounts.id = accounts_opportunities.account_id
JOIN opportunities ON accounts_opportunities.opportunity_id = opportunities.id
WHERE cg_groups.deleted='0' AND cg_groups_cstm.status_c='1' AND opportunities.deleted='0'
order by cg_groups.name
no group by need as you have not used any aggregate function

how about just adding distinct right after your SELECT statement .
Select distinct ... from ...

Related

How to create distinct count from queries with several tables

I am trying to create one single query that will give me a distinct count for both the ActivityID and the CommentID. My query in MS Access looks like this:
SELECT
tbl_Category.Category, Count(tbl_Activity.ActivityID) AS CountOfActivityID,
Count(tbl_Comments.CommentID) AS CountOfCommentID
FROM tbl_Category LEFT JOIN
(tbl_Activity LEFT JOIN tbl_Comments ON
tbl_Activity.ActivityID = tbl_Comments.ActivityID) ON
tbl_Category.CategoryID = tbl_Activity.CategoryID
WHERE
(((tbl_Activity.UnitID)=5) AND ((tbl_Comments.PeriodID)=1))
GROUP BY
tbl_Category.Category;
I know the answer must somehow include SELECT DISTINCT but am not able to get it to work. Do I need to create multiple subqueries?
This is really painful in MS Access. I think the following does what you want to do:
SELECT ac.Category, ac.num_activities, aco.num_comments
FROM (SELECT ca.category, COUNT(*) as num_activities
FROM (SELECT DISTINCT c.Category, a.ActivityID
FROM (tbl_Category as c INNER JOIN
tbl_Activity as a
ON c.CategoryID = a.CategoryID
) INNER JOIN
tbl_Comments as co
ON a.ActivityID = co.ActivityID
WHERE a.UnitID = 5 AND co.PeriodID = 1
) as caa
GROUP BY ca.category
) as ca LEFT JOIN
(SELECT c.Category, COUNT(*) as num_comments
FROM (SELECT DISTINCT c.Category, co.CommentId
FROM (tbl_Category as c INNER JOIN
tbl_Activity as a
ON c.CategoryID = a.CategoryID
) INNER JOIN
tbl_Comments as co
ON a.ActivityID = co.ActivityID
WHERE a.UnitID = 5 AND co.PeriodID = 1
) as aco
GROUP BY c.Category
) as aco
ON aco.CommentId = ac.CommentId
Note that your LEFT JOINs are superfluous because the WHERE clause turns them into INNER JOINs. This adjusts the logic for that purpose. The filtering is also very tricky, because it uses both tables, requiring that both subqueries have both JOINs.
You can use DISTINCT:
SELECT
tbl_Category.Category, Count(DISTINCT tbl_Activity.ActivityID) AS CountOfActivityID,
Count(DISTINCT tbl_Comments.CommentID) AS CountOfCommentID
FROM tbl_Category LEFT JOIN
(tbl_Activity LEFT JOIN tbl_Comments ON
tbl_Activity.ActivityID = tbl_Comments.ActivityID) ON
tbl_Category.CategoryID = tbl_Activity.CategoryID
WHERE
(((tbl_Activity.UnitID)=5) AND ((tbl_Comments.PeriodID)=1))
GROUP BY
tbl_Category.Category;

How to display distinct values based on MAX date in report builder?

I'm quite new to SQL and I hope you can help me.
I'm trying to retrieve unique values from my table based on the latest date where specific users are selected.
This is the data:
Raw Data
And this is what I'm looking to achieve:
Desired Data
I tried to write 2 queries but unfortunately:
My 1st query would display duplicated rows for each company:
SELECT DISTINCT FilteredAppointment.regardingobjectidname ,FilteredAppointment.owneridname ,FilteredAppointment.subject ,MAX(FilteredAppointment.scheduledstart) as Date ,FilteredAppointment.location ,FilteredCcx_member.ccx_mnemonic FROM FilteredAppointment INNER JOIN FilteredAccount ON FilteredAppointment.regardingobjectid = FilteredAccount.accountid INNER JOIN FilteredCcx_member ON FilteredAccount.accountid = FilteredCcx_member.ccx_accountid WHERE FilteredAppointment.statecodename != N'Canceled' AND FilteredAppointment.owneridname IN (N'User1', N'User2', N'User3') GROUP BY FilteredAppointment.regardingobjectidname ,FilteredAppointment.owneridname ,FilteredAppointment.subject ,FilteredAppointment.scheduledstart ,FilteredAppointment.location ,FilteredCcx_member.ccx_mnemonic ORDER BY FilteredAppointment.regardingobjectidname
And my 2nd query would display one row only:
SELECT DISTINCT FilteredAppointment.regardingobjectidname ,FilteredAppointment.owneridname ,FilteredAppointment.subject ,FilteredAppointment.scheduledstart ,FilteredAppointment.location ,FilteredCcx_member.ccx_mnemonic FROM FilteredAppointment INNER JOIN FilteredAccount ON FilteredAppointment.regardingobjectid = FilteredAccount.accountid INNER JOIN FilteredCcx_member ON FilteredAccount.accountid = FilteredCcx_member.ccx_accountid WHERE FilteredAppointment.scheduledstart = (SELECT MAX(FilteredAppointment.scheduledstart) FROM FilteredAppointment WHERE FilteredAppointment.regardingobjectidname = FilteredAppointment.regardingobjectidname) AND FilteredAppointment.statecodename != N'Canceled' AND FilteredAppointment.owneridname IN (N'User1', N'User2', N'User3') GROUP BY FilteredAppointment.regardingobjectidname ,FilteredAppointment.owneridname ,FilteredAppointment.subject ,FilteredAppointment.scheduledstart ,FilteredAppointment.location ,FilteredCcx_member.ccx_mnemonic ORDER BY FilteredAppointment.regardingobjectidname
Try this:-
SELECT distinct a.date, a.company, a.companyID, a.User, a.Location, a.topic
FROM tablename a
inner join
(
Select company, companyID, User, max(date) as recent_date
from
tablename
group by company, companyID, User
) b
on a.date=b.recent_date and a.company=b.company and a.companyID=b.companyID
and a.User=b.User;
I managed to solve the issue - Thank you for the help again!
WITH apptmts AS (SELECT TOP 1 WITH TIES fa.scheduledstart,fa.location,fa.regardingobjectidname,mem.ccx_mnemonic,fa.owneridname,fa.subject FROM FilteredAppointment fa JOIN FilteredAccount acc on fa.regardingobjectid = acc.accountid JOIN FilteredCcx_member mem ON acc.accountid = mem.ccx_accountid WHERE fa.statecodename != N'Canceled' AND fa.owneridname IN (N'User1', N'User2', N'User3') ORDER BY ROW_NUMBER() OVER(PARTITION BY fa.regardingobjectidname ORDER BY fa.scheduledstart DESC) ) SELECT * FROM apptmts ORDER BY scheduledstart DESC

Limitting results in association

I want to limit the results in a lateral join, so that it only returns the N most recent matches.
This is my query, but the limit inside the join does not seem to work, as it returns all visitors
select am.id, am.title, ame.event, array_agg(row_to_json(visitors))
from auto_messages am
left join apps a on am.app_id = a.id
left join app_users au on a.id = au.app_id
left join auto_message_events ame on ame.auto_message_id = am.id
left join lateral (
select
id,
name,
avatar,
ame.inserted_at
from visitors v
where v.id = ame.visitor_id
order by ame.inserted_at desc
limit 1
) as visitors on visitors.id = ame.visitor_id
where am.id = '100'
group by am.id, ame.event
I am pretty sure the problem is with ame. That is where the rows are generated. The join to visitors is only picking up additional information.
So, this might solve your problem:
select am.id, am.title, visitors.event, array_agg(row_to_json(visitors))
from auto_messages am left join
apps a
on am.app_id = a.id left join
app_users au
on a.id = au.app_id left join lateral
(select v.id, v.name, v.avatar,
ame.event, ame.inserted_at, ame.auto_message_id
from auto_message_events ame join
visitors v
on v.id = ame.visitor_id
order by ame.inserted_at desc
limit 1
) visitors
on visitors.auto_message_id = am.id
where am.id = '100'
group by am.id, visitors.event;
You also might want to change your select clause, if you only want a subset of columns.

Distinct on id with ordering by possible duplicate names

I have the following requisites for a query:
Needs to ordered on a inner joined table (see from_products_products below),
Allow duplicates names on from_products_products
It cannot return duplicates records on the origin table (distinct on products.id).
The following query will eliminate the duplicate names, which is not desired, as I had to put a distinct on from_products_products.name because of the use in order by:
SELECT DISTINCT ON (from_products_products.name, products.id) "products".* FROM "products"
INNER JOIN "suppliers_plugin_source_products" ON "suppliers_plugin_source_products"."to_product_id" = "products"."id"
INNER JOIN "products" "from_products_products" ON "from_products_products"."id" = "suppliers_plugin_source_products"."from_product_id"
INNER JOIN "suppliers_plugin_source_products" "sources_from_products_products_join" ON "sources_from_products_products_join"."to_product_id" = "products"."id"
INNER JOIN "suppliers_plugin_suppliers" ON "suppliers_plugin_suppliers"."id" = "sources_from_products_products_join"."supplier_id"
WHERE "products"."profile_id" = 45781 AND (("products"."type" IN ('SuppliersPlugin::DistributedProduct') OR "products"."type" IS NULL)) AND (products.archived <> true)
ORDER BY from_products_products.name ASC, products.id
Using GROUP BY has the same effect and also don't remove duplicates;
The original query that gives duplicate products when the INNER JOIN doesn't match any product:
SELECT "products".* FROM "products"
INNER JOIN "suppliers_plugin_source_products" ON "suppliers_plugin_source_products"."to_product_id" = "products"."id"
INNER JOIN "products" "from_products_products" ON "from_products_products"."id" = "suppliers_plugin_source_products"."from_product_id"
INNER JOIN "suppliers_plugin_source_products" "sources_from_products_products_join" ON "sources_from_products_products_join"."to_product_id" = "products"."id"
INNER JOIN "suppliers_plugin_suppliers" ON "suppliers_plugin_suppliers"."id" = "sources_from_products_products_join"."supplier_id"
WHERE "products"."profile_id" = 45781 AND (("products"."type" IN ('SuppliersPlugin::DistributedProduct') OR "products"."type" IS NULL)) AND (products.archived <> true)
ORDER BY from_products_products.name ASC
So, how to overcome this on PostgreSQL?
PS: This is part of open-source software Noosfero-ecosol
Does this do what you want?
with t as (
SELECT DISTINCT ON (products.id) "products".*,
from_products_products.name as from_products_name
FROM "products"
INNER JOIN "suppliers_plugin_source_products" ON "suppliers_plugin_source_products"."to_product_id" = "products"."id"
INNER JOIN "products" "from_products_products" ON "from_products_products"."id" = "suppliers_plugin_source_products"."from_product_id"
INNER JOIN "suppliers_plugin_source_products" "sources_from_products_products_join" ON "sources_from_products_products_join"."to_product_id" = "products"."id"
INNER JOIN "suppliers_plugin_suppliers" ON "suppliers_plugin_suppliers"."id" = "sources_from_products_products_join"."supplier_id"
WHERE "products"."profile_id" = 45781 AND (("products"."type" IN ('SuppliersPlugin::DistributedProduct') OR "products"."type" IS NULL)) AND (products.archived <> true)
ORDER BY products.id
)
select t.*
from t
order by from_products_name
It seems to meet your requirements.
EDIT:
If the above does what you want, I can think of five options:
The above using a CTE.
Basically the same logic, using a subquery.
Using window functions, which is structurally very similar.
Using group by.
Using a where clause for the filtering logic.
Here is the group by method:
SELECT "products".*,
MIN(from_products_products.name) as from_products_name
FROM "products"
INNER JOIN "suppliers_plugin_source_products" ON "suppliers_plugin_source_products"."to_product_id" = "products"."id"
INNER JOIN "products" "from_products_products" ON "from_products_products"."id" = "suppliers_plugin_source_products"."from_product_id"
INNER JOIN "suppliers_plugin_source_products" "sources_from_products_products_join" ON "sources_from_products_products_join"."to_product_id" = "products"."id"
INNER JOIN "suppliers_plugin_suppliers" ON "suppliers_plugin_suppliers"."id" = "sources_from_products_products_join"."supplier_id"
WHERE "products"."profile_id" = 45781 AND (("products"."type" IN ('SuppliersPlugin::DistributedProduct') OR "products"."type" IS NULL)) AND (products.archived <> true)
GROUP BY products.id
ORDER BY from_products_name;
This form depends on products.id being declared as a primary key. Alternatively, you can put all the columns from that table in the group by.
Rewriting (simplifying the aliases) yields:
SELECT p1.*
FROM products p1
INNER JOIN suppliers_plugin_source_products spsp
ON spsp.to_product_id = p1.id
INNER JOIN products p2
ON p2.id = spsp.from_product_id
INNER JOIN suppliers_plugin_source_products spsp2
ON spsp2.to_product_id = p1.id -- <<-- Huh?
INNER JOIN suppliers_plugin_suppliers sps
ON sps.id = spsp2.supplier_id
WHERE p1.profile_id = 45781
AND (p1."type" IN ('SuppliersPlugin::DistributedProduct') OR p1."type" IS NULL)
AND p1.archived <> true
ORDER BY p2.name ASC -- <<-- Huh?
;
The outer query only refers to the product tables p1 and p2.
Assuming that JOINing the "suppliers_plugin_source_products" table twice was unintentional, this can be reduced to:
SELECT p1.*
FROM products p1
JOIN products p2
ON EXISTS (
SELECT * FROM suppliers_plugin_source_products spsp
-- the next line might not be necessary ...
INNER JOIN suppliers_plugin_suppliers sps ON sps.id = spsp.supplier_id
WHERE spsp.to_product_id = p1.id
AND spsp.from_product_id = p2.id
)
WHERE p1.profile_id = 45781
AND (p1."type" IN ('SuppliersPlugin::DistributedProduct') OR p1."type" IS NULL)
AND p1.archived <> true
ORDER BY p2.name ASC
;

Postgresql distinct issue

It needs receiving unique profiles ordered by creation_date. There is following query:
SELECT DISTINCT profiles.id, COALESCE(occured_at, users_visitors.created_at, visitors.created_at) creation_date FROM "profiles"
JOIN "visitors" ON "visitors"."profile_id" = "profiles"."id"
LEFT JOIN events ON profiles.id = events.profile_id
LEFT JOIN event_kinds ON event_kinds.id = events.event_kind_id
LEFT JOIN users_visitors ON visitors.id = users_visitors.visitor_id
WHERE (event_kinds.name = 'enter') AND "users_visitors"."user_id" = 2
ORDER BY creation_date asc
DISTINCT ON (profiles.id) won't help once it should be used for ordering. GROUP BY profiles.id, ... doesn't work as well.
Could you help me, please?
Does this GROUP BY work? Or which creation_date do you want - if not the max one?
SELECT profiles.id,
MAX(COALESCE(occured_at,
users_visitors.created_at,
visitors.created_at)) creation_date
FROM "profiles"
JOIN "visitors" ON "visitors"."profile_id" = "profiles"."id"
LEFT JOIN events ON profiles.id = events.profile_id
LEFT JOIN event_kinds ON event_kinds.id = events.event_kind_id
AND event_kinds.name = 'enter'
LEFT JOIN users_visitors ON visitors.id = users_visitors.visitor_id
AND "users_visitors"."user_id" = 2
GROUP BY profiles.id
ORDER BY creation_date asc
Note how I've moved the where clause conditions to get the LEFT JOIN's to perform as LEFT JOIN's.