Distinct on id with ordering by possible duplicate names - sql

I have the following requisites for a query:
Needs to ordered on a inner joined table (see from_products_products below),
Allow duplicates names on from_products_products
It cannot return duplicates records on the origin table (distinct on products.id).
The following query will eliminate the duplicate names, which is not desired, as I had to put a distinct on from_products_products.name because of the use in order by:
SELECT DISTINCT ON (from_products_products.name, products.id) "products".* FROM "products"
INNER JOIN "suppliers_plugin_source_products" ON "suppliers_plugin_source_products"."to_product_id" = "products"."id"
INNER JOIN "products" "from_products_products" ON "from_products_products"."id" = "suppliers_plugin_source_products"."from_product_id"
INNER JOIN "suppliers_plugin_source_products" "sources_from_products_products_join" ON "sources_from_products_products_join"."to_product_id" = "products"."id"
INNER JOIN "suppliers_plugin_suppliers" ON "suppliers_plugin_suppliers"."id" = "sources_from_products_products_join"."supplier_id"
WHERE "products"."profile_id" = 45781 AND (("products"."type" IN ('SuppliersPlugin::DistributedProduct') OR "products"."type" IS NULL)) AND (products.archived <> true)
ORDER BY from_products_products.name ASC, products.id
Using GROUP BY has the same effect and also don't remove duplicates;
The original query that gives duplicate products when the INNER JOIN doesn't match any product:
SELECT "products".* FROM "products"
INNER JOIN "suppliers_plugin_source_products" ON "suppliers_plugin_source_products"."to_product_id" = "products"."id"
INNER JOIN "products" "from_products_products" ON "from_products_products"."id" = "suppliers_plugin_source_products"."from_product_id"
INNER JOIN "suppliers_plugin_source_products" "sources_from_products_products_join" ON "sources_from_products_products_join"."to_product_id" = "products"."id"
INNER JOIN "suppliers_plugin_suppliers" ON "suppliers_plugin_suppliers"."id" = "sources_from_products_products_join"."supplier_id"
WHERE "products"."profile_id" = 45781 AND (("products"."type" IN ('SuppliersPlugin::DistributedProduct') OR "products"."type" IS NULL)) AND (products.archived <> true)
ORDER BY from_products_products.name ASC
So, how to overcome this on PostgreSQL?
PS: This is part of open-source software Noosfero-ecosol

Does this do what you want?
with t as (
SELECT DISTINCT ON (products.id) "products".*,
from_products_products.name as from_products_name
FROM "products"
INNER JOIN "suppliers_plugin_source_products" ON "suppliers_plugin_source_products"."to_product_id" = "products"."id"
INNER JOIN "products" "from_products_products" ON "from_products_products"."id" = "suppliers_plugin_source_products"."from_product_id"
INNER JOIN "suppliers_plugin_source_products" "sources_from_products_products_join" ON "sources_from_products_products_join"."to_product_id" = "products"."id"
INNER JOIN "suppliers_plugin_suppliers" ON "suppliers_plugin_suppliers"."id" = "sources_from_products_products_join"."supplier_id"
WHERE "products"."profile_id" = 45781 AND (("products"."type" IN ('SuppliersPlugin::DistributedProduct') OR "products"."type" IS NULL)) AND (products.archived <> true)
ORDER BY products.id
)
select t.*
from t
order by from_products_name
It seems to meet your requirements.
EDIT:
If the above does what you want, I can think of five options:
The above using a CTE.
Basically the same logic, using a subquery.
Using window functions, which is structurally very similar.
Using group by.
Using a where clause for the filtering logic.
Here is the group by method:
SELECT "products".*,
MIN(from_products_products.name) as from_products_name
FROM "products"
INNER JOIN "suppliers_plugin_source_products" ON "suppliers_plugin_source_products"."to_product_id" = "products"."id"
INNER JOIN "products" "from_products_products" ON "from_products_products"."id" = "suppliers_plugin_source_products"."from_product_id"
INNER JOIN "suppliers_plugin_source_products" "sources_from_products_products_join" ON "sources_from_products_products_join"."to_product_id" = "products"."id"
INNER JOIN "suppliers_plugin_suppliers" ON "suppliers_plugin_suppliers"."id" = "sources_from_products_products_join"."supplier_id"
WHERE "products"."profile_id" = 45781 AND (("products"."type" IN ('SuppliersPlugin::DistributedProduct') OR "products"."type" IS NULL)) AND (products.archived <> true)
GROUP BY products.id
ORDER BY from_products_name;
This form depends on products.id being declared as a primary key. Alternatively, you can put all the columns from that table in the group by.

Rewriting (simplifying the aliases) yields:
SELECT p1.*
FROM products p1
INNER JOIN suppliers_plugin_source_products spsp
ON spsp.to_product_id = p1.id
INNER JOIN products p2
ON p2.id = spsp.from_product_id
INNER JOIN suppliers_plugin_source_products spsp2
ON spsp2.to_product_id = p1.id -- <<-- Huh?
INNER JOIN suppliers_plugin_suppliers sps
ON sps.id = spsp2.supplier_id
WHERE p1.profile_id = 45781
AND (p1."type" IN ('SuppliersPlugin::DistributedProduct') OR p1."type" IS NULL)
AND p1.archived <> true
ORDER BY p2.name ASC -- <<-- Huh?
;
The outer query only refers to the product tables p1 and p2.
Assuming that JOINing the "suppliers_plugin_source_products" table twice was unintentional, this can be reduced to:
SELECT p1.*
FROM products p1
JOIN products p2
ON EXISTS (
SELECT * FROM suppliers_plugin_source_products spsp
-- the next line might not be necessary ...
INNER JOIN suppliers_plugin_suppliers sps ON sps.id = spsp.supplier_id
WHERE spsp.to_product_id = p1.id
AND spsp.from_product_id = p2.id
)
WHERE p1.profile_id = 45781
AND (p1."type" IN ('SuppliersPlugin::DistributedProduct') OR p1."type" IS NULL)
AND p1.archived <> true
ORDER BY p2.name ASC
;

Related

Access Subquery On mulitple conditions

This SQL query needs to be done in ACCESS.
I am trying to do a subquery on the total sales, but I want to link the sale to the province AND to product. The below query will work with one or the other: (po.product_name = allp.all_products) AND (p.province = allp.all_province); -- but it will no take both.
I will be including every month into this query, once I can figure out the subquery on with two criteria.
Select
p.province as [Province],
po.product_name as [Product],
all_price
FROM
(purchase_order po
INNER JOIN person p
on p.person_id = po.person_id)
left join
(
select
po1.product_name AS [all_products],
sum(pp1.price) AS [all_price],
p1.province AS [all_province]
from (purchase_order po1
INNER JOIN product pp1
on po1.product_name = pp1.product_name)
INNER JOIN person p1
on po1.person_id = p1.person_id
group by po1.product_name, pp1.price, p1.province
)
as allp
on (po.product_name = allp.all_products) AND (p.province = allp.all_province);
Make the first select sql into a table by giving it an alias and join table 1 to table 2. I don't have your table structure or data to test it but I think this will lead you down the right path:
select table1.*, table2.*
from
(Select
p.province as [Province],
po.product_name as [Product]
--removed this ,all_price
FROM
(purchase_order po
INNER JOIN person p
on p.person_id = po.person_id) table1
left join
(
select
po1.product_name AS [all_products],
sum(pp1.price) AS [all_price],
p1.province AS [all_province]
from (purchase_order po1
INNER JOIN product pp1
on po1.product_name = pp1.product_name)
INNER JOIN person p1
on po1.person_id = p1.person_id
group by po1.product_name, pp1.price, p1.province --check your group by, I dont think you want pp1.price here if you want to aggregate
) as table2 --changed from allp
on (table1.product = table2.all_products) AND (table1.province = table2.all_province);

SQL Server : Group By causes "column invalid" error, how to solve that?

I am trying to filter cg_group names (please check the query) and group (using: GROUP BY) the results according to last updated opportunity (using: ORDER BY opportunities.date_modified DESC).
When I used query without use group by it returns the following results:
SELECT cg_groups.name
FROM cg_groups
JOIN cg_groups_cstm ON cg_groups_cstm.id_c = cg_groups.id
JOIN accounts_cstm ON cg_groups.name = accounts_cstm.client_group_c
JOIN accounts ON accounts.id = accounts_cstm.id_c
JOIN accounts_opportunities ON accounts.id = accounts_opportunities.account_id
JOIN opportunities ON accounts_opportunities.opportunity_id = opportunities.id
WHERE cg_groups.deleted='0' AND cg_groups_cstm.status_c='1' AND opportunities.deleted='0'
ORDER BY opportunities.date_modified DESC
Results:
ABC Group
ABC Group
CBC Group
ABC Group
XYZ Group
But I want to group this to following order:
ABC Group
CBC Group
XYZ Group
To do that I added GROUP BY cg_groups.name
SELECT cg_groups.name
FROM cg_groups
JOIN cg_groups_cstm ON cg_groups_cstm.id_c = cg_groups.id
JOIN accounts_cstm ON cg_groups.name = accounts_cstm.client_group_c
JOIN accounts ON accounts.id = accounts_cstm.id_c
JOIN accounts_opportunities ON accounts.id = accounts_opportunities.account_id
JOIN opportunities ON accounts_opportunities.opportunity_id = opportunities.id
WHERE cg_groups.deleted='0' AND cg_groups_cstm.status_c='1' AND opportunities.deleted='0'
GROUP BY cg_groups.name
ORDER BY opportunities.date_modified DESC
But now I get this error:
Msg 8127, Level 16, State 1, Line 10
Column "opportunities.date_modified" is invalid in the ORDER BY clause because it is not contained in either an aggregate function or the GROUP BY clause.
Someone please help me to solve this issue, thank you.
Use ROW_NUMBER to find the most recently updated record for each group:
WITH cte AS (
SELECT cg_groups.name, o.date_modified,
ROW_NUMBER() OVER (PARTITION BY o.date_modified DESC) rn
FROM cg_groups cg
INNER JOIN cg_groups_cstm cgc
ON cgc.id_c = cg.id
INNER JOIN accounts_cstm ac
ON cg.name = ac.client_group_c
INNER JOIN accounts a
ON a.id = ac.id_c
INNER JOIN accounts_opportunities ao
ON a.id = ao.account_id
INNER JOIN opportunities o
ON ao.opportunity_id = o.id
WHERE cg.deleted = '0' AND cgc.status_c = '1' AND o.deleted = '0'
)
SELECT name
FROM cte
WHERE rn = 1
ORDER BY date_modified DESC;
Note that this may not be exactly what you want. This answer returns a single record per name group which is the most recently updated for that group. It then orders all results descending, but maybe you want ascending.
put opportunities.date_modified in selection and group by then you can use that in order by
SELECT opportunities.date_modified,cg_groups.name
FROM cg_groups
JOIN cg_groups_cstm ON cg_groups_cstm.id_c = cg_groups.id
JOIN accounts_cstm ON cg_groups.name = accounts_cstm.client_group_c
JOIN accounts ON accounts.id = accounts_cstm.id_c
JOIN accounts_opportunities ON accounts.id = accounts_opportunities.account_id
JOIN opportunities ON accounts_opportunities.opportunity_id = opportunities.id
WHERE cg_groups.deleted='0' AND cg_groups_cstm.status_c='1' AND opportunities.deleted='0'
GROUP BY cg_groups.name,opportunities.date_modified
ORDER BY opportunities.date_modified DESC
but for your result you can try like below just use distinct
SELECT distinct cg_groups.name
FROM cg_groups
JOIN cg_groups_cstm ON cg_groups_cstm.id_c = cg_groups.id
JOIN accounts_cstm ON cg_groups.name = accounts_cstm.client_group_c
JOIN accounts ON accounts.id = accounts_cstm.id_c
JOIN accounts_opportunities ON accounts.id = accounts_opportunities.account_id
JOIN opportunities ON accounts_opportunities.opportunity_id = opportunities.id
WHERE cg_groups.deleted='0' AND cg_groups_cstm.status_c='1' AND opportunities.deleted='0'
order by cg_groups.name
no group by need as you have not used any aggregate function
how about just adding distinct right after your SELECT statement .
Select distinct ... from ...

SQL remove duplicates with INNER JOIN

I have tried multiple things to get rid of all the duplicates in my query result, none of them worked. I tried DISTINCT and GROUP BY.
DISTINCT won't do anything at all and with GROUP BY I keep getting errors.
My Query:
SELECT
categorie.categorie_id AS categorie,
categorie.categorie_nummer,
categorie.naam,
product.product_id, product.product_naam,
foto.foto_id, foto.foto1,
item.prijs, item.item_id
FROM
((((((categorie
INNER JOIN
behoort_tot ON categorie.categorie_id = behoort_tot.categorie_id)
INNER JOIN
product ON behoort_tot.product_id = product.product_id)
INNER JOIN
heeft ON product.product_id = heeft.product_id)
INNER JOIN
foto ON heeft.foto_id = foto.foto_id)
INNER JOIN
is_een_1 ON product.product_id = is_een_1.product_id)
INNER JOIN
item ON is_een_1.item_id = item.item_id)
WHERE
(categorie.categorie_id = ?)
Thanks in advance
WITH TEMP AS
(
SELECT categorie.categorie_id AS categorie, categorie.categorie_nummer, categorie.naam,
product.product_id, product.product_naam, foto.foto_id, foto.foto1, item.prijs, item.item_id,
ROW_NUMBER()
OVER (PARTITION BY categorie.categorie_id ORDER BY categorie.categorie_id) As ROW_NO
FROM ((((((categorie INNER JOIN
behoort_tot ON categorie.categorie_id = behoort_tot.categorie_id)
INNER JOIN
product ON behoort_tot.product_id = product.product_id) INNER JOIN
heeft ON product.product_id = heeft.product_id) INNER JOIN
foto ON heeft.foto_id = foto.foto_id) INNER JOIN
is_een_1 ON product.product_id = is_een_1.product_id) INNER JOIN
item ON is_een_1.item_id = item.item_id)
)
SELECT * FROM TEMP WHERE ROW_NO = 1;
I do see that you are trying to sort out using the categories.Try this.I`m sorting it using the row_number function and assuming that you are using sql server

Sum record data into one

I have this query which returns qty in each of my branch. now the branch has two WH_subType as you see in the attached diagram i have attached. I want to sum the 2 subtype and show its available qty. how can i do it.
my select query is like this
SELECT
dbo.WarehouseType.name AS Section,
dbo.WarehouseSubType.name AS WH_Type,
dbo.WarehouseSubType1.name AS WH_SubType,
dbo.Branch.name AS Branch,
(dbo.WarehouseProductQuantity.actualQuantity - dbo.WarehouseProductQuantity.reservedQuantity) AS AvailQty,
dbo.WarehouseProductQuantity.tafsilId AS Tafsil,
dbo.Tafsil.description AS Product_Name
FROM
dbo.WarehouseSubType
INNER JOIN
dbo.WarehouseType
ON
(
dbo.WarehouseSubType.warehouseTypeId = dbo.WarehouseType.id)
INNER JOIN
dbo.WarehouseSubType1
ON
(
dbo.WarehouseSubType.id = dbo.WarehouseSubType1.warehouseSubTypeId)
INNER JOIN
dbo.Warehouse
ON
(
dbo.WarehouseSubType1.id = dbo.Warehouse.warehouseSubType1Id)
INNER JOIN
dbo.Branch
ON
(
dbo.Warehouse.branchId = dbo.Branch.id)
INNER JOIN
dbo.WarehouseProductQuantity
ON
(
dbo.Warehouse.id = dbo.WarehouseProductQuantity.warehouseId)
INNER JOIN
dbo.TafsilLink
ON
(
dbo.WarehouseProductQuantity.tafsilId = dbo.TafsilLink.sourceId)
INNER JOIN
dbo.Tafsil
ON
(
dbo.TafsilLink.targetId = dbo.Tafsil.id)
INNER JOIN
dbo.FinishProduct
ON
(
dbo.Tafsil.id = dbo.FinishProduct.tafsilId)
INNER JOIN
dbo.Supplier
ON
(
dbo.FinishProduct.supplierId = dbo.Supplier.tafsilId)
WHERE
WarehouseSubType1.warehouseSubTypeId IN (1,4)
group by dbo.WarehouseProductQuantity.tafsilId
Have you tried a group by
SELECT SubType, SUM(qty) AS QtySum
GROUP BY SubType
Every grouped by column should be in your select. Note: for every column you group by it further sub divides the data
Update based on OP comment:
If you want other columns you need to do something like
SELECT s.WH_SubType,s.AvailQty, t.other_cols
from
(SELECT
dbo.WarehouseSubType1.name AS WH_SubType,
sum(dbo.WarehouseProductQuantity.actualQuantity - dbo.WarehouseProductQuantity.reservedQuantity) AS AvailQty
FROM
table
GROUP BY
dbo.WarehouseSubType1.name) s
left join table t on t.dbo.WarehouseSubType1.name = s.WH_SubType;
For reference see this question: How do I use "group by" with three columns of data?
UPDATE 2:
SELECT
dbo.WarehouseType.name AS Section,
dbo.WarehouseSubType.name AS WH_Type,
dbo.WarehouseSubType1.name AS WH_SubType,
dbo.Branch.name AS Branch,
SumTable.AvailQty,
SumTable.Tafsil,
dbo.Tafsil.description AS Product_Name
FROM
dbo.WarehouseSubType
INNER JOIN
dbo.WarehouseType
ON
(
dbo.WarehouseSubType.warehouseTypeId = dbo.WarehouseType.id)
INNER JOIN
dbo.WarehouseSubType1
ON
(
dbo.WarehouseSubType.id = dbo.WarehouseSubType1.warehouseSubTypeId)
INNER JOIN
dbo.Warehouse
ON
(
dbo.WarehouseSubType1.id = dbo.Warehouse.warehouseSubType1Id)
INNER JOIN
dbo.Branch
ON
(
dbo.Warehouse.branchId = dbo.Branch.id)
INNER JOIN
dbo.WarehouseProductQuantity
ON
(
dbo.Warehouse.id = dbo.WarehouseProductQuantity.warehouseId)
INNER JOIN
dbo.TafsilLink
ON
(
dbo.WarehouseProductQuantity.tafsilId = dbo.TafsilLink.sourceId)
INNER JOIN
dbo.Tafsil
ON
(
dbo.TafsilLink.targetId = dbo.Tafsil.id)
INNER JOIN
dbo.FinishProduct
ON
(
dbo.Tafsil.id = dbo.FinishProduct.tafsilId)
LEFT JOIN (SELECT
sum(dbo.WarehouseProductQuantity.actualQuantity - dbo.WarehouseProductQuantity.reservedQuantity) AS AvailQty,
dbo.WarehouseProductQuantity.tafsilId AS Tafsil
FROM
dbo.WarehouseProductQuantity
group by dbo.WarehouseProductQuantity.tafsilId) SumTable on dbo.Tafsil.id = SumTable.Tafsil
WHERE
WarehouseSubType1.warehouseSubTypeId IN (1,4)
You need to do something like
SELECT SUM(AvailQty), ... FROM ... WHERE ... GROUP BY WH_SubType
http://www.w3schools.com/sql/sql_func_sum.asp
http://www.w3schools.com/sql/sql_groupby.asp

SQL Server, insert value to variable and sort

I need to sort the results of a query after insert a value to a variable.
I am trying to sort according to 'RowId' but its not valid in my case.
Below is my query, how can I make it work?
Thanks.
SELECT TOP 1 #NumOfProducts = ROW_NUMBER() OVER(ORDER BY Products.Id) AS RowId
FROM Cities INNER JOIN
CitiesInLanguages ON Cities.Id = CitiesInLanguages.CityId INNER JOIN
ShopsInCities ON Cities.Id = ShopsInCities.CityId INNER JOIN
Categories INNER JOIN
ProductstInCategories ON Categories.Id = ProductstInCategories.CategoryId INNER JOIN
Products ON ProductstInCategories.ProductId = Products.Id INNER JOIN
ProductsInProdutGroup ON Products.Id = ProductsInProdutGroup.ProductId INNER JOIN
ProductsGroups ON ProductsInProdutGroup.ProductGroupId = ProductsGroups.Id INNER JOIN
ShopsInProductsGroup ON ProductsGroups.Id = ShopsInProductsGroup.ProductGroupId INNER JOIN
aspnet_Users ON ShopsInProductsGroup.ShopId = aspnet_Users.UserId ON ShopsInCities.ShopId = aspnet_Users.UserId INNER JOIN
ProductsNamesInLanguages ON Products.Id = ProductsNamesInLanguages.ProductId INNER JOIN
UsersInfo ON aspnet_Users.UserId = UsersInfo.UserId INNER JOIN
ProductOptions ON Products.Id = ProductOptions.ProductId INNER JOIN
ProductOptionsInLanguages ON ProductOptions.Id = ProductOptionsInLanguages.ProductOptionId INNER JOIN
ProductFiles ON Products.Id = ProductFiles.ProductId INNER JOIN
ProductsInOccasions ON Products.Id = ProductsInOccasions.ProductId INNER JOIN
Occasions ON ProductsInOccasions.OccasionId = Occasions.Id INNER JOIN
OccasionsInLanguages ON Occasions.Id = OccasionsInLanguages.OccasionId
WHERE (Products.IsAddition = 0) AND (Categories.IsEnable = 1) AND (Products.IsEnable = 1) AND (ProductsGroups.IsEnable = 1) AND (Cities.IsEnable = 1) AND
(ShopsInProductsGroup.IsEnable = 1) AND (CitiesInLanguages.CityName = #CityName) AND (ProductsNamesInLanguages.LanguageId = #languageId) AND
(Categories.Id = #CategoryId) AND (ProductOptions.IsEnable = 1) AND (ProductFiles.IsEnable = 1)
group by Products.Id, ProductsNamesInLanguages.ProductName, UsersInfo.Name
Order By RowId
With edit try this:
SELECT TOP 1 #NumOfProducts = ROW_NUMBER() OVER(ORDER BY Products.Id),
ROW_NUMBER() OVER(ORDER BY Products.Id) AS RowId
or try
ORDER BY ROW_NUMBER() OVER(ORDER BY Products.Id)
I'd have to test but I thik both will work.
The problem is that rowid is not in any of the group by items.
You could order by Products.id. If rowid is going to be the same for each one you could order by max(rowid) or min(rowid) or add rowid to the group by statement.
Are you trying to find the ID of the most recently inserted row? You want
SELECT Scope_Identity()
Edit
*I am trying to get the max row id of ROW_NUMBER()*
Wrap your query in
SELECT #NumOfProducts = Max(RowID) FROM
( [your query here] ) v
Alternately, a SELECT COUNT... query may provide the answer