SQL Select Distinct - sql

Working on a query here but I'm getting multiple Sub Items in my Sub Item Column. I want to adjust my query to only show a single Subitem. The PO table that I have may or may not have multiple subitems and thats why its showing sometimes many subitems.
SELECT
dbo.MasterTable.StartItem,
dbo.MasterTable.SubItem,
dbo.MasterTable.STDCOST,
ISNULL(dbo.PO_2_months.[Purchase Price], dbo.MasterTable.STDCOST) AS NewCost,
dbo.PO_2_months.[Purchase Price]
FROM dbo.MasterTable LEFT OUTER JOIN
dbo.PO_2_months ON dbo.MasterTable.SubItem = dbo.PO_2_months.Item
GROUP BY
dbo.MasterTable.SubItem,
dbo.MasterTable.STDCOST,
ISNULL(dbo.PO_2_months.[Purchase Price], dbo.MasterTable.STDCOST),
dbo.MasterTable.StartItem,
dbo.PO_2_months.[Purchase Price]
HAVING (dbo.MasterTable.StartItem = 'FO6534')

Each result in that list is distinct. There may be duplicates of some fields, but there are no two exactly identical records in that result set.
So, if you only want one result for sub item 0S331500G0, for instance, you need to specify what record you want to keep, and which ones you want to discard. Do you want the 0S331500G0 item in your result set to report a STDCOST of 0.6295 or 0.6470? Or the average thereof?
As long as you specify that, you will be able to select distinct:
SELECT DISTINCT
dbo.MasterTable.StartItem,
dbo.MasterTable.SubItem,
AVG(dbo.MasterTable.STDCOST),
AVG(dbo.MasterTable.NewCost),
AVG(dbo.MasterTable.PurchasePrice)
...
MIN(), MAX() and SUM() are other examples of Aggregate Functions you may want to use to specify how the data is to be handled where items are not distinct.

SELECT DISTINCT won't have any effect over what you already have. The reason being that your subitems, despite having the same ID, aren't distinct... You need to make a call on how you choose which subitem is the subitem. For example, an easy option would be to take the maximum stdcost, newcost and Purchase Price (using the MAX aggregation function); but "easy" doesn't necessarily mean "correct"!

It depends on ow you want to determine which of the many sub-items you want to select.
In your case, your MasterTable already has the 1:many relationship between StartItem and SubItem. This means that you could simply pre-process that table and work from there...
SELECT
dbo.MasterTable.StartItem,
dbo.MasterTable.SubItem,
dbo.MasterTable.STDCOST,
ISNULL(dbo.PO_2_months.[Purchase Price], dbo.MasterTable.STDCOST) AS NewCost,
dbo.PO_2_months.[Purchase Price]
FROM
(
SELECT StartItem, MIN(SubItem) FROM dbo.MasterTable GROUP BY StartItem WHERE StartItem =
'FO6534'
)
AS reduced_MasterTable
INNER JOIN
dbo.MasterTable
ON MasterTable.StartItem = reduced_MasterTable.StartItem
AND MasterTable.SubItem = reduced_MasterTable.SubItem
LEFT OUTER JOIN
dbo.PO_2_months
ON dbo.MasterTable.SubItem = dbo.PO_2_months.Item
This assumes a 1:1 relationship between the records in MasterTable and PO_2_Months. You'll want to put the GROUP BY back in if it's 1:many, and probably use SUM() or MAX() or something on dbo.PO_2_months.[Purchase Price].

Related

MS Access 2013, How to add totals row within SQL

I'm in need of some assistance. I have search and not found what I'm looking for. I have an assigment for school that requires me to use SQL. I have a query that pulls some colunms from two tables:
SELECT Course.CourseNo, Course.CrHrs, Sections.Yr, Sections.Term, Sections.Location
FROM Course
INNER JOIN Sections ON Course.CourseNo = Sections.CourseNo
WHERE Sections.Term="spring";
I need to add a Totals row at the bottom to count the CourseNo and Sum the CrHrs. It has to be done through SQL query design as I need to paste the code. I know it can be done with the datasheet view but she will not accept that. Any advice?
To accomplish this, you can union your query together with an aggregation query. Its not clear from your question which columns you are trying to get "Totals" from, but here's an example of what I mean using your query and getting counts of each (kind of useless example - but you should be able to apply to what you are doing):
SELECT
[Course].[CourseNo]
, [Course].[CrHrs]
, [Sections].[Yr]
, [Sections].[Term]
, [Sections].[Location]
FROM
[Course]
INNER JOIN [Sections] ON [Course].[CourseNo] = [Sections].[CourseNo]
WHERE [Sections].[Term] = [spring]
UNION ALL
SELECT
"TOTALS"
, SUM([Course].[CrHrs])
, count([Sections].[Yr])
, Count([Sections].[Term])
, Count([Sections].[Location])
FROM
[Course]
INNER JOIN [Sections] ON [Course].[CourseNo] = [Sections].[CourseNo]
WHERE [Sections].[Term] = “spring”
You can prepare your "total" query separately, and then output both query results together with "UNION".
It might look like:
SELECT Course.CourseNo, Course.CrHrs, Sections.Yr, Sections.Term, Sections.Location
FROM Course
INNER JOIN Sections ON Course.CourseNo = Sections.CourseNo
WHERE Sections.Term="spring"
UNION
SELECT "Total", SUM(Course.CrHrs), SUM(Sections.Yr), SUM(Sections.Term), SUM(Sections.Location)
FROM Course
INNER JOIN Sections ON Course.CourseNo = Sections.CourseNo
WHERE Sections.Term="spring";
Whilst you can certainly union the aggregated totals query to the end of your original query, in my opinion this would be really bad practice and would be undesirable for any real-world application.
Consider that the resulting query could no longer be used for any meaningful analysis of the data: if displayed in a datagrid, the user would not be able to sort the data without the totals row being interspersed amongst the rest of the data; the user could no longer use the built-in Totals option to perform their own aggregate operation, and the insertion of a row only identifiable by the term totals could even conflict with other data within the set.
Instead, I would suggest displaying the totals within an entirely separate form control, using a separate query such as the following (based on your own example):
SELECT Count(Course.CourseNo) as Courses, Sum(Course.CrHrs) as Hours
FROM Course INNER JOIN Sections ON Course.CourseNo = Sections.CourseNo
WHERE Sections.Term = "spring";
However, since CrHrs are fields within your Course table and not within your Sections table, the above may yield multiples of the desired result, with the number of hours multiplied by the number of corresponding records in the Sections table.
If this is the case, the following may be more suitable:
SELECT Count(Course.CourseNo) as Courses, Sum(Course.CrHrs) as Hours
FROM
Course INNER JOIN
(SELECT DISTINCT s.CourseNo FROM Sections s WHERE s.Term = "spring") q
ON Course.CourseNo = q.CourseNo

MS Access - Summing up a field to be used in another query is "duplicating" data

I am trying to sum up one field and use it in another query, but when I use the Totals button and then call that sum from the other query it considers that field as multiple instances but with the sum value in each one. How can I sum two fields in two different queries and then use those sums in another query? Note - I only separated them into 3 queries because I felt it would help me avoid "is not part of an aggregate function" errors.
Example Data
Inventory Query: This query groups by item and sums the qty_on_hand field
Item SumOfqty_on_hand
A 300
Job Material query: This query groups on the job's materials and sums up the qty_req field (quantity required to complete the job)
Item SumOfqty_req
A 500
When I make a third query to do the calculation [SumOfqty_req]-[SumOfqty_on_hand] the query does the calculation but for each record in the Job Material query.
Job Material Query
SELECT dbo_jobmatl.item,
IIf(([qty_released]-[qty_complete])<0,0,([qty_released]-[qty_complete]))*[matl_qty] AS qty_req
FROM new_BENInventory
INNER JOIN (dbo_jobmatl
INNER JOIN new_BENJobs
ON (new_BENJobs.suffix = dbo_jobmatl.suffix)
AND (dbo_jobmatl.job = new_BENJobs.job)
) ON new_BENInventory.item = dbo_jobmatl.item
GROUP BY dbo_jobmatl.item,
IIf(([qty_released]-[qty_complete])<0,0,([qty_released]-[qty_complete]))*[matl_qty];
Inventory Query
SELECT dbo_ISW_LPItem.item,
Sum(dbo_ISW_LPItem.qty_on_hand) AS SumOfqty_on_hand,
dbo_ISW_LP.whse,
dbo_ISW_LPItem.hold_flag
FROM (dbo_ISW_LP INNER JOIN dbo_ISW_LPItem
ON dbo_ISW_LP.lp_num = dbo_ISW_LPItem.lp_num)
INNER JOIN dbo_ISW_LPLot
ON (dbo_ISW_LPItem.lp_num = dbo_ISW_LPLot.lp_num)
AND (dbo_ISW_LPItem.item = dbo_ISW_LPLot.item)
AND (dbo_ISW_LPItem.qty_on_hand = dbo_ISW_LPLot.qty_on_hand)
GROUP BY dbo_ISW_LPItem.item,
dbo_ISW_LP.whse,
dbo_ISW_LPItem.hold_flag
HAVING (((Sum(dbo_ISW_LPItem.qty_on_hand))>0)
AND ((dbo_ISW_LP.whse) Like "BEN")
AND ((dbo_ISW_LPItem.hold_flag) Like 0));
Third Query
SELECT new_BENJobItems.item,
[qty_req]-[SumOfqty_on_hand] AS [Transfer QTY]
FROM new_BENInventory
INNER JOIN new_BENJobItems
ON new_BENInventory.item = new_BENJobItems.item;
Please note that anything that starts with dbo_ is a prefix for a table that sources the original data.
If any more clarification is needed I would be more than happy to provide it.
Looks like you need a GROUP BY new_BENJobItems.item on your final query along with a SUM() on the quantity. Or to remove the IIf(([qty_released]-[qty_complete])<0,0,([qty_released]-[qty_complete]))*[matl_qty] from your Job Material query. Or both. As written, the Job Material Query is going to return a record for every different key value in the joined input tables that has a distinct quantity, which doesn't seem like the granularity you want for that.

MS access query aggregation

I am trying to get query like this
SELECT sales.action_date, sales.item_id, items.item_name,
sales.item_quantity, sales.item_price, sales.net
FROM sales INNER JOIN items ON sales.item_id = items.ID
GROUP BY sales.item_id
HAVING (((sales.action_date)=[Forms]![rep_frm]![Text13].[value]));
Every time I try to show data this message show
your query does not include the specified expression ' action date '
as part of aggregate function.
and for all field in the query >>> but i just want the aggregation be for item_id
what i should do?
You don't have any aggregations like SUM in your SELECT statement. I also don't understand why you sales.action_date is in de HAVING clause. This is for aggregated filtering like SUM(sales.item_price) <> 0. It should be possible to put this part in de WHERE-clause, before the GROUP BY instead of the HAVING clause.
This example should work:
SELECT sales.item_id, items.item_name, SUM(sales.item_quantity),
SUM(sales.item_price), SUM(sales.net)
FROM sales INNER JOIN items ON sales.item_id = items.ID
WHERE sales.action_date=[Forms]![rep_frm]![Text13].[value]
GROUP BY sales.item_id, items.item_name;
When you are grouping your data all fields in select query should be either included in group by clause, or some of aggregate functions should be applied to it - otherwise it doesn't makes sanse.
By the way - I far as I can see, you should use WHERE(((sales.action_date)=[Forms]![rep_frm]![Text13].[value])) before group, not having after.
If you want to aggregate by date you have to put the date in the GROUP BY clause
SELECT sales.action_date,
SUM(sales.item_quantity),
SUM(sales.item_quantity * sales.item_price) as Total,
SUM(sales.net)
FROM sales
INNER JOIN items ON sales.item_id = items.ID
WHERE (((sales.action_date)=[Forms]![rep_frm]![Text13].[value]));
GROUP BY sales.action_date
Only the column you want to group by can appear in the GROUP BY clause. Only these columns can appear in the select clause outside of aggregation functions.

sql SUM value incorrect when using joins and group by

Im writing a query that sums order values broken down by product groups - problem is that when I add joins the aggregated SUM gets greatly inflated - I assume its because its adding in duplicate rows. Im kinda new to SQL, but I think its because I need to construct the query with sub selects or nested joins?
All data returns as expected, and my joins pull out the needed data, but the SUM(inv.item_total) AS Value returned is much higher that it should be - SQL below
SELECT so.Company_id, SUM(inv.item_total) AS Value, co.company_name,
agents.short_desc, stock_type.short_desc AS Type
FROM SORDER as so
JOIN company AS co ON co.company_id = so.company_id
JOIN invoice AS inv ON inv.Sorder_id = so.Sorder_id
JOIN sorder_item AS soitem ON soitem.sorder_id = so.Sorder_id
JOIN STOCK AS stock ON stock.stock_id = soitem.stock_id
JOIN stock_type AS stock_type ON stock_type.stype_id = stock.stype_id
JOIN AGENTS AS AGENTS ON agents.agent_id = co.agent_id
WHERE
co.last_ordered >'01-JAN-2012' and so.Sotype_id='1'
GROUP BY so.Company_id,co.company_name,agents.short_desc, stock_type.short_desc
Any guidence on how I should structure this query to pull out an "un-duplicated" SUM(inv.item_total) AS Value much appreciated.
To get an accurate sum, you want only the joins that are needed. So, this version should work:
SELECT so.Company_id, SUM(inv.item_total) AS Value, co.company_name
FROM SORDER so JOIN
company co
ON co.company_id = so.company_id JOIN
invoice inv
ON inv.Sorder_id = so.Sorder_id
group by so.Company_id, co.company_name
You can then add in one join at a time to see where the multiplication is taking place. I'm guessing it has to do with the agents.
It sounds like the joins are not accurate.
First suspect join
For example, would an agent be per company, or per invoice?
If it is per order, then should the join be something along the lines of
JOIN AGENTS AS AGENTS ON agents.agent_id = inv.agent_id
Second suspect join
Can one order have many items, and many invoices at the same time? That can cause problems as well. Say an order has 3 items and 3 invoices were sent out. According to your joins, the same item will show up 3 times means a total of 9 line items where there should be only 3. You may need to eliminate the invoices table
Possible way to solve this on your own:
I would remove all the grouping and sums, and see if you can filter by one invoice produce an unique set of rows for all the data.
Start with an invoice that has just one item and inspect your result set for accuracy. If that works, then add another invoice that has multiple and check the rows to see if you get your perfect dataset back. If not, then the columns that have repeating values (Company Name, Item Name, Agent Name, etc) are usually a good starting point for checking up on why the duplicates are showing up.

Use of the HAVING clause when using muliple sums

I was having a problem getting mulitple sums from multiple tables. Short story, my answer was solved in the "sql sum data from multiple tables" thread on this site. But where it came up short, is that now I'd like to only show sums that are greater than a certain amount. So while I have sub-selects in my select, I think I need to use a HAVING clause to filter the summed amounts that are too low.
Example, using the code specified in the link above (more specifically the answer that the owner has chosen as correct), I would only like to see a query result if SUM(AP2.Value) > 1500. Any thoughts?
If you need to filter on the results of ANY aggregate function, you MUST use a HAVING clause. WHERE is applied at the row level as the DB scans the tables for matching things. HAVING is applied basically immediately before the result set is sent out to the client. At the time WHERE operates, the aggregate function results are not (and cannot) be available, so you have to use a HAVING clause, which is applied after the main query is complete and all aggregate results are available.
So... long story short, yes, you'll need to do
SELECT ...
FROM ...
WHERE ...
HAVING (SUM_AP > 1500)
Note that you can use column aliases in the having clause. In technical terms, having on a query as above works basically exactly the same as wrapping the initial query in another query and applying another WHERE clause on the wrapper:
SELECT *
FROM (
SELECT ...
) AS child
WHERE (SUM_AP > 1500)
You could wrap that query as a subselect and then specify your criteria in the WHERE clause:
SELECT
PROJECT,
SUM_AP,
SUM_INV
FROM (
SELECT
AP1.[PROJECT],
(SELECT SUM(AP2.Value) FROM AP AS AP2 WHERE AP2.PROJECT = AP1.PROJECT) AS SUM_AP,
(SELECT SUM(INV2.Value) FROM INV AS INV2 WHERE INV2.PROJECT = AP1.PROJECT) AS SUM_INV
FROM AP AS AP1
INNER JOIN INV AS INV1 ON
AP1.[PROJECT] = INV1.[PROJECT]
WHERE
AP1.[PROJECT] = 'XXXXX'
GROUP BY
AP1.[PROJECT]
) SQ
WHERE
SQ.SUM_AP > 1500