GROUP BY/SELECT DISTINCT issues - sql

I have gone through dozens of questions on this site and others, but am still having issues adding a GROUP BY to my code. One came close, but not sure I fully understand it:
SQL/mysql - Select distinct/UNIQUE but return all columns?
I'm working on a program that tracks tracking numbers for orders shipped, but need each sales order only listed once as I need to assign a status to each order. PackageReferenceNo1 contains the sales order numbers, but I only need one tracking number for each order. The following code pulls all tracking numbers and their sales orders. My thought was if I could SELECT DISTINCT, or GROUP BY, I could resolve this issue but haven't had any luck with either option.
My working code is:
SELECT tblImportUPS.ManifestDate,
tblImportUPS.TrackingNumber,
tblImportUPS.PackageReferenceNo1,
tblImportUPS.PackageReferenceNo2,
tblImportUPS.STATUS,
tblTrackingLinks.Notes,
tblImportUPS.ShipToName,
tblTrackingLinks.PGId,
tblTrackingLinks.BBReleased,
tblTrackingLinks.BBReleaseDate,
tblTrackingLinks.SRFNumber,
tblImportUPS.ShipToCity,
tblImportUPS.[ShipToState/Province],
tblImportUPS.ShipToCountry,
tblImportUPS.ShipperName,
tblImportUPS.NumberofPieces,
tblImportUPS.Weight,
tblImportUPS.ScheduledDelivery,
tblImportUPS.DateDelivered
FROM tblTrackingLinks
INNER JOIN tblImportUPS ON (tblTrackingLinks.TrackingNumber = tblImportUPS.TrackingNumber)
AND (tblTrackingLinks.TrackingNumber = tblImportUPS.TrackingNumber)
WHERE (((tblImportUPS.PackageReferenceNo1) LIKE "*56*")
AND ((tblImportUPS.STATUS) <> "Void"))
ORDER BY tblImportUPS.ManifestDate DESC;
This works fine, but repeats multiple tracking numbers for each order. I try to add GROUP BY tblImportUPS.PackageReferenceNo1 before the ORDER BY line and receive an aggregate error.
Can anyone advise the correct way to proceed and why? I'd prefer to understand and not just receive the solution.

Related

Access 2013 SQL to perform linear interpolation where necessary

I have a database in which there are 13 different products, sold in 6 different countries.
Prices increase once a year.
Prices need to be calculated using a linear interpolation method.  I have 21 different price and quantity increments for each product for each country for each year.
The user needs to be able to see how much an order would cost for any given value (as you would expect).
What the database needs to do (in English!) is to:
If there is a matching quantity from TblOrderDetail in the TblPrices,
use the price for the current product, country and year
if there isn't a matching quantity but the quantity required is greater than 1000 for one product (GT) and greater than 100 for every other product:
Find the highest quantity for the product, country and year (so, 1000 or 100, depending on the product), and calculate a pro-rated price.  eg.  If someone wanted 1500 of product GT for the UK for 2015, we'd look at the price for 1000 GT in the UK for 2015 and multiply it by 1.5.  If 1800 were required, we'd multiply it by 1.8.  I haven't been able to get this working yet as I'm looking at it alongside the formula for the next possibility...
If there isn't a matching quantity and the quantity required is less than 1000 for the product GT but 100 for the other products (this is the norm)...
Find the quantity and price for the increment directly below the quantity required by the user for the required product, country and year (let's call these quantitybelow and pricebelow)
Find the quantity and price for the increment directly above the quantity required by the user for the required product, country and year (let's call these quantityabove and priceabove)
Calculate the price for the required number of products for an account holder in a particular country for a given year using this formula.
ActualPrice: PriceBelow + ((PriceAbove - PriceBelow) * (The quantity required in the order detail - QuantityBelow) / (QuantityAbove - QuantityBelow))
I have spent days on this and have sought advice about this before but I am still getting very stuck.
The tables I've been working with to try and make this work are as follows:
TblAccount (primary key is AccountID, it also has a Country field which joins to the TblCountry.Code (primary key)
TblOrders (primary key is Order ID) which joins to TblAccount via the AccountID field; TblOrderDetail via the OrderID.  This table also holds the OrderDate and Recipient ID which links to a person in TblContact - I don't need that here but will need it later to generate an invoice 
TblOrderDetail (primary key is DetailID) which joins to TblOrders via OrderID field; TblProducts via ProductID field, and holds the Quantity required as well as the product
TblProducts (primary key is ProductCode) which as well as joining to TblOrderDetail, also joins to TblPrice via the Product field
TblPrices links to the TblProducts (as you have just read).  I've also created an Alias for the TblCountry (CountryAliasForProductCode) so I can link it to the TblPrices to show the country link. I'm not sure if I needed to do this - it doesn't work if I do or I don't do it, so I seek guidance again here.
This is the code I've been trying to use (and failing) to get my price and quantity steps above and I hope to replicate it, making a couple of tweaks to get the steps below:
SELECT MIN(TblPrices.stepquantity) AS QuantityAbove, MIN(TblPrices.StepPrice) AS PriceAbove, TblOrders.OrderID, TblOrders.OldOrderID, TblOrders.AccountID, TblOrders.OrderDate, TblOrders.RecipientID, TblOrders.OrderStatus, TblOrderDetail.DetailID, TblOrderDetail.Product, TblOrderDetail.Quantity
FROM (TblCountry INNER JOIN ((TblAccount INNER JOIN TblOrders ON TblAccount.AccountID = TblOrders.AccountID) INNER JOIN (TblOrderDetail INNER JOIN TblProducts ON TblOrderDetail.Product = TblProducts.ProductCode) ON TblOrders.OrderID = TblOrderDetail.OrderID) ON TblCountry.Code = TblAccount.Country) INNER JOIN (TblCountry AS CountryAliasForProduct INNER JOIN TblPrices ON CountryAliasForProduct.Code = TblPrices.CountryCode) ON TblProducts.ProductCode = TblPrices.Product
WHERE (StepQuantity >= TblOrderDetails.Quantity)
AND (TblPrices.CountryCode = TblAccount.Country)
AND (TblOrderDetail.Product = TblPrices.Product)
AND (DATEPART('yyyy', TblPrices.DateEffective) = DATEPART('yyyy', TblOrders.OrderDate));
I've also tried...
I've even tried going back to basics and trying again to generate the steps below in 1 query, then try the steps above in another and finally, create the final calculation in another query.
This is what I have been trying to get my prices and quantities below:
SELECT Max(StepQuantity) AS quantity_below, Max(StepPrice) AS price_below, TblOrderDetails.Quantity, TblAccounts.Country
FROM 
(TblProducts INNER JOIN TblPrices ON TblProducts.ProductCode = TblPrices.Product)
(TblOrderDetail INNER JOIN TblProducts ON TblOrderDetail.Product = TblProducts.ProductCode)
(TblOrders INNER JOIN TblOrderDetail ON TblOrders.OrderID = TblOrderDetail.OrderID)
(TblAccount INNER JOIN TblOrders ON TblAccount.AccountID = TblOrders.AccountID),
WHERE (((TblPrices.StepQuantity)<=(TblOrderDetail.Quantity)) AND ((TblPrices.CountryCode)=([TblAccounts].[country])) AND ((TblPrices.Product)=([TblOrderDetail].[product])) AND ((DatePart('yyyy',[TblPrices].[DateApplicable]))=(DatePart('yyyy',[TblOrders].[OrderDate]))));
You may be able to see glaring errors in this but I'm afraid I can't.  I've tried re-jigging it and I'm getting nowhere.
I need to be able to tie the information in to the OrderDetail records as the price generated will need to be added to a financial transactions table as a debit amount and will show as an amount owing on statements.
I'm really not very good at SQL.  I've read and worked though several self-study books and I have asked part of this question before; but I really am struggling with it.  If anyone has any ideas on how to proceed, or even where I've gone wrong with my code, I'd be delighted, even if you tell me I shouldn't be using SQL. For the record, I originally posted this question on a different forum under Visual Basic. Responses from that forum brought me to SQL - however, anything that works would be good!
I've even tried, using Excel, concatenating the Year&Product&Country&Quantity to get a unique product code, interpolating the prices for every quantity between 1 and 1000 for each product, country and year and bringing them into a TblProductsAndPrices table. In Access, I created a query to concatenate the Year(of order date from tblOrders)&Product(of tblorderdetails)&Country(of tblAccount) in order to get the required product code for the order. Another query would find a price for me. However, any product code that doesn't appear on the list (such as where a quantity isn't listed in the tblProductsAndPrices as it is larger than the highest price increment) doesn't have a price.
If there was a workable solution to what I've just described that would generate a price for everything, then I'd be so pleased.
I'd really like to be able to generate an order for any quantity of any product for any account based in any country on any date and retrieve a price which will be used to "debit" a financial account in the database, who in a transaction history for an account and appear on statements. I'd also like to be able to do an ad-hoc price check on the spot.
Thank you very much for taking the time to read this.  I really appreciate it. If you could offer any help or words of encouragement, I'd be very grateful.
Many thanks
Karen
Maybe no one thinks on an easy solution to the problem, since not all minds work in database thinking.
Easy solution: Create one view that gives all calculated values, not only the final one you need, each one as a column. Then you can use such view in a relation view and use on some rows one of the values and on other rows other values, etc.
How to think is simple, think in reverse order, instead of thinking "if that then I need to calculate such else I need this other", think as "I need "such" and I need "this other", both are columns of an intermediate view, then think on top level "if" that would be another view, such view will select the correct value ignoring the rest.
Never ever try to solve all in one step, that can be a really big headache.
Pros: You can isolate calculated values (needed or not), sql is much more easy to write and maintain.
Cons: Resources use is bigger than minimal, but most of times that extra calculated values does not represent a really big impact.
In terms of tutorial out there: Instead of a Top-Down method, use a Down-Top method.
Sometimes it is better (with your example) to calculate all three values (you write sentences on bold) ignoring the if part, and have all three possible values for your order and after that discard the ones not wanted, than trying to only calculate one.
Trying to calculate only one is thinking as a procedural programming, when working with databases most times one must get rid of such thinking and think as reverse, first do the most internal part of such procedural programming to have all data collected, then do the external selection of the procedural programing.
Note: If one of the values can not be calculated, just generate a Null.
I know it is hard to think on First in, last out (Down-Top) model, but it is great for things as the one you want.
Step1 (on specific view, or a join from one view per calculation):
Calculate column 1 as price for the current product, country and
year
Calculate column 2 as calculate a pro-rated price as if 1000
Calculate column 3 as calculate a pro-rated price as if 100
Calculate column 4 as etc
Calculate column N as etc
Step 2 (Another view, the one you want):
Calculate the if part, so you can choose adequate column from previous view (you can use immediately if or a calculated auxiliary field).
Hope you can follow theese way of thinking, I have solved a lot of things like that one (and more complex) thinking in that way, but it is not easy to think as that, needs an extra effort.

Trouble with pulling distinct data

Ok this is hard to explain partially because I'm bad at sql but this code isn't doing exactly what I want it to do. I'll try to explain what it is supposed to do as best I can and hopefully someone can spot a glaring mistake. I'm sorry about the long winded explanation but there is a lot going on here and I really could use the help.
The point of this script is to search for parts which need to be obsoleted. in other words they haven't been used in three years and are still active.
When we obsolete part, "part.status" is set to 'O'. It is normally null. Also, the word 'OBSOLETE' is usually written in to "part.description"
The "WORK_ORDER" contains every scheduled work order. These are defined by base,lot, and sub ID's. It also contains many dates such as the date when the work order was closed.
the "REQUIREMENT" table contains all the parts require for each job. many jobs may require multiple parts, some at different legs of the job. The way this is handled is that for a given "REQUIREMENT.WORKORDER_BASE_ID" and "REQUIREMENT.WORKORDER_LOT_ID", they may be listed on a dozen or so subsequent rows. Each line specifies a different "REQUIREMENT.PART_ID". The sub id separates what leg of the job that the part is needed. All of the parts I care about start with 'PCH'
When I run this code it returns 14 lines, I happen to know it should be returning about 39 right now. I believe the screwy part starts at line 17. I found that code on another form hoping that it would help solve the original problem. Without that code, I get like 27K lines because the DB is pulling every criteria matching requirement from every criteria matching work order. Many of these parts are used on multiple jobs. I've also tried using DISTINCT on REQUIREMENT.PART_ID which seems like it should solve the problem. Alas it doesn't.
So I know despite all the information I probably still didn't give nearly enough. Does anyone have any suggestions?
SELECT
PART.ID [Engr Master]
,PART.STATUS [Master Status]
,WO.CLOSE_DATE
,PT.ID [Die]
,PT.STATUS [Die Status]
FROM PART
CROSS APPLY(
SELECT
WORK_ORDER.BASE_ID
,WORK_ORDER.LOT_ID
,WORK_ORDER.SUB_ID
,WORK_ORDER.PART_ID
,WORK_ORDER.CLOSE_DATE
FROM WORK_ORDER
WHERE
GETDATE() - (360*3) > WORK_ORDER.CLOSE_DATE
AND PART.ID = WORK_ORDER.PART_ID
AND PART.STATUS ='O'
)WO
CROSS APPLY(
SELECT
REQUIREMENT.WORKORDER_BASE_ID
,REQUIREMENT.WORKORDER_LOT_ID
,REQUIREMENT.WORKORDER_SUB_ID
,REQUIREMENT.PART_ID
FROM REQUIREMENT
WHERE
WO.BASE_ID = REQUIREMENT.WORKORDER_BASE_ID
AND WO.LOT_ID = REQUIREMENT.WORKORDER_LOT_ID
AND WO.SUB_ID = REQUIREMENT.WORKORDER_SUB_ID
AND REQUIREMENT.PART_ID LIKE 'PCH%'
)REQ
CROSS APPLY(
SELECT
PART.ID
,PART.STATUS
FROM PART
WHERE
REQ.PART_ID = PART.ID
AND PART.STATUS IS NULL
)PT
ORDER BY PT.ID
This is difficult to understand without any sample data, but I took a stab at it anyway. I removed the second JOIN to PART (that had alias PART1) as it seemed unecessary. I also removed the subquery that was looking for parts HAVING COUNT(PART_ID) = 1
The first JOIN to PART should be done on REQUIREMENT.PART_ID = PART.PART_ID as the relationship as already been defined from WORK_ORDER to REQUIREMENT, hence you can JOIN PART directly to REQUIREMENT at this point.
EDIT 03/23/2015
If I understand this correctly, you just need a distinct list of PCH parts, and their respective last (read: MAX) CLOSE_DATE. If that is the case, here is what I propose.
I broke the query up into a couple of CTE's. The first CTE is simply going through the PART table and pulling out a DISTINCT list of PCH parts, grouping by PART_ID and DESCRIPTION.
The second CTE, is going through the REQUIREMENT table, joining to the WORK_ORDER table and, for each PART_ID (handled by the PARTITION) assigning the CLOSE_DATE a ROW_NUMBER in descending order. This will ensure that each ROW_NUMBER with a value of "1" will be the Max CLOSE_DATE for each PART_ID.
The final SELECT statement simply JOINS the two Cte's on PART_ID, filtering where LastCloseDate = 1 (the ROW_NUMBER assigned in the second CTE).
If I understand the requirements correctly, this should give you the desired results.
Additionally, I removed the filter WHERE PART.DESCRIPTION NOT LIKE 'OB%' because we're already filtering by PART.STATUS IS NULL and you stated above that an 'O' is placed in this field for Obsolete parts. Also, [DIE] and [ENGR MASTER] have the same value in the 27 rows being pulled before, so I just used the same field and labeled them differently.
; WITH Parts AS(
SELECT prt.PART_ID AS [ENGR MASTER]
, prt.DESCRIPTION
FROM PART prt
WHERE prt.STATUS IS NULL
AND prt.PART_ID LIKE 'PCH%'
GROUP BY prt.ID, prt.DESCRIPTION
)
, LastCloseDate AS(
SELECT req.PART_ID
, wrd.CLOSE_DATE
, ROW_NUMBER() OVER(PARTITION BY req.PART_ID ORDER BY wrd.CLOSE_DATE DESC) AS LastCloseDate
FROM REQUIREMENT req
INNER JOIN WORK_ORDER wrd
ON wrd.BASE_ID = req.WORKORDER_BASE_ID
AND wrd.LOT_ID = req.WORKORDER_LOT_ID
AND wrd.SUB_ID = req.WORKORDER_SUB_ID
WHERE wrd.CLOSE_DATE IS NOT NULL
AND GETDATE() - (365 * 3) > wrd.CLOSE_DATE
)
SELECT prt.PART_ID AS [DIE]
, prt.PART_ID AS [ENGR MASTER]
, prt.DESCRIPTION
, lst.CLOSE_DATE
FROM Parts prt
INNER JOIN LastCloseDate lst
ON prt.PART_ID = lst.PART_ID
WHERE LastCloseDate = 1

Calculate First Time Buyer and Repeating Buyers using MAQL Queries in GOOD-DATA platform

I have been recently working with GOOD-DATA platform. I don't have that much experience in MAQL, but I am working on it. I did some metric and reports in GOOD-DATA platform. Recently I tried to create a metric for calculating Total Buyers,First Time Buyers and Repeating Buyers. I created these three reports and working perfect.But when i try to add a order date parent filter the first time buyers and repeating buyers value getting wrong. please have Look at Following queries.
I can find out the correct values using sql queries.
MAQL Queries:
TOTAL ORDERS-
SELECT COUNT(NexternalOrderNo) BY CustomerNo WITHOUT PF
TOTAL FIRSTTIMEBUYERS-
SELECT COUNT(CustomerNo) WHERE (TOTAL ORDER WO PF=1) WITHOUT PF
TOTAL REPEATINGBUYERS-
SELECT COUNT(CustomerNo) WHERE (TOTAL ORDER WO PF>1) WITHOUT PF
Can any one suggest a logic for finding these values using MAQL
It's not clear what you want to do. If you could provide more details about the report you need to get, it would be great.
It's not necessary to put "without pf" into the metrics. This clause bans filter application, so when you remove it, the parent filter will be used there. And you will probably get what you want. Specifically, modify this:
SELECT COUNT(CustomerNo) WHERE (TOTAL ORDER WO PF>1) WITHOUT PF
to:
SELECT COUNT(CustomerNo) WHERE (TOTAL ORDER WO PF>1)
The only thing you miss here is "ALL IN ALL OTHER DIMENSIONS" aka "ALL OTHER".
This keyword locks and overrides all attributes in all other dimensions—keeping them from having any affect on the metric. You can read about it more in MAQL Reference Guide.
FIRSTTIMEBUYERS:
SELECT COUNT(CustomerNo)
WHERE (SELECT IFNULL(COUNT(NexternalOrderNo), 0) BY Customer ID, ALL OTHER) = 1
REPEATINGBUYERS:
SELECT COUNT(CustomerNo)
WHERE (SELECT IFNULL(COUNT(NexternalOrderNo), 0) BY Customer ID, ALL OTHER) > 1

SQL grouping with "Invalid use of group" error

I'll be upfront, this is a homework question, but I've been stuck on this one for hours and I just was looking for a push in the right direction. First I'll give you the relations and the hw question for background, then I'll explain my question:
Branch (BookCode, BranchNum, OnHand)
HW problem: List the BranchNum for all branches that have at least one book that has at least 10 copies on hand.
My question: I understand that I must take the SUM(OnHand) grouped by BookCode, but how do I then take that and group it by BranchNum? This is logically what I come up with and various versions:
select distinct BranchNum
from Inventory
where sum(OnHand) >= 10
group by BookCode;
but I keep getting an error that says "Invalid use of group function."
Could someone please explain what is wrong here?
UPDATE:
I understand now, I had to use the HAVING statement, the basic form is this:
select distinct (what you want to display)
from (table)
group by
having
Try this one.
SELECT BranchNum
FROM Inventory
GROUP BY BranchNum
HAVING SUM(OnHand) >= 10
You can also find Group By Clause with example here.
Although all comments in the question seem to be valid and add information they all seem to be missing why your query is not working. The reason is simple and is strictly related by the state/phase at which the sum is calculated.
The where clause is the first thing that will get executed. This means it will filter all rows at the beginning. Then the group by will come in effect and will merge all rows that are not specified in the clause and apply the aggregated functions (if any).
So if you try to add an aggregated function to the where clause you're trying to aggregate before data is being grouped by and even filtered. The having clause gets executed after the group by and allows you to filter the aggregated functions, as they have already been calculated.
That's why you can write HAVING SUM(OnHand) >= 10 and you can't write WHERE SUM(OnHand) >= 10.
Hope this helps!

Grouping results within SQL

I was thinking about doing this in crystal, but am thinking about doing it another way now. This is the way I was thinking about doing it before. Count by group in crystal reports
I am pulling by transactions and need a count of unique people, but in the report they want to show the count of individual people and then show their services below as shown in the image. What I'd like to do is somehow get a count of unique users that I could just throw in at the bottom. Crystal won't allow me to do a count by group and users will have duplicates in the format they want it displayed. I am hopping that I can group it in the code then just add it to the bottom of the report. I hope I'm getting across what I'm trying to accomplish. If I can somehow just add the total of unique users at the bottom of the report it will finish it for me. Thanks in advance.
select
distinct p.patient_id,
pa.fname as 'Patient',
p.clinic_id,
p.service_id,
p.program_id,
p.protocol_id,
p.discharge_reason,
p.date_discharged
from patient_assignment p
join patient pa
on p.patient_id = pa.patient_id
where p.program_id not in ('TEST', 'SA', 'INTAKE' ) and (p.date_discharged between '2013-01-01 00:00:00.000' and '2013-06-01 00:00:00.000')
and p.patient_id not in ('00000004', '00001667', '00020354')
This is too long for a comment.
SQL Queries have fixed columns. You can't "just throw" a different type of row at the bottom. Although there are SQL solutions (such as concatenating all the fields to make a single row), these are not very palatable.
Instead, here are some other ways.
You can run another query to get the count.
You can query the "result" set to get the count (perhaps after fetching all the rows).
You can do some minor coding work in the application.