AdventureWorks - SQL Joins are Grouping Data - sql

So I'm trying to produce a little table from the AdventureWorks database, with the sales of all Silver Mountain Bikes sold in October 2013. I need the table to be ordered by Territory, and have a column of total sales and a column of sales that were the bikes only.
Where is my query going wrong?
Note, I haven't done any aliasing yet, just to keep things sensible in my own head.
select distinct sales.salesorderheader.SalesOrderID, sales.salesorderheader.OrderDate, Color, sales.SalesTerritory.Name, Quantity
from sales.SalesOrderHeader
right join sales.salesTerritory
on sales.salesOrderHeader.TerritoryID = sales.salesTerritory.TerritoryID
right join sales.SalesOrderDetail
on sales.salesOrderHeader.SalesOrderID = sales.SalesOrderDetail.SalesOrderID
right join production.product
on sales.SalesOrderDetail.ProductID = production.Product.ProductID
right join production.transactionHistory
on production.Product.ProductID = production.transactionHistory.ProductID
where production.product.ProductSubcategoryID = '1'
and color = 'Silver'
and sales.SalesOrderDetail.ModifiedDate between '2013-10-01' and '2013-10-31'
and production.TransactionHistory.TransactionType = 'S'
and sales.SalesOrderHeader.OnlineOrderFlag = 'False'
Due to the schema of the database, I've had to daisy-chain a few different tables together through a series of joins, but I've found these joins are grouping data up and seem to be adding a bunch of stuff together. I figure I can run a count of the Quantity column to get my total orders, and a sum of the Quantity column to see how many bikes were sold (some orders will have more than one bike!)
I hoped throwing in a 'distinct' at the top would narrow things down for me, but that's not getting me the count I need either, as I'm trying to replicate the results of a table I've been given.

Related

I think I'm calculating the total revenue from each warehouse wrong

I've got a question I'm working through and I'd like someone to double check my code because I'm trying to calculate the revenue for each warehouse in a database and the returning values seem high - it's returning tens of millions for most warehouses. Not impossibly high but high enough for me to take a second look.
Here is the ER diagram
And here is my code:
SELECT WAREHOUSES.WAREHOUSE_ID, WAREHOUSES.WAREHOUSE_NAME, SUM(ORDER_ITEMS.QUANTITY *
PRODUCTS.LIST_PRICE) AS TOTAL
FROM WAREHOUSES
JOIN INVENTORIES ON INVENTORIES.WAREHOUSE_ID = WAREHOUSES.WAREHOUSE_ID
JOIN PRODUCTS ON PRODUCTS.PRODUCT_ID = INVENTORIES.PRODUCT_ID
JOIN ORDER_ITEMS ON PRODUCTS.PRODUCT_ID = ORDER_ITEMS.PRODUCT_ID
GROUP BY WAREHOUSES.WAREHOUSE_NAME, WAREHOUSES.WAREHOUSE_ID
ORDER BY TOTAL DESC;
Is there something wrong with my SUM? I'm new to SQL so I'm sure you'll find plenty of tweaks to this code - and I'd love to hear about it!
Thank you!
You are calculating the total sales over all time for each product inside a warehouse. And then adding them up at the warehouse level.
I do not see that this is useful. And it is definitely overcounting sales.
I don't know what "revenue" means at the warehouse level. The data model doesn't seem to have any indication of which warehouse provided the products for a given order.
My best guess is that "revenue" means "potential revenue". That is, for all the products in inventory in the warehouse, how much revenue could be generated if they were sold at full price?
As a hint: This calculation has nothing to do with orders or orderlines. It only requires calculations between products and inventory.
The query looks OK, except I would think order_items.unit_price reflects the actual price paid and hence the sales revenue.
Here it is tidied up a little. I use aliases for the table names rather than repeating the whole name. Opinions vary on how to lay out SQL, and join clauses in particular, but this works better for me than a monolithic block of uppercase text.
select w.warehouse_id
, w.warehouse_name
, sum(oi.quantity * oi.unit_price) as total
from warehouses w
join inventories i on i.warehouse_id = w.warehouse_id
join products p on p.product_id = i.product_id
join order_items oi on p.product_id = oi.product_id
group by w.warehouse_name, w.warehouse_id
order by total desc;
You could check the results for one order, or one product, or one warehouse by including their ID in the select list and group by clauses and confirming that it gives the expected total. Sample data would help here.

Find item name along with the seller id and buyer id such that the seller has sold the item to the buyer

Hi IM trying to build a query for the following sentence for sql.
For each seller and each item sold by the seller, find the total amount sold.
I have 3 tables but not sure If i have to use them all. I have a feeling I need to use at least three tables to get this query but I keep getting an error. Also, I keep getting an error when I try to the following:
select selleruserid, itemid, sum(price) total
from sales_fact s
join items_dim i on i.itemid = s.itemid
join sellers_dim d on d.userid = s.selleruserid
group by selleruserid, itemid
I added a picture below of my tables.
All the information you want is in the fact table, so the other tables do not seem necessary:
select sf.selleruserid, sf.itemid, sum(sf.price) as total
from sales_fact s
group by sf.selleruserid, sf.itemid;
You would need to join in the other tables if you needed other information from the dimension, such as the name.

My question is about SQL, using a TOP function inside a sub-query in MS Access

Overall what I'm trying to achieve is a query that shows the most ordered item from a customer in a database. To achieve this I've tried making a query showing how many times a customer has ordered an item, and now I am trying to create a sub-query in it using TOP1 to discern the most bought items.
With the SQL from the first query (looking weird because I made it with the Access automatic creator):
SELECT
Customers.CustomerFirstName,
Customers.CustomerLastName,
Products.ProductName,
COUNT(SalesQuantity.ProductCode) AS CountOfProductCode
FROM (Employees
INNER JOIN (Customers
INNER JOIN Sales
ON Customers.CustomerCode = Sales.CustomerCode)
ON Employees.EmployeeCode = Sales.EmployeeCode)
INNER JOIN (Products
INNER JOIN SalesQuantity
ON Products.ProductCode = SalesQuantity.ProductCode)
ON Sales.SalesCode = SalesQuantity.SalesCode
GROUP BY
Customers.CustomerFirstName,
Customers.CustomerLastName,
Products.ProductName
ORDER BY
COUNT(SalesQuantity.ProductCode) DESC;
I have tried putting in a subquery after FROM line:
(SELECT TOP1 CountOfProduct(s)
FROM (.....)
ORDER by Count(SalesQuantity.ProductCode) DESC)
I'm just not sure what to put in for the "from"-every other tutorial has the data from an already created table, however this is from a query that is being made at the same time. Just messing around I've put "FROM" and then listed every table, as well as
FROM Count(SalesQuantity.ProductCode)
just because I've seen that in the order by from the other code, and assume that the query is discerning from this count. Both tries have ended with an error in the syntax of the "FROM" line.
I'm new to SQL so sorry if it's blatantly obvious, but any help would be greatly appreciated.
Thanks
As I understand, you want the most purchased product for each customer.
So, begin by building aggregate query that counts product purchases by customer (appears to be done in the posted image). Including customer ID in the query would simplify the next step which is to build another query with TOP N nested query.
Part of what complicates this is unique record identifier is lost because of aggregation. Have to use other fields from the aggregate query to provide unique identifier. Consider:
SELECT * FROM Query1 WHERE CustomerID & ProductName IN
(SELECT TOP 1 CustomerID & ProductName FROM Query1 AS Dupe
WHERE Dupe.CustomerID = Query1.CustomerID
ORDER BY Dupe.CustomerID, Dupe.CountOfProductCode DESC);
Overall what I'm trying to achieve is a query that shows the most ordered item from a customer in a database.
This answers your question. It does not modify your query which is only tangentially related.
SELECT s.CustomerCode, sq.ProductCode, SUM(sq.quantity) as qty
FROM Sales as s INNER JOIN
SalesQuantity as sq
ON s.SalesCode = sq.SalesCode
GROUP BY s.CustomerCode, sq.ProductCode;
To get the most ordered items, you can use this twice:
SELECT s.CustomerCode, sq.ProductCode, SUM(sq.quantity) as qty
FROM Sales as s INNER JOIN
SalesQuantity as sq
ON s.SalesCode = sq.SalesCode
GROUP BY s.CustomerCode, sq.ProductCode
HAVING sq.ProductCode IN (SELECT TOP 1 sq2.ProductCode
FROM Sales as s2 INNER JOIN
SalesQuantity as sq2
ON s2.SalesCode = sq2.SalesCode
WHERE s2.CustomerCode = s.CustomerCode
GROUP BY sq2.ProductCode
);
In almost any other database, this would be simpler, because you would be able to use window functions.

SQL: If there is No match on condition (row in table) return a default value and Rows which match from different tables

I have three tables: Clinic, Stock and StockLog.
I need to get all rows where Stock.stock < 5. I need to also show if an order has been placed and what amount it is; which is found in the table Stocklog.
The issue is that a user can set his stock level in the Stock table without placing an order which would go to Stocklog.
I need a query that can : return the rows in the Stock table and get the related order amounts in the Stocklog table. If no order has been placed in StockLog, then set amount to order amount to zero.
I have tried :
SELECT
Clinic.Name,
Stock.NameOfMedication, Stock.Stock,
StockLog.OrderAmount
FROM
Clinic
JOIN
Stock ON Stock.ClinicID = Clinic.ClinicID
JOIN
StockLog ON StockLog.StockID = Stock.StockID
WHERE
Stock.Stock <= 5
The issue with my query is that I lose rows which are not found in StockLog.
Any help on how to write this.
Thank you.
I am thinking the query should look like this:
SELECT c.Name, s.NameOfMedication, s.Stock,
COALESCE(sl.OrderAmount, 0) as OrderAmount
FROM Stock s LEFT JOIN
Clinic c
ON s.ClinicID = c.ClinicID LEFT JOIN
StockLog sl
ON sl.StockID = s.StockID
WHERE s.Stock <= 5 ;
You want to keep all rows in Stock (subject to the WHERE condition). So think: "make Stock the first table in the FROM and use LEFT JOIN for all the other tables."
If you want to keep all the rows that result from joining Clinic and Stock, then use a LEFT OUTER JOIN with StockLog. I don't know which SQL you're using (SQL Server, MySQL, PostgreSQL, Oracle), so I can't give you a precise example, but searching for "left outer join" in the relevant documentation should work.
See this Stack Overflow post for an explanation of the various kinds of joins.

Multiple records joined Access SQL

I'm not sure if what I want to do is possible but if it is possible, it's probably a really easy solution that I just can't figure out. Once things get to a certain complexity though, my head starts spinning. Please forgive my ignorance.
I have a database running in MS Access 2007 for a school which has a plethora of tables joined to each other. I'm trying to create a query in which I get information from several tables. I'm looking up sales and payment information for different customers, pulling info from several different linked tables. Each sale is broken down into one of 4 categories, Course Fee, Registration Fee, Book Fee and Others. Because each customer will have multiple purchases, each one is a separate entry in the Sales table. The payment information is also in its own table.
My SQL currently looks like this:
SELECT StudentContracts.CustomerID, (Customers.CFirstName & " " & Customers.CLastName) AS Name, Customers.Nationality, Courses.CourseTitle, (StudentContracts.ClassesBought + StudentContracts.GiftClasses) AS Weeks, StudentContracts.StartDate, Sales.SaleAmount, SaleType.SaleType, Sales.DueDate, Payments.PaymentAmount
FROM (
(
(Customers INNER JOIN StudentContracts ON Customers.CustomerID = StudentContracts.CustomerID)
INNER JOIN Payments ON Customers.CustomerID = Payments.CustomerID)
INNER JOIN
(SaleType INNER JOIN Sales ON SaleType.SalesForID = Sales.SalesForID)
ON Customers.CustomerID = Sales.CustomerID)
INNER JOIN
(
(Courses INNER JOIN Classes ON Courses.CourseID = Classes.CourseID)
INNER JOIN StudentsClasses ON Classes.ClassID = StudentsClasses.ClassID)
ON Customers.CustomerID = StudentsClasses.CustomerID;
This works and brings up the information I need. However, I am getting one record for each sale as in:
CustomerID Name ... SaleAmount SaleType PaymentAmount
1 Bob $600 Course $1000
1 Bob $300 RgnFee $1000
1 Bob $100 Book $1000
What I need is one line for each customer but each sale type in it's own column in the row with the sale amount listed in its value field. As so:
CustomerID Name ... Course RgnFee Book Others PaymentAmount
1 Bob $600 $300 $100 $1000
Can anyone help and possibly explain what I should/need to be doing?
Thanks in advance!
You can create a cross tab from the query you have already created. Add the query to the Query Design Grid, choose Crosstab from query types, and select a Row or rows, Column and Value.
Say:
TRANSFORM Sum(t.SaleAmount) AS SumOfSaleAmount
SELECT t.ID, t.Name, Sum(t.SaleAmount) AS Total
FROM TableQuery t
GROUP BY t.ID, t.Name
PIVOT t.SaleType
If you want a certain order, you can edit the property sheet to include column headings, or you can add an In statement to the SQL. Note that if you add column headings, a column will be included for each column, whether or not data is available, and more importantly, a column will not be included that has data, if it is not listed.
TRANSFORM Sum(t.SaleAmount) AS SumOfSaleAmount
SELECT t.ID, t.Name, Sum(t.SaleAmount) AS Total
FROM TableQuery t
GROUP BY t.ID, t.Name
PIVOT t.SaleType In ("Course","RgnFee","Book","Others");