Manipulating Data in Joins - sql

Diagram:
Query for join:
SELECT DISTINCT c.CustomerID, c.FirstName , sh.DueDate, p.ProductID,p.ListPrice
FROM SalesLT.Customer c
INNER JOIN SalesLT.SalesOrderHeader sh
ON c.CustomerID = sh.CustomerID
INNER JOIN SalesLT.SalesOrderDetail sd
ON sh.SalesOrderID = sd.SalesOrderID
INNER JOIN SalesLT.Product p
ON sd.ProductID = p.ProductID
Order BY ListPrice Desc
Output:
Desired Result:
For desired output:
What could be the add on to the existing query?
What would be the optimized way of doing this query ?
What would be the time and space complexity for Sub query and join?

I think you want:
SELECT c.CustomerID
, c.FirstName
, sh.DueDate
, MAX(p.ProductID) ProductID
,p.ListPrice
FROM SalesLT.Customer c
INNER JOIN SalesLT.SalesOrderHeader sh
ON c.CustomerID = sh.CustomerID
INNER JOIN SalesLT.SalesOrderDetail sd
ON sh.SalesOrderID = sd.SalesOrderID
INNER JOIN SalesLT.Product p
ON sd.ProductID = p.ProductID
GROUP BY
, c.FirstName
, sh.DueDate
, p.ListPrice
Order BY ListPrice Desc
Not that it makes much sense as a query, but I guess you wanted to know the approach? Probably you have a good answer by now, but it encourages us to put time into an answer if you tick an appropriate response, thanks

Related

Gross sales of top ten customers that made at least 5 purchases of some Category in a specified year

I need to get (sales.customers) | year | gross_sales of top ten customers that made at least 5 purchases of Category "Beverages" in the year 2014. I have already written these SELECT queries, but since I am new to SQL, I think I am very inefficient in writing code. This does not work properly and there is probably a simpler way of doing it. I have also pinned a picture of an ER diagram .
SELECT
T6.COMPANYNAME, YEAR, GROSS_SALES
FROM
(SELECT T1.CUSTID
FROM
(SELECT R.CUSTID, COUNT(R.CUSTID) AS NUMBEROFSALES
FROM SALES.ORDERDETAILS O
RIGHT JOIN PRODUCTION.PRODUCTS P ON P.PRODUCTID = O.PRODUCTID
RIGHT JOIN SALES.ORDERS R ON R.ORDERID = O.ORDERID
INNER JOIN SALES.CUSTOMERS C1 ON R.CUSTID = C1.CUSTID
INNER JOIN PRODUCTION.CATEGORIES C2 ON P.CATEGORYID = C2.CATEGORYID
WHERE C2.CATEGORYNAME = 'Beverages' AND YEAR(R.ORDERDATE) = 2014
GROUP BY R.CUSTID
ORDER BY SUM(R.CUSTID) DESC) T1
--HAVING COUNT(R.CUSTID) > 5
RIGHT JOIN
(SELECT R.CUSTID, SUM(O.UNITPRICE) AS MONEYSPENT
FROM SALES.ORDERDETAILS O
RIGHT JOIN PRODUCTION.PRODUCTS P ON P.PRODUCTID = O.PRODUCTID
RIGHT JOIN SALES.ORDERS R ON R.ORDERID = O.ORDERID
INNER JOIN SALES.CUSTOMERS C1 ON R.CUSTID = C1.CUSTID
INNER JOIN PRODUCTION.CATEGORIES C2 ON P.CATEGORYID = C2.CATEGORYID
WHERE C2.CATEGORYNAME = 'Beverages' AND YEAR(R.ORDERDATE) = 2014
GROUP BY R.CUSTID
ORDER BY SUM(O.UNITPRICE) DESC) T2 ON T1.CUSTID = T2.CUSTID
ORDER BY T1.NUMBEROFSALES DESC
LIMIT 10) T5
INNER JOIN
(SELECT DISTINCT(T4.COMPANYNAME), T4.CUSTID, YEAR, GROSS_SALES
FROM
(SELECT R.CUSTID AS CUSTID, YEAR(R.ORDERDATE) AS YEAR, SUM(O.UNITPRICE * O.QTY * (1 - O.DISCOUNT)) AS GROSS_SALES
FROM SALES.ORDERDETAILS O
RIGHT JOIN SALES.ORDERS R ON R.ORDERID = O.ORDERID
INNER JOIN SALES.CUSTOMERS C1 ON R.CUSTID = C1.CUSTID
GROUP BY R.CUSTID, YEAR(R.ORDERDATE)
ORDER BY YEAR(R.ORDERDATE)) T3
INNER JOIN
(SELECT C.COMPANYNAME, C.CUSTID
FROM SALES.ORDERS R
INNER JOIN SALES.CUSTOMERS C ON R.CUSTID = C.CUSTID) T4 ON T3.CUSTID = T4.CUSTID) T6 ON T5.CUSTID = T6.CUSTID
I'm not going to try and fix your code or explain the misakes there, as there are many of them. Instead based on your requirements I wrote a query that solves the problem. I show it in steps below which should make clear the process I used to solve the problem.
First how do we find the top ten customers that made 5 purchases of Beverages?
Take customer table and join to orders with beverages (inner join will exclude customers that don't meet criteria)
SELECT CUSTOMERID
FROM CUSTOMERS C
JOIN ORDERS O ON C.CUSTOMERID = O.CUSTOMERID
JOIN ORDER_DETAILS OD ON O.ORDERID = OD.ORDERID
JOIN PRODUCTS P ON OD.PRODUCTID = P.PRODUCTID
JOIN CATEGORIES C ON P.CATEGORYID = C.CATEGORYID AND C. CATEGORY_NAME = 'Beverages'
WHERE YEAR(ORDER_DATE) = 2014
GROUP BY CUSTOMERID
HAVING COUNT(ORDER_DETAILS) >= 5
Now we need the sum of order (for 2014) by customers which looks like this:
SELECT CUSTOMERID, YEAR(ORDER_DATE) AS YEAR, SUM(OD.UNIT_PRICE*OD.QUANTITY) AS TOTAL_SPEND
FROM CUSTOMERS C
JOIN ORDERS O ON C.CUSTOMERID = O.CUSTOMERID
JOIN ORDER_DETAILS OD ON O.ORDERID = OD.ORDERID
WHERE YEAR(ORDER_DATE) = 2014
GROUP BY CUSTOMERID, YEAR(ORDER_DATE)
Now we just combine these two queries like this:
SELECT CUSTOMERID, YEAR(ORDER_DATE) AS YEAR, SUM(OD.UNIT_PRICE*OD.QUANTITY) AS TOTAL_SPEND
FROM CUSTOMERS C
JOIN ORDERS O ON C.CUSTOMERID = O.CUSTOMERID
JOIN ORDER_DETAILS OD ON O.ORDERID = OD.ORDERID
JOIN (
SELECT CUSTOMERID
FROM CUSTOMERS C
JOIN ORDERS O ON C.CUSTOMERID = O.CUSTOMERID
JOIN ORDER_DETAILS OD ON O.ORDERID = OD.ORDERID
JOIN PRODUCTS P ON OD.PRODUCTID = P.PRODUCTID
JOIN CATEGORIES C ON P.CATEGORYID = C.CATEGORYID AND C. CATEGORY_NAME = 'Beverages'
WHERE YEAR(ORDER_DATE) = 2014
GROUP BY CUSTOMERID
HAVING COUNT(ORDER_DETAILS) >= 5
) as SUB ON SUB.CUSTOMERID = C.CUSTOMERID
WHERE YEAR(ORDER_DATE) = 2014
GROUP BY CUSTOMERID, YEAR(ORDER_DATE)
ORDER BY SUM(OD.UNIT_PRICE*OD.QUANTITY)
LIMIT 10
Note I did not test this but just wrote the SQL since I don't have a db to test against so there might be typos
Also Note: I'm expect it is possible to remove the sub query as it is doing a lot of the same joins the outer query is-- but we want to make sure we get the correct result and it is easier to see it is correct this way. You can also test the sub-query by itself to make sure it returns expected results.

How to obtain two values from a sub-query to be used in the where clause of sql

I have to use two values in the where clause to be tested for equality of two values obtained from a subquery. Since I am working on an existing application, I want to keep it as a subquery. The following is my query.
SELECT
o.EMAIL_ADDRESS, c.FIRST_NAME, p.PARTY_ID
FROM
ORDER o WITH (NOLOCK)
INNER JOIN
PARTY p WITH (NOLOCK) ON o.ORDER_ID = p.PARTY_ID
INNER JOIN
CUSTOMER c WITH (NOLOCK) ON p.PARTY_ID = c.CUSTOMER_ID
WHERE
(o.EMAIL_ADDRESS, c.CUSTOMER_ID) IN (SELECT EMAIL_ADDRESS, CUSTOMER_ID
FROM CUSTOMER_MASTER
WHERE insert_date > '01/02/2019')
The problem I am facing is that the first value within the where clause, o.EMAIL_ADDRESS, throws the following error:
An expression of non-boolean type specified in a context where a condition is expected
When I use a single value within the where clause it works fine.
One method is with EXISTS and a correlated subquery.
SELECT o.EMAIL_ADDRESS, c.FIRST_NAME, p.PARTY_ID
FROM ORDER_HEADER oh WITH (NOLOCK)
INNER JOIN PARTY p WITH (NOLOCK) ON o.ORDER_ID = p.PARTY_ID
INNER JOIN CUSTOMER c WITH (NOLOCK) ON p.PARTY_ID = c.CUSTOMER_ID
WHERE EXISTS(
SELECT 1
FROM CUSTOMER_MASTER AS cm
WHERE cm.insert_date > '01/02/2019'
AND o.EMAIL_ADDRESS = cm.EMAIL_ADDRESS
AND c.CUSTOMER_ID = cm.c.CUSTOMER_ID
);
in allows single column list only. So create single column.
SELECT o.EMAIL_ADDRESS, c.FIRST_NAME, p.PARTY_ID
FROM ORDER o WITH (NOLOCK)
INNER JOIN PARTY p WITH (NOLOCK) ON o.ORDER_ID = p.PARTY_ID
INNER JOIN CUSTOMER c WITH (NOLOCK) ON p.PARTY_ID = c.CUSTOMER_ID
WHERE o.EMAIL_ADDRESS + cast(c.CUSTOMER_ID as varchar)
IN (select EMAIL_ADDRESS + cast(CUSTOMER_ID as varchar)
from CUSTOMER_MASTER where insert_date > '01/02/2019')
Use EXISTS. I would also advise fixing the date format:
SELECT o.EMAIL_ADDRESS, c.FIRST_NAME, p.PARTY_ID
FROM ORDER o JOIN
PARTY p
ON o.ORDER_ID = p.PARTY_ID JOIN
CUSTOMER c
ON p.PARTY_ID = c.CUSTOMER_ID
WHERE EXISTS (SELECT 1
FROM CUSTOMER_MASTER cm
WHERE cm.EMAIL_ADDRESS = o.EMAIL_ADDRESS AND
cm.CUSTOMER_ID = c.CUSTOMER_ID AND
cm.insert_date > '2019-02-01'
);
Your date constants should be in the YYYYMMDD format.
Based on your logic, though, I don't think you need all the JOINs:
SELECT o.EMAIL_ADDRESS, c.FIRST_NAME, o.ORDER_ID
FROM ORDER o JOIN
CUSTOMER c
ON o.ORDER_ID = c.CUSTOMER_ID
WHERE EXISTS (SELECT 1
FROM CUSTOMER_MASTER cm
WHERE cm.EMAIL_ADDRESS = o.EMAIL_ADDRESS AND
cm.CUSTOMER_ID = c.CUSTOMER_ID AND
cm.insert_date > '2019-02-01'
);
The table PARTY doesn't seem necessary, because the orders can align directly to the customers.

Is it possible to get one row by grouping more than one column

I have a query as below. DB from http://www.w3schools.com/sql/default.asp
SELECT count(distinct C.CustomerID),C.Country
FROM Customers C
inner join Orders O
on C.CustomerID = O.CustomerID
inner join OrderDetails D
on O.OrderID = D.OrderID
inner join Products P
on D.ProductID = P.ProductID
group by C.Country,P.CategoryID
order by C.Country
Here is the result from above.
But I want to get one row per country(as below pic) by counting CustomerIDs where any CustomerIDs are in the same country and have a same CategoryID as well. So I have to group by 2 columns. Is there any way to do it? Could you please kindly suggest me?
Thank you.
That's quite simple. Just remove P.CategoryID in your GROUP BY clause.
SELECT COUNT(DISTINCT C.CustomerID), C.Country
FROM Customers C
INNER JOIN Orders O
ON C.CustomerID = O.CustomerID
INNER JOIN OrderDetails D
ON O.OrderID = D.OrderID
INNER JOIN Products P
ON D.ProductID = P.ProductID
GROUP BY C.Country
ORDER BY C.Country;
Update
Following your comment, this should be correct approach then:
SELECT T.Country, SUM(T.Cnt)
FROM (
SELECT COUNT(DISTINCT C.CustomerID) AS Cnt, C.Country
FROM Customers C
INNER JOIN Orders O
ON C.CustomerID = O.CustomerID
INNER JOIN OrderDetails D
ON O.OrderID = D.OrderID
INNER JOIN Products P
ON D.ProductID = P.ProductID
GROUP BY C.Country, P.CategoryID
) AS T
GROUP BY T.Country
ORDER BY T.Country;

LEFT OUTER JOIN EQUIVALENT

I have a tables contains null values. In ORDER table i have 2 null in PART_ID section and 2 null values in CUSTOMER_ID.
And i have that kind of query:
SELECT O.ORDER_ID , O.ORDER_DATE , O.CUST_ID, O.QUANTITY ,O.PART_ID ,
C.CUST_NAME, C.CUST_CODE, P.PART_NAME, P.PART_CODE
FROM [ORDER] O
LEFT OUTER JOIN PART P ON P.PART_ID = O.PART_ID
LEFT OUTER JOIN CUSTOMER C ON C.CUST_ID = O.CUST_ID
So here is my question. How can i do it without using outer join ?
I tried too many things including where not exists or this ;
SELECT *
FROM [ORDER] O ,CUSTOMER C, PART P
WHERE C.CUST_ID = (
SELECT CUST_ID FROM CUSTOMER C WHERE O.CUST_ID = C.CUST_ID
) AND P.PART_ID = (SELECT PART_ID FROM PART P WHERE O.PART_ID = P.PART_ID)
but i couldn't find solution. If there is a solution how it will be ?
(Note: this is homework.)
I have that kind of table :
and left outer join gives that :
the hw said do it without using outer join and get same table as left outer join gives. But like a said i coulnd't. I'm also using MSSQL.
Outer join produces super-set over inner join. Indeed, from Wikipedia: A left outer join returns all the values from an inner join plus all values in the left table that do not match to the right table.
So to model left outer join using inner join one could use UNION of inner join SELECT between same tables with same join condition and another SELECT from 1st table that returns all rows without a match from the right table (I reduced your case to a single left join):
SELECT O.ORDER_ID , O.ORDER_DATE , O.CUST_ID, O.QUANTITY ,O.PART_ID ,
P.PART_NAME, P.PART_CODE
FROM [ORDER] O JOIN PART P ON P.PART_ID = O.PART_ID
UNION
SELECT O.ORDER_ID , O.ORDER_DATE , O.CUST_ID, O.QUANTITY ,O.PART_ID ,
NULL, NULL
FROM [ORDER] O
WHERE NOT EXISTS (SELECT 'found' FROM PART P WHERE P.PART_ID = O.PART_ID)
Presumably, you want to get matches to the columns with NULL values, instead of having them fail. If so, just modify the join conditions:
FROM [ORDER] O
LEFT OUTER JOIN PART P
ON P.PART_ID = O.PART_ID or (p.Part_id is NULL and o.Part_id is null)
LEFT OUTER JOIN CUSTOMER C
ON C.CUST_ID = O.CUST_ID or (c.cust_id is null and o.cust_id is null)
The major issue with this approach is that many (most?) SQL engines will not use indexes for the join.
Is there a specific reason why you don't want to use outer join? Isn't this the result you want? :
FROM [ORDER] O
LEFT JOIN PART P
ON P.PART_ID = O.PART_ID and P.PARTID is not null
LEFT JOIN CUSTOMER C
ON C.CUST_ID = O.CUST_ID and C.CUSTID is not null
So complete answer should be like this (after my teacher gave results) :
SELECT O.ORDER_ID,O.ORDER_DATE,O.CUST_ID,
(SELECT C.CUST_CODE FROM CUSTOMER C WHERE C.CUST_ID=O.CUST_ID) AS CUST_CODE,
(SELECT C.CUST_NAME FROM CUSTOMER C WHERE C.CUST_ID=O.CUST_ID) AS CUST_NAME,
O.PART_ID,
(SELECT P.PART_CODE FROM PART P WHERE P.PART_ID = O.PART_ID ) AS PART_CODE,
(SELECT P.PART_NAME FROM PART P WHERE P.PART_ID = O.PART_ID ) AS PART_NAME,
O.QUANTITY
FROM [ORDER] O

order sql query by name

I have a query which I would like to tweak little bit to display different info.
Currently my query gets all the orders with products ranked by the one with most conversions at the top.
Here is the query:
SELECT nopv.ProductVariantID, COUNT(nopv.ProductVariantID), p.ProductId, c.CategoryID, c.Name FROM Nop_OrderProductVariant nopv
INNER JOIN Nop_ProductVariant npv
ON nopv.ProductVariantID = npv.ProductVariantId
INNER JOIN Nop_Product p
ON npv.ProductID = p.ProductId
INNER JOIN Nop_Product_Category_Mapping npcm
ON p.ProductId = npcm.ProductID
INNER JOIN Nop_Category c
ON npcm.CategoryID = c.CategoryID
GROUP BY nopv.ProductVariantID, p.ProductId, c.CategoryID, c.Name
HAVING COUNT(*) > 0
ORDER BY COUNT(nopv.ProductVariantID) DESC
What I have as a result is:
I want to be able to have each category only one time, for example "programmers & modules" category should only one record, containing the sum of all the productvariantIDs in that category. The first field can be avoided as well, because if there are multiple productvariants, the query will need to show just one. What I really need is the count of each category and the categoryID.
Thanks in advance, Laziale
Simply remove the Variant and ProductID from both the select and Group By.
SELECT
COUNT(nopv.ProductVariantID) ,
c.CategoryID ,
c.Name
FROM
Nop_OrderProductVariant nopv
INNER JOIN Nop_ProductVariant npv
ON
nopv.ProductVariantID = npv.ProductVariantId
INNER JOIN Nop_Product p
ON
npv.ProductID = p.ProductId
INNER JOIN Nop_Product_Category_Mapping npcm
ON
p.ProductId = npcm.ProductID
INNER JOIN Nop_Category c
ON
npcm.CategoryID = c.CategoryID
GROUP BY
c.CategoryID ,
c.Name
HAVING
COUNT(*) > 0
ORDER BY
COUNT(nopv.ProductVariantID) DESC
I think the issue is your group by:
GROUP BY nopv.ProductVariantID, p.ProductId, c.CategoryID, c.Name
Try:
GROUP BY c.CategoryID, c.Name -- c.Name is here since you probably can't select it otherwise
Then make whatever changes you need to your SELECT so it will work.
So something like this:
SELECT COUNT(nopv.ProductVariantID), c.CategoryID, c.Name
FROM Nop_OrderProductVariant nopv
INNER JOIN Nop_ProductVariant npv
ON nopv.ProductVariantID = npv.ProductVariantId
INNER JOIN Nop_Product p
ON npv.ProductID = p.ProductId
INNER JOIN Nop_Product_Category_Mapping npcm
ON p.ProductId = npcm.ProductID
INNER JOIN Nop_Category c
ON npcm.CategoryID = c.CategoryID
GROUP BY c.CategoryID, c.Name
HAVING COUNT(*) > 0
ORDER BY COUNT(nopv.ProductVariantID) DESC