SQL INNER JOIN Without Repeats

SQL INNER JOIN Without Repeats - sql

Getting the next table:
Column1 - OrderID - Earliest orders of customers from Column2
Column2 - CustomerID - Customers from orders in Column1
Column3 - OrderID - All *Other* orders of customers from Column2
which do not appear in Column1
This is my query and I'm looking for a way to apply the rules mentioned above:
SELECT O1.orderid, C1.customerid, O2.Orderid
FROM orders AS O1
INNER JOIN customers AS C1 ON O1.customerid = C1.customerid
RIGHT JOIN orders AS O2 ON C1.customerid = O2.customerid
WHERE O1.orderdate >= '2014-01-01'
AND O1.orderdate <= '2014-03-31'
ORDER BY O1.orderid
Thanks in advance

Not entirely sure why you want to get a result out like this as the earliest order will repeat for each order for the given customer.
SELECT earliestOrders.orderid, C1.customerid, O1.Orderid
FROM orders AS O1
INNER JOIN customers AS C1 ON O1.customerid = C1.customerid
INNER JOIN (
select o.customerid, min(o.OrderId) as OrderId
from orders o
Group by o.customerid
) earliestOrders
ON earliestOrders.CustomerId = C1.CustomerId
AND earliestOrders.orderid <> O1.Orderid

To find the first order per customer, look for first order dates per customer and then pick the one or one of the orders made by the customer then. (If orderdate really is just a date one customer can have placed more than one order that day, so we pick one of them. With MIN(orderid) we are likely to get the first one of that bunch :-)
Outer join the other orders and you are done.
If your dbms supports IN clauses on tuples, you get a quite readable statement:
select first_order.orderid, first_order.customerid, later_order.orderid
from
(
select customerid, min(first_order.orderid) as first_orderid
from orders
where (customerid, orderdate) in
(
select customerid, min(orderdate)
from orders
group by cutomerid
)
) first_order
left join orders later_order
on later_order.customerid = first_order.customerid
and later_order.orderid <> first_order.orderid
;
If your dbms doesn't support IN clauses on tuples, the statement looks a bit more clumsy:
select first_order.orderid, first_order.customerid, later_order.orderid
from
(
select first_orders.customerid, min(first_orders.orderid) as orderid
from orders first_orders
inner join
(
select customerid, min(orderdate)
from orders
group by cutomerid
) first_order_dates
on first_order_dates.customerid = first_orders.customerid
and first_order_dates.orderdate = first_orders.orderdate
group by first_orders.customerid
) first_order
left join orders later_order
on later_order.customerid = first_order.customerid
and later_order.orderid <> first_order.orderid
;

Related

How can i get all the MAX values from a certain column in a dataset in PostgreSQL

I'm asked to find the top user for different countries, however, one of the countries has 2 users with the same amount spent so they should both be the top users, but I can't get the max value for 2 values in this country.
Here is the code:
WITH t1 AS (
SELECT c.customerid,SUM(i.total) tot
FROM invoice i
JOIN customer c ON c.customerid = i.customerid
GROUP BY 1
ORDER BY 2 DESC
),
t2 AS (
SELECT c.customerid as CustomerId ,c.firstname as FirstName,c.lastname as LastName, i.billingcountry as Country,MAX(t1.tot) as TotalSpent
FROM t1
JOIN customer c
ON c.customerid = t1.customerid
JOIN invoice i ON i.customerid = c.customerid
GROUP BY 4
ORDER BY 4
)
SELECT *
FROM t2
BILLINGCOUNTRY is in Invoice, and it has the name of all the countries.
TOTAL is also in invoice and it shows how much is spent for each purchase by Customer (so there are different fees and taxes for each purchase and total shows the final price payed by the user at each time)
Customer has id,name,last name and from its' ID I'm extracting the total of each of his purchases
MAX was used after finding the sum for each Customer and it was GROUPED BY country so that i could find the max for each country, however I can't seem to find the max of the last country that had 2 max values

Use rank() or dense_rank():
SELECT c.*, i.tot
FROM (SELECT i.customerid, i.billingCountry, SUM(i.total) as tot,
RANK() OVER (PARTITION BY i.billingCountry ORDER BY SUM(i.total) DESC) as seqnum
FROM invoice i
GROUP BY 1, 2
) i JOIN
customer c
ON c.customerid = i.customerid
WHERE seqnum = 1;
The subquery finds the amount per customer in each country -- and importantly calculates a ranking for the combination with ties having the same rank. The outer query just brings in the additional customer information that you seem to want.

here is how it worked for me since i was restricted from using many Commands such RIGHT JOIN and RANK() (As what Gordon Linoff suggessted) so i had to create a 3rd case for the anamoly and join it using union. this solution works only on this case, the good solution is the one posted by Gordon Linoff:
WITH t1 AS (
SELECT c.customerid,SUM(i.total) tot
FROM invoice i
JOIN customer c ON c.customerid = i.customerid
GROUP BY 1
ORDER BY 2 DESC
),
t2 AS (
SELECT c.customerid as CustomerId ,c.firstname as FirstName,c.lastname as LastName, i.billingcountry as Country,MAX(t1.tot) as TotalSpent
FROM t1
JOIN customer c
ON c.customerid = t1.customerid
JOIN invoice i ON i.customerid = c.customerid
GROUP BY 4
ORDER BY 4
) ,
t3 AS (
SELECT DISTINCT c.customerid as CustomerId ,c.firstname as FirstName,c.lastname as LastName, i.billingcountry as Country,t1.tot as TotalSpent
FROM t1
JOIN customer c
ON c.customerid = t1.customerid
JOIN invoice i ON i.customerid = c.customerid
WHERE i.billingcountry = 'United Kingdom'
ORDER BY t1.tot DESC
LIMIT 2
)
SELECT *
FROM t2
UNION
SELECT * FROM t3
ORDER BY t2.country

Only return rows after the exists condtion

I have an SQL server Query
This returns orders from customers of products that are related / added to their OrderID (Final Invoice)
This uses an exists condition
Select * from Orders o1
where DepartmentSpecialty = 'LivingRoom'
and Exists (SELECT o2.Department FROM Orders o2 WHERE o2.Department = 'Kitchen'
and o1.ID = o2.ID
and o1.OrderID = o2.OrderID
)
I only wish to bring back rows for the order dates AFTER they have ordered from the Kitchen Department in relation to their OrderID. This whom ordered from the Living Room department.
Any ideas team that I can amend the SQL to do this please

You can do this without a join or subquery, by using window functions:
select o.*
from (select o.*,
min(case when o.Department = 'Kitchen'
then date
end) over (partition by id, orderid) as kitchen_date
from orders o
) o
where o.DepartmentSpecialty = 'LivingRoom' and
o.date > o.kitchen_date;
I suspect that your original join conditions (and hence the partition by columns) are too restrictive. I would expect these to be some sort of customer id.

something a bit like this perhaps
Select * from Orders o1
where DepartmentSpecialty = 'LivingRoom'
and o1.orderdate >= (SELECT MIN(o2.orderdate) FROM Orders o2 WHERE o2.Department = 'Kitchen'
and o1.ID = o2.ID
and o1.OrderID = o2.OrderID)

This should do the trick
SELECT o1.*
FROM Orders AS o1
INNER JOIN Orders AS o2
ON o1.OrderId = o2.OrderId
AND o1.[SomeDateField] > o2.[SomeDateField] -- You will need to specify a date field
WHERE o1.DepartmentSpecialty = 'LivingRoom' -- Department Your interested in
AND o2.DepartmentSpeciality = 'Kitchen' -- Inital Department

SQL EXISTS returns all rows, more than two tables

I know similar questions like this have been asked before, but I have not seen one for more than 2 tables. And there seems to be a difference.
I have three tables from which I need fields, customers where I need customerID and orderID from, orders from which I get customerID and orderID and lineitems from which I get orderID and quantity (= quantity ordered).
I want to find out how many customers bought more than 2 of the same item, so basically quantity > 2 with:
SELECT COUNT(DISTINCT custID)
FROM customers
WHERE EXISTS(
SELECT *
FROM customers C, orders O, lineitems L
WHERE C.custID = O.custID AND O.orderID = L.orderID AND L.quantity > 2
);
I do not understand why it is returning me the count of all rows. I am correlating the subqueries before checking the >2 condition, am I not?
I am a beginner at SQL, so I'd be thankful if you could explain it to me fundamentally, if necessary. Thanks.

You don't have to repeat customers table in the EXISTS subquery. This is the idea of correlation: use the table of the outer query in order to correlate.
SELECT COUNT(DISTINCT custID)
FROM customers c
WHERE EXISTS(
SELECT *
FROM orders O
JOIN lineitems L ON O.orderID = L.orderID
WHERE C.custID = O.custID AND L.quantity > 2
);

I would approach this as two aggregations:
select count(distinct customerid)
from (select o.customerid, l.itemid, count(*) as cnt
from lineitems li join
orders o
on o.orderID = l.orderId
group by o.customerid, l.itemid
) ol
where cnt >= 2;
The inner query counts the number of items that each customer has purchased. The outer counts the number of customers.
EDIT:
I may have misunderstood the question for the above answer. If you just want where quantity >= 2, then that is much easier:
select count(distinct o.customerid)
from lineitems li join
orders o
on o.orderID = l.orderId
where l.quantity >= 2;
This is probably the simplest way to express the query.

I suggest you to use "joins" ,
Try this
select
count(*)
From
orders o
inner join
lineitems l
on
l.orderID = o.orderID
where
l.quantity > 2

Getting max value before given date

I am pretty new to using MS SQL 2012 and I am trying to create a query that will:
Report the order id, the order date and the employee id that processed the order
report the maximum shipping cost among the orders processed by the same employee prior to that order
This is the code that I've come up with, but it returns the freight of the particular order date. Whereas I am trying to get the maximum freight from all the orders before the particular order.
select o.employeeid, o.orderid, o.orderdate, t2.maxfreight
from orders o
inner join
(
select employeeid, orderdate, max(freight) as maxfreight
from orders
group by EmployeeID, OrderDate
) t2
on o.EmployeeID = t2.EmployeeID
inner join
(
select employeeid, max(orderdate) as mostRecentOrderDate
from Orders
group by EmployeeID
) t3
on t2.EmployeeID = t3.EmployeeID
where o.freight = t2.maxfreight and t2.orderdate < t3.mostRecentOrderDate

Step one is to read the order:
select o.employeeid, o.orderid, o.orderdate
from orders o
where o.orderid = #ParticularOrder;
That gives you everything you need to go out and get the previous orders from the same employee and join each one to the row you get from above.
select o.employeeid, o.orderid, o.orderdate, o2.freight
from orders o
join orders o2
on o2.employeeid = o.employeeid
and o2.orderdate < o.orderdate
where o.orderid = #ParticularOrder;
Now you have a whole bunch of rows with the first three values the same and the fourth is the freight cost of each previous order. So just group by the first three fields and select the maximum of the previous orders.
select o.employeeid, o.orderid, o.orderdate, max( o2.freight ) as maxfreight
from orders o
join orders o2
on o2.employeeid = o.employeeid
and o2.orderdate < o.orderdate
where o.orderid = #ParticularOrder
group by o.employeeid, o.orderid, o.orderdate;
Done. Build your query in stages and many times it will turn out to be much simpler than you at first thought.

It is unclear why you are using t3. From the question it doesn't sound like the employee's most recent order date is relevant at all, unless I am misunderstanding (which is absolutely possible).
I believe the issue lies in t2. You are grouping by orderdate, which will return the max freight for that date and employeeid, as you describe. You need to calculate a maximum total from all orders that occurred before the date that the order occurred on, for that employee, for every row you are returning.
It probably makes more sense to use a subquery for this.
SELECT o.employeeid, o.orderid, o.orderdate, m.maxfreight
FROM
orders o LEFT OUTER JOIN
(SELECT max(freight) as maxfreight
FROM orders AS f
WHERE f.orderdate <= o.orderdate AND f.employeeid = o.employeeid
) AS m
Hoping this is syntactically correct as I'm not in front of SSMS right now. I also included a left outer join as your previous query with an inner join would have excluded any rows where an employee had no previous orders (i.e. first order ever).

You can do what you want with a correlated subquery or apply. Here is one way:
select o.employeeid, o.orderid, o.orderdate, t2.maxfreight
from orders o outer apply
(select max(freight) as maxfreight
from orders o2
where o2.employeeid = o.employeid and
o2.orderdate < o.orderdate
) t2;
In SQL Server 2012+, you can also do this with a cumulative maximum:
select o.employeeid, o.orderid, o.orderdate,
max(freight) over (partition by employeeid
order by o.orderdate rows between unbounded preceding and 1 preceding
) as maxfreight
from orders o;

SQL Query for counting number of orders per customer and Total Dollar amount

I have two tables
Order with columns:
OrderID,OrderDate,CID,EmployeeID
And OrderItem with columns:
OrderID,ItemID,Quantity,SalePrice
I need to return the CustomerID(CID), number of orders per customer, and each customers total amount for all orders.
So far I have two separate queries. One gives me the count of customer orders....
SELECT CID, Count(Order.OrderID) AS TotalOrders
FROM [Order]
Where CID = CID
GROUP BY CID
Order BY Count(Order.OrderID) DESC;
And the other gives me the total sales. I'm having trouble combining them...
SELECT CID, Sum(OrderItem.Quantity*OrderItem.SalePrice) AS TotalDollarAmount
FROM OrderItem, [Order]
WHERE OrderItem.OrderID = [Order].OrderID
GROUP BY CID
I'm doing this in Access 2010.

You would use COUNT(DISTINCT ...) in other SQL engines:
SELECT CID,
Count(DISTINCT O.OrderID) AS TotalOrders,
Sum(OI.Quantity*OI.SalePrice) AS TotalDollarAmount
FROM [Order] O
INNER JOIN [OrderItem] OI
ON O.OrderID = OI.OrderID
GROUP BY CID
Order BY Count(DISTINCT O.OrderID) DESC
Which Access unfortunately does not support. Instead you can first get the Order dollar amounts and then join them before figuring the order counts:
SELECT CID,
COUNT(Orders.OrderID) AS TotalOrders,
SUM(OrderAmounts.DollarAmount) AS TotalDollarAmount
FROM [Orders]
INNER JOIN (SELECT OrderID, Sum(Quantity*SalePrice) AS DollarAmount
FROM OrderItems GROUP BY OrderID) AS OrderAmounts
ON Orders.OrderID = OrderAmounts.OrderID
GROUP BY CID
ORDER BY Count(Orders.OrderID) DESC
If you need to include Customers that have orders with no items (unusual but possible), change INNER JOIN to LEFT OUTER JOIN.

Create a query which uses your 2 existing queries as subqueriers, and join the 2 subqueries on CID. Define your ORDER BY in the parent query instead of in a subquery.
SELECT
sub1.CID,
sub1.TotalOrders,
sub2.TotalDollarAmount
FROM
(
SELECT
CID,
Count(Order.OrderID) AS TotalOrders
FROM [Order]
GROUP BY CID
) AS sub1
INNER JOIN
(
SELECT
CID,
Sum(OrderItem.Quantity*OrderItem.SalePrice)
AS TotalDollarAmount
FROM OrderItem INNER JOIN [Order]
ON OrderItem.OrderID = [Order].OrderID
GROUP BY CID
) AS sub2
ON sub1.CID = sub2.CID
ORDER BY sub1.TotalOrders DESC;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

SQL INNER JOIN Without Repeats - sql

Related

How can i get all the MAX values from a certain column in a dataset in PostgreSQL

Only return rows after the exists condtion

SQL EXISTS returns all rows, more than two tables

Getting max value before given date

SQL Query for counting number of orders per customer and Total Dollar amount

Categories

Resources