select orders from first time customers - sql

I need help building a SQL query that returns orders from customers who have only ordered once.
The tables and relevant fields are as follows:
Order Customer
------- -----------
orderId customerId
orderDate
customerId
etc.
I'm looking for a result set of Order records where there is only one occurence of the customer id. For the following data set...
[orderId] [customerId] [orderDate] [etc.]
---------- ------------ ------------ ------------
o1 c1 1/1/14 foo
o2 c2 1/1/14 baz
o3 c3 1/3/14 bar
o4 c2 1/3/14 wibble
I would like the results to be
[orderId] [orderDate] [etc.]
--------- ----------- ------
o1 1/1/14 foo
o3 1/3/14 bar
Orders o2 and o4 are ommitted because c2 has ordered twice.
Any help would be greatly appreciated.
Sorry, didn't put my failed attempt. This is what I tried...
SELECT customerId,
orderId,
orderDate,
Count(*)
FROM Orders
GROUP BY orderId,
orderDate,
customerID
HAVING Count(*) = 1
ORDER BY orderId
It appears to return all the orders.

Try the following (assuming SQL Server 2005+):
;WITH CTE AS
(
SELECT *,
N = COUNT(*) OVER(PARTITION BY customerId)
FROM Orders
)
SELECT *
FROM CTE
WHERE N = 1
Since sometimes a pedestrian approach is preferred over complex CTEs, you can use a derived table if you want (but since it's using the OVER clause, you'll still need SQL Server 2005+):
SELECT *
FROM ( SELECT *,
N = COUNT(*) OVER(PARTITION BY customerId)
FROM Orders) T
WHERE N = 1
Alternatively (if for example you are in an older than 2005 version of SQL-Server), you can use the GROUP BY / HAVING COUNT(*)=1 method to find customers with only 1 order and then join back to the Orders table (no need for aggregate functions in all the columns):
SELECT o.*
FROM Orders o
JOIN
( SELECT customerId
FROM Orders
GROUP BY customerId
HAVING COUNT(*) = 1
) c
ON c.customerId = o.customerId ;
or use NOT EXISTS (no COUNT() needed and it works even in MySQL):
SELECT o.*
FROM Orders o
WHERE NOT EXISTS
( SELECT 1
FROM Orders c
WHERE c.customerId = o.customerId
AND c.orderId <> o.orderId
) ;

This will list all first-time customers in your ORDERS table.
SELECT [customerID],
MIN([orderId]) AS [orderId],
MIN([orderDate]) AS [orderDate],
MIN([etc.]) AS [etc.]
FROM [Orders]
GROUP BY [customerID]
HAVING Count(*) = 1
ORDER BY [customerID]
In order to bring back all the additional columns you would need to wrap them in an aggregate such as MIN/MAX.
It is arbitrary which to use as there will only be one row per group anyway. This does assume that all columns in the table are of datatypes valid for such aggregation however (examples of datatypes that aren't are BIT, or XML)

Related

SQL query to get three most recent records by customer

I have a table of orders and am looking to get the three most recent orders by customer id
customer orderID orderDate
1 234 2018-01-01
1 236 2017-02-01
3 256 20157-03-01
I was able to use row number () to identify the row number of each line in the table, but is there a way to get the three most recent orders by customer id? Some customers do have less than 3 orders while others have more than 10 orders so I wasn't able to specify by the row number.
Does anyone have recommendations for a different option?
Here is an interesting approach using apply (and assuming you have a customers table):
select o.*
from customers c cross apply
(select top 3 o.*
from orders o
where o.customerid = c.customerid
order by orderdate desc
) o;
You could use partition by;
select customerid, orderid,orderdate from (
select t.customerid, t.orderid,t.orderdate
,row_number() over (partition by t.customerid order by t.orderDate desc) as mostRecently
from samplecustomers t)
Records where mostRecently < 4
Use this query:
SELECT result.customer
, result.orderID
, result.orderDate
FROM
(
SELECT Temp.customer
, Temp.orderID
, Temp.orderDate
, ROW_NUMBER() OVER(PARTITION BY Temp.customer
ORDER BY Temp.orderDate DESC) AS MR
FROM YourTable AS Temp
) AS result
WHERE result.MR <= 3;
Try this:
SELECT *
FROM orders
WHERE customer = 1
ORDER BY orderDate ASC limit 3
This should solve the problem.
How about this:
select a.Customer, a.orderID, a.orderDate
from orders a
where a.orderID in
(
select top 3 b.orderID
from orders b
where b.Customer = a.Customer
order by b.orderDate desc
)
order by a.Customer, a.orderID, a.orderDate

Query two similar tables and combine sorted results

I have three tables
orders.orderid (and other non-pertinent stuff)
payment.orderid
payment.transactiondate
payment.amount
projectedpayment.orderid
projectedpayment.projecteddate
projectedpayment.projectedamount
Essentially, payment represents when actual payments are received; projectedpayment represents when the system thinks they should be received. I need to build a query to compare projected vs actual.
I'd like to query them such that each row in the query has the orderid, payment.transactiondate, payment.amount, projectedpayment.projecteddate, projectedpayment.projectedamount, with the rows from payment and projectedpayment sorted by their respective dates. e.g.,
orderid transactiondate amount projecteddate projectedamount
1 2015-01-01 12.34 2015-01-03 12.34
1 2015-01-15 12.34 2015-01-15 12.44
1 null null 2015-02-01 12.34
2 2014-12-31 50.00 null null
So broken down by order, what are the actual and projected payments, where there may be more projected payments than actual, or more actual payments than projected, aligned by date (simply by sorting the two, nothing more complex than that).
It seems like I should be able to achieve this with a left join from orders to some kind of union of the other two tables sorted with an order by, but I haven't been able to make it work, so it may be something completely different. I know I cannot join all three of order, payment, and projectedpayment or I get the cross-product of the latter two tables.
I happen to be using postgresql 9.4, but hopefully we don't need to get too database-specific.
I dont know postgres sorry :( but if you know how to do partitioned row numbers something like this should work.
select
coalesce(a.orderid,b.orderid) as orderid
,transactiondate
,amount
,projecteddate
,projectedamount
FROM
(select
orderid
,ransactiondate
,amount
,row_number() over (partition by orderid order by orderid,transactiondate) as rn
from payment) as a
full join
(select
orderid
,projecteddate
,projectedamount
,row_number() over (partition by orderid order by orderid,projecteddate) as rn
from projectedpayment) as b
on a.orderid= b.orderid
and a.rn = b.rn
*this is sqlserver syntax (2K5+ AFAIK)
The logic here is that you need to assign a unique number to each predicted and actual payment so that you can join the two tables together but only have each row matching a single row from the other table.
If you have ONLY ONE PAYMENT PER DAY then yo could do the full join on the order ID and date without worrying about row numbers.
the full join allows you to have nulls on either side so you will need to coalesce orderid
*also this doesn't show orders with NO payments or predictions.. comment if this is an issue.
This should work
Select * from Orders o
Left Join Payments p on o.ID = p.OrderID
Left Join ProjectedPaymentp pp on o.ID = pp.OrderID
Order By o.ID
If i correctly understand, the following query should help:
select o.orderid, ap.transactiondate, ap.amount, pp.projecteddate, pp.projectedamount
from orders o
left join
(
select p.orderid, p.transactiondate, p.amount,
row_number() over (partition by p.orderid order by p.transactiondate) n
from payment p
) ap on o.orderid = ap.order
left join
(
select p.orderid, p.projecteddate, p.projectedamount,
row_number() over (partition by p.orderid order by p.projecteddate) n
from projectedpayment p
) pp on o.orderid = ap.order and (ap.n is null or ap.n = pp.n)
order by o.orderid, ap.n, pp.n
UPD
Another option (works in slightly different way and you can have NULL values not only for last records for same orderid but it will be completely sorted by date, in one timeline):
select o.orderid, ap.transactiondate, ap.amount, pp.projecteddate, pp.projectedamount
from orders o
inner join
(
select ap.orderid, ap.transactiondate d from payment ap
union
select ap.orderid, ap.projecteddate d from projectedpayment pp
) d on d.orderid = o.orderid
left join payment ap on ap.orderid = o.orderid and ap.transactiondate = d.d
left join projectedpayment pp on pp.orderid = o.orderid and pp.projecteddate = d.d
order by o.orderid, d.d

SQL INNER JOIN Without Repeats

Getting the next table:
Column1 - OrderID - Earliest orders of customers from Column2
Column2 - CustomerID - Customers from orders in Column1
Column3 - OrderID - All *Other* orders of customers from Column2
which do not appear in Column1
This is my query and I'm looking for a way to apply the rules mentioned above:
SELECT O1.orderid, C1.customerid, O2.Orderid
FROM orders AS O1
INNER JOIN customers AS C1 ON O1.customerid = C1.customerid
RIGHT JOIN orders AS O2 ON C1.customerid = O2.customerid
WHERE O1.orderdate >= '2014-01-01'
AND O1.orderdate <= '2014-03-31'
ORDER BY O1.orderid
Thanks in advance
Not entirely sure why you want to get a result out like this as the earliest order will repeat for each order for the given customer.
SELECT earliestOrders.orderid, C1.customerid, O1.Orderid
FROM orders AS O1
INNER JOIN customers AS C1 ON O1.customerid = C1.customerid
INNER JOIN (
select o.customerid, min(o.OrderId) as OrderId
from orders o
Group by o.customerid
) earliestOrders
ON earliestOrders.CustomerId = C1.CustomerId
AND earliestOrders.orderid <> O1.Orderid
To find the first order per customer, look for first order dates per customer and then pick the one or one of the orders made by the customer then. (If orderdate really is just a date one customer can have placed more than one order that day, so we pick one of them. With MIN(orderid) we are likely to get the first one of that bunch :-)
Outer join the other orders and you are done.
If your dbms supports IN clauses on tuples, you get a quite readable statement:
select first_order.orderid, first_order.customerid, later_order.orderid
from
(
select customerid, min(first_order.orderid) as first_orderid
from orders
where (customerid, orderdate) in
(
select customerid, min(orderdate)
from orders
group by cutomerid
)
) first_order
left join orders later_order
on later_order.customerid = first_order.customerid
and later_order.orderid <> first_order.orderid
;
If your dbms doesn't support IN clauses on tuples, the statement looks a bit more clumsy:
select first_order.orderid, first_order.customerid, later_order.orderid
from
(
select first_orders.customerid, min(first_orders.orderid) as orderid
from orders first_orders
inner join
(
select customerid, min(orderdate)
from orders
group by cutomerid
) first_order_dates
on first_order_dates.customerid = first_orders.customerid
and first_order_dates.orderdate = first_orders.orderdate
group by first_orders.customerid
) first_order
left join orders later_order
on later_order.customerid = first_order.customerid
and later_order.orderid <> first_order.orderid
;

Query That Returns First Results from Subquery

I'm trying to create a query that returns the results from a subquery in the result set.
Here are the tables I'm using:
Orders OrderDetails
------- -----------
orderId orderDetailId
(other data) orderId
productName
I'd like to get the first two order details for each order (Most orders have only one or two details). Here's an example of the desired result set:
orderId (other order data) productName1 productName2
------- ------------------ ------------ ------------
1 (other order data) apple grape
2 (other order data) orange banana
3 (other order data) apple orange
This is what I tried so far:
SELECT o.orderid,
Max(CASE WHEN detail = 1 THEN oi.productname END) AS ProductName1,
Max(CASE WHEN detail = 2 THEN oi.productname END) AS ProductName2
FROM orders AS o
OUTER apply (SELECT TOP 2 oi.*,
Row_number() OVER (ORDER BY orderdetailid) AS detail
FROM orderdetails AS oi
WHERE oi.orderid = o.orderid) AS oi
GROUP BY o.orderid
I'm doing this in the custom reporting module of a hosted ecommerce solution and getting the following unhelpful syntax error: SQL Error: Incorrect syntax near '('.
Unfortunately I don't know what version of SQL Server I'm using. Customer support knows nothing and select ##Version doesn't work.
Note, it appears the row_number() function is not properly supported even though error messages reference the function by name.
Thanks for the help!
Here is an alternative that does not use cross apply. Your ranking was correct but I added a partition by the order.
SELECT
*
FROM
(
SELECT
o.orderid,
ProductName=oi.productcode,
RowNumber=ROW_NUMBER() OVER (PARTITION BY o.orderid ORDER BY oi.orderdetailid)
FROM
orders as o
INNER JOIN orderdetailid oi ON oi.orderid=o.orderid
)AS X
WHERE
RowNumber=1
without using row_number
SELECT
orderdetails.*
Q1.*
FROM
(
SELECT
o.*,
FirstOrderDetailID=(SELECT MIN(orderdetails.orderdetailsid) FROM orderdetails WHERE orderid=o.orderid)
FROM
orders o
)AS Q1
LEFT OUTER JOIN orderdetails oi ON oi.orderdetailsid=Q1.FirstOrderDetailID
If you are just selecting orderid and the product, you don't need the join at all:
select orderid, productcode
from (SELECT oi.orderid, oi.productcode,
row_number() over (partition by oi.orderid order by oi.orderdetailid) as seqnum
FROM orderdetails oi
) oi
where seqnum = 1;
This may not fix the problem if row_number() is not working, but it simplifies the query. You can do this with the min() method as well:
select orderid, productcode
from orderdetails oi
where oi.orderdetailid in (select min(orderdetailid) from orderdetails group by orderid);

Segment purchases based on new vs returning

I'm trying to write a query that can select a particular date and count how many of those customers have placed orders previously and how many are new. For simplicity, here is the table layout:
id (auto) | cust_id | purchase_date
-----------------------------------
1 | 1 | 2010-11-15
2 | 2 | 2010-11-15
3 | 3 | 2010-11-14
4 | 1 | 2010-11-13
5 | 3 | 2010-11-12
I was trying to select orders by a date and then join any previous orders on the same user_id from previous dates, then count how many had orders, vs how many didnt. This was my failed attempt:
SELECT SUM(
CASE WHEN id IS NULL
THEN 1
ELSE 0
END ) AS new, SUM(
CASE WHEN id IS NOT NULL
THEN 1
ELSE 0
END ) AS returning
FROM (
SELECT o1 . *
FROM orders AS o
LEFT JOIN orders AS o1 ON ( o1.user_id = o.user_id
AND DATE( o1.created ) = "2010-11-15" )
WHERE DATE( o.created ) < "2010-11-15"
GROUP BY o.user_id
) AS t
Given a reference data (2010-11-15), then we are interested in the number of distinct customers who placed an order on that date (A), and we are interested in how many of those have placed an order previously (B), and how many did not (C). And clearly, A = B + C.
Q1: Count of orders placed on reference date
SELECT COUNT(DISTINCT Cust_ID)
FROM Orders
WHERE Purchase_Date = '2010-11-15';
Q2: List of customers placing order on reference date
SELECT DISTINCT Cust_ID
FROM Orders
WHERE Purchase_Date = '2010-11-15';
Q3: List of customers who placed an order on reference date who had ordered before
SELECT DISTINCT o1.Cust_ID
FROM Orders AS o1
JOIN (SELECT DISTINCT o2.Cust_ID
FROM Orders AS o2
WHERE o2.Purchase_Date = '2010-11-15') AS c1
ON o1.Cust_ID = c1.Cust_ID
WHERE o1.Purchase_Date < '2010-11-15';
Q4: Count of customers who placed an order on reference data who had ordered before
SELECT COUNT(DISTINCT o1.Cust_ID)
FROM Orders AS o1
JOIN (SELECT DISTINCT o2.Cust_ID
FROM Orders AS o2
WHERE o2.Purchase_Date = '2010-11-15') AS c1
ON o1.Cust_ID = c1.Cust_ID
WHERE o1.Purchase_Date < '2010-11-15';
Q5: Combining Q1 and Q4
There are several ways to do the combining. One is to use Q1 and Q4 as (complicated) expressions in the select-list; another is to use them as tables in the FROM clause which don't need a join between them because each is a single-row, single-column table that can be joined in a Cartesian product. Another would be a UNION, where each row is tagged with what it calculates.
SELECT (SELECT COUNT(DISTINCT Cust_ID)
FROM Orders
WHERE Purchase_Date = '2010-11-15') AS Total_Customers,
(SELECT COUNT(DISTINCT o1.Cust_ID)
FROM Orders AS o1
JOIN (SELECT DISTINCT o2.Cust_ID
FROM Orders AS o2
WHERE o2.Purchase_Date = '2010-11-15') AS c1
ON o1.Cust_ID = c1.Cust_ID
WHERE o1.Purchase_Date < '2010-11-15') AS Returning_Customers
FROM Dual;
(I'm blithely assuming MySQL has a DUAL table - similar to Oracle's. If not, it is trivial to create a table with a single column containing a single row of data. Update 2: bashing the MySQL 5.5 Manual shows that 'FROM Dual' is supported but not needed; MySQL is happy without a FROM clause.)
Update 1: added qualifier 'o1.Cust_ID' in key locations to avoid 'ambiguous column name' as indicated in the comment.
How about
SELECT * FROM
(SELECT * FROM
(SELECT CUST_ID, COUNT(*) AS ORDER_COUNT, 1 AS OLD_CUSTOMER, 0 AS NEW_CUSTOMER
FROM ORDERS
GROUP BY CUST_ID
HAVING ORDER_COUNT > 1)
UNION ALL
(SELECT CUST_ID, COUNT(*) AS ORDER_COUNT, 0 AS OLD_CUSTOMER, 1 AS NEW_CUSTOMER
FROM ORDERS
GROUP BY CUST_ID
HAVING ORDER_COUNT = 1)) G
INNER JOIN
(SELECT CUST_ID, ORDER_DATE
FROM ORDERS) O
USING (CUST_ID)
WHERE ORDER_DATE = [date of interest] AND
OLD_CUSTOMER = [0 or 1, depending on what you want] AND
NEW_CUSTOMER = [0 or 1, depending on what you want]
Not sure if that'll do the whole thing, but it might provide a starting point.
Share and enjoy.
select count(distinct o1.cust_id) as repeat_count,
count(distinct o.cust_id)-count(distinct o1.cust_id) as new_count
from orders o
left join (select cust_id
from orders
where purchase_date < "2010-11-15"
group by cust_id) o1
on o.cust_id = o1.cust_id
where o.purchase_date = "2010-11-15"