Select first purchase for each customer - sql

We are trying to select the first purchase for each customer in a table similar to this:
transaction_no customer_id operator_id purchase_date
20503 1 5 2012-08-24
20504 1 7 2013-10-15
20505 2 5 2013-09-05
20506 3 7 2010-09-06
20507 3 7 2012-07-30
The expected result from the query that we are trying to achieve is:
transaction_no customer_id operator_id first_occurence
20503 1 5 2012-08-24
20505 2 5 2013-09-05
20506 3 7 2010-09-06
The closest we've got is the following query:
SELECT customer_id, MIN(purchase_date) As first_occurence
FROM Sales_Transactions_Header
GROUP BY customer_id;
With the following result:
customer_id first_occurence
1 2012-08-24
2 2013-09-05
3 2010-09-06
But when we select the rest of the needed fields we obviously have to add them to the GROUP BY clause which will make the result from MIN different. We have also tried to joining it on itself, but haven't made any progress.
How do we get the rest of the correlated values without making the aggregate function confused?

You can simply treat the query you have come up with as an inner query. This will work on older version of SQL Server as well (you didn't specify version of SQL Server).
SELECT H.transaction_no, H.customer_id, H.operator_id, H.purchase_date
FROM Sales_Transactions_Header H
INNER JOIN
(SELECT customer_id, MIN(purchase_date) As first_occurence
FROM Sales_Transactions_Header
GROUP BY customer_id) X
ON H.customer_id = X.customer_id AND H.purchase_date = X.first_occurence

You can use the ROW_NUMBER function to help you with that.
This is how to do it for your case.
WITH Occurences AS
(
SELECT
*,
ROW_NUMBER () OVER (PARTITION BY customer_id order by purchase_date ) AS "Occurence"
FROM Sales_Transactions_Header
)
SELECT
transaction_no,
customer_id,
operator_id,
purchase_date
FROM Occurences
WHERE Occurence = 1

Sounds like a job for a CTE!
Clicky!
The CTE will allow you to get the earliest purchase date for each customer. Then you join that back to your original table on customer_id and the date, getting the rest of the information for that transaction.
Like so:
with first_date as(
select customer_id,
min(purchase_date) as first_purchase
from
table1
group by
customer_id
)
select
t1.transaction_no,
t1.customer_id,
t1.operator_id,
t1.purchase_date
from
table1 t1
inner join first_date
on
purchase_date = first_purchase
and t1.customer_id = first_date.customer_id

Below query will also provide the solution
select * from customer_sale_details
where purchase_date in (select min(purchase_date)
from customer_sale_details c1 group by c1.customer_id);

Related

sql - select all rows that have all multiple same cols

I have a table with 4 columns.
date
store_id
product_id
label_id
and I need to find all store_ids that have all products_id with same label_id (for example 4)in one day.
for example:
store_id | label_id | product_id | data|
4 4 5 9/2
5 4 7 9/2
4 3 12 9/2
4 4 7 9/2
so it should return 4 because it's the only store that contains all possible products with label 4 at one day.
I have tried something like this:
(select store_id, date
from table
where label_id = 4
group by store_id, date
order by date)
I dont know how to write the outer query, I tried:
select * from table
where product_id = all(Inner query)
but it didnt work.
Thanks
It is unclear from your question whether the labels are specific to a given day or through the entire period. But a variation of Tim's answer seems appropriate. For any label:
SELECT t.date, t.label, t.store_id
FROM t
GROUP BY t.date, t.label, t.store_id
HAVING COUNT(DISTINCT t.product_id) = (SELECT COUNT(DISTINCT t2product_id)
FROM t t2
WHERE t2.label = t.label
);
For a particular label:
SELECT t.date, t.store_id
FROM t
WHERE t.label = 4
GROUP BY t.date,t.store_id
HAVING COUNT(DISTINCT t.product_id) = (SELECT COUNT(DISTINCT t2product_id)
FROM t t2
WHERE t2.label = t.label
);
If the labels are specific to the date, then you need that comparison in the outer queries as well.
Here is one way:
SELECT date, store_id
FROM yourTable
GROUP BY date, store_id
HAVING COUNT(DISTINCT product_id) = (SELECT COUNT(DISTINCT product_id)
FROM yourTable t2
WHERE t2.date = t1.date)
ORDER BY date, product_id;
This query reads in a pretty straightforward way, and it says to find every product, on some date, whose distinct product count is the same as the distinct product count on the same day, across all stores.
I'd probably aggregate to lists of products in a string or array:
with products_per_day_and_store as
(
select
store_id,
date,
string_agg(distinct product_id order by product_id) as products
from mytable
where label_id = 4
group by store_id, date
)
, products_per_day
(
select
date,
string_agg(distinct product_id order by product_id) as products
from mytable
where label_id = 4
group by date
)
select distinct ppdas.store_id
from products_per_day_and_store ppdas
join products_per_day ppd using (date, products);

How to choose max of one column per other column

I am using SQL Server and I have a table "a"
month segment_id price
-----------------------------
1 1 100
1 2 200
2 3 50
2 4 80
3 5 10
I want to make a query which presents the original columns where the price will be the max per month
The result should be:
month segment_id price
----------------------------
1 2 200
2 4 80
3 5 10
I tried to write SQL code:
Select
month, segment_id, max(price) as MaxPrice
from
a
but I got an error:
Column segment_id is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause
I tried to fix it in many ways but didn't find how to fix it
Because you need a group by clause without segment_id
Select month, max(price) as MaxPrice
from a
Group By month
as you want results per each month, and segment_id is non-aggregated in your original select statement.
If you want to have segment_id with maximum price repeating per each month for each row, you need to use max() function as window analytic function without Group by clause
Select month, segment_id,
max(price) over ( partition by month order by segment_id ) as MaxPrice
from a
Edit (due to your lastly edited desired results) : you need one more window analytic function row_number() as #Gordon already mentioned:
Select month, segment_id, price From
(
Select a.*,
row_number() over ( partition by month order by price desc ) as Rn
from a
) q
Where rn = 1
I would recommend a correlated subquery:
select t.*
from t
where t.price = (select max(t2.price) from t t2 where t2.month = t.month);
The "canonical" solution is to use row_number():
select t.*
from (select t.*,
row_number() over (partition by month order by price desc) as seqnum
from t
) t
where seqnum = 1;
With the right indexes, the correlated subquery often performs better.
Only because it was not mentioned.
Yet another option is the WITH TIES clause.
To be clear, the approach by Gordon and Barbaros would be a nudge more performant, but this technique does not require or generate an extra column.
Select Top 1 with ties *
From YourTable
Order By row_number() over (partition by month order by price desc)
With not exists:
select t.*
from tablename t
where not exists (
select 1 from tablename
where month = t.month and price > t.price
)
or:
select t.*
from tablename inner join (
select month, max(price) as price
from tablename
group By month
) g on g.month = t.month and g.price = t.price

SQL - Group by group property?

I've got a table of consignments (simplified of course)
CONSIGNMENT_NR CUSTOMER
1 1
2 1
3 2
4 2
5 2
I can easily select, for each customer, how many consignments they have:
SELECT CUSTOMER, COUNT(*) AS 'Count'
FROM CONSIGNMENT
GROUP BY CUSTOMER
Which will give me (with that example data):
CUSTOMER Count
1 2
2 3
But what I want is to get how many customers made x amount of consignments.
The data I want would look like this:
Amount No of Customers
2 1
3 1
I can't quite figure out how.
Wrap your query up as a derived table. GROUP BY its result:
select Amount, count(*) as No_of_Customers
from
(
SELECT COUNT(*) AS Amount
FROM CONSIGNMENT
GROUP BY CUSTOMER
) dt
group by Amount
you can try below - using subquery
select count(distinct customer) as noofcustomer, 'Count'
from
(
SELECT CUSTOMER, COUNT(*) AS 'Count'
FROM CONSIGNMENT
GROUP BY CUSTOMER
)A group by 'Count'

SQL Server Query for distinct rows

How do I query for distinct customers? Here's the table I have..
CustID DATE PRODUCT
=======================
1 Aug-31 Orange
1 Aug-31 Orange
3 Aug-31 Apple
1 Sept-24 Apple
4 Sept-25 Orange
This is what I want.
# of New Customers DATE
========================================
2 Aug-31
1 Sept-25
Thanks!
This is a bit tricky. You want to count the first date a customer appears and then do the aggregation:
select mindate, count(*) as NumNew
from (select CustId, min(Date) as mindate
from table t
group by CustId
) c
group by mindate
You could use a simple common table expression to find the first time a user id is used;
WITH cte AS (
SELECT date, ROW_NUMBER() OVER (PARTITION BY custid ORDER BY date) rn
FROM customers
)
SELECT COUNT(*)[# of New Customers], date FROM cte
WHERE rn=1
GROUP BY date
ORDER BY date
An SQLfiddle to test with.

SQL Server 2008 - Select last 3 Booking Ids for each supplier

Id like to return the top three most recent BOOKING_ID, by CREATION_DATE for each active SUPPLIER_ID within a column of bookings.
The active SUPPLIER_ID are gathered with the following query.
select SUPPLIER_ID
into #active
from BookingTable
where BOOKING_ID in (select BOOKING_ID
from BookingTable
where CREATION_DATE > getdate() - 90)
group by SUPPLIER_ID
Is it possible to do this as one query? My current approach is to enter the active SUPPLIER_ID's into a temporary table, and using an outer join to somehow return three records for each supplier.
My expected output is:
SUPPLIER BOOKING
1 12345
1 54656
1 34546
2 54965
2 05650
2 90565
you could use ranking_functions by partitioning through supplier_id
WITH cte
AS (SELECT *,
Row_number()
OVER(
partition BY supplier
ORDER BY creation_date DESC) AS rn
FROM table)
SELECT *
FROM cte
WHERE rn <= 3
I think this should do it ??
select top 3 from table order by CREATION_DATE desc group by SUPPLIER_ID
or like other answer by #Vijaykumar Hadalgi using cte and ROW_NUMBER()/RANK() and PARTITION