SQL Server Query for distinct rows - sql

How do I query for distinct customers? Here's the table I have..
CustID DATE PRODUCT
=======================
1 Aug-31 Orange
1 Aug-31 Orange
3 Aug-31 Apple
1 Sept-24 Apple
4 Sept-25 Orange
This is what I want.
# of New Customers DATE
========================================
2 Aug-31
1 Sept-25
Thanks!

This is a bit tricky. You want to count the first date a customer appears and then do the aggregation:
select mindate, count(*) as NumNew
from (select CustId, min(Date) as mindate
from table t
group by CustId
) c
group by mindate

You could use a simple common table expression to find the first time a user id is used;
WITH cte AS (
SELECT date, ROW_NUMBER() OVER (PARTITION BY custid ORDER BY date) rn
FROM customers
)
SELECT COUNT(*)[# of New Customers], date FROM cte
WHERE rn=1
GROUP BY date
ORDER BY date
An SQLfiddle to test with.

Related

How to choose max of one column per other column

I am using SQL Server and I have a table "a"
month segment_id price
-----------------------------
1 1 100
1 2 200
2 3 50
2 4 80
3 5 10
I want to make a query which presents the original columns where the price will be the max per month
The result should be:
month segment_id price
----------------------------
1 2 200
2 4 80
3 5 10
I tried to write SQL code:
Select
month, segment_id, max(price) as MaxPrice
from
a
but I got an error:
Column segment_id is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause
I tried to fix it in many ways but didn't find how to fix it
Because you need a group by clause without segment_id
Select month, max(price) as MaxPrice
from a
Group By month
as you want results per each month, and segment_id is non-aggregated in your original select statement.
If you want to have segment_id with maximum price repeating per each month for each row, you need to use max() function as window analytic function without Group by clause
Select month, segment_id,
max(price) over ( partition by month order by segment_id ) as MaxPrice
from a
Edit (due to your lastly edited desired results) : you need one more window analytic function row_number() as #Gordon already mentioned:
Select month, segment_id, price From
(
Select a.*,
row_number() over ( partition by month order by price desc ) as Rn
from a
) q
Where rn = 1
I would recommend a correlated subquery:
select t.*
from t
where t.price = (select max(t2.price) from t t2 where t2.month = t.month);
The "canonical" solution is to use row_number():
select t.*
from (select t.*,
row_number() over (partition by month order by price desc) as seqnum
from t
) t
where seqnum = 1;
With the right indexes, the correlated subquery often performs better.
Only because it was not mentioned.
Yet another option is the WITH TIES clause.
To be clear, the approach by Gordon and Barbaros would be a nudge more performant, but this technique does not require or generate an extra column.
Select Top 1 with ties *
From YourTable
Order By row_number() over (partition by month order by price desc)
With not exists:
select t.*
from tablename t
where not exists (
select 1 from tablename
where month = t.month and price > t.price
)
or:
select t.*
from tablename inner join (
select month, max(price) as price
from tablename
group By month
) g on g.month = t.month and g.price = t.price

SQL query to get recent items

I have a sql table
id item date
A apple 2017-09-17
A banana 2017-08-10
A orange 2017-10-01
B banana 2015-06-17
B apple 2014-06-18
How do I write a sql query, so that for each id I get the two most recent items based on date. ex:
id recent second_recent
a orange apple
b banana apple
You can use row_number() and conditional aggregation:
select id,
max(case when seqnum = 1 then item end) as most_recent,
max(case when seqnum = 2 then item end) as most_recent_but_one,
from (select t.*,
row_number() over (partition by id order by date desc) as seqnum
from t
) t
group by id;
Like said on:
SQL: Group by minimum value in one field while selecting distinct rows
You must use A group By to get min
SELECT mt.*,
FROM MyTable mt INNER JOIN
(
SELECT item AS recent, MIN(date) MinDate, ID
FROM MyTable
GROUP BY ID
) t ON mt.ID = t.ID AND mt.date = t.MinDate
I think you can do the same with a order by to get two value instead of one
You can use Pivot table
SELECT first_column AS <first_column_alias>,
[pivot_value1], [pivot_value2], ... [pivot_value_n]
FROM
(<source_table>) AS <source_table_alias>
PIVOT
(
aggregate_function(<aggregate_column>)
FOR <pivot_column> IN ([pivot_value1], [pivot_value2], ... [pivot_value_n])
) AS <pivot_table_alias>;
Learn More with example here
Example

Count customers the first time they appear

Hi I need to count the number of customers with subcategory=E grouped by seller (createdby). Once a customer has been counted by a seller, no other seller should be able to count that customer, eventhough a observation might exist.
Example
id customerID CreatedBy createdate subcategory
1 1111111111 EVAJEN 2014-03-14 E
2 1111111111 MICMAD 2014-04-15 E
3 9999999999 MICMAD 2014-02-10 E`
Here MICMAD shouldn't get a count for id=2 since EVAJEN already made a sale to that customer. Right now my code looks like this, but I'm not able to check if a customer already has been counted.
sel createdby, cast(createdate as date) as date1, count(distinct customerID)
from MyDatabase
where subcategory='E'
group by 1,2`
Thank you
You can use ROW_NUMBER to get one row per customer:
select createdby, cast(createdate as date) as date1, count(*)
from
(
select *
from tab
where subcategory = 'E'
qualify row_number() -- 1st row per customer
over (partition by customerId
order by createddate) = 1
) t
group by 1,2;
Use a subquery to get the first date and count that. In most databases (including Teradata), you can use window functions to get the first row for each customer:
select createdby, cast(createdate as date) as date1, count(*)
from (select t.*,
row_number() over (partition by customerId order by createddate asc) as seqnum
from MyDatabase t
where subcategory = 'E'
) t
where seqnum = 1
group by createdby, cast(createdate as date) ;

Select first purchase for each customer

We are trying to select the first purchase for each customer in a table similar to this:
transaction_no customer_id operator_id purchase_date
20503 1 5 2012-08-24
20504 1 7 2013-10-15
20505 2 5 2013-09-05
20506 3 7 2010-09-06
20507 3 7 2012-07-30
The expected result from the query that we are trying to achieve is:
transaction_no customer_id operator_id first_occurence
20503 1 5 2012-08-24
20505 2 5 2013-09-05
20506 3 7 2010-09-06
The closest we've got is the following query:
SELECT customer_id, MIN(purchase_date) As first_occurence
FROM Sales_Transactions_Header
GROUP BY customer_id;
With the following result:
customer_id first_occurence
1 2012-08-24
2 2013-09-05
3 2010-09-06
But when we select the rest of the needed fields we obviously have to add them to the GROUP BY clause which will make the result from MIN different. We have also tried to joining it on itself, but haven't made any progress.
How do we get the rest of the correlated values without making the aggregate function confused?
You can simply treat the query you have come up with as an inner query. This will work on older version of SQL Server as well (you didn't specify version of SQL Server).
SELECT H.transaction_no, H.customer_id, H.operator_id, H.purchase_date
FROM Sales_Transactions_Header H
INNER JOIN
(SELECT customer_id, MIN(purchase_date) As first_occurence
FROM Sales_Transactions_Header
GROUP BY customer_id) X
ON H.customer_id = X.customer_id AND H.purchase_date = X.first_occurence
You can use the ROW_NUMBER function to help you with that.
This is how to do it for your case.
WITH Occurences AS
(
SELECT
*,
ROW_NUMBER () OVER (PARTITION BY customer_id order by purchase_date ) AS "Occurence"
FROM Sales_Transactions_Header
)
SELECT
transaction_no,
customer_id,
operator_id,
purchase_date
FROM Occurences
WHERE Occurence = 1
Sounds like a job for a CTE!
Clicky!
The CTE will allow you to get the earliest purchase date for each customer. Then you join that back to your original table on customer_id and the date, getting the rest of the information for that transaction.
Like so:
with first_date as(
select customer_id,
min(purchase_date) as first_purchase
from
table1
group by
customer_id
)
select
t1.transaction_no,
t1.customer_id,
t1.operator_id,
t1.purchase_date
from
table1 t1
inner join first_date
on
purchase_date = first_purchase
and t1.customer_id = first_date.customer_id
Below query will also provide the solution
select * from customer_sale_details
where purchase_date in (select min(purchase_date)
from customer_sale_details c1 group by c1.customer_id);

Select newest records that have distinct Name column

I did search around and I found this
SQL selecting rows by most recent date with two unique columns
Which is so close to what I want but I can't seem to make it work.
I get an error Column 'ID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
I want the newest row by date for each Distinct Name
Select ID,Name,Price,Date
From table
Group By Name
Order By Date ASC
Here is an example of what I want
Table
ID
Name
Price
Date
0
A
10
2012-05-03
1
B
9
2012-05-02
2
A
8
2012-05-04
3
C
10
2012-05-03
4
B
8
2012-05-01
desired result
ID
Name
Price
Date
2
A
8
2012-05-04
3
C
10
2012-05-03
1
B
9
2012-05-02
I am using Microsoft SQL Server 2008
Select ID,Name, Price,Date
From temp t1
where date = (select max(date) from temp where t1.name =temp.name)
order by date desc
Here is a SQL Fiddle with a demo of the above
Or as Conrad points out you can use an INNER JOIN (another SQL Fiddle with a demo) :
SELECT t1.ID, t1.Name, t1.Price, t1.Date
FROM temp t1
INNER JOIN
(
SELECT Max(date) date, name
FROM temp
GROUP BY name
) AS t2
ON t1.name = t2.name
AND t1.date = t2.date
ORDER BY date DESC
There a couple ways to do this. This one uses ROW_NUMBER. Just partition by Name and then order by what you want to put the values you want in the first position.
WITH cte
AS (SELECT Row_number() OVER (partition BY NAME ORDER BY date DESC) RN,
id,
name,
price,
date
FROM table1)
SELECT id,
name,
price,
date
FROM cte
WHERE rn = 1
DEMO
Note you should probably add ID (partition BY NAME ORDER BY date DESC, ID DESC) in your actual query as a tie-breaker for date
select * from (
Select
ID, Name, Price, Date,
Rank() over (partition by Name order by Date) RankOrder
From table
) T
where RankOrder = 1
I have found another memory efficient way (but probably crude way)that has worked for me in postgress. Order the query by the date desc, then select the first record of each distinct field.
SELECT distinct on (Name) ID, Price, Date from
table
order by Date desc
Use Distinct instead of Group By
Select Distinct ID,Name,Price,Date
From table
Order By Date ASC
http://technet.microsoft.com/en-us/library/ms187831.aspx