Is there a better way to do this join? - sql

I have a table of my sales agents' sales, by quarter:
Agent Quarter Sales
----------------------------
Alex Andersen 2011Q1 358
Alex Andersen 2011Q2 289
Alex Andersen 2011Q3 27
Alex Andersen 2011Q4 2965
Brian Blogg 2010Q3 277
Brian Blogg 2010Q4 123
Brian Blogg 2011Q1 783
Brian Blogg 2011Q2 0
Christy Cliff 2011Q2 777
Christy Cliff 2011Q3 273
Christy Cliff 2011Q4 111
Christy Cliff 2012Q1 901
What's the simplest, most efficient query for getting each agent's earliest quarter and the sales for that quarter?
It's easy to find out "What is each agent's first quarter?":
SELECT agent, min(quarter) FROM salestable GROUP BY agent
But this doesn't include the sales figures, so I thought I'd do a join:
SELECT agent, sales
FROM salestable s1
JOIN
(
SELECT agent AS e, MIN(quarter) AS q
FROM salestable
GROUP by employee
) AS q1 ON q1.e=s1.agent AND q1.mq=s1.quarter
But this is unacceptably slow on my data set. If I could use a cursor, it would only take one pass through the table, but using a query it seems to require a join. Is that right?

Try this variation and see if it's any better:
WITH cteRowNum AS (
SELECT agent, quarter, sales,
ROW_NUMBER() OVER (PARTITION BY agent ORDER BY quarter) AS RowNum
FROM salestable
)
SELECT agent, quarter, sales
FROM cteRowNum
WHERE RowNum = 1;

Related

SQL COUNT the number purchase between his first purchase and the follow 10 months

every customer has different first-time purchase date, I want to COUNT the number of purchases they have between the following 10 months after the first purchase?
sample table
TransactionID Client_name PurchaseDate Revenue
11 John Lee 10/13/2014 327
12 John Lee 9/15/2015 873
13 John Lee 11/29/2015 1,938
14 Rebort Jo 8/18/2013 722
15 Rebort Jo 5/21/2014 525
16 Rebort Jo 2/4/2015 455
17 Rebort Jo 3/20/2016 599
18 Tina Pe 10/8/2014 213
19 Tina Pe 6/10/2016 3,494
20 Tina Pe 8/9/2016 411
my code below just use ROW_NUM function to identify the first purchase, but I don't know how to do the calculations or there's a better way to do it?
SELECT client_name,
purchasedate,
Dateadd(month, 10, purchasedate) TenMonth,
Row_number()
OVER (
partition BY client_name
ORDER BY client_name) RM
FROM mytable
You might try something like this - I assume you're using SQL Server from the presence of DATEADD() and the fact that you're using a window function (ROW_NUMBER()):
WITH myCTE AS (
SELECT TransactionID, Client_name, PurchaseDate, Revenue
, MIN(PurchaseDate) OVER ( PARTITION BY Client_name ) AS min_PurchaseDate
FROM myTable
)
SELECT Client_name, COUNT(*)
FROM myCTE
WHERE PurchaseDate <= DATEADD(month, 10, min_PurchaseDate)
GROUP BY Client_name
Here I'm creating a common table expression (CTE) with all the data, including the date of first purchase, then I grab a count of all the purchases within a 10-month timeframe.
Hope this helps.
Give this a whirl ... Subquery to get the min purchase date, then LEFT JOIN to the main table to have a WHERE clause for the ten month date range, then count.
SELECT Client_name, COUNT(mt.PurchaseDate) as PurchaseCountFirstTenMonths
FROM myTable mt
LEFT JOIN (
SELECT Client_name, MIN(PurchaseDate) as MinPurchaseDate GROUP BY Client_name) mtmin
ON mt.Client_name = mtmin.Client_name AND mt.PurchaseDate = mtmin.MinPurchaseDate
WHERE mt.PurchaseDate >= mtmin.MinPurchaseDate AND mt.PurchaseDate <= DATEADD(month, 10, mtmin.MinPurchaseDate)
GROUP BY Client_name
ORDER BY Client_name
btw I'm guessing there's some kind of ClientID involved, as nine character full name runs the risk of duplicates.

SQL Retrieve distinct data by latest date

I have a table as follows:
Name Customer Date Amount
Joe Aaron 2012-01-03 14:12:00.0 150
Joe Aaron 2012-02-03 14:12:00.0 150
Joe Danny 2012-03-03 14:12:00.0 150
Joe Karen 2012-07-03 14:12:00.0 150
Ronald Blake 2012-05-03 14:12:00.0 1501
I would like to query to retrieve data by specifying the Name and if there are duplicates for Customer column, the records for the latest Date is
For example, if I want to query Joe, I will get the following result:
Name Customer Date Amount
Joe Aaron 2012-02-03 14:12:00.0 150
Joe Danny 2012-03-03 14:12:00.0 150
Joe Karen 2012-07-03 14:12:00.0 150
How should I do this? Tried distinct but it doesnt work that way.
EDIT
I'm using SQL Server. Sorry I re-edit my question and this should be the correct question that I am asking.
Did you Try this?
SELECT Name, Customer, MAX(Date) as CurrentDate, Amount
FROM data
Group By Name, Customer, Amount
HAVING Name = 'Joe'
Another option is using the WITH TIES clause in concert with Row_Number()
Example
Select top 1 with ties *
From YourTable
Where Name='Joe'
Order By Row_Number() over (Partition By Customer Order By Date Desc)
Returns
Name Customer Date Amount
Joe Aaron 2012-02-03 14:12:00.000 150
Joe Danny 2012-03-03 14:12:00.000 150
Joe Karen 2012-07-03 14:12:00.000 150
One option:
SELECT * FROM Table WHERE (NAME, CUSTOMER, DATE) IN (SELECT NAME, CUSTOMER, MAX(DATE) WHERE NAME = 'Joe' GROUP BY NAME, CUSTOMER)

Advanced Sql query solution required

player team start_date end_date points
John Jacob SportsBallers 2015-01-01 2015-03-31 100
John Jacob SportsKings 2015-04-01 2015-12-01 115
Joe Smith PointScorers 2014-01-01 2016-12-31 125
Bill Johnson SportsKings 2015-01-01 2015-06-31 175
Bill Johnson AllStarTeam 2015-07-01 2016-12-31 200
The above table has many more rows. I was asked the below questions in an interview.
1.)For each player, which team were they play for on 2015-01-01?
I could not answer this one.
2.)For each player, how can we get the team for whom they scored the most points?
select team from Players
where points in (select max(points) from players group by player).
Please, solutions for both.
1
select *
from PlayerTeams
where startdate <='2015-01-01' and enddate >= '2015-01-01'
2
Select player, team, points
from(
Select *, row_number() over (partition by player order by points desc) as rank
From PlayerTeams) as player
where rank = 1
For #1:
Select Player
,Team
From table
Where '2015-01-01' between start_date and end_date
For #2:
select t.Player
,t.Team
from table t
inner join (select Player
,Max(points)
from table
group by Player) m
on t.Player = m.Player
and t.points = m.points

Using distinct and sum in sql server 2008

I'm trying to get the SUM(Values) for each Acct, but my issue is trying to get at least one entire row for a DISTINCT Acct with the SUM(Values).
I have some sample data for example:
Acct Values Name Street
123456789 100.20 John 66 Main Street
123456789 200.80 John 22 Main Avenue
222222222 50.25 Jane 1 Blvd
333333333 25.00 Joe 55 Test Ave
333333333 50.00 Joe 8 Douglas Road
555555555 75.00 Tim 12 Clark Ave
666666666 500.00 Tim 12 Clark Street
666666666 500.00 Tim 3 Main Rd.
My query consisted of:
SELECT DISTINCT Acct, SUM(Value) AS [TOTAL]
FROM TABLE_NAME
GROUP BY Acct
The above query gets me close to what I need, but I need the entire row.
Example below of what I am looking for:
Acct Total Name Addr1
123456789 301.00 John 66 Main Street
222222222 50.25 Jane 1 Blvd
333333333 75.00 Joe 55 Test Ave
555555555 75.00 Tim 12 Clark Ave
666666666 1000.00 Tim 12 Clark Street
Thanks.
If it does not matter what address you return, then you can apply and aggregate to the other columns:
SELECT Acct,
SUM(Value) AS [TOTAL],
max(name) name,
max(Street) addr1
FROM TABLE_NAME
GROUP BY Acct;
See SQL Fiddle with Demo
You can do this using window functions such as row_number() in most databases:
select acct, total, name, addr1
from (select t.*, row_number() over (partition by acct order by acct) as seqnum,
sum(value) over (partition by acct) as Total
from table_name
) t
where seqnum = 1;
I would use Windowing Functions (the OVER clause) to solve this.
SELECT DISTINCT
Acct
,SUM([Values]) OVER (PARTITION BY Acct) AS 'Total'
,Name
,FIRST_VALUE(Street) OVER (PARTITION BY Acct ORDER BY Street DESC) AS 'Addr1'
FROM TABLE_NAME
;
The nice thing about Windowing Functions is that you do not add things to a grouping that you do not need in your functions (e.g. SUM), instead you can focus on describing what you are looking for.
In the SQL above, we are saying we want the SUM of Values grouped by (or PARTITION BY as it is called in the OVER clause) Acct. The FIRST_VALUE allows use to return the first value of the street address. The same did not have a DATETIME column so it is hard to say what the order should be for the first value. There is also a LAST_VALUE windowing function. Assuming you do have a DATETIME column you would want to ORDER BY that column value, if not you can just pick some value like I did with Street (MAX might also be a good option then too, but having some type of DATETIME value would be the best way to do it).
Check out this SQL Fiddle: http://sqlfiddle.com/#!6/a474c/8
Here is the BOL about SUM using the OVER clause: http://msdn.microsoft.com/en-us/library/ms187810.aspx
Here is more info on FIRST_VALUE: http://blog.sqlauthority.com/2011/11/09/sql-server-introduction-to-first-_value-and-last_value-analytic-functions-introduced-in-sql-server-2012/
Here is a blog post I've done on the Windowing Functions: http://comp-phil.blogspot.com/2013/03/higher-order-functions.html

How to produce detail, not summary, report sorted by count(*)?

Oracle 11g:
I want results to list by highest count, then ch_id. When I use group by to get the count then I loose the granularity of the detail. Is there an analytic function I could use?
SALES
ch_id desc customer
=========================
ANAR Anari BOB
SWIS Swiss JOE
SWIS Swiss AMY
BRUN Brunost SAM
BRUN Brunost ANN
BRUN Brunost ROB
Desired Results
count ch_id customer
===========================================
3 BRUN ANN
3 BRUN ROB
3 BRUN SAM
2 SWIS AMY
2 SWIS JOE
1 ANAR BOB
Use the analytic count(*):
select * from
(
select count(*) over (partition by ch_id) cnt,
ch_id, customer
from sales
)
order by cnt desc
select total, ch_id, customer
from sales s
inner join (select count(*) total, ch_id from sales group by ch_id) b
on b.ch_id = s.chi_id
order by total, ch_id
ok - the other post that happened at the same time, using partition, is the better solution for Oracle. But this one works regardless of DB.