SQL Retrieve distinct data by latest date - sql

I have a table as follows:
Name Customer Date Amount
Joe Aaron 2012-01-03 14:12:00.0 150
Joe Aaron 2012-02-03 14:12:00.0 150
Joe Danny 2012-03-03 14:12:00.0 150
Joe Karen 2012-07-03 14:12:00.0 150
Ronald Blake 2012-05-03 14:12:00.0 1501
I would like to query to retrieve data by specifying the Name and if there are duplicates for Customer column, the records for the latest Date is
For example, if I want to query Joe, I will get the following result:
Name Customer Date Amount
Joe Aaron 2012-02-03 14:12:00.0 150
Joe Danny 2012-03-03 14:12:00.0 150
Joe Karen 2012-07-03 14:12:00.0 150
How should I do this? Tried distinct but it doesnt work that way.
EDIT
I'm using SQL Server. Sorry I re-edit my question and this should be the correct question that I am asking.

Did you Try this?
SELECT Name, Customer, MAX(Date) as CurrentDate, Amount
FROM data
Group By Name, Customer, Amount
HAVING Name = 'Joe'

Another option is using the WITH TIES clause in concert with Row_Number()
Example
Select top 1 with ties *
From YourTable
Where Name='Joe'
Order By Row_Number() over (Partition By Customer Order By Date Desc)
Returns
Name Customer Date Amount
Joe Aaron 2012-02-03 14:12:00.000 150
Joe Danny 2012-03-03 14:12:00.000 150
Joe Karen 2012-07-03 14:12:00.000 150

One option:
SELECT * FROM Table WHERE (NAME, CUSTOMER, DATE) IN (SELECT NAME, CUSTOMER, MAX(DATE) WHERE NAME = 'Joe' GROUP BY NAME, CUSTOMER)

Related

How do I select a max date by person in a table

I am not too advanced with SSRS/SQL queries, and need to write a report that pulls out % allocations by person to then compare to a wage table to allocate the wages. These allocations change quarterly, but all allocations continue to be stored in the table. If a persons allocation did not change, they do NOT get a new entry in the table. Here is a sample table called Allocations.
First Name
Last Name
Date
Area
Percent
Smith
Bob
01/01/20
A
50.00
Smith
Bob
01/01/20
B
50.00
Doe
Jane
01/01/20
A
25.00
Doe
Jane
01/01/20
B
25.00
Doe
Jane
01/01/20
C
50.00
Doe
Jane
04/01/20
A
35.00
Doe
Jane
04/01/20
C
65.00
Wayne
Bruce
01/01/20
A
100.00
Wayne
Bruce
04/01/20
B
100.00
The results that I would want to have from this sample table when querying it are:
First Name
Last Name
Date
Area
Percent
Smith
Bob
01/01/20
A
50.00
Smith
Bob
01/01/20
B
50.00
Doe
Jane
04/01/20
A
35.00
Doe
Jane
04/01/20
C
65.00
Wayne
Bruce
04/01/20
B
100.00
However, I would also like to pull this by comparing it to a date that the user inputs, so that they could run this report at any point in time and get the correct "max" dates. So, for example, if there were also 7/1/20 dates in here, but the user input date was 6/30/20, I would NOT want to pull the 7/1/20 data. In other words, I would like to pull the rows with the maximum date by name w/o going over the user's input date.
Any idea on the best way to accomplish this?
Thanks in advance for any advice you can provide.
In SQL, ROW_NUMBER can be used to order records in groups by a particular field.
SELECT * FROM (
SELECT *, ROW_NUMBER()OVER(PARTITION BY Last_Name, First_Name ORDER BY DATE DESC) as ROW_NUM
FROM TABLE
) AS T
WHERE ROW_NUM = 1
Then you filter for ROW_NUM = 1.
However, I noticed that there are a couple with the same date and you want both. In this caseyou'd want to use RANK - which allows for ties so there may be multiple records with the same date that you want to capture.
SELECT * FROM (
SELECT *, RANK()OVER(PARTITION BY Last_Name, First_Name ORDER BY DATE DESC) as ROW_NUM
FROM TABLE
) AS T
WHERE ROW_NUM = 1

SQL Server query to 'ftatten' data for reporting

Say I have a table with the following data, in the following structure. I'm trying to query the data to find the date ranges that someone (employee) worked.
NAME WORKED DATE
Bob YES 1/1/2019
Bob YES 1/2/2019
Bob YES 1/3/2019
Bob NO 1/4/2019
Bob YES 1/5/2019
Bob YES 1/6/2019
Bob NO 1/7/2019
Jane Yes 1/1/2019
Jane Yes 1/2/2019
Jame No 1/3/2019
Expected Result: (The Result I need)
Bob 1/1/2019 - 1/3/2019
Bob 1/5/2019 - 1/6/2019
Jane 1/1/2019 - 1/2/2019
What's the SQL syntax (SQL Server 2008+) of the query to return this result set?
thx in advance
This is a gaps-and-islands problem. You can identify the rows using row_number() and some date arithmetic.
So, assuming you have a row for every date:
select name, min(date), max(date)
from (select t.*,
row_number() over (partition by name order by date) as seqnum
from t
where worked = 'YES'
) t
group by name,
dateadd(day, - seqnum, date);
Why does this work? You are looking for adjacent dates. If you subtract a sequence from the dates, then the result is constant -- when the dates are sequential. This observation is used in the group by to get the groups you want.

SQL - group on occurence in x or y

I'm having a hard time making the following to work:
I have a list of transactions consisting of Sender,Recipient, Amount and Date.
Table: Transactions
Sender Recipient Amount Date
--------------------------------------------------
Jack Bob 52 2019-04-21 11:06:32
Bob Jack 12 2019-03-29 12:08:11
Bob Jill 50 2019-04-19 24:50:26
Jill Bob 90 2019-03-20 16:34:35
Jill Jack 81 2019-03-25 12:26:54
Bob Jenny 53 2019-04-20 09:07:02
Jack Jenny 5 2019-03-29 06:15:35
Now I want to list the people who have participated in transactions, how many transactions they have participated in and the dates of the first and last transaction they participated in :
Result
Person NUM_TX First_active last_active
------------------------------------------------------------------
Jack 4 2019-03-25 12:26:54 2019-04-21 11:06:32
Bob 5 xxxx-xx-xx xx:xx:xx xxxx-xx-xx xx:xx:xx
Jill 3 xxxx-xx-xx xx:xx:xx xxxx-xx-xx xx:xx:xx
Jenny 2 xxxx-xx-xx xx:xx:xx xxxx-xx-xx xx:xx:xx
Using a group by statement seems not right - what is the right way to achieve my goal? I'm running on a postgres btw.
You need a UNION to get the 2 columns as 1 column person of a resultset and then group by person:
select
t.person Person,
count(*) NUM_TX,
min(t.date) First_active,
max(t.date) Last_active
from (
select sender person, date from transactions
union all
select recipient person, date from transactions
) t
group by t.person
This is a good place to use a lateral join:
select v.person, count(*) as num_transactions,
min(t.date) as first_date,
max(t.date) as last_date
from transactions t cross join lateral
(values (sender), (recipient)) v(person)
group by v.person;

Using distinct and sum in sql server 2008

I'm trying to get the SUM(Values) for each Acct, but my issue is trying to get at least one entire row for a DISTINCT Acct with the SUM(Values).
I have some sample data for example:
Acct Values Name Street
123456789 100.20 John 66 Main Street
123456789 200.80 John 22 Main Avenue
222222222 50.25 Jane 1 Blvd
333333333 25.00 Joe 55 Test Ave
333333333 50.00 Joe 8 Douglas Road
555555555 75.00 Tim 12 Clark Ave
666666666 500.00 Tim 12 Clark Street
666666666 500.00 Tim 3 Main Rd.
My query consisted of:
SELECT DISTINCT Acct, SUM(Value) AS [TOTAL]
FROM TABLE_NAME
GROUP BY Acct
The above query gets me close to what I need, but I need the entire row.
Example below of what I am looking for:
Acct Total Name Addr1
123456789 301.00 John 66 Main Street
222222222 50.25 Jane 1 Blvd
333333333 75.00 Joe 55 Test Ave
555555555 75.00 Tim 12 Clark Ave
666666666 1000.00 Tim 12 Clark Street
Thanks.
If it does not matter what address you return, then you can apply and aggregate to the other columns:
SELECT Acct,
SUM(Value) AS [TOTAL],
max(name) name,
max(Street) addr1
FROM TABLE_NAME
GROUP BY Acct;
See SQL Fiddle with Demo
You can do this using window functions such as row_number() in most databases:
select acct, total, name, addr1
from (select t.*, row_number() over (partition by acct order by acct) as seqnum,
sum(value) over (partition by acct) as Total
from table_name
) t
where seqnum = 1;
I would use Windowing Functions (the OVER clause) to solve this.
SELECT DISTINCT
Acct
,SUM([Values]) OVER (PARTITION BY Acct) AS 'Total'
,Name
,FIRST_VALUE(Street) OVER (PARTITION BY Acct ORDER BY Street DESC) AS 'Addr1'
FROM TABLE_NAME
;
The nice thing about Windowing Functions is that you do not add things to a grouping that you do not need in your functions (e.g. SUM), instead you can focus on describing what you are looking for.
In the SQL above, we are saying we want the SUM of Values grouped by (or PARTITION BY as it is called in the OVER clause) Acct. The FIRST_VALUE allows use to return the first value of the street address. The same did not have a DATETIME column so it is hard to say what the order should be for the first value. There is also a LAST_VALUE windowing function. Assuming you do have a DATETIME column you would want to ORDER BY that column value, if not you can just pick some value like I did with Street (MAX might also be a good option then too, but having some type of DATETIME value would be the best way to do it).
Check out this SQL Fiddle: http://sqlfiddle.com/#!6/a474c/8
Here is the BOL about SUM using the OVER clause: http://msdn.microsoft.com/en-us/library/ms187810.aspx
Here is more info on FIRST_VALUE: http://blog.sqlauthority.com/2011/11/09/sql-server-introduction-to-first-_value-and-last_value-analytic-functions-introduced-in-sql-server-2012/
Here is a blog post I've done on the Windowing Functions: http://comp-phil.blogspot.com/2013/03/higher-order-functions.html

Is there a better way to do this join?

I have a table of my sales agents' sales, by quarter:
Agent Quarter Sales
----------------------------
Alex Andersen 2011Q1 358
Alex Andersen 2011Q2 289
Alex Andersen 2011Q3 27
Alex Andersen 2011Q4 2965
Brian Blogg 2010Q3 277
Brian Blogg 2010Q4 123
Brian Blogg 2011Q1 783
Brian Blogg 2011Q2 0
Christy Cliff 2011Q2 777
Christy Cliff 2011Q3 273
Christy Cliff 2011Q4 111
Christy Cliff 2012Q1 901
What's the simplest, most efficient query for getting each agent's earliest quarter and the sales for that quarter?
It's easy to find out "What is each agent's first quarter?":
SELECT agent, min(quarter) FROM salestable GROUP BY agent
But this doesn't include the sales figures, so I thought I'd do a join:
SELECT agent, sales
FROM salestable s1
JOIN
(
SELECT agent AS e, MIN(quarter) AS q
FROM salestable
GROUP by employee
) AS q1 ON q1.e=s1.agent AND q1.mq=s1.quarter
But this is unacceptably slow on my data set. If I could use a cursor, it would only take one pass through the table, but using a query it seems to require a join. Is that right?
Try this variation and see if it's any better:
WITH cteRowNum AS (
SELECT agent, quarter, sales,
ROW_NUMBER() OVER (PARTITION BY agent ORDER BY quarter) AS RowNum
FROM salestable
)
SELECT agent, quarter, sales
FROM cteRowNum
WHERE RowNum = 1;