Group-by statement doesn't work - sql

So currently, I have the following table:
ID, Name, Code, Date
1 AB x1 01/03/2014
1 AB x2 01/04/2014
1 AB x3 01/05/2014
2 BC x3 01/05/2014
2 BC x5 01/06/2014
3 CD x1 01/06/2014
I want the following output:
ID, Name, Code, Date
1 AB x3 01/05/2014
2 BC x5 01/06/2014
3 CD x1 01/06/2014
So basically, I just want the latest date, without caring for the code.
In my code, I have
select id, name, code, max(date)
group by id, name, code
But the group by does not work as it's also going to take the code into consideration, thus I don't get just the latest date. Also, I can't leave code in the group by statement as it'll give me an error.
How do I use a group by without including code?
I'm using PL/SQL developer as IDE.

select id, name, code, date
from (
select id, name, code,
date,
max(date) over (partition by id) as max_date
from the_table
)
where date = max_date;
If you want to pick exactly one of the dates if there are multiple "max dates" you can use row_number() instead:
select id, name, code, date
from (
select id, name, code,
date,
row_number() over (partition by id order by date desc) as rn
from the_table
)
where rn = 1;
Btw: date is a horrible name for a column. For one because it's also the name of a data type but more importantly because it does not document at all what the column contains. An "end date"? A "start date"? A "due date"? ...

What you want is latest updated record right?
select t1.*
from table t1
inner join (select id, name, max(date) as latest_date
from table
group by id, name) t2 on t1.date = t2.latest_date
and t1.id = t2.id and t1.name = t2.name
It will be good to have index on date column

I assume you want to get whatever the code is that is on the row that has the max date. If you truly don't care what code gets returned, just use an aggregate function on it like max(code).
Otherwise, you can do this:
SELECT t1.id, t1.name, t2.code, t2.date
FROM MyTable t1
CROSS JOIN (
SELECT TOP 1 code, date
FROM MyTable t3
WHERE t3.id=t1.id
AND t3.name=t1.name
ORDER BY t3.date DESC
) t2
I'm not sure if CROSS JOIN is PL/SQL compatible, but you can find the equivalent, I'm sure.

Related

SQL GroupBy to show one row with max value in one row

I have following table
ID Date quantity storename
id1 01-01 1 A1
id2 01-03 3 A2
id1 01-03 40 A2
I want to see
ID Date quantity storename
id1 01-03 40 A2
id2 01-03 3 A2
So basically I would like to groupby ID and find the max(newest) date. Then get the entire row data from that max(newest) date.
I tried the following code and it's not working out.
SELECT ID, max(Date), quantity, storename FROM table
GROUPBY ID
Also, is it possible to get all the columns(Like using *) instead of specifying one by one?
You can use aggregation:
select t.*
from t
where t.date = (select max(t2.date) from t t2);
Do be sure that the date does not have a time component. If it does:
select t.*
from t
where trunc(t.date) = (select trunc(max(t2.date)) from t t2);
You would want to do something like this:
select *
from (
select
ROW_NUMBER() OVER
(
PARTITION BY
tbl.ID
ORDER BY
tbl.date desc
) AS RowNumber,
*
from
tablename as tbl
) tmp
where tmp.RowNumber = 1
The partition by will group your data and then order it by the field(s) you want. You can include as many group bys (partition by) or order bys as you would like by using a comma to separate them.

Select multiple max values after GROUP BY query

Suppose I have a table look like this:
date ID income
0 9/1 C 10.40
1 9/3 A 33.90
2 9/3 B 29.10
3 9/4 C 19.30
4 9/4 B 17.80
5 9/5 B 9.55
6 9/5 C 11.10
7 9/5 A 13.10
8 9/7 A 29.10
9 9/7 B 29.10
I want to find out the ID who made the most income for each date. The most intuitive approach would be writing
SELECT ID, MAX(income) FROM table GROUP BY date
But there are two IDs who made the same MAX income on 9/7, I want to retain all ties on the same date, by using that query I will ignore one ID on 9/7, and 29.1 appears on 9/3 and 9/7, any other approach?
A join based approach doesn't have this problem, and would retain all records tied for the max income on a given date.
SELECT t1.*
FROM yourTable t1
INNER JOIN
(
SELECT date, MAX(income) AS max_income
FROM yourTable
GROUP BY date
) t2
ON t1.date = t2.date AND t1.income = t2.max_income
ORDER BY
t1.date;
The way the above query works is to join the complete original table to a subquery which finds, for each date, the maximum income value. This has the effect of filtering off any record which did not have the max income on a given date. Pay close attention to the join condition, which has two components, the date, and the income.
If your database supports analytic function, we can also use RANK here:
SELECT date, ID, income
FROM
(
SELECT t.*, RANK() OVER (PARTITION BY date ORDER BY income DESC) rnk
FROM yourTable t
) t
WHERE rnk = 1
ORDER BY date;
one approach can be like below
with cte1
(
Select t1.*
FROM yourTable t1
INNER JOIN
(
SELECT date, MAX(income) AS max_income
FROM yourTable
GROUP BY date
) t2
ON t1.date = t2.date AND t1.income = t2.max_income
) select min(ID) as ID, date,income from cte1
group by date,income
As you not mentioned which id you need in case of two ID's(when income is same on a particular date) so i took minimum id among them when two id's income is same on a particular date But at the same time you may use max() function also
Try below using subquery and as you've tie for one date so take minimum ID which'll give you one id from date 9/7
select date,min(ID),income
from
(SELECT t1.date, t1.ID,t1.income
FROM tablename t1
INNER JOIN
(
SELECT date, MAX(income) AS mincome
FROM yourTable
GROUP BY date
) t2 ON t1.date = t2.date AND t1.income = t2.mincome
)X group by date,income

Custom GROUP BY clause

SAMPLE DATA
Suppose I have table like this:
No Company Vendor Code Date
1 C1 V1 C1 2016-03-08
1 C1 V1 C1 2016-03-07
1 C1 V1 C2 2016-03-06
DESIRED OUPUT
Desired output should be:
No Company Vendor Code Date
1 C1 V1 C1 2016-03-08
It should take max Date for No, Company, Vendor (group by these columns). But shouldn't group by Code, It have to be taken for that Date.
QUERY
SQL query like:
.....
LEFT JOIN (
SELECT No_, Company, Vendor, Code, MAX(Date)
FROM tbl
GROUP BY No_, Company, Vendor, Code
) t2 ON t1.Company = t2.Company and t1.No_ = t2.No_
.....
OUTPUT FOR NOW
But I got output for now:
No Company Vendor Code Date
1 C1 V1 C1 2016-03-08
1 C1 V1 C2 2016-03-06
That because Code records are different, but It should take C1 code in this case (because No, Company, Vendor match)
WHAT I'VE TRIED
I've tried to remove Code from GROUP BY clause and use SELECT MAX(Code)..., but this is wrong that because It take higher Code by alphabetic.
Have you ideas how can I achieve It? If something not clear I can explain more.
If you don't have any identity column for your table then each row is identified by all column values combination it has. That brings us weird on statement. It includes all columns we are grouping by and a date column which is max for given tuple (No_, Company, Vendor).
select t1.No_, t1.Company, t1.Vendor, t1.Code, t1.Date
from tbl t1
join (select No_, Company, Vendor, MAX(Date) as Date
from tbl
group by No_, Company, Vendor) t2
on t1.No_ = t2.No_ and
t1.Company = t2.Company and
t1.Vendor = t2.Vendor and
t1.Date = t2.Date
Take a look at this similar question.
Edit
Thank you for an answer, but this returning duplicates. Suppose that there can be rows with equal No, Company, Vendor and Date, some other columns are different, but no care. So with INNER SELECT everything fine, It returning distinct values, but problem accured when joining t1, that because It have multiple values.
Then you might be interested in such tsql constructions as rank or row_number. Take a look at Ullas' answer. Try rank as well as it can give slightly different output which might fit your needs.
You could give a row_number partitioned by No, Vendor and Date and order by descending order of date.
Query
;with cte as (
select rn = row_number() over(
partition by [No], Company, Vendor
order by [Date] desc
), *
from tbl
)
select [No], Company, Vendor, Code, [Date] from cte
where rn = 1;
If 1 date only can have 1 record, then you can Query it by search the max date first, then check it.
select No_, Company, Vendor, Code, Date
FROM tbl
where Date in
(select MAX(Date) from tbl GROUP BY No_, Company, Vendor)
if there is more than 1 row that could have the same date, then you could use partition
with cte as
(
select *, ROW_NUMBER() over(partition by No_, Company, Vendor order by Date DESC) as rn
from tbl
)
select No_, Company, Vendor, Code, Date
from cte
where rn=1
A Common Table Expression will do it for you.
WITH cte(N,C,V,D)
AS
(
SELECT t1.[No]
,t1.[Company]
,t1.[Vendor]
,MAX(t1.[Date])
FROM [MyTest] t1
GROUP BY t1.[No]
,t1.[Company]
,t1.[Vendor]
)
SELECT N,C,V,t2.Code,D
FROM cte c
INNER JOIN MyTest t2 ON c.N = t2.No AND c.C = t2.Company AND c.V = t2.Vendor AND c.D = t2.Date

Select newest records that have distinct Name column

I did search around and I found this
SQL selecting rows by most recent date with two unique columns
Which is so close to what I want but I can't seem to make it work.
I get an error Column 'ID' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
I want the newest row by date for each Distinct Name
Select ID,Name,Price,Date
From table
Group By Name
Order By Date ASC
Here is an example of what I want
Table
ID
Name
Price
Date
0
A
10
2012-05-03
1
B
9
2012-05-02
2
A
8
2012-05-04
3
C
10
2012-05-03
4
B
8
2012-05-01
desired result
ID
Name
Price
Date
2
A
8
2012-05-04
3
C
10
2012-05-03
1
B
9
2012-05-02
I am using Microsoft SQL Server 2008
Select ID,Name, Price,Date
From temp t1
where date = (select max(date) from temp where t1.name =temp.name)
order by date desc
Here is a SQL Fiddle with a demo of the above
Or as Conrad points out you can use an INNER JOIN (another SQL Fiddle with a demo) :
SELECT t1.ID, t1.Name, t1.Price, t1.Date
FROM temp t1
INNER JOIN
(
SELECT Max(date) date, name
FROM temp
GROUP BY name
) AS t2
ON t1.name = t2.name
AND t1.date = t2.date
ORDER BY date DESC
There a couple ways to do this. This one uses ROW_NUMBER. Just partition by Name and then order by what you want to put the values you want in the first position.
WITH cte
AS (SELECT Row_number() OVER (partition BY NAME ORDER BY date DESC) RN,
id,
name,
price,
date
FROM table1)
SELECT id,
name,
price,
date
FROM cte
WHERE rn = 1
DEMO
Note you should probably add ID (partition BY NAME ORDER BY date DESC, ID DESC) in your actual query as a tie-breaker for date
select * from (
Select
ID, Name, Price, Date,
Rank() over (partition by Name order by Date) RankOrder
From table
) T
where RankOrder = 1
I have found another memory efficient way (but probably crude way)that has worked for me in postgress. Order the query by the date desc, then select the first record of each distinct field.
SELECT distinct on (Name) ID, Price, Date from
table
order by Date desc
Use Distinct instead of Group By
Select Distinct ID,Name,Price,Date
From table
Order By Date ASC
http://technet.microsoft.com/en-us/library/ms187831.aspx

SQL Group by one column

so I have this table:
ID INITIAL_DATE TAX
A 18-02-2012 105
A 19-02-2012 95
A 20-02-2012 105
A 21-02-2012 100
B 18-02-2012 135
B 19-02-2012 150
B 20-02-2012 130
B 21-02-2012 140
and what I need is, for each distinct ID, the highest TAX ever. And if that TAX occurs twice I want the record with the highest INITIAL_DATE.
This is the query I have:
SELECT ID, MAX (initial_date) initial_date, tax
FROM t t0
WHERE t0.tax = (SELECT MAX (t1.tax)
FROM t t1
WHERE t1.ID = t0.ID
GROUP BY ID)
GROUP BY ID, tax
ORDER BY id, initial_date, tax
but I want to believe there is a better way of grouping these records.
Is there any way of NOT grouping by all the columns in the SELECT?
Have you tried with analytical functions?:
SELECT t0.ID, t0.INITIAL_DATE, t0.TAX
FROM ( SELECT *, ROW_NUMBER() OVER(PARTITION BY ID ORDER BY TAX DESC , INITIAL_DATE DESC) Corr
FROM t) t0
WHERE t0.Corr = 1
As far as I know all the columns in a SELECT that are not aggregated in one or another way, must be part of the GROUP BY statement.
i have tested and this is a solution that works
SELECT t1.ID ,
MAX(t1.data) ,
t1.tax FROM test t1
INNER JOIN ( SELECT ID ,
MAX(tax) as maxtax
FROM test
GROUP BY ID
) t2 ON t1.ID = t2.ID
AND t1.tax = t2.maxtax GROUP BY t1.ID ,t1.tax