Using aggregate function to return minimum value - sql

Please, help me to create a query to determine minimum date_time from the table below:
ID | Name | Date_Time | Location
---------------------------------------
001 | John | 01/01/2015 | 901
001 | john | 02/01/2015 | 903
001 | john | 05/01/2015 | 905
001 | john | 06/01/2015 | 904
002 | Jack | 01/01/2015 | 903
002 | Jack | 03/01/2015 | 904
002 | Jack | 04/01/2015 | 905
003 | Sam | 01/01/2015 | 904
003 | Sam | 03/01/2015 | 903
003 | Sam | 04/01/2015 | 901
003 | Sam | 06/01/2015 | 903
I tried this query:
SELECT ID, NAME, MIN(DATE_TIME), LOCATION
FROM TABLE
GROUP BY (ID)
but I got this error message:
ORA-00979: not a GROUP BY expression

If you use aggregation function, you have specify for which fields the agregation should be applied. So you are using group by clause. In this case you probably mean to find the minimum date_time for each id, name combination.
select id, name, min(date_time)
from my_table group by id, name

When you group something, all other rows will be left clustered to that grouped key. For a key, you can only fetch one of the row(entity) in SELECT.
Shortcut is, what ever in GROUP BY can be in SELECT freely. Otherwise, they have to be enclosed in a AGGREGATE function.
When you group by id,
001 key has 4 rows clustered to it.. Just think, what would happen when you specify non grouped column in SELECT. Where-as when you use MIN(date).. out of 4 dates, a minimum of one is taken.
So, your query has to be
SELECT ID,MIN(NAME),MIN(LOCATION),MIN(DATE)
FROM TABLE
GROUP BY ID
OR
SELECT ID,LOCATION,NAME,MIN(DATE)
FROM TABLE
GROUP BY ID,LOCATION,NAME
OR
Analytical approach.
SELECT ID,LOCATION,DATE,MIN(DATE) OVER(PARTITION BY ID ORDER BY NULL) AS MIN_DATE
FROM TABLE.
Still, it is upto the requirements, on how the query has to be re-written.
EDIT: To fetch rows corresponding the Min date, we can create a SELF JOIN like one below.
SELECT T.ID,T.NAME,T.LOCATION,MIN_DATE
FROM
(
SELECT ID,MIN(DATE) AS MIN_DATE
FROM TABLE T1
GROUP BY ID
) AGG, TABLE T
WHERE T.ID = AGG.ID
AND T.DATE = AGG.MIN_DATE
OR
SELECT ID,NAME,LOCATION,MIN_DATE
FROM
(
SELECT ID,
NAME,
LOCATION,
MIN(DATE) OVER(PARTITION BY ID ORDER BY NULL) MIN_DATE,
ROW_NUMBER() OVER(PARTITION BY ID ORDER BY NULL) RNK
FROM TABLE
)
WHERE RNK = 1;

Try grouping all the other columns ... and if the table is name 'table' try changing the table name to something else in the schema.
SELECT ID , NAME , MIN(DATE_TIME) , LOCATION FROM TABLE GROUP BY ID, Name, Location

select t1.name,t1.id,t1.location,t1.date from (select id,MIN(Date) as min_date from table group by id ) t2 inner join TABLE t1 on t1.date=t2.min_date and t1.id=t2.id;

Related

Select the highest value of column 2 per column 1

Given the following table P_PROV
+----+-----------+-----------+
| id | date | person_id |
+----+-----------+-----------+
| 1 |19/06/2019 | 1 |
| 2 |18/07/2010 | 2 |
| 3 |19/06/2020 | 1 |
| 4 |17/06/2020 | 2 |
| 5 |28/06/2020 | 3 |
+----+-----------+-----------+
I want this output
+----+-----------+-----------+
| id | date | person_id |
+----+-----------+-----------+
| 3 |19/06/2020 | 1 |
| 4 |17/06/2020 | 2 |
| 5 |28/06/2020 | 3 |
+----+-----------+-----------+
Putting this in words, I want to return per person the maximum date. I tried something like this
SELECT DISTINCT pp.date, pp.id FROM P_PROV pp
WHERE (SELECT MAX(aa.date)
FROM P_PROV aa) = pp.date;
This one is only returning one row (of course, because the MAX will return the maximum date only), but I really don't know how to approach this issue, any kind of help would be appreciated
ROW_NUMBER provides one way to handle this:
SELECT id, date, person_id
FROM
(
SELECT t.*, ROW_NUMBER() OVER (PARTITION BY person_id ORDER BY date DESC) rn
FROM yourTable t
) t
WHERE rn = 1;
Oracle has a fun way to do this using aggregation:
select max(id) keep (dense_rank first order by date desc) as id,
max(date) as date, person_id
from P_PROV
group by person_id;
Given that your ids are increasing, this probably also does what you want:
select max(id) as id, max(date) as date, person_id
from P_PROV
group by person_id;

SQL - SELECT duplicates between IDs, but not show records if duplicates occur for same ID

I have the following table (simplified from the real table) at the moment:
+----+-------+-------+
| ID | Name | Phone |
+----+-------+-------+
| 1 | Tom | 123 |
| 1 | Tom | 123 |
| 1 | Tom | 123 |
| 2 | Mark | 321 |
| 2 | Mark | 321 |
| 3 | Kate | 321 |
+----+-------+-------+
My desired output in the SELECT statement is:
+----+------+-------+
| ID | Name | Phone |
+----+------+-------+
| 2 | Mark | 321 |
| 3 | Kate | 321 |
+----+------+-------+
I want to select duplicates only when they occur between two different IDs (like Mark and Kate sharing the same phone number), but not to show any records for IDs that share the same phone number with themselves only (like Tom).
Could someone advise how this can be achieved?
You can use an EXISTS condition with a correlated subquery to ensure that another record exists that has the same phone and a different id. We also need DISTINCT to remove the duplicates in the resultset.
SELECT DISTINCT id, name, phone
FROM mytable t
WHERE EXISTS (
SELECT 1
FROM mytable t1
WHERE t1.phone = t.phone AND t1.id <> t.id
)
Demo on DB Fiddle:
| id | name | phone |
| --- | ---- | ----- |
| 2 | Mark | 321 |
| 3 | Kate | 321 |
You can use window functions for this:
select t.*
from (select t.*,
row_number() over (partition by phone, name order by id) as seqnum,
min(id) over (partition by phone) as min_id,
max(id) over (partition by phone) as max_id
from t
) t
where seqnum = 1 and min_id <> max_id;
Another method uses aggregation and a window function:
select phone, name, id
from (select phone, name, id,
count(*) over (partition by phone) as num_ids
from t
group by phone, name, id
) pn
where num_ids > 1;
Both of these have the advantage over the exists solution (GMB's) that they refer to the "table" only once. That can be a big advantage if the table is a complex view or query. If performance is an issue, I would encourage you to test several variants to see which works best.
Can use somewhat a corelated query with group by and having as below
Select ID, NAME, max(PHONE) From
(Select * From Table) t group by id,
name having
1= max(
case
When phone in (select phone from
table where t.id<>Id) then 1 else 0)
end)

Update in the same column from the same table

I'm trying to update a column in my table that was ignored at the initial insert based on a key and not null values in the same column.
My table is a history table in a data warehouse : it consists of (to simplify):
id which is its primary key
employee_id
date_of_birth
project_id
The rows help the company keep track of projects that an employee had worked on.
The problem is that when updating this table, the date_of_birth column is ignored, which is a problem for me since I'm working on a project that needs the age of the employee at the time he changed projects.
Actual:
+----+-------------+---------------+------------+
| ID | EMPLOYEE_ID | YEAR_OF_BIRTH | PROJECT_ID |
+----+-------------+---------------+------------+
| 1 | 1 | 1980 | 1 |
| 2 | 1 | NULL | 2 |
| 3 | 2 | 1990 | 2 |
| 4 | 2 | NULL | 1 |
+----+-------------+---------------+------------+
And this what I want:
+----+-------------+---------------+------------+
| ID | EMPLOYEE_ID | YEAR_OF_BIRTH | PROJECT_ID |
+----+-------------+---------------+------------+
| 1 | 1 | 1980 | 1 |
| 2 | 1 | 1980 | 2 |
| 3 | 2 | 1990 | 2 |
| 4 | 2 | 1990 | 1 |
+----+-------------+---------------+------------+
We could try using COALESCE to conditionally replace a NULL year of birth with a non NULL value:
SELECT
ID,
EMPLOYEE_ID,
COALESCE(YEAR_OF_BIRTH, MAX(YEAR_OF_BIRTH) OVER (PARTITION BY EMPLOYEE_ID)) AS YEAR_OF_BIRTH,
PROJECT_ID
FROM yourTable;
The following query should do what you want:
UPDATE yourTable
SET YEAR_OF_BIRTH = (SELECT MIN(YEAR_OF_BIRTH) FROM yourTable a where a.EMPLOYEE_ID = EMPLOYEE_ID)
WHERE YEAR_OF_BIRTH IS NULL
According to your sample data, you can also use a correlated subquery as
SELECT T1.ID,
T1.EMPLOYEE_ID,
ISNULL(YEAR_OF_BIRTH,(
SELECT MAX(T2.YEAR_OF_BIRTH)
FROM T T2
WHERE T2.EMPLOYEE_ID = T1.EMPLOYEE_ID
)),
T1.PROJECT_ID
FROM T T1 ;
OR
SELECT ID,
EMPLOYEE_ID,
ISNULL(YEAR_OF_BIRTH, MAX(YEAR_OF_BIRTH) OVER (PARTITION BY EMPLOYEE_ID)) AS YEAR_OF_BIRTH,
PROJECT_ID
FROM T;
Demo
I would use an updatable CTE for this purpose:
with toupdate as (
select a.*, min(year_of_birth) over (partition by employee_id) as min_date_of_birth
from actual a
)
update toupdate
set date_of_birth = min_date_of_birth
where date_of_birth is null or date_of_birth <> min_ date_of_birth;
The where clause reduces the number of rows being updated.
That said, FIX YOUR DATA MODEL. Sorry for raising my voice. The date-of-birth information should not be stored in this table. It should be in the employee table, because an employee has only one of them.
Your desired output can get by this query:
SELECT ID, EMPLOYEE_ID,
MAX(YEAR_OF_BIRTH) OVER (PARTITION BY EMPLOYEE_ID) AS YEAR_OF_BIRTH,
PROJECT_ID
FROM Table1
To check the output of the query you can Click Here

Takes 1 row from list in each ID

Please help me to the SQL Oracle with the showing only 1 row result in each ID group, Order by DATE DESC below:
From this list
Table: Employees
____________________________
ID | Employees | Date
___+___________+____________
1 | A | 2017-08-01
1 | A | 2017-08-08
2 | B | 2017-07-01
2 | B | 2017-07-10
2 | B | 2017-07-05
Result
____________________________
ID | Employees | Date
___+___________+____________
1 | A | 2017-08-08
2 | B | 2017-07-10
There are several different ways of achieving this. It really depends on what business rules you want to use to choose the one date per employee. You haven't specified anything, but your desired output suggests you need the most recent date. If so you can use the max() aggregate function with a GROUP BY clause.
select id,
employees,
max(date) as date
from employee
group by id,
employees
You can use the analytic function :
https://docs.oracle.com/cloud/latest/db112/SQLRF/functions004.htm#SQLRF06174
select *, RANK() OVER (PARTITION BY Employees ORDER BY date desc) from Employees
then put an outer select to select rank 1
select id, emplyoees, max(date) as date1
from employees
group by id, emplyoees;

SQL query for finding records where count > 1

I have a table named PAYMENT. Within this table I have a user ID, an account number, a ZIP code and a date. I would like to find all records for all users that have more than one payment per day with the same account number.
UPDATE: Additionally, there should be a filter than only counts the records whose ZIP code is different.
This is how the table looks like:
| user_id | account_no | zip | date |
| 1 | 123 | 55555 | 12-DEC-09 |
| 1 | 123 | 66666 | 12-DEC-09 |
| 1 | 123 | 55555 | 13-DEC-09 |
| 2 | 456 | 77777 | 14-DEC-09 |
| 2 | 456 | 77777 | 14-DEC-09 |
| 2 | 789 | 77777 | 14-DEC-09 |
| 2 | 789 | 77777 | 14-DEC-09 |
The result should look similar to this:
| user_id | count |
| 1 | 2 |
How would you express this in a SQL query? I was thinking self join but for some reason my count is wrong.
Use the HAVING clause and GROUP By the fields that make the row unique
The below will find
all users that have more than one payment per day with the same account number
SELECT
user_id,
COUNT(*) count
FROM
PAYMENT
GROUP BY
account,
user_id,
date
HAVING COUNT(*) > 1
Update
If you want to only include those that have a distinct ZIP you can get a distinct set first and then perform you HAVING/GROUP BY
SELECT
user_id,
account_no,
date,
COUNT(*)
FROM
(SELECT DISTINCT
user_id,
account_no,
zip,
date
FROM
payment
) payment
GROUP BY
user_id,
account_no,
date
HAVING COUNT(*) > 1
Try this query:
SELECT column_name
FROM table_name
GROUP BY column_name
HAVING COUNT(column_name) = 1;
I wouldn't recommend the HAVING keyword for newbies, it is essentially for legacy purposes.
I am not clear on what is the key for this table (is it fully normalized, I wonder?), consequently I find it difficult to follow your specification:
I would like to find all records for all users that have more than one
payment per day with the same account number... Additionally, there
should be a filter than only counts the records whose ZIP code is
different.
So I've taken a literal interpretation.
The following is more verbose but could be easier to understand and therefore maintain (I've used a CTE for the table PAYMENT_TALLIES but it could be a VIEW:
WITH PAYMENT_TALLIES (user_id, zip, tally)
AS
(
SELECT user_id, zip, COUNT(*) AS tally
FROM PAYMENT
GROUP
BY user_id, zip
)
SELECT DISTINCT *
FROM PAYMENT AS P
WHERE EXISTS (
SELECT *
FROM PAYMENT_TALLIES AS PT
WHERE P.user_id = PT.user_id
AND PT.tally > 1
);
create table payment(
user_id int(11),
account int(11) not null,
zip int(11) not null,
dt date not null
);
insert into payment values
(1,123,55555,'2009-12-12'),
(1,123,66666,'2009-12-12'),
(1,123,77777,'2009-12-13'),
(2,456,77777,'2009-12-14'),
(2,456,77777,'2009-12-14'),
(2,789,77777,'2009-12-14'),
(2,789,77777,'2009-12-14');
select foo.user_id, foo.cnt from
(select user_id,count(account) as cnt, dt from payment group by account, dt) foo
where foo.cnt > 1;