Update in the same column from the same table - sql

I'm trying to update a column in my table that was ignored at the initial insert based on a key and not null values in the same column.
My table is a history table in a data warehouse : it consists of (to simplify):
id which is its primary key
employee_id
date_of_birth
project_id
The rows help the company keep track of projects that an employee had worked on.
The problem is that when updating this table, the date_of_birth column is ignored, which is a problem for me since I'm working on a project that needs the age of the employee at the time he changed projects.
Actual:
+----+-------------+---------------+------------+
| ID | EMPLOYEE_ID | YEAR_OF_BIRTH | PROJECT_ID |
+----+-------------+---------------+------------+
| 1 | 1 | 1980 | 1 |
| 2 | 1 | NULL | 2 |
| 3 | 2 | 1990 | 2 |
| 4 | 2 | NULL | 1 |
+----+-------------+---------------+------------+
And this what I want:
+----+-------------+---------------+------------+
| ID | EMPLOYEE_ID | YEAR_OF_BIRTH | PROJECT_ID |
+----+-------------+---------------+------------+
| 1 | 1 | 1980 | 1 |
| 2 | 1 | 1980 | 2 |
| 3 | 2 | 1990 | 2 |
| 4 | 2 | 1990 | 1 |
+----+-------------+---------------+------------+

We could try using COALESCE to conditionally replace a NULL year of birth with a non NULL value:
SELECT
ID,
EMPLOYEE_ID,
COALESCE(YEAR_OF_BIRTH, MAX(YEAR_OF_BIRTH) OVER (PARTITION BY EMPLOYEE_ID)) AS YEAR_OF_BIRTH,
PROJECT_ID
FROM yourTable;

The following query should do what you want:
UPDATE yourTable
SET YEAR_OF_BIRTH = (SELECT MIN(YEAR_OF_BIRTH) FROM yourTable a where a.EMPLOYEE_ID = EMPLOYEE_ID)
WHERE YEAR_OF_BIRTH IS NULL

According to your sample data, you can also use a correlated subquery as
SELECT T1.ID,
T1.EMPLOYEE_ID,
ISNULL(YEAR_OF_BIRTH,(
SELECT MAX(T2.YEAR_OF_BIRTH)
FROM T T2
WHERE T2.EMPLOYEE_ID = T1.EMPLOYEE_ID
)),
T1.PROJECT_ID
FROM T T1 ;
OR
SELECT ID,
EMPLOYEE_ID,
ISNULL(YEAR_OF_BIRTH, MAX(YEAR_OF_BIRTH) OVER (PARTITION BY EMPLOYEE_ID)) AS YEAR_OF_BIRTH,
PROJECT_ID
FROM T;
Demo

I would use an updatable CTE for this purpose:
with toupdate as (
select a.*, min(year_of_birth) over (partition by employee_id) as min_date_of_birth
from actual a
)
update toupdate
set date_of_birth = min_date_of_birth
where date_of_birth is null or date_of_birth <> min_ date_of_birth;
The where clause reduces the number of rows being updated.
That said, FIX YOUR DATA MODEL. Sorry for raising my voice. The date-of-birth information should not be stored in this table. It should be in the employee table, because an employee has only one of them.

Your desired output can get by this query:
SELECT ID, EMPLOYEE_ID,
MAX(YEAR_OF_BIRTH) OVER (PARTITION BY EMPLOYEE_ID) AS YEAR_OF_BIRTH,
PROJECT_ID
FROM Table1
To check the output of the query you can Click Here

Related

Extract employee record based on certain criteria

I have a database of employees with their employment history in organization.
Sample Data -
+----+----------+------------+
| ID | Date | Event |
+----+----------+------------+
| 1 | 20190807 | Hired |
| 1 | 20191209 | Promoted |
| 1 | 20200415 | Terminated |
| 2 | 20180901 | Hired |
| 2 | 20191231 | Terminated |
| 3 | 20180505 | Hired |
| 3 | 20190630 | Promoted |
+----+----------+------------+
I want to extract the list of employees who were terminated after promotion. In above example, the query should return ID 1.
I am using SSMS 17 if that helps.
You can try using lag()
DEMO
select distinct ID from
(
select *,lag(event) over(partition by id order by dateval) as prevval
from t
)A where prevval='Promoted'
If you want immediately after, then you would use lag(). If you want any time after, then you can use aggregation:
select id
from t
group by id
having max(case when event = 'Promoted' then dateval end) < max(case when event = 'Terminated' then dateval end);
Using lag(), the code looks like:
select id
from (select t.*, lag(event) over (partition by id order by dateval) as prev_event
from t
) t
where prev_event = 'Promoted' and event = 'Terminated';
A simple exists check could also solve this simple requirement.
DEMO
select * from table1 a
where event='Terminated'
and exists(select 1 from table1 b where a.ID = b.ID and event='Promoted');
output:
ID date1 event
1 20191209 Terminated
We can even compare event date in correlated sub-query as shown in DEMO link.

Select the highest value of column 2 per column 1

Given the following table P_PROV
+----+-----------+-----------+
| id | date | person_id |
+----+-----------+-----------+
| 1 |19/06/2019 | 1 |
| 2 |18/07/2010 | 2 |
| 3 |19/06/2020 | 1 |
| 4 |17/06/2020 | 2 |
| 5 |28/06/2020 | 3 |
+----+-----------+-----------+
I want this output
+----+-----------+-----------+
| id | date | person_id |
+----+-----------+-----------+
| 3 |19/06/2020 | 1 |
| 4 |17/06/2020 | 2 |
| 5 |28/06/2020 | 3 |
+----+-----------+-----------+
Putting this in words, I want to return per person the maximum date. I tried something like this
SELECT DISTINCT pp.date, pp.id FROM P_PROV pp
WHERE (SELECT MAX(aa.date)
FROM P_PROV aa) = pp.date;
This one is only returning one row (of course, because the MAX will return the maximum date only), but I really don't know how to approach this issue, any kind of help would be appreciated
ROW_NUMBER provides one way to handle this:
SELECT id, date, person_id
FROM
(
SELECT t.*, ROW_NUMBER() OVER (PARTITION BY person_id ORDER BY date DESC) rn
FROM yourTable t
) t
WHERE rn = 1;
Oracle has a fun way to do this using aggregation:
select max(id) keep (dense_rank first order by date desc) as id,
max(date) as date, person_id
from P_PROV
group by person_id;
Given that your ids are increasing, this probably also does what you want:
select max(id) as id, max(date) as date, person_id
from P_PROV
group by person_id;

SQL - SELECT duplicates between IDs, but not show records if duplicates occur for same ID

I have the following table (simplified from the real table) at the moment:
+----+-------+-------+
| ID | Name | Phone |
+----+-------+-------+
| 1 | Tom | 123 |
| 1 | Tom | 123 |
| 1 | Tom | 123 |
| 2 | Mark | 321 |
| 2 | Mark | 321 |
| 3 | Kate | 321 |
+----+-------+-------+
My desired output in the SELECT statement is:
+----+------+-------+
| ID | Name | Phone |
+----+------+-------+
| 2 | Mark | 321 |
| 3 | Kate | 321 |
+----+------+-------+
I want to select duplicates only when they occur between two different IDs (like Mark and Kate sharing the same phone number), but not to show any records for IDs that share the same phone number with themselves only (like Tom).
Could someone advise how this can be achieved?
You can use an EXISTS condition with a correlated subquery to ensure that another record exists that has the same phone and a different id. We also need DISTINCT to remove the duplicates in the resultset.
SELECT DISTINCT id, name, phone
FROM mytable t
WHERE EXISTS (
SELECT 1
FROM mytable t1
WHERE t1.phone = t.phone AND t1.id <> t.id
)
Demo on DB Fiddle:
| id | name | phone |
| --- | ---- | ----- |
| 2 | Mark | 321 |
| 3 | Kate | 321 |
You can use window functions for this:
select t.*
from (select t.*,
row_number() over (partition by phone, name order by id) as seqnum,
min(id) over (partition by phone) as min_id,
max(id) over (partition by phone) as max_id
from t
) t
where seqnum = 1 and min_id <> max_id;
Another method uses aggregation and a window function:
select phone, name, id
from (select phone, name, id,
count(*) over (partition by phone) as num_ids
from t
group by phone, name, id
) pn
where num_ids > 1;
Both of these have the advantage over the exists solution (GMB's) that they refer to the "table" only once. That can be a big advantage if the table is a complex view or query. If performance is an issue, I would encourage you to test several variants to see which works best.
Can use somewhat a corelated query with group by and having as below
Select ID, NAME, max(PHONE) From
(Select * From Table) t group by id,
name having
1= max(
case
When phone in (select phone from
table where t.id<>Id) then 1 else 0)
end)

Sql two table query most duplicated foreign key

I got those two tables sport and student:
First table sport:
|idsport | name |
_______________________
| 1 | bobsled |
| 2 | skating |
| 3 | boarding |
| 4 | iceskating |
| 5 | skiing |
Second table student:
foreign key
|idstudent | name | sport_idsport
__________________________________________
| 1 | john | 3 |
| 2 | pauly | 2 |
| 3 | max | 1 |
| 4 | jane | 2 |
| 5 | nico | 5 |
so far i did this it output which number is mostly inserted, but cant get it to work
with two tables
SELECT sport_idsport
FROM (SELECT sport_idsport FROM student GROUP BY sport_idsport ORDER BY COUNT(*) desc)
WHERE ROWNUM<=1;
I need to output name of most popular sport, in that case it would be skating.
I use oracle sql.
with counter as (
Select sport_idsport,
count(*) as cnt,
dense_rank() over (order by count(*) desc) as rn
from student
group by sport_idsport
)
select s.*, c.cnt
from sport s
join counter c on c.sport_idsport = s.idsport and c.rn = 1;
SQLFiddle example: http://sqlfiddle.com/#!4/b76e21/1
select cnt, sport_idsport from (
select count(*) cnt, sport_idsport
from student
group by sport_idsport
order by count(*) desc
)
where rownum = 1

Can we replace all values of column with Row numbers?

i have a query
select name,name_order from name_table where dept_id=XXX;
and the resultSet is
+------------+--------+
| name_order | name |
+------------+--------+
| 0 | One |
| 1 | Two |
| 2 | Three |
| 3 | four |
| 6 | five |
| 9 | six |
+------------+--------+
i have to update the name_order for the dept_id, in such a way that they start from 0 and
incremented (for that dept_id only)
note : name_order is not an index
the out come should be like
+------------+--------+
| name_order | name |
+------------+--------+
| 0 | One |
| 1 | Two |
| 2 | Three |
| 3 | four |
| 4 | five |
| 5 | six |
+------------+--------+
i tried analytical function rowNumber(), it did not help
update name_table set name_order = (
ROW_NUMBER() OVER (PARTITION BY dept_id ORDER BY name_order)-1
)
where dept_id=XXX order by name_order
Thanks in advance
-R
You can do it with a merge command
MERGE INTO name_table dst
USING (SELECT t.*, row_number() over (partition BY dept_id ORDER BY name_order) -1 n
FROM name_table t) src
ON (dst.dept_id = src.dept_id AND dst.name = src.name)
WHEN MATCHED THEN UPDATE SET Dst.name_order = src.n;
Here is a sqlfiddle demo
But why would you want a column with values you can have in a query ?
UPDATE NAME_TABLE A
SET NAME_ORDER=(
SELECT R
FROM (SELECT NAME,ROW_NUMBER() OVER(ORDER BY NAME_ORDER) R
FROM NAME_TABLE ) B
WHERE A.NAME=B.NAME);
http://www.sqlfiddle.com/#!4/6804a/1
UPDATE NAME_TABLE A
SET NAME_ORDER=(
SELECT R
FROM (SELECT NAME,DEPT_ID,ROW_NUMBER() OVER(PARTITION BY DEPT_ID ORDER BY NAME_ORDER)-1 R
FROM NAME_TABLE ) B
WHERE A.NAME=B.NAME AND A.DEPT_ID=B.DEPT_ID /*AND A.DEPT_ID=XXX*/ );
Add the condition about dept_id. Thanks Passerby.
SET #rownum:=0; SELECT #rownum:=#rownum+1 AS name_order, names from name_table where dept_id=XXX;
working fine on mysql.