Can we replace all values of column with Row numbers? - sql

i have a query
select name,name_order from name_table where dept_id=XXX;
and the resultSet is
+------------+--------+
| name_order | name |
+------------+--------+
| 0 | One |
| 1 | Two |
| 2 | Three |
| 3 | four |
| 6 | five |
| 9 | six |
+------------+--------+
i have to update the name_order for the dept_id, in such a way that they start from 0 and
incremented (for that dept_id only)
note : name_order is not an index
the out come should be like
+------------+--------+
| name_order | name |
+------------+--------+
| 0 | One |
| 1 | Two |
| 2 | Three |
| 3 | four |
| 4 | five |
| 5 | six |
+------------+--------+
i tried analytical function rowNumber(), it did not help
update name_table set name_order = (
ROW_NUMBER() OVER (PARTITION BY dept_id ORDER BY name_order)-1
)
where dept_id=XXX order by name_order
Thanks in advance
-R

You can do it with a merge command
MERGE INTO name_table dst
USING (SELECT t.*, row_number() over (partition BY dept_id ORDER BY name_order) -1 n
FROM name_table t) src
ON (dst.dept_id = src.dept_id AND dst.name = src.name)
WHEN MATCHED THEN UPDATE SET Dst.name_order = src.n;
Here is a sqlfiddle demo
But why would you want a column with values you can have in a query ?

UPDATE NAME_TABLE A
SET NAME_ORDER=(
SELECT R
FROM (SELECT NAME,ROW_NUMBER() OVER(ORDER BY NAME_ORDER) R
FROM NAME_TABLE ) B
WHERE A.NAME=B.NAME);
http://www.sqlfiddle.com/#!4/6804a/1
UPDATE NAME_TABLE A
SET NAME_ORDER=(
SELECT R
FROM (SELECT NAME,DEPT_ID,ROW_NUMBER() OVER(PARTITION BY DEPT_ID ORDER BY NAME_ORDER)-1 R
FROM NAME_TABLE ) B
WHERE A.NAME=B.NAME AND A.DEPT_ID=B.DEPT_ID /*AND A.DEPT_ID=XXX*/ );
Add the condition about dept_id. Thanks Passerby.

SET #rownum:=0; SELECT #rownum:=#rownum+1 AS name_order, names from name_table where dept_id=XXX;
working fine on mysql.

Related

Update in the same column from the same table

I'm trying to update a column in my table that was ignored at the initial insert based on a key and not null values in the same column.
My table is a history table in a data warehouse : it consists of (to simplify):
id which is its primary key
employee_id
date_of_birth
project_id
The rows help the company keep track of projects that an employee had worked on.
The problem is that when updating this table, the date_of_birth column is ignored, which is a problem for me since I'm working on a project that needs the age of the employee at the time he changed projects.
Actual:
+----+-------------+---------------+------------+
| ID | EMPLOYEE_ID | YEAR_OF_BIRTH | PROJECT_ID |
+----+-------------+---------------+------------+
| 1 | 1 | 1980 | 1 |
| 2 | 1 | NULL | 2 |
| 3 | 2 | 1990 | 2 |
| 4 | 2 | NULL | 1 |
+----+-------------+---------------+------------+
And this what I want:
+----+-------------+---------------+------------+
| ID | EMPLOYEE_ID | YEAR_OF_BIRTH | PROJECT_ID |
+----+-------------+---------------+------------+
| 1 | 1 | 1980 | 1 |
| 2 | 1 | 1980 | 2 |
| 3 | 2 | 1990 | 2 |
| 4 | 2 | 1990 | 1 |
+----+-------------+---------------+------------+
We could try using COALESCE to conditionally replace a NULL year of birth with a non NULL value:
SELECT
ID,
EMPLOYEE_ID,
COALESCE(YEAR_OF_BIRTH, MAX(YEAR_OF_BIRTH) OVER (PARTITION BY EMPLOYEE_ID)) AS YEAR_OF_BIRTH,
PROJECT_ID
FROM yourTable;
The following query should do what you want:
UPDATE yourTable
SET YEAR_OF_BIRTH = (SELECT MIN(YEAR_OF_BIRTH) FROM yourTable a where a.EMPLOYEE_ID = EMPLOYEE_ID)
WHERE YEAR_OF_BIRTH IS NULL
According to your sample data, you can also use a correlated subquery as
SELECT T1.ID,
T1.EMPLOYEE_ID,
ISNULL(YEAR_OF_BIRTH,(
SELECT MAX(T2.YEAR_OF_BIRTH)
FROM T T2
WHERE T2.EMPLOYEE_ID = T1.EMPLOYEE_ID
)),
T1.PROJECT_ID
FROM T T1 ;
OR
SELECT ID,
EMPLOYEE_ID,
ISNULL(YEAR_OF_BIRTH, MAX(YEAR_OF_BIRTH) OVER (PARTITION BY EMPLOYEE_ID)) AS YEAR_OF_BIRTH,
PROJECT_ID
FROM T;
Demo
I would use an updatable CTE for this purpose:
with toupdate as (
select a.*, min(year_of_birth) over (partition by employee_id) as min_date_of_birth
from actual a
)
update toupdate
set date_of_birth = min_date_of_birth
where date_of_birth is null or date_of_birth <> min_ date_of_birth;
The where clause reduces the number of rows being updated.
That said, FIX YOUR DATA MODEL. Sorry for raising my voice. The date-of-birth information should not be stored in this table. It should be in the employee table, because an employee has only one of them.
Your desired output can get by this query:
SELECT ID, EMPLOYEE_ID,
MAX(YEAR_OF_BIRTH) OVER (PARTITION BY EMPLOYEE_ID) AS YEAR_OF_BIRTH,
PROJECT_ID
FROM Table1
To check the output of the query you can Click Here

How to delete the rows with three same data columns and one different data column

I have a table "MARK_TABLE" as below.
How can I delete the rows with same "STUDENT", "COURSE" and "SCORE" values?
| ID | STUDENT | COURSE | SCORE |
|----|---------|--------|-------|
| 1 | 1 | 1 | 60 |
| 3 | 1 | 2 | 81 |
| 4 | 1 | 3 | 81 |
| 9 | 2 | 1 | 80 |
| 10 | 1 | 1 | 60 |
| 11 | 2 | 1 | 80 |
Now I already filtered the data I want to KEEP, but without the "ID"...
SELECT student, course, score FROM mark_table
INTERSECT
SELECT student, course, score FROM mark_table
The output:
| STUDENT | COURSE | SCORE |
|---------|--------|-------|
| 1 | 1 | 60 |
| 1 | 2 | 81 |
| 1 | 3 | 81 |
| 2 | 1 | 80 |
Use the following query to delete the desired rows:
DELETE FROM MARK_TABLE M
WHERE
EXISTS (
SELECT
1
FROM
MARK_TABLE M_IN
WHERE
M.STUDENT = M_IN.STUDENT
AND M.COURSE = M_IN.COURSE
AND M.SCORE = M_IN.SCORE
AND M.ID < M_IN.ID
)
OUTPUT
db<>fiddle demo
Cheers!!
use distinct
SELECT distinct student, course, score FROM mark_table
Assuming you don't just want to select the unique data you want to keep (you mention you've already done this), you can proceed as follows:
Create a temporary table to hold the data you want to keep
Insert the data you want to keep into the temporary table
Empty the source table
Re-Insert the data you want to keep into the source table.
select * from
(
select row_number() over (partition by student,course,score order by score)
rn,student,course,score from mark_table
) t
where rn=1
Use CTE with RowNumber
create table #MARK_TABLE (ID int, STUDENT int, COURSE int, SCORE int)
insert into #MARK_TABLE
values
(1,1,1,60),
(3,1,2,81),
(4,1,3,81),
(9,2,1,80),
(10,1,1,60),
(11,2,1,80)
;with cteDeleteID as(
Select id, row_number() over (partition by student,course,score order by score) [row_number] from #MARK_TABLE
)
delete from #MARK_TABLE where id in
(
select id from cteDeleteID where [row_number] != 1
)
select * from #MARK_TABLE
drop table #MARK_TABLE

SQL : Getting duplicate rows along with other variables

I am working on Terradata SQL. I would like to get the duplicate fields with their count and other variables as well. I can only find ways to get the count, but not exactly the variables as well.
Available input
+---------+----------+----------------------+
| id | name | Date |
+---------+----------+----------------------+
| 1 | abc | 21.03.2015 |
| 1 | def | 22.04.2015 |
| 2 | ajk | 22.03.2015 |
| 3 | ghi | 23.03.2015 |
| 3 | ghi | 23.03.2015 |
Expected output :
+---------+----------+----------------------+
| id | name | count | // Other fields
+---------+----------+----------------------+
| 1 | abc | 2 |
| 1 | def | 2 |
| 2 | ajk | 1 |
| 3 | ghi | 2 |
| 3 | ghi | 2 |
What am I looking for :
I am looking for all duplicate rows, where duplication is decided by ID and to retrieve the duplicate rows as well.
All I have till now is :
SELECT
id, name, other-variables, COUNT(*)
FROM
Table_NAME
GROUP BY
id, name
HAVING
COUNT(*) > 1
This is not showing correct data. Thank you.
You could use a window aggregate function, like this:
SELECT *
FROM (
SELECT id, name, other-variables,
COUNT(*) OVER (PARTITION BY id) AS duplicates
FROM users
) AS sub
WHERE duplicates > 1
Using a teradata extension to ISO SQL syntax, you can simplify the above to:
SELECT id, name, other-variables,
COUNT(*) OVER (PARTITION BY id) AS duplicates
FROM users
QUALIFY duplicates > 1
As an alternative to the accepted and perfectly correct answer, you can use:
SELECT {all your required 'variables' (they are not variables, but attributes)}
, cnt.Count_Dups
FROM Table_NAME TN
INNER JOIN (
SELECT id
, COUNT(1) Count_Dups
GROUP BY id
HAVING COUNT(1) > 1 -- If you want only duplicates
) cnt
ON cnt.id = TN.id
edit: According to your edit, duplicates are on id only. Edited my query accordingly.
try this,
SELECT
id, COUNT(id)
FROM
Table_NAME
GROUP BY
id
HAVING
COUNT(id) > 1

Sql two table query most duplicated foreign key

I got those two tables sport and student:
First table sport:
|idsport | name |
_______________________
| 1 | bobsled |
| 2 | skating |
| 3 | boarding |
| 4 | iceskating |
| 5 | skiing |
Second table student:
foreign key
|idstudent | name | sport_idsport
__________________________________________
| 1 | john | 3 |
| 2 | pauly | 2 |
| 3 | max | 1 |
| 4 | jane | 2 |
| 5 | nico | 5 |
so far i did this it output which number is mostly inserted, but cant get it to work
with two tables
SELECT sport_idsport
FROM (SELECT sport_idsport FROM student GROUP BY sport_idsport ORDER BY COUNT(*) desc)
WHERE ROWNUM<=1;
I need to output name of most popular sport, in that case it would be skating.
I use oracle sql.
with counter as (
Select sport_idsport,
count(*) as cnt,
dense_rank() over (order by count(*) desc) as rn
from student
group by sport_idsport
)
select s.*, c.cnt
from sport s
join counter c on c.sport_idsport = s.idsport and c.rn = 1;
SQLFiddle example: http://sqlfiddle.com/#!4/b76e21/1
select cnt, sport_idsport from (
select count(*) cnt, sport_idsport
from student
group by sport_idsport
order by count(*) desc
)
where rownum = 1

SELECT only latest record of an ID from given rows

I have this table shown below...How do I select only the latest data of the id based on changeno?
+----+--------------+------------+--------+
| id | data | changeno | |
+----+--------------+------------+--------+
| 1 | Yes | 1 | |
| 2 | Yes | 2 | |
| 2 | Maybe | 3 | |
| 3 | Yes | 4 | |
| 3 | Yes | 5 | |
| 3 | No | 6 | |
| 4 | No | 7 | |
| 5 | Maybe | 8 | |
| 5 | Yes | 9 | |
+----+---------+------------+-------------+
I would want this result...
+----+--------------+------------+--------+
| id | data | changeno | |
+----+--------------+------------+--------+
| 1 | Yes | 1 | |
| 2 | Maybe | 3 | |
| 3 | No | 6 | |
| 4 | No | 7 | |
| 5 | Yes | 9 | |
+----+---------+------------+-------------+
I currently have this SQL statement...
SELECT id, data, MAX(changeno) as changeno FROM Table1 GROUP BY id;
and clearly it doesn't return what I want. This should return an error because of the aggrerate function. If I added fields under the GROUP BY clause it works but it doesn't return what I want. The SQL statement is by far the closest I could think of. I'd appreciate it if anybody could help me on this. Thank you in advance :)
This is typically referred to as the "greatest-n-per-group" problem. One way to solve this in SQL Server 2005 and higher is to use a CTE with a calculated ROW_NUMBER() based on the grouping of the id column, and sorting those by largest changeno first:
;WITH cte AS
(
SELECT id, data, changeno,
rn = ROW_NUMBER() OVER (PARTITION BY id ORDER BY changeno DESC)
FROM dbo.Table1
)
SELECT id, data, changeno
FROM cte
WHERE rn = 1
ORDER BY id;
You want to use row_number() for this:
select id, data, changeno
from (SELECT t.*,
row_number() over (partition by id order by changeno desc) as seqnum
FROM Table1 t
) t
where seqnum = 1;
Not a well formed or performance optimized query but for small tasks it works fine.
SELECT * FROM TEST
WHERE changeno IN (SELECT MAX(changeno)
FROM TEST
GROUP BY id)
for other alternatives :
DECLARE #Table1 TABLE
(
id INT, data VARCHAR(5), changeno INT
);
INSERT INTO #Table1
SELECT 1,'Yes',1
UNION ALL
SELECT 2,'Yes',2
UNION ALL
SELECT 2,'Maybe',3
UNION ALL
SELECT 3,'Yes',4
UNION ALL
SELECT 3,'Yes',5
UNION ALL
SELECT 3,'No',6
UNION ALL
SELECT 4,'No',7
UNION ALL
SELECT 5,'Maybe',8
UNION ALL
SELECT 5,'Yes',9
SELECT Y.id, Y.data, Y.changeno
FROM #Table1 Y
INNER JOIN (
SELECT id, changeno = MAX(changeno)
FROM #Table1
GROUP BY id
) X ON X.id = Y.id
WHERE X.changeno = Y.changeno
ORDER BY Y.id