Data Matching with SQL and assigning Identity ID's

Data Matching with SQL and assigning Identity ID's - sql

How to write a query that will match data and produce and identity for it.
For Example:
RecordID | Name
1 | John
2 | John
3 | Smith
4 | Smith
5 | Smith
6 | Carl
I want a query which will assign an identity after matching exactly on Name.
Expected Output:
RecordID | Name | ID
1 | John | 1X
2 | John | 1X
3 | Smith | 1Y
4 | Smith | 1Y
5 | Smith | 1Y
6 | Carl | 1Z
Note: The ID should be unique for every match. Also, it can be numbers or varchar.
Can somebody help me with this? The main thing is to assign the ID's.
Thanks.

How about this:
with temp as
(
select 1 as id,'John' as name
union
select 2,'John'
union
select 3,'Smith'
union
select 4,'Smith'
union
select 5,'Smith'
union
select 6,'Carl'
)
SELECT *, DENSE_RANK() OVER
(ORDER BY Name) as NewId
FROM TEMP
Order by id
The first part is for testing purposes only.

Please try:
SELECT *,
Rank() over (order by Name ASC)
FROM table

This structure seems to work:
CREATE TABLE #Table
(
Department VARCHAR(100),
Name VARCHAR(100)
);
INSERT INTO #Table VALUES
('Sales','michaeljackson'),
('Sales','michaeljackson'),
('Sales','jim'),
('Sales','jim'),
('Sales','jill'),
('Sales','jill'),
('Sales','jill'),
('Sales','j');
WITH Cte_Rank AS
(
SELECT [Name],
rw = ROW_NUMBER() OVER (ORDER BY [Name])
FROM #Table
GROUP BY [Name]
)
SELECT a.Department,
a.Name,
b.rw
FROM #Table a
INNER JOIN Cte_Rank b
ON a.Name = b.Name;

Related

Insert blank row in every row

I have a result like this:
name
Sally
Dan
Andy
Jackson
I used this query:
SELECT name from NAMEINFO where nNameIndex<5
and I want to get result like this:
name
Sally
(blank)
Dan
(blank)
Andy
(blank)
Jackson
(blank)
I tried query like this:
select name
from (
select top 100 percent *
from (
SELECT nNameIndex, name
FROM NAMEINFO
union all
select nNameIndex, ''
from NAMEINFO
) as t
order by 1
) as r
but result is:
name
Sally
Dan
Andy
Jackson
(blank)
(blank)
(blank)
(blank)
I want to insert blank row into every row in my SQL result.
Where should I change my code to get wanted result?

select name, ROW_NUMBER() over (order by name) pos from NAMEINFO where nNameIndex < 5
union all
select '', ROW_NUMBER() over (order by name) from NAMEINFO where nNameIndex < 5
order by pos, name desc
but really you should not do this in your database but in the consuming application.

In your initial query, you almost have it:
select name
from (
SELECT nNameIndex, name
FROM NAMEINFO
union all
select nNameIndex, ''
from NAMEINFO
) as t
order by nNameIndex

You can try ROW_NUMBER() to achieve the results.
DECLARE #table table(name varchar(50))
insert into #table
values
('Sally')
,('Dan')
,('Andy')
,('Jackson');
SELECT name
from
(
select name, ROW_NUMBER() OVER (ORDER BY name) as rnk from #table
union all
SELECT '',ROW_NUMBER() OVER (ORDER BY (SELECT null)) as rnk from #table
) as t
order by rnk,name desc
+---------+
| name |
+---------+
| Andy |
| |
| Dan |
| |
| Jackson |
| |
| Sally |
| |
+---------+

Alternative to CASE WHEN?

I have a table in SQL where the results look something like:
Number | Name | Name 2
1 | John | Derek
1 | John | NULL
2 | Jane | Louise
2 | Jane | NULL
3 | Michael | Mark
3 | Michael | NULL
4 | Sara | Paul
4 | Sara | NULL
I want a way to say that if Number=1, return Name 2 in new column Name 3, so that the results would look like:
Number | Name | Name 2 | Name 3
1 | John | Derek | Derek
1 | John | NULL | Derek
2 | Jane | Louise | Louise
2 | Jane | NULL | Louise
3 | Michael | Mark | Mark
3 | Michael | NULL | Mark
4 | Sara | Paul | Paul
4 | Sara | NULL | Paul
The problem is that I can't say if Number=1, return Name 2 in Name 3, because my table has >100,000 records. I need it to do it automatically. More like "if Number is the same, return Name 2 in Name 3." I've tried to use a CASE statement but haven't been able to figure it out. Is there any way to do this?

Empirically, this seems to work:
SELECT
Number, Name, [Name 2],
MAX([Name 2]) OVER (PARTITION BY Number) [Name 3]
FROM yourTable;
The idea here, if I interpreted your requirements correctly, is that you want to report the non NULL value of the second name for all records as the third name value.

Solution 3, with group by
with maxi as(
SELECT Number, max(Name2) name3
FROM #sample
group by number, name
)
SELECT f1.*, f2.name3
FROM #sample f1 inner join maxi f2 on f1.number=f2.number

Solution 4, with cross apply
SELECT *
FROM #sample f1 cross apply
(
select top 1 f2.Name2 as Name3 from #sample f2
where f2.number=f1.number and f2.Name2 is not null
) f3

you can try this:
Solution 1, with row_number
declare #sample table (Number integer, Name varchar(50), Name2 varchar(50))
insert into #sample
select 1 , 'John' , 'Derek' union all
select 1 , 'John' , NULL union all
select 2 , 'Jane' , 'Louise' union all
select 2 , 'Jane' , NULL union all
select 3 , 'Michael' , 'Mark' union all
select 3 , 'Michael' , NULL union all
select 4 , 'Sara' , 'Paul' union all
select 4 , 'Sara' , NULL ;
with tmp as (
select *, row_number() over(partition by number order by number) rang
from #sample
)
select f1.Number, f1.Name, f1.Name2, f2.Name2 as Name3
from tmp f1 inner join tmp f2 on f1.Number=f2.Number and f2.rang=1

Solution 2, with lag (if your sql server version has lag function)
SELECT
Number, Name, Name2,
isnull(Name2, lag(Name2) OVER (PARTITION BY Number order by number)) Name3
FROM #sample;

Oracle duplicate row N times where N is a column

I'm new to Oracle and I'm trying to do something a little unusual. Given this table and data I need to select each row, and duplicate ones where DupCount is greater than 1.
create table TestTable
(
Name VARCHAR(10),
DupCount NUMBER
)
INSERT INTO TestTable VALUES ('Jane', 1);
INSERT INTO TestTable VALUES ('Mark', 2);
INSERT INTO TestTable VALUES ('Steve', 1);
INSERT INTO TestTable VALUES ('Jeff', 3);
Desired Results:
Name DupCount
--------- -----------
Jane 1
Mark 2
Mark 2
Steve 1
Jeff 3
Jeff 3
Jeff 3
If this isn't possible via a single select statement any help with a stored procedure would be greatly appreciated.

You can do it with a hierarchical query:
SQL Fiddle
Query 1:
WITH levels AS (
SELECT LEVEL AS lvl
FROM DUAL
CONNECT BY LEVEL <= ( SELECT MAX( DupCount ) FROM TestTable )
)
SELECT Name,
DupCount
FROM TestTable
INNER JOIN
levels
ON ( lvl <= DupCount )
ORDER BY Name
Results:
| NAME | DUPCOUNT |
|-------|----------|
| Jane | 1 |
| Jeff | 3 |
| Jeff | 3 |
| Jeff | 3 |
| Mark | 2 |
| Mark | 2 |
| Steve | 1 |

You can do this with a recursive cte. It would look like this
with cte as (name, dupcount, temp)
(
select name,
dupcount,
dupcount as temp
from testtable
union all
select name,
dupcount,
temp-1 as temp
from cte
where temp > 1
)
select name,
dupcount
from cte
order by name

Selecting row with highest ID based on another column

In SQL Server 2008 R2, suppose I have a table layout like this...
+----------+---------+-------------+
| UniqueID | GroupID | Title |
+----------+---------+-------------+
| 1 | 1 | TEST 1 |
| 2 | 1 | TEST 2 |
| 3 | 3 | TEST 3 |
| 4 | 3 | TEST 4 |
| 5 | 5 | TEST 5 |
| 6 | 6 | TEST 6 |
| 7 | 6 | TEST 7 |
| 8 | 6 | TEST 8 |
+----------+---------+-------------+
Is it possible to select every row with the highest UniqueID number, for each GroupID. So according to the table above - if I ran the query, I would expect this...
+----------+---------+-------------+
| UniqueID | GroupID | Title |
+----------+---------+-------------+
| 2 | 1 | TEST 2 |
| 4 | 3 | TEST 4 |
| 5 | 5 | TEST 5 |
| 8 | 6 | TEST 8 |
+----------+---------+-------------+
Been chomping on this for a while, but can't seem to crack it.
Many thanks,

SELECT *
FROM (SELECT uniqueid, groupid, title,
Row_number()
OVER ( partition BY groupid ORDER BY uniqueid DESC) AS rn
FROM table) a
WHERE a.rn = 1

With SQL-Server as rdbms you can use a ranking function like ROW_NUMBER:
WITH CTE AS
(
SELECT UniqueID, GroupID, Title,
RN = ROW_NUMBER() OVER (PARTITON BY GroupID
ORDER BY UniqueID DESC)
FROM dbo.TableName
)
SELECT UniqueID, GroupID, Title
FROM CTE
WHERE RN = 1
This returns exactly one record for each GroupID even if there are multiple rows with the highest UniqueID (the name does not suggest so). If you want to return all rows in then use DENSE_RANK instead of ROW_NUMBER.
Here you can see all functions and how they work: http://technet.microsoft.com/en-us/library/ms189798.aspx

Since you have not mentioned any RDBMS, this statement below will work on almost all RDBMS. The purpose of the subquery is to get the greatest uniqueID for every GROUPID. To be able to get the other columns, the result of the subquery is joined on the original table.
SELECT a.*
FROM tableName a
INNER JOIN
(
SELECT GroupID, MAX(uniqueID) uniqueID
FROM tableName
GROUP By GroupID
) b ON a.GroupID = b.GroupID
AND a.uniqueID = b.uniqueID
In the case that your RDBMS supports Qnalytic functions, you can use ROW_NUMBER()
SELECT uniqueid, groupid, title
FROM
(
SELECT uniqueid, groupid, title,
ROW_NUMBER() OVER (PARTITION BY groupid
ORDER BY uniqueid DESC) rn
FROM tableName
) x
WHERE x.rn = 1
TSQL Ranking Functions
The ROW_NUMBER() generates sequential number which you can filter out. In this case the sequential number is generated on groupid and sorted by uniqueid in descending order. The greatest uniqueid will have a value of 1 in rn.

SELECT *
FROM the_table tt
WHERE NOT EXISTS (
SELECT *
FROM the_table nx
WHERE nx.GroupID = tt.GroupID
AND nx.UniqueID > tt.UniqueID
)
;
Should work in any DBMS (no window functions or CTEs are needed)
is probably faster than a sub query with an aggregate

Keeping it simple:
select * from test2
where UniqueID in (select max(UniqueID) from test2 group by GroupID)
Considering:
create table test2
(
UniqueID numeric,
GroupID numeric,
Title varchar(100)
)
insert into test2 values(1,1,'TEST 1')
insert into test2 values(2,1,'TEST 2')
insert into test2 values(3,3,'TEST 3')
insert into test2 values(4,3,'TEST 4')
insert into test2 values(5,5,'TEST 5')
insert into test2 values(6,6,'TEST 6')
insert into test2 values(7,6,'TEST 7')
insert into test2 values(8,6,'TEST 8')

Collapse SQL rows

Say I have this table:
id | name
-------------
1 | john
2 | steve
3 | steve
4 | john
5 | steve
I only want the rows that are unique compared to the previous row, these:
id | name
-------------
1 | john
2 | steve
4 | john
5 | steve
I can partly achieve this by using this query:
SELECT *, (
SELECT `name` FROM demotable WHERE id=t.id-1
) AS prevName FROM demotable AS t GROUP BY prevName ORDER BY id ASC
But when I am using a query with multiple UNIONs and stuff, this gets way to complicated. Is there an easy way to do this (like GROUP BY, but than more specific)?

This should work, but I don't know if it's simpler :
select demotable.*
from demotable
left join demotable as prev on prev.id = demotable.id - 1
where demotable.name != prev.name

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Data Matching with SQL and assigning Identity ID's - sql

How about this: with temp as ( select 1 as id,'John' as name union select 2,'John' union select 3,'Smith' union select 4,'Smith' union select 5,'Smith' union select 6,'Carl' ) SELECT *, DENSE_RANK() OVER (ORDER BY Name) as NewId FROM TEMP Order by id The first part is for testing purposes only.

Please try: SELECT *, Rank() over (order by Name ASC) FROM table

Related

Insert blank row in every row

Alternative to CASE WHEN?

Oracle duplicate row N times where N is a column

Selecting row with highest ID based on another column

Collapse SQL rows

Categories

Resources