Add column to ensure composite key is unique - sql

I have a table which needs to have a composite primary key based on 2 columns (Material number, Plant).
For example, this is how it is currently (note that these rows are not unique):
MATERIAL_NUMBER PLANT NUMBER
------------------ ----- ------
000000000000500672 G072 1
000000000000500672 G072 1
000000000000500672 G087 1
000000000000500672 G207 1
000000000000500672 G207 1
However, I'll need to add the additional column (NUMBER) to the composite key such that each row is unique, and it must work like this:
For each MATERIAL_NUMBER, for each PLANT, let NUMBER start at 1 and increment by 1 for each duplicate record.
This would be the desired output:
MATERIAL_NUMBER PLANT NUMBER
------------------ ----- ------
000000000000500672 G072 1
000000000000500672 G072 2
000000000000500672 G087 1
000000000000500672 G207 1
000000000000500672 G207 2
How would I go about achieving this, specifically in SQL Server?
Best Regards!

SOLVED.
See below:
SELECT MATERIAL_NUMBER, PLANT, (ROW_NUMBER() OVER (PARTITION BY MATERIAL_NUMBER, PLANT ORDER BY VALID_FROM)) as NUMBER
FROM Table_Name
Will output the table in question, with the NUMBER column properly defined

Suppose this is actual table,
create table #temp1(MATERIAL_NUMBER varchar(30),PLANT varchar(30), NUMBER int)
Suppose you want to insert only single record then,
declare #Num int
select #Num=isnull(max(number),0) from #temp1 where MATERIAL_NUMBER='000000000000500672' and PLANT='G072'
insert into #temp1 (MATERIAL_NUMBER,PLANT , NUMBER )
values ('000000000000500672','G072',#Num+1)
Suppose you want to insert bulk record.Your bulk record sample data is like
create table #temp11(MATERIAL_NUMBER varchar(30),PLANT varchar(30))
insert into #temp11 (MATERIAL_NUMBER,PLANT)values
('000000000000500672','G072')
,('000000000000500672','G072')
,('000000000000500672','G087')
,('000000000000500672','G207')
,('000000000000500672','G207')
You want to insert `#temp11` in `#temp1` maintaining number id
insert into #temp1 (MATERIAL_NUMBER,PLANT , NUMBER )
select t11.MATERIAL_NUMBER,t11.PLANT
,ROW_NUMBER()over(partition by t11.MATERIAL_NUMBER,t11.PLANT order by (select null))+isnull(maxnum,0) as Number from #temp11 t11
outer apply(select MATERIAL_NUMBER,PLANT,max(NUMBER)maxnum from #temp1 t where t.MATERIAL_NUMBER=t11.MATERIAL_NUMBER
and t.PLANT=t11.PLANT group by MATERIAL_NUMBER,PLANT) t
select * from #temp1
drop table #temp1
drop table #temp11
Main question is Why you need number column ? In mot of the cases you don't need number column,you can use ROW_NUMBER()over(partition by t11.MATERIAL_NUMBER,t11.PLANT order by (select null)) to display where you need. This will be more efficient.
Or tell the actual situation and number of rows involved where you will be needing Number column.

Related

Merge multiple rows having some identity to a new one that have the sum of a column thats is distinct between thems

I have table like this :
id name qt
----------------
0 mm 4
1 mm 5
2 xx 8
I want update it or get new table that will produce this kind of result:
id name qt
------------------
0 mm 9 (sum of the two or multiple some identical )
1 xx 8
Including the id column will cause the GROUP BY to fail since multiple records are being summed that have different ids.
SELECT name, SUM(qt) as qt_sum
FROM table GROUP BY name
SELECT ROW_NUMBER() OVER (ORDER BY name) AS id
, name
, SUM(qt) AS qt
FROM YourTableName
GROUP BY name
ORDER BY name
I'm making the assumption that the id field doesn't actually mean anything because the id of the record xx actually changes between your two visuals. That's why I'm setting it by ROW_NUMBER() so it increments for distinct name. If this isn't the case, remove the ROW_NUMBER() expression and add id to the GROUP BY clause. This does mean that records in the name field may change depending on the number of distinct names.
If you really need and id column you could create one like this...
create table Test (id int, name varchar(10), qt int)
insert into Test values (0, 'mm', 4)
insert into Test values (1, 'mm', 5)
insert into Test values (2, 'xx', 8)
select
row_number() over (order by name) - 1
, name
, sum(qt) as qt
from Test
group by name
There may be some cases where this does not work for you, but with such limited sample data it is hard to tell.

Effective way of locating top ranked rows on Oracle DB

I have a large table (millions of records) and I need to write an efficient select statement.
The table looks like this:
create table tab1 (
pt_key number
, cp_key number
, ext_info varchar2(10)
, resp_nm varchar2(20)
, resp_dttm date
, rank number
);
Sample records:
insert into tab1 values (1,1,'info1','OK', to_date('01.03.18 17:00:00','DD.MM.RR HH24:MI'),1);
insert into tab1 values (1,1,'info2','FAILED', to_date('01.03.18 17:00:00','DD.MM.RR HH24:MI'),2);
insert into tab1 values (1,1,'info3','SENT', to_date('01.03.18 17:00:00','DD.MM.RR HH24:MI'),3);
insert into tab1 values (1,1,'info4','SENT', to_date('02.03.18 17:00:00','DD.MM.RR HH24:MI'),3);
insert into tab1 values (1,2,'info5','OK', to_date('05.03.18 17:00:00','DD.MM.RR HH24:MI'),1);
insert into tab1 values (1,2,'info6','OK', to_date('06.03.18 17:00:00','DD.MM.RR HH24:MI'),1);
insert into tab1 values (1,2,'info7','FAILED', to_date('01.03.18 17:00:00','DD.MM.RR HH24:MI'),2);
I would like the query to return for each combination of pt_key and cp_key (part of composite primary key, other columns are not indexed) record with the highest rank. If there are (for each combination of pt_key and cp_key) several records with the same highest rank then pick the one with the greatest resp_dttm.
The select statement should return only the first four columns.
For the above posted sample data the desired result would be:
1 1 info4 SENT
1 2 info7 FAILED
Thanks for help.
Here's one approach using row_number():
select *
from (
select *, row_number() over (partition by pt_key, cp_key
order by rank desc, resp_dttm desc) rn
from tab1
) t
where rn = 1
Here's another approach using FIRST aggregate function:
select pt_key,
cp_key,
max(ext_info) keep (dense_rank first order by t.rank desc, t.resp_dttm desc) as ext_info,
max(resp_nm) keep (dense_rank first order by t.rank desc, t.resp_dttm desc) as resp_nm
from tab1 t
group by pt_key, cp_key
Here's how it works on Oracle Live SQL
EDIT 2:
Result:
PT_KEY | CP_KEY | EXT_INFO | RESP_NM
--------+--------+----------+---------
1 | 1 | info4 | SENT
1 | 2 | info7 | FAILED
EDIT 1:
This solution has an important drawback, if for a certain combination of pt_key and cp_key, there are multiple rows with the same rank and resp_dttm values. In that case it will "combine" those rows, and calculate the aggregates for ext_info and resp_nm (in my example it'll take max value).
You can refine that behavior, by adding tertiary sort criteria, to make the ranking distinct (e.g. add all other columns from the primary key).
The answer from #sgeddes is a bit better in that sense, that it will use one (random) row from the equally ranked rows, without combining the data, and without having to add sorting criteria. It also is easier to maintain/update, as it has the ranking criteria in one place, while mine has it in two spots.
You should probably test performance of both in your specific scenario (e.g. specific indices, specific data profile/statistics).

A thought experiment in SQL

I want to show the number of times each distinct element in a column in a table in a SQL database appears, alongside the particular distinct element in a new output table. Is it possible in a single statement over ramming my head over it manually?
Without having actually tried, how about this:
SELECT tmp.Field, (SELECT COUNT(*) FROM [Table] t WHERE t.DesiredField = tmp.Field) AS Count
FROM
(
SELECT DISTINCT DesiredField FROM [Table]
) tmp
This would first select all distinct values from [Table] and in the outer select, take the values and the number of times they appear in the column.
You could also try
SELECT Field, SUM(1) AS Count FROM Table
GROUP BY Field
This should "flatten" the table so that it only contains distinct values in Field and the number of rows where Field has the same value.
I just tried the second - it seems to work nicely.
Turns out I was wrong all the time. The second example and the following actually return the same results:
SELECT Field, COUNT(*) AS Count FROM Table
GROUP BY Field
Simplest just to use COUNT(). You'll see varieties on what your count parameter, so here are the options.
DECLARE #tbl TABLE(id INT, data INT)
INSERT INTO #tbl VALUES (1,1),(2,1),(3,2),(4,NULL)
SELECT data
,COUNT(*) Count_star
,COUNT(id) Count_id
,COUNT(data) Count_data
,COUNT(1) Count_literal
FROM #tbl
GROUP BY data
data Count_star Count_id Count_data Count_literal
----------- ----------- ----------- ----------- -------------
NULL 1 1 0 1
1 2 2 2 2
2 1 1 1 1
Warning: Null value is eliminated by an aggregate or other SET operation.
You'll see the difference coming with the treatment of NULL if you COUNT a field that contains NULLs.

How can I insert records from a table to another table order by the specific column value?

Does anyone know how can I insert records from a table to another table order by the specific column value?
For Example:
I have the following table:
tableA:
record_id int,
name varchar(100),
nickname(100),
chain_id int (PK),
chain_n int,
count int,
create_date datetime
tableB:
record_id int,
name varchar(100),
nickname(100),
chain_id int (PK),
chain_n int,
create_date datetime
I have the following value for tableA:
record_id name nickname chain_id chain_n count create_date
1 Test One 1 1 2 2013-06-06
2 Test Two 2 1 5 2013-06-06
3 Test Three 3 1 3 2013-06-06
I using the following scrip to insert the data into thableB
INSERT INTO tableB
(
record_id,
name,
nickname,
chain_id,
chain_n,
create_date
)
SELECT
(
record_id,
name,
nickname,
chain_id,
chain_n,
create_date
)
FROM tableA
ORDER BY count DESC
I expecting the data will insert into tableB like the following:
record_id name nickname chain_id chain_n create_date
2 Test Two 2 1 2013-06-06
3 Test Three 3 1 2013-06-06
1 Test One 1 1 2013-06-06
However, the result was as following still order by the chain_id
record_id name nickname chain_id chain_n create_date
1 Test One 1 1 2013-06-06
2 Test Two 2 1 2013-06-06
3 Test Three 3 1 2013-06-06
Does anyone know how can manage to insert the records order by the count instead
It seems that record_id is your primary key and thus default ordering is done by that. That's why your output is same as in tableA. Just use ORDER BY in SELECT clause for tableB.
Inserting records into a table in particular order doesn't make much sense to me becuase ORDER BY doesn't actually influence the way the data is written to your drive. All records will be inserted in the order of your clustered index. In any case if you want to query data from a table ordered by some field you should explicitly state it in the ORDER BY clause even if you want your data ordered by clustered index columns. Although all data is meant to be ordered by clustered index by default it's still up to SQL Server engine to define the best execution plan and change the order if required if you don't specify it explicitly in the ORDER BY clause.

Help With SQL - Combining Two Rows Into One Row

I have an interesting SQL problem that I need help with.
Here is the sample dataset:
Warehouse DateStamp TimeStamp ItemNumber ID
A 8/1/2009 10001 abc 1
B 8/1/2009 10002 abc 1
A 8/3/2009 12144 qrs 5
C 8/3/2009 12143 qrs 5
D 8/5/2009 6754 xyz 6
B 8/5/2009 6755 xyz 6
This dataset represents inventory transfers between two warehouses. There are two records that represent each transfer, and these two transfer records always have the same ItemNumber, DateStamp, and ID. The TimeStamp values for the two transfer records always have a difference of 1, where the smaller TimeStamp represents the source warehouse record and the larger TimeStamp represents the destination warehouse record.
Using the sample dataset above, here is the query result set that I need:
Warehouse_Source Warehouse_Destination ItemNumber DateStamp
A B abc 8/1/2009
C A qrs 8/3/2009
D B xyz 8/5/2009
I can write code to produce the desired result set, but I was wondering if this record combination was possible through SQL. I am using SQL Server 2005 as my underlying database. I also need to add a WHERE clause to the SQL, so that for example, I could search on Warehouse_Source = A. And no, I can't change the data model ;).
Any advice is greatly appreciated!
Regards,
Mark
SELECT source.Warehouse as Warehouse_Source
, dest.Warehouse as Warehouse_Destination
, source.ItemNumber
, source.DateStamp
FROM table source
JOIN table dest ON source.ID = dest.ID
AND source.ItemNumber = dest.ItemNumber
AND source.DateStamp = dest.DateStamp
AND source.TimeStamp = dest.TimeStamp + 1
Mark,
Here is how you can do this with row_number and PIVOT. With a clustered index or primary key on the columns as I suggest, it will use a straight-line query plan with no Sort operation, thus be particularly efficient.
create table T(
Warehouse char,
DateStamp datetime,
TimeStamp int,
ItemNumber varchar(10),
ID int,
primary key(ItemNumber,DateStamp,ID,TimeStamp)
);
insert into T values ('A','20090801','10001','abc','1');
insert into T values ('B','20090801','10002','abc','1');
insert into T values ('A','20090803','12144','qrs','5');
insert into T values ('C','20090803','12143','qrs','5');
insert into T values ('D','20090805','6754','xyz','6');
insert into T values ('B','20090805','6755','xyz','6');
with Tpaired(Warehouse,DateStamp,TimeStamp,ItemNumber,ID,rk) as (
select
Warehouse,DateStamp,TimeStamp,ItemNumber,ID,
row_number() over (
partition by ItemNumber,DateStamp,ID
order by TimeStamp
)
from T
)
select
max([1]) as Warehouse_Source,
max([2]) as Warehouse_Destination,
ItemNumber,
DateStamp
from Tpaired
pivot (
max(Warehouse) for rk in ([1],[2])
) as P
group by ItemNumber, DateStamp, ID;
go
drop table T;