max date for multiple id using different tables - sql

I have 3 tables (SQL Server 2008 R2):
Table 1
ID Date
123 20-08-2011
123 20-08-2011
234 30-09-2012
Table 2
ID Centre ChangeDate
123 987 16-08-11
123 568 28-05-10
234 456 14-09-12
Table 3
Centre Centre_Name
987 test1
568 test2
456 test3
I would like to make a query which joins all columns and only selects the Centre with the maximum ChangeDate. Thus, the following table should be returned:
ID Date Centre ChangeDate Centre_Name
123 20-08-11 987 16-08-11 test1
123 20-08-11 987 16-08-11 test1
234 30-09-12 456 14-09-12 test3
Thank you.

There should be multiple ways of achieving your results. One way is by using a row_number() partitioned by your [Id] column (or [Id] and [Centre] if you want the Id/Centre combinations with the highest [ChangeDate]). This will yield unique results, and wouldn't fit your desired result as described in your answer (my question would be: why would you want duplicated answers?). The other method is using a subquery to retrieve the maximum changedate from Table 2 per Id, and subsequently joining that result tot Table 2 for a second time.
declare #table1 table ([Id] INT, [Date] date)
declare #table2 table ([Id] INT, [Centre] int, [ChangeDate] date)
declare #table3 table ([Centre] int, [Centre_Name] varchar(25))
insert into #table1 values (123,'2011-08-20'),(123,'2011-08-20'),(234,'2012-09-30')
insert into #table2 values (123,987,'2011-08-16'),(123,568,'2010-05-28'),(234,456,'2012-09-14')
insert into #table3 values (987,'test1'),(568,'test2'),(456,'test3')
Method 1: using row_number()
select *
from (
select t1.[Id],t1.[Date],t2.[Centre],t2.[ChangeDate],t3.[Centre_Name]
, rn=row_number() over (partition by t2.[Id]/*, t2.[Centre]*/ order by t2.[ChangeDate] desc)
from #table1 t1
join #table2 t2 on t2.Id = t1.Id
join #table3 t3 on t3.Centre=t2.Centre
) x where x.rn=1
Method 2: using a subquery to retrieve the maximum [ChangeDate] per [Id]
select t1.[Id],t1.[Date],t2.[Centre],t2.[ChangeDate],t3.[Centre_Name]
from #table1 t1
join (select Id, maxChangeDate=max(ChangeDate) from #table2 group by Id ) t2_maxChangeDate on t1.[Id]=t2_maxChangeDate.[Id]
join #table2 t2 on t2.[Id]=t2_maxChangeDate.[Id] and t2.[ChangeDate]=t2_maxChangeDate.[maxChangeDate]
join #table3 t3 on t3.[Centre] = t2.[Centre]

I would use apply :
select t1.*, tt.*
from table1 t1 cross apply
( select top (1) t2.Centre, t2.ChangeDate, t3.Centre_Name
from table2 t2 inner join
table3 t3
on t3.Centre = t2.Centre
where t2.id = t1.id
order by t2.ChangeDate desc
) tt;

Related

Why does selecting all columns change the order

I have 2 tables. The order of selecting with a select * is different than the order of selecting without the wildcard.
This issue is happening on a production environment.
I have tried to replicate this issue but have not succeeded.
What could be causing this issue in the production tables?
DROP TABLE IF EXISTS #table1
DROP TABLE IF EXISTS #table2
CREATE TABLE #table1 (id int, code varchar(10), carriercode varchar(10), maxvalue numeric(14,3))
CREATE TABLE #table2 (id int, carriercode varchar(10))
-- notice the maximum value is always 2000.000
INSERT INTO #table1 (id,code,carriercode, maxvalue) SELECT 1,'a','carrier_a',2000.000
INSERT INTO #table1 (id,code,carriercode, maxvalue) SELECT 2,'a','carrier_b',2000.000
INSERT INTO #table1 (id,code,carriercode, maxvalue) SELECT 3,'c','carrier_c',2000.000
INSERT INTO #table2 (id,carriercode) SELECT 1,'carrier_a'
INSERT INTO #table2 (id,carriercode) SELECT 2,'carrier_b'
This is the select without the wildcard
SELECT t1.id,t1.code,t1.parentcode,t1.carriercode
FROM #table1 t1
LEFT JOIN #table2 t2 on t1.carriercode=t2.carriercode
WHERE (t1.parentcode = 'a')
AND (t1.maxvalue >= 830 OR t1.maxvalue is null)
ORDER BY t1.maxvalue DESC
And the result
id code parentcode carriercode
1 a1 a carrier_a
2 a2 a carrier_b
Here the select with the wildcard
SELECT t1.id,t1.code,t1.parentcode,t1.carriercode,*
FROM #table1 t1
LEFT JOIN #table2 t2 on t1.carriercode=t2.carriercode
WHERE (t1.parentcode = 'a')
AND (t1.maxvalue >= 830 OR t1.maxvalue is null)
ORDER BY t1.maxvalue DESC
And the second result
id code parentcode carriercode id code parentcode carriercode maxvalue dt id carriercode
1 a1 a carrier_a 1 a1 a carrier_a 2000.000 2022-09-30 22:49:52.787 1 carrier_a
2 a2 a carrier_b 2 a2 a carrier_b 2000.000 2022-09-30 22:49:52.787 2 carrier_b
Notice that the order of table 1 id column is the same for both select statements. On the production tables the 2 select statements are ordered differently.
What I have tried
Rounding issues: CAST numeric to int -> order is still the same for both selects
Changed the order of the initial inserts -> order is still equal for both selects
Because the order of rows that are tied based on your ORDER BY is undocumented and depends on the details of the execution plan.
To fix the order, ensure your ORDER BY includes enough columns to uniquely order the rows. EG
ORDER BY t1.maxvalue DESC, id

Output the results of several SELECT statements to an excel sheet in their own columns

I have a query that I want to turn into a stored proc which has, right now, about 6 select statements in it of similar data. Each one just brings back phone numbers in one column except each of the columns is named differently.
Basically it is:
SELECT PhoneNumber as PhoneGroup1 FROM PhoneNumberTable
SELECT PhoneNumber as PhoneGroup2 FROM PhoneNumberTable
SELECT PhoneNumber as PhoneGroup3 FROM PhoneNumberTable
It is actually more complex than that, but those are the results I get in a nutshell.
I then will go and copy/paste each column and its header name into a spreadsheet into Column A for PhoneGroup1, Column B for PhoneGroup2, etc.
PhoneGroup1 | PhoneGroup2 | PhoneGroup3
4856562281 | 9498675309 | 6238471273
7452837719 | 5739542855 | 4745856147
8472639273 | 6495232247 | 9516538847
Is there any way I can have this export to an excel sheet?
Thank you guys for any guidance!
I think I understand what you're trying to do. Do you have something like this:
declare #tbl1 table ( id int )
declare #tbl2 table ( id int )
insert into #tbl1 values(1),(2),(3)
insert into #tbl2 values(10),(20),(30)
select * from #tbl1
union
select * from #tbl2
which returns this result set:
id
----
1
2
3
10
20
30
but you really want this result set?
id1 id2
---- ----
1 10
2 20
3 30
I can see a way to do this using row numbers. Basically, you give each row returned from the individual tables a row number, and then you join the tables together matching on the row numbers. It looks like this in my example:
declare #tbl1 table ( id int )
declare #tbl2 table ( id int )
insert into #tbl1 values(1),(2),(3)
insert into #tbl2 values(10),(20),(30)
select t1.id as id1, t2.id as id2
from
(
select 'table1' as header, id, row_number() over (order by id) rnum
from #tbl1 t1
) t1
inner join
(
select 'table2' as header, id, row_number() over (order by id) rnum
from #tbl2 t2
) t2 on t1.rnum = t2.rnum
To add a column you have to add another join to the query. If your tables have different numbers of rows and you want to see all rows, use left full outer joins instead of inner joins.

Insert lost data to the table

I have two tables, which have two common column 'StationID'.
Create table t1(ID int, StationID bigint)
insert into t1 values
(0,1111),
(1,2222),
(2,34),
(3,456209),
(56,78979879),
(512,546)
go
Create table t2(StationID bigint, Descr varchar(50))
insert into t2 values
(-1,'test-1'),
(0,'test0'),
(1,'test1'),
(2,'test2'),
(5001,'dummy'),
(5002,'dummy'),
(6001,'dummy')
go
Now we notice that not every t1.StationID is in t2.StationID. Run the script can prove it.
select distinct StationID from t1 as A
where not exists
(select * from t2 as B where B.StationID =A.StationID)
The result is:
StationID
34
546
1111
2222
456209
78979879
Now I want to fill t2 with the lost StationID above, the column Descr can be any dummy data.
My real case has thousands records, how to use script to implement it?
insert into t2 (StationID, Descr)
select distinct StationID, 'dummy'
from t1 as A
where not exists
(select * from t2 as B where B.StationID =A.StationID)
INSERT INTO
t2
SELECT DISTINCT
stationid, 'dummy'
FROM
t1
WHERE
stationid NOT IN (SELECT stationid FROM t2)
(As an alternative to the others).

Recursive Update Statement

I need to create a recursive update statement that updates from another table so for ex..
Table1
(
IdNumberGeneratedFromAService INT NOT NULL,
CodeName NVARCHAR(MAX)
)
Table2
(
Table2Id Auto_Increment,
Name NVARCHAR(MAX),
IdNumberThatComesFromTabl1,
CodeNameForTable1ToMatch
)
the issue is CodeNameForTable1ToMatch is not unique so if Table1 has 2 idnumber for the same code and there are two rows in Table2 with the same CodeName I want to update the rows in table2 in sequence so first row gets the first idnumber and second row gets the second id number.
Also want to do it without cursor....
SAMPLE DATA
Table1
idNumber Code
C145-6678-90 Code1
C145-6678-91 Code1
C145-6678-92 Code1
C145-6678-93 Code1
C145-6678-94 Code1
Table 2
AutoIncrementIdNumber Code IdNumber
1 Code1 {NULL}
2 Code1 {NULL}
3 Code1 {NULL}
4 Code1 {NULL}
5 Code1 {NULL}
C145-6678-90 needs to got 1
C145-6678-91 needs to got 2
C145-6678-92 needs to got 3
C145-6678-93 needs to got 4
C145-6678-94 needs to got 5
in one update statement
Using the ROW_NUMBER windowing function on each of the tables, partitioned by the code, you can number each of the rows that have a code in common, then combine the results of that on each query to match rows based on the code and the numbered instance of that code. So the first Code A in Table 1 would matched the first Code A in table 2, and etc.
Sample code showing this (SQL 2005 or higher):
-- Sample code prep
CREATE TABLE #Table1
(
IdNumberGeneratedFromAService INT NOT NULL,
CodeName NVARCHAR(MAX)
);
CREATE TABLE #Table2
(
Table2Id INT NOT NULL IDENTITY(1,1),
Name NVARCHAR(MAX),
IdNumberThatComesFromTabl1 INT NULL,
CodeNameForTable1ToMatch NVARCHAR(MAX)
);
INSERT INTO #Table1(IdNumberGeneratedFromAService, CodeName)
VALUES(100,'Code A'),(150,'Code A'),(200,'Code B'),(250,'Code A'),(300,'Code C'),(400,'Nonexistent');
INSERT INTO #Table2(Name, IdNumberThatComesFromTabl1, CodeNameForTable1ToMatch)
VALUES('A1-100',0,'Code A'),('A2-150',0,'Code A'),('A3-250',0,'Code A'),('B1-200',0,'Code B'),('C1-300',0,'Code C'),('No Id For Me',0,'Code No Id :(');
-- Sample select statement that shows the row numbers
--SELECT *
--FROM
-- (SELECT *, ROW_NUMBER() OVER (Partition By IT2.CodeNameForTable1ToMatch Order By IT2.Table2Id) as RowNum
-- FROM #Table2 IT2) T2
-- INNER JOIN
-- (SELECT *, ROW_NUMBER() OVER (Partition By IT1.CodeName Order By IT1.IdNumberGeneratedFromAService) as RowNum
-- FROM #Table1 IT1) T1
-- ON T1.CodeName = T2.CodeNameForTable1ToMatch AND T1.RowNum = T2.RowNum;
-- Table 2 Before
SELECT * FROM #Table2;
-- Actual update statement
UPDATE #Table2
SET IdNumberThatComesFromTabl1 = T1.IdNumberGeneratedFromAService
FROM #Table2 AT2
INNER JOIN
(SELECT *, ROW_NUMBER() OVER (Partition By IT2.CodeNameForTable1ToMatch Order By IT2.IdNumberThatComesFromTabl1) as RowNum
FROM #Table2 IT2) T2
ON T2.Table2Id = AT2.Table2Id
INNER JOIN
(SELECT *, ROW_NUMBER() OVER (Partition By IT1.CodeName Order By IT1.IdNumberGeneratedFromAService) as RowNum
FROM #Table1 IT1) T1
ON T1.CodeName = T2.CodeNameForTable1ToMatch AND T1.RowNum = T2.RowNum;
-- Table 2 after
SELECT * FROM #Table2;
-- Cleanup
DROP TABLE #Table1;
DROP TABLE #Table2;
I turned your two sample tables into temp tables and added 3 records for 'Code A', a record for 'Code B', and a record for 'Code C'. The codes in table1 are numbered based on the order of the table 1 ID, the codes in Table 2 are ordered by the auto-incrementing Table 2 id. I also included a record in each table that wouldn't have a match in the other. I tried to make the code's descriptive so it would be easier to see that a correct match has occurred (they order for table 2 is important since it has an auto incrementing id)
The commented out sample select is there to help understand how the select works before I join it into the UPDATE statement.
So we can see before the update Table 2 is all 0's, then we update the values in table 2 where the unique table 2 id matches the unique table 2 id from our nicely numbered and matched join, then we select from table 2 again to see the results.
A riff on Tarwn's solution:
with cte1 as (
select code, row_number() over (partition by code order by idNumber) as [rn]
from table1
), cte2 as (
select code, row_number() over (partition by code order by AutoIncrementIdNumber) as [rn]
from table2
)
update cte2
set idNumber = cte1.idNumber
from cte2
inner join cte1
on cte2.code = cte1.code
and cte2.rn = cte1.rn
I only present this because people are often amazed that you can update a common table expression.
This isn't possible without a cursor.

How to create a view which merges two tables?

I have two tables which have the exact same structure. Both tables can store the same data with different primary keys (autoincremented integers). Therefore, there is a third table which lists which two primary keys list the same data. However, there also exist rows which don't exist in the other. Therefore, a simple join won't work since you will have two rows with the same primary key but different data. Therefore, is there a way of reassigning primary keys to unused values in the view?
Table1
ID name
1 Adam
2 Mark
3 David
4 Jeremy
Table2
ID name
1 Jessica
2 Jeremy
3 David
4 Mark
Table3
T1ID T2ID
2 4
3 3
4 2
I am looking for a result table like the following:
Result
ID name
1 Adam
2 Mark
3 David
4 Jeremy
5 Jessica
The real heart of the question is how i can assign the temporary fake id of 5 to Jessica and not just some random number. The rule I want for the ids is that if the row exists in the first table, then use its own id. Otherwise, use the next id that an insert statement would have generated (the column is on autoincrement).
Answer to edited question
select id, name from table1
union all
select X.offset + row_number() over (order by id), name
from (select MAX(id) offset from table1) X
cross join table2
where not exists (select * from table3 where t2id = table2.id)
The MAX(id) is used to "predict" the next identity that would occur if you merged the data from the 2nd table into the first. If Table3.T2ID exists at all, it means that it is already included in table1.
Using the test data below
create table table1 (id int identity, name varchar(10))
insert table1 select 'Adam' union all
select 'Mark' union all
select 'David' union all
select 'Jeremy'
create table table2 (id int identity, name varchar(10))
insert table2 select 'Jessica' union all
select 'Jeremy' union all
select 'David' union all
select 'Mark'
create table table3 (t1id int, t2id int)
insert table3 select 2,4 union all
select 3,3 union all
select 4,2
Answer to original question below
So the 3rd table is the one you want to build (a view instead of a table)?
select newid=row_number() over (order by pk_id), *
from
(
select a.*
from tblfirst a
UNION ALL
select b.*
from tblsecond b
) X
The data will contain a unique newid value for each record, whether from first or second table. Change pk_id to your primary key column name.
Assuming you have below data, as I understand reading your question:
Table: T1
ID name
--------
1 a
2 b
3 c
Table: T2
ID name
--------
2 b
3 c
4 d
Table: Rel
ID1 ID2
--------
2 2
3 3
T1 has some data which is not in T2 and vice versa.
Following query will give all data unioned
SELECT ROW_NUMBER() OVER (order by name) ID, Col
from
(
SELECT ISNULL(T1.name,'') name
FROM T1 t1 LEFT JOIN Rel TR ON TR.ID1 = T1.ID
union
SELECT ISNULL(T2.name,'') name
FROM T2 t2 LEFT JOIN Rel TR ON TR.ID2 = T2.ID
) T
If I understand you correct, following might work
Select everything from your first table
Select everything from your second table that is not linked to your third table
Combine the results
Test data
DECLARE #Table1 TABLE (ID INTEGER IDENTITY(1, 1), Value VARCHAR(32))
DECLARE #Table2 TABLE (ID INTEGER IDENTITY(1, 1), Value VARCHAR(32))
DECLARE #Table3 TABLE (T1ID INTEGER, T2ID INTEGER)
INSERT INTO #Table1 VALUES ('Adam')
INSERT INTO #Table1 VALUES ('Mark')
INSERT INTO #Table1 VALUES ('David')
INSERT INTO #Table1 VALUES ('Jeremy')
INSERT INTO #Table2 VALUES ('Jessica')
INSERT INTO #Table2 VALUES ('Jeremy')
INSERT INTO #Table2 VALUES ('David')
INSERT INTO #Table2 VALUES ('Mark')
INSERT INTO #Table3 VALUES (2, 4)
INSERT INTO #Table3 VALUES (3, 3)
INSERT INTO #Table3 VALUES (4, 2)
SQL Statement
SELECT ROW_NUMBER() OVER (ORDER BY ID), t1.Value
FROM #Table1 t1
UNION ALL
SELECT ROW_NUMBER() OVER (ORDER BY ID) + offset, t2.Value
FROM #Table2 t2
LEFT OUTER JOIN #Table3 t3 ON t3.T2ID = t2.ID
CROSS APPLY (
SELECT Offset = COUNT(*)
FROM #Table1
) offset
WHERE t3.T2ID IS NULL