getting top row of joined table - sql

I have 2 tables, tableA and tableB
tableA - id int
name varchar(50)
tableB - id int
fkid int
name varchar(50)
Both tables are joined between id and fkid.
Below are sample rows from tableA
Below is output from tableB
I want to join both tables and get only top row of joined table. So output will be like below
Id Name fkid
1 P1 1
2 P2 4
3 P3 null
Here is Sql fiddle
How can i achieve this with single query? I know that i can loop through in my .net code and retrieve top rows. But i want it in single query.

select a.id,a.name,b.fid from tableA a left join
(
select min(id) fid ,fkid from tableB group by fkid
)b
on a.id = b.fkid

select ta.id, ta.name, min(tb.id) from tableA ta
left join tableB tb on tb.fkid=ta.id
group by ta.id, ta.name

You could do this:
;WITH CTE
AS
(
SELECT
ROW_NUMBER() OVER(PARTITION BY fkID ORDER BY ID) AS RowNbr,
tableB.*
FROM
tableB
)
SELECT
*
FROM
tableA
LEFT JOIN CTE
ON CTE.fkID=tableA.id
AND CTE.RowNbr=1
Demo here
Or without window function. Like this:
SELECT
*
FROM
tableA
LEFT JOIN
(
SELECT
ROW_NUMBER() OVER(PARTITION BY fkID ORDER BY ID) AS RowNbr,
tableB.*
FROM
tableB
) as tbl
ON tbl.fkID=tableA.id
AND tbl.RowNbr=1
Demo here
Update:
The reason why I choose to do it with row_number is that if there is more columns in tableB then the example. Then there is no need for additional aggregate if you want to show more columns. For me personally it is more clear with an order by on the ID

Related

Using Min in Update statement to get oldest record

I want to update my tableA with tableB but get only those records from table B having the oldest entry
TableA:
name ID
nick 15
john 12
tableB:
ID sportsname createddate
12 tennis 15march2019
14 baseball 15march2019
15 basketball 16march2019
15 cricket 20march2020
15 football 17may2020
My query:
update a
set a.sportsname=b.sportsname
from tablea a join tableb b
on a.id=b.id where b.createdate=( select min(createdate) from tableb )
But this is not giving correct result
update a
set a.name=b.sportsname
from #T a join (select min(createddate) as min_createddate,ID,sportsname from #t2
group by ID,sportsname) b ON b.ID=a.ID
You can use a SUB QUERY to attain this.
I suspect that the problem with your query is that you are using the minimum create date over the entire tableb rather than per id. Although you could fix that using a correlated subquery, I would recommend apply:
update a
set a.sportsname = b.sportsname
from tablea a cross apply
(select top (1) b.*
from tableb b
where a.id = b.id
order by b.createdate asc
) b;
For performance, you want an index on tableb(id, createdate desc, sportname).
You can use FIRST_VALUE() window function:
UPDATE a
SET a.sportsname=b.sportsname
FROM TableA a INNER JOIN (
SELECT DISTINCT ID,
FIRST_VALUE(sportsname) OVER (PARTITION BY ID ORDER BY createddate) sportsname
FROM TableB
) b ON b.ID = a.ID
See the demo.

SQL Server 2012 writing duplicate entries into table from CTE

So I am writing to a table the output from a few sequential CTEs, and when I fixed a join in one of the CTEs from an inner to a left join, there are now duplicated entries in the Table that do not show up if I just run the query without the insert.
Is there something I need to understanding about creating and inserting into a table with regards to joins in a CTE?
EDIT
create table MYTABLE
(
ID int,
Date smalldatetime,
Val1 int,
Val2 int
)
; with cte1 as (
select
a.ID,
a.Date,
a.Val1,
b.Val2
from table1 a
left join table2 b
on a.ID = b.ID
and a.Date = b.Date
)
insert into MYTABLE
(ID, Date, Val1, Val2)
select * from cte1
When creating the table on the inner join there is no problem with duplicates; on the left join (as shown above), rows where there are NULLs appear to be duplicated many times.
Check your right table (table2) my guess is that there are more than one record that have the same ID and Date.
If that is the case, the records are not technically duplicated if you do a select all (*) in the CTE, you will see the other fields that have changed.
If you do not care about the rest of the fields being different though, just try adding a Row_Number to your CTE and select where the Row_Number = 1 outside of the CTE.
For Instance:
create table MYTABLE
(
ID int,
Date smalldatetime,
Val1 int,
Val2 int
)
; with cte1 as (
select
a.ID,
a.Date,
a.Val1,
b.Val2
Rnum = ROW_NUMBER() OVER(PARTITION BY a.ID, a.Date, a.Val1, a.Val2 ORDER BY ID)
from table1 a
left join table2 b
on a.ID = b.ID
and a.Date = b.Date
)
insert into MYTABLE
(ID, Date, Val1, Val2)
select ID, Date, Val1, Val2 from cte1
where Rnum = 1
The row_number acts as a "distinct" and depending on what combination of fields you want to not duplicate, you will get different results.
For instance, if you do not want the IDs to duplicate, then
Rnum = ROW_NUMBER() OVER(PARTITION BY a.ID ORDER BY ID)
if you do not care about the IDs duplicating, but you do not want the same ID on the same date, then
Rnum = ROW_NUMBER() OVER(PARTITION BY a.ID, a.Date ORDER BY ID)
etc.... just depends on your selection criteria of what you do not want to duplicate.
Hope this helps

How to query total when I have a join table

Hallo,
I have a join table, said tableA and tableB. tableA have a column called Amount. tableB have a column called refID. I would like to total up the Amount column when refID having the same value. I was using SUM in my query, but it throw me an error:
ORA-30483: window functions are not allowed here
30483. 00000 - "window functions are not allowed here"
*Cause: Window functions are allowed only in the SELECT list of a query.
And, window function cannot be an argument to another window or group
function.
Here is my query for your reference:
select *
from (
select SUM(A.Amount), B.refId, Rank() over (partition by B.refID order by B.id desc) as ranking
from table A
left outer join table B on A.refID = B.refID
)
where ranking=1;
May I know is there any alternate solution in order for me to SUM the Amount?
THanks #!
select
SUM(A.Amount),
B.refId
from table A
left outer join table B on A.refID = B.refID
GROUP BY
B.refId
SELECT *
FROM (
SELECT A.Amount, B.refId,
Rank() over (partition by A.refID order by B.id desc) as ranking,
SUM(amount) OVER (PARTITION BY a.refId) AS asum
FROM tableA A
LEFT JOIN
tableB B
ON B.refID = A.refID
)
WHERE ranking = 1
Declare #T table(id int)
insert into #T values (1),(2)
Declare #T1 table(Tid int,fkid int,Amount int)
insert into #T1 values (1,1,200),(2,1,250),(3,2,100),(4,2,25)
Select SUM(t1.Amount) as amount,t1.fkid as id from #T t
left outer join #T1 t1 on t1.fkid = t.id group by t1.fkid
SELECT refid, sum(a.amount)
FROM table AS a LEFT table AS b USING (refid)
GROUP BY refid;
I'm a little confused. The query you posted did not have a SUM function anywhere, and performed a self-join of a table named "TABLE" to itself. I'm going to guess that you actually have two tables (I'll call them TABLE_A and TABLE_B), in which case the following should do it:
SELECT a.REFID, SUM(a.AMOUNT)
FROM TABLE_A a
INNER JOIN TABLE_B b
ON (b.REFID = a.REFID)
GROUP BY a.REFID;
If I understood your question you only wanted results when you have a TABLE_B.REFID which matches a TABLE_A.REFID, so an INNER JOIN would be appropriate.
Share and enjoy.

select a value where it doesn't exist in another table

I have two tables
Table A:
ID
1
2
3
4
Table B:
ID
1
2
3
I have two requests:
I want to select all rows in table A that table B doesn't have, which in this case is row 4.
I want to delete all rows that table B doesn't have.
I am using SQL Server 2000.
You could use NOT IN:
SELECT A.* FROM A WHERE ID NOT IN(SELECT ID FROM B)
However, meanwhile i prefer NOT EXISTS:
SELECT A.* FROM A WHERE NOT EXISTS(SELECT 1 FROM B WHERE B.ID=A.ID)
There are other options as well, this article explains all advantages and disadvantages very well:
Should I use NOT IN, OUTER APPLY, LEFT OUTER JOIN, EXCEPT, or NOT EXISTS?
For your first question there are at least three common methods to choose from:
NOT EXISTS
NOT IN
LEFT JOIN
The SQL looks like this:
SELECT * FROM TableA WHERE NOT EXISTS (
SELECT NULL
FROM TableB
WHERE TableB.ID = TableA.ID
)
SELECT * FROM TableA WHERE ID NOT IN (
SELECT ID FROM TableB
)
SELECT TableA.* FROM TableA
LEFT JOIN TableB
ON TableA.ID = TableB.ID
WHERE TableB.ID IS NULL
Depending on which database you are using, the performance of each can vary. For SQL Server (not nullable columns):
NOT EXISTS and NOT IN predicates are the best way to search for missing values, as long as both columns in question are NOT NULL.
select ID from A where ID not in (select ID from B);
or
select ID from A except select ID from B;
Your second question:
delete from A where ID not in (select ID from B);
SELECT ID
FROM A
WHERE NOT EXISTS( SELECT 1
FROM B
WHERE B.ID = A.ID
)
This would select 4 in your case
SELECT ID FROM TableA WHERE ID NOT IN (SELECT ID FROM TableB)
This would delete them
DELETE FROM TableA WHERE ID NOT IN (SELECT ID FROM TableB)
SELECT ID
FROM A
WHERE ID NOT IN (
SELECT ID
FROM B);
SELECT ID
FROM A a
WHERE NOT EXISTS (
SELECT 1
FROM B b
WHERE b.ID = a.ID)
SELECT a.ID
FROM A a
LEFT OUTER JOIN B b
ON a.ID = b.ID
WHERE b.ID IS NULL
DELETE
FROM A
WHERE ID NOT IN (
SELECT ID
FROM B)

SQL Select Distinct with Conditional

Table1 has columns (id, a, b, c, group). There are several rows that have the same group, but id is always unique. I would like to SELECT group,a,b FROM Table1 WHERE the group is distinct. However, I would like the returned data to be from the row with the greatest id for that group.
Thus, if we have the rows
(id=10, a=6, b=40, c=3, group=14)
(id=5, a=21, b=45, c=31, group=230)
(id=4, a=42, b=65, c=2, group=230)
I would like to return these 2 rows:
[group=14, a=6,b=40] and
[group=230, a=21,b=45] (because id=5 > id=4)
Is there a simple SELECT statement to do this?
Try:
select grp, a, b
from table1 where id in
(select max(id) from table1 group by grp)
You can do it using a self join or an inner-select. Here's inner select:
select `group`, a, b from Table1 AS T1
where id=(select max(id) from Table1 AS T2 where T1.`group` = T2.`group`)
And self-join method:
select T1.`group`, T2.a, T2.b from
(select max(id) as id,`group` from Table1 group by `group`) T1
join Table1 as T2 on T1.id=T2.id
2 selects, your inner select gets:
SELECT MAX(id) FROM YourTable GROUP BY [GROUP]
Your outer select joins to this table.
Think about it logically, the inner select gets a sub set of the data you need.
The outer select inner joins to this subset and can get further data.
SELECT [group], a, b FROM YourTable INNER JOIN
(SELECT MAX(id) FROM YourTable GROUP BY [GROUP]) t
ON t.id = YourTable.id
SELECT mi.*
FROM (
SELECT DISTINCT grouper
FROM mytable
) md
JOIN mytable mi
ON mi.id =
(
SELECT id
FROM mytable mo
WHERE mo.grouper = md.grouper
ORDER BY
id DESC
LIMIT 1
)
If your table is MyISAM or id is not a PRIMARY KEY, then make sure you have a composite index on (grouper, id).
If your table is InnoDB and id is a PRIMARY KEY, then a simple index on grouper will suffice (id, being a PRIMARY KEY, will be implictly included).
This will use an INDEX FOR GROUP-BY to build the list of distinct groupers, and for each grouper it will use the index access to find the maximal id.
Don't know how to do it in mysql. But the following code will work for MsSQL...
SELECT Y.* FROM
(
SELECT DISTINCT [group], MAX(id) ID
FROM Table1
GROUP BY [group]
) X
INNER JOIN Table1 Y ON X.ID=Table1.ID