Limiting Results on Join Query - sql

Say I have the following tables:
|RefNumber| Charge| IssueDate|
------------------------------
| 00001| 40.0|2009-01-01|
| 00002| 40.0|2009-06-21|
|ID|RefNumber|Forename| Surname|
---------------------------------
1| 00001| Joe| Blogs|
2| 00001| David| Jones|
3| 00002| John| Smith|
4| 00002| Paul| Walsh|
I would like to select refnumber, charge and issuedate from the first table then join on refnumber to the second table to retrieve forename and surname but only get the row with the highest id.
So results would look like:
|RefNumber| Charge| IssueDate|ID|Forename| Surname|
-----------------------------------------------------
| 00001| 40.0|2009-01-01| 2| David| Jones|
| 00002| 40.0|2009-06-21| 4| Paul| Walsh|
I am unsure who to limit results on the join to only return the record with the highest ID from the second table.

The most flexible way to write this, which doesn't require a correlated subquery, is to use ROW_NUMBER (SQL Server 2005+ only):
;WITH Names_CTE AS
(
SELECT
ID, RefNumber, Forename, Surname,
ROW_NUMBER() OVER (PARTITION BY RefNumber ORDER BY ID) AS RowNum
FROM Names
)
SELECT o.RefNumber, o.Charge, o.IssueDate, n.Forename, n.Surname
FROM Orders o
[INNER|LEFT] JOIN Names_CTE n
ON n.RefNumber = o.RefNumber
WHERE n.RowNum = 1
Note that ROW_NUMBER isn't always the most efficient if you can use MIN/MAX instead, just the easiest to write.
If you're running SQL 2000, or this isn't efficient enough, you can try a MIN or MAX query:
SELECT o.RefNumber, o.Charge, o.IssueDate, n.Forename, n.Surname
FROM Orders o
[INNER|LEFT] JOIN
(
SELECT RefNumber, MIN(ID) AS MinID
FROM Names
GROUP BY RefNumber
) m
ON m.RefNumber = o.RefNumber
[INNER|LEFT] JOIN Names n
ON n.ID = m.MinID
Sometimes this is actually faster, it depends a lot on the indexing strategy used.
(Edit - this gets rows with the lowest ID, which in most cases will be faster than getting the highest ID. If you need the highest, change the first query to ORDER BY ID DESC and the second query to use MAX instead of MIN).

I'd probably join against a subquery that returns only record from the second table with the highest id.
select a.RefNumber, a.Charge, a.IssueDate, b.ID, b.Forename, b.Surname
from References a inner join
(select ID, RefNumber, ForeName, Surname from Names n1
where n1.ID = (select top 1 n2.ID from Names n2 where n1.RefNumber = n2.RefNumber) ) b
on a.RefNumber = b.RefNumber

Actually your subquery would need to select the highest ID for EACH refnumber, so that would look more like this:
select
a.RefNumber, a.Charge, a.IssueDate, b.BiggestID, b.Forename, b.Surname
from References a
inner join
(select
RefNumber,
max(ID) as BiggestID
from Names
group by
RefNumber) b
on a.RefNumber = b.RefNumber
Hope that helps.
-Tom

select nt.RefNumber, ct.Charge, ct.IssueDate, nt.ID, nt.Forename, nt.Surname
from NameTable nt
join ChargeTable ct on (ct.RefNumber = nt.RefNumber)
where nt.ID = (select MAX(nt2.id)
from NameTable nt2
where nt2.RefNumber = nt.RefNumber)
order by nt.ID

Related

How can I join multiple tables on the same auto generated integer from one "table"?

I want to create a random selected cross-joined table which auto increments its own id and joins on it.
Let's say my tables looking like this.
Person
Firstname, Lastname
Hans | Müller
Joachim | Bugert
Address
City, Street, StreetNumber
Hamburg | Wandsbeckerstr. | 2
Berlin | Konradstraße | 13
Now I want to join the tables with a auto generated ID and they should be random selected.
The final table should look like this
ID,Firstname,Lastname, City, Street, StreetNumber
1 |Hans|Bugert|Berlin|Wandsbeckerstr|2
2|Joachim|Müller|Hamburg|Konradstraße | 13
What I already tried or used:
Here I auto-generate the ID where I want to join the tables on
select GENERATED_PERIOD_START as ID FROM SERIES_GENERATE_INTEGER(1,1,10)
The problem is cross join and inner join isn't working for me because it always joins everything with everything or its not joining on the same ID.
SELECT Person."Firstname", Person."Lastname", Address."City",Address."Street", Address."StreetNumber"
FROM
( select GENERATED_PERIOD_START as ID FROM SERIES_GENERATE_INTEGER(1,1,10)
) autoGenID
inner JOIN
(select "Firstname" ,"Lastname" FROM Person ORDER BY RAND()) Person
inner JOIN
(select "City", "Street", "StreetNumber", FROM Address ORDER BY RAND()) Address
JOIN ON autoGenID."ID"=?????
Here is my problem I can't just select random data and select that on my auto generated ID.
Thanks for your help or ideas how to solve this!
I think you want:
SELECT p."Firstname", p."Lastname", a."City", a."Street", a."StreetNumber"
FROM (SELECT p.*,
ROW_NUMBER() OVER (ORDER BY RAND()) as seqnum
FROM Person p
) p JOIN
(SELECT a.*,
ROW_NUMBER() OVER (ORDER BY RAND()) as seqnum
FROM Address a
) a
ON p.seqnum = a.seqnum;

SQL - How to select rows which have the same multiple values

I'll start with a simplified example of my table:
+-----+----------+
|Name |Teaches |
+-----+----------+
|Dave |Science |
+-----+----------+
|Dave |History |
+-----+----------+
|Alice|History |
+-----+----------+
|Alice|Literature|
+-----+----------+
|Alice|Science |
+-----+----------+
|John |History |
+-----+----------+
I'm trying to select those people who also teach the same classes as Dave. (In this case, Alice). I'm thinking of using a cursor to go through Dave's courses and selecting those people who teach the same course and intersecting the results, but I'd like to know if there is a better (simpler) way.
Here is one method:
select t.name
from t join
t td
on td.teaches = t.teaches
where td.name = 'Dave'
group by t.name
having count(*) = (select count(*) from t where t.name = 'Dave');
You need to use Self join, something like this
SELECT a.NAME
FROM Table1 a
INNER JOIN (SELECT Teaches,
Count(*)OVER() AS cnt
FROM Table1
WHERE NAME = 'Dave') b
ON a.Teaches = b.Teaches
WHERE a.NAME <> 'Dave'
GROUP BY a.NAME,
b.cnt
HAVING Count(*) = b.cnt
One method, used CTE here.
;WITH CTE_Parent
AS (
SELECT Name,Teaches,COUNT(*) OVER() AS Parent_Count
FROM #Teachers
WHERE Name = 'Dave'
)
SELECT T.Name
FROM #Teachers AS T
INNER JOIN CTE_Parent AS C ON C.Name <> T.Name
AND C.Teaches= T.Teaches
GROUP BY T.Name,C.Parent_Count
HAVING COUNT(*) = C.Parent_Count

How to select the highest value after a count() | Sql Oracle

This is my query:
SELECT f.name, COUNT(*) as num_books
from author f
JOIN book b on b.tittle = f.book
Group by f.name
Which gives me this table:
NAME NUM_BOOKS
-------------------------------------------------- ----------
Dyremann 2
Nam mann 1
Thomas 1
Asgeir 1
Tullemann 5
Plantemann 1
Beste forfatter 1
Fagmann 5
Lars 1
Hans 1
Svein Arne 1
How could I easly alter the query to only display the author with the highest amount of released books? (While keeping in mind I'm rather new to sql)
Oracle, and as far as I know - only Oracle, allows you to nest two aggregate functions.
SELECT max (f.name) keep (dense_rank last order by count (*)) as name
from author f
JOIN book b on b.tittle = f.book
Group by f.name
In order to get ALL top authors:
select name
from (SELECT f.name,rank () over (order by count(*) desc) as rnk
from author f
JOIN book b on b.tittle = f.book
Group by f.name
)
where rnk = 1
Since Oracle 12c:
SELECT f.name
from author f
JOIN book b on b.tittle = f.book
Group by f.name
order by count (*) desc
fetch first row /* with ties (optional, in order to get all top authors) */
The best way to do is to use:
SELECT f.name, COUNT(*) as num_books
from author f
JOIN book b on b.tittle = f.book
Group by f.name
Order by num_books DESC
FETCH FIRST ROW ONLY
This will order the results from biggest to smallest and return the first result.
1) Oracle Specific : ( Using ROWNUM, For Postgres/MySql use limit )
select * from
(SELECT f.name, COUNT(*) as num_books
from author f
JOIN book b on b.tittle = f.book
Group by f.name order by num_books desc )
where ROWNUM = 1
2) General Query for all databases :
select f.name,count(*) as max_num_books from author f
JOIN book b on b.tittle = f.book
Group by f.name
having count(*) =
(select max(num_books)
from
(SELECT f.name, COUNT(*) as num_books
from author f
JOIN book b on b.tittle = f.book
Group by f.name)
);
I am not sure why you need a join in the first place. It appears that the author table has a column book - why is it not enough to count(book) from that table, grouping by name? This arrangement is very strange - the author table should only have author properties, the author name should be in the title table, but you do join on author.book = book.title which seems to suggest that you do, in fact, have that strange arrangement (and therefore you don't need a join). Also, having a table and a column (in another table) share the same name, book, is a practice best to be avoided.
The most elementary solution (not the most efficient though), in this case, is
select name, count(book) as max_num_books
from author
group by name
having count(book) = (select max(count(book) from author group by name);
The subquery groups by name, and then it selects the max over all group counts. The outer query selects the names that have a book count equal to this maximum. The subquery returns a single row in a single column - a single value. Such a query is called a "scalar" subquery and can be used wherever a single value is needed, such as the HAVING clause of the outer query. (It's in the HAVING clause and not a WHERE clause, since it refers to group properties - count(book) - and not to individual row properties).
The more efficient solution is as Dudu showed:
select name, ct as max_num_books
from ( select name, count(*) as ct, rank() over (order by count(*) desc) rnk
from author
group by name
)
where rnk = 1;

Think up algorithm: Calculate full earnings for every company

I have a TreeView with companies. Every company can have subcompanies, every subcompany can have subcompanies and this can continue to infinity.
Every company have own earnings, and have FullEarnings which equals sum of all subcompanies this company.
I need to calculate earnings for every company if known own earnings every company.
Example:
CompanyName | OwnEarnings | FullEarnings
-Company1 | 25K$ | 53K$
--Company2 | 13K$ | 18K$
---Company3 | 5K$
--Company4 | 10K$
I have a database column ParentID, which link to id parent company.
How i can do it? Maybe by recursion?
One way or another, you will need to recursively update your table.
In your case that will look something like:
with C as
(
select T.Id,
T.Earnings,
T.Id as RootID
from T
union all
select T.Id,
T.Earnings,
C.RootID
from T
inner join C
on T.ParentId = C.Id
)
select T.Id,
T.ParentId,
T.CompanyName,
T.Earnings,
S.FullEarnings
from T
inner join (
select RootID,
sum(Earnings) as FullEarnings
from C
group by RootID
) as S
on T.Id = S.RootID
order by T.Id
option (maxrecursion 0);
SQL Fiddle example
In order to update, you will want to change out the select with an update query as shown in this SQL Fiddle example

SQL display two results side-by-side

I have two tables, and am doing an ordered select on each of them. I wold like to see the results of both orders in one result.
Example (simplified):
"SELECT * FROM table1 ORDER BY visits;"
name|# of visits
----+-----------
AA | 5
BB | 9
CC | 12
.
.
.
"SELECT * FROM table2 ORDER BY spent;"
name|$ spent
----+-------
AA | 20
CC | 30
BB | 50
.
.
.
I want to display the results as two columns so I can visually get a feeling if the most frequent visitors are also the best buyers. (I know this example is bad DB design and not a real scenario. It is an example)
I want to get this:
name by visits|name by spent
--------------+-------------
AA | AA
BB | CC
CC | BB
I am using SQLite.
Select A.Name as NameByVisits, B.Name as NameBySpent
From (Select C.*, RowId as RowNumber From (Select Name From Table1 Order by visits) C) A
Inner Join
(Select D.*, RowId as RowNumber From (Select Name From Table2 Order by spent) D) B
On A.RowNumber = B.RowNumber
Try this
select
ISNULL(ts.rn,tv.rn),
spent.name,
visits.name
from
(select *, (select count(*) rn from spent s where s.value>=spent.value ) rn from spent) ts
full outer join
(select *, (select count(*) rn from visits v where v.visits>=visits.visits ) rn from visits) tv
on ts.rn = tv.rn
order by ISNULL(ts.rn,tv.rn)
It creates a rank for each entry in the source table, and joins the two on their rank. If there are duplicate ranks they will return duplicates in the results.
I know it is not a direct answer, but I was searching for it so in case someone needs it: this is a simpler solution for when the results are only one per column:
select
(select roleid from role where rolename='app.roles/anon') roleid, -- the name of the subselect will be the name of the column
(select userid from users where username='pepe') userid; -- same here
Result:
roleid | userid
--------------------------------------+--------------------------------------
31aa33c4-4e66-4da3-8525-42689e46e635 | 12ad8c95-fbef-4287-9834-7458a4b250ee
For RDBMS that support common table expressions and window functions (e.g., SQL Server, Oracle, PostreSQL), I would use:
WITH most_visited AS
(
SELECT ROW_NUMBER() OVER (ORDER BY num_visits) AS num, name, num_visits
FROM visits
),
most_spent AS
(
SELECT ROW_NUMBER() OVER (ORDER BY amt_spent) AS num, name, amt_spent
FROM spent
)
SELECT mv.name, ms.name
FROM most_visited mv INNER JOIN most_spent ms
ON mv.num = ms.num
ORDER BY mv.num
Just join table1 and table2 with name as key like bellow:
select a.name,
b.name,
a.NumOfVisitField,
b.TotalSpentField
from table1 a
left join table2 b on a.name = b.name