Join table on null where condition - sql

I have tables Member and Transaction. Table Member has 2 columns MemberID and MemberName. Table Transaction has 3 columns, MemberID, TransactionDate, and MemberBalance.
The rows in the tables are as shown below:
Table Member:
MemberID MemberName
=============================
1 John
2 Betty
3 Lisa
Table Transaction:
MemberID TransactionDate MemberBalance
=====================================================
1 13-12-2012 200
2 12-12-2012 90
1 10-09-2012 300
I would like to query for MemberID, MemberName and MemberBalance where the TransactionDate is the latest (max) for each MemberID.
My query is like this:
SELECT
t.MemberID, m.MemberName , t.MemberBalance
FROM
Member AS m
INNER JOIN
Transaction AS t ON m.MemberID = t.MemberID
WHERE
t.TransactionDate IN (SELECT MAX(TransactionDate)
FROM Transaction
GROUP BY MemberID)
This query returns:
MemberID MemberName MemberBalance
===================================================
1 John 200
2 Betty 90
My problem is, I want the query to return:
MemberID MemberName MemberBalance
===================================================
1 John 200
2 Betty 90
3 Lisa NULL
I want the member to be displayed even if its MemberID does not exist in the Transaction table.
How do I do this?
Thank you.

You can also use something like this:
SELECT m.MemberID, m.MemberName, t1.MemberBalance
FROM Member AS m
LEFT JOIN
(
select max(transactionDate) transactionDate,
MemberID
from Transactions
group by MemberID
) AS t
ON m.MemberID = t.MemberID
left join transactions t1
on t.transactionDate = t1.transactionDate
and t.memberid = t1.memberid
See SQL Fiddle with Demo

member to be displayed even if its MemberID does not exist in Transaction table
You can preserve rows using LEFT JOIN on the Member table to Transaction table.
where the TransactionDate is the latest (max) for each MemberID.
From SQL Server 2005 onwards, the preferred and better performing method is to use ROW_NUMBER()
SELECT MemberID, MemberName, MemberBalance
FROM (
SELECT m.MemberID, m.MemberName , t.MemberBalance,
row_number() over (partition by m.MemberID order by t.TransactionDate desc) rn
FROM Member AS m
LEFT JOIN [Transaction] AS t ON m.MemberID = t.MemberID
) X
WHERE rn=1;

To keep member in the result set, you need an outer join.
Also, please don't forget to add a condition on memberid for inner select query, as you might get issues when a maximum date for one user would match a non-maximum date of another (your where condition would pass twice for the second user as his transaction dates would appear on the select's results twice, one would be his actual maximum date and another - the max date of some user matching a non-max date)

You need to use LEFT JOIN. Also you had an error in your query because if two members had transactions at the same time you can get two rows for both the users.
Try this
SELECT t.MemberID, m.MemberName , t.MemberBalance
FROM Member AS m
LEFT JOIN Transaction AS t ON m.MemberID = t.MemberID AND t.TransactionDate=
(
SELECT MAX(TransactionDate)
FROM Transaction T2
WHERE T2.MemberID=t.MemberID
)

SELECT a.MemberId,a.MemberName,a.MemberBalance
FROM
(
SELECT m.MemberId,m.MemberName,t1.MemberBalance
,ROW_NUMBER() OVER(PARTITION BY m.MemberId ORDER BY t1.TransactionDate DESC) AS RN
FROM
#Member m OUTER APPLY (SELECT t.MemberId,t.MemberBalance,t.TransactionDate
FROM #Transaction t WHERE m.MemberId=t.MemberId) t1
)a
WHERE a.RN=1

Related

Calculate variable of max amount in a group

I have difficulties in doing the following exercise. I would need to find how frequent is that an id is not the max_id in the group with the most amount. This should be done considering groups that contain at least two different people.
Data comes from two different tables: max_id comes from table1 (I will call it a)as well as user and amount; id comes from table2 (b) as well as group.
From the text above, the conditions should be
(1) a.id<>b.max_id /* is not */
(2) people in group >=2
(3) a.id<> id of max amount
The dataset looks like
(a)
max_id user amount
(b)
group email
From a previous exercise, I had to compute distinct people as follows:
sel a.distinct users
a.max_id
b.id
from table1 as a
inner join table2 as b
on b.id=a.max_id
where
b.max_id is not null
and b.time is null
No information from amount was required in the exercise above. This is the main difference between the two exercises, but the structure and fields are quite similar.
Now, I would need to edit the code above in order to find how frequent is that an id is not the max_id in the group with the most amount. This makes sense only if groups have at least two different persons/users.
I think I will need to join tables to get the id of max amount in a group and count people in a group, but I do not know how to do it.
Any help would be greatly appreciated. Thank you.
Data sample
max_id user amount id group email
12 1 -2000 12 house email1
312 1 0 54 work email1
11 32 -213 11 house email32
41 13 -43 78 work email13
312 53 -650 34 work email53
1 67 -532 43 defense email67
64 76 -9650 98 work email76
For my understanding, what the exercise asks and based on the code above, I should find values for id<>max_id and having more than 2 users in a group (i.e. house, work, defence).
Then, what I would need to select is id <> id of max amount.
I hope this it can be a bit more clear.
assuming yoy have a query as
select t.User, m.Email, m.Model, m.Amount
from my_table m
inner join (
select user, max(amount) max_amount
from my_table
group by user
) t on t.user = m.user
and t.max_amount = m.amount
you can obatin the max di for each amoun using
select max(id), Amount
from (
select m.id, t.User, m.Email, m.Model, m.Amount
from my_table m
inner join (
select user, max(amount) max_amount
from my_table
group by user
) t on t.user = m.user
and t.max_amount = m.amount
) k
and you should obtain the valud of id that are not equal to max id as
select mm.id, t.User, mm.Email, mm.Model, mm.Amount
from my_table mm
inner join (
select user, max(amount) max_amount
from my_table
group by user
) t on t.user = m.user
and t.max_amount = m.amount
inner join (
select max(k.id) max_id, k.Amount
from (
select m.id, t.User, m.Email, m.Model, m.Amount
from my_table m
inner join (
select user, max(amount) max_amount
from my_table
group by user
) t on t.user = m.user
and t.max_amount = m.amount
) k
) kk ON kk.max_id <> mm.id
and based on your last sample the query should be
select m.*
from my_table
inner join (
select my_groups, count(distinct user)
from my_table
group by my_groups
having count(distinct user) >2
) t on t.my_group = m.my_group
and m.max_id <> m.id
PS group is a reserved word so i use my_groups for the column name

PARTITION BY duplicated id and JOIN with the ID with the least value

I need to JOIN through a view in SQLServer 2008 tables hstT and hstD. The main table contains a data regarding employees and their "logins" (so multiple records associated to x employee in x month) and the second table has info about their area based on months, and I need to join both tables but keeping the earliest record as reference for the join and the rest of records associated to that id.
So hstT its something like:
id id2 period name
----------------------
x 1 0718 john
x 1 0818 john
y 2 0718 jane
And hstD:
id2 period area
----------------------
1 0718 sales
1 0818 hr
2 0707 mng
With an OUTER JOIN I manage to merge all data based on ID2 (user id) and the period BUT as I mentioned I need to join the other table based on the earliest record by associating ID (which I could use as criteria) so it would look like this:
id id2 period name area
---------------------------
x 1 0718 john sales
x 1 0818 john sales
y 2 0718 jane mng
I know I could use ROW_number but I don't know how to use it in a view and JOIN it on those conditions:
SELECT T.*,D.*, ROW_NUMBER() OVER (PARTITION BY T.ID ORDER BY T.PERIOD ASC) AS ORID
FROM dbo.hstT AS T LEFT OUTER JOIN
dbo.hstD AS D ON T.period = D.period AND T.id2 = D.id2
WHERE ORID = 1
--prompts error as orid doesn't exist in any table
You can use apply for this:
select t.*, d.area
from hstT t outer apply
(select top (1) d.*
from hstD d
where d.id2 = t.id2 and d.period <= t.period
order by d.period asc
) d;
Actually, if you just want the earliest period, then you can filter and join:
select t.*, d.area
from hstT t left join
(select d.*, row_number() over (partition by id2 order by period asc) as seqnum
from hstD d
order by d.period asc
) d;
on d.id2 = t.id2 and seqnum = 1;

How to get the lowest ID present for each category SQL

I have the following tables is sql
Diagnosis
DiagnosisID
DiagnosisDescription
Member
MemberID
FirstName
LastName
DiagnosisCategoryMap
DiagnosisCategoryID
DiagnosisID
MemberDiagnosis
MemberID
DiagnosisID
What I need to do is find the diagnosis with the lowest DiagnosisID present for each Members Category
This is the sql I have so far:
SELECT MD.MemberID AS MID,
MD.DiagnosisID AS DID,
DM.DiagnosisCategoryID AS CID
FROM
MemberDiagnosis MD
INNER JOIN DiagnosisCategoryMap DM ON MD.DiagnosisID = DM.DiagnosisID
Which gives me this result set:
> MID DID CID
> 1 2 2
> 1 4 3
> 3 3 3
> 3 4 3
The result set I need should look like this:
> MID DID CID
1 2 2
3 3 3
What am I missing in my query.
I have tried to do a group by but that (of course) did not work out well because I could not aggregate properly for the group by.
I am using SQL SERVER and that is all I can use
Use the MIN aggregate to get the minimum DiagnosticID for each MemberID and DiagnosisCategoryID using GROUP BY
SELECT MD.MemberID AS MID,
MIN(MD.DiagnosisID) AS DID,
DM.DiagnosisCategoryID AS CID
FROM
MemberDiagnosis MD
INNER JOIN DiagnosisCategoryMap DM ON MD.DiagnosisID = DM.DiagnosisID
GROUP BY
MD.MemberID,
DM.DiagnosisCategoryID
Break the problem down into smaller steps.
First, verify that you can get the lowest Diagnosis Id for each Member with the following:
select MemberId as MID, min(DiagnosisId) as DID
from MemberDiagnosis
group by MemberId
When you have verified that that works, join the DiagnosisCategoryMap table...
select MID, DID, DiagnosisCategoryId as CID
from
(
select MemberId as MID, min(DiagnosisId) as DID
from MemberDiagnosis
group by MemberId
) src
inner join DiagnosisCategoryMap dcm
on dcm.DiagnosisId = src.DID
SELECT A.ID AS MID, MIN(C.DiagnosisID) AS DID, C.DiagnosisCategoryID AS CID
FROM
Member A INNER JOIN MemberDiagnosis B
ON A.MemberID=B.MemberID
INNER JOIN DiagnosisCategoryMap C
ON B.DiagnosisID=C.DiagnosisID
GROUP BY A.ID, C.DiagnosisCategoryID;

One to many join with group by

I have two tables. one table is named Shopper and it looks like
SHOPPER_ID | SHOPPER_NAME |
-------------------------
1 | Marianna |
2 | Jason |
and another table named Order has information like Date on the order
ORDER_ID | SHOPPER_ID | DATE
----------------------------------
1 | 1 | 08/09/2012
2 | 1 | 08/08/2012
Now I want to do a query that joins two tables and group by SHOPPER_ID, because one shopper can have multiple orders, I want to pick the latest order base on DATE value.
My query looks like:
Select * from Shopper as s join Order as o
on s.SHOPPER_ID = o.SHOPPER_ID
group by s.SHOPPER_ID
The query is wrong right now because I don't know how to apply the filter to only get the latest order. Thanks in advance!
I suggest using a sub-select:
Select s.SHOPPER_ID, s.SHOPPER_NAME, o.MAX_DATE
from Shopper s
INNER join (SELECT SHOPPER_ID, MAX(DATE) AS MAX_DATE
FROM ORDER
GROUP BY SHOPPER_ID) o
on s.SHOPPER_ID = o.SHOPPER_ID
Best of luck.
Easy way is use row_number to find the lastest order
SQL Fiddle Demo
SELECT *
FROM
(SELECT S.*,
O.[ORDER_ID], O.[DATE],
ROW_NUMBER() OVER ( PARTITION BY S.SHOPPER_ID
ORDER BY [DATE] DESC) as rn
FROM Shopper S
JOIN Orders O
ON S.SHOPPER_ID = O.SHOPPER_ID
) T
WHERE rn = 1
SELECT *
FROM
Shopper s
CROSS APPLY
(
SELECT TOP 1 *
FROM
Order o
WHERE
s.SHOPPER_ID = o.SHOPPER_ID
ORDER BY
o.DATE DESC
) o;
You need a subquery to get the last order per shopper, and then join that with the shopper and order tables to get the name of the shopper and the order id
SELECT ss.SHOPPER_ID, ss.SHOPPER_NAME, oo.ORDER_ID LAST_ORDER
FROM (SELECT o.SHOPPER_ID, MAX(o.DATE) [DATE]
FROM Shopper s
INNER JOIN Order o
ON s.SHOPPER_ID = o.SHOPPER_ID
GROUP BY o.SHOPPER_ID) mo
INNER JOIN Shopper ss
ON mo.SHOPPER_ID = ss.SHOPPER_ID
INNER JOIN Order oo
ON mo.SHOPPER_ID = oo.SHOPPER_ID AND mo.DATE = oo.DATE
Here's the SQL Fiddle to try it out
Select s.*, o1.*
From Order as o1
left join Order as o2
on (o1.SHOPPER_ID = o2.SHOPPER_ID and o1.DATE < o2.DATE)
join Shopper as s
on (s.SHOPPER_ID = o1.SHOPPER_ID )
where o2.DATE is NULL;
Join Order table to itself, looking for newer Orders to join it to. The "left" join means that every row in the Order table will be kept in the results even if it cannot be joined to a newer order for that customer.
The "where" discards all of the rows where a newer order was found. This leaves you only with only the most recent Orders.
Join those results to the Shopper table to include the shopper data.
Edit: I suggested this answer because JOINs are much faster for a Database than sub-selects.

Select Most Recent Date with Inner Join

Running into a wall when trying to pull info from tables similar to those below. Not sure how to approach this.
The results should have the most recent TRANSAMT for each ACCNUM along with NAME and address.
Select A.ACCNUM, MAX(B.TRANSAMT) as BAMT, B.ADDRESS from
From TableA A inner join TableB on A.ACCNUM = B.ACCNUM
This is what i have so far. Any help would be appreciated.
TableA
ACCNUM NAME ADDRESS
00001 R. GRANT Miami, FL
00002 B. PAUL Dallas, TX
TableB
ACCNUM TRANSAMT TRANSDATE
00001 150 1/1/2015
00001 200 13/2/2015
00002 100 2/1/205
00003 50 18/2/2015
You can use the ANSI standard row_number() function in most databases. This allows you to do conditional aggregation:
select a.accnum, a.name, b.amount, a.address
from tableA a left join
(select b.*, row_number() over (partition by accnum order by transdate desc) as seqnum
from tableB b
) b
on a.accnum = b.accnum and b.seqnum = 1;
Note: I changed the join to a left join. This will keep all records in tableA, even those with no matches. I am not sure if that is the intention of your query.
You can use row_number to order rows per each account number by the most recent first.
select accnum, amt, name, address
from (
select A.ACCNUM, B.TRANSAMT as BAMT, B.ADDRESS,A.Name,
row_number() over(partition by a.accnum order by b.transdate desc) as rn
From TableA A
inner join TableB on A.ACCNUM = B.ACCNUM
) t
where rn = 1;
Please note this will not work if you are using MySQL.
This one with no ROW_NUMBER():
with find_max as(
select acc_name,max(TRANSDATE) as TRANSDATE from talbeB group by acc_name)
select find_max.ACCNUM , A.TRANSAMT ,
find_max.TRANSDATE , B.ADDRESS,B.Name
from tableA as A
join find_max on find_max.ACCNUM=A.ACCNUM and find_max.ACCNUM=A.ACCNUM
join TableB B on A.ACCNUM = B.ACCNUM
First find the max date for each acc_name, the join both of tables to it.
Will work on most data bases.