SQL Complex Filter/Join Issue - sql

I'm a novice at SQL and I think this is a relatively basic query but I can't seem to get it to work.
I have two tables. One has group membership and the other details about the group. The key field between the two is Group.
Membership looks like this.
Person EffectiveDate Group
Mary 8/10/2017 A
Joe 8/05/2017 A
Peter 9/01/2017 B
Mike 9/2/2017 B
Alice 9/2/2017 B
Joe 9/10/2017 B
Pam 9/3/2017 C
Note that there are two entries for Joe because he changed groups.
GroupInformation Looks like this:
Group FullName Location Color
A Panthers New York Blue
B Steelers London Orange
C Archers Moscow Yellow
I want to run a query that, on any given day, will give me the individual's group membership along with team details.
So, I want to find the line with the MAX(EffectiveDate) in Membership for each individual person on the date run and left join the GroupInformation table on key Group
If I ran the query on 9/4 I'd get this:
Person EffectiveDate Group FullName Location Color
Mary 8/10/2017 A Panthers New York Blue
Joe 8/05/2017 A Panthers New York Blue
Peter 9/01/2017 B Steelers London Orange
Mike 9/2/2017 B Steelers London Orange
Alice 9/2/2017 B Steelers London Orange
Pam 9/3/2017 C Archers Moscow Yellow
If I ran the query on 9/13 I'd get this:
Person EffectiveDate Group FullName Location Color
Mary 8/10/2017 A Panthers New York Blue
Peter 9/01/2017 B Steelers London Orange
Mike 9/2/2017 B Steelers London Orange
Alice 9/2/2017 B Steelers London Orange
Joe 9/10/2017 B Steelers London Orange
Pam 9/3/2017 C Archers Moscow Yellow
Note that the difference between the two query results is Joe. The 9/4 run has him in Group A joining on 8/5 where the 9/13 run has him in Group B which he joined on 9/10.
My query code is as follow:
Select s.Person,
s.Group,
s.EffectiveDate,
g.FullName,
g.Location,
g.Color
From Membership s
Join GroupInformation g
on s.Group = g.Group
and s.EffectiveDate = (
Select Max(s1.EffectiveDate)
From Membership s1
where s1.Group = g.Group
and s1.EffectiveDate <= '2017-09-14')
However when I run this code I find in my actual data that it omits records. So if I have 150 records in membership the resulting query join and subquery operations will result in an answer with maybe 80 records.
Can't figure out what I'm doing wrong. Guidance please.
Thanks.

You are on the right track, but using the wrong correlation clause:
Select s.Person, s.Group, s.EffectiveDate, g.FullName, g.Location, g.Color
From Membership s Join
GroupInformation g
on s.Group = g.Group
WHERE s.EffectiveDate = (Select Max(s1.EffectiveDate)
From Membership s1
where s1.Person = s.Person and
s1.EffectiveDate <= '2017-09-14'
);
Note that group is a very poor name for a column name in SQL, because it is a SQL key word.

What you need is to recharacterize the membership data to group member names as well as dates, then use it as a subquery and join to it in this vein. You're basically saying "give me the max membership date of each person prior to a given date of interest." Caveat: if the EffectiveDate field is strictly 'Date' (rather than a DateTime), it could theoretically still fail if someone changed memberships twice on the same day (no date resolution beyond the day).
Suggest this as a possible alternative (warning this is very hastily thrown together and not tested):
select s.person, s.group, s.EffectiveDate, g.FullName,g.location, g.color
from (select m.person,m.group, max(m.effectivedate) effectivedate
from Membership m
where m.EffectiveDate <= '2017-09-14'
group by m.person,m.group) s
join GroupInformation g
on s.group=g.group

Related

SQL Query: Join (or select) 2 columns from 1 table with 1 column from another table for a view without extra join columns

This is my very first Stackoverflow post, so I apologize if I am not formatting my question correctly. I'm pounding my head against the wall with what I'm sure is a simple problem. I have a table with a bunch of event information, about 10 columns as so:
Table: event_info
date location_id lead_user_id colead_user_id attendees start end <and a few more...>
------------------------------------------------------------------------------------------------
2020-10-10 1 3 1 26 2100 2200 .
2020-10-11 3 2 4 18 0600 0700
2020-10-12 2 5 6 6 0800 0900
And another table with user information:
Table: users
user_id user_name display_name email phone city
----------------------------------------------------------------------
1 Joe S goofball ...
2 John T schmoofball ...
3 Jack U aloofball ...
4 Jim V poofball ...
5 Joy W tootball ...
6 George A boring ...
I want to create a view that has only a subset of the information, not full table joins. The event table lead_user_id and colead_user_id columns both refer to the user_id column in the users table.
I want to create a view like this:
date Location Lead Name CoLead Name attendees
---------------------------------------------------------------------
2020-10-10 1 Jack U Joe S 26
2020-10-11 3 John T Jim V 18
2020-10-12 2 Joy W George A 6
I have tried the following and several iterations like it to no avail...
SELECT
E.date, E.location,
U1.display_name AS Lead Name,
U2.display_name AS CoLead Name.
E.attendees
FROM
users U1, event_info E
INNER JOIN
event_info E ON U1.user_id = E.lead_user_id
INNER JOIN
users U2 ON U2.user_id = E.colead_user_id
And I get the dreaded
You have an error in your SQL Syntax
message. I'm not surprised, as I've really only ever used joins on single columns or nested select statements... this two columns pointing to one is throwing me for a loop. Help!
correct query for this matter
SELECT
E.date, E.location,
U1.display_name AS Lead Name,
(select display_name from users where user_id=E.colead_user_id) AS CoLead Name,
E.attendees
FROM
event_info E
INNER JOIN
users U1 ON U1.user_id = E.lead_user_id

Is this multiple join on 2 tables possible?

I have 2 tables and I am having trouble joining it to give me the desired output.
First table is called Future. It is future meetings I have.
Date Name Subject Importance Location
7/08/2020 David Work 1 London
7/08/2020 George Updates 2 New York
7/08/2020 Frank New Appointments 5 London
7/08/2020 Steph Policy 1 Paris
The second table is called Previous. It is previous meetings I have had.
Date Name Subject Importance Location Time Rating
1/08/2020 David Work 3 London 23.50 4
2/10/2018 David Emails 3 New York 18.20 3
1/08/2019 George New Appointments5 London 55.10 2
3/04/2020 Steph Dismissal 1 Paris 33.20 5
Now what I need to is to reference my previous table by name to see the previous meetings I have had with this person and I want all the data from the Previous Table there. I also need to limit it to only showing maximum 5 previous meetings with each person.
Date Name Subject Importance Location Time Rating
7/08/2020 David Work 1 London - -
1/08/2020 David Work 3 London 23.50 4
2/10/2018 David Emails 3 New York 18.20 3
7/08/2020 George Updates 2 New York - -
1/08/2019 George New Appointments5 London 55.10 2
The Name column will need to be a left join, but then i need to just do a regular join on the other columns. Also unsure how to limit the name results to a maximum of 5 of the same value. Thanks for your help in advance.
Basically, you want union all:
select m.*
from ((select Date, Name, Subject, Importance, Location, NULL as time, NULL as rating
from future
) union all
(select Date, Name, Subject, Importance, Location, time, rating
from previous
)
) m
group by name, date desc;
You can apply other conditions to this result. It is not clear what other conditions you really want, but this is a start.

Is there a way to list the most recent dates for an event based on data in other columns?

I am working to write a query that shows the most recent job start date for each person with extended families with in the past year (I should not show future dates) It is possible that multiple families (in multiple states) may have started their job on the same date. In that case, I need to list the state(s), both people, and the respective dates. However, I should only list each state/person pair once.
Additionally, if the person didn't start their job within the past year, I should still list the persons name but in the place of the state name, I should have the query return NULL and the date return NULL.
Below is the date in the raw table:
LOC FAM PPL MILESTONE_ID MILESTONE_NAME START_DATE
WI Smith Mike 1 End College 9/4/2017 0:00
WI Smith Mike 2 Start Job 9/4/2017 0:00
WI Smith Bob 1 End College 6/4/2019
WI Smith Bob 2 Start Job 6/4/2019
IL Thomas Mike 1 End College 1/4/2019
IL Thomas Mike 2 Start Job 6/4/2019
IL Thomas Bob 1 End College 12/4/2019
IL Thomas Bob 2 Start Job 6/4/2019
I know that I need to use a subquery to get the most recent job start dates but my subquery isn't behaving as expected. I have also tried using a CTE but that isn't working either.
This is what I have so far. I haven't gotten the subquery to work correctly. I still need to add the NULL portion of the situation above
Select family.*
From
FAMILY.KEYINFO as family
Inner Join
(Select family.milestone_id, MAX(family.start_date) as LatestDate
from FAMILY.keyinfo
group by milestone_id) groupeddate
on family.milestone_id=groupeddate.milestone
where family.start_date<= CURRENT_TIMESTAMP
and family.start_date > DATEADD(year,-1,GETDATE())
Below is what I would expect the answer to be if the query was correct:
LOC PPL START_DATE
N/A Mike N/A
N/A Mike N/A
WI Bob 6/4/2019
IL Mike 6/4/2019
IL Bob 6/4/2019
You seem to want window functions:
select f.*
from (select f.*,
rank() over (partition by fam order by start_date desc as seqnum
from families f
where milestone_name = 'Start Job'
) f
where seqnum = 1;

Many to one merging sql

I have three tables as below:
First Table Second Table Third Table
Name PIN Id City City_id
David 1948 1 Roma 3
Susan 1245 2 Berlin 2
Jack 1578 3 New York 3
Hans 1247 2
Rose 8745 1
I want to merge first and second table according to third table. Result will be: Person
Name PIN City
David 1948 New York
Susan 1245 Berlin
Jack 1578 New York
Hans 1247 Berlin
Rose 8745 Roma
Firsty I can merge second and third table and then merge the result table with first table but I want to solve this problem without a medium table. How can I handle this? How can I combine first table's rows in sequence with a specified row in second table according to third table?
You would need a fourth table, PersonCity, with PersonID and CityID to link together. Think of relational databases like a grid (spreadsheet, roads). If you're going North and the street you want to get on is parallel (think |^| |^|) you're gonna need to use a different road that links the two. Currently, you have no such path.
The short answer is that your tables are not adequate for the task, what you need is along the lines of:
Table_1 Table_2 Table_3
Id Name PIN Id City Name_id City_id
1 David 1948 1 Roma 1 3
2 Susan 1245 2 Berlin 2 2
3 Jack 1578 3 New York 3 3
4 Hans 1247 4 2
5 Rose 8745 5 1
Then you can do your query as follow:
SELECT T1.Name, T1.PIN, T2.City
FROM Table_1 T1 LEFT JOIN Table_3 T2 ON T1.Id = T3.Name_id
LEFT JOIN Table_2 ON T3.City_id = T2.Id
ORDER BY T1.Name
Or you could ORDER BY City, name
I have good news and bad news.
The good news, Given the tables the way they were originally specified, in Oracle, this will give you something that looks like what you are asking:
---
--- Pay attention, This looks right but it is not!
---
select name,pin,city from
( select name,pin,rownum rn from first ) a,
( select city,id from second) b,
( select id,rownum rn from third ) c
where
a.rn=c.rn AND
b.id=c.id;
NAME PIN CITY
-------------------- ---- --------------------
Rose 8745 Roma
Susan 1245 Berlin
Hans 1247 Berlin
David 1948 New York
Jack 1578 New York
The bad news is this does not really work and is cheating. You will get results but they may not be what you would expect and they won't necessarily be consistent.
The database orders records in its own order. If you don't specify an order by clause, you get what they give you, which may not be what you want. This is cheating because Oracle does not REALLY support using rownum in this way because you can't bet on what you will get. This won't work in most other databases.
The only correct way is what #daShier gave, where you have to add something, say, ID, to allow connecting the rows in the order you want.

Limiting records of combinations from 2 columns

looking for some help limiting the results while querying combinations between 2 columns. Here's an example of the kind of table I am working with..
id name group state
1 Bob A NY
2 Jim A NY
3 Dan A NY
4 Mike A FL
5 Tim B NY
6 Sam B FL
7 Brad B FL
8 Glen B FL
9 Ben C FL
I am trying to display all records of all combinations of "group" and "state", but limiting to displaying only up to 2 records for each combination. The result should look like the following..
id name group state
1 Bob A NY
2 Jim A NY
4 Mike A FL
5 Tim B NY
6 Sam B FL
7 Brad B FL
9 Ben C FL
Thanks for the help.
Assuming you always want the two rows for each group and state combination with the lowest id
SELECT *
FROM (SELECT a.*,
row_number() over (partition by group, state
order by id asc) rnk
FROM your_table a)
WHERE rnk <= 2
Of course, since group is a reserved word, I assume your column is actually named something else... You'd need to adjust my query to use the correct column name.