How can I query pairwise event attendance in MS Access given its subquery restrictions? - sql

My colleague and I have been wracking our heads on this for days and have come to the conclusion that we just don't know SQL well enough to solve this problem. Please help us!
We have a table in MS Access called EventAttendance with 2 fields: MemberID (the person who attended the event) and EventID (the event they attended). Unique Member and Event IDs are stored in their own tables (Member and Event, respectively.) EventAttendance contains entries for hundreds of members and events, but a simple collection of records for 3 events might look like this:
|---------------------|------------------|
| MemberID | EventID |
|---------------------|------------------|
| 1 | 1 |
|---------------------|------------------|
| 2 | 1 |
|---------------------|------------------|
| 4 | 1 |
|---------------------|------------------|
| 1 | 2 |
|---------------------|------------------|
| 2 | 2 |
|---------------------|------------------|
| 3 | 2 |
|---------------------|------------------|
| 1 | 3 |
|---------------------|------------------|
| 3 | 3 |
|---------------------|------------------|
| 4 | 3 |
|---------------------|------------------|
In this example, Member 1 attended Events 1, 2, & 3; Member 2 attended Events 1 & 2; Member 3 attended Events 2 & 3; and Member 4 attended Events 1 & 3.
We are now trying to create a table that documents how often each pair of members attended an event together. Ideally, we would want the table to look like the one below (for the example records above), where CoAttendance is the number of events each pair attended together:
|---------------------|------------------|------------------|
| Member1 | Member2 | CoAttendance |
|---------------------|------------------|------------------|
| 1 | 2 | 2 |
|---------------------|------------------|------------------|
| 1 | 3 | 2 |
|---------------------|------------------|------------------|
| 1 | 4 | 2 |
|---------------------|------------------|------------------|
| 2 | 3 | 1 |
|---------------------|------------------|------------------|
| 2 | 4 | 1 |
|---------------------|------------------|------------------|
| 3 | 4 | 1 |
|---------------------|------------------|------------------|
This has proven a lot more challenging than we assumed. First, we haven't been able to figure out to get pairs in such a way that combinations don't repeat, but we have been able to list all permutations by querying a perfect copy of the Member Table (MemberClone) and using the following query:
SELECT Member.MemberID AS Member1, MemberClone.MemberID AS Member2
FROM Member, MemberClone;
This query resulted in a table like so:
|---------------------|------------------|
| Member1 | Member2 |
|---------------------|------------------|
| 1 | 1 |
|---------------------|------------------|
| 1 | 2 |
|---------------------|------------------|
| 1 | 3 |
|---------------------|------------------|
| 1 | 4 |
|---------------------|------------------|
| 2 | 1 |
|---------------------|------------------|
| 2 | 2 |
|---------------------|------------------|
| 2 | 3 |
|---------------------|------------------|
| 2 | 4 |
|---------------------|------------------|
| 3 | 1 |
|---------------------|------------------|
| 3 | 2 |
|---------------------|------------------|
| 3 | 3 |
|---------------------|------------------|
| 3 | 4 |
|---------------------|------------------|
| 4 | 1 |
|---------------------|------------------|
| 4 | 2 |
|---------------------|------------------|
| 4 | 3 |
|---------------------|------------------|
| 4 | 4 |
|---------------------|------------------|
Not perfect, but good enough (this is a problem we'd like solved eventually, but not the main one.)
The bigger problem is getting the third column (CoAttendance) to work. The closest we've gotten to a solution is through the use of subqueries:
SELECT Member.MemberID AS Member1, MemberClone.MemberID AS Member2,
(SELECT Count(*)
FROM (SELECT EventID FROM EventAttendance WHERE EventAttendance.MemberID = Member1) AS Member1Attendance
INNER JOIN (SELECT EventID FROM EventAttendance WHERE EventAttendance.MemberID = Member2) AS Member2Attendance
ON Member1Attendance.EventID = Member2Attendance.EventID) AS CoAttendance
FROM MemberClone, Member;
This should theoretically generate a list of only events that Member1 and Member2 attended together for each pair, and the count(*) operation would count those events.
The problem is that Access subqueries can only see one level above themselves, so Member1 and Member2 are undefined in the nested subqueries (they are defined 2 levels above.) I've tried finding solutions, but find that I just don't understand SQL enough to process similar solutions posted elsewhere (e.g., Nested subquery in Access alias causing "enter parameter value") while also making the inner join work.
Any help you can offer would be super appreciated!

I think you just want a self-join with aggregation:
select ea1.memberid, ea2.memberid, count(*) as num_events
from EventAttendance as ea1 inner join
EventAttendance as ea2
on ea1.eventid = ea2.eventid and ea1.memberid < ea2.memberid
group by ea1.memberid, ea2.memberid;
MS Access might be finicky about the < in the on clause. This is an inner join, so you can do:
select ea1.memberid, ea2.memberid, count(*) as num_events
from EventAttendance as ea1 inner join
EventAttendance as ea2
on ea1.eventid = ea2.eventid
where ea1.memberid < ea2.memberid
group by ea1.memberid, ea2.memberid;

This requires doing a match finding all people from events (so a self-join on EventAttendance)
SQL for this is below
SELECT Member.ID AS Member1, Member_1.ID AS Member2, Count(EventAttendance_1.Event_ID) AS CountOfEvent_ID
FROM (Member INNER JOIN (EventAttendance INNER JOIN EventAttendance AS EventAttendance_1 ON EventAttendance.Event_ID = EventAttendance_1.Event_ID) ON Member.ID = EventAttendance.Member_ID) INNER JOIN Member AS Member_1 ON EventAttendance_1.Member_ID = Member_1.ID
GROUP BY Member.ID, Member_1.ID
HAVING (((Member_1.ID)>[Member].[ID]));
That is the SQL provided by Access for the following setup:
Note that you do not necessarily need links to Member or Member_1 - just that you may want to get info about them other than their ID.
---- Update
Gah this is basically the same answer as #Gordon's above (so I've upvoted his). I've not deleted this one in case you find the picture useful (when using Access, I much prefer the GUI query designer).

Related

Make a query making groups on the same result row

I have two tables. Like this.
select * from extrafieldvalues;
+----------------------------+
| id | value | type | idItem |
+----------------------------+
| 1 | 100 | 1 | 10 |
| 2 | 150 | 2 | 10 |
| 3 | 101 | 1 | 11 |
| 4 | 90 | 2 | 11 |
+----------------------------+
select * from items
+------------+
| id | name |
+------------+
| 10 | foo |
| 11 | bar |
+------------+
I need to make a query and get something like this:
+--------------------------------------+
| idItem | valtype1 | valtype2 | name |
+--------------------------------------+
| 10 | 100 | 150 | foo |
| 11 | 101 | 90 | bar |
+--------------------------------------+
The quantity of types of extra field values is variable, but every item ALWAYS uses every extra field.
If you have only two fields, then left join is an option for this:
select i.*, efv1.value as value_1, efv2.value as value_2
from items i left join
extrafieldvalues efv1
on efv1.iditem = i.id and
efv1.type = 1 left join
extrafieldvalues efv2
on efv1.iditem = i.id and
efv1.type = 2 ;
In terms of performance, two joins are probably faster than an aggregation -- and it makes it easier to bring in more columns from items. One the other hand, conditional aggregation generalizes more easily and the performance changes by little as more columns from extrafieldvalues are added to the select.
Use conditional aggregation
select iditem,
max(case when type=1 then value end) as valtype1,
max(case when type=2 then value end) as valtype2,name
from extrafieldvalues a inner join items b on a.iditem=b.id
group by iditem,name

Trying to join a table of individuals to a table of couples, give a family ID and not time out the server

I have one table with fake individual tax records like so (one row per filer):
T1:
+-------+---------+---------+
| Person| Spouse | Income |
+-------+---------+---------+
| 1 | 2 | 34000 |
| 2 | 1 | 10000 |
| 3 | NULL | 97000 |
| 4 | 6 | 11000 |
| 5 | NULL | 25000 |
| 6 | 4 | 100000 |
+-------+---------+---------+
I have a second table which has tax 'families', a single individual or married couple (one line per tax 'family').
T1_Family:
+-------- -+-------+---------+
| Family_id| Person| Spouse |
+-------- -+-------+---------+
| 2 | 2 | 1 |
| 3 | 3 | NULL |
| 5 | 5 | NULL |
| 6 | 6 | 4 |
+------ ---+-------+---------+
Family = max(Person) within a couple
The idea of joining the two is for example, to sum the income of 2 people in one tax family (aggregate to the family level).
So, I've tried the following:
select *
into family_table
from
(
(select * from T1_family)a
join
(select * from T1)b
on a.family = b.person **or a.spouse = b.person**
)
where family_id is not null and person is not null
What I should get (and I do get when I select 1 random couple) is one line per individual where I can then group by family_id and sum income, pension contributions, etc. BUT SQL times out before the tables can be joined. The part in bold is what's slowing down the process but I'm not sure what else to do.
Is there an easier way to group by family?
It is simpler to put the data on one row:
select a.*, p.income as person_income, s.income as spouse_income
into family_table
from t1_family a left join
t1 p
on a.person = p.person lef tjoin
t1 s
on a.spouse = s.person;
Of course, you can add them together as well.

What Clause would most optimally create this query?

So I don't have much experience with SQL, and am trying to learn. An interview question I came across had this question. I'm trying to learn more SQL but maybe I'm missing a piece of info to solve this? Or maybe I'm approaching the problem wrong.
This is the question:
We have following two tables , below is their info:
POLICY (id as int, policy_content as varchar2)
POLICY_VOTES (vote as boolean, policy_id as int)
Write a single query that returns the policy_id, number of yes(true) votes and number of no(false) votes with a row for each policy up for a vote stored
My first thought when approaching this was to use a WITH clause to get the policy_ids and use an inner join to get the votes for yes and no but I can't find a way to make it work, which is what leads me to believe that there's another clause in SQL I'm not aware of or couldn't find that would make it easier. Either that or I'm thinking of the problem in the wrong way.
Good question.
I cannot answer too specifically, since you did not specify a DBMS, but what you will want to do is count or situationally sum based on criteria. When you use an aggregate function like that, you also need GROUP BY.
Here are two example tables I made with test data:
policy
| id | policy_content |
|----|----------------|
| 1 | foo |
| 2 | foo |
| 3 | foo |
| 4 | foo |
| 5 | foo |
policy votes
| vote | policy_id |
|------|-----------|
| yes | 1 |
| no | 1 |
| yes | 2 |
| yes | 2 |
| no | 3 |
| no | 3 |
| no | 4 |
| yes | 4 |
| yes | 5 |
| yes | 5 |
Using the below query:
SELECT
policy_votes.policy_id,
SUM(CASE WHEN vote = 'yes' THEN 1 ELSE 0 END) AS yes_votes,
SUM(CASE WHEN vote = 'no' THEN 1 ELSE 0 END) AS no_votes
FROM
policy_votes
GROUP BY
policy_votes.policy_id
You get:
| POLICY_ID | YES_VOTES | NO_VOTES |
|-----------|-----------|----------|
| 1 | 1 | 1 |
| 2 | 2 | 0 |
| 4 | 1 | 1 |
| 5 | 2 | 0 |
| 3 | 0 | 2 |
Here is an SQL Fiddle for you to try it out.
Try this:
select p.id, p.content,
Count(case when pv.vote='true' then 1 end) as number_of_yes,
Count(case when pv.vote='false' then 1 end) as number_of_no
From policy p join policy_votes pv
On(p.id = pv.policy_id)
Group by p.id, p.content
Cheers!!

Create a pivot table from two tables based on dates

I have two MS Access tables sharing a one to many relationship. Their structures are like the following:
tbl_Persons
+----------+------------+-----------+
| PersonID | PersonName | OtherData |
+----------+------------+-----------+
| 1 | PersonA | etc. |
| 2 | PersonB | |
| 3 | PersonC | |
tbl_Visits
+----------+------------+------------+-----------------------
| VisitID | PersonID | VisitDate | dozens of other fields
+----------+------------+------------+-----------
| 1 | 1 | 09/01/13 |
| 2 | 1 | 09/02/13 |
| 3 | 2 | 09/03/13 |
| 4 | 2 | 09/04/13 | etc...
I wish to create a new table based on the VisitDate field, the column headings of which are Visit-n where n is 1 to the number of visits, Visit-n-Data1, Visit-n-Data2, Visit-n-Data3 etc.
MergedTable
+----------+----------+---------------+-----------------+----------+----------------+
| PersonID | Visit1 | Visit1Data1 | Visit1Data2... | Visit2 | Visit2Data1... |
+----------+----------+---------------+-----------
| 1 | 09/01/13 | | | 09/02/13 |
| 2 | 09/03/13 | | | 09/04/13 |
| 3 | etc. | |
I am really not sure how to do this. Whether SQL query or using DAO then looping through records and columns. It is essential that there is only 1 PersonID per row and all his data appears chronologically into columns.
Start of by ranking the visits with something like
SELECT PersonID, VisitID,
(SELECT COUNT(VisitID) FROM tbl_Visits AS C
WHERE C.PersonID = tbl_Visits.PersonID
AND C.VisitDate < tbl_Visits.VisitDate) AS RankNumber
FROM tbl_Visits
Use this query as a base for the 'pivot'
Since you seem to have some visits of persons on the same day (visit 1 and 2) the WHERE clause needs to be a bit more sophisticated. But I hope you get the basic concept.
Pivoting can be done with multiple LEFT JOINs.
I question if my solution will have a high performance, since I did not test it. It is easier in SQL Server than in MS Access to accomplish.

Getting Sum of MasterTable's amount which joins to DetailTable

I have two tables:
1. Master
| ID | Name | Amount |
|-----|--------|--------|
| 1 | a | 5000 |
| 2 | b | 10000 |
| 3 | c | 5000 |
| 4 | d | 8000 |
2. Detail
| ID |MasterID| PID | Qty |
|-----|--------|-------|------|
| 1 | 1 | 1 | 10 |
| 2 | 1 | 2 | 20 |
| 3 | 2 | 2 | 60 |
| 4 | 2 | 3 | 10 |
| 5 | 3 | 4 | 100 |
| 6 | 4 | 1 | 20 |
| 7 | 4 | 3 | 40 |
I want to select sum(Amount) from Master which joins to Deatil where Detail.PID in (1,2,3)
So I execute the following query:
SELECT SUM(Amount) FROM Master M INNER JOIN Detail D ON M.ID = D.MasterID WHERE D.PID IN (1,2,3)
Result should be 20000. But I am getting 40000
See this fiddle. Any suggestion?
You are getting exactly double the amount because the detail table has two occurences for each of the PIDs in the WHERE clause.
See demo
Use
SELECT SUM(Amount)
FROM Master M
WHERE M.ID IN (
SELECT DISTINCT MasterID
FROM DETAIL
WHERE PID IN (1,2,3) )
What is the requirement of joining the master table with details when you have all your columns are in Master table.
Also, isnt there any FK relationhsip defined on these tables. Looking at your data it seems to me that there should be FK on detail table for MasterId. If that is the case then you do not need join the table at all.
Also, in case you want to make sure that you have records in details table for the records for which you need sum and there is no FK relationship. Then you could give a try for exists instead of join.