How to count total crossover from two tables each with specific conditions - sql

I am working from two tables in a dataset. Let's call the first one 'Demographic_Info', the other 'Study_Info'. The two tables both have a Subject_ID column. How can I run a query that will return all of the Subject_IDs where Sex = Male (from Demographic_Info) but also where the Study Case = Case (from Study_Info)?
Is this an inner join? Do I need to make a combined table?
I just don't know what function to use. I know how to select for each of these conditions in each table individually, but not how to run them against eachother.

Yes, you will want to inner join and then use the where clause to filter on both tables.
select
s.Subject_ID
from `Study_info` s
inner join `Demographic_info` d on s.Subject_ID = d.Subject_ID
where d.Sex = 'Male'
and s.Study_Case = 'Case' -- Unclear from your question about the actual field name
The aliases s and d will be useful for organizing which table each field comes from (or if the same field occurs in both tables).
Similarly, you could filter first and then perform the join.
with study as (select * from `Study_info` where Study_Case = 'Case'),
demographics as (select * from `Demographic_info` where Sex = 'Male')
select s.Subject_ID
from study s
inner join demographics d on s.Subject_ID = d.Subject_ID

Related

Same queries giving different results

So for an assignment at school, we had to extract a count from a database. The question was as follows,
--19) How many airports in a timezone name containing 'Europe' are used as a source airport (source_airport_id) for a route that has aircraft that have a wake of 'M' or 'L'
This was the code I came up with,
SELECT count(DISTINCT airports.id) FROM airports WHERE timezone_name LIKE '%Europe%' AND id IN
(SELECT source_airport_id FROM routes WHERE id IN
(SELECT id FROM route_aircrafts WHERE aircraft_id IN
(SELECT id FROM aircrafts WHERE wake_size IN ('M', 'L'))));
it returned 544, while the professors answer returned 566.
SELECT count (DISTINCT airports.id)
FROM airports, routes, route_aircrafts, aircrafts
WHERE airports.id = routes.source_airport_id
AND routes.id = route_aircrafts.route_id
AND aircrafts.id = route_aircrafts.aircraft_id
AND airports.timezone_name LIKE'%Europe%'
AND aircrafts.wake_size IN ('M', 'L'); --566
To me, those two should be doing the same thing and I can't understand why the answers are different.
To get the same answer in your query you need:
SELECT count(DISTINCT airports.id) FROM airports WHERE timezone_name LIKE '%Europe%' AND id IN
(SELECT source_airport_id FROM routes WHERE id IN
(SELECT route_id FROM route_aircrafts WHERE aircraft_id IN
(SELECT id FROM aircrafts WHERE wake_size IN ('M', 'L'))));
You'd used the primary ID field rather than the foreign key route_id. You were getting an approximately similar result because there must be a significant overlap in the values.
I would go with something along the lines of:
SELECT COUNT(DISTINCT airports.id)
FROM airports
INNER JOIN routes ON airports.id = routes.source_airport_id
INNER JOIN route_aircrafts ON routes.id = route_aircrafts.route_id
INNER JOIN aircrafts ON route_aircrafts.aircraft_id = aircrafts.id
AND aircrafts.wake_size IN ('M', 'L')
WHERE airports.timezone_name LIKE '%Europe%'
Explanation:
SELECT COUNT(DISTINCT airports.id)
You don't want to count duplicate airports.ids more than once.
FROM airports
This is the main table you're counting from. All other tables build from this one.
INNER JOIN routes ON airports.id = routes.source_airport_id
INNER JOIN will only include rows that match in both tables. Matching on airports.id and routes.source_airport_id.
INNER JOIN route_aircrafts ON routes.id = route_aircrafts.route_id
INNER JOIN will only include rows that match in both tables. Matching on routes.id and route_aircrafts.route_id.
INNER JOIN aircrafts ON route_aircrafts.aircraft_id = aircrafts.id
AND aircrafts.wake_size IN ('M', 'L')
Same thing with the INNER JOINs above. We've added an additional filter for wakes. For an INNER JOIN, this filter can also be performed in the WHERE clause without changing the results. Putting filters in the JOIN keeps the intent together (and the optimizer will likely filter this way anyway). For an OUTER JOIN, filtering in the JOIN vs filtering in the WHERE can possibly return different results (depending on your data).
WHERE airports.timezone_name LIKE '%Europe%'
Now we are filtering the entire resultset by the timezone_name from the base table of airports.
When working with SQL, it's important to think of your data in SETS. This will help you write more performant, and less programatic, queries.

Selecting data using three tables only joining two tables

I am trying to select data using three tables. I need to get an equity number and contract date from an actor and contract table where the name of the film = x from a film table.
I have done:
SELECT equity_number, contract_date
FROM actor,
contract,
film
WHERE actor.equity_number = contract.equity_number
and title = 'x'
Although I am getting a column ambiguously defined error.
You don't have any Relationship with the film table.
And use a modern way of writing sql :)
SELECT a.equity_number, c.contract_date
FROM actor a
INNER JOIN contract c on a.equity_number = c.equity_number
INNER JOIN film f on f.SOMERELATIONSHIPID = c.SOMERELATIONSHIPID
WHERE f.title = 'x'
I suggest you to use JOINS instead of old comma-separated syntax. After It set aliases for each table and use these aliases in select list. Something like:
SELECT a.equity_number,
c.contract_date
FROM actor a
LEFT JOIN contract c ON a.equity_number = c.equity_number
LEFT JOIN film f ON a.film_id = f.id -- there should be related columns
AND f.title = 'x'

I want to retrieve data from 4 tables in SQL Server

I want to retrieve data from 4 tables. Patient table has id as PK which is the foreign key in other three tables ett, phar and ssc. Where a patient lie in only one category. i.e patient id pt1 exists in either of the 3 tables. now I want to retrieve patient info along with its associated category.
My query is:
SELECT *
FROM Patient p
INNER JOIN ETT t
ON p.Patient_ID = t.Patient_ID || INNER JOIN Pharmacological ph
ON p.Patient_ID = ph.Patient_ID
I used OR clause because I want only 1 inner join executing at one time. but its not giving me results, any suggestions??
....Patient table has ID as PK which is the foreign key in other three
tables name: ett, phar and ssc where a patient lie in only one
category. Example, patient id pt1 exists in either of the 3 tables.
Based on your statement, you can join all the tables in table Patient using LEFT JOIN since a record can only exist on one table. The query below uses COALESCE which returns the first non-null value with int the list.
The only thing you need is to manually specify the column names that you want to be shown on the list as shown below.
SELECT a.*,
COALESCE(t.colA, p.ColA, s.ColA) ColA,
COALESCE(t.colB, p.ColB, s.ColB) ColB,
COALESCE(t.colN, p.ColN, s.ColN) ColN
FROM Patient a
LEFT JOIN ETT t
ON a.Patient_ID = t.Patient_ID
LEFT JOIN Phar p
ON a.Patient_ID = p.Patient_ID
LEFT JOIN SSC s
ON a.Patient_ID = s.Patient_ID
To further gain more knowledge about joins, kindly visit the link below:
Visual Representation of SQL Joins
For or - do not ise ||, use "or"
You cannot join with or, you need re-format your query.

sql server - how to modify values in a query statement?

I have a statement like this:
select lastname,firstname,email,floorid
from employee
where locationid=1
and (statusid=1 or statusid=3)
order by floorid,lastname,firstname,email
The problem is the column floorid. The result of this query is showing the id of the floors.
There is this table called floor (has like 30 rows), which has columns id and floornumber. The floorid (in above statement) values match the id of the table floor.
I want the above query to switch the floorid values into the associated values of the floornumber column in the floor table.
Can anyone show me how to do this please?
I am using Microsoft sql server 2008 r2.
I am new to sql and I need a clear and understandable method if possible.
select lastname,
firstname,
email,
floor.floornumber
from employee
inner join floor on floor.id = employee.floorid
where locationid = 1
and (statusid = 1 or statusid = 3)
order by floorid, lastname, firstname, email
You have to do a simple join where you check, if the floorid matches the id of your floor table. Then you use the floornumber of the table floor.
select a.lastname,a.firstname,a.email,b.floornumber
from employee a
join floor b on a.floorid = b.id
where a.locationid=1 and (a.statusid=1 or a.statusid=3)
order by a.floorid,a.lastname,a.firstname,a.email
You need to use a join.
This will join the two tables on a certain field.
This way you can SELECTcolumns from more than one table at the time.
When you join two tables you have to specify on which column you want to join them.
In your example, you'd have to do this:
from employee join floor on employee.floorid = floor.id
Since you are new to SQL you must know a few things. With the other enaswers you have on this question, people use aliases instead of repeating the table name.
from employee a join floor b
means that from now on the table employee will be known as a and the table floor as b. This is really usefull when you have a lot of joins to do.
Now let's say both table have a column name. In your select you have to say from which table you want to pick the column name. If you only write this
SELECT name from Employee a join floor b on a.id = b.id
the compiler won't understand from which table you want to get the column name. You would have to specify it like this :
SELECT Employee.name from Employee a join floor b on a.id = b.id or if you prefer with aliases :
SELECT a.name from Employee a join floor b on a.id = b.id
Finally there are many type of joins.
Inner join ( what you are using because simply typing Join will refer to an inner join.
Left outer join
Right outer join
Self join
...
To should refer to this article about joins to know how to use them correctly.
Hope this helps.

SQL Query for finding values that do not exist in one table, with WHERE clause

I'm struggling to compile a query for the following and wonder if anyone can please help (I'm a SQL newbie).
I have two tables:
(1) student_details, which contains the columns: student_id (PK), firstname, surname (and others, but not relevant to this query)
(2) membership_fee_payments, which contains details of monthly membership payments for each student and contains the columns: membership_fee_payments_id (PK), student_id (FK), payment_month, payment_year, amount_paid
I need to create the following query:
which students have not paid fees for March 2012?
The query could be for any month/year, March is just an example. I want to return in the query firstname, surname from student_details.
I can query successfully who has paid for a certain month and year, but I can't work out how to query who has not paid!
Here is my query for finding out who has paid:
SELECT student_details.firstname, student_details.surname,
FROM student_details
INNER JOIN membership_fee_payments
ON student_details.student_id = membership_fee_payments.student_id
WHERE membership_fee_payments.payment_month = "March"
AND membership_fee_payments.payment_year = "2012"
ORDER BY student_details.firstname
I have tried a left join and left outer join but get the same result. I think perhaps I need to use NOT EXISTS or IS NULL but I haven't had much luck writing the right query yet.
Any help much appreciated.
I'm partial to using WHERE NOT EXISTS Typically that would look something like this
SELECT D.firstname, D.surname
FROM student_details D
WHERE NOT EXISTS (SELECT * FROM membership_fee_payments P
WHERE P.student_id = D.student_id
AND P.payment_year = '2012'
AND P.payment_month = 'March'
)
This is know an a correlated subquery as it contains references to the outer query. This allows you to include your join criteria in the subquery without necessarily writing a JOIN. Also, most RDBMS query optimizers will implement this as a SEMI JOIN which does not typically do as much 'work' as a complete join.
You could use a left join. When the payment is missing, all the columns in the left join table will be null:
SELECT student_details.firstname, student_details.surname,
FROM student_details
LEFT JOIN membership_fee_payments
ON student_details.student_id = membership_fee_payments.student_id
AND membership_fee_payments.payment_month = "March"
AND membership_fee_payments.payment_year = "2012"
WHERE membership_fee_payments.student_id is null
ORDER BY student_details.firstname
You can also write following query. This will gives your expected output.
SELECT student_details.firstname,
student_details.surname,
FROM student_details
Where
student_details.student_id Not in
(SELECT membership_fee_payments.student_id
from membership_fee_payments
WHERE
membership_fee_payments.payment_year = '2012'
AND membership_fee_payments.payment_month = 'March'
)