R) Using join in R - sql

Given database is down below,
> dbReadTable(jamesdb, "EMPLOYEE")
EMP_NO NI_NO NAME AGE DEPT_NO
1 E1 123 SMITH 21 D1
2 E2 159 SMITH 31 D1
3 E3 5432 BROWN 65 D2
4 E5 7654 GREEN 52 D3
> dbReadTable(jamesdb, "DEPARTMENT")
DEPT_NO NAME MANAGER
1 D1 Accounts E1
2 D2 Stores E3
3 D3 Sales E5
> dbReadTable(jamesdb, "PRODUCT")
PROD_NO NAME COLOR
1 p1 PANTS BLUE
2 p2 PANTS KHAKI
3 p3 SOCKS GREEN
4 p4 SOCKS WHITE
5 p5 SHIRTS WHITE
> dbReadTable(jamesdb, "STOCK_TOTAL")
PROD_NO QUANTITY
1 p1 2000
2 p2 1000
3 p3 1500
4 p4 200
5 p5 800
And down below is what I got so far but I think I have a misunderstanding of using join.
How should I fix them?
Retrieve the employment number of the sales department manager.
dbGetQuery(jamesdb, 'SELECT EMPLOYEE.EMP_NO FROM DEPARTMENT JOIN EMPLOYEE
WHERE DEPARTMENT.NAME = "Sales"')
Who works in Department D2?
dbGetQuery(jamesdb, 'SELECT MANAGER FROM DEPARTMENT WHERE DEPT_NO = "D2"')
How many white-colored products are in stock?
dbGetQuery(jamesdb, 'SELECT SUM(QUANTITY) FROM PRODUCT JOIN STOCK_TOTAL
WHERE PRODUCT.COLOR = "WHITE"')

Joins are typically done matching one (or more) fields from one table with a corresponding field(s) of another table. For instance, I'm inferring that DEPARTMENT.MANAGER is actually a foreign key to EMPLOYEE.EMP_NO, so when you join, you should be very specific about that relationship:
SELECT e.EMP_NO
FROM DEPARTMENT d
LEFT JOIN EMPLOYEE e on d.MANAGER = e.EMP_NUM
WHERE d.NAME = "Sales"
Notes:
Many databases allow you to be sloppy, where they will infer field associations (foreign keys) based on common field names. First, I don't like allowing that inference; second, it doesn't work here.
I personally prefer to be explicit about the type of join, whether left join, inner join, etc. It's a style, you may choose just join if you prefer.
I'm introducing table aliases here (d and e), a way to shorten long table names. However, they are stylistic, not required.
I personally dislike databases that lack a dictionary of foreign keys and have unintuitive names to associate them. For instance, I'm inferring from the contents of the tables that DEPARTMENT.MANAGER is linked to EMPLOYEE.EMP_NUM. If I'm wrong on this inference, then answers below are likely skewed.
For your first question, though, I don't know why you need a join: since MANAGER is already the employee number, this should be simply
select d.MANAGER
from DEPARTMENT d
where d.NAME='Sales'
Similarly, your second question needs no join.
select e.*
from EMPLOYEE e
where e.DEPT_NO='D2'
The last one needs a join, and can be done in a number of ways. One such is:
select sum(case when st.Quantity > 0 then 1 else 0 end) as Count
from STOCK_TOTAL st
left join PRODUCT pr on st.PROD_NO=pr.PROD_NO
where pr.COLOR='WHITE'

Related

SQL inquiry, tried absolutely everything I know, sql, four tables

I cant get this inquiry, tried like every thing
TABLE: ARTIST
JMBG NAME AGE ADRESA
--------------------------------------
J1 Ladygaga 35 HOLIVUDHILZ
J2 DUSKO 13 BB
J3 EMINEM 40 REVOLUCIJA 5
J4 BAGI 22 KURAC
J5 MARKO 33 ULICA
TABLE:HALL
DID CAPACITY CITY
---------------------------------
D1 500 PODGORICA
D2 300 NIS
D3 1000 BAR
D4 2000 NEWYORK
D5 750 BEOGRAD
TABLE: CITY
-----------------------------------------
BAR montenegro 5000
BEOGRAD Serbia 2000000
BUDVA montenegro 50000
NEWYORK AMERICA 7000000
NIS Serbia 1000000
PODGORICA montenegro 250000
TABLE: CONCERT
ID JMBG HALL
------------------------
K1 J3 D4
K2 J4 D1
K3 J1 D1
K4 J1 D5
K5 J1 D1
K6 J3 D1
K7 J5 D1
The inquiry is: Find the countries where the artist with the most held concerts
has performed in. I really did spend a lot of time on this and energy. I would greatly appreciate if someone could do this that has experience, and doesnt find it too difficult.
I tried this:
SELECT DISTINCT COUNTRY FROM CITY G, HALL D, CONCERT K
WHERE K.DID = D.DID AND D.NAZIV = G.NAZIV AND EXISTS(
SELECT JMBG FROM CONCERT K1,HALL D1, CITY G1
WHERE K.KID=K1.KID
GROUP BY JMBG
HAVING COUNT (*) >= ALL(SELECT COUNT(*) FROM CONCERT
GROUP BY JMBG))
Break it down. The artist with the most held concerts... Which artist had the most held concerts? (We're going to assume that we're interested in the total number of concerts held overall (in all countries), not the number of concerts held in a particular country.
How many concerts did each artist hold?
SELECT c.jmbg
, COUNT(1) AS cnt
FROM concert c
GROUP BY c.jmbg
Which artist held the most concerts? MySQL and MS SQL Server both have some convenient short cuts we can use here. A question we should ask here, what if there are two or more artists held the same number of concerts? Do we want to return both (or all) of those artists, or just return one of them? Which one? (We'd prefer the query to be deterministic... to return the same result given the same rows in the tables.)
Assuming that we want to return just one artist that held the most concerts...
For MySQL:
SELECT c.jmbg
FROM concert c
GROUP BY c.jmbg
ORDER BY COUNT(1) DESC, c.jmbg DESC
LIMIT 1
For SQL Server:
SELECT TOP 1 c.jmbg
FROM concert c
GROUP BY c.jmbg
ORDER BY COUNT(1) DESC, c.jmbg DESC
So that gets us the artist.
The other part of the "inquiry"... which countries did the artist hold concerts in.
Given a particular artist, we could write a query that performs join operations on the concert, hall and city tables. We'll just take a guess at the name of that first column in the city table (since it isn't provided in the question).
SELECT i.country
FROM city i
JOIN hall h
ON h.city = i.cid
JOIN concert o
ON o.hall = h.did
WHERE o.jmbg = 'Ladygaga'
GROUP BY i.country
To combine the two queries, we could use the first as a subquery. My preference is to use an inline view.
SELECT g.country
FROM city g
JOIN hall h
ON h.city = g.cid
JOIN concert o
ON o.hall = h.did
JOIN (
SELECT c.jmbg
FROM concert c
GROUP BY c.jmbg
ORDER BY COUNT(1) DESC, c.jmbg DESC
LIMIT 1
) m
ON m.jmbg = o.jmbg
GROUP BY g.country
Obviously, there are obviously other query patterns that will return an equivalent result.
As I noted in a comment on the question, the specification for this "inquiry" is a bit ambiguous, as to what is meant by "where the artist with the most held concerts has performed in".
There is another interpretation of that specification. If we're interested in getting and analyzing a count of "how many concerts were held in each country by each artist", that's a different query.
FOLLOWUP
"... not allowed to use TOP DESC"
Then just write the query differently. Here's a different way to get the "largest number of concerts held by any artist", and use that to get all the artists that all held that number of concerts.
SELECT n.jmbg
FROM ( -- largest number of concerts by artist
SELECT MAX(p.cnt) AS maxcnt
FROM (
SELECT COUNT(1) AS cnt
FROM concert d
GROUP BY d.jmbg
) p
) o
JOIN ( -- count of concerts by artist
SELECT c.jmbg
, COUNT(1) AS cnt
FROM concert c
GROUP BY c.jmbg
) n
ON n.cnt = o.maxcnt
Since that has the potential to return more than one row (more than one artist), your outer query may want to return a list of countries for each of the returned artists. That is to say, rather than just GROUP BY g.country, you'll likely want to return the artist in the SELECT list, and
GROUP BY m.jmbg, g.country
ORDER BY m.jmbg, g.country
This is a basic question that looks like coming out of school type of question. This answer will give you some hints but you need to work it out for yourself.
JOIN is your friend, find source below:
JOIN - MySQL
JOIN - SQL Server
What you need to do:
join CONCERT table with HALL table by HALL ID
join HALL table to CITY table by CITY name
sum the count of country appearance or hall capacity (either one you need) grouped by artist
order descending by the sum of count if you need it
Good luck

SQL populating table with calculation from other tables

I have two tables:
grades_table = has 100 rows of students with grades i.e.
Student Mark
Joe 64
Mark 50
percentage_table = a table which has a percentage to a grade i.e.
Grade Mark
A 70+
B 60+
C 50+
Is it possible in SQL to populate a new table called Overall, with the grade calculation? For example,
Student Grade
Joe B
Mark C
I've looked at IN and BETWEEN. But I can't quite understand how to calculate the ranges. I'm new to SQL and any help or point to the right direction would be great.
Split percentage table's mark column into two columns, markmin and markmax. Then JOIN the tables as:
select g.student, p.grade
from grades g
join percentage p ON g.mark between p.markmin and p.markmax
Or, according to later suggested If I changed the scenario slightly, and only had one mark column in the percentage_table, so only 60+, 70+:
Here I assume (A, 70) means 70 and above is grade A. Do a GROUP BY with MIN to find "lowest" grade alphabetically:
select g.student, min(p.grade)
from grades g
join percentage p ON g.mark <= p.mark
group by g.student
Alternatively, use a correlated sub-select:
select g.student, (select min(p.grade) from percentage p
where g.mark <= p.mark)
from grades g
I would suggest you to have two different columns in your table percentage_table:
Grade | MarkMin | MarkMax
A | 70 | Null
B | 60 | 69
C | 50 | 59
Then you can just use a JOIN:
SELECT g.student, p.Mark
FROM
grades_table AS g LEFT JOIN percentage_table AS p
ON (g.Mark >= p.MarkMin OR p.MarkMin IS NULL)
AND (g.Mark<=p.MarkMax OR p.MarkMax IS NULL)
70+ which means >= 70 can then be inserted as (70, Null).
If you want to have a <=49 then you can add a mark as (Null, 49).

Oracle, LEFT OUTER JOIN not returning all rows from left table, instead behaving like INNER JOIN

I'm doing a left outer join and only getting back matching rows like it was an inner join.
To simplify the data, my first table(ROW_SEG), or left table looks something like this:
ASN | DEPT NO
-----------------------
85 | 836
86 | null
87 | null
My second table(RF_MERCHANT_ORG) has DEPT_NAME, and some other things which i want to get when i have a dept number.
DEPT NO | DEPT_NAME
-----------------------
836 | some dept name 1
837 | some dept name 2
838 | some dept name 3
In this case after my join i'd only get 1 row, for ASN 85 that had a DEPT NO.
...omitting a bunch of SQL for simplicity
, ROW_SEG AS (
SELECT *
FROM VE_SI_EC_OI
WHERE ROW_NUM BETWEEN 1 AND 1000 -- screen pagination, hardcoding values
)
-- ROW_SEG has a count of 1000 now
, RFS_JOIN AS (
SELECT ROW_SEG.*
,MO.BYR_NO
,MO.BYR_NAME
,MO.DEPT_NAME
FROM ROW_SEG
LEFT OUTER JOIN RF_MERCHANT_ORG MO
ON ROW_SEG.DEPT_NO = MO.DEPT_NO
WHERE MO.ORG_NO = 100
)
SELECT * FROM RFS_JOIN; -- returns less than 1000
I only get back the number of rows equal to the number of rows that have dept nos. So in my little data example above i would only get 1 row for ASN 85, but i want all rows with BYR_NO, BYR_NAME, AND DEPT_NAME populated on rows where i had a DEPT_NO, and if not, then empty/null columns.
If ORG_NO is within the RF_MERCHANT_ORG table (using aliases consistently would help there) then acting like an inner join would then would be the correct result for the SQL being used.
The join should be this to make it act like a proper left join:
LEFT OUTER JOIN RF_MERCHANT_ORG MO ON ROW_SEG.DEPT_NO = MO.DEPT_NO AND MO.ORG_NO = 100
If ORG_NO is in RF_MERCHANGE_ORG, then that is likely to be the cause... the where condition is limiting the result set.

SQL query get course number for certain student grades

I'm working through some problems and I can't seem to get the expected results for this one. The question is below with what is in my code right now and also the expected results. If anyone help that would be great. I'm just trying to get a understanding on this and can't seem to get my head around what exactly this is asking as you can see my code I have now isn't close to what the expected result is as of right now. Also I added the schema this will show whats in what table if needed for your guys help.
Question:
List the course number of courses wherein students have received grades for every one of the possible defined grade types. Order by course number.
My code so far:
SELECT g.Student_id, g.Grade_type_code
FROM Grade g LEFT OUTER JOIN Section s
ON g.Section_id = s.Section_id
GROUP BY g.Student_id, g.Grade_type_code
ORDER BY g.Student_id;
Any help would be great, also here is the Schema.
DBMS: I'm using Oracle SQL Developer
Here is the Expected Result
COURSE_NO
----------
20
25
100
120
122
125
130
135
Note: The Chapter for this problem is based off using
LEFT OUTER JOIN
My Current results
STUDENT_ID GRADE_TYPE_CODE
---------- ---------------
102 FI
102 HM
102 MT
102 PA
102 QZ
103 FI
103 HM
103 MT
103 PA
103 QZ
104 FI
104 HM
Based on your ER diagram I believe this query should return a list of courses whose enrolled students have collectively received all of the grade types listed in the GRADE_TYPE table.
select s.course_no,
c.descr,
count(distinct g.grade_type_code) as num_grade_types
from grade g
join enrollment e
on g.student_id = e.student_id
and g.section_id = e.section_id
join section s
on e.section_id = s.section_id
join course c
on s.course_no = c.course_no
group by s.course_no, c.descr
having count(distinct g.grade_type_code) = (select count(grade_type_code)
from grade_type)
I didn't notice your expected result was only the course # (you can just get rid of the columns you don't want from the select list). Also the join to the COURSE table is only there to get the course description, so if you don't want the course description selected, you do not need that join.
You need to select COURSE_NO instead. And also use JOIN and not LEFT JOUTER JOIN.
Something like this:
select COURSE_NO from
(
SELECT distinct (s.COURSE_NO)
FROM Grade g JOIN Section s
ON g.Section_id = s.Section_id
)
ORDER BY s.COURSE_NO;

How to find out employees having same department but different shifts

I have a table employee which has different attributes like emp_code, naeme,...., deptt.
There is another table called nightShift which has fields- emp_code, shift_time.
Any employee which is not in nightShift table is automatically assumed to be in day shift.
Now I have to find out those deptt which has some employees working in night shift and some in normal shift
What can be a query for this.
Example
**Employees**
----------------------------------------
emp_code| Name | deptt
----------------------------------------
e1 John Ops
e2 Martin Ops
e3 Gary Infra
e4 John Facilities
e5 Michael Ops
e6 Alan Ops
e7 Tony Facilites
e8 Alex Infra
e9 Peter Infra
e10 Ron Ops
**nightShift**
----------------------------------------
emp_code | shift_time
----------------------------------------
e1 shiftA
e2 shiftA
e5 shiftB
e4 shiftB
e7 shiftC
Now in the output, I want only Deptt Ops, as some of its employees are in night shift(e1,e2,e5) and some in normal shift(e6,e10)
The output should NOT contain Infra as all employees(e3,e8,e9) are in normal shift and none in night shift.
The output should NOT contain Facilities as all employees(e4,e7) are in night shift and none in normal shift.
Can somebody help me with this?
Here is a group by version - join both tables in a left join and count nightShifts per department. If count is greater than zero but not equal to count of all workers in department, we have a match.
select employees.deptt
from employees
left join nightShift
on employees.emp_code = nightShift.emp_code
group by employees.deptt
having count (nightShift.emp_code) > 0
and count (employees.emp_code) <> count (nightShift.emp_code)
Test it on Sql Fiddle.
You could try something along these lines
select
distinct e.deptt
from
Employees e
inner join
NightShift n
on
n.emp_code = e.emp_code
Where
e.deptt not in ('Facilities', 'Facilites')
The inner join will eliminate everyone not working night shifts, then in the where we find any results not working in deptt Facilities
I think you need to count the number of night shift workers for each department, and the number of day shift workers for each department, and you're concerned with those departments where both counts are bigger than zero.
Stage 1: Night shift workers per department:
SELECT e.deptt, COUNT(*) AS headcount
FROM Employees AS e
JOIN NightShift AS n
ON n.emp_code = e.emp_code
GROUP BY e.deptt
Stage 2: Day shift workers per department
There are various possible strategies for this. One is to count the total workers for each department and subtract the number of nightshift workers:
SELECT d.deptt, (d.headcount - n.headcount) AS headcount
FROM (SELECT e2.deptt, COUNT(*) AS headcount
FROM Employees AS e2
GROUP BY e2.deptt
) AS d
JOIN (SELECT e.deptt, COUNT(*) AS headcount
FROM Employees AS e
JOIN NightShift AS n
ON n.emp_code = e.emp_code
GROUP BY e.deptt
) AS n
ON d.deptt = e.deptt
Stage 3: Pick those departments where the headcounts on the dayshift and night shift are both non-zero:
SELECT d.deptt, (d.headcount - n.headcount) AS dayshift, n.headcount AS nightshift
FROM (SELECT e2.deptt, COUNT(*) AS headcount
FROM Employees AS e2
GROUP BY e2.deptt
) AS d
JOIN (SELECT e.deptt, COUNT(*) AS headcount
FROM Employees AS e
JOIN NightShift AS n
ON n.emp_code = e.emp_code
GROUP BY e.deptt
) AS n
ON d.deptt = e.deptt
WHERE d.headcount > 0
AND n.headcount > 0
There might be a more succinct formulation, but I'm fairly sure this would give the correct answer.
Subject to the note that this has not been anywhere near an actual SQL DBMS so there could be some syntax errors in it.
I'm also assuming you're using a supported version of Informix (and not Informix OnLine or Informix SE). Some older versions of Informix would not support all this syntax but I believe all the 11.x versions (which are all currently supported) should all handle these queries.
You can probably simplify this along the lines of Nikola Markovinović's answer.