Oracle SQL Select: getting duplicate results joining tables - sql

I'm starting with SQL and doing some exercises and I'm completely stuck on the last one.
It is about looking for streets in the same country name. Two tables are used, locations and countries from the HR schema.
The problem is that I don't know how to avoid duplicate results. For example if I have the street "x" in Canada and the street "y" also in Canada, it shows me twice:
street x / Canada / street y
street y / Canada / street x
and I can't find a way to correct this.
My select is:
SELECT DISTINCT A.STREET_ADDRESS AS "CALLE A", C.COUNTRY_NAME, B.STREET_ADDRESS AS "CALLE B"
FROM HR.LOCATIONS A JOIN HR.LOCATIONS B ON (A.STREET_ADDRESS <> B.STREET_ADDRESS), HR.COUNTRIES C
WHERE A.COUNTRY_ID = B.COUNTRY_ID AND B.COUNTRY_ID = C.COUNTRY_ID
ORDER BY C.COUNTRY_NAME
I get this Result_
Any ideas? Thank you.

use < instead of <>
SELECT DISTINCT A.STREET_ADDRESS AS "CALLE A", C.COUNTRY_NAME, B.STREET_ADDRESS AS "CALLE B"
FROM HR.LOCATIONS A JOIN HR.LOCATIONS B ON (A.STREET_ADDRESS > B.STREET_ADDRESS), HR.COUNTRIES C
WHERE A.COUNTRY_ID = B.COUNTRY_ID AND B.COUNTRY_ID = C.COUNTRY_ID
ORDER BY C.COUNTRY_NAME

All the correct answers should be logically equivalent. You may want to use a form closer to this, however.
Ignore the alignment, unless you feel it helps readability (I happen to).
I'm referring more to the JOIN / ON form (avoid comma separated table expressions in the FROM clause, called <table reference list> ... a single <table reference> is usually sufficient), as #jarlh mentioned, and to keep the join criteria (in the ON clauses) nearer to the corresponding tables in the FROM clause, unless you have a particular reason to separate any of the logic to the WHERE clause (try to avoid that, unless necessary).
Logically, we can write these in lots of different ways. Some may be a little easier to write/read.
SELECT A.STREET_ADDRESS AS "CALLE A"
, C.COUNTRY_NAME
, B.STREET_ADDRESS AS "CALLE B"
FROM HR.LOCATIONS A
JOIN HR.LOCATIONS B
ON A.COUNTRY_ID = B.COUNTRY_ID
AND A.STREET_ADDRESS < B.STREET_ADDRESS
JOIN HR.COUNTRIES C
ON B.COUNTRY_ID = C.COUNTRY_ID
ORDER BY C.COUNTRY_NAME
;
The inequality, as #nbk points out, avoids the street reflections.
SELECT DISTINCT isn't really needed for this, if your join logic and selected detail is sufficient to identify unique locations. If not, the result probably isn't practically useful.
If this is a small subset of sample data and/or these street addresses are unique within each country, then you're also ok, and DISTINCT isn't needed.

Related

How to combine taking several joins and add constraints on query?

How can I answer the following question by quering this database:
The police is looking for a brown hair coloured woman that checked in in the gym somewhere between september 8th 2016 and october 24th 2016.This woman has a silver membership. Can we find the name of this woman?
I tried the following query:
dbGetQuery(db,"
SELECT *
FROM get_fit_now_member
JOIN get_fit_now_check_in ON id = membership_id
WHERE check_in_date BETWEEN '20160909' AND '20161023' AND membership_status = 'silver'
")
This gives me the following output:
The problem is that I have to join multiple times and at the same time have to add different constrains. How can I solve this question in a clever way?
Here's how I would write the query:
SELECT m.name
FROM get_fit_now_member AS m
JOIN get_fit_now_check_in AS c ON m.id = c.membership_id
JOIN person AS p ON m.person_id = p.id
JOIN drivers_license AS d ON p.license_id = d.id
WHERE c.check_in_date BETWEEN '20160908' AND '20161024'
AND m.membership_status = 'silver'
AND d.hair_color = 'brown';
JOIN is just an operator, like + is in arithmetic. In arithmetic, you can extend the expressions with more terms, like a + b + c + d. In SQL, you can use JOIN multiple times in a similar way.
I used correlation names (m, c, p, d) to make it more convenient to qualify the table names, so I can be clear for example which id I mean in each join condition, since a column named id exists in multiple tables.
I also changed the date expression, because I assume "between" is meant to include the two dates named in the problem statement.

How do I make a query shorter and neater?

Im trying to make This query more understandable and neater. But im not sure how to?
SELECT a.Patient_id, COUNT (p.Person_id) AS "Number of Operations", SUM (w.Daily_charge * (a.Discharge_date - a.Admission_date) + ot.Theatre_fee + b.Charges + c.Charges ) AS "Total Payment"
FROM person p, admission a, ward w, operation o, operation_type ot, staff b, staff c
WHERE w.Ward_code = a.Ward_code AND p.Person_id = a.Patient_id
AND a.Admission_id = o.Admission_id AND ot.Op_code = o.Actual_op
AND o.Surgeon = b.Person_id AND o.Anaesthetist = c.Person_id
GROUP BY a.Patient_id, p.Person_id
ORDER BY COUNT (p.Person_id) DESC FETCH FIRST 1 ROWS ONLY;
Any decent formatter would do it for you.
Other than that,
JOIN instead of comma-separate tables in the FROM clause
remove p.person_id from group by clause, there's no use of it as it is
equal to a.patient_id which is correctly put into the clause,
not part of the select statement's column list
So:
select a.patient_id,
count (p.person_id) as "number of operations",
sum (w.daily_charge * (a.discharge_date - a.admission_date) +
ot.theatre_fee + b.charges + c.charges
) as "total payment"
from person p join admission a on a.patient_id = p.person_id
join ward w on w.ward_code = a.ward_code
join operation o on o.admission_id = a.admission_id
join operation_type ot on ot.op_code = o.actual_op
join staff b on o.surgeon = b.person_id
join staff c on o.anaesthetist = c.person_id
group by a.patient_id
order by count (p.person_id) desc
fetch first 1 rows only;
Sadly there is no perfect formatter, although as Ed mentioned in the comments, they can be a start if you review the settings carefully. (It's a tradition in the industry that the default formatter settings are always horrible.)
It's also been said (I think by Steven Feuerstein) that you should only set formatting rules that are supported by your formatter, and of course he makes a good point. But taken with the limitations of all formatters, an industry tradition for horrible formatting and the impossibility of consistent rules for formatting SQL anyway, that puts us PL/SQL developers in a difficult position.
I'd say the first principle of computer code layout is to use vertically aligned blocks to indicate dependency levels (similar to the grids used in graphic design). A lot of the choices then become about how to apply that principle.
We then need to separate the code into logical sections, but at the same time not let it sprawl down the page unnecessarily. I think this is difficult for automated formatters as the rules become a bit fuzzy, e.g. for a join with only one condition I keep it on one line, but if there is more than one I start splitting it out onto multiple lines, one per on or and keyword. The same goes for your complex sum() expression - normally I would place it all on one line, but if it aids readability then I split it up.
Finally, opinions vary on where to place commas in stacked lists, of which SQL has a lot. I say they go on the left, to act like bullet points and also make it easier to add items to the ends of lists. Others will disagree.
select ad.patient_id
, count(*) as "Number of Operations"
, sum(
wa.daily_charge * (ad.discharge_date - ad.admission_date)
+ ot.theatre_fee + ss.charges + sa.charges
) as "Total Payment"
from person pr
join admission ad on ad.patient_id = pr.person_id
join ward wa on wa.ward_code = ad.ward_code
join operation op on op.admission_id = ad.admission_id
join operation_type ot on ot.op_code = op.actual_op
join staff ss on ss.person_id = op.surgeon
join staff sa on sa.person_id = op.anaesthetist
group by ad.patient_id
order by count(*) desc
fetch first row only;

Oracle SQL SELECT equivalent to multiple AND

maybe you can help me. I actually study SQL on ORACLE Platform and i have several exercises.
One of them is to hard for me, i don´t get it done right...
This is the Excercise:
Which Countrys are member in ALL organisations, where also Country XY is member of?
I have multiple tables but i think only one is necessary for this task.
Tablename: isMember ( abbreviation(fk), country(fk) )
So the Tables looks like:
Country / Abbreviation
USA / G-5
USA / G-7
USA / G-9
Canada / G-7
Canada / G-9
Norway / G-20
and so on....
How can i find every country in the list which is also member of ALL organizations where for example USA is member of?
Thank you very much!
This is a tricky question. One method is to construct all rows with countries and organization the US is in. Then, count the number each country has and see if they match:
select c.country
from (select distinct country from isMember) c cross join
(select abbreviation from isMember where country = 'USA') a left join
isMember im
on im.country = c.country and im.abbreviation = a.abbreviation
group by c.country
having count(*) = count(im.country); -- do all organizations match?

SQL (oracle) imposing hierarchy on where clause

Table Name: REG_NBRS
STATE CITY COUNTRY REG_NBR
------------------------------------------
ILLINOIS USA 444333222
NEBRASKA USA 111222333
NEW YORK USA 333444555
FLORIDA USA 666222666
TAMPA USA 888333888
I have data something like this and I need to get REG_NBR for that state or city. If the row matches both for state and City, city takes precidence, and if it is not matched for either state or city, then I will have to still list the row with null for reg_nbr.
I tried to come up with a query but didn't get much successs as I don't know how to impose a precidence while doing an outer join.
SELECT C.NAME, C.AGE, RN.REG_NBR
FROM CUSTOMER C, REG_NBRS RN
WHERE C.COUNTRY = RN.COUNTRY
AND (C.STATE = RN.STATE OR C.CITY = RN.CITY)
AND C.ID BETWEEN 1000 AND 2000
As a beginner, I do not know how to join these two tables in such a way that
it joins first on STATE and City
But still list all 1000 rows
Put null to those registered numbers which are not from those states or cities
If both match, then use the State's Registered number (i.e. use 666222666 for Tampa even though we can find an entry for TAMPA)
I am sorry if this is not making sense but I have tried to explain as much as possible. I have also tried different combinations of left outer join and right outer join but couldn't get how to impose a hierarchy for WHERE coinditions. I thought of UNIONS but I think even unions would list 2 rows for a customer in Tampa with both REG_NBRs.
Any suggestions?
APOLOGIZE for jumbled code as I am using a (not so) smart phone to post this question.
Assuming that you have unique keys in (Country, State) and (Country, City), you can do it in a simple way by joining twice:
SELECT
C.NAME, C.AGE,
COALESCE(RN1.REG_NBR, RN2.REG_NBR) AS REG_NBR
FROM CUSTOMER C
LEFT OUTER JOIN REG_NBRS RN1
ON RN1.COUNTRY = RN1.COUNTRY
AND RN1.STATE = C.STATE
LEFT OUTER JOIN REG_NBRS RN2
ON RN2.COUNTRY = C.COUNTRY
AND RN2.CITY = C.CITY
WHERE C.ID BETWEEN 1000 AND 2000
Moreover this should be faster than the OR, which databases don't like much (at least with proper indexing in place).

Do I misunderstand joins?

I'm trying to learn the the ansi-92 SQL standard, but I don't seem to understand it completely (I'm new to the ansi-89 standard as well and in databases in general).
In my example, I have three tables kingdom -< family -< species (biology classifications).
There may be kingdoms without species nor families.
There may be families without species nor kindgoms.
There may be species without kingdom or families.
Why this may happen?
Say a biologist, finds a new species but he has not classified this into a kingdom or family, creates a new family that has no species and is not sure about what kingdom it should belong, etc.
here is a fiddle (see the last query): http://sqlfiddle.com/#!4/015d1/3
I want to make a query that retrieves me every kingdom, every species, but not those families that have no species, so I make this.
select *
from reino r
left join (
familia f
right join especie e
on f.fnombre = e.efamilia
and f.freino = e.ereino
) on r.rnombre = f.freino
and r.rnombre = e.ereino;
What I think this would do is:
join family and species as a right join, so it brings every species, but not those families that have no species. So, if a species has not been classified into a family, it will appear with null on family.
Then, join the kingdom with the result as a left join, so it brings every kingdom, even if there are no families or species classified on that kingdom.
Am I wrong? Shouldn't this show me those species that have not been classified? If I do the inner query it brings what I want. Is there a problem where I'm grouping things?
You're right on your description of #1... the issue with your query is on step #2.
When you do a left join from kingdom to (family & species), you're requesting every kingdom, even if there's no matching (family & species)... however, this won't return you any (family & species) combination that doesn't have a matching kingdom.
A closer query would be:
select *
from reino r
full join (
familia f
right join especie e
on f.fnombre = e.efamilia
and f.freino = e.ereino
) on r.rnombre = f.freino
and r.rnombre = e.ereino;
Notice that the left join was replaced with a full join...
however, this only returns families that are associated with a species... it doesn't return any families that are associated with kingdoms but not species.
After re-reading your question, this is actually want you wanted...
EDIT: On further thought, you could re-write your query like so:
select *
from
especie e
left join familia f
on f.fnombre = e.efamilia
and f.freino = e.ereino
full join reino r
on r.rnombre = f.freino
and r.rnombre = e.ereino;
I think this would be preferrable, because you eliminate the RIGHT JOIN, which are usually frowned upon for being poor style... and the parenthesis, which can be tricky for people to parse correctly to determine what the result will be.
In case this helps:
Relationally speaking, [OUTER JOIN is] a kind of shotgun marriage: It
forces tables into a kind of union—yes, I do mean union, not join—even
when the tables in question fail to conform to the usual requirements
for union. It does this, in effect, by padding one or
both of the tables with nulls before doing the union, thereby making
them conform to those usual requirements after all. But there's no
reason why that padding shouldn't be done with proper values instead
of nulls, as in this example:
SELECT SNO , PNO
FROM SP
UNION
SELECT SNO , 'nil' AS PNO
FROM S
WHERE SNO NOT IN ( SELECT SNO FROM SP )
The above is equivalent to:
SELECT SNO , COALESCE ( PNO , 'nil' ) AS PNO
FROM S NATURAL LEFT OUTER JOIN SP
Source:
SQL and Relational Theory: How to Write Accurate SQL Code By C. J. Date
If you want the query rewritten with only the slightest change from what you have, you can change the LEFT join to a FULL join. You can further remove the redundant parenthesis and the r.rnombre = f.freino from the ON condition:
select *
from reino r
full join --- instead of LEFT JOIN
familia f
right join especie e
on f.fnombre = e.efamilia
and f.freino = e.ereino
on r.rnombre = e.ereino;
---removed the: r.rnombre = f.freino
Try to use this:
select *
from reino r
join especie e on (r.rnombre = e.ereino)
join familia f on (f.freino = e.ereino and f.fnombre = e.efamilia)
could it be, that you interchanged efamilia and enombre in table especie?