SQL (oracle) imposing hierarchy on where clause

SQL (oracle) imposing hierarchy on where clause - sql

Table Name: REG_NBRS
STATE CITY COUNTRY REG_NBR
------------------------------------------
ILLINOIS USA 444333222
NEBRASKA USA 111222333
NEW YORK USA 333444555
FLORIDA USA 666222666
TAMPA USA 888333888
I have data something like this and I need to get REG_NBR for that state or city. If the row matches both for state and City, city takes precidence, and if it is not matched for either state or city, then I will have to still list the row with null for reg_nbr.
I tried to come up with a query but didn't get much successs as I don't know how to impose a precidence while doing an outer join.
SELECT C.NAME, C.AGE, RN.REG_NBR
FROM CUSTOMER C, REG_NBRS RN
WHERE C.COUNTRY = RN.COUNTRY
AND (C.STATE = RN.STATE OR C.CITY = RN.CITY)
AND C.ID BETWEEN 1000 AND 2000
As a beginner, I do not know how to join these two tables in such a way that
it joins first on STATE and City
But still list all 1000 rows
Put null to those registered numbers which are not from those states or cities
If both match, then use the State's Registered number (i.e. use 666222666 for Tampa even though we can find an entry for TAMPA)
I am sorry if this is not making sense but I have tried to explain as much as possible. I have also tried different combinations of left outer join and right outer join but couldn't get how to impose a hierarchy for WHERE coinditions. I thought of UNIONS but I think even unions would list 2 rows for a customer in Tampa with both REG_NBRs.
Any suggestions?
APOLOGIZE for jumbled code as I am using a (not so) smart phone to post this question.

Assuming that you have unique keys in (Country, State) and (Country, City), you can do it in a simple way by joining twice:
SELECT
C.NAME, C.AGE,
COALESCE(RN1.REG_NBR, RN2.REG_NBR) AS REG_NBR
FROM CUSTOMER C
LEFT OUTER JOIN REG_NBRS RN1
ON RN1.COUNTRY = RN1.COUNTRY
AND RN1.STATE = C.STATE
LEFT OUTER JOIN REG_NBRS RN2
ON RN2.COUNTRY = C.COUNTRY
AND RN2.CITY = C.CITY
WHERE C.ID BETWEEN 1000 AND 2000
Moreover this should be faster than the OR, which databases don't like much (at least with proper indexing in place).

Related

Oracle SQL Select: getting duplicate results joining tables

I'm starting with SQL and doing some exercises and I'm completely stuck on the last one.
It is about looking for streets in the same country name. Two tables are used, locations and countries from the HR schema.
The problem is that I don't know how to avoid duplicate results. For example if I have the street "x" in Canada and the street "y" also in Canada, it shows me twice:
street x / Canada / street y
street y / Canada / street x
and I can't find a way to correct this.
My select is:
SELECT DISTINCT A.STREET_ADDRESS AS "CALLE A", C.COUNTRY_NAME, B.STREET_ADDRESS AS "CALLE B"
FROM HR.LOCATIONS A JOIN HR.LOCATIONS B ON (A.STREET_ADDRESS <> B.STREET_ADDRESS), HR.COUNTRIES C
WHERE A.COUNTRY_ID = B.COUNTRY_ID AND B.COUNTRY_ID = C.COUNTRY_ID
ORDER BY C.COUNTRY_NAME
I get this Result_
Any ideas? Thank you.

use < instead of <>
SELECT DISTINCT A.STREET_ADDRESS AS "CALLE A", C.COUNTRY_NAME, B.STREET_ADDRESS AS "CALLE B"
FROM HR.LOCATIONS A JOIN HR.LOCATIONS B ON (A.STREET_ADDRESS > B.STREET_ADDRESS), HR.COUNTRIES C
WHERE A.COUNTRY_ID = B.COUNTRY_ID AND B.COUNTRY_ID = C.COUNTRY_ID
ORDER BY C.COUNTRY_NAME

All the correct answers should be logically equivalent. You may want to use a form closer to this, however.
Ignore the alignment, unless you feel it helps readability (I happen to).
I'm referring more to the JOIN / ON form (avoid comma separated table expressions in the FROM clause, called <table reference list> ... a single <table reference> is usually sufficient), as #jarlh mentioned, and to keep the join criteria (in the ON clauses) nearer to the corresponding tables in the FROM clause, unless you have a particular reason to separate any of the logic to the WHERE clause (try to avoid that, unless necessary).
Logically, we can write these in lots of different ways. Some may be a little easier to write/read.
SELECT A.STREET_ADDRESS AS "CALLE A"
, C.COUNTRY_NAME
, B.STREET_ADDRESS AS "CALLE B"
FROM HR.LOCATIONS A
JOIN HR.LOCATIONS B
ON A.COUNTRY_ID = B.COUNTRY_ID
AND A.STREET_ADDRESS < B.STREET_ADDRESS
JOIN HR.COUNTRIES C
ON B.COUNTRY_ID = C.COUNTRY_ID
ORDER BY C.COUNTRY_NAME
;
The inequality, as #nbk points out, avoids the street reflections.
SELECT DISTINCT isn't really needed for this, if your join logic and selected detail is sufficient to identify unique locations. If not, the result probably isn't practically useful.
If this is a small subset of sample data and/or these street addresses are unique within each country, then you're also ok, and DISTINCT isn't needed.

Sql Left or Right Join One To Many Pagination

I have one main table and join other tables via left outer or right outer outer join.One row of main table have over 30 row in join query as result. And I try pagination. But the problem is I can not know how many rows will it return for one main table row result.
Example :
Main table first row result is in my query 40 rows.
Main table second row result is 120 row.
Problem(Question) UPDATE:
For pagination I need give the pagesize the count of select result. But I can not know the right count for my select result. Example I give page no 1 and pagesize 50, because of this I cant get the right result.I need give the right pagesize for my main table top 10 result. Maybe for top 10 row will the result row count 200 but my page size is 50 this is the problem.
I am using Sql 2014. I need it for my ASP.NET project but is not important.
Sample UPDATE :
it is like searching an hotel for booking. Your main table is hotel table. And the another things are (mediatable)images, (mediatable)videos, (placetable)location and maybe (commenttable)comments they are more than one rows and have one to many relationship for the hotel. For one hotel the result will like 100, 50 or 10 rows for this all info. And I am trying to paginate this hotels result. I need get always 20 or 30 or 50 hotels for performance in my project.
Sample Query UPDATE :
SELECT
*
FROM
KisiselCoach KC
JOIN WorkPlace WP
ON KC.KisiselCoachId = WP.WorkPlaceOwnerId
JOIN Album A
ON KC.KisiselCoachId = A.AlbumId
JOIN Media M
ON A.AlbumId = M.AlbumId
LEFT JOIN Rating R
ON KC.KisiselCoachId = R.OylananId
JOIN FrUser Fr
ON KC.CoachId = Fr.UserId
JOIN UserJob UJ
ON KC.KisiselCoachId = UJ.UserJobOwnerId
JOIN Job J
ON UJ.JobId = J.JobId
JOIN UserExpertise UserEx
ON KC.KisiselCoachId = UserEx.UserExpertiseOwnerId
JOIN Expertise Ex
ON UserEx.ExpertiseId = Ex.ExpertiseId
Hotel Table :
HotelId HotelName
1 Barcelona
2 Berlin
Media Table :
MediaID MediaUrl HotelId
1 www.xxx.com 1
2 www.xxx.com 1
3 www.xxx.com 1
4 www.xxx.com 1
Location Table :
LocationId Adress HotelId
1 xyz, Berlin 1
2 xyz, Nice 1
3 xyz, Sevilla 1
4 xyz, Barcelona 1
Comment Table :
CommentId Comment HotelId
1 you are cool 1
2 you are great 1
3 you are bad 1
4 hmm you are okey 1
This is only sample! I have 9999999 hotels in my database. Imagine a hotel maybe it has 100 images maybe zero. I can not know this. And I need get 20 hotels in my result(pagination). But 20 hotels means 1000 rows maybe or 100 rows.

First, your query is poorly written for readability flow / relationship of tables. I have updated and indented to try and show how/where tables related in hierarchical relativity.
You also want to paginate, lets get back to that. Are you intending to show every record as a possible item, or did you intend to show a "parent" level set of data... Ex so you have only one instance per Media, Per User, or whatever, then once that entry is selected you would show details for that one entity? if so, I would do a query of DISTINCT at the top-level, or at least grab the few columns with a count(*) of child records it has to show at the next level.
Also, mixing inner, left and right joins can be confusing. Typically a right-join means you want the records from the right-table of the join. Could this be rewritten to have all required tables to the left, and non-required being left-join TO the secondary table?
Clarification of all these relationships would definitely help along with the context you are trying to get out of the pagination. I'll check for comments, but if lengthy, I would edit your original post question with additional details vs a long comment.
Here is my SOMEWHAT clarified query rewritten to what I THINK the relationships are within your database. Notice my indentations showing where table A -> B -> C -> D for readability. All of these are (INNER) JOINs indicating they all must have a match between all respective tables. If some things are NOT always there, they would be changed to LEFT JOINs
SELECT
*
FROM
KisiselCoach KC
JOIN WorkPlace WP
ON KC.KisiselCoachId = WP.WorkPlaceOwnerId
JOIN Album A
ON KC.KisiselCoachId = A.AlbumId
JOIN Media M
ON A.AlbumId = M.AlbumId
LEFT JOIN Rating R
ON KC.KisiselCoachId = R.OylananId
JOIN FrUser Fr
ON KC.CoachId = Fr.UserId
JOIN UserJob UJ
ON KC.KisiselCoachId = UJ.UserJobOwnerId
JOIN Job J
ON UJ.JobId = J.JobId
JOIN UserExpertise UserEx
ON KC.KisiselCoachId = UserEx.UserExpertiseOwnerId
JOIN Expertise Ex
ON UserEx.ExpertiseId = Ex.ExpertiseId
Readability of a query is a BIG help for yourself, and/or anyone assisting or following you. By not having the "on" clauses near the corresponding joins can be very confusing to follow.
Also, which is your PRIMARY table where the rest are lookup reference tables.
ADDITION PER COMMENT
Ok, so I updated a query which appears to have no context to the sample data and what you want in your post. That said, I would start with a list of hotels only and a count(*) of things per hotel so you can give SOME indication of how much stuff you have in detail. Something like
select
H.HotelID,
H.HotelName,
coalesce( MedSum.recs, 0 ) as MediaItems,
coalesce( LocSum.recs, 0 ) as NumberOfLocations,
coalesce( ComSum.recs, 0 ) as NumberOfLocations
from
Hotel H
LEFT JOIN
( select M.HotelID,
count(*) recs
from Media M
group by M.HotelID ) MedSum
on H.HotelID = MedSum.HotelID
LEFT JOIN
( select L.HotelID,
count(*) recs
from Location L
group by L.HotelID ) LocSum
on H.HotelID = LocSum.HotelID
LEFT JOIN
( select C.HotelID,
count(*) recs
from Comment C
group by C.HotelID ) ComSum
on H.HotelID = ComSum.HotelID
order by
H.HotelName
--- apply any limit per pagination
Now this will return every hotel at a top-level and the total count of things per the hotel per the individual counts which may or not exist hence each sub-check is a LEFT-JOIN. Expose a page of 20 different hotels. Now, as soon as one person picks a single hotel, you can then drill-into the locations, media and comments per that one hotel.
Now, although this COULD work, having to do these counts on an every-time query might get very time consuming. You might want to add counter columns to your main hotel table representing such counts as being performed here. Then, via some nightly process, you could re-update the counts ONCE to get them primed across all history, then update counts only for those hotels that have new activity since entered the date prior. Not like you are going to have 1,000,000 posts of new images, new locations, new comments in a day, but of 22,000, then those are the only hotel records you would re-update counts for. Each incremental cycle would be short based on only the newest entries added. For the web, having some pre-aggregate counts, sums, etc is a big time saver where practical.

Oracle SQL SELECT equivalent to multiple AND

maybe you can help me. I actually study SQL on ORACLE Platform and i have several exercises.
One of them is to hard for me, i don´t get it done right...
This is the Excercise:
Which Countrys are member in ALL organisations, where also Country XY is member of?
I have multiple tables but i think only one is necessary for this task.
Tablename: isMember ( abbreviation(fk), country(fk) )
So the Tables looks like:
Country / Abbreviation
USA / G-5
USA / G-7
USA / G-9
Canada / G-7
Canada / G-9
Norway / G-20
and so on....
How can i find every country in the list which is also member of ALL organizations where for example USA is member of?
Thank you very much!

This is a tricky question. One method is to construct all rows with countries and organization the US is in. Then, count the number each country has and see if they match:
select c.country
from (select distinct country from isMember) c cross join
(select abbreviation from isMember where country = 'USA') a left join
isMember im
on im.country = c.country and im.abbreviation = a.abbreviation
group by c.country
having count(*) = count(im.country); -- do all organizations match?

Joining a selected table to a cross joined table

I have a table, flight_schedule, that consists of a bunch of flight segments. Below I have a Terradata SQL query that creates a list of two segment itineraries between Chicago and Denver. i.e. each row has two flights that eventually get the passenger from Chicago to Denver. For example, the first row contains flight information on a leg from Chicago to Omaha and then a later leg from Omaha to Denver. This query works just fine.
SELECT A.flt_num, A.dprt_sta_cd, A.arrv_sta_cd, A.sch_dprt_dtml, A.sch_arrv_dtml,
B.flt_num, B.dprt_sta_cd, B.arrv_sta_cd, B.sch_dprt_dtml, B.sch_arrv_dtml
FROM
flight_schedule A
CROSS JOIN
flight_schedule B
WHERE
A.dprt_sta_cd = 'Chicago' AND
B.arrv_sta_cd = 'Denver' AND
A.arrv_sta_cd = B.dprt_sta_cd AND
A.sch_arrv_dtml < B.sch_dprt_dtml
ORDER BY B.sch_arrv_dtml;
I have another table, flight_seat_inventory, that consists of seats available in different cabins for each flight number. The query below aggregates total available seats for each flight number. This query is also A-OK.
SELECT flt_num, SUM(seat_cnt) as avail_seats
FROM flight_seat_inventory
GROUP BY flt_num;
I want to combine these two queries with a LEFT JOIN, twice, so that each flight has a corresponding avail_seats value. How can I do this?
For added clarity, I think my desired Select statement looks like this:
SELECT A.flt_num, A.dprt_sta_cd, A.arrv_sta_cd, A.sch_dprt_dtml, A.sch_arrv_dtml, C.avail_seats
B.flt_num, B.dprt_sta_cd, B.arrv_sta_cd, B.sch_dprt_dtml, B.sch_arrv_dtml, D.avail_seats
flight_schedule is HUGE, so I suspect it's more efficient to do the LEFT JOIN after the CROSS JOIN. Again, using Teradata SQL.
Thanks!

I needed to declare the second seats query as a temporary table using a WITH command before I did the LEFT JOIN:
WITH tempSeatsTable AS (
SELECT flt_num, SUM(seat_cnt) as avail_seats
FROM flight_seat_inventory
GROUP BY flt_num
)
SELECT
A.flt_num, A.dprt_sta_cd, A.arrv_sta_cd, A.sch_dprt_dtml, A.sch_arrv_dtml, C.avail_seats
B.flt_num, B.dprt_sta_cd, B.arrv_sta_cd, B.sch_dprt_dtml, B.sch_arrv_dtml, D.avail_seats
FROM
flight_schedule A
CROSS JOIN
flight_schedule B
LEFT JOIN
tempSeatsTable C
ON A.flt_num = C.flt_num
LEFT JOIN
tempSeatsTable D
ON B.flt_num = D.flt_num
WHERE
A.dprt_sta_cd = 'Chicago' AND
B.arrv_sta_cd = 'Denver' AND
A.arrv_sta_cd = B.dprt_sta_cd AND
A.sch_arrv_dtml < B.sch_dprt_dtml
ORDER BY B.sch_arrv_dtml;

finding people with possible incorrectly spelled cities where zip codes match

I am trying to create a report that will return a list of people whose cities most likely need to be corrected.
I was thinking of comparing the data against other data within the table to leverage the assumption that most of the cities are spelled correctly. Take Albuquerque, for example. We have records for many of the zip codes, but the city isn't always spelled correctly.
I can't figure out my next step.
Here's what I have started with:
SELECT city, zip_5_digits, COUNT(*) AS "COUNT"
FROM people
INNER JOIN addresses
ON addresses.people_id = people.id
AND city LIKE 'Albu%que'
GROUP BY city, zip_5_digits
Doing this results in
Albuqureque 87108 1
Albuquerque 87108 238
Albuqerque 87109 1
Albuquerque 87109 34
What I'd like to do is, for each row, find the maximum records where the zip code matches but the city does not match. If there is no match, I want to return that record, and I'll use this to return people's id and names, since I most likely need to correct the name of the city for those people who have it mis-spelled.

This is hard, because some "cities" have very few residents. And, some zip codes might just have a small part of a city.
I would recommend two rules:
Look at zip codes that have at least a certain number of people -- say 100.
Look at cities in the zip code that have less than some number -- say 5.
There are candidates for misspellings:
SELECT pa.*
FROM (SELECT city, zip_5_digits, COUNT(*) AS cnt,
MAX(COUNT(*)) OVER (PARTITION BY zip_5_digits) as max_cnt,
SUM(COUNT(*)) OVER (PARTITION BY zip_5_digits) as sum_cnt
FROM people p, INNER JOIN
addresses a
ON a.people_id = p.id
GROUP BY city, zip_5_digits
) pa
WHERE sum_cnt >= 100 AND cnt <= 5;

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas