Oracle SQL - how to NOT SHOW athlete name that apears only once - sql

created a view called winners, it contains the columns: athlete_name,year,medal_won
its basicly athletes that won olympic medal and the year,
it look like that,
data base is in live sql: https://livesql.oracle.com/apex/f?p=590:1000:0
select distinct year,athlete_name,medal
from olym.olym_medals
join olym.olym_athlete_games on olym_athlete_games.id = olym_medals.athlete_game_id
join olym.olym_nations on olym_nations.id = olym_athlete_games.nation_id
join olym.olym_games on olym_games.id = Olym_athlete_games.game_id
join olym.olym_athletes on olym_athletes.id = olym_athlete_games.athlete_id
order by athlete_name
as you can see some name show only once and some names are showing more than once, i want to get rid off all lines of those who show ONLY ONCE, please help me.
thank you!

if i have understand your problem, must group your data,
select year,athlete_name,medal, count(*) "number of Medals"
from olym.olym_medals
join olym.olym_athlete_games on olym_athlete_games.id = olym_medals.athlete_game_id
join olym.olym_nations on olym_nations.id = olym_athlete_games.nation_id
join olym.olym_games on olym_games.id = Olym_athlete_games.game_id
join olym.olym_athletes on olym_athletes.id = olym_athlete_games.athlete_id
group by year,athlete_name,medal;

If I followed you correctly, you can use window functions:
select *
from (
select og.year, oa.athlete_name, om.medal, count(*) over(partition by oa.id) cnt
from olym.olym_medals om
join olym.olym_athlete_games oag on oag.id = om.athlete_game_id
join olym.olym_nations ona on ona.id = oag.nation_id
join olym.olym_games og on og.id = oag.game_id
join olym.olym_athletes oa on oa.id = oag.athlete_id
) t
where cnt > 1
order by athlete_name
Notes:
I am unsure why you were using distinct in the first place, so I removed it (I suspect it is actually not needed)
I added table aliases to shorten the query, and prefixed the columns in the select clause with the table they belong to (you might want to review that) - these are best practices when dealing with multi-table queries

Use GROUP BY and HAVING COUNT(*) > 1:
SELECT year,
athlete_name,
medal
FROM olym.olym_medals
INNER JOIN olym.olym_athlete_games
ON olym_athlete_games.id = olym_medals.athlete_game_id
INNER JOIN olym.olym_nations
ON olym_nations.id = olym_athlete_games.nation_id
INNER JOIN olym.olym_games
ON olym_games.id = Olym_athlete_games.game_id
INNER JOIN olym.olym_athletes
ON olym_athletes.id = olym_athlete_games.athlete_id
GROUP BY
year,
athlete_name,
medal
HAVING COUNT(*) > 1
ORDER BY athlete_name

Related

Why is SQL not letting me access information in inner queries?

I'm writing a query to find solve the following question:
"For those customer – product combinations where the product belongs to one of the product lines that have ‘Ethernet’ in their name, list the name of the customer, name of the product, the sales last year and total sales year to date."
Now, I have four tables that I need to use to solve this: xproduct, xprodline, and xsales, and xcustomer. These are related in the following ways:
And they have the following columns:
Since xcustomer and xproduct are not directly related, I'm using xsales to join them, but I'm having issues accessing the information I get from inner queries. This is the code I have so far, but it throws "ORA-00904: "S"."SALES_YEAR_TO_DATE": invalid identifier":
SELECT xcustomer.cust_name, PL.prod_name, S.sales_last_year, S.sales_year_to_date FROM xcustomer
JOIN(
SELECT xsales.sales_cust_nbr FROM xsales
JOIN (
SELECT xproduct.prod_name, xprodline.prodline_pyear_sales, xprodline.prodline_ytd_sales, xproduct.prod_nbr FROM xproduct
INNER JOIN xprodline
ON xproduct.prod_prodline = xprodline.prodline_nbr
WHERE prod_prodline= 1
) PL
ON xsales.sales_prod_nbr = PL.prod_nbr
)S
ON S.sales_cust_nbr = xcustomer.cust_nbr;
Try this:
SELECT xcustomer.cust_name, PL.prod_name, S.sales_last_year, S.sales_year_to_date FROM xcustomer
JOIN(
SELECT xsales.sales_cust_nbr, xsales.sales_last_year, xsales.sales_year_to_date FROM xsales
JOIN (
SELECT xproduct.prod_name, xprodline.prodline_pyear_sales, xprodline.prodline_ytd_sales, xproduct.prod_nbr FROM xproduct
INNER JOIN xprodline
ON xproduct.prod_prodline = xprodline.prodline_nbr
WHERE prod_prodline= 1
) PL
ON xsales.sales_prod_nbr = PL.prod_nbr
)S
ON S.sales_cust_nbr = xcustomer.cust_nbr;
You were missing selecting the columns you needed.
You are missing one more column (PROD_NAME) in your subquery. Here is the updated query:
SELECT xcustomer.cust_name,
S.prod_name,
S.sales_last_year,
S.sales_year_to_date
FROM xcustomer
JOIN(
SELECT xsales.sales_cust_nbr, xsales.sales_last_year, xsales.sales_year_to_date, pl.prod_name
FROM xsales
JOIN (
SELECT xproduct.prod_name, xprodline.prodline_pyear_sales, xprodline.prodline_ytd_sales, xproduct.prod_nbr
FROM xproduct
INNER JOIN xprodline
ON xproduct.prod_prodline = xprodline.prodline_nbr
WHERE prod_prodline= 1
) PL
ON xsales.sales_prod_nbr = PL.prod_nbr
)S
ON S.sales_cust_nbr = xcustomer.cust_nbr;
Also, I think you can directly join the tables and simplify your query as follows:
SELECT C.cust_name,
PL.prod_name,
S.sales_last_year,
S.sales_year_to_date
FROM xcustomer C
JOIN xsales S
ON C.cust_nbr = S.sales_cust_nbr
JOIN xprodline PL
ON S.sales_prod_nbr = PL.prod_nbr
JOIN xproduct P
ON PL.prodline_nbr = P.prod_prodline
WHERE P.prod_prodline= 1

LEFT JOIN not keeping only records that occur in a SELECT query

I have the following SQL select statement that I use to get a subset of products, or wines:
SELECT pv.SkProdVariantId AS id,
pa.Colour AS colour,
FROM Dim.ProductVariant AS pv
JOIN ProductAttributes_new AS pa
ON pv.SkProdVariantId = pa.SkProdVariantId
WHERE pv.ProdTypeName = 'Wines'
The length of this table generated is 3,905. I want to get all the transactional data for these products.
At the moment I'm using this select statement
SELECT c.CalDate AS timestamp,
f.SkProductVariantId AS sku_id,
f.Quantity AS quantity
FROM fact.FTransactions AS f
LEFT JOIN Dim.Calendar AS c
ON f.SkDateId = c.SkDateId
LEFT JOIN (
SELECT pv.SkProdVariantId AS id,
pa.Colour AS colour,
FROM Dim.ProductVariant AS pv
JOIN ProductAttributes_new AS pa
ON pv.SkProdVariantId = pa.SkProdVariantId
WHERE pv.ProdTypeName = 'Wines'
) AS s
ON s.id = f.SkProductVariantId
WHERE c.CalDate LIKE '%2019%'
The calendar dates are correct, but the number of unique products returned is 5,648, rather than the expected 3,905 from the select query.
Why does my LEFT JOIN on the first select query not work as I expect it to, please?
Thanks for any help!
If you want all the rows form your query, it needs to be the first reference in the LEFT JOIN. Then, I am guessing that you want transaction in 2019:
select . . .
from (SELECT pv.SkProdVariantId AS id, pa.Colour AS colour,
FROM Dim.ProductVariant pv JOIN
ProductAttributes_new pa
ON pv.SkProdVariantId = pa.SkProdVariantId
WHERE pv.ProdTypeName = 'Wines'
) s LEFT JOIN
(fact.FTransactions f JOIN
Dim.Calendar c
ON f.SkDateId = c.SkDateId AND
c.CalDate >= '2019-01-01' AND
c.CalDate < '2020-01-01'
)
ON s.id = f.SkProductVariantId;
Note that this assumes that CalDate is really a date and not a string. LIKE should only be used on strings.
You misunderstand somehow how outer joins work. See Gordon's answer and my request comment on that.
As to the task: It seems you want to select transactions of 2019, but you want to restrict your results to wine products. We typically restrict query results in the WHERE clause. You can use IN or EXISTS for that.
SELECT
c.CalDate AS timestamp,
f.SkProductVariantId AS sku_id,
f.Quantity AS quantity
FROM fact.FTransactions AS f
INNER JOIN Dim.Calendar AS c ON f.SkDateId = c.SkDateId
WHERE DATEPART(YEAR, c.CalDate) = 2019
AND f.SkProductVariantId IN
(
SELECT pv.SkProdVariantId
FROM Dim.ProductVariant AS pv
WHERE pv.ProdTypeName = 'Wines'
);
(I've removed the join to ProductAttributes_new, because it doesn't seem to play any part in this query.)

SELECT statement where rows are omitted based on another table

Table with orders has another table with positions. I want the orders table to show but then only have the most up to-date position on it. Below is a picture of the 3 rows I want showing. Omit the rest.
SELECT DispatchTable.ordernumber, DispatchTable.truck,
DispatchTable.driver, DispatchTable.actualpickup,
DispatchTable.actualdropoff, orders.pickupdateandtime,
orders.dropoffdateandtime, Truck002.lastposition,
Truck002.lastdateandtime
FROM DispatchTable
INNER JOIN orders ON DispatchTable.ordernumber = orders.id
INNER JOIN Truck002 ON DispatchTable.truck = Truck002.name
WHERE (orders.status = 'onRoute')
Assuming that you want the row having the latest lastdateandtime for the truck name, this should work:
SELECT DispatchTable.ordernumber,
DispatchTable.truck,
DispatchTable.driver,
DispatchTable.actualpickup,
DispatchTable.actualdropoff,
orders.pickupdateandtime,
orders.dropoffdateandtime,
TruckLatest.lastposition,
TruckLatest.lastdateandtime
FROM DispatchTable
INNER JOIN orders ON DispatchTable.ordernumber = orders.id
INNER JOIN (SELECT name,
lastposition,
lastdateandtime
FROM Truck002 Truck1
WHERE lastdateandtime =
(SELECT MAX(lastdateandtime)
FROM Truck002 Truck2
WHERE Truck2.name = Truck1.name)) TruckLatest
ON DispatchTable.truck = TruckLatest.name
WHERE (orders.status = 'onRoute')
If I understand correctly, you can get the most recent record for a truck using ROW_NUMBER():
SELECT dt.ordernumber, dt.truck,
dt.driver, dt.actualpickup,
dt.actualdropoff, o.pickupdateandtime,
o.dropoffdateandtime, t.lastposition,
t.lastdateandtime
FROM DispatchTable dt INNER JOIN
orders o
ON dt.ordernumber = o.id INNER JOIN
(SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY t.name ORDER BY t.lastdateandtime DESC) as seqnum
FROM Truck002 t
) t
ON dt.truck = t.name
WHERE o.status = 'onRoute' AND seqnum = 1;
Firstly, why are you using Truck002's name field rather than its id field as the link to DispacthTable? This is considered a less efficient way of doing it than using id (which is either a numerical field or a shorter string than name).
Secondly, you should mention in your Question that each Order can have many DispatchTable's and that each DispacthTable can have many Truck002's, otherwise many people will start by assuming that it is the other way round between DispatchTable and Truck002.
Thirdly, please try...
SELECT DispatchTable.ordernumber,
DispatchTable.truck,
DispatchTable.driver,
DispatchTable.actualpickup,
DispatchTable.actualdropoff,
orders.pickupdateandtime,
orders.dropoffdateandtime,
Truck002.lastposition,
Truck002.lastdateandtime
FROM DispatchTable
INNER JOIN orders ON DispatchTable.ordernumber = orders.id
INNER JOIN Truck002 ON DispatchTable.truck = Truck002.name
WHERE (orders.status = 'onRoute')
GROUP BY ordernumber
HAVING lastdateandtime = MAX( lastdateandtime )
If you have any questions or comments, then please feel free to post a Comment accordingly.
Further Reading
https://msdn.microsoft.com/en-us/library/bb177906(v=office.12).aspx (on HAVING)
https://www.w3schools.com/sql/sql_having.asp (on HAVING)
https://msdn.microsoft.com/en-us/library/bb177905(v=office.12).aspx (on GROUP BY)
https://www.w3schools.com/sql/sql_groupby.asp (on GROUP BY)

Subquery with multiple joins involved

Still trying to get used to writing queries and I've ran into a problem.
Select count(region)
where (regionTable.A=1) in
(
select jxn.id, count(jxn.id) as counts, regionTable.A
from jxn inner join
V on jxn.id = V.id inner join
regionTable on v.regionID = regionTable.regionID
group by jxn.id, regionTable.A
)
The inner query gives an ID number in one column, the amount of times they appear in the table, and then a bit attribute if they are in region A. The outer query works but the error I get is incorrect syntax near the keyword IN. Of the inner query, I would like a number of how many of them are in region A
You must specify table name in query before where
Select count(region)
from table
where (regionTable.A=1) in
And you must choose one of them.
where regionTable.A = 1
or
where regionTable.A in (..)
Your query has several syntax errors. Based on your comments, I think there is no need for a subquery and you want this:
select jxn.id, count(jxn.id) as counts, regionTable.A
from jxn inner join
V on jxn.id = V.id inner join
regionTable on v.regionID = regionTable.regionID
where regionTable.A = 1
group by jxn.id, regionTable.A
which can be further simplified to:
select jxn.id, count(jxn.id) as counts
, 1 as A --- you can even omit this line
from jxn inner join
V on jxn.id = V.id inner join
regionTable on v.regionID = regionTable.regionID
where regionTable.A = 1
group by jxn.id
You are getting the error because of this line:
where (regionTable.A=1)
You cannot specify a condition in a where in clause, it should only be column name
Something like this may be what you want:
SELECT COUNT(*)
FROM
(
select jxn.id, count(jxn.id) as counts, regionTable.A
from
jxn inner join
V on jxn.id = V.id inner join
regionTable on v.regionID = regionTable.regionID
group by jxn.id, regionTable.A
) sq
WHERE sq.a = 1

Joining top records in T-SQL

SELECT MD.*, Contact.FirstName
FROM MerchantData MD
JOIN Merchant M ON M.MerchID = MD.MerchID
JOIN (SELECT TOP 1 * FROM Location WHERE Location.BusID = MD.BusID) L ON L.BusID=MD.BusID
AND L.Deleted = 0
JOIN Contact ON Contact.ContactID = L.PrincipalID
I am using SQLSERVER 2008 and trying to write this SQL statement. There is some times multiple locations for a busid and I want to join in only the first found. I am getting an error on the part "Location.BusID = MD.BusID" as MD.BusID cannot be bound. Is it possible to use the MD table in the nested select statment in this join or is there another way of accomplishing this?
I am contiplating putting the data using nested querys in the column list to grab the contact data driectly there.
It would be simpler I think to have a subquery of the full result set:
SELECT MD.*, Contact.FirstName
FROM MerchantData MD
JOIN Merchant M ON M.MerchID = MD.MerchID
JOIN (SELECT BusID, MAX(PrincipalID)
FROM Location
WHERE Deleted = 0
GROUP BY BusID) L ON L.BusID=MD.BusID
JOIN Contact ON Contact.ContactID = L.PrincipalID
You still get one record per BusID in the JOIN but it's not correlated.
SELECT MD.*, Contact.FirstName
FROM MerchantData MD
JOIN Merchant M ON M.MerchID = MD.MerchID
CROSS APPLY (SELECT TOP 1 * FROM Location WHERE BusID = MD.BusID AND DELETED = 0) L
JOIN Contact ON Contact.ContactID = L.PrincipalID
This is a case of the "top n per group" problem. This question will guide you:
SQL Server query select 1 from each sub-group
You'll want to be doing something like this:
SELECT MD.* ,
Contact.FirstName
FROM MerchantData MD
JOIN Merchant M ON M.MerchID = MD.MerchID
JOIN ( select * ,
seq = rank() over( partition by BusID order by BusID , ... )
from Location
where L.Deleted = 0
) L on L.BusID = MD.BusID
and seq = 1
JOIN Contact ON Contact.ContactID = L.PrincipalID
The virtual table expression should return at most 1 Location per BusID (0 if the BusID has no non-deleted Locations).
To try and isolate out the error I would try. See if it can match Location.BusID = MD.BusID.
SELECT MD.*, Contact.FirstName
FROM MerchantData MD
JOIN Merchant M ON M.MerchID = MD.MerchID
JOIN Location On Location.BusID = MD.BusID
You do not use the * so use
SELECT TOP 1 Location.BusID FROM Location WHERE Location.BusID = MD.BusID
Once you get the syntax working.
You do know that once you get this working it will only match if the "first" row and then check if it is deleted. The problem is that without an order by the "first" row is arbitrary. Even with a clustered index on a table there is no guaranteed sort without an order by clause. To get a repeatable answer you need a sort. But if you are sorting and only want the top row then a MAX or MIN and a group be more straight forward.
If you want just business that have one or more deleted locations then the following should work but you need to break out the columns for the group by. If two deleted locations have differenct contact name then it will report each. So, this may not be what you are looking for.
SELECT MD.col1, MD.col2, Contact.FirstName
FROM MerchantData MD
JOIN Merchant M ON M.MerchID = MD.MerchID
JOIN Location L
ON L.BusID = MD.BusID
AND L.Deleted = 0
JOIN Contact ON Contact.ContactID = L.PrincipalID
GROUP BY MD.col1, MD.col2, Contact.FirstName