SQL INNER JOIN DISTINCT [closed]

SQL INNER JOIN DISTINCT [closed] - sql

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
I have the tables PRODUCTS and LISTINGS. When doing the following query:
SELECT DISTINCT *
FROM products
INNER JOIN listings
ON products.product_number=listings.product_number
This is the "search" functionality:
WHERE products.product_number !=''
AND listings.monthly_price BETWEEN '0' AND '10'
This returns a double entry of one of the product listings. Why isn't DISTINCT working?
EDIT
Products:
product_number, make, model model_number, colour, processor, battery_standby, battery_talk, camera, flash, screen_size, screen_res, memory
Listings:
listing_number, featured, date, member_id, network, length, product_number, monthly_price, minutes, texts, data, image1
Essentially I'd like to create result rows matching the listings tables via their PRODUCT_NUMBER to the product table. It's for a search function of a phone listings website to be more precise.
To be much more specific, the search function uses the products table to search, then the listings table to show the useful information about the phone listing.
ANSWER
SELECT DISTINCT *
FROM listings
INNER JOIN products
ON products.product_number=listings.product_number
The above did the trick; simply swapping the tables round. I also inserted a few more rows into listings, and the "problem" vanished. Even if it's not solved, it isn't happening anymore... Not sure what the problem was.

I believe you're just expecting something from DISTINCT that doesn't work that way....
Assume you have a table Products with ID and Name, and table Listings with ID, ProductID (FK to Products), andListing_date` (just to make things a bit simpler here....)
Assume furthermore that your table Products has entries:
ID Name
1 Foobar
2 Bazfoo
and table Listings has entries
ID ProductID ListingDate
1 1 2012-01-01
2 1 2012-03-01
3 2 2012-04-01
If you join these two tables and apply a DISTINCT
SELECT DISTINCT ProdID = p.ID, p.Name, ListingID = l.ID, l.ListingDate
FROM dbo.Products p
INNER JOIN dbo.Listings l ON l.ProductID = p.ID
what results do you expect??
The result will be:
ProdID Name ListingID ListingDate
1 Foobar 1 2012-01-01
1 Foobar 2 2012-03-01
2 Bazfoo 3 2012-04-01
The DISTINCT keyword is applied to all columns - only if all columns in the result set are identical, then a row will be filtered out.
From your comments, I'm led to believe that you're expecting that the "duplicate" product with ID = 1 and Name = Foobar should be excluded. This is NOT the case - see the result set - if you look at all four columns, those two rows with ProdID = 1 are NOT identical - therefore, they will both show up.
That's just the way the DISTINCT keyword is defined to work.
If you want to "filter out" the duplicate product with ID=1 - which of the two entries in the Listings table are you expecting to be shown in the result set?

If there is a record where 2 listings joins a single product then this would produce what you are seeing:
The select distinct is done on the result of the inner join
I'd use the common join value to select * from each table and see the results
HTH
Ian

Related

Join on non unique column [closed]

Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 2 years ago.
Improve this question
I have two tables in DB and I want to select and join some data from these table.
The first table has some customers:
customer
id
Dave
1
Tom
2
The second table has a list of products and a column that indicates which customer bought that Product:
id
product
isin
customer id
1
PC
XV452889
1
2
phone
VN865232
2
3
laptop
PL201325
1
I tried INNER JOIN in order to get as output a table that lists for each product that have been bought, who was its customer.
Desired output:
id
product
isin
customer id
customer
1
PC
XV452889
1
Dave
2
phone
VN865232
2
Tom
3
laptop
PL201325
1
Dave
I tried inner join but the answer is empty, its like you cant join two table on a non-unique column. it has been two days I try to solve it.
select table2.product, table2.customer_id
from table2
inner join table1 on table1.id = table2.customer_id;
Hers is the query I am running on similar tables (allocations table is equivalent to the second table of product above and orders is equivalents to customer table):

You can join the customer.id on product.customer_id:
SELECT p.*, c.customer
FROM products p
JOIN customer c on p.customer_id = c.id

THANK YOU ALL FOR YOUR ANSWERS about he join!!!
I found the error in the query, it was missing ',' after orders.instructions...
too long queries ):

How to select last entry per ID in SQL [duplicate]

This question already has answers here:
How to get the last record per group in SQL
(10 answers)
Closed 6 years ago.
I have a big log table with 2 million rows give or take.
I am looking to look for the last entry for each id.
The 3 columns of importance are
Userid
Actiontype
Actiontime
Text2
Some userids show up thousands of times some just show up once. I need the most recent of each userid. I tried to use 'Group By' but it wont work because text2 is different for each entry which is really the data I need. So it needs to be ordered by actiontime, actiontype needs to be 103. I am really at a loss how to do this.
Any help would be appreciated.

Select B.*
From (
Select UserID,ActionTime=max(ActionTime)
From SomeTable
Group By UserID
) A
Join SomeTable B on A.UserID=B.UserID and A.ActionTime=B.ActionTime

SQL JOIN returning multiple rows when I only want one row

I am having a slow brain day...
The tables I am joining:
Policy_Office:
PolicyNumber OfficeCode
1 A
2 B
3 C
4 D
5 A
Office_Info:
OfficeCode AgentCode OfficeName
A 123 Acme
A 456 Acme
A 789 Acme
B 111 Ace
B 222 Ace
B 333 Ace
... ... ....
I want to perform a search to return all policies that are affiliated with an office name. For example, if I search for "Acme", I should get two policies: 1 & 5.
My current query looks like this:
SELECT
*
FROM
Policy_Office P
INNER JOIN Office_Info O ON P.OfficeCode = O.OfficeCode
WHERE
O.OfficeName = 'Acme'
But this query returns multiple rows, which I know is because there are multiple matches from the second table.
How do I write the query to only return two rows?

SELECT DISTINCT a.PolicyNumber
FROM Policy_Office a
INNER JOIN Office_Info b
ON a.OfficeCode = b.OfficeCode
WHERE b.officeName = 'Acme'
SQLFiddle Demo
To further gain more knowledge about joins, kindly visit the link below:
Visual Representation of SQL Joins

Simple join returns the Cartesian multiplication of the two sets and you have 2 A in the first table and 3 A in the second table and you probably get 6 results. If you want only the policy number then you should do a distinct on it.

(using MS-Sqlserver)
I know this thread is 10 years old, but I don't like distinct (in my head it means that the engine gathers all possible data, computes every selected row in each record into a hash and adds it to a tree ordered by that hash; I may be wrong, but it seems inefficient).
Instead, I use CTE and the function row_number(). The solution may very well be a much slower approach, but it's pretty, easy to maintain and I like it:
Given is a person and a telephone table tied together with a foreign key (in the telephone table). This construct means that a person can have more numbers, but I only want the first, so that each person only appears one time in the result set (I ought to be able concatenate multiple telephone numbers into one string (pivot, I think), but that's another issue).
; -- don't forget this one!
with telephonenumbers
as
(
select [id]
, [person_id]
, [number]
, row_number() over (partition by [person_id] order by [activestart] desc) as rowno
from [dbo].[telephone]
where ([activeuntil] is null or [activeuntil] > getdate()
)
select p.[id]
,p.[name]
,t.[number]
from [dbo].[person] p
left join telephonenumbers t on t.person_id = p.id
and t.rowno = 1
This does the trick (in fact the last line does), and the syntax is readable and easy to expand. The example is simple but when creating large scripts that joins tables left and right (literally), it is difficult to avoid that the result contains unwanted duplets - and difficult to identify which tables creates them. CTE works great for me.

Join a table to bring in another field [closed]

This question is unlikely to help any future visitors; it is only relevant to a small geographic area, a specific moment in time, or an extraordinarily narrow situation that is not generally applicable to the worldwide audience of the internet. For help making this question more broadly applicable, visit the help center.
Closed 10 years ago.
I currently have this query:
select f.chainid,count(f.player_uuid) as Favorites
from deals_player_favorite f
group by f.chainid
order by 2 desc
Which results in:
CHAINID FAVORITES
25 2771
2207 2282
3940 1954
etc...
I have another table called deals_deals, which also includes the CHAINID field. From this table, I want to join a field called VENUE in, so that each CHAIN ID has a Venue description, and the output would look like this
CHAINID VENUE FAVORITES
25 Amazon.com 2771
2207 Walmart 2282
3940 CVS 1954
etc...
How would I properly join the venue field into the query, using CHAIN id as the key that is in both the deals_deals table and deals_player_favorite table.
I tried an inner join which resulted in way too many results.
The deals_deals table has the fields CHAINID and VENUE..
The deals_player_favorite table has the fields CHAINID and PLAYER_UUID, but does not include all of the CHAINIDs that the deals_deals table, only ones that have been accessed by a player_uuid.
SAMPLE DATA:
deals_deals table
VENUE CHAINID
Walmart 235
Aeropostale 1467
Checker's 881
deals_player_favorite table
PLAYER_UUID CHAINID
23rjior23-32fjdf 235
keep in mind that deals_player_favorite only includes specific CHAINIDs that have been clicked on, not ALL chainids....

SELECT F.chainid, V.Venue, COUNT(f.player_uuid) as Favorites
FROM deals_player_favorite F
INNER JOIN Venues V
ON F.chainid = V.chainid
GROUP BY F.chainid, V.Venue
ORDER BY COUNT(f.player_uuid) DESC

If your problem is that you are getting too many records in your count, then you might want to consider using a subquery and then joining the subquery to get the venue:
select f.chainid,
v.venue,
f.Favorites
from
(
select chainid, count(player_uuid) Favorites
from deals_player_favorite
group by chainid
) f
inner join deals_deals v
on f.chainid = v.chainid
See SQL Fiddle with Demo
The subquery will get your total favorites first, then using the chainid you will get the venue

SQL count distinct values for records but filter some dups

I have a MS SQL 2008 table of survey responses and I need to produce some reports. The table is fairly basic, it has a autonumber key, a user ID for the person responding, a date, and then a bunch of fields for each individual question. Most of the questions are multiple choice and the data value in the response field is a short varchar text representation of that choice.
What I need to do is count the number of distinct responses for each choice option (ie. for question 1, 10 people answered A, 20 answered B, and so forth). That is not overly complex. However, the twist is that some people have taken the survey multiple times (so they would have the same User ID field). For these responses, I am only supposed to include the latest data in my report (based on the survey date field). What would be the best way to exclude the older survey records for those users that have multiple records?

Since you didn't give us your DB schema I've had to make some assumptions but you should be able to use row_number to identify the latest survey taken by a user.
with cte as
(
SELECT
Row_number() over (partition by userID, surveyID order by id desc) rn,
surveyID
FROM
User_survey
)
SELECT
a.answer_type,
Count(a.anwer) answercount
FROM
cte
INNER JOIN Answers a
ON cte.surveyID = a.surveyID
WHERE
cte.rn = 1
GROUP BY
a.answer_type

Maybe not the most efficient query, but what about:
select userid, max(survey_date) from my_table group by userid
then you can inner join on the same table to get additional data.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas