SQL Server query returns too many records

SQL Server query returns too many records - sql

I have 3 tables in my SQL Server database.
They are linked together as shown in this picture (lines are connected to right rows in picture)
I have a query which should return all the reparations from tblreparations with some information about what is repaired, but instead it returns the reparation 3 times, one time for each laptop that the client (klant in dutch) has assigned to it, while the reparations table (reparaties in dutch) only contains one laptopID each row
This is the query:
SELECT AankopenReparaties.Id,
AankopenReparaties.KlantenId,
AankopenReparaties.actietype,
AankopenReparaties.voorwerptype,
laptopscomputers.merk,
laptopscomputers.model,
laptopscomputers.info,
AankopenReparaties.info,
AankopenReparaties.Prijs,
AankopenReparaties.lopend
FROM AankopenReparaties, laptopscomputers
WHERE (aankopenreparaties.lopend = 'lopend');
It returns this
and it should be only one row since the reparations table (aankopenreparaties) only contains one row with one laptopID
Does anyone know how to fix this?
Please help because it should be fixed soon (it's an assignment for school)

The reason why you are returning too many records is because your query produces cartesian product of both tables. You need to tell the server on how the two tables are related with each other.
SELECT AankopenReparaties.Id,
AankopenReparaties.KlantenId,
AankopenReparaties.actietype,
AankopenReparaties.voorwerptype,
laptopscomputers.merk,
laptopscomputers.model,
laptopscomputers.info,
AankopenReparaties.info,
AankopenReparaties.Prijs,
AankopenReparaties.lopend
FROM AankopenReparaties
INNER JOIN laptopscomputers
ON AankopenReparaties.LaptopID = laptopscomputers.ID -- specify relationship
WHERE aankopenreparaties.lopend = 'lopend'
To further gain more knowledge about joins, kindly visit the link below:
Visual Representation of SQL Joins
INNER JOIN ON vs WHERE clause

Related

How to check if table relationship is one to many in SSMS

I'm writing some SQL code based on my tables but don't want to miss any edge cases. I'm wondering how do you check if there's a one to many relationship between two tables in SSMS
SELECT *
FROM Houses
JOIN Addresses
on Houses.Id = Addresses.HouseId
Unfortunately the data when queried doesn't give me any insight.
What I tried to do:
Checked table dependencies but that didn't give me any insight. It shows addresses are dependencies but no relationship details.
May I ask, is it possible to determine if one to one via SSMS?

You can run a query to determine how many values each house in the two tables. Well, in this case, we'll assume it is the primary key of houses and the values of HouseId are valid ids:
SELECT num_addresses, COUNT(*)
FROM (SELECT h.Id, COUNT(a.HouseId) as num_addresses
FROM Houses h LEFT JOIN
Addresses a
ON h.Id = a.HouseId
GROUP BY h.Id
) ha
GROUP BY num_addresses;
Then interpret the results:
If the only row returned has num_addresses of 1, then you have a 1-1 relationship.
If two rows are returns with values of 0 and 1, then you have a 1/0-1 relationship.
If multiple rows are returned and the minimum is 1 then you have a 1-n.
If multiple rows are returned and the minimum value is 0, then you have a 0-n.
You could extend this for more general relationships, but this answers the question you asked here.

The relationship between tables is shown by things called 'DATABASE - DIAGRAM'
Here is document about how to create database - diagram in SSMS.
How to create database diagram SSMS

Gordon Linoff response is perfect for the helicopter view -
To get all records side by side simply select all columns from both tables side by side with the left join
SELECT Houses.*, Address.*
FROM Houses
left JOIN Addresses on Addresses.HouseId = Houses.Id
union all
SELECT Houses.*, Address.*
FROM Addresses
left JOIN Houses on Houses.Id = Addresses.HouseId
Microsoft have implemented a reduced code method to do the same but I have not explored it so could not comment
https://learn.microsoft.com/en-us/sql/odbc/reference/develop-app/outer-joins?view=sql-server-ver15
What will be apparent is you will see Houses.ID and Addresses.HouseID side by side - An empty value in either means that the relevant table does not have a record for the relevant HouseID or ID
You can then cherry pick those edge case records as you see fit

SQL Query across two tables only show most recently updated result per tag address

I have two tables: violator_state and violator_tags
violator_state:
m_state_id
is_violating
m_translatedid
m_tag
m_violator_tag
This table holds the "tags" which has an unchanging row count of 10 in this case. The purpose is to list out each tag present, connect the full tag address (m_violator_tag) with its shorthand name (m_tag) and state whether it is in "violation". I need to use this table as reference because of the link between m_violator_tag and m_tag.
violator_tags
m_violator_id
m_eval_time_from
m_eval_time_to
m_tag
m_tag_peers
m_tag_position
This table is constantly having new rows added to it holding the information of what tags are in violation with a specific tag. So it would show T6 in violation with T1,T2,T9 ect.
I am looking to create a query which joins the two tables to show only the most recently updated (largest m_eval_time_from) for each tag.
I am using the following query to join the two tables but I expect m_translatedid and m_tag to match but they do not. Unsure why.
SELECT violator_state.m_violator_tag, violator_state.is_violating, violator_state.m_translatedid, violator_tags.m_tag, violator_tags.m_eval_time_to, violator_tags.m_tag_peers,
violator_tags.m_tag_position, violator_tags.m_eval_time_from
FROM violator_tags CROSS JOIN
violator_state
Violation_state table
violation_tags table
results of my (incorrect) query
Any suggestions on what I should try?

Your CROSS JOIN will give you a cartesian product where EVERY row in the first table is paired with ALL the rows in the second table e.g. if you have 10 rows in each, you will get 10 x 10 = 100 rows in the result! I believe you need to join the tables on the m_tag column and select the violator_tags row with the latest date. The query below should do this for you (though you haven't provided your question in a manner that makes it easy for me to double-check my code - see the link provided by a_horse_with_no_name for more on this or use a website like db-fiddle to set up your example).
SELECT vs.m_violator_tag,
vs.is_violating,
vs.m_translatedid,
vt.m_tag,
vt.m_eval_time_to,
vt.m_tag_peers,
vt.m_tag_position,
vt.m_eval_time_from
FROM violator_tags vt
JOIN violator_state vs
ON vt.m_tag = vs.m_tag
AND vt.m_eval_time_from = (SELECT MAX(vt.m_eval_time_from)
FROM violator_tags
WHERE m_tag = vt.m_tag)

Return query results where two fields are different (Access 2010)

I'm working in a large access database (Access 2010) and am trying to return records where two locations are different.
In my case, I have a large number of birds that have been observed on multiple dates and potentially on different islands. Each bird has a unique BirdID and also an actual physical identifier (unfortunately that may have changed over time). [I'm going to try addressing the changing physical identifier issue later]. I currently want to query individual birds where one or more of their observations is different than the "IslandAlpha" (the first island where they were observed). Something along the lines of a criteria for BirdID: WHERE IslandID [doesn't equal] IslandAlpha.
I then need a separate query to find where all observations DO equal where they were first observed. So where IslandID = IslandAlpha
I'm new to Access, so let me know if you need more information on how my tables/relationships are set up! Thanks in advance.

Assuming the following tables:
Birds table in which all individual birds have records with a unique BirdID and IslandAlpha.
Sightings table in which individual sightings are recorded, including IslandID.
Your first query would look something like this:
SELECT *
FROM Birds
INNER JOIN Sightings ON Birds.BirdID=Sightings.BirdID
WHERE Sightings.IslandID <> Birds.IslandAlpha
You second query would be the same but with = instead of <> in the WHERE clause.
Please provide us information about the tables and columns you are using.

I will presume you are asking this question because a simple join of tables and filtering where IslandAlpha <> ObsLoc is not possible because IslandAlpha is derived from first observation record for each bird. Pulling first observation record for each bird requires a nested query. Need a unique record identifier in Observations - autonumber should serve. Assuming there is an observation date/time field, consider:
SELECT * FROM Observations WHERE ObsID IN
(SELECT TOP 1 ObsID FROM Observations AS Dupe
WHERE Dupe.ObsBirdID = Observations.ObsBirdID ORDER BY Dupe.ObsDateTime);
Now use that query for subsequent queries.
SELECT * FROM Observations
INNER JOIN Query1 ON Observations.ObsBirdID = Query1.ObsBirdID
WHERE Observations.ObsLocID <> Query1.ObsLocID;

Difference in where clause and join

im very new to SQL and currently working with joins the first time in my life. What I am trying to figure out right now is trying to get the difference between to queries.
Query 1:
SELECT name
FROM actor
JOIN casting ON id = actorid
where (SELECT COUNT(ord) FROM casting join actor on actorid = actor.id AND ord=1) >= 30
GROUP BY name
Query 2:
SELECT name
FROM actor
JOIN casting ON id = actorid
AND (SELECT COUNT(ord) FROM casting WHERE actorid = actor.id AND ord=1)>=30)
GROUP BY name
So I would think that doing
FROM casting join actor on actorid = actor.id
in the subquery is the same as
FROM casting WHERE actorid = actor.id.
But apparently it is not. Could anyone help me out and explain why?
Edit: If anyone is wondering: The queries are based on question 13 from http://sqlzoo.net/wiki/More_JOIN_operations

Actually, the part that really looks like a "where" statement is only what's after the keyword ON. We sometimes fall on queries performing some data filtering directly at this stage, but its actual purpose is to specify the criteria used
A "join" is a very common operation that consists of associating the rows of two distinct tables according to a common criteria. For example, if you have, on one side, a table containing a client list in which each of them has a unique client number, and on a other side a order list table in which each order contains the client's number, then you may want to "resolve" the number of the latter table into its name, address, and so on.
Before SQL92 (26 years ago), the only way to achieve this was to write something like this :
SELECT client.name,
client.adress,
orders.product,
orders.totalprice
FROM client,orders
WHERE orders.clientNumber = client.clientNumber
AND orders.totalprice > 100.00
Selecting something from two (or more) tables induces a "cartesian product" which actually consists of associating every row from the first set which every row of the second one. This means that if your first table contains 3 rows and the second one 8 rows, the resulting set would be 24-row wide. And out of these, you use the WHERE clause to exclude basically everything and retain only rows in which the client number is the same on both side.
We understand that the size of the resulting set before filtering can grow exponentially if the contents of the different tables exceed a few rows (which is always the case) and it can get even worse if you imply more than two tables. Also, on the programmer's side, it rapidly becomes rather unreadable.
Therefore, if this is what you actually want to do, you now can explicitly tell the server about it, and specify the criteria at first, which will avoid unnecessary growing temporary subsets, while still letting you filter the results with WHERE if needed.
SELECT client.name,
client.adress,
orders.product,
orders.totalprice
FROM client
JOIN orders
ON orders.clientNumber = client.clientNumber
WHERE orders.totalprice > 100.00
It becomes critical when performing multiple JOIN in a single query, especially when performing both INNER and OUTER joins.

In the 2nd query your nested query takes the actor.id from its root query and only counts the results from that. In the 1st query your nested query counts results from all actors instead of only the specified one.

How do you JOIN tables to a view using a Vertica DB?

Good morning/afternoon! I was hoping someone could help me out with something that probably should be very simple.
Admittedly, I’m not the strongest SQL query designer. That said, I’ve spent a couple hours beating my head against my keyboard trying to get a seemingly simple three way join working.
NOTE: I'm querying a Vertica DB.
Here is my query:
SELECT A.CaseOriginalProductNumber, A.CaseCreatedDate, A.CaseNumber, B.BU2_Key as BusinessUnit, C.product_number_desc as ModelNumber
FROM pps_sfdc.v_Case A
INNER JOIN reference_data.DIM_PRODUCT_LINE_HIERARCHY B
ON B.PL_Key = A.CaseOriginalProductLine
INNER JOIN reference_data.DIM_PRODUCT C
ON C.product_line_code = A.CaseOriginalProductLine
WHERE B.BU2_Key = 'XWT'
LIMIT 20
I have a view (v_Case) that I’m trying to join to two other tables so I can lookup a value from each of them. The above query returns identical data on everything EXCEPT the last column (see below). It's like it's iterating through the last column to pull out the unique entries, sort of like a "GROUP BY" clause. What SHOULD be happening is that I get unique rows with specific "BusinessUnit" and "ModelNumber" for that record.
DUMEPRINT 5/2/2014 8:56:27 AM 3002845327 JJT Product 1
DUMEPRINT 5/2/2014 8:56:27 AM 3002845327 JJT Product 2
DUMEPRINT 5/2/2014 8:56:27 AM 3002845327 JJT Product 3
DUMEPRINT 5/2/2014 8:56:27 AM 3002845327 JJT Product 4
I modeled my solution after this post:
How to deal with multiple lookup tables for beginners of SQL?
What am I doing wrong?
Thank you for any help you can provide.

Data issue. General rule in trouble shooting these is the column that is distinct (in this case C.product_number_desc as ModelNumber) for each record is generally where the issue is going to be...and why I pointed you towards dim_product.
If you receive duplicates, this query below will help identify if this table is giving you the issues. Remember key in this statement can be multiple fields...whatever you are joining the table on:
Select key,count(1) from table group by key having count(1)>1
Other options for the future...don't assume it's your code, duplicates like this almost always point towards dirty data (other option is you are causing cross joins because keys are not correct). If you comment out the 'c' table and the column referred to in the select clause, you would have received one row...hence your dupes were coming from the 'c' table here.
Good luck with it

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas