SQL One-to-one relationship join - sql

I have 2 tables one is an extension of the other so it is currently a simple one-to-one relationship (this is likely to become one-to-many in the future). I need to join from one table to another to pull a value out of another column in the extension.
so table A contains basic details including an id and table B uses a FK reference to the Id column in table A. I need to pull out column X from table B.
To add complexity sometimes there won't be a matching entry in table B but in that case it needs to return null. Also the value of X could be null.
I know I can use a left outer join but is there a more efficient way to perform the join?

Left outer join is the way. In order to make it most efficient, make sure you index the FK column in table B. It will be super-fast with the index.
You don't need to index the primary key in table A for this query (and most databases already index primary keys anyway).
The MySQL syntax for creating the index:
CREATE INDEX `fast_lookups` ON `table_b` (`col_name`);
You can name it whatever, I picked "fast_lookups."

Related

How to get the differences between two - kind of - duplicated tables (sql)

Prolog:
I have two tables in two different databases, one is an updated version of the other. For example we could imagine that one year ago I duplicated table 1 in the new db (say, table 2), and from then I started working on table 2 never updating table 1.
I would like to compare the two tables, to get the differences that have grown in this period of time (the tables has preserved the structure, so that comparison has meaning)
My way of proceeding was to create a third table, in which I would like to copy both table 1 and table 2, and then count the number of repetitions of every entry.
In my opinion, this, added to a new attribute that specifies for every entry the table where he cames from would do the job.
Problem:
Copying the two tables into the third table I get the (obvious) error to have two duplicate key values in a unique or primary key costraint.
How could I bypass the error or how could do the same job better? Any idea is appreciated
Something like this should do what you want if A and B have the same structure, otherwise just select and rename the columns you want to confront....
SELECT
*
FROM
B
WHERE NOT EXISTS (SELECT * FROM A)
if NOT EXISTS doesn't work in your DBMS you could also use a left outer join comparing the rows columns values.
SELECT
A.*
from
A left outer join B
on A.col = B.col and ....

Must a natural join be on a shared primary key?

Suppose I perform A natural join B, where:
A's schema is: (aID), where (aID) is a primary key.
B's schema is: (aID,bID), where (aID, bID) is a composite primary key.
Would performing the natural join work in this case? Or is it necessary for A to have both aID and bID for this to work?
NATURAL JOIN returns rows with one copy each of the common input table column names and one copy each of the column names that are unique to an input table. It returns a table with all such rows that can be made by combining a row from each input table. That is regardless of how many common column names there are, including zero. When there are no common column names, that is a kind of CROSS JOIN aka CARTESIAN PRODUCT. When all the column names are common, that is a kind of INTERSECTION. All this is regardless of PKs, UNIQUE, FKs & other constraints.
NATURAL JOIN is important as a relational algebra operator. In SQL it can be used in a certain style of relational programming that is in a certain sense simpler than usual.
For a true relational result you would SELECT DISTINCT. Also relations have no special NULL value whereas SQL JOINs treat a NULL as not equal to a NULL; so if we treat NULL as just another value relationally then SQL will sometimes not return the true relational result. (When both arguments have a NULL for each of some shared columns and both have the same non-NULL value for each other shared column.)
A "natural" join uses the names of columns to match between tables. It uses any matching names, regardless of key definitions.
Hence,
select . . .
from a natural join b
will use AId, because that is the only column with the same name.
In my opinion, natural join is an abomination. For one thing, it ignores explicitly declared foreign key relationships. These are the "natural join" keys, regardless of their names.
Second, the join keys are not clear in the SELECT statement. This makes debugging the query much more difficult.
Third, I cannot think of a SQL construct where adding a column or removing a column from a table takes a working query and changes the number of rows in the result set.
Further, I often have common columns on my tables -- CreatedAt, CreatedOn, CreatedBy. Just the existence of these columns precludes using natural joins.

Extending table with another table ... sort of

I have a DB about renting cars.
I created a CarModels table (ModelID as PK).
I want to create a second table with the same primary key as CarModels have.
This table only contains the number of times this Model was searched on my website.
So lets say you visit my website, you can check a list that contains common cars rented.
"Most popular Cars" table.
It's not about One-to-One relationship, that's for sure.
Is there any SQL code to connect two Primary keys together ?
select m.ModelID, m.Field1, m.Field2,
t.TimesSearched
from CarModels m
left outer join Table2 t on m.ModelID = t.ModelID
but why not simply add the field TimesSearched to table CarModels ?
Then you dont need another table
Easiest is to just use a new primary key on the new table with a foreign key to the CarModels table, like [CarModelID] INT NOT NULL. You can put an index and a unique constraint on the FK.
If you reeeealy want them to be the same, you can jump through a bunch of hoops that will make your life Hell, like creating the table from the CarModels table, then setting that field as the primary key, then whenever you add a new CarModel you'll have to create a trigger that will SET IDENTITY_INSERT ON so you can add the new one, and remember to SET IDENTITY_INSERT OFF when you're done.
Personally, I'd create a CarsSearched table that holds ThisUser selected ThisCarModel on ThisDate: then you can start doing some fun data analysis like [are some cars more popular in certain zip codes or certain times of year?], or [this company rents three cars every year in March, so I'll send them a coupon in January].
You are not extending anything (modifying the actual model of the table). You simply need to make INNER JOIN of the table linking with the primary keys being equal.
It could be outer join as it has been suggested but if it's 1:1 like you said ( the second table with have exact same keys - I assume all of them), inner will be enough as both tables would have the same set of same prim keys.
As a bonus, it will also produce fewer rows if you didn't match all keys as a nice reminder if you fail to match all PKs.
That being said, do you have a strong reason why not to keep the said number in the same table? You are basically modeling 1:1 relationship for 1 extra column (and small one too, by data type)
You could extend (now this is extending tables model) with the additional attribute of integer that keeps that number for you.
Later is preferred for simplicity and lower query times.

Update multiple table without knowing the table name (due to a chain of Foreign key relationship)

I need to update one field for a few rows in one table (say, Table_A). However, I'm getting an error message saying conflict with the Foreign Key Constraint in Table_B.
So, I tried to update Table_B as well, turns out Table_B has Foreign Key Constraint with Table_C and Table_D; again, I tried to update Table_C and D, turns out they are conflicting with table_E, F, G, H, I, J, K etc. etc. and on and on.
I was told that such "chain" can go up to 20+ tables.
Additionally, I do not have access to the database schema, thus it is extremely difficult for me to determine which field in which table is the foreign key for the other table.
Currently, all I can do is manually checking each table, all the way from A-Z by using select * statement from the table that is showing in the error message. I'm wondering if there is any alternative to update these specific fields all across tables A till (whichever the last table) directly?
I'm using SQL Server 2005.
This will give you the names of the tables and columns in your foreign keys
SELECT
OBJECT_NAME(fk.[constraint_object_id]) AS [foreign_key_name]
,OBJECT_SCHEMA_NAME(fk.[parent_object_id]) AS [parent_schema_name]
,OBJECT_NAME(fk.[parent_object_id]) AS [parent_table_name]
,pc.[name] AS [parent_column_name]
,OBJECT_SCHEMA_NAME(fk.[parent_object_id]) AS [referenced_schema_name]
,OBJECT_NAME(fk.[referenced_object_id]) AS [referenced_table_name]
,rc.[name] AS [referenced_column_name]
FROM [sys].[foreign_key_columns] fk
INNER JOIN [sys].[columns] pc ON
pc.[object_id] = fk.[parent_object_id] AND
pc.[column_id] = fk.[parent_column_id]
INNER JOIN [sys].[columns] rc ON
rc.[object_id] = fk.[referenced_object_id] AND
rc.[column_id] = fk.[referenced_column_id]
How to best display and analyze the connection graph is a more subjective matter and will depend on the complexity of your schema.

How do you keep a JOIN table performant?

I'm drawing up plans for a few new features on my site, and one could be "solved" using a join table.
Example schema:
Person table
PK PersonId
Name
Age ...
PersonCheckin table
PK FK PersonId
PK FK CheckinId
Date ...
Checkin table
PK CheckinId
CheckedInto ...
A join would be run to get the check in data for a person (connected by the PersonCheckin table). Since every person could check in an unlimited number of times, the PersonCheckin table could become very large.
I'd imagine this would cause some performance issues. What are typical ways this is handled to keep performance high?
A join is considering the best performing means of connecting related tables.
But it really depends on the query, because it might not need to be a JOIN -- JOINing can inflate the record set on the parent tables side if there are more than one child record related, which means there could be a need for either GROUP BY or DISTINCT. EXISTS or IN is a better choice in such situations...
Indexes can help on the column(s) used in the JOIN criteria, on both sides of the relationship. In this example both sides are primary keys, which typically have the best index automatically created when the primary key is defined...
If you are going to execute this query very often and you want to achieve better performance just create a view on the database where you write the join query