Database design: table with NULL keys - sql

I'm designing a table that will hold numeric values for 2-3 situations of data:
Situation 1: has Age and Sex, along with the numeric value
Situation 2: has only Age, along with the numeric value
Situation 3: has only Sex, along with the numeric value
I don't want to create 3 different tables. Instead, only one table, with the following columns:
AgeID (references a table that contains information about the Age)
SexID (references a table that contains information about the Age)
Value (the numeric value itself)
AgeID and SexID as Foreign Keys and linked to the appropriate tables.
The problem is: my query is always doing a INNER JOIN with Age and Sex tables. For Situation 1 it works well because values are present. For Situations 2 and 3 I don't get any data, because either AgeID or SexID is null.
What solution is the correct one?
Change something in the table design?
Use Entity-Attribute-Value table to be more generic?
Use LEFT JOIN instead of INNER JOIN for all queries involving the nullable columns??
Any other idea?
Could someone clarify?
Thanks!

Yes an outer Join, Left or right, the Inner join is meant to filter out everything that doesn't have a match in both tables.

Use a conditional INNER JOIN, like
INNER JOIN Table x ON
(AgeID IS NULL OR AgeID = x.AgeID)
AND (SexID IS NULL OR SexID = x.SexID)

Related

Can we join two parts of two composite primary keys together?

I have to two tables, both have a composite primary key:
OrderNr + CustNr
OrderNr + ItemNr
Can I join both tables with the OrderNr and OrderNr which is each a part of a composite primary key?
Yes, but you may find you get rows from each table that repeat as they combine to make a unique combination. This is called a Cartesian product
Table A
OrderNr, CustNr
1,C1
1,C2
2,C1
2,C2
TableB
OrderNr,ItemNr
1,i1
1,i2
SELECT * FROM a JOIN b ON a.OrderNr = b.OrderNr
1,C1,1,i1
1,C1,1,i2
1,C2,1,i1
1,C2,1,i2
This happens because composite primary keys can contain repeated elements so long as the combination of elements is unique. Joining on only one part of the PK, and that part being an element that is repeated (my custnr 1 repeats twice in each table, even though the itemnr and CustNr mean the rows are unique) results in a multiplied resultset - 2 rows from A that are custnr 1, multiplied by 2 rows from B that are custnr 1, gives 4 rows in total
Does it work with the normal/naturla join too?
Normal joins (INNER, LEFT OUTER, RIGHT OUTER, FULL OUTER) will join the rows from two tables or subqueries when the ON condition is valid. The clause in the ON is like a WHERE clause, yes - in that it represents a statement that is true or false (a predicate). If the statement is true, the rows are joined. You don't even have to make it about data from the tables - you can even say a JOIN b ON 1=1 and every rows from A will get joined to every row from B. As commented, primary keys aren't involved in JOINS at all, though primary keys often rely on indexes and those indexes may be used to speed up a join, but they aren't vital to it.
Other joins (CROSS, NATURAL..) exist; a CROSS join is like the 1=1 example above, you don't specify an ON, every row from A is joined to every row from B, by design. NATURAL JOIN is one to avoid using, IMHO - the database will look for column names that are the same in both tables and join on them. The problem is that things can stop working in future if someone adds a column with the same name but different content/meaning to the two tables. No serious production system I've ever come across has used NATURAL join. You can get away with some typing if your columns to join on are named the same, with USING - SELECT * FROM a JOIN b USING (col) - here both A and B have a column called col. USING has some advantages, especially over NATURAL join, in that it doesn't fall apart if another column of the same name as an existing one but it has some detractors too - you can't say USING(col) AND .... Most people just stick to writing ON, and forget USING
NATURAL join also does NOT use primary keys. There is no join style (that I know of) that will look at a foreign key relationship between two tables and use that as the join condition
And then is it true that if I try to join Primary key and foreign key of two tables, that it works like a "where" command?
Hard to understand what you mean by this, but if you mean that A JOIN B ON A.primarykey = B.primary_key_in_a then it'll work out, sure. If you mean A CROSS JOIN B WHERE A.primarykey = B.primary_key_in_a then that will also work, but it's something I'd definitely avoid - no one writes SQLs this way, and the general favoring is to drop use of WHERE to create joining conditions (you do still see people writing the old school way of FROM a,b WHERE a.col=b.col but it's also heavily discouraged), and put them in the ON instead
So, in summary:
SELECT * FROM a JOIN b ON a.col1 = b.col2
Joins all rows from a with all rows from b, where the values in col1 equal the values in col2. No primary keys are needed for any of this to work out
You can join any table if there is/are logical relationship between them
select *
from t1
JOIN t2
on t1.ORderNr = t2.OrderNr
Although if OrderNr cannot provide unicity between tables by itself, your data will be multiplied.
Lets say that you have 2 OrderNr with value 1 on t1 and 5 OrderNr with value 1 on t2, when you join them, you will get 2 x 5 = 10 records.
Your data model is similar to a problem commonly referred to as a "fan trap". (If you had an "order" table keyed solely by OrderNr if would exactly be a fan trap).
Either way, it's the same problem -- the relationship between Order/Customers and Order/Items is ambiguous. You cannot tell which customers ordered which items.
It is technically possible to join these tables -- you can join on any columns regardless of whether they are key columns or not. The problem is that your results will probably not make sense, unless you have more conditions and other tables that you are not telling us about.
For example, a simple join just on t1.OrderNr = t2.OrderNr will return rows indicating every customer related to the order has ordered every item related to the order. If that is what you want, you have no problem here.

PostgreSQL - copy column from related table

So I have three tables: companies, addresses and company_address.
For optimization reasons I need to copy city column from addresses table to companies table. Relation between companies and addresses is many to one (as many companies can occupy same address). They are connected through company_address table, consisting of address_id and company_id columns.
I found this solution for case without intermediate table: How to copy one column of a table into another table's column in PostgreSQL comparing same ID
Trying to modify query I came up with:
UPDATE company SET company.city=foo.city
FROM (
SELECT company_address.company_id, company_address.address_id, address.city
FROM address LEFT JOIN company_address
ON address.id=company_address.address_id
) foo
WHERE company.id=foo.company_id;
but it gives error:
ERROR: column "company" of relation "company" does not exist
I cant figure out what is going on. I'll be grateful for any ideas.
You don't need a subquery for that. Also, refer in the SET clause to your table columns without preceding with table name.
I believe that since your WHERE condition includes joined table, it should be INNER JOIN instead of a LEFT JOIN.
UPDATE company c
SET city = a.city
FROM address a
INNER JOIN company_address ca ON a.id = ca.address_id
WHERE c.id = ca.company_id
Note how using aliases for table names shortens the code and makes it readable at the very first glance.
You're right syntactically, you just don't need the table name at the beginning of the update statement:
UPDATE company SET city=foo.city
FROM (
SELECT company_address.company_id, company_address.address_id, address.city
FROM address LEFT JOIN company_address
ON address.id=company_address.address_id
) foo
WHERE company.id=foo.company_id;

Not allowing multiple Null values in an Access 2010 multi-column Index

I'm trying to create a table in Access 2010 which will not allow duplicates in two fields, but will allow nulls in one of those fields providing there is only a single null value (so no duplication of value/null).
My table fields are as below with the ID field set as a Primary Key and the plan is to not allow duplicates in CostCode/TeamID but TeamID can be Null once for each instance of a CostCode.
The picture below shows that I can't add a CostCode and TeamID twice if they both have values, but I can add a CostCode twice with Null values in TeamID.
Is there anyway to achieve this?
I've read I could give TeamID a default value of an empty string (or 0 as that will never be a TeamID) but I'd like to use Null if possible as that is what the empty string or 0 would represent.
EDIT:
After the comment from JJ32 and a weekend to think it through I've gone with putting the TeamID value into a separate table.
I would then have a Many-2-Many join between tbl_BranchDetail and tbl_CostCodes and a Many-2-Many join between tbl_CostCodeM2MJoin and tbl_Teams.
This will remove Null values from occurring in either Many-2-Many table and my query will now read as:
SELECT M2M.BranchID
,M2M.CostCodeID
,TM2M.TeamID
,CC.CostCode
,TM.TeamName
FROM ((tbl_CostCodes CC INNER JOIN tbl_CostCodeM2MJoin M2M ON CC.ID = M2M.CostCodeID)
LEFT JOIN tbl_CostCodeToTeamM2MJoin TM2M ON (M2M.BranchID = TM2M.BranchID AND
M2M.CostCodeID = TM2M.CostCodeID))
LEFT JOIN tbl_Teams TM ON TM2M.TeamID = TM.ID
I don't believe it is possible to disallow duplicate nulls in a unique composite index since no two Nulls are ever considered equal.
So in your example above you'd have three unique rows, one with a combination of TBC/1 and two with a combination of TBC/null.
The only answer I know of, unfortunately, is to choose some non-null value to represent null in TeamID, and then display the result as empty within the application.

MSSQL statement referencing across tables

So I have two tables, disciplinary and employees. Disciplinary has a column that lists an employee ID (an investigator) and an attempt to import new columns that are drawn from the employee table that yield the employee first and last name based on the existing employee ID from the disciplinary table. Below is the SQL I have so far:
SELECT d.*
, inv.firstName as investigatorFirstName
, inv.lastName as investigatorLastName
FROM det_siu_disciplinary d
LEFT OUTER JOIN cpso_employees inv ON
inv.commissionNumber = d.investigatorEmployeeID
WHERE d.isDelete = 0
This statement successfully adds the joined columns with their new names, but all columns are null. My primary concern is my SQL being flat out wrong, as it's the part of this process that I have least experience with. These statements are part of a much larger query, so if at all possible I'd prefer to not write a new query...adding contingencies would be perfect!
Anyone that assists, thank you in advance :)
the primary key column "CommissionNumber" seems unlikely to me to be the primary key of a table that should contain "EmployeeID" values, in order to join to the foreign key columns of your d table.

SQL query to get data from one table based upon a column from another table?

In my tables I have for example
CountyID,County and CityID in the county table and in the city table I have table I have for example
City ID and City
How do I create a report that pulls the County from the county table and pulls city based upon the cityid in the county table.
Thanks
Since this is quite a basic question, I'll give you a basic answer instead of the code to do it for you.
Where tables have columns that "match" each other, you can join them together on what they have in common, and query the result almost as if it was one table.
There are also different types of join based on what you want - for example it might be that some rows in one of the tables you're joining together don't have a corresponding match.
If you're sure that a city will definitely have a corresponding county, try inner joining the two tables on their matching column CityID and querying the result.
The obvious common link between both tables is CityID, so you'd be joining on that. I think you have the data organized wrong though, I'd put CountryID in the City table rather than CityID in the country table. Then, based on the CountryID selected, you can limit your query of the City table based on that.
To follow in context of Bridge's answer, you are obviously new to SQL and there are many places to dig up how to write them. However, the most fundamental basics you should train yourself with is always apply the table name or alias to prevent ambiguity and try to avoid using column names that might be considered reserved words to the language... they always appear to bite people.
That said, the most basic of queries is
select
T1.field1,
T1.field2,
etc with more fields you want
from
FirstTable as T1
where
(some conditional criteria)
order by
(some column or columns)
Now, when dealing with multiple tables, you need the JOINs... typically INNER or LEFT are most common. Inner means MUST match in both tables. LEFT means must match the table on the left side regardless of a match to the right... ex:
select
T1.Field1,
T2.SomeField,
T3.MaybeExistsField
from
SomeTable T1
Join SecondTable T2
on T1.SomeKey = T2.MatchingColumnInSecondTable
LEFT JOIN ThirdTable T3
on T1.AnotherKey = T3.ColumnThatMayHaveTheMatchingKey
order by
T2.SomeField DESC,
T1.Field1
From these examples, you should easily be able to incorporate your tables and their relationships to each other into your results...