SQL tables relationship and indexes - sql

I want to have 2 tables SalesOrderHeader and SalesOrderLine with one to many relationship and also want to build indexes.
SalesOrderHeader table:
SalesOrderNumber
------------------
| |
| 1 |
| |
|----------------|
| |
| |
| 2 |
| |
| |
------------------
SalesOrderLine table:
SalesOrderNumber Line
-------------------------
| | 1 |
| 1 |------|
| | 2 |
|----------------|------|
| | 1 |
| |------|
| 2 | 2 |
| |------|
| | 3 |
-------------------------
Please advise how possible to build relatinships and indexes with these tables structure approach?

Create a foreign key
ALTER TABLE SalesOrderNumberLine
ADD FOREIGN KEY (SalesOrderNumberID) REFERENCES SalesOrderNumber (ID)
Afterwards create a nonclustered index on that column
CREATE NONCLUSTERED INDEX IX_SalesOrderNumberLine_SalesOrderNumberID
ON dbo.SalesOrderNumberLine(SalesOrderNumberID);

Related

Foreign Keys: Multiple Numeric or Character

I have to store records in a "Analyze" table which are referenced to a Object with a defined Id (nvarchar20).
I came up with two possible designs (see Option A or B) for the Analyze Table:
What I'm not sure about is weather its better to store the primary key of the different objects (ObjectA, ObjectB, ...) in separate columns or simply store the plain ObjectId
The Analyze Table is growing very fast and most operations are read via a given ObjectId. So most cases you have a ObjectId and have to search the Analyze Table.
The pattern of the ObjectId is always the same: for example you can identify a given ObjectIds Type if you look at the eight to fourth last characters CA16834K23850001ABCD
0001 is always ObjectA
0002 is always ObjectB
ObjectA
| PK (bigInt) | ObjectId (nvarchar20) | otherfields |
| ------------| --------------------- | ------------|
| 1 | CA16834K23850001ABCD | .. |
| 2 | CA16834K23850001ABCE | .. |
ObjectB
| PK (bigInt) | ObjectId (nvarchar20) | otherfields |
| ----------- | --------------------- | ----------- |
| 1 | CA16834K23850002ABCD | .. |
| 2 | CA16834K23850002ABCE | .. |
Option A:
AnalyzeTable
| id (bigInt) | ObjA_PK (bigInt)| ObjB_PK (bigInt)| otherfields... |
| ----------- | --------------- | --------------- | -------------- |
| 1 | 1 | NULL | ... |
| 2 | NULL | 1 | ... |
Option B:
AnalyzeTable
| id (bigInt) | ObjectId (nvarchar20) | otherfields... |
| ----------- | --------------------------- | --------------- |
| 1 | CA16834K23850001ABCD | ... |
| 2 | CA16834K23850002ABCD | ... |
What is the better design for reading the AnalyzeTable. Since numeric indexes might be faster than an index on an nvarchar?

Making sense of database table references with foreign and primary keys

I am relatively new to database logic, and am trying to figure out how multiple tables should/shouldn't reference each other.
I have a table, 'Book', that should have columns:
| 'title' | 'genre' | 'author' | 'buyOption' | 'pubDate'
I want each book to have the possibility to have 1 or more genres.
I want each book to have the possibility to have 1 or more authors.
I want each book to have the possibility to have 1 or more purchase options ('buyOption').
and each purchase option (Amazon, Walmart, etc.) for each book has a unique url.
What I think makes sense (please correct me where I'm wrong):
__________________________
| |
| Book |
|________________________|
|Primary Key | book_id | //seems redundant (same as title_id)...would like to just use title_id, but idk if that's possible
|------------|-----------|
|Foreign Key | title_id | <--------------------------------------------|
|Foreign Key | bo_id | <----------------------------------| |
|Foreign Key | genre_id | <--------------------------| | |
|Foreign Key | author_id | <-------------------| | | |
| - - - - - | - - - - - | | | | |
| | pubDate | //publish date | | | |
|________________________| | | | |
| | | |
| | | |
| | | |
__________________________ | | | |
| | | | | |
| Authors | | | | |
|________________________| | | | |
|Primary Key | author_id |------------------| | | |
|------------|-----------| | | |
|--->|Foreign Key | title_id | | | |
| | - - - - - | - - - - - | | | |
| | | author | | | |
| |____________|___________| | | |
| | | |
| | | |
| __________________________ | | |
| | | | | |
| | Genres | | | |
| |________________________| | | |
| |Primary Key | genre_id |-------------------------| | |
| |------------|-----------| | |
|--->|Foreign Key | title_id | | |
| | - - - - - | - - - - - | | |
| | | genre | | |
| |____________|___________| | |
| | |
| __________________________ | |
| | | | |
| | Buy Options | | |
| |________________________| | |
| |Primary Key | bo_id |---------------------------------| |
| |------------|-----------| |
|--->|Foreign Key | title_id | |
| | - - - - - | - - - - - | |
| | | buyBrand | //(Walmart, Amazon, etc.) |
| | | buyUrl | //(ex: https://www.amzn.com/buyBook1) |
| |____________|___________| |
| |
| |
| |
| __________________________ |
| | | |
| | Title | |
| |________________________| |
|---------|Primary Key | title_id |--------------------------------------|
|------------|-----------|
| | title |
|____________|___________|
Does it make sense to have the title table? If so, can i use its primary key to fill various other tables, as depicted?
If the 'Buy Options' table is going to have a bunch of different options and associated urls for each book, will it be possible to get the buyBrand and buyUrl directly from the main 'Book' table? In the end, I just want a giant table that I can grab cell data from. Right now I'm trying to figure out how to populate tables with my data, and what tables to fill for each piece of data.
(again, I'm new to database logic, so I apologize if my wording is hard to understand)
Your design does not look good. You are describing many-to-many relationships between books and genres, books and authors, books and options.
Storing references to the related genre, author, and option in the books table is not the right way to go: you can only store one related value per book (one genre, one author, one option), while you need many. Instead, for each of these relationships, you should have a separate table, called a bridge table, that references the associations.
On the other hand, information that is dependent on the book (say, the title) should be stored in the book table.
Here is one example for books and genres:
create table books(
book_id int primary key,
title varchar(100), --dependent column
pub_date date --dependent column
);
create table genres(
genre_id int primary key,
name varchar(100)
);
create table book_genres(
book_id int references book(book_id),
genre_id int references genre(genre_id),
primary key (book_id, genre_id)
);
Now, say you want to list all books that belong to genre 'Sci-Fi'; you would go:
select b.*
from books b
inner join book_genres bg on bg.book_id = b.book_id
inner join genres g on g.genre_id = bg.genre_id
where g.name = 'Sci-Fi'
The same logic should be implemented for each and every many-to-many relationship in your schema.

Find recipe by ingredient name, and sort if possiple

I have trouble writing queries for my database. I've found a number of similar examples here, and have been trying to get it around my head for hours, but none of the examples helped me figure out how to get the result i want. I'm building a recipe app but I'm really new to everything that concerns databases.
I would like to pick 1 or more ingredients and get matching recipe titles, and if possible, sorted by best result (recipe which contains most ingredients).
I've tried to make an inner join where I try to get recipe names together with matching ingredients.
And I've tried to do as this person does in this question because we have the same design:
Recipe Database, search by ingredient
But even when i try to do the join as he/she does, just to test (and not as i would like it) it i get:
ERROR: missing FROM-clause entry for table
Here are my tables:
Recipe_Ingredient
+-----------+---------------+
| recipe_id | ingredient_id |
+-----------+---------------+
| 1 | 1 |
| 1 | 2 |
| 1 | 3 |
| 1 | 4 |
| 1 | 5 |
| 1 | 6 |
| 1 | 7 |
| 2 | 1 |
| 2 | 8 |
| 2 | 9 |
| 2 | 10 |
| 2 | 11 |
+-----------+---------------+
Recipe
+-----------+-----------------------+--------------+
| id | name | instructions |
+-----------+-----------------------+--------------+
| 1 | Guacamole | sample text |
| 2 | Grilled avocado toast | sample text |
+-----------+-----------------------+--------------+
Ingredient
+------------------+
| id | name |
+----+-------------+
| 1 | avocado |
| 2 | tomato |
| 3 | chili |
| 4 | garlic |
| 5 | lemon |
| 6 | salt |
| 7 | black pepper|
| 8 | pesto |
| 9 | spinach |
| 10 | hard cheese |
| 11 | bread |
+------------------+
sql
create table Recipe (
id SERIAL PRIMARY KEY,
name CHAR(50),
prep_time CHAR(10),
instructions VARCHAR(2000));
create table Ingredient (
id SERIAL PRIMARY KEY,
name CHAR(50));
create table Recipe_Ingredient (
recipe_id INT NOT NULL,
ingredient_id INT NOT NULL,
FOREIGN KEY(recipe_id) REFERENCES Recipe(id),
FOREIGN KEY(ingredient_id) REFERENCES Ingredient(id));
Are you looking query like this:
with cte as (
select recipe_id, count(*) as cnt from Recipe_Ingredient
group by recipe_id
) select r.id as Recipeid, r.name, c.cnt from Recipe r join cte c
on r.id = c.recipe_id
order by c.cnt desc

MySql - Is an index on a column redundant if it is defined as unique?

If a table is defined to have a unique constraint on a column, is it also necessary to define a separate index on that column if we want it to be indexed for fast lookup?
For example, given the following table indexes:
mysql> show index from FRONTIER;
+----------+------------+-------------------------+--------------+--------------+-----------+-------------+----------+--------+------+-------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+----------+------------+-------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+
| FRONTIER | 0 | PRIMARY | 1 | ID | A | 0 | NULL | NULL | | BTREE | |
| FRONTIER | 0 | uniq_cnstr | 1 | HOST_ID | A | 0 | NULL | NULL | | BTREE | |
| FRONTIER | 0 | uniq_cnstr | 2 | HASH_PATHQRY | A | 0 | NULL | NULL | | BTREE | |
| FRONTIER | 1 | I_FRONTIER_HASH_PATHQRY | 1 | HASH_PATHQRY | A | 0 | NULL | NULL | | BTREE | |
+----------+------------+-------------------------+--------------+--------------+-----------+-------------+----------+--------+------+------------+---------+
Is the index 'I_FRONTIER_HASH_PATHQRY' redundant given unique constraint 'uniq_constr' ?
(note that the unique constraint spans 2 columns)
A unique index is a plain index as well, there is no need to create a separate non-unique one.
However, your unique constraint covers two columns (host_id, hash_pathqry), the hash_pathqry being trailing.
An index maintains the lexicographical order of columns, so hash_pathqry are only ordered within each single value of host_id.
So the unique index you already have will not improve the lookups on HASH_PATHQRY. The I_FRONTIER_HASH_PATHQRY is still required.
no - if you specify unique i.e username varbinary(32) unique not null, then MySQL adds a unique index automatically.

optimizing an sql query using inner join and order by

I'm trying to optimize the following query without success. Any idea where it could be indexed to prevent the temporary table and the filesort?
EXPLAIN SELECT SQL_NO_CACHE `groups`.*
FROM `groups`
INNER JOIN `memberships` ON `groups`.id = `memberships`.group_id
WHERE ((`memberships`.user_id = 1)
AND (`memberships`.`status_code` = 1 AND `memberships`.`manager` = 0))
ORDER BY groups.created_at DESC LIMIT 5;`
+----+-------------+-------------+--------+--------------------------+---------+---------+---------------------------------------------+------+----------------------------------------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+-------------+--------+--------------------------+---------+---------+---------------------------------------------+------+----------------------------------------------+
| 1 | SIMPLE | memberships | ref | grp_usr,grp,usr,grp_mngr | usr | 5 | const | 5 | Using where; Using temporary; Using filesort |
| 1 | SIMPLE | groups | eq_ref | PRIMARY | PRIMARY | 4 | sportspool_development.memberships.group_id | 1 | |
+----+-------------+-------------+--------+--------------------------+---------+---------+---------------------------------------------+------+----------------------------------------------+
2 rows in set (0.00 sec)
+--------+------------+-----------------------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+--------+------------+-----------------------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+
| groups | 0 | PRIMARY | 1 | id | A | 6 | NULL | NULL | | BTREE | |
| groups | 1 | index_groups_on_name | 1 | name | A | 6 | NULL | NULL | YES | BTREE | |
| groups | 1 | index_groups_on_privacy_setting | 1 | privacy_setting | A | 6 | NULL | NULL | YES | BTREE | |
| groups | 1 | index_groups_on_created_at | 1 | created_at | A | 6 | NULL | NULL | YES | BTREE | |
| groups | 1 | index_groups_on_id_and_created_at | 1 | id | A | 6 | NULL | NULL | | BTREE | |
| groups | 1 | index_groups_on_id_and_created_at | 2 | created_at | A | 6 | NULL | NULL | YES | BTREE | |
+--------+------------+-----------------------------------+--------------+-----------------+-----------+-------------+----------+--------+------+------------+---------+
+-------------+------------+----------------------------------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------------+------------+----------------------------------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| memberships | 0 | PRIMARY | 1 | id | A | 2 | NULL | NULL | | BTREE | |
| memberships | 0 | grp_usr | 1 | group_id | A | 2 | NULL | NULL | YES | BTREE | |
| memberships | 0 | grp_usr | 2 | user_id | A | 2 | NULL | NULL | YES | BTREE | |
| memberships | 1 | grp | 1 | group_id | A | 2 | NULL | NULL | YES | BTREE | |
| memberships | 1 | usr | 1 | user_id | A | 2 | NULL | NULL | YES | BTREE | |
| memberships | 1 | grp_mngr | 1 | group_id | A | 2 | NULL | NULL | YES | BTREE | |
| memberships | 1 | grp_mngr | 2 | manager | A | 2 | NULL | NULL | YES | BTREE | |
| memberships | 1 | complex_index | 1 | group_id | A | 2 | NULL | NULL | YES | BTREE | |
| memberships | 1 | complex_index | 2 | user_id | A | 2 | NULL | NULL | YES | BTREE | |
| memberships | 1 | complex_index | 3 | status_code | A | 2 | NULL | NULL | YES | BTREE | |
| memberships | 1 | complex_index | 4 | manager | A | 2 | NULL | NULL | YES | BTREE | |
| memberships | 1 | index_memberships_on_user_id_and_status_code_and_manager | 1 | user_id | A | 2 | NULL | NULL | YES | BTREE | |
| memberships | 1 | index_memberships_on_user_id_and_status_code_and_manager | 2 | status_code | A | 2 | NULL | NULL | YES | BTREE | |
| memberships | 1 | index_memberships_on_user_id_and_status_code_and_manager | 3 | manager | A | 2 | NULL | NULL | YES | BTREE | |
+-------------+------------+----------------------------------------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
Index on memberships column user_id (should already have one if it's PK)
Index on memberships columns status_code and manager (both of them on same index)
Index on groups column created_at (with default DESC if possible, I don't know if you can in mySql)
This is what I would do in a MS SQL Server but I guess same optimization can be used in mySql too.
Do you have all the "obvious" single-column indexes on the fields you join on, the fields in your where clause and the created_at field you order by?
The trouble is you need an index on groups in order to eliminate the filesort, but all of your where conditions are on memberships.
Try adding an index on groups on (id, created_at).
If that doesn't work, try tricking the optimizer like so using a subquery (keeping the aforementioned index on groups):
SELECT SQL_NO_CACHE `groups`.*
FROM `groups`
INNER JOIN (select group_id from `memberships`
WHERE
`memberships`.user_id = 1
AND `memberships`.`status_code` = 1
AND `memberships`.`manager` = 0
) m on m.group_id=`groups`.id
ORDER BY groups.created_at DESC LIMIT 5;
There should be an index on at least membershipships.user_id, but you could also gain some benefit from an index like (user_id, status, manager). I assume status and manager are flags that don't have a large range of possible values so it isn't that important as long as there's an index on user_id.
A (user_id, status_code, manager) (in any order) index on memberships would help.
Avoiding the sort would be difficult, because then you have to start the join in the groups table, which means you can't use all the (presumably pretty selective) where clauses that reference the memberships table until it is too late.
Thanks for posting details about the indexes you're using.
I've tested this, and tried omitting some of the indexes. The most index is memberships.complex_index which serves as a covering index. This allows the query to achieve its results by reading only the index; it doesn't have to read data rows for memberships at all.
None of the indexes on groups make any difference. It appears to use a filesort no matter what. Filesort in MySQL simply means it's doing a table-scan, which in some cases can be less costly than using an index. For instance, if every row of groups needs to be read to produce the result of the query anyway, why bother doing an unnecessary double-lookup by using an index? The optimizer can sense these cases and can appropriately refuse to use an index.
So aside from primary key indexes and the complex_index, I'd drop all the others, since they're not helping and can only add to the cost of maintaining these tables.