How to design table between three important columns (products,designs,colorImages) in SQL Server - sql

I designed three tables
Products
Designs
Colorimages
What is the best way to design extra tables to join these three table together:
Each product have more than one design with more than colorimages of that design.
Example
productid designid colorimageid picturename
1 1 1 img1
1 1 2 img2
1 2 1 img3
1 2 3 img4
1 3 1 img5
2 1 1 img6
2 1 2 img7
2 2 1
2 3 3
2 3 4
How to design it with high performance ?

Create a table (called ProductDesignColor perhaps) with fields
ProductID
DesignID
ColorImageID
plus its own ID column as an IDENTITY column called ID
For high speed, keep the integer data types in the smallest sensible size, perhaps 32 bit integers.
Build a Clustered index on the ID of the table.
Build a Composite index on the 3 main ID fields (P, D, C), this index will be used by all of your queries no doubt.
Also make single indexes on the 3 ID field (P, D, C) - So 3 indexes in total - only if you will be querying using a single ID value.
Also, performance is a factor of your complete design, so your Product table should have ProductID as a Primary Key (Index). Same with the other tables... Indexes are the key to performance, but they have to be considered and used carefully. Or if you have the RAM - the more indexes the better (unless you are doing lots of inserts / updates).
So a good,neat table design also leads to well planned, minimal, powerful indexes that can fit neatly in the available RAM.

Related

MS SQL Server Optimize repeated column values

I was requested to create a table that will contain many repeated values and I'm not sure if this is the best way to do it.
I must use SQL Server. I would love to use Azure Table Storage and partition keys, but I'm not allowed to.
Imagine that the table Shoes has the columns
id int, customer_name varchar(50), shoe_type varchar(50)
The problem is that the column shoe_type will have millions of repeated values, and I want to have them in their own partition, but SQL Server only allows ranged partitions afaik.
I don't want the repeated values to take more space than needed, meaning that if the column value is repeated 50 times, I don't want it to take 50 times more space, only 1 time.
I thought about using a relationship between the column shoe_type (as an int) and another table which will have its string value, but is that the most I can optimize?
EDIT
Shoes table data
id customer_name shoe_type
-----------------------------
1 a nike
2 b adidas
3 c adidas
4 d nike
5 e adidas
6 f nike
7 g puma
8 h nike
As you can see, the rows contain repeated shoe_type values (nike, adidas, puma).
What I thought about is using the shoe_type column as an int foreign key to another table, but I'm not sure if this is the most efficient way to do it, because in Azure Table Storage you have partitions and partition keys, and in MS SQL Server you have partitions, but they are ranged only.
The sample data you provide suggests that there is a "shoe type" entity in the business domain, and that all shoes have a mandatory relationship to a single shoe type. It would be different if the values were descriptive text - e.g. "Attractive running shoe, suitable for track and leisure wear". Repeated values are often (but of course not always) an indicator that there is another entity you can extract.
You suggest that the table will have millions of records. In very general terms, I recommend designing your schema to reflect the business domain, and only go for exotic optimization options once you know, and can measure, that you have a performance problem.
In your case, I'd suggest factoring out a separate table called "shoe_types", and to include a foreign key relationship from "shoes" to "shoe_types". The primary key for "shoe_types" should be a clustered index, and the "shoe_type_id" in "shoe_types" should be a regular index. All things being equal, with (tens of) millions of rows, that hit the foreign key index should be very fast.
In addition, supporting queries like "find all shoes where shoe type name starts with 'nik%'" should be much faster, because the shoe_types table should have far fewer rows than "shoes".

Oracle Enforce Uniqueness

I need to enforce uniqueness on specific data in a table (~10 million rows). This example data illustrates the rule -
For code=X the part# cannot be duplicate. For any other code there can be duplicate part#. e.g ID 8 row can't be there but ID 6 row is fine. There are several different codes in the table and part# but uniqueness is desired only for one code=X.
ID CODE PART#
1 A R0P98
2 X R9P01
3 A R0P98
4 A R0P44
5 X R0P44
6 A R0P98
7 X T0P66
8 X T0P66
The only way I see is to create a trigger on the table and check for PART# for code=X before insert or update. However, I fear this solution may slow down inserts and updates on this table.
Appreciate your help!
In Oracle, you can create a unique index on an expression for this:
create unique index myidx
on mytable (case when code = 'X' then part# end);

SQL Server "pseudo/synthetic" composite Id(key)

Sorry but I don't know how to call in the Title what I need.
I want to create an unique key where each two digits of the number identify other table PK. Lets say I have below Pks in this 3 tables:
Id Company Id Area Id Role
1 Abc 1 HR 1 Assistant
2 Xyz 2 Financial 2 Manager
3 Qwe 3 Sales 3 VP
Now I need to insert values in other table, I know that I may do in 3 columns and create a Composite Key to reach integrity and uniqueness as below:
Id_Company Id_Area Id_Role ...Other_Columns.....
1 2 1
1 1 2
2 2 2
3 3 3
But I was thinking in create a single column where each X digites identify each FK. So the above table 3 first columns become like below (suposing each digit in an FK)
Id ...Other_Columns.....
121
112
222
333
I don't know how to call it and even if it's stupid but it makes sense for me, where I can select for a single column and in case of need some join I just need to split number each X digits by my definition.
It's called a "smart", "intelligent" or "concatenated" key. It's a bad idea. It is fragile, leads to update problems and impedes the DBMS. The DBMS and query language are designed for you to describe your application via base tables in a straightforward way. Use them as they were intended.

How to design a database schema with type and subtype

I've read plenty of supertype/subtype threads and I'm pretty sure I am not asking the same one.
I have the following tables in my database. Note that:
1. Some security types only need Type but require no SubType, such as stocks and bonds.
2. Securties.TypeId is a foreign key pointing to Type.ID.
3. Securties.SubTypeId has no foreign key relationship to BondType or DerivativeType tables. And currently the data integrity is maintained by C# code.
Since lacking of foreign key relationship is bad, I want to refactor this DB to have it. Given that this DB is already in production, what's the best way to improve it while limiting the software risk? i.e., one way to do it is to combine all XXXType tables into a single table and have all SubTypeIds rearranged, but clearly that involves updating tons of records in the Securites table. So it's considered a more risky approach than another one which doesn't require changing values.
[Securites]
ID Name TypeId SubTypeId
1 Stock1 2 NULL
2 Fund1 3 NULL
3 Bond1 1 3
4 Deriv1 4 3
[Type]
ID Name
1 Bond
2 Stock
3 ETF
4 Derivative
[BondType]
ID Name
...
2 GovermentBond
3 CorporateBond
4 MunicipalBond
...
[DerivativeType]
ID Name
...
2 Future
3 Option
4 Swap
...

SQL database Structure

I've got a list of synonyms and need to create a database in SQL for it.
I was thinking about using a Relational Database Design, but don't know if it would be the best. There will be a decent amount of traffic using this database.
I was thinking about Table1 would be like
Id
Table2
Id
InterlinkID (Table1 Id)
Word
Would this be the best way? There could be 1 - 20+ linked words. One other problem I see from this setup is If I have 1 word that works as a synonym for more than one word.
Not so great Example of how it will be used, but you get the idea:
Table 1
Id 1
Id 2
Table 2
Id 1
InterlinkID 1
Word One
Id 2
InterlinkID 1
Word 1
Id 3
InterlinkID 1
Word First
Id 4
InterlinkID 2
Word Two
Id 5
InterlinkID 2
Word 2
Id 6
InterlinkID 2
Word Second
The most minimal way of modeling the relationship would be as a single table with three columns:
id - primary key, integer
word - unique word, should have a unique constraint to stop duplicates
parent_id - nullable
Use the parent_id to store the id number of the word you want to relate the current word to. IE:
id | word | parent_id
---------------------------
1 | abc | NULL
2 | def | 1
...shows that abc was added first, and def is a synonym for it.
A more obvious and flexible means of modelling the relationship would be with two tables:
WORDS
id, primary key
wordvalue
SYNONYMS
word_id
synonym_id
Both columns in the SYNONYMS table would be the primary key, to ensure that there can't be duplicates. However it won't stop duplicates in reverse order. But it will allow you to map numerous combinations to have a "spider web" relationship between words, while the single table format would only support a hierarchical relationship.