SQL database Structure - sql

I've got a list of synonyms and need to create a database in SQL for it.
I was thinking about using a Relational Database Design, but don't know if it would be the best. There will be a decent amount of traffic using this database.
I was thinking about Table1 would be like
Id
Table2
Id
InterlinkID (Table1 Id)
Word
Would this be the best way? There could be 1 - 20+ linked words. One other problem I see from this setup is If I have 1 word that works as a synonym for more than one word.
Not so great Example of how it will be used, but you get the idea:
Table 1
Id 1
Id 2
Table 2
Id 1
InterlinkID 1
Word One
Id 2
InterlinkID 1
Word 1
Id 3
InterlinkID 1
Word First
Id 4
InterlinkID 2
Word Two
Id 5
InterlinkID 2
Word 2
Id 6
InterlinkID 2
Word Second

The most minimal way of modeling the relationship would be as a single table with three columns:
id - primary key, integer
word - unique word, should have a unique constraint to stop duplicates
parent_id - nullable
Use the parent_id to store the id number of the word you want to relate the current word to. IE:
id | word | parent_id
---------------------------
1 | abc | NULL
2 | def | 1
...shows that abc was added first, and def is a synonym for it.
A more obvious and flexible means of modelling the relationship would be with two tables:
WORDS
id, primary key
wordvalue
SYNONYMS
word_id
synonym_id
Both columns in the SYNONYMS table would be the primary key, to ensure that there can't be duplicates. However it won't stop duplicates in reverse order. But it will allow you to map numerous combinations to have a "spider web" relationship between words, while the single table format would only support a hierarchical relationship.

Related

Prohibit breaking the sequence for rows field (no gaps)

Is there a way to creacte check or index or anything else to prohibit breaking the sequence for rows field?
Let assume I have chapters table with order column.
Chapter table:
uuid | order
dad | 1
1dd | 2
xxss | 3
sdsd | 4
5aa | 5
Chapters order start from 1 and should not contain sequence gaps like 1,2,4,5 (3 is missing). Any chapter can be deleted, or inserted in any order (with reordering).
If there is no way to forbid skips, then how can i reoder chapters after insert or delete to erase skips (reoder from 1 to max)?
I am unsure that there is an easy way to prevent gaps. I would start with a unique constraint, that avoids duplicates.
Then, you can use a view that assigns an autoincrementing id based on the existing column:
create view myview as
select uuid, row_number() over(order by ord) as new_ord
from mytable
Whenever you want to display the sequential chapter numbers, you can query the view instead of the table.
Note: order is a language keyword; I used ord instead in the query.

SQL Server "pseudo/synthetic" composite Id(key)

Sorry but I don't know how to call in the Title what I need.
I want to create an unique key where each two digits of the number identify other table PK. Lets say I have below Pks in this 3 tables:
Id Company Id Area Id Role
1 Abc 1 HR 1 Assistant
2 Xyz 2 Financial 2 Manager
3 Qwe 3 Sales 3 VP
Now I need to insert values in other table, I know that I may do in 3 columns and create a Composite Key to reach integrity and uniqueness as below:
Id_Company Id_Area Id_Role ...Other_Columns.....
1 2 1
1 1 2
2 2 2
3 3 3
But I was thinking in create a single column where each X digites identify each FK. So the above table 3 first columns become like below (suposing each digit in an FK)
Id ...Other_Columns.....
121
112
222
333
I don't know how to call it and even if it's stupid but it makes sense for me, where I can select for a single column and in case of need some join I just need to split number each X digits by my definition.
It's called a "smart", "intelligent" or "concatenated" key. It's a bad idea. It is fragile, leads to update problems and impedes the DBMS. The DBMS and query language are designed for you to describe your application via base tables in a straightforward way. Use them as they were intended.

How to design a database schema with type and subtype

I've read plenty of supertype/subtype threads and I'm pretty sure I am not asking the same one.
I have the following tables in my database. Note that:
1. Some security types only need Type but require no SubType, such as stocks and bonds.
2. Securties.TypeId is a foreign key pointing to Type.ID.
3. Securties.SubTypeId has no foreign key relationship to BondType or DerivativeType tables. And currently the data integrity is maintained by C# code.
Since lacking of foreign key relationship is bad, I want to refactor this DB to have it. Given that this DB is already in production, what's the best way to improve it while limiting the software risk? i.e., one way to do it is to combine all XXXType tables into a single table and have all SubTypeIds rearranged, but clearly that involves updating tons of records in the Securites table. So it's considered a more risky approach than another one which doesn't require changing values.
[Securites]
ID Name TypeId SubTypeId
1 Stock1 2 NULL
2 Fund1 3 NULL
3 Bond1 1 3
4 Deriv1 4 3
[Type]
ID Name
1 Bond
2 Stock
3 ETF
4 Derivative
[BondType]
ID Name
...
2 GovermentBond
3 CorporateBond
4 MunicipalBond
...
[DerivativeType]
ID Name
...
2 Future
3 Option
4 Swap
...

How to merge two identical database data to one?

Two customers are going to merge. They are both using my application, with their own database. About a few weeks they are merging (they become one organisation). So they want to have all the data in 1 database.
So the two database structures are identical. The problem is with the data. For example, I have Table Locations and persons (these are just two tables of 50):
Database 1:
Locations:
Id Name Adress etc....
1 Location 1
2 Location 2
Persons:
Id LocationId Name etc...
1 1 Alex
2 1 Peter
3 2 Lisa
Database 2:
Locations:
Id Name Adress etc....
1 Location A
2 Location B
Persons:
Id LocationId Name etc...
1 1 Mark
2 2 Ashley
3 1 Ben
We see that person is related to location (column locationId). Note that I have more tables that is referring to the location table and persons table.
The databases contains their own locations and persons, but the Id's can be the same. In case, when I want to import everything to DB2 then the locations of DB1 should be inserted to DB2 with the ids 3 and 4. The the persons from DB1 should have new Id 4,5,6 and the locations in the person table also has to be changed to the ids 4,5,6.
My solution for this problem is to write a query which handle everything, but I don't know where to begin.
What is the best way (in a query) to renumber the Id fields also having a cascade to the childs? The databases does not containing referential integrity and foreign keys (foreign keys are NOT defined in the database). Creating FKeys and Cascading is not an option.
I'm using sql server 2005.
You say that both customers are using your application, so I assume that it's some kind of "shrink-wrap" software that is used by more customers than just these two, correct?
If yes, adding special columns to the tables or anything like this probably will cause pain in the future, because you either would have to maintain a special version for these two customers that can deal with the additional columns. Or you would have to introduce these columns to your main codebase, which means that all your other customers would get them as well.
I can think of an easier way to do this without changing any of your tables or adding any columns.
In order for this to work, you need to find out the largest ID that exists in both databases together (no matter in which table or in which database it is).
This may require some copy & paste to get a lot of queries that look like this:
select max(id) as maxlocationid from locations
select max(id) as maxpersonid from persons
-- and so on... (one query for each table)
When you find the largest ID after running the query in both databases, take a number that's larger than that ID, and add it to all IDs in all tables in the second database.
It's very important that the number needs to be larger than the largest ID that already exists in both databases!
It's a bit difficult to explain, so here's an example:
Let's say that the largest ID in any table in both databases is 8000.
Then you run some SQL that adds 10000 to every ID in every table in the second database:
update Locations set Id = Id + 10000
update Persons set Id = Id + 10000, LocationId = LocationId + 10000
-- and so on, for each table
The queries are relatively simple, but this is the most work because you have to build a query like this manually for each table in the database, with the correct names of all the ID columns.
After running the query on the second database, the example data from your question will look like this:
Database 1: (exactly like before)
Locations:
Id Name Adress etc....
1 Location 1
2 Location 2
Persons:
Id LocationId Name etc...
1 1 Alex
2 1 Peter
3 2 Lisa
Database 2:
Locations:
Id Name Adress etc....
10001 Location A
10002 Location B
Persons:
Id LocationId Name etc...
10001 10001 Mark
10002 10002 Ashley
10003 10001 Ben
And that's it! Now you can import the data from one database into the other, without getting any primary key violations at all.
If this were my problem, I would probably add some columns to the tables in the database I was going to keep. These would be used to store the pk values from the other db. Then I would insert records from the other tables. For the ones with foreign keys, I would use a known value. Then I would update as required and drop the columns I added.

How do I relate multiple rows in two tables?

I have two tables:
table1:
id(int) | stuff(text)
-------------------------
1 | foobarfoobarfoo
2 | blahfooblah
3 | foo
table2:
id(int) | otherstuff(text)
--------------------------
1 | foo
2 | bar
3 | blah
A row in table1 can have more than one of foo, bar etc. And, each row in table2 can appear in more than one row of table1.
Which is a better way of keeping this straight. Should I create a third table like this:
table3:
id_from2(int) | id_from1(int)
-----------------------------
1 | 1
1 | 2
1 | 3
2 | 1
3 | 2
Or, should I have an column of type array added to table1 and table2 to keep track of the same information?
Yes, using junction tables is the correct way of implementing many-to-many relations in RDBMS.
You can add more attributes to your junction table (i.e. table3) if necessary. For example, if the relations are ordered, you can add a third field that specifies an ordering of the (table1, table2) combinations. Here is a link to an answer on Stack Overflow that gives a nice detailed example of a many-to-many table.
This is a standard Many-To-Many design, most flexible solution would be a third table with id associations as you shown.
Can't agree more. Your design of adding a third table is correct.
Relation table is the best way to relate many-to-many relationships. You did it well.
So you want a many to many relationship? One from table 1 can have relation to more objects in 2, and the other way around? Yes, use a third table like you said, it's a best practice. Also attach a primary key autoincrement column, just to be safe