Best SQL table design [closed] - sql

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 2 years ago.
Improve this question
Which design is better for a table of products with probably more than one name per product?
Using one table or two tables:
One Table:
id | Price|EN name| CH name| JP name|....
---------------------------------------------------
1 | 100| ABC| 中文一| null|...
2 | 200| CDE| null| null|...
..... | ...| ...| ...| ...|...
Two Tables:
id | Price|EN name|
---------------------------
1 | 100| ABC|
2 | 200| CDE|
..... | ...| ...|
id | language| name|
-------------------------------
1 | CH| 中文一|
3 | JP| 東京|
..... | ... | ...|

In general, two tables is considered "better", because it makes it easier to add new languages. I will say that this might apply more to languages that share a common character set.
In many databases, expanding the width of a row (by adding more languages) has an effect on performance as well. Wider rows are more expensive. And adding columns to table with many rows is expensive.
One advantage of a single table is that you can define the character set for each column appropriate to that column, instead of having one generic column for all character sets.
In other words, there is not necessarily a simple answer to your question.

The second way allows you to add extra languages.
Actually you could even put English inside the second table in order to make it more compliant with the Database Normalization:
https://en.wikipedia.org/wiki/Database_normalization

Related

Find exact match or first bigger number in Access database [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 5 months ago.
Improve this question
I have next problem.
There is a table with four columns:
|ID | X | Y | VAL |
|:--|:--:|:--:|----:|
|1 | 1 | 1 | 1110|
|2 | 1 | 2 | 1720|
|3 | 1 | 3 | 2330|
|4 | 1 | 4 | 2940|
|5 | 1 | 5 | 3550|
...
When user enter some value in text field e.g. 2370 i need function to find is there exact match in VAL field and if not to find very first bigger than 2370 (2940) and return ID value.
In some other language I can do it trough dictionaries and so one but in VBA I simply don't have idea.
Any idea or help will be appreciated.
You can use a query to get this answer, using TOP 1 to just return 1 record:
SELECT TOP 1 tblData.ID, tblData.VAL
FROM tblData
WHERE (((tblData.VAL)>=2370))
ORDER BY tblData.VAL ASC;

update sql table with increment number [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
I have a table of information in sql, I have added a counter field to it, we want to fill this field based on the item field in order. Of course, the item field is variable and must have its own counter based on each type.
|item|no|
|:---|-:|
|110 | |
|120 | |
|110 | |
|150 | |
After the update
|item|no |
|:---|--:|
|110 | 1 |
|120 | 1 |
|110 | 2 |
|150 | 1 |
ok.
You can use row_number(). Assuming that you have a column that specifies ordering:
select t.*,
row_number() over (partition by item order by <ordering col>) as number
from t;
Note: If you don't care about the ordering, you can use order by item. Some databases allow row_number() without the order by as well.

Pentaho Data Integration De-normalize many row values as field names

I am reading data from a survey in a table that has 3 fields:
- record
- question
- answer
in each row for every record there are many questions with the relative answer:
|record|question|answer|
------------------------
|1 |q1. |a1. |
|1 |q2. |a2. |
|2 |q1. |a1. |
|2 |q2. |a2. |
What i want to do in Pentaho is transform this table to one where i have the record field and then each question should be a field so that rows contain record id and answer values:
|record|q1 |q2. |
------------------------
|1 |a1 |a2 |
|2 |a1 |a2 |
I would do it with the de-normalization step, but in my case i have a lot and possibly changing questions, so i was wondering if there is an automatic way to have the values in the input question field mapped to the output field names.
You can try with Metadata injection to inject these values in runtime!!!

Does splitting up data between multiple (N) tables result in linear or exponential slowdown during querying? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 years ago.
Improve this question
Question
I am currently wondering about the performance implications of splitting up data between multiple tables.
Specifically, I am wondering how the number of tables accessed during a query (via multiple joins) impacts query time - and whether the slowdown usually grows in a linear fashion O(c*n) with the number of tables n, or whether the slowdown usually grows in an exponential fashion O(c^n).
TLDR: By having more joins, can I typically expect a linear growth in query time, or an exponential one?
*Footnote: I understand that this question depends on a number of different factors (e.g. Table schema/number of rows/query type. However, I am asking for a general rule of thumb)
Example
Lets say we are tasked with retrieving information about people and their respective occupations from the following database. Presumably, we will need to perform a join in our select statement.
SCENARIO A)
Person_Table
_______________________________
| (PK) ID | Name | Age | Race |
|---------|-------|-----|-------|
| 0 | Jack | 24 | Asian |
| 1000 | Tom | 35 | White |
| 2000 | Robin | 11 | White |
| ... | ... | ... | ... |
Occupation_Table
______________________________
| (PK) ID | (FK) PID | Job |
|---------|----------|---------|
| 0 | 0 | Cook |
| 1 | 1000 | Cook |
| 2 | 2000 | Teacher |
| ... | ... | ... |
Now consider this slightly different database schema representing the same data. With this one, we will not have to perform a join.
SCENARIO B)
Person_Table
_________________________________________
| (PK) ID | Name | Age | Race | Job |
|---------|-------|-----|-------|---------|
| 0 | Jack | 24 | Asian | Cook |
| 1000 | Tom | 35 | White | Cook |
| 2000 | Robin | 11 | White | Teacher |
| ... | ... | ... | ... | ... |
How will performance compare between these two tables?
Will SCENARIO B be faster by a factor of c*2?
Will SCENARIO B be faster by a factor of c^2?
Will SCENARIO B be more or less the same as SCENARIO A?
How will these differences generalize to more extreme examples involving 3/4/5/etc distinct tables & joins?
*Footnote: In my examples - (PK) stands for primary key, (FK) stands for Foreign key
Query performance is related much more to the volume of data being processed than to the number of tables.
The volume is basically in three categories:
Rows that need to be read from permanent storage.
Rows that need to be written to permanent storage.
Intermediate movement of data, to support aggregations and joins.
In your example, the persons and occupations tables appear to be "vertical partitions" of the data. That is, a single record has columns split across different tables.
In such a scenario, a query on all columns will be slower in the multiple table version. However, a query on a subquery might only need to read one of the tables and would be quicker.
In any reasonable schema, an index would link the two tables. So, the two table approach has to read slightly more data and do an index lookup. It will be some constant slower than the one-table version for the query you specify.
I don't think that in your scenario you would find a significant difference. Only because the number of jobs are limited and you can use an inner join(most people have only one job).
Problems start on big tables and left/right (outer) joins which need lots of memory and swap space and where you don't reduce the number of rows.
And when you do left and right join and union them for a full outer join, then you know why mysql hasn't implemented it.
In short as long as your database is small, you don't have lots of connections and the server hardware is decent enough, you won't feel a thing

Database design: I want a column value to determine which table to query

I don't have much experience in designing databases. I want a column value to determine which table to query, and I don't know if there is a better method for this. Here is the concrete problem for better understanding:
I am designing a database for a survey creator application. I want to store different kind of questions (for example: multiple choice questions and basic text question). I have the following tables:
QUESTION
| ID | Title | TypeID |
----------------------------------------------
| 1 | "Pick a num from 1-10" | 1 |
| 2 | "Choose some from the list:" | 2 |
TYPE
| ID | Name | ExtraValues |
--------------------------------------------
|1 |Scale Question |ScaleValues |
|2 |Multiple Choice |MultiValues |
SCALE VALUES
|Question_ID | Min | Max |
--------------------------
|1 | 1 |10 |
MULTI VALUES
|Question_ID | Name | Value |
--------------------------------
|2 | Sugar | 10 |
|2 | Milk | 20 |
|2 | Egg | 14 |
So from now on, if a question is a "Multiple choice" type, than I want to check the table MULTI VALUES, else the SCALE VALUES. I can do it with stored procedure or I can just query the all the SOMETHING VALUES tables for the question_ID. But is there a better way to do it?
You can certainly design your database that way. However you can't grab the "ExtraValues" column in a query and have that automagically pull in that table into a query. Not without dynamically executed sql. You're best bet is just use branching logic on the question type and use that to determine where to get other related data.
You could also move the min and max fields into the QUESTION table and do away with the ScaleValues table completely. You could just set the to NULL if it's a multiple choice question.
I think there is definetely a better way to do it. Set up a many to many relationship between questions and available answers. Add a third column, named points. So your three tables would be:
Question - QuestionId and Text
Answer - AnswerId and Text
QuestionAnswer - QuestionId, AnswerId, and Points.
Award 0 points for wrong answers.
This design might be too simple. You might need a Test Table as well. Then you would need a TestId field in that many to many table, which would now be called, TestQuestionAnswer.