Optimal way to store statuses in DB [closed]

Optimal way to store statuses in DB [closed] - sql

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 12 months ago.
Improve this question
I would like to have your opinion on the "best" way to manage the storage of different statuses in my DB
Currently, when I have a new status type, e.g. "status registration file", "status refund request", "status transfers to another system", I create a new table for each of these types of status, usually with an ID and a label field, then I join the created table.
I was told that this was a no-no, it was an amateur way of working, that only one table should be used, that it multiplied the tables unnecessarily and that, moreover, it was bad for performance. Less tables = more performance.
From my point of view, the advantages I find in creating one table per status type:
allows me to add information/columns as needed (active/inactive status, additional IDs with letters or strings, descriptions, translations...), in short, information that is not necessary for most statuses.
facilitates queries with IDEs (no need to specify the ID of the type of status to be taken into account in a query)
ease of data retrieval with doctrine for the same reason.
The negative point:
a table and a join to be created for each new status type.
Depending on my projects, I have 2/3 to a dozen tables to manage.
What do you think about it?
Is it bad for sql performance/cache to have many tables ( more than 100)?
Thanks in advance for your answers.

When we think of statuses we tend to either think of a series of events like 'prepared' -> 'running' -> 'finished' or of mere booleans (married = yes/no, active = yes/no). If we need this in combination with dates, we can use status history tables that show when a status changed.
But this is not what you have in mind. Your statuses come with data. When you talk about "status registration file", some registration file got involved and you want to store this with the product, order or whatever. And once you store this file (or the file's path) this implies a certain status.
Depending on what you have to store, you'll add a column or a table and maybe even a status (the registration file being unchecked, approved, dismissed).
If I have a table of employees, I may store a column driving_licence_photo. And all employees that have a driving licince photo in the table are allowed to drive the company's cars. The status ("they have a driving licence") is implicit.
If I have a table of employees and they can have various certificates, I may create a table employee_certificate and this table may have a certificate type, a certificate number and maybe even a status "pending" / "achieved".
If I have a table of employees and want to know their working status ('active', 'pausing', 'retired', 'on sick leave', ...), I will probably create a table work_status and give the employee table a work_status_id.
So, the answer is: It depends.

Related

Database design - single subtype in a separate table vs common table [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
Improve this question
In a food delivery app I have a USERS table that holds info about user data like firstname, lastname, email, password etc.
A small subset of users (~1% of all users) will have a delivery person role assigned. This means that there will be some delivery person-specific data like driver license id, average_rating and some more.
What I'm unsure of is what's better: have one USERS table that holds all the data (which means that for the majority of users the delivery-person specific columns will be null) or have a subtype table (DELIVERY_PERSON) that will hold those columns and a foreign key to USERS table?
Option #1
USERS:
id(PK)
email
password
name
...
driver_license_id (null for all regular users)
avg_rating (null for all regular users)
more delivery person specific columns
Option #2
USERS:
id(PK)
email
password
name
DELIVERY_PERSON:
id (PK, FK to USERS.id)
driver_license_id
avg_rating
more delivery person specific columns
I've seen several similar questions on SO, but in all of them there are multiple subtypes like Vehicle -> Car/Airplane/Boat etc.
In my scenario there is one base type (user) and only one possible extending subtype (delivery person). I'm wondering if having only one possible subtype somehow affects what option to choose.

The "clean" implementation of subtypes is, in my perception, to create a separate table for each subtype and a common one for the supertype.
This avoids complex integrity conditions and reduces the number of null values.
To illustrate how complex integrity conditions arise, just imagine you have a supertype and a subtype with 10 additional mandatory properties ("columns") and several optional properties.
Now if a single optional property is non-null, the 10 additional mandatory properties must be non-null as well.
This gets worse if you imagine you have 12 subtypes instead of just one.
On the other hand, if you store everything in a single table, you don't have to perform joins. This is a performance advantage that can add up if you often need the additional columns.
Naturally this is only partially true. If you have many subtypes, the rows will be long. This reduces the effectiveness of your data cache.
If your application doesn't need the additional information very often, it is probably better to keep a separate table for the additional columns. If it needs all the information all the time, you'll probably be better of with a single table containing everything.
In short: there is no general answer to your question.
The best approach will be to make a guess based on your application and my considerations. You then implement this and test if the performance of your implementation meets your requirements. If so, you have a valid implementation. If not, try the other strategy.

Should I Create a Table or a Query? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
OK, I have an Access DB that have an Items Table and a Students Table that contains the monthly subscription fee, this two is linked in a third Table "Payments", that gathers the data from a Student(fee + items) and sum them. But that table only keeps the values and not the description. As the payment is irregular(the student don't need to pay all in the same day), and because of this the student's item debt value needs to be reduced as the way he pays, I need a control of that. So, should I create a new table that copies the data from two other tables and make the changes in this new one, or just use a query to show the data and makes the changes in the "main" table? I'm a bit lost and confuse in this, so sorry this mess.

You need to read a beginner's text about database design before you go any further with this project, imo. The first item found by googling "relational database tutorial" is
http://www3.ntu.edu.sg/home/ehchua/programming/sql/relational_database_design.html
see the section "Create Relationships among Tables". There are countless other tutorials online.
As rule, you don't copy data from one table to another. A piece of information like an item's description nor a user's name should only be stored in one place in a database. When you need that in the context of relating it to data in another table (e.g. to display the description of an entry in the Items table with the cost amount in the Fees or Payments table), you look it up, not copy it.
The way to deal with a student having arbitarily many items is to have a "link" table that mainly stores only a unique identifier of the student and a unique identifier of the Item. Usually, these would be numeric identifiers that are assigned as now student/item/other entities are added to the db.
The point of having a link table is that there is no practical limit to the number of items that can be associated with a particular student.
You call add a column to the link table to relate the student and one or more instance of the same item to particular bills (or or orders or whatever it is that your db is modelling).

Designing a database, need to know if I'm doing it correctly [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 7 years ago.
Improve this question
I have a website I'm building for a friend's business and I am wondering how to go about storing the data. So far I have designed this diagram. Let me give an example:
Say there are 2 events in a karate tournament: Sparring and Forms. Each event can have it's own division: Sparring 4-6 year olds, Sparring 8-10 year olds, etc... And each student can sign up for either 1 or all events.
My question is, does the image below suffice for what I just explained in the example, minus the cardinality.
My second questions is, what is the actual database going to look like? Right now, I can think of the following tables to add:
students
events
divisions
student_divisions (student_id, division_id) Is this correct? Because I need to be able to store multiple divisions to one student
Thanks, any pointers to help me be a better designer would be helpful.

Keep it simple, make it fun.
lose the "id" term (make it student_id or division_id or event_id) Use terms that make it really clear what is being identified. Same for "name".. is that student_name or event_name?
Put as much detail as possible on each student (student_id, current_belt_ranking, date_of_birth (--> age), student_name.
Events (event_name, division_id, date, location) (I'd make key = event_name/division pair) or in the alternative "Sparing_8-10", "Forms_4-6"
Divisions (division id, other stuff as indicated)
student/events table (student_id matched with event_name/division_id pair)
Analyse for first, second, third normal form.
First Normal Form (1NF): In plain English, no row of data can have repeating elements. All occurances of a record type must contain the same number of fields. ex( you would NOT put the events a given student was signed up for within the student table. Put that stuff in a seperate table.)
Second Normal Form (2NF): In plain words, are there any data elements in a single row that are only dependent on a portion of the concatenated primary key? If so, remove those elements into an additional table. (ex: you wouldn't have student name, age, and date_of_birth within the student/events table... )
Third Normal Form (3NF): Every non-prime attribute of your tables is non-transitively dependent on every superkey of tables. 3NF is violated when a non-key field is a fact about another non-key field.(Yeah, like does that even make sense? And frankly I can't provide an example of this for you with your example...Let me do a bit of research... With your system, I don't think there are any tables with large numbers of fields to even get close to this violation. Remember 3NF deals with non-key fields relating to other non-key fields. )
Give it a shot, then build your queries and see if they make sense?

When to use separate SQL database tables for two slightly different types of information? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 4 years ago.
Improve this question
I need help with an SQL decision that has confused me for a while.
I'm trying to make a short story website where users can write their own stories and can browse each other's, etc. I've also got a collection of classic short stories written by great writers from the past. I'm confused as to whether I should store both types of story in the same database table.
I want to keep the two types of stories (classic authors/users) distinct to some degree, since you should be able to search the website and filter out user stories from the results. But I can't just have a single database row in the table to represent this, ie a boolean CLASSIC, since with classic short stores, several other of the rows would be different too - there is no user, the date would be YYYY (ie, 1869) instead of a full datetime when the user submitted it.
Yet I can't quite justify putting them in separate tables either. When most of the attributes are the same, should I really have two different database tables for short stories? At the moment I am filling in NULL into the user row for classic short stories, and my filtered search has an option to search only through classics, which selects from the database where user is NULL. This seems to hit performance though, when you're searching through a huge database of potentially millions of user stories just to find a few thousand classic stories.
Note that there are other tables too, like tags for the stories, linked to the short stories table.
So I'm basically asking you SQL experts - is there enough justification for separating the two types of information into different tables? I'm currently using SQLite in development but will switch to MySQL or PostgreSQL later.

I'd probably go with a "parent-child" table structure, where you have matching primary keys across tables, something like:
Stories: StoryId (PK), StoryType (U or C), StoryText, etc. (all of the shared stuff)
UserStories: StoryId (PK and FK), UserId, etc.
ClassicStories: StoryId (PK and FK), AuthorName, etc.
Then if you want, you can build two views around them:
V_UserStories: StoryId, StoryText, UserId, etc.
V_ClassicStories: StoryId, StoryText, AuthorName, etc.
With this setup, you're not wasting any columns, you're keeping shared stuff together, while still keeping the two types of stories easily logically separate if you need them.

To make such a decision you have to think if the field you want to insert into your table for your table only and nothing else.
for example
Story and type of story, if a story can have several types of stories and / or a type for several stories then yes you must make a specific table kind of history, but if only one type of story concern one story then you insert the type informations (name, description etc ...) directly into the stories table.

Good Software Engineering Concepts: Messed Up Database vs Everything in one [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 6 years ago.
Improve this question
First of all, I will explain the situation. Please read it carefully.
I am creating an java phonebook software. I created a database with fields Name, address, mobile1, mobile2, landPh1, landPh2, etc. After 90% completed, I decided to expand it functionalties. As a result of that, I started working with VCards, now that program can read VCards and add them to the DB. Then I decided to write VCards using the data stored in the database. Here, problem occurs!!
VCards don't have field called "Name", as I have in my software. Instead of that, they have "First Name", "Last Name" and "Middle Name". VCards don't have "Address" as I have in my software too. They have "country", "city" and "street address". Now, how can I get these SEPARATE details???? I can get only the name, not the first, last etc. I can get only the address, not the country, city, etc. Now what can I do?????? Below are my suggestions
get the complete name. Set it to VCards "first name" field. You will have the complete name there. For address, add the complete address to VCard's "Street Address" field.
Edit the database, alter it and add the missing fields. But then, the DB will look like this
firstName, Address, mobile1, mobile2, landPh1, landPh2, middleName, lastName, country, city
pretty messy, isn't it?
I am unable to drop the table because lot of stuff has been created based on current format!! Changing it will take lot of time!!
I don't know whether above suggestions are OK with good software engineering concepts. If you have a better way, I am glad to hear that too. Please help!!

You made a number of mistakes in your original database design. You should correct those mistakes at the earliest possible time as the longer you maintain the system with the design flaws the more difficult it will be to correct them later.
In short you need to:
Ensure that each column contains one and only one piece of information. That means separate columns for the parts of the name, separate columns for the parts of the address, etc.
You need to ensure that you are not storing multiple instances of the same item in a single record. That means creating a separate table for the phone numbers. Most likely this table will have three columns, an ID to point back to the contact person, a column for the phone number, and a description.
You will never be able to accurately "decode" 100% of the possible addresses and names.
You can read more about the rules for good database design by googling database normalization.
Don't worry about the order of the columns in a table, or the records in the table. SQL does not contain a concept of default ordering, instead you order the columns and records as you want on retrieval.

FWIW, I would:
don't worry too much about the physical ordering of columns in the db. New columns are usually appended at the end of production databases without recreation of the tables.
keep your data normalized, i.e. don't combine all your name and address fields into one column, as it is cumbersome to split these later.
you might consider normalizing your address and vcf card data into separate tables which are 1:N with your person / contacts table. This will allow for multiple vcard and addresses per person.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas