storing multiple formats in a table - sql

So here's the basic problem: I'd like to be able to store various fields in a database. These can be short textfields (maybe 150 characters max, probably more like 50 in general) and long textfields (something that can store a whole page full of text). Ideally more types can be added later on.
These fields are group by common field_group ids, and their type shouldn't really have anything to do with categorization.
So what's the best way to represent this in MySQL? One table with a short_text and long_text columns of differing types, one of which is to be NULL? Or is there a more elegant solution?
(I'd like this to be primarily driven by ease to select all fields with a given field_group_id.)
Clarification
I'm essentially attempting to allow users to create their own tables, but without actually creating tables.
So you'd have a 'Book' field group, which would have the fields 'Name' (short text), 'Summary' (long text). Then you would be able to create entries into that book. I realize that this is essentially the whole point of MySQL, but I need to have a LOT of these and don't want users creating whole tables in my database.

What you are looking for is called an EAV. With an EAV model you can build any freaking database in the world with only inserts. But it's really horrible for a lot of reasons but yours sounds so looney-tunes, it could work.
Build an Entity table
In here you'd list
Car
Person
Plant
Build an Attribute Table.
Here you'd list the PK from Entity and the list of attributes.
I'll use the word instead of a number PK.
Car | Engine Cylinders
Car | Doors
Car | Make
Person | First Name
Person | Last Name
then in a third table you'd list the actual values for each one, again using the words but you'd have numbers.
Car | Engine Cylinders | 4
Car | Doors | 4
Car | Make | Honda
Person | First Name | Stephanie
Person | Last Name | Page
If you want to get tricky instead on one column for value you could have 4 columns
a number
a varchar
a date
a clob
then in the Attribute table you could add a column that says which column to put the data.
If you plan on this database being "Multitenent" you'll need to add an OWNER table as the parent of the entity table, so you and I could both have a Car entity.
But this SUCKS to query, SUCKS to index, SUCKS to use for anything else but a toy app.

I don't know exactly what you mean by "field group", but if the information (short text, long text) all belongs to a certain entry, you can create a single table and include all those columns.
Say you have a bunch of books with a title and a summary:
table: `books`
- id, int(11) // unique for each book
- title, varchar(255)
- writer, varchar(50)
- summary, text
- etc
Fields that don't necessarily need to be set can be set to NULL by default.
To retrieve the information, simply select all the fields:
SELECT * FROM books WHERE id = 1
Or some of the fields:
SELECT title, writer FROM books ORDER BY title ASC

Related

How to handle this type of Oracle SQL issue

Just have a question for writing SQL.
In ORACLE DB, I have rows of different apples in one "APPLE" TABLE, where the "TAGS" holds all the features of this type of apple. For example:
NAME, TAGS
-----------
APPLE1, FUJI BOXED MEDIUM CALIFORNIA ...
APPLE2, ORGANIC GALA PER_POUND LARGE FLORIDA ...
APPLE3, RED_DELICIOUS MEDIA PACKED ORGANIC ...
APPLE4, LARGE RED_DELICIOUS Mexico ....
APPLE5, PACKED FUJI MEXICO LARGE
Now I want to have a SQL query to find out all rows with any given tag values, For example, "FUJI MEDIUM MEXICO ". How would this SQL be look like ?
This is related to one project I am working on. IN DB, the reason why I have one "TAG" COLUMN to keep all the features, instead of having separate columns, is because we know more and more new tag values will be introduced, so instead of adding more and more columns, we would like to keep them in one column, so that the code does not need to change every time.
Thanks,
Jack
You could redesign the table so it looks like this:
name | tag
----------
Apple1| FUJI
Apple1| BOXED
...
Apple5| PACKED
Apple5| FUJI
Then to find all items with tags fuji, medium OR mexico you could do this:
SELECT name from tags where tag in ('FUJI','MEDIUM','MEXICO')
GROUP BY name
You could find all items with tags fuji, medium AND mexico with:
SELECT name from tags where tag in ('FUJI','MEDIUM','MEXICO')
GROUP BY name
HAVING count(tag) = 3
(assuming (name,tag) is unique)
This works for any number of tags. Also makes removing tags from items much easier, and allows you to join and sort on the tags too.
I assume that by "FUJI MEDIUM MEXICO" you mean that you want to select apples that are tagged with "FUJI" and "MEDIUM" and "MEXICO", in any order. In that case, the following query would work:
Select name From apple
Where tag like '%FUJI%'
And tag like '%MEDIUM%'
And tag like '%MEXICO%';
As others have mentioned, if you want a case-insensitive search, then you would want to add appropriate Upper or Lower functions, like so:
Select name From apple
Where Upper(tag) like '%FUJI%'
And Upper(tag) like '%MEDIUM%'
And Upper(tag) like '%MEXICO%';
For the sake of efficiency, tags should be stored as completely upper case or completely lower case. This would eliminate the need to call the Upper() or Lower() function on the tag value of each row, which could save a lot of time if the data set were very large.
Better design will be your friend here.
Three tables:
CREATE TABLE APPLE_TYPE
(APPLE_TYPE VARCHAR2(100));
CREATE TABLE APPLE_ATTRIBUTES
(ATTRIBUTE_TYPE VARCHAR2(100));
CREATE TABLE APPLES
(APPLE_ID NUMBER,
APPLE_TYPE VARCHAR2(100)
CONSTRAINT APPLES_FK1
REFERENCES APPLE_TYPE(APPLE_TYPE)
ON DELETE CASCADE,
ATTRIBUTE_TYPE VARCHAR2(100)
CONSTRAINT APPLES_FK2
REFERENCES APPLE_ATTRIBUTES(ATTRIBUTE_TYPE)
ON DELETE NO ACTION);
Best of luck.
Bad table design aside, this can be accomplished using a like evaluation.
select
apple
tags
from
table
where
lower(tags) like '%tag_here%'
I've used the lower() function here to make dealing with string casing easier. When you replace tag_here do so with all lowercase characters.
That being said, you really should improve your database design. This is very inefficient from both a storage and a performance standpoint. A better design would have two different tables. One would store the apples and a second table would store the tags with a foreign key back to the apples table.
I would create some of these tags as columns and create a second table for the "miscellaneous" tags.
Table: Apples
Apple_ID PK
Name
Where_Grown
Size
Table: Apple_Tags
Tag_ID PK
Apple_ID FK
Tag
Index: Apple_Tags.Tag, Apple_ID
Data for Apple 1 is:
Apples Table
ID: 1
Name: Fuji
Where_Grown: California
Size: Medium
Tags Table
Tag_ID: 1
Apple_ID: 1
Tag: Boxed
To find tags:
select * from apples a inner join apple_tags t on a.apple_id = t.apple_id
Notice I'm not storing multiple tags in one column. That breaks the first rule of normalization that columns are atomic. I'm storing them as rows in a separate table. I'm also recognizing that apple name, size, and the place where it's grown are attributes common to all apples.

Database structure, one big entity for multiple entities

Suppose that I have a store-website where user can leave comments about any product.
Suppose that I have tables(entities) in my website database: let it be 'Shoes', 'Hats' and 'Skates'.
I don't want to create separate "comments" table for every entity (like 'shoes_comments', 'hats_comments', 'skates_comments').
My idea is to somehow store all the comments in one big table.
One way to do this, that I thought of, is to create a table:
table (comments):
ID (int, Primary Key),
comment (text),
Product_id (int),
isSkates (boolean),
isShoes (boolean),
isHats (boolean)
and like flag for every entity that could have comments.
Then when I want to get comments for some product the SELECT query would look like:
SELECT comment
FROM comments, ___SOMETABLE___
WHERE ____SOMEFLAG____ = TRUE
AND ___SOMETABLE___.ID = comments.Product_id
Is this an efficient way to implement database for needed functionality?
What other ways i can do this?>
Sorry, this feels odd.
Do you indeed have one separate table for each product type? Don't they have common fields (e.g. name, description, price, product image, etc.)?
My recommendation as for tables: product for common fields, comments with foreign key to product but no hasX columns, hat with only the fields that are specific to the hat product line. The primary key in hat is either the product PK or an individual unique value (then you'd need an extra field for the foreign key to product).
I would recommend you to make one table for the comments and use a foreign key of other tables in the comments table.
The "normalized" way to do this is to add one more entity (say, "Product") that groups all characteristics common to shoes, hats and skates (including comments)
+-- 0..1 [Shoe]
|
[Product] 1 --+-- 0..1 [Hat]
1 |
| +-- 0..1 [Skate]
*
[Comment]
Besides performance considerations, the drawback here is that there is nothing in the data model preventing a row in Product to be referenced both by a row in Shoe and one in Hat.
There are other alternatives too (each with perks & flaws) - you might want to read something about "jpa inheritance strategies" - you'll find java-specific articles that discuss your same issue (just ignore the java babbling and read the rest)
Personally, I often end up using a single table for all entities in a hierarchy (shoes, hats and skates in our case) and sacrificing constraints on the altar of performance and simplicity (eg: not null in a field that is mandatory for shoes but not for hats and skates).

Basic question: how to properly redesign this schema

I am hopping on a project that sits on top of a Sql Server 2008 DB with what seems like an inefficient schema to me. However, I'm not an expert at anything SQL, so I am seeking for guidance.
In general, the schema has tables like this:
ID | A | B
ID is a unique identifier
A contains text, such as animal names. There's very little variety; maybe 3-4 different values in thousands of rows. This could vary with time, but still a small set.
B is one of two options, but stored as text. The set is finite.
My questions are as follows:
Should I create another table for names contained in A, with an ID and a value, and set the ID as the primary key? Or should I just put an index on that column in my table? Right now, to get a list of A's, it does "select distinct(a) from table" which seems inefficient to me.
The table has a multitude of columns for properties of A. It could be like: Color, Age, Weight, etc. I would think that this is better suited in a separate table with: ID, AnimalID, Property, Value. Each property is unique to the animal, so I'm not sure how this schema could enforce this (the current schema implies this as it's a column, so you can only have one value for each property).
Right now the DB is easily readable by a human, but its size is growing fast and I feel like the design is inefficient. There currently is not index at all anywhere. As I said I'm not a pro, but will read more on the subject. The goal is to have a fast system. Thanks for your advice!
This sounds like a database that might represent a veterinary clinic.
If the table you describe represents the various patients (animals) that come to the clinic, then having properties specific to them are probably best on the primary table. But, as you say column "A" contains a species name, it might be worthwhile to link that to a secondary table to save on the redundancy of storing those names:
For example:
Patients
--------
ID Name SpeciesID Color DOB Weight
1 Spot 1 Black/White 2008-01-01 20
Species
-------
ID Species
1 Cocker Spaniel
If your main table should be instead grouped by customer or owner, then you may want to add an Animals table and link it:
Customers
---------
ID Name
1 John Q. Sample
Animals
-------
ID CustomerID SpeciesID Name Color DOB Weight
1 1 1 Spot Black/White 2008-01-01 20
...
As for your original column B, consider converting it to a boolean (BIT) if you only need to store two states. Barring that, consider CHAR to store a fixed number of characters.
Like most things, it depends.
By having the animal names directly in the table, it makes your reporting queries more efficient by removing the need for many joins.
Going with something like 3rd normal form (having an ID/Name table for the animals) makes you database smaller, but requires more joins for reporting.
Either way, make sure to add some indexes.

Can this MySQL db be improved or is it good as it is?

In a classifieds website, you have several categories (cars, mc, houses etc).
For every category chosen, a hidden div becomes visible and shows additional options the user may specify if he/she wishes.
I am creating a db now, and I have read some articles about normalization and making it optimized etc...
Here is my layup today
CATEGORY TABLE:
- cars
- mc
- houses
CLASSIFIED TABLE:
- headline
- description
- hide_telephone_nr
- changeable
- action
- price
- modify_date
POSTER TABLE:
- name
- passw
- tel
- email
AREA TABLE:
- area
- community
CARS TABLE:
- year
- fuel
- gearbox
- colour
MC TABLE:
- year
- type
HOUSE TABLE:
- Villa
- Apartment
- Size
- rooms
etc
I have so far one table for each category, so that is around 30 tables.
Isn't that too many?
I haven't created PK or FK for any of these so far, haven't got that far yet...
Could you tell me if this setup is good, or should I have it made differently?
ALSO, how would you setup the FK and the PK here?
Thanks
From my understanding, I would make a table for all the categories and store the categories' name and ID there. Next, I would create a separate table to store the additional options for each category.
MySQL Table 1
----------------
Category_ID int PRIMARY KEY
Category_name varchar
MySQL Table 2
----------------
Category_ID int
Entry_Number int PRIMARY KEY (this will keep track of which entry everything belongs to)
Additional_Option varchar
Additional_Option_Answer varchar (this is the one that stores what your user clicks/inputs)
For example, using:
POSTER TABLE:
- name
- passw
- tel
- email
You would store the category_id this data is for in Category_ID and store name passw tel email into Additional_Option in it's own row and the user's input for those criteria would be stored in Additional_Option_Answer.
Category_ID for Posters will be 1 and for Area will be 2.
It would look like this if the first user added something:
---------------------------------------------------------------------------------------------
Category_ID | Entry_Number | Additional_Options | Additional_Options_Answers
---------------------------------------------------------------------------------------------
1 | 1 | name | doug
1 | 1 | passw | 1234
It would look like this if the second user added something:
---------------------------------------------------------------------------------------------
Category_ID | Entry_Number | Additional_Options | Additional_Options_Answers
---------------------------------------------------------------------------------------------
1 | 2 | name | Hamlet
1 | 2 | passw | iliketurtles
Further more, let's apply another category:
AREA TABLE:
- area
- community
---------------------------------------------------------------------------------------------
Category_ID | Entry_Number | Additional_Options | Additional_Options_Answers
---------------------------------------------------------------------------------------------
2 | 3 | area | San Francisco
2 | 3 | community | community_name
You can recognise a problem with the category tables by the use of data in the table names. The problem with having tables for each category isn't mainly that you get many tables, but that you have to change the database design if you add another category. Also, querying the database is difficult when you need to select a table based on data.
You should have as single table for the posting properties instead of one for each category. As the properties for each category differs, you would also need a table that describes which properties are use for each category.
Tables that describe the main objects (category, classified, poster, area, property) would get a primary key. The other tables only need foreign keys, as they are relations between objects.
Category (CategoryId, CategoryName)
Classified (ClassifiedId, PosterId, AreaId, ...)
Poster (PosterId, ...)
Area (AreaId, AreaName, ...)
Property (PropertyId, PropertyName)
CategoryProperty (CategoryId, PropertyId)
ClassifiedProperty (ClassifiedId, PropertyId, Value)
Your design is very much tied to the underlying products. Also you are putting what appears to be mutually exclusive data in different columns (e.g. surely a house can't be both a villa and an aprtment?) I'd go with a much more generalized form, something like:
Category
Classified
Poster
As in the OP, but with primary keys added/declared.
Then group all the category specific attributes into a single table - like
Std_Tags {id, category, tag}
{0,Cars,year}
{1,Cars,fuel}
{2,house,type}
{3,house,rooms}
With values in another table:
classified_tags {std_tags_id, classified_id, value}
{0,13356,2005}
{2,109,villa}
{0,153356,diesel}
This also simplifies the building of input forms becuase the template is explicitly stated also, by adding a table like:
Allowed_values {std_tags_id, value}
{1,diesel}
{1,petrol}
{1,LPG}
{2,Villa}
{2,Apartment}
Then much of the data entry could be done using drop-down lists, conforming to standard searches.
C.
First of all you need to create primary key for each table. Normally the best way to do this is to use sequential id field that is named either id or tablenameId. This is really important. Primary keys tied to the actual data will cause problems when the data changes.
category (id PK, name)
category_options (id PK, category_id FK->category.id, option_name)
So that category table would have values like
(1, car)
(2, MC)
and options would have values like
(1, 1, year)
(2, 1, fuel)
(3, 2, type)
Then you need a table where the values are actually stored and linked to the item. This just requires that you join all 3 category tables when you do a query for one item.
category_values (id PK, category_options_id FK-> category_options.id, value, classified_id FK->classified.id)
Classified table needs fk to poster and id field.
classified (id PK, poster_id FK->poster.id, headline, description, hide_telephone_nr, changeable, action, price, modify_date)
Poster table is quite good as it is just add id field for primary key. I just think that it is normally called users.
By category_options_id FK-> category_options.id I mean that category_options_id should have foreign key reference to category_options.id.
You could do even more normalizations like for classified.action and classified.changeable but it also adds complexity.
I hope this helps.
I must also stress that this is not the only possible solution and depending on how you actually want to use the data it might not be the best option but it works and is atleast decent :)

How to display multiple values in a MySQL database?

I was wondering how can you display multiple values in a database for example, lets say you have a user who will fill out a form that asks them to type in what types of foods they like for example cookies, candy, apples, bread and so on.
How can I store it in the MySQL database under the same field called food?
How will the field food structure look like?
You may want to read the excellent Wikipedia article on database normalization.
You don't want to store multiple values in a single field. You want to do something like this:
form_responses
id
[whatever other fields your form has]
foods_liked
form_response_id
food_name
Where form_responses is the table containing things that are singular (like a person's name or address, or something where there aren't multiple values). foods_liked.form_response_id is a reference to the form_responses table, so the foods liked by the person who has response number six will have a value of six for the form_response_id field in foods_liked. You'll have one row in that table for each food liked by the person.
Edit: Others have suggested a three-table structure, which is certainly better if you are limiting your users to selecting foods from a predefined list. The three-table structure may be better in the case that you are allowing them the ability to enter their own foods, though if you go that route you'll want to be careful to normalize your input (trim whitespace, fix capitalization, etc.) so you don't end up with duplicate entries in that table.
normally, we do NOT work out like this. try to use a relation table.
Table 1: tbl_food
ID primary key, auto increment
FNAME varchar
Table 2: tbl_user
ID primary key, auto increment
USER varchar
Table 3: tbl_userfood
RID auto increment
USERID int
FOODID int
Use similar format to store your data, instead a chunk of data fitted into a field.
Querying in these tables are easier than parsing the chunk of data too.
Use normalization.
More specifically, create a table called users. Create another called foods. Then link the two tables together with a many-to-many table called users_to_foods referencing each others foreign keys.
One way to do it would be to serialize the food data in your programming language, and then store it in the food field. This would then allow you to query the database, get the serialized food data, and convert it back into a native data structure (probably an array in this case) in your programming language.
The problem with this approach is that you will be storing a lot of the same data over and over, e.g. if a lot of people like cookies, the string "cookies" will be stored over and over. Another problem is searching for everyone who likes one particular food. To do that, you would have to select the food data for each record, unserialize it, and see if the selected food is contained within. This is a very inefficient.
Instead you'll want to create 3 tables: a users table, a foods table, and a join table. The users and foods tables will contain one record for each user and food respectively. The join table will have two fields: user_id and food_id. For every food a user chooses as a favorite, it adds a record to the join table of the user's ID and the food ID.
As an example, to pull all the users who like a particular food with id FOOD_ID, your query would be:
SELECT users.id, users.name
FROM users, join_table
WHERE join_table.food_id = FOOD_ID
AND join_table.user_id = users.id;