SQL - Storing undefined fields in a table - sql

So, I have a few client that want to store user data, the thing is they want to store data without telling me what it is. What I mean is that, they will likely store things like username, first name, last name and email, but then there are a myriad of other things that are no defined and that can change.
So with that in mind, I want to create a table that can handle that data. I was thinking of setting up a table like this:
CREATE TABLE [dbo].[Details](
[Id] [int] NOT NULL,
[ClientId] [nvarchar](128) NOT NULL,
[ColumnName] [nvarchar](255) NOT NULL,
[ColumnValue] [nvarchar](255) NOT NULL,
CONSTRAINT [PK_Details] PRIMARY KEY CLUSTERED
(
[Id] ASC
)
Is this the best way to store this data, or am I making a mistake?
Please help me before I go ahead and create this table! :D

Just make clear to your clients that fields the database knows about, like a user number, name, and address can be searched fast, checked for consistency, and even be used for program control (such as some users see or can do what others cannot), whereas "undefined data" will not.
Then find out, if they really need single entities. Then the name/value approach is exactly the way to go. Maybe, however, it suffices to store one note per user, where they enter all their comments, memos, etc., i.e. just one text to hold all "undefined data". If possible this would be a better approach.

Related

SQL - Table structure for read/unread Notifications per user

I have a requirement to create a notification module on a web application and I would like to know which user has read the notification so I won't have to display the Red Badge that indicates that a notification is not yet read.
I have thought of two solutions.
The first one is a bit naive but I find it easier to implement and maintain:
Solution #1:
CREATE TABLE [Notifications1]
(
Id integer NOT NULL,
Title varchar(255) NOT NULL,
Description varchar(255) NOT NULL,
DateInserted datetime NOT NULL,
ReadByUsers varchar(MAX) NOT NULL
)
So in this case, when a user opens an unread notification, I will append the User ID on the ReadByUsers column, so it will look like this 10,12,15,20 ...
Then when I need to check whether the user has read the notification, I will simply query the column to see whether the user id is included.
The second solution looks more of 'best practice' approach but it gets bit more complicated and a bit more heavy I guess..
Solution #2:
CREATE TABLE [Notifications]
(
Id integer NOT NULL,
Title varchar(255) NOT NULL,
Description varchar(255) NOT NULL,
DateInserted datetime NOT NULL,
CONSTRAINT [PK_NOTIFICATIONS]
PRIMARY KEY CLUSTERED
)
CREATE TABLE [Notification_Users]
(
NotificationId integer NOT NULL,
UserId integer NOT NULL
)
In this case I store the user that has seen the notification in a different table with a relationship (One to Many) on the Notification Id on the main table. And then I can check the Notification_Users table to check if the user has seen the notification.
Also note that :
I don't need to know the time the user has seen the notification.
The application is not expected to reach more than 10000 users any time soon.
Notifications will be deleted from database after few months. So max 10 notifications at a time.
Do you have any suggestions for better implementation?
Thanks
Solution #2 would be the preferred solution.
Storing a list of Ids is not a normalized solution. In my opinion, the return on investment of breaking this out into two tables out weights the perceived complexity.
Some good info below, and honestly all over the net.
A Database "Best" Practice
https://softwareengineering.stackexchange.com/questions/358913/is-storing-a-list-of-strings-in-single-database-field-a-bad-idea-why

Automatical creation of indentity(1,1) but with GUIDs instead of integers

A neat thingy is to put identity(2,3) on a column to get the counter and an unique value to index. I wonder if, and in such case how, it's possible to do the same with a guid.
So, instead of inserting a row with newid(), I'd have one created for me.
If there's no way to do that, I'd like to know the technical rationale behind it. If there is any, of course.
create table Records (
Id int identity(3,14) primary key,
Occasion datetime not null,
Amount float not null,
Mileage int not null,
Information varchar(999) default ' '
)
As the lazy efficient #MartinSmith commented (but didn't bother to formulate a full-size reply, forcing me to do that for him, hehe), we can use default keyword to achieve the requested behavior as shown below.
create table Records (
Id uniqueidentifier primary key default newid(),
Occasion datetime not null,
Amount float not null,
Mileage int not null,
Information varchar(999) default ' '
)
And if the said lazy Martin wishes, he might just copy over the contents of this answer and post it as his own reply, since it was really his suggestions and I'm merely a typing machine for him. Given that he's got rep like an immortal God, I doubt he will, but still - I want to make this perfectly clear.

SQL Indexing Strategy on Link Tables

I often find myself creating 'link tables'. For example, the following table maps a user record to an event record.
CREATE TABLE [dbo].[EventLog](
[EventId] [int] NOT NULL,
[UserId] [int] NOT NULL,
[Time] [datetime] NOT NULL,
[Timestamp] [timestamp] NOT NULL
)
For the purposes of this question, please assume the combination of EventId plus UserId is unique and that the database in question is a MS SQL Server 2008 installation.
The problem I have is that I am never sure as to how these tables should be indexed. For example, I might want to list all users for a particular event, or I might want to list all events for a particular user or, perhaps, retrieve a particular EventId/UserId record. Indexing options I have considered include:
Creating a compound primary key on EventId and UserId (but I
understand the index won't be useful when accessing by UserId on its
own).
Creating a compound primary key on EventId and UserId and a adding a
supplemental index on UserId.
Creating a primary key on EventId and a supplemental index on
UserId.
Any advice would be appreciated.
The indices are designed to solve performance problems. If you don't yet have such problem and cannot exactly know where you'll face troubles then you shouldn't create indexes. The indices are quite expensive. Because it not only takes up disk space but also causes the overhead of writing or modifying data. So you have to be clear understand what the specific performance problem you decide by creating an index. So you can appreciate the need to create it.
The answer to your question depends on several aspects.
It depends on the DBMS you are going to use. Some prefer single-column indexes (like Postgresql), some can take more advantage of multi-column indexes (like Oracle). Some can answer a query completely from a covering index (like sqlite), others cannot and eventually have to read the pages of the actual table (again, like postgres).
It depends on the queries you want to answer. For example, do you navigate in both directions, i.e., do you join on both of your Id columns?
It depends on your space and processing time requirements for data modification, too. Keep in mind that indexes are often bigger than the actual table that they index, and that updating indexes is often more expensive that just updating the underlying table.
EDIT:
When your conceptual model has a many-to-many relationship R between two entities E1 and E2, i.e., the logical semantics of R is either "related" or "not-related", than I would always declare that combined primary key for R. That will create a unique index. The primary motivation is, however, data consistency, not query optimization, i.e.:
CREATE TABLE [dbo].[EventLog](
[EventId] [int] NOT NULL,
[UserId] [int] NOT NULL,
[Time] [datetime] NOT NULL,
[Timestamp] [timestamp] NOT NULL,
PRIMARY KEY([EventId],[UserId])
)

Need to Develop SQL Server 2005 Database to Store Insurance Rates

Good Evening All,
A client has asked that I develop a web application as part of his existing site built in ASP.net 3.5 that will enable his brokers to generate quotes for potential client groups. Such quotes will need to be derived using rate tables stored in a SQL Server 2005 database.
The table structure is as follows:
[dbo].[PlanRates](
[AgeCategory] [int] NULL,
[IndustryType] [int] NULL,
[CoverageType] [int] NULL,
[PlanDeductible] [float] NULL,
[InpatientBenefit] [float] NULL,
[Rate] [float] NULL,
[OPMD15Copay] [float] NULL,
[OPMD25Copay] [float] NULL
Question: Assuming I use page validation in the web application to verify input against business logic, do you anticipate issues arising relative to the web application returning a quotation using the above database table layout? If so, can you suggest a better way to structure my table?
Bonus goes to anyone who has programmed web-based insurance quoting systems.
Thanks much for your help and guidance.
I would definitely add a surrogate primary key, e.g. PlanRatesID INT IDENTITY(1,1) to make each entry uniquely identifiable.
Secondly, I would think the fields "PlanDeductible", "InpatientBenefit", "Rate" are money values, so I would definitely make them of type DECIMAL, not FLOAT. Float is not very accurate and could lead to rounding errors. For DECIMAL, you need to specify the amount of significant digits before and after the decimal point, too, e.g. DECIMAL(12,3) or something like that.
That's about it! :)
I would suggest using parameterized queries to save an retrieve your data to protect against SQL injection.
EDIT
It looks like
[AgeCategory] [int] NULL,
[IndustryType] [int] NULL,
[CoverageType] [int] NULL,
Are probably foreign keys, if so you may not want to make them null-able.
NULLable category, types and rates?
Category and type columns are lookup fields to other tables storing additional information for each type.
You should check which of the columns is really nullable, and define how to deal with NULL values.
As rates can change, you should also consider a date range in the table (ValidFrom, ValidTo DATETIME), or have a history table associated with the table above to make sure past rate calculations can be repeated. (Might be a legal/financial requirement)

Address book DB schema

I need to store contact information for users. I want to present this data on the page as an hCard and downloadable as a vCard. I'd also like to be able to search the database by phone number, email, etc.
What do you think is the best way to store this data? Since users could have multiple addresses, etc complete normalization would be a mess. I'm thinking about using XML, but I'm not familiar with querying XML db fields. Would I still be able to search for users by contact info?
I'm using SQL Server 2005, if that matters.
Consider two tables for People and their addresses:
People (pid, prefix, firstName, lastName, suffix, DOB, ... primaryAddressTag )
AddressBook (pid, tag, address1, address2, city, stateProv, postalCode, ... )
The Primary Key (that uniquely identifies each and every row) of People is pid. The PK of AddressBook is the composition of pid and tag (pid, tag).
Some example data:
People
1, Kirk
2, Spock
AddressBook
1, home, '123 Main Street', Iowa
1, work, 'USS Enterprise NCC-1701'
2, other, 'Mt. Selaya, Vulcan'
In this example, Kirk has two addresses: one 'home' and one 'work'. One of those two can (and should) be noted as a foreign key (like a cross-reference) in People in the primaryAddressTag column.
Spock has a single address with the tag 'other'. Since that is Spock's only address, the value 'other' ought to go in the primaryAddressTag column for pid=2.
This schema has the nice effect of preventing the same person from duplicating any of their own addresses by accidentally reusing tags while at the same time allowing all other people use any address tags they like.
Further, with FK references in primaryAddressTag, the database system itself will enforce the validity of the primary address tag (via something we database geeks call referential integrity) so that your -- or any -- application need not worry about it.
Why would complete normalization "be a mess"? This is exactly the kind of thing that normalization makes less messy.
Don't be afraid of normalizing your data. Normalization, like John mentions, is the solution not the problem. If you try to denormalize your data just to avoid a couple joins, then you're going to cause yourself serious trouble in the future. Trying to refactor this sort of data down the line after you have a reasonable size dataset WILL NOT BE FUN.
I strongly suggest you check out Highrise from 36 Signals. It was recently recommended to me when I was looking for an online contact manager. It does so much right. Actually, my only objection so far with the service is that I think the paid versions are too expensive -- that's all.
As things stand today, I do not fit into a flat address profile. I have 4-5 e-mail addresses that I use regularly, 5 phone numbers, 3 addresses, several websites and IM profiles, all of which I would include in my contact profile. If you're starting to build a contact management system now and you're unencumbered by architectural limitations (think gmail cantacts being keyed to a single email address), then do your users a favor and make your contact structure as flexible (normalized) as possible.
Cheers, -D.
I'm aware of SQLite, but that doesn't really help - I'm talking about figuring out the best schema (regardless of the database) for storing this data.
Per John, I don't see what the problem with a classic normalised schema would be. You haven't given much information to go on, but you say that there's a one-to-many relationship between users and addresses, so I'd plump for a bog standard solution with a foreign key to the user in the address relation.
If you assume each user has one or more addresses, a telephone number, etc., you could have a 'Users' table, an 'Addresses Table' (containing a primary key and then non-unique reference to Users), the same for phone numbers - allowing multiple rows with the same UserID foreign key, which would make querying 'all addresses for user X' quite simple.
I don't have a script, but I do have mySQL that you can use. Before that I should mentioned that there seem to be two logical approaches to storing vCards in SQL:
Store the whole card and let the database search, (possibly) huge text strings, and process them in another part of your code or even client side. e.g.
CREATE TABLE IF NOT EXISTS vcards (
name_or_letter varchar(250) NOT NULL,
vcard text NOT NULL,
timestamp timestamp default CURRENT_TIMESTAMP on update CURRENT_TIMESTAMP,
PRIMARY KEY (username)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
Probably easy to implement, (depending on what you are doing with the data) though your searches are going to be slow if you have many entries.
If this is just for you then this might work, (if it is any good then it is never just for you.) You can then process the vCard client side or server side using some beautiful module that you share, (or someone else shared with you.)
I've watched vCard evolve and know that there is going to be
some change at /some/ time in the future so I use three tables.
The first is the card, (this mostly links back to my existing tables - if you don't need this then yours can be a cut down version).
The second are the card definitions, (which seem to be called profile in vCard speak).
The last is all the actual data for the cards.
Because I let DBIx::Class, (yes I'm one of those) do all of the database work this, (three tables) seems to work rather well for me,
(though obviously you can tighten up the types to match rfc2426 more closely,
but for the most part each piece of data is just a text string.)
The reason that I don't normalize out the address from the person is that I already have an
address table in my database and these three are just for non-user contact details.
CREATE TABLE `vCards` (
`card_id` int(255) unsigned NOT NULL AUTO_INCREMENT,
`card_peid` int(255) DEFAULT NULL COMMENT 'link back to user table',
`card_acid` int(255) DEFAULT NULL COMMENT 'link back to account table',
`card_language` varchar(5) DEFAULT NULL COMMENT 'en en_GB',
`card_encoding` varchar(32) DEFAULT 'UTF-8' COMMENT 'why use anything else?',
`card_created` datetime NOT NULL,
`card_updated` datetime NOT NULL,
PRIMARY KEY (`card_id`) )
ENGINE=InnoDB DEFAULT CHARSET=latin1 COMMENT='These are the contact cards'
create table vCard_profile (
vcprofile_id int(255) unsigned auto_increment NOT NULL,
vcprofile_version enum('rfc2426') DEFAULT "rfc2426" COMMENT "defaults to vCard 3.0",
vcprofile_feature char(16) COMMENT "FN to CATEGORIES",
vcprofile_type enum('text','bin') DEFAULT "text" COMMENT "if it is too large for vcd_value then user vcd_bin",
PRIMARY KEY (`vcprofile_id`)
) COMMENT "These are the valid types of card entry";
INSERT INTO vCard_profile VALUES('','rfc2426','FN','text'),('','rfc2426','N','text'),('','rfc2426','NICKNAME','text'),('','rfc2426','PHOTO','bin'),('','rfc2426','BDAY','text'),('','rfc2426','ADR','text'),('','rfc2426','LABEL','text'),('','rfc2426','TEL','text'),('','rfc2426','EMAIL','text'),('','rfc2426','MAILER','text'),('','rfc2426','TZ','text'),('','rfc2426','GEO','text'),('','rfc2426','TITLE','text'),('','rfc2426','ROLE','text'),('','rfc2426','LOGO','bin'),('','rfc2426','AGENT','text'),('','rfc2426','ORG','text'),('','rfc2426','CATEGORIES','text'),('','rfc2426','NOTE','text'),('','rfc2426','PRODID','text'),('','rfc2426','REV','text'),('','rfc2426','SORT-STRING','text'),('','rfc2426','SOUND','bin'),('','rfc2426','UID','text'),('','rfc2426','URL','text'),('','rfc2426','VERSION','text'),('','rfc2426','CLASS','text'),('','rfc2426','KEY','bin');
create table vCard_data (
vcd_id int(255) unsigned auto_increment NOT NULL,
vcd_card_id int(255) NOT NULL,
vcd_profile_id int(255) NOT NULL,
vcd_prof_detail varchar(255) COMMENT "work,home,preferred,order for e.g. multiple email addresses",
vcd_value varchar(255),
vcd_bin blob COMMENT "for when varchar(255) is too small",
PRIMARY KEY (`vcd_id`)
) COMMENT "The actual vCard data";
This isn't the best SQL but I hope that helps.