When is a good time use EAV? - sql

I am trying to create a networking site to allow manufacturers to network with Producers. My boss wants other organizations to be able to upload excel files to the website of their products and have it stored in the database. Each product that people enter could have their own host of properties. Maybe the company could somehow control the attributes they want to display on said networking site. That said this seems like a time where I might end up with near inifinite amount of attributes and EAV seems like it makes alot of sense for this type of task.
I went here to see my different options:
How to design a product table for many kinds of product where each product has many parameters
Is this a good case to use EAV or no
and saw that there were these questions
Do you have a lot of product types ?
Do you need to handle "variants" of the products ?
Do you intend to add entirely new types of products ?
and the answer for all of them in my mind is yes SO I think this is a good time to use eav but I'm not sure because I know some people think EAV IS the the enemy. I also have no experience using a nosql database and maybe that is the direction I should think about going. if it matters at all im doing the back end with asp.net visual studio as of right now.
But on a sidenote I feel like this is a bad task to give a singular intern (still in college) to make a networking site this big when you are a small manufacturing company with no cs division... debating quitting.
regardless does anyone have any advice as to if I should stick with MySQL and try eav vs one of the other options from the first link.

Related

CRM 2013 requirement gathering and expansion, Best Practices

I'm fairly new to CRM 2013, and I've been reading and watching a lot of videos on the subject. I'm hoping someone on the interweb, can offer some tips or suggestions that will resonate with the way I interpret and comprehend this material.
I have a regular MS Access DB, with a few different tables that is used to store personnel records. From what I've learned so far CRM looks like a good candidate for moving away from Access and towards CRM.
My question is, what are some best practices when it comes to gathering requirements and expanding them for the CRM 2013 environment?
i.e. (for simplicity sake)
MS Access db has two tables.
Table A stores names of employees.
Table B stores the employee's favorite food.
Should each table have their own entity (Table_A 1:N, Table_B N:1), or are there times when you should combine multiple tables under a single entity?
Is it considered bad form to just put everything under one entity?
If it is bad form, how do you determine when to split the information
into multiple entities?
Business processes seem to remind me of SharePoint workflows. When should you rely on a BP?
I hope this makes sense, I'm still trying to make sense of it all. Any help is appreciated, thanks!
The Dynamics CRM is backed by a SQL Server, so think in terms of SQL tables and what's better. Splitting into smaller objects which have single responsibility is is better in most cases, but this might affect the performance on joining records. Honestly, we are moving away from CRM in cloud because it's not scaleable, not reliable (if it goes down you need to wait when it goes up - no second replica), no control over underlying SQL data or SQL Server instance size (DTUs), hard to test, and is just painful for a big project. And we are going to replace all Business Processes with code, because they are not testable as well - you cannot write unit tests.
Although #sergeSemenov has some good points, you have to start what CRM was is. CRM is a Customer Relationship Management suite, built on top of Xrm, a RADP, Rapid Application Development Platform.
Xrm is not designed to be the most flexible, most performant platform in the world. If speed of use and the ability to use any specific technology/sever topology to meet your demand is a requirement for your app, don't start with XRM.
If speed of development, speed to market, reduced costs by having non developers doing work that in the custom app dev world are required by developers, or even a desire to be in the cloud on day 1, are more highly valued, then Xrm, and consequently CRM is a great place to start.
As far as your questions, it all depends. The more normalized your data is, multiple entities for each relationship, the less work involved in ensuring they are all in sync, but usually the more difficult it is for your end users to work with the information. They'll have to navigate to multiple forms for data entry, and have lots of joins to configure in reports.
I generally try to keep the data as normalized as possible, but if anything is a 1:1 relationship, combine that into single entity. Or you know it's going to be a 1:2 or 1:3 relationship (See addresses on contacts and accounts for example)
Basically, it is an answer that requires a unique look at the application requirements, and personal experience. I would highly recommend seeking a consultant with CRM experience. Another advantage of a platform, is that there is a whole resource pool of developers, and BAs that already know the framework, and can provide value to the business on day one, rather than taking 2 or 3 weeks getting up to speed on your specific architecture.
Good luck!

Should I use EAV database design model or a lot of tables

I started a new application and now I am looking at two paths and don't know which is good way to continue.
I am building something like eCommerce site. I have a categories and subcategories.
The problem is that there are different type of products on site and each has different properties. And site must be filterable by those product properties.
This is my initial database design:
Products{ProductId, Name, ProductCategoryId}
ProductCategories{ProductCategoryId, Name, ParentId}
CategoryProperties{CategoryPropertyId, ProductCategoryId, Name}
ProductPropertyValues{ProductId, CategoryPropertyId, Value}
Now after some analysis I see that this design is actually EAV model and I read that people usually don't recommend this design.
It seems that dynamic sql queries are required for everything.
That's one way and I am looking at it right now.
Another way that I see is probably named a LOT WORK WAY but if it's better I want to go there.
To make table
Product{ProductId, CategoryId, Name, ManufacturerId}
and to make table inheritance in database wich means to make tables like
Cpus{ProductId ....}
HardDisks{ProductId ....}
MotherBoards{ProductId ....}
erc. for each product (1 to 1 relation).
I understand that this will be a very large database and very large application domain but is it better, easier and performance better than the option one with EAV design.
EAV is rarely a win. In your case I can see the appeal of EAV given that different categories will have different attributes and this will be hard to manage otherwise. However, suppose someone wants to search for "all hard drives with more than 3 platters, using a SATA interface, spinning at 10k rpm?" Your query in EAV will be painful. If you ever want to support a query like that, EAV is out.
There are other approaches however. You could consider an XML field with extended data or, if you are on PostgreSQL 9.2, a JSON field (XML is easier to search though). This would give you a significantly larger range of possible searches without the headaches of EAV. The tradeoff would be that schema enforcement would be harder.
This questions seems to discuss the issue in greater detail.
Apart from performance, extensibility and complexity discussed there, also take into account:
SQL databases such as SQL Server have full-text search features; so if you have a single field describing the product - full text search will index it and will be able to provide advanced semantic searches
take a look at no-sql systems that are all the rage right now; scalability should be quite good with them and they provide support for non-structured data such as the one you have. Hadoop and Casandra are good starting points.
You could very well work with the EAV model.
We do something similar with a Logistics application. It is built on .net though.
Apart from the tables, your application code has to handle the objects correctly.
See if you can add generic table for each object. It works for us.

Draw EER Model to create Store Database system

I asked the following question on a database site:
I am trying to build an EER Model for a Autostore that has 5 locations
and offers a range of auto products. They offer car repairs and
roadworthy tests as a service also. I need to be able to make
fortnightly reports on unfinished service jobs, and fortnightly
reports on the sales. They have a wide customer database filled with
full addresses. There is a constant inflow of new stock items and
restocking of old ones. There should also be a way to know the cost of
each item in stock and where its being held.
I swear I've researched it enough to be able to understand it by now
but Im really struggling to map this out as I'm constantly running
into a wall when dealing with the products that are being restocked,
sold and stocked by particular stores in different locations.
-I'm a total rookie with this kind of thing but if anyone can help me it would be amazing.
but I am struggling to find an answer and was thinking that maybe if I asked someone here to build an SQL setup it would lead me in the direction of being able to make the model or if there was a way of building the relational model then it would be a simple step from there, unless someone has the original answer - or all of them haha, hope you can help!
Thanks,
Jacob
If you are struggling with creating an EER diagram, my guess is that you may not have captured detailed enough requirements for the application. A clear understanding of the functionality the application should provide should lay the groundwork for what you need to model in the database.
Ask yourself these questions.
Have I created user profiles for each type of user the application will be used by?
Have I outlined every action these users will be performing on the application and the details of the actions?
These are just two of many questions that you have hopefully fully addressed. If you have addressed these topics and everything else fully, perhaps you just need a different approach in organizing your requirements.
Break it up into segments of data. For example, you'll need to create a system of tables that manages inventory. Which will need to then be linked up to a system of tables that manages sales and service records. Which will need to be linked to a system of tables that manages customers data. The sales/service and inventory control will need to be linked up to a system of tables that governs employees and their roles and ability to do things (security, privileges, etc). I can go on and on speaking theoretically about this, but this should hopefully be enough to get you started.
Good luck.

Database Modification or start over?

So I'm currently working on rebuilding an existing website that is used internally at my company for project management, at heart it is a bug tracking utility that has some customer support and accounting operations linked into it.
Currently the database model is very repetitive, a good example of this is, currently a UserId is linked into a record (FK relationship into a user table that contains all the information about the user) and then all the information about the user also exists in the table.
I've been tasked with improving the website and the functionality of the model; however, I want to reduce the repetition of data in the website (is this normalization or is that the breaking apart of unlinked items into separate tables?). I'm not sure what the best method of doing this would be. I'm thinking of generating the creation scripts for the database and creating a new database project in VS to then modify the database, then generating some scripts to populate the new database model from the old database.
I plan on using the Entity Framework and ASP. NET MVC 2 to build the website as I think it provides the most flexible model moving forward for the modification and maintenance of the website.
The reason I ask all of this is because I'm very familiar with using databases and modifying existing ones to be used in applications and websites but I'm trying to discover the best way to build one.
I'm curious if there is any material on the best way to do this or if I should be using a different tool to do this with?
Edit: Providing more information on the model
There are 4 major areas that we have that are used:
Cases (Bugs, Features, Working Tasks, Etc)
2 .Tickets (Tech Support Events)
Errors (Errors Generated from our logging Library, Basically a stack trace with customer information)
License (Keeps track of each customers License allows modification to those licenses)
These are the Objects that are intermixed and used throughout the above 4 major areas.
Users (People who use the system)
Customers (People who use our software)
Stores (Places where our customers use our software)
Products (Our Software)
Relationships
Cases:
A Cases has to have a User, can have a Customer, Store, Error, Ticket and/or Product
Tickets
A Ticket has to have a User and a Customer, can have a Store, Error and/or Product
Errors:
A Error has to have a Product, Can Have a Case, Ticket, Store, and/or Product
Licenses:
A Licenses has to have a Product and Customer, can have a Store
Like I said very basic website, with a not super complex database, if done correctly.
Currently the database has no FK constraints, replication of lots of information across each table and lots of extra tables that are duplicates with different names.
E.g.
Each Case type has a separate table so there is a FeatureRequest, Bug, Tasks, Completed, etc table that all contain the same information.
Normalization is about storing data without redundancy or anomalies.
One example of an anomaly could be when attributes about a user in your main table are not in sync with the users table. Someone changes information about that user in one table without reflecting the changes in the redundant copy. The problem is that it's hard to know which change is the correct one.
Some people think that normalization is just about breaking apart tables into littler tables, because that's what they see as the most common type of change. But that's not the goal of normalization. It's just by coincidence that most mistakes of non-normalization involve stuffing too much data into one table where multiple tables would be correct.
It's hard to answer your question about whether to modify your database in-place or whether to create a whole new database and migrate to it.
What I would do in your case is to design a properly normalized database, and then examine the differences between that and your existing database. Imagine what you would have to do for each difference, to change your old database to the new one, versus a data migration. It could be that only a few changes are needed, only dropping the redundant columns. Or it could be that some major rework is needed. It's impossible to tell until you do the work of creating a normalized data model so you can compare.
The bigger task might be to adapt your application code that uses the database. One way to ease this transition is to create database views on top of the normalized database, which mimic your old non-normalized database. That way hopefully you don't have to rewrite every bit of code in your app all at once, you can keep some of it the same at least until you can refactor the code.
Also having a good set of regression tests in place is ideal, so you can be sure your app still does all the tasks it is supposed to do, as you refactor the database and the code that uses the database.
Re your comment: You mention that you're adding new functionality to the user model at the same time. I would find it too confusing to try to do this simultaneously with refactoring. Refactoring typically does not change functionality, it only changes implementation. But refactoring adds value because it makes the code easier to maintain or debug, improves efficiency, or prepares you to make future functionality changes more easily.
I would recommend that you bit the bullet and add your new user model features to the old non-normalized database. It's good to get the benefit of new features in the short term, and also you need to develop those features first to understand them well enough to account for them in your big refactoring project.
Here are some suggestions for resources to help you truly understand what normalization means:
SQL and Relational Theory by C. J. Date
A Simple Guide to Five Normal Forms in Relational Database Theory by William Kent
Database Normalization at Wikipedia and its sub-pages for each respective normal form
SQL Antipatterns Volume 1: Avoiding the Pitfalls of Database Programming by me, Bill Karwin. I wrote a chapter about database normalization that I hope explains it in plain English and with good examples.
Here are a couple of resources for managing changes to a database:
Refactoring Databases by Scott W. Ambler and Pramodkumar J. Sadalage
Agile Database Techniques: Effective Strategies for the Agile Software Developer by Scott W. Ambler
How long do you have, and how big is the database?
It's very difficult to answer this question black and white without being immersed in your environment and business case. It really doesn't seem like your limitation is technology wise, just to choose between solutions.
Re-creating is what programmers instinctively go for. However, in the "real world", sometimes we spend a lot of effort into something that isn't that used or wont last that long.
So food for thought. How long will it take you to re-do the database, how much will it cost. Will working with what's existent be sufficient for the functionality asked?

Does an ORM integrate with existing applications or do I not understand?

Assume Hibernate for the ORM.
I'm not sure how to ask this. I want to build an application that can replace part of another. For example, say I have an application with various modules, called the "big" app. This application may handle HR, financial, purchases, skill sets, etc. But maybe, for whatever reason, I don't like the skill set module, but I like the rest of the application. I want to build an app that uses the same database that the rest of the "big" app uses but use my software as the front end for that piece.
I could build my app and have it hit the database directly with no ORM. My question is is there an advantage to using an ORM here. I'm thinking there is because if the "big" app goes away and another app is purchased, we could continue to use my version of skill set because I am using hibernate instead of hitting things directly. I'm still learning but I thought that my application used objects that I named and that in the case I just described I'd have to change my mapping files only or/and my code very little.
Here is another example. I have a legacy application and legacy database. It uses database X. I decide that I no longer like the old terminal emulator application that is used to get the data and that I want a graphical version. I can use hibernate with my application and when I finally decide to get rid of the legacy database and change to the latest Oracle or SQL Server, I can do so with minimal headache? Or is my database going to change so much that it wouldn't have matter anyway (I'm suggesting that upon changing to a new database more information will want to be captured)?
I was hoping for comments, if I am misunderstanding why hibernate/ORM might or might not be a benefit.
Thank you.
I do not think you will have a huge benefit frmo hibernate if the database schema changes to something completely different, you might have to change more than just your mapping - especially if more "structure" is added to the database (tables, column and such schema things). That said, if the database was structured mostly the same way, but lets say just the column names and tables names changes and a couple of tables are merged or something like that - you can get by with just changing your mapping.
But I would really recommend using hbernate for database agnosticity, that's is a pretty easy path.
AND then just because it doesn't exactly helps you if your entire database is changed, it such an incredible amount of other forces, that I would choose that over direct DB access most of the time.
Lastly you could think about using a service-layer such as the repository pattern that abstracs away the data access, so the business of your appilcation wouldn't need to change if the database changes.
Switching from one DBMS to another (ala Oracle to SQL Server) is one thing that using an ORM would certainly make much easier.
As for switching from one "big app" to another "big app", I doubt if using an ORM would help that much. It's likely that the database structure and business logic would be different enough that you would find yourself rewriting lots of code anyways.
You can generate domain objects with Hibernate Tools, if you do that than it will be painless and fast. however if you write all the objects by hand you will die. i think its good idea to rewrite part of the app and get to know hibernate better.
I think it's generally a bad idea to make any decision based on the
unknowns versus the knowns. Whether you're deciding on a data
access/persistence strategy, what car to buy, or what college to go
to, you should put the most weight on the things you know you want
today, rather than worrying about what may or may not happen tomorrow.
So when considering ORMs, I wouldn't worry too much about things such as apps
"going away" or DBMSs changing (unless that's either already been talked about, or
there's a history of this in your company). I'm not saying that these aren't things that will never happen, but rather that they should take a back seat to the generally much more important considerations of maintainability, performance, and developer productivity.
So in short, choose an ORM based on its ability to solve the problems and satisfy the requirements that you have today.