SQL vs. NoSQL database for 'tags-heavy' CRM application - sql

I'm building a talent management CRM application and I'm having trouble choosing between a SQL or NoSQL database for my data.
The application will only have a few 'core' entities (Person, Job, Company, Interview), and will rely heavily on 'tagging' of those entities. You can add Tags and Notes to a Person, a Job, a Company, and then sort/search data by those tags.
What I learned about NoSQL is that I can just have a Person object (document) with an array of Tags and Notes, where in SQL I would need separate Tags and Notes tables and construct joins to gather all my data for a Person.
Could anyone give me some pointers on what would be the way to go for my particular scenario?
Thanks!

Our ERP system is based on UniData (NoSQL), it is okay for performing the standard tasks needed to do business like entering in customers, creating sales orders, invoicing etc. But when it comes to creating reports that were not originally foreseen it is quite cumbersome. The system only lets you create reports off of one table, if you need data from another table you have two options: 1. Create what is called a virtual attribute for every field you need to look up from a different table, Or write a UniBasic program to retrieve the data needed.
To meet most of our business needs on the reporting front it is more beneficial for us to export the Data to SQL and then perform reports in SQL, the result is the reports run quicker from SQL and most of the time a reporting tool can be used to create the reports - this can usually be performed by a power user as opposed to someone that has to have quite a high level of programming abilities to just build a report.
It would have been nice if it had already been in SQL in the first place.
But maybe some other NoSQL database has better functionality than UniData, that said too usually 3rd party support for NoSQL database engines comes at a higher premium because there are less specialists available than 3rd party support for SQL engines.

Related

Creating a Datawarehouse

Currently our team is having a major database management/data management issue where hundreds of databases are being built and used for minor/one off applications where the app should really be pulling from an already existing database.
Since our security is so tight, the owners of these Systems of authority will not allow others to pull data from them at a consistent (App Necessary) rate, rather they allow a single app to do a weekly pull and that data is then given to the org.
I am being asked to compile all of those publicly available (weekly snapshots) into a single data warehouse for end users to go to. We realistically are talking 30-40 databases each with hundreds of thousands of records.
What is the best way to turn this into a data warehouse? Create a SQL server and treat each one as its own DB on the server? As far as the individual app connections I am less worried, I really want to know what is the best practice to house all of the data for consumption.
What you're describing is more of a simple data lake. If all you're being asked for is a single place for the existing data to live as-is, then sure, directly pulling all 30-40 databases to a new server will get that done. One thing to note is that if they're creating Database Snapshots, those wouldn't be helpful here. With actual database backups, it would be easy to build a process that would copy and restore those to your new server. This is assuming all of the sources are on SQL Server.
"Data warehouse" implies a certain level of organization beyond that, to facilitate reporting on an aggregate of the data across the multiple sources. Generally you'd identify any concepts that are shared between the databases and create a unified table for each concept, then create an ETL (extract, transform, load) process to standardize the data from each source and move it into those unified tables. This would be a large lift for one person to build. There's plenty of resources that you could read to get you started--Ralph Kimball's The Data Warehouse Toolkit is a comprehensive guide.
In either case, a tool you might want to look into is SSIS. It's good for copying data across servers and has drivers for multiple different RDBMS platforms. You can schedule SSIS packages from SQL Agent. It has other features that could help for data warehousing as well.

How to create multi database support software?

I am creating a software for retail shops and I want that my software support SQL Server and SQLite. If the user is a standalone (one PC) select the sqlite database and if it is over the network then choose the SQL Server option.
I am developing this software in Visual Studio 2010 and vb.net language.
As research we have three types of connections in Visual Studio, ODBC, OleDB and MSSQL.
And OLEDB can support MS-Access database and SQL Server.
Any comment and idea is highly appreciated.
The best way to code your applications is to abstract functionality into different tiers or layers. This can mean lots of things and can get quite complex, but the general idea is to keep your application's parts separated. Let's assume you have an inventory form in your program where you can look up current inventory. The form that displays the inventory doesn't need to know what database your customer is running. Generally you're better served by it not knowing. Likewise, your code that accesses the respective database, whether it be SQL Server, SQLite, or Access, doesn't really need to know what your Inventory form is going to do with the data it is retrieving. All your Inventory code should be doing is displaying your inventory in a way that's most useful to your customer, and all your data coding should be doing is getting the data that is requested of it.
The route I would probably take in your situation is to create a data provider class. Inside that class is where you would encapsulate logic for the different database functions you may have, as well as the different database systems your customers may have. Say for instance a store owner just received a shipment of products and needs to add one to his store's inventory. Ideally, your program should simply be able to perform a call like DataProvider.AddInventory(). Inside the DataProvider class, you would write code to keep track of which database solution the customer is using as well as an implementation of logic for each of the database solutions you'd like to support. Ideally, you should implement every data function you may need your application to perform so that it can be called very simply like the AddInventory() example.
Implementations of data providers can be as simple or complex as you like. In some cases where you're going to have multiple applications written in multiple different languages on multiple different platforms accessing your data source from multiple locations, it may make sense to write some sort of middleware. In your case, it sounds like this is the type of application that will reside "in house" and should be served fine by abstracting the data access to a separate class.

Database Client Specific Tables v/s Relational Tables

I have a scenario, my application is a SAAS based app catering to multiple clients. Data Integrity to clients is very essential.
Is it better to keep my Tables
Client specific
OR
Relational Tables
For Ex: I have a mapping table with fields MapField1,MapField2. I need this kind of data for each client.
Should I have tables like MappingData_
or a Single Table with mapping to the ClientId
MappingData with Fields MapField1,MapField2,ClientId
I would have a separate database for each customer. (Multiple databases in a single SQL Server instance.)
This would allow you to design it once, with a single schema.
No dynamically named tables compromising test & development
Upgrades and maintenance can be designed and tested in one DB, then rolled out to all
A single customer's data can be backed-up, restored or dropped exceedingly simply
Bugs discovered/exploited in one DB won't comprise the integrity of other DBs
Data access (read and write) can be managed using SQL Logins (No re-inventing the wheel)
If there is a need for globally shared data, that would go in another database, with it's own set of permissions for the different SQL Logins.
The use of a single database, with all users in it is my next best choice. You still have a single schema. But you don't get to partition the customers' data, you need to manage access rights and permissions yourself, and a whole host of other additional design and testing work.
I would never go near dynamically creating new tables for additional customers. A new table name means all your queries need to be updated with the new table name, and a whole host of other maintenance head-aches.
I'm pretty much of the opinion that if you want to create tables dynamically during the Business As Usual use of an application/service, you've designed it badly.
SO has a tag for the thing you're describing: "multi-tenant".
Visualize the architecture for supporting a multi-tenant database application as a spectrum. At one extreme of the spectrum is "shared nothing", which means each tenant has its own database. At the other extreme of the spectrum is "shared everything", which means tenants share tables, and each row in each table belongs to one tenant. (Each row contains a tenant identifier.)
Terminology seems to overlap, so read carefully. What one writer means by shared schema might be identical to what another writer means by shared everything.
This SO answer, also written by me, describes the differences and the tradeoffs in terms of cost, data isolation and protection, maintenance, and disaster recovery. It also links to a fairly good introductory article.

Local SQL database interface to cloud database

Excuse me if the question is simple. We have multiple medical clinics running each running their own SQL database EHR.
Is there anyway I can interface each local SQL database with a cloud system?
I essentially want to use the current patient data that one is consulting with at that moment to generate a pathology request that links to a cloud ?google app engine database.
As a medical student / software developer this project of yours interests me greatly!
If you don't mind me asking, where are you based? I'm from the UK and unfortunately there's just no way a system like this would get off the ground as most data is locked in proprietary databases.
What you're talking about is fairly complex anyway, whatever country you're in I assume there would have to be a lot of checks / security around any cloud system that dealt with patient data. Theoretically though, what you would want to do ideally is create an online database (cloud, hosted, intranet etc), and scrap the local databases entirely.
You then have one 'pool' of data each clinic can pull information from (i.e. ALL records for patient #3563). They could then edit that data and/or insert new records and SAVE them, exporting them back to the main database.
If there is a need to keep certain information private to one clinic only this could still be achieved on one database in a number of ways, or you could retain parts of the local database and have them merge with the cloud data as they're requested by the clinic
This might be a bit outdated, but you guys should checkout https://www.firebase.com/. It would let you do what you want fairly easily. We just did this for a client in the exact same business your are.
Basically, Firebase lets you work with a Central Database on the Cloud, that is automatically synchronised with all its front-ends. It even handles losing the connection to the server automagically. It's the best solution I've found so far to keep several systems running against one only cloud database.
We used to have our own backend that would try its best to sync changes, but you need to be really careful with inter-system unique IDs for your tables (i.e. going to one of the branches and making a new user won't yield the same id that one that already exists in any other branch or the central database). It becomes cumbersome very quickly.
CakePHP can automatically generate this kind of Unique IDs pretty easily and automatically, but you still have to work on sync'ing all the local databases with the central repository.

SQL, Analysis Services OR Reporting Services

I have a requirement and I am not sure if I should use Analysis services or Reporting services or some other technique.
My client wants to show special deals from a database on their online website. They want to target users, i.e. if user is from UK; show deals for UK and in Pound. if user is from Canada; show deals for Canada and in canadian dollar etc.
Their database has multiple tables loaded with 1 to 2 million records in each table. Each table is for a different category of products and has a currency and a Country column to filter. I cannot restructure their schema as they have huge amount of development done to integrate with various buisness applications.
I need a solution which involves datawarehouse, can fetch data quickly and cache it for next 12 or 24 hours (Do not want to cache on web server). I do not have much experience in Analysis and Reporting services so I need your solution/suggestion and anything you can share from your good or bad experiences.
Analysis services is not what you want here: you do not need cubes that summerize info.
Nor is reporting services: you will want to display your data in plain HTML.
I would just query the existing data and display that data. If performance becomes an issue you could run an SSIS job every 12 hours to extract data to a specific database you created for this application. But consider tweaking your indexes first.