Setting up a company DB from scratch - primary-key

I'm using this forced "down time" to finally take my business from Excel to Access. I am fairly accomplished at Excel VBA etc, and pretty much run the business on a handful of highly developed Excel sheets I’ve created over the years. They work well, but they are not very scalable, and I want to get over to a proper relational DB.
I've taken an Udemy course on Access which was fine, but I’ve already hit some issues which may be fundamental misunderstandings, or just inexperience.
My first issue is that my company has projects (commercial contracts) which often, but not always, involve two ‘customers’ - an End User and an Agent. Agents and End Users can be interchangeable though, i.e. an Agent on one project might be the End User on another, so my “Customer Table” is simply a list of ALL my end users and agents with a CustomerID.
In my “Project Table” I have a CustomerID field and an AgentID field, both of which I wanted to use to pull out a customer and then agent from the single “Customer Table”. I can’t find the way to set up the relationships to enable me to do that – I can get either one, but not both for each Project Table query.
For a while I thought I needed a many-to-many relationship I needed, but I still don’t find how I can reference two entries from a single table in one record.
Thanks for any help!

You're almost there. What you need to do is to create a one-to-many join between tblCustomer and tblProject (based on tblCustomer!CustomerID=tblProject!CustomerID) and then another one-to-many join between tblProject and another instance of tblCustomer (based on tblCustomer!CustomerID=tblProject!AgentID). The relationship window should look like:
Regards,

Related

Need advice on repeating a table primary key on multiple linked tables as foreign key

I'm looking for a second opinion from maybe someone with more experience with SQL.
So I have a database that looks like this :
Company has multiple Clients which has multiple Projects which has multiple Tasks, etc.
In my application a user is assigned a company and cannot query information that isn't tied to it. So whenever a user tries to retrieve Client/Project/Task/Punch I need to make sure that my query contains a Where clause that looks like WHERE companyID=[user's company id]. This add a lot of joins when I need to fetch Punch since I need to go up the chain to see if the company is the same as the user.
Since a client/project/task/punch will never switch from a company to another one, I was wondering if there's any red flag to add a companyID field in project/task/punch in order to simplify the querying ?
I'm using PostgreSQL
If I understand correctly, what you are buildng is a multitenant system, where your companies are the tenants. If that is the case then there are no red flags - on the contrary, your main concern is to isolate data belonging to different companies in the most efficient and most secure way.
I find this old blog post to be a basic but clear introduction to multitenancy.
The recommended way to go was then, and is today, the third option: one DB, many schemas. I'm no Postgres expert, but I believe it supports that option quite well.

Access 2007 - two read-only tables, several end users - design query

I've inherited an Access 2007 database with a remit to make it available to our sales reps (10 of them) who work on laptops, which can connect to a central server but they also have a requirement to use the db offline when they are on their travels. I'm not sure how to manage this.
The db is fairly straightforward: 3 tables for customers, prices and products.
Each rep needs to have his own individual customer table, but the prices and products tables are read-only so that they are updated centrally by one person. They are updated maybe half-a dozen times a year.
I though that the simplest way would be to create a copy of the database with all three tables on a central server for each of them, then tell them to download the new copy to their laptops whenever there are any updates.
Is there any way I could automate the process (such as giving them a button on a form to press with some vba coding behind the scenes to do the copy?)
Or is there better way of managing this?
Thanks

Database optimisation

I'm starting a web application that will be used by a lot of companies (over 20K), and most importantly a lot of information will be recorded daily. I would like your advice on the following idea: create a database for each company to do sql queries like this:
select * from enterprisedb1.tablename;
select * from enterprisedb2.tablename2 where enterprisedb2.tablename2.col='foo'
Pleace i need your advice, i don't find anything on google
If you are selling this to multiple clients then it might come down to separation of their data.
On the one hand everything for the app is in the one database for each client, and provided you get the connection string right you probably don't need to ever specify the company name again for the rest of the app. No more "where customer=123" on every single query.
Also means a client could be deleted, backed up, moved, audited, whatever in a completely independent manner.
And also means there is no risk of a developer or a query accidentally doing cross-client things. So you can even open up to generic query access that still cant accidentally cross a client-to-client border. And security set-up will be simpler.
But if you have a million clients you do end up with a lot of databases. How well this works will depend on all sorts of things, including your database of choice.
You also end up having multiple copies of reference data unless you create an additional database "common" or something like that.
Its going to be very much a "depends" answer, but that's a few things to consider.
I suggest to use common tables for each company. It will better to manage and easy to understand.
Create one table for company data and use Integer reference of that key in another mete data tables. For better performance, Index and Query must be well formed.

SQL Database Schema - Many to Many or Pivot Table

I'm trying to create a proper database layout for a project I'm working on, however I can't seem to work out which is best.
Basically the "app" is something where a User can be assigned to many Products, and a Product can be have many Customers.
From here, each Customer has a Service, which is specific to that Customer and Product.
A Service can have many Incidents, but an Incident can only be assigned to one Service.
A User can also have Incidents, but an Incident can only have one User.
Here is the two designs I have made for this:
http://i.imgur.com/ZcCFcdg.png
As you can see, the left design has a table specific tables for the Many-Many relationships, where as the right one has a overall pivot table for them all.
I see both of these methods working (in my head) - however since I'm not the best with this, are there any downsides to either of these methods? And do you see any problems I'll run into, in the future?
Also, is the right one even a proper way of doing it?
I'm also going to be using the Eloquent ORM.
Look up 'Third Normal Form'. This gives rules on how to design tables. You could take one or other of your current designs and apply the three rules to see where you get to.
I would say the one on the right is wrong: too much duplicate information, and unclear relationships. The one on the left is OK, but the two Many-Many tables are redundant. You know a Product and Customer are linked because it's in the Service table. You know a User and Product are linked because its in the Incident + Service tables.
Cheers -
After clarification, a User belongs to a Customer. In that case, add CustomerID to the User table - this is important data about the user and should be included. It will allow you to stop the User raising Incidents on Products not associated with the Customer.
This also will enable you to list the Products associated with the User via the Customer the User is associated with.
Further, the Customer should have it's ID on Product, and Service should have the ProductId, not the CustomerID, as the Service is associated with a Product and only with a Customer via the Product it is associated with.

Ideas for Combining Thousand Databases into One Database

We have a SQL server that has a database for each client, and we have hundreds of clients. So imagine the following: database001, database002, database003, ..., database999. We want to combine all of these databases into one database.
Our thoughts are to add a siteId column, 001, 002, 003, ..., 999.
We are exploring options to make this transition as smoothly as possible. And we would LOVE to hear any ideas you have. It's proving to be a VERY challenging problem.
I've heard of a technique that would create a view that would match and then filter.
Any ideas guys?
Create a client database id for each of the client databases. You will use this id to keep the data logically separated. This is the "site id" concept, but you can use a derived key (identity field) instead of manually creating these numbers. Create a table that has database name and id, with any other metadata you need.
The next step would be to create an SSIS package that gets the ID for the database in question and adds it to the tables that have to have their data separated out logically. You then can run that same package over each database with the lookup for ID for the database in question.
After you have a unique id for the data that is unique, and have imported the data, you will have to alter your apps to fit the new schema (actually before, or you are pretty much screwed).
If you want to do this in steps, you can create views or functions in the different "databases" so the old client can still hit the client's data, even though it has been moved. This step may not be necessary if you deploy with some downtime.
The method I propose is fairly flexible and can be applied to one client at a time, depending on your client application deployment methodology.
Why do you want to do that?
You can read about Multi-Tenant Data Architecture and also listen to SO #19 (around 40-50 min) about this design.
The "site-id" solution is what's done.
Another possibility that may not work out as well (but is still appealing) is multiple schemas within a single database. You can pull common tables into a "common" schema, and leave the customer-specific stuff in customer-specific schema. In some database products, however, the each schema is -- effectively -- a separate database. In other products (Oracle, DB2, for example) you can easily write queries that work in multiple schemas.
Also note that -- as an optimization -- you may not need to add siteId column to EVERY table.
Sometimes you have a "contains" relationship. It's a master-detail FK, often defined with a cascade delete so that detail cannot exist without the parent. In this case, the children don't need siteId because they don't have an independent existence.
Your first step will be to determine if these databases even have the same structure. Even if you think they do, you need to compare them to make sure they do. Chances are there will be some that are customized or missed an upgrade cycle or two.
Now depending on the number of clients and the number of records per client, your tables may get huge. Are you sure this will not create a performance problem? At any rate you may need to take a fresh look at indexing. You may need a much more powerful set of servers and may also need to partion by client anyway for performance.
Next, yes each table will need a site id of some sort. Further, depending on your design, you may have primary keys that are now no longer unique. You may need to redefine all primary keys to include the siteid. Always index this field when you add it.
Now all your queries, stored procs, views, udfs will need to be rewritten to ensure that the siteid is part of them. PAy particular attention to any dynamic SQL. Otherwise you could be showing client A's information to client B. Clients don't tend to like that. We brought a client from a separate database into the main application one time (when they decided they didn't still want to pay for a separate server). The developer missed just one place where client_id had to be added. Unfortunately, that sent emails to every client concerning this client's proprietary information and to make matters worse, it was a nightly process that ran in the middle of the night, so it wasn't known about until the next day. (the developer was very lucky not to get fired.) The point is be very very careful when you do this and test, test, test, and test some more. Make sure to test all automated behind the scenes stuff as well as the UI stuff.
what I was explaining in Florence towards the end of last year is if you had to keep the database names and the logical layer of the database the same for the application. In that case you'd do the following:
Collapse all the data into consolidated tables into one master, consolidated database (hereafter referred to as the consolidated DB).
Those tables would have to have an identifier like SiteID.
Create the new databases with the existing names.
Create views with the old table names which use row-level security to query the tables in the consolidated DB, but using the SiteID to filter.
Set up the databases for cross-database ownership chaining so that the service accounts can't "accidentally" query the base tables in the consolidated DB. Access must happen through the views or through stored procedures and other constructs that will enforce row-level security. Now, if it's the same service account for all sites, you can avoid the cross DB ownership chaining and assign the rights on the objects in the consolidated DB.
Rewrite the stored procedures to either handle the change (since they are now referring to views and they don't know to hit the base tables and include SiteID) or use InsteadOf Triggers on the views to intercept update requests and put the appropriate site specific information into the base tables.
If the data is large you could look at using a partioned view. This would simplify your access code as all you'd have to maintain is the view; however, if the data is not large, just add a column to identify the customer.
Depending on what the data is and your security requirements the threat of cross contamination may be a show stopper.
Assuming you have considered this and deem it "safe enough". You may need/want to create VIEWS or impose some other access control to prevent customers from seeing each-other's data.
IIRC a product called "Trusted Oracle" had the ability to partition data based on such a key (about the time Oracle 7 or 8 was out). The idea was that any given query would automagically have "and sourceKey = #userSecurityKey" (or some such) appended. The feature may have been rolled into later versions of the popular commercial product.
To expand on Gregory's answer, you can also make a parent ssis that calls the package doing the actual moving within a foreach loop container.
The parent package queries a config table and puts this in an object variable. The foreach loop then uses this recordset to pass variables to the package, such as your database name and any other details the package might need.
You table could list all of your client databases and have a flag to mark when you are ready to move them. This way you are not sitting around running the ssis package on 32,767 databases. I'm hooked on the foreach loop in ssis.