Is it safe to insert into crm database using sql? - sql

We need to insert data(8k records) into a CRM Entity, the data will come from other CRM Entities. Currently we are doing it through code but it takes too much time (Hours). I was wondering if we use SQL to insert directly into the CRM Database it will be a lot easier and will take only minutes. But before moving farward I have few questions:
Is it safe to insert directly into CRM Database, using SQL?
What is the best practice for insert data into CRM using SQL?
What things should i consider before trying it?
EDIT:
4: How do I increase the insert performance?

No, it is not. It is considered unsupported
Don't do it
Rollup 12 was just released and contains a new API feature. There is now a ExecuteMultipleRequest which could be used for batched bulk imports. See http://msdn.microsoft.com/en-us/library/jj863631.aspx

It shouldn't take hours to insert 8000 records. It would help to see your code, but here are some things to consider to improve performance:
Reuse your IOrganizationService. I've found a 10x increase in performance by reusing a IOrganizationService, rather than creating a new one with each record that is being updated
Use multi-threading. You have to be careful with this one, because it could lead to worse performance if the function to check for the entity existing is your bottle neck.
Tweak your exists function. If the check for the entity existing is taking a long time, consider pulling back the entire table and storing it in memory (assuming it's not ridiculously huge). This would remove 8000 separate select statements.
Turn off plugins that may be degrading performance. If you have any plugins registered on the entity, see if performance increases if you disable them during the import.
Create a new, "How do I increase the insert performance" question with your code posted for additional help.

I have not used the CRM application you are referring to, but if you bypass the code you might bypass certain restrictions or even triggers that the code has in place based on certain values sent in.
For example, if you sent a number in through the code, it might perform some mathematical function on that number and add it to some other value and end up storing two values in the database (one value for the number you entered, and another value representing the total including the newly added one).
So if you had just inserted the one value straight into the database, then the total wouldn't get updated with it.
That is just a hypothetical scenario. You may not run into any problems like that or any others, but there could be the chance.

Well i found this article very helpful. It says:
The direct SQL writes to CRM database are not supported.The reason for this is that creating a record in CRM database is so much more than just INSERT INTO…-statement. The first step of optimizing is to understand what happens behind the scenes and can affect the speed:
1. CRM entities usually consist of 2 physical tables.
2. Cascade rules/Sharing: If created record has any relationships with
cascade rules, web service will handle the cascades automatically.
For example cascaded sharing will lead to additional records being
created in PrincipalObjectAccess table. In case of one-time
migrations, disabling the cascade rules while migration runs can
save lot of time
3. Record Ownership: If you are inserting records, make sure you are
setting the owner as an attribute for create and not as an
additional owner assign request. Assigning owner actually takes
4. Money/Time: Web Service handles currencies and time zones.
5. Workflows/Plugins: If the system has any custom workflows and/or
plugins, I strongly recommend pausing them for the duration of
migration.

Related

Trigger for a lot of data

I have a table that records a lot of information at any moment, for example, 100 rows per second.
After completing each row, certain operations must be performed. That is, some of these rows should be copied to another table.
Now a few questions:
Can I use triggers to do this? Given the high number of entry rows
If multiple conditions are checked for copying to the table, can the triggers be responsive?
Additional explanation: the records added to this table are added by the fingerprint recorder
first of all, check these :
1.refer to define your trigger it can be called in insert or update etc. which not need to be executed for all operations(not required for all inserts)
2.you can forget your business during the times by changing some rules of your application
you need to pay attention to it for every change (prevent to introduce bugs)
4....
I strongly suggest you do not define trigger unless you have not any other choices.
if you have an application, you can do it in that and with putting the business
(for Instance, make a thread in your application to check and do your business)
you can have a windows service to do that for you
if you have just database access you can define a job in that to do it for you (not recommended)
finally, to avoiding blocks if you decided to use multi-thread(second thread according to your question is just for read data from your original table and insert into another), you can turn on the is_read_committed_snapshot_on in your database

Should I create separate SQL Server database for each user?

I am working on Asp.Net MVC web application, back-end is SQL Server 2012.
This application will provide billing, accounting, and inventory management. The user will create an account by signup. just like http://www.quickbooks.in. Each user will create some masters and various transactions. There is no limit, user can make unlimited records in the database.
I want to keep stable database performance, after heavy data load. I am maintaining proper indexing and primary keys in it, but there would be a heavy load on the database, per user.
So, should I create a separate database for each user, or should maintain one database with UserID. Add UserID in each table and making a partition based on UserID?
I am not an expert in SQL Server, so please provide suggestions with clear specifications.
Please inform me if there is any lack of information.
A DB per user is what happens when customers need to be able pack up and leave taking the actual database with them. Think of a self hosted wordpress website. Or if there are incredible risks to one user accidentally seeing another user's data, so it's safer to rely on the servers security model than to rely on remembering to add the UserId filter to all your queries. I can't imagine a scenario like that, but who knows-- maybe if the privacy laws allowed for jail time, I would rather data partitioned by security rules rather than carefully writing WHERE clauses.
If you did do user-per-database, creating a new user will be 10x more effort. While INSERT, UPDATE and so on stay the same from version to version, with each upgrade the syntax for database, user creation, permission granting and so on will evolve enough to break those scripts each SQL version upgrade.
Also, this will multiply your migration headaches by the number of users. Let's say you have 5000 users and you need to add some new columns, change a columns data type, update a trigger, and so on. Instead of needing to run that change script 1x, you need to run it 5000 times.
Per user Dbs also probably wastes disk space. Each of those databases is going to have a transaction log, sitting idle taking up the minimum log space.
As for load, if collectively your 5000 users are doing 1 billion inserts, updates and so on per day, my intuition tells me that it's going to be faster on one database, unless there is some sort of contension issue (everyone reading and writing to the same table at the same time and the same pages of the same table). Each database has machine resources (probably threads and memory) per database doing housekeeping, so these extra DBs can't be free.
Anyhow, the best thing to do is to simulate the two architectures and use a random data generator to simulate load and see how they perform.
It's not an easy answer to give.
First, there is logical design to be considered. Then you have integrity, security, management and performance (in this very order).
A database is a logical unit of data, self contained. Ideally, you should be able to take a database, move it to another instance, probably change the connection strings and be running again.
All the constraints are database-level. No foreign keys can exist referencing some object outside the database.
So, try thinking in these terms first.
How would you reliably prevent one user messing up the other user's data? Keep in mind that it's just a matter of time before someone opens an excel sheet and fire up queries on the database bypassing your application. Row level security in SQL Server is something you don't want to deal with.
Multiple databases mean that all management tasks should be scripted out and executed on all databases. Yes, there is some overhead to it, but once you set it up it's just the matter of monitoring. If a database goes suspect, it's a single customer down, not all of them. You can even have different versions for different customes if each customer have it's own database. Additionally, if you roll an upgrade, you can do it per customer, so the inpact will be much less.
Performance is the least relevant factor here. Of course, it really depends on how many customers and how much data, but proper indexing will solve these issues. Scale-out is much easier with multiple databases.
BTW, partitioning, as you mentioned it, is never a performance booster, it's simply a management feature, allowing for faster loading and evicting of data from a table.
I'd probably put each customer in separate database, but it's up to you eventually to make a decision for yourself. Hope I've helped some with this.

Should I create multiple tables, or even databases for multiple users of a CRM

I'm working on creating an application best described as a CRM. There is a relatively complex table structure, and I'm thinking about allowing users to do a fair bit of customization (adding fields and the like). One concern is that I will be reaching a certain level of scale almost immediately. We have about 50,000 individual users who will be coming online within about nine months of launch. So I want to build to last.
I'm thinking about two and maybe even three options.
One table set with a userID column on everything and with a custom attributes table created by creating a table which indexes custom attributes, then another table which has their values, which can then be joined to the existing contact records for the user. -- From what I've read, this seems like the right option, but I keep feeling like it's not. It seems like once these tables start reaching the millions of records searching for just one users records in every query is going to become a database hog.
For each user account recreate the table set, preened with a unique identifier (the userID for example.) Then rather than using a WHERE userID=? everywhere I can use a FROM ?_contacts. For attributes I could then have a custom attributes table where users could add additional columns for custom attributes. -- This feels like the simplest way to go, though, of course when I decide to change the database structure there would be a migration from hell.
The third option, which I'm pretty confident is wrong, but for that reason alone I can not rule out, is that a new database should be created for each user with all the requisite tables.
Am I crazy? Is option one really the best?
The first method is the best. Create individual userId's and then you can assign specific roles to them. A database retrieval time indeed depends on the number of records too. But, there is a trade-off where you can write efficient sql queries to fetch data. Well, according to this site, you will probably won't run out of memory or run into concurrency issues, because with a good server, the performance ought to be good, provided that you are efficient in writing queries.
If you recreate table sets, you will just end up creating lots of tables and can make the indexing slow which is a bad practice. Whereas if you opt of relational database scheme rather than an ordinary database scheme, and normalize the database and datatables for improving efficiency.
Creating a new database for each and every user, just sums up the complexity from both the above statements resulting in a shabby and disorganized database access. Because, if you decide to run individual instances of databases for every single user, you would just end up consuming your servers physical resources like RAM and CPU usage which will affect the service quality of all the other users.
Take up option 1. Assign separate userIds and assign them roles and privileges where needed. That is more efficient than the other two methods.

Is having a copy of SQL data in an application a good idea to save SQL SELECTS?

I am working on a multithreading .NET 4 application which acquires data continuously and writes them into a SQL database (MySQL or SQL Server - not yet sure).
Everytime when a INSERT is executed, at leat one prior SELECT is necessary in order so synchronize with the databaes. This means the applications gets a block which contains new and old data and then has to check which data sets are new and which are already in the database.
This means a lot of SELECTS which result everytime in more or less the same data.
Would it be a good idea to have a copy of the last x entries per table within the application?
This way the synchronization could be done on the copy instead of the database.
Pro:
Faster
Contra:
Uses a lot of memory
Risk of becomming unsynchronized with the database
What do you think? What is the best practice for such a use case?
Any other pros and cons?
Unless you have an external program writing to your database at the same time, you can use buffering.
But instead of buffering SELECT results, just add to the insert method a buffer of the last X (a reasonable number) insertion requests, and only insert the new one if it isn't on that list.
You might also want to lock the insertion method, to make sure the inclusion check is always correct.
If you have multiple processes writing to the database, it is non-trivial to maintain perfect synchronization between in-memory data and the database. In fact the only way to confirm you are synchronized is by making a SELECT query on database. So you have a trade-off between perfect synchronization and synchronization with some tolerance which is very efficient.
My suggestions, which may help in both cases, would be:
Tune your SELECT queries. Add indexes if necessary.
Create meta-data, like version numbers. So that you have to only check something very trivial to determine if you need synchronization.
Write a stored proc which implements your SELECT and INSERT logic. Then you do not have to worry about making multiple calls to the database.

Versioning data in SQL Server so user can take a certain cut of the data

I have a requirement that in a SQL Server backed website which is essentially a large CRUD application, the user should be able to 'go back in time' and be able to export the data as it was at a given point in time.
My question is what is the best strategy for this problem? Is there a systematic approach I can take and apply it across all tables?
Depending on what exactly you need, this can be relatively easy or hell.
Easy: Make a history table for every table, copy data there pre update or post insert/update (i.e. new stuff is there too). Never delete from the original table, make logical deletes.
Hard: There is an fdb version counting up on every change, every data item is correlated to start and end. This requires very fancy primary key mangling.
Just add a little comment to previous answers. If you need to go back for all users you can use snapshots.
The simplest solution is to save a copy of each row whenever it changes. This can be done most easily with a trigger. Then your UI must provide search abilities to go back and find the data.
This does produce an explosion of data, which gets worse when tables are updated frequently, so the next step is usually some kind of data-based purge of older data.
An implementation you could look at is Team Foundation Server. It has the ability to perform historical queries (using the WIQL keyword ASOF). The backend is SQL Server, so there might be some clues there.