Is database replication the way to go to keep production and development databases in sync?

Is database replication the way to go to keep production and development databases in sync? - sql

I am not a DBA; however, my small company is using SQL Server for a project that we are working on. On the same SQL Server instance there is a MS Great Plains (Dynamics GP) database - as we pass data back and forth between the two databases (mainly a scribe process getting our data and transferring it into GP).
We are using database replication (snapshot) as a means of syncing our production and development (and soon DR) environments. Right now its set to replicate every three hours during core business hours - mainly to keep production and development up to date for us while we are working.
1) Is this the correct way of doing such a thing? Is there a better way?
2) Does this stress the server or the SQL Server? Is this a possible cause of GP database issues because they are on the same server and instance?
3) Replication only occurs on the non GP database - this shouldn't affect the GP database at all right?
Our database should stay rather small. In doing the snapshot, it is my understanding that tables get locked while the replication is going on. Do the tables stay locked until the entire replication is done or are they off loading after they are completed as the process continues?

There are many ways to sync a SQL Server with another. There is replication which you are currently using, log shipping, backup/restore, mirroring, and Always On to name a few methods.
The "best" method depends on your requirements. If you're concerned about disaster recovery, snapshot replication is not a great option and I would look into AlwaysOn Availability Groups.
If load on your production system is a concern I would look into nightly restoring a backup of the production system.
To answer your specific questions:
1) Is this the correct way of doing such a thing? Is there a better way?
This answer depends on your exact requirements
2) Does this stress the server or the SQL Server?
Doing something is always more work than doing nothing. Depending on many factors this could affect your production server.
3) Replication only occurs on the non GP database - this shouldn't affect the GP database at all right?
Your server only has a finite amount of hardware resources. It could affect the performance of queries against the GP database

We have found that having replication in place also adds complexity when it comes to upgrades and schema changes. If you must have dev and prod in sync (and I would argue about that) Always On or log shipping would be my preferred techniques.
DR is a separate issue. You have to determine your Recovery Point Objective (RPO) and Recovery Time Objective (RTO) and adopt the appropriate technology to satisfy your requirements.

Related

Trigger Based Replication (Live Sync) OR Transactional Replication in MSSQL

can someone give me a clear idea about which technique/ method is more reliable, less memory consuming and faster in replicating data from one Database to another in MSSQL database(SQl Server 2012) and why. We are in the process of developing a Live GPS based tracking application and I am confused with which method to proceed with
Trigger Based Replication (Live Sync)
(OR)
Transactional Replication
Thanks in Advance ☺

I would recommend using standardised solutions whenever possible. Within the choice given to you, transaction replication should be an obvious favourite, because:
It doesn't require any coding and can be deployed using standard tools. This makes it much faster to deploy and maintain - any proper DBA can do it, some of them even being blindfolded.
Actual data transfer is done by replication agents which are separate applications external to the SQL Server process and client connections. Any network issues within the publisher-distributor-subscriber(s) chain will lead to delays in copying the data, but they will not affect the performance of the publisher database itself.
With triggers, you have neither of these advantages: you will have to add a lot of code, and sluggish network will make data-changing queries slower, potentially leading to timeouts.
Of course, there are many more ways to move the data between the databases in SQL Server, such as (in no particular order):
AlwaysOn Availability Groups (Database mirroring);
Log shipping;
CDC (Change Data Capture);
Service Broker.
However, given your needs, transaction replication still looks like your best bet, overall.

Azure SQL Database vs. MS SQL Server on Dedicated Machine

I'm currently running an instance of MS SQL Server 2014 (12.1.4100.1) on a dedicated machine I rent for $270/month with the following specs:
Intel Xeon E5-1660 processor (six physical 3.3ghz cores +
hyperthreading + turbo->3.9ghz)
64 GB registered DDR3 ECC memory
240GB Intel SSD
45000 GB of bandwidth transfer
I've been toying around with Azure SQL Database for a bit now, and have been entertaining the idea of switching over to their platform. I fired up an Azure SQL Database using their P2 Premium pricing tier on a V12 server (just to test things out), and loaded a copy of my existing database (from the dedicated machine).
I ran several sets of queries side-by-side, one against the database on the dedicated machine, and one against the P2 Azure SQL Database. The results were sort of shocking: my dedicated machine outperformed (in terms of execution time) the Azure db by a huge margin each time. Typically, the dedicated db instance would finish in under 1/2 to 1/3 of the time that it took the Azure db to execute.
Now, I understand the many benefits of the Azure platform. It's managed vs. my non-managed setup on the dedicated machine, they have point-in-time restore better than what I have, the firewall is easily configured, there's geo-replication, etc., etc. But I have a database with hundreds of tables with tens to hundreds of millions of records in each table, and sometimes need to query across multiple joins, etc., so performance in terms of execution time really matters. I just find it shocking that a ~$930/month service performs that poorly next to a $270/month dedicated machine rental. I'm still pretty new to SQL as a whole, and very new to servers/etc., but does this not add up to anyone else? Does anyone perhaps have some insight into something I'm missing here, or are those other, "managed" features of Azure SQL Database supposed to make up the difference in price?
Bottom line is I'm beginning to outgrow even my dedicated machine's capabilities, and I had really been hoping that Azure's SQL Database would be a nice, next stepping stone, but unless I'm missing something, it's not. I'm too small of a business still to go out and spend hundreds of thousands on some other platform.
Anyone have any advice on if I'm missing something, or is the performance I'm seeing in line with what you would expect? Do I have any other options that can produce better performance than the dedicated machine I'm running currently, but don't cost in the tens of thousand/month? Is there something I can do (configuration/setting) for my Azure SQL Database that would boost execution time? Again, any help is appreciated.
EDIT: Let me revise my question to maybe make it a little more clear: is what I'm seeing in terms of sheer execution time performance to be expected, where a dedicated server # $270/month is well outperforming Microsoft's Azure SQL DB P2 tier # $930/month? Ignore the other "perks" like managed vs. unmanaged, ignore intended use like Azure being meant for production, etc. I just need to know if I'm missing something with Azure SQL DB, or if I really am supposed to get MUCH better performance out of a single dedicated machine.

(Disclaimer: I work for Microsoft, though not on Azure or SQL Server).
"Azure SQL" isn't equivalent to "SQL Server" - and I personally wish that we did offer a kind of "hosted SQL Server" instead of Azure SQL.
On the surface the two are the same: they're both relational database systems with the power of T-SQL to query them (well, they both, under-the-hood use the same DBMS).
Azure SQL is different in that the idea is that you have two databases: a development database using a local SQL Server (ideally 2012 or later) and a production database on Azure SQL. You (should) never modify the Azure SQL database directly, and indeed you'll find that SSMS does not offer design tools (Table Designer, View Designer, etc) for Azure SQL. Instead, you design and work with your local SQL Server database and create "DACPAC" files (or special "change" XML files, which can be generated by SSDT) which then modify your Azure DB such that it copies your dev DB, a kind of "design replication" system.
Otherwise, as you noticed, Azure SQL offers built-in resiliency, backups, simplified administration, etc.
As for performance, is it possible you were missing indexes or other optimizations? You also might notice slightly higher latency with Azure SQL compared to a local SQL Server, I've seen ping times (from an Azure VM to an Azure SQL host) around 5-10ms, which means you should design your application to be less-chatty or to parallelise data retrieval operations in order to reduce page load times (assuming this is a web-application you're building).

Perf and availability aside, there are several other important factors to consider:
Total cost: your $270 rental cost is only one of many cost factors. Space, power and hvac are other physical costs. Then there's the cost of administration. Think work you have to do each patch Tuesday and when either Windows or SQL Server ships a service pack or cumulative update. Even if you don't test them before rolling out, it still takes time and effort. If you do test, then there's a second machine and duplicating the product instance and workload for test.
Security: there is a LOT written about how bad and dangerous and risky it is to store any data you care about in the cloud. Personally, I've seen way worse implementations and processes on security with local servers (even in banks and federal agencies) than I've seen with any of the major cloud providers (Microsoft, Amazon, Google). It's a lot of work getting things right then even more work keeping them right. Also, you can see and audit their security SLAs (See Azure's at http://azure.microsoft.com/en-us/support/trust-center/).
Scalability: not just raw scalability but the cost and effort to scale. Azure SQL DB recently released the huge P11 edition which has 7x the compute capacity of the P2 you tested with. Scaling up and down is not instantaneous but really easy and reasonably quick. Best part is (for me anyway), it can be bumped to some higher edition when I run large queries or reindex operations then back down again for "normal" loads. This is hard to do with a regular SQL Server on bare metal - either rent/buy a really big box that sits idle 90% of the time or take downtime to move. Slightly easier if in a VM; you can increase memory online but still need to bounce the instance to increase CPU; your Azure SQL DB stays online during scale up/down operations.

There is an alternative from Microsoft to Azure SQL DB:
“Provision a SQL Server virtual machine in Azure”
https://azure.microsoft.com/en-us/documentation/articles/virtual-machines-provision-sql-server/
A detailed explanation of the differences between the two offerings: “Understanding Azure SQL Database and SQL Server in Azure VMs”
https://azure.microsoft.com/en-us/documentation/articles/data-management-azure-sql-database-and-sql-server-iaas/
One significant difference between your stand alone SQL Server and Azure SQL DB is that with SQL DB you are paying for high levels of availability, which is achieved by running multiple instances on different machines. This would be like renting 4 of your dedicated machines and running them in an AlwaysOn Availability Group, which would change both your cost and performance. However, as you never mentioned availability, I'm guessing this isn't a concern in your scenario. SQL Server in a VM may better match your needs.

SQL DB has built in availability (which can impact performance), point in time restore capability and DR features. You have the option to scale up / down your DB based on your usage to reduce the cost. You can improve your query performance using Global query (shard data). SQl DB manages auto upgrades and patching and greatly improves the manageability story. You may need to pay a little premium for that. Application level caching / evenly distributing the load, downgrading when cold etc. may help improve your database performance and optimize the cost.

SQL Server full copy of database for read operations

Please advise what suits my problem better. I have a highload web app hosted on the same server where SQL server is hosted. I also have SQL Service reporting running on the same server, generating user reports.
So my server basically works on top of disk read/write speed. I'm going to get another server and install there another SQL server in order to host SSRS there. So my criteria is to get as fresh data as it possible.
I've looked a couple of solution, currently I do make backup via jobs, copy it to second server and restore it there, also via jobs. But that's not the best solution.
All replication mechanism(transaction, merge, snapshot) affect publisher database by locking it's table, what is unacceptable for me.
So I wonder is there any possibility to create a replica with read only access, that would be synced periodically not affecting main db? I would put all report load to that replica and make my primary db be used only by web app.
What solution might suit my problem? As I'm not a DBA, I'd start investigating that direction. Thanks.

Transactional Replication is typically used to off-load reporting to another server/instance and can be near real-time in a best case scenario. The benefit of Transactional Replication is you can place different indexes on the subscriber(s) to optimize reporting. You can also choose to replicate only a portion of the data if only a subset is needed for reporting.
The only time locking occurs with Transactional Replication is when you generate a snapshot. With concurrent snapshot processing, which is the default for Transactional Replication, the shared locks are only held for a short period of time, so users are able to continue working uninterrupted. Either way, this shouldn't be an issue since you'll likely be generating the snapshot during a period of low user activity anyway.

should i advocate migrating from access to (my)sql

We have a windows MFC app that is written against an access database on a company server. The db is not that big: 19 MB. There are at most 2-3 users accessing it at any one time. It is used in a factory environment where access speed (or lack thereof) over the intranet becomes noticeable as it is part of the manufacturing time for our widgets.
The scenario is this: as each widget is completed, it gets a record in the db.. by the end of the year, the db is larger and searching for a record takes longer and longer. The solution so far has been to manually move older records to an archival table about once a year.
We are reworking other portions of this app right now, and it would be a good time to move to another db if we are going to do it.
It is my understanding that if we were using sql, the search time would not go up as the table gets bigger because the entire .mdb does not have to be sent over the network each time. Is this correct? Does anyone have any insight about whether it could be worth it to go to the trouble (time and money) of migrating to a new db, or should I just add more functionality to the application we have now, and maybe automatically purge the older records from time to time, and add additional facilities to the app to get at the older records when needed?
Thanks for any wisdom you can share..

Since your database is small and very few users, I could not make a solid case for migration. I would definetly set up an script to archive old records on a more frequent basis (don't archive into same db, this would somewhat defeat the purpose).
But also make sure two things are correct as well.
INDEXES. If your queries start slowing down, make sure you have proper indexes
http://support.microsoft.com/kb/304272
Your network connection between computers is fast. Maybe upgrade to gigabit cards and router? Possibly put the db on a scsi drive (raid 10 for speed and redundancy)
Throwing advanced technology at simple problems is an expensive way to go and not always the answer!

First of all, the information that the whole table and the whole database is transferred across the network is simply incorrect. If the queries are indexed, then the search times should not go up that much over time.
As others have mentioned spending the time + money to setup and maintain and then have someone maintain and manage and support that database server is certainly a possibility here. However, keep in mind that simply migrating a JET based application to sql server in many cases will run slower, and in fact sql server is slower then JET when no network is involved.
So, I would take some time to ascertain why things slow down so much, and also check into how indexing is setup.
So, just keep in mind that it is pure folklore and myth that the whole tables and whole database is transferred over the network. This concept is ONLY DUE to most people really not having any computer training and not knowing and understanding how the JET data engine works.

I would probably move to either Microsoft SQL Server 2008 R2 Express Edition (free) or MySQL (free) if there is both funding and time to put in a data access layer. Because you will be making requests of a remote server and not operating on data at the local workstation this move is very involved from the development standpoint.
However you should analyze whether or not its more cost effective to perform your archival process quarterly or monthly, and just move the archive database to SQL Server 2008 R2 Express Edition. (You can install the Microsoft SQL Server Management Studio client tools on workstations and query the archival database for faster reports on historical data without rewriting your entire production application; similar solutions exist for using MySQL or other OSS/free RDBMS).

I have cilents with 300 mb databases although they should be upsizing to SQL Server for other reasons. 19 Mb is relatively small. If performance is bad enough that archiving speeds things up then check the indexes to the tables for all your sorting and selection fields. Albert gave you a good URL there to check.
Entire MDB files do not go down the wire. Unless you are missing indexes.

Instead of shipping the DB over the network to the client and then performing queries, you could instead write a small wrapper on the server that handles requests, looks up the result in the Access DB (using SQL + the Access ODBC driver), and returns the result. This avoids the overhead of a large migration you might not need and still gets rid of the basic problem the users are experiencing.
Moving to a "proper" database solution is the best long term solution, but if your needs scale linearly and slowly over the next 30 years, it's hard to justify an expensive migration. That said, if you expect to really ramp up, or want to be more "future-proof", migrating now will likely save money/time.

It is my understanding that if we were
using sql, the search time would not
go up as the table gets bigger because
the entire .mdb does not have to be
sent over the network each time. Is
this correct?
This general idea is true for almost all databases. The idea of a database is to separate your application from the actual data. The data resides in a database server. Your application doesn't.
Does anyone have any insight about
whether it could be worth it to go to
the trouble (time and money) of
migrating to a new db
Yes. Having proposed this many times. It's expensive. It's complicated. Your MS-Access database will never get better or faster.
Other database servers will (and can) get faster and more sophisticated. After all, you're not sending .MDB files through a network anymore. The limitations are reduced. You're working with standard SQL through ODBC. Any database will work at the end of ODBC. You can fire vendors to find better, faster, cheaper products. Once you stop using Access you have choices.
Either stop using Access now or plan to suffer with it forever. And remake this decision every year until the end of time.

Mirroring vs. Log Shipping in Sql Server 2005

I'm interested in hearing people's thoughts about the pros and cons of database mirroring vs. log shipping in this scenario: we need to setup a database backup situation wherein there is exactly one secondary server that need not automatically pick up when the primary fails. Recovering and starting with the secondary should not have to take too long though.

Mirroring
Database mirroring is limited to only two servers.
Mirroring with a Witness Server allows for High Availability and automatic fail over.
You can configure your DSN string to have both mirrored servers in it so that when they switch you notice nothing.
While mirrored, your Mirrored Database cannot be accessed. It is in Synchronizing/Restoring mode.
Mirroring with SQL Server 2005 standard edition is not good for load balancing (see sentence above)
Log Shipping
You can log ship to multiple servers.
Log shipping is only as current as how often the job runs. If you ship logs every 15 minutes, the secondary server could be as far as 15 minutes. Making it more of a Warm Standby.
You can leave the database in read only mode while it is being updated. Good for reporting servers.
Good for disaster recovery

For backup purposes I would recommend Mirroring: it keeps an always up-to-date copy of your database with no hassle.. If you don't need automatic fail-over you need just two servers/instances. Note that High Performance mode is only available in the Enterprice (sp) edition!

Switching to the secondary database does take longer with log shipping, but it's not too bad. You'll have to manually copy any uncopied backup files, apply the transaction log backups to the secondary database, recover the secondary database, and change its role to primary. If the old primary databases accessible, you should back up its transaction log before beginning. Failing over with mirroring is somewhat simpler, and can be done automatically if you are using High Availability mode. Even when using High Performance mode, it's still a one statement operation.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas