I am writing some new SQL queries and want to check the query plans that the Oracle query optimiser would come up with in production.
My development database doesn't have anything like the data volumes of the production database.
How can I export database statistics from a production database and re-import them into a development database? I don't have access to the production database, so I can't simply generate explain plans on production without going through a third party hosting organisation. This is painful. So I want a local database which is in some way representative of production on which I can try out different things.
Also, this is for a legacy application. I'd like to "improve" the schema, by adding appropriate indexes. constraints, etc.
I need to do this in my development database first, before rolling out to test and production.
If I add an index and re-generate statistics in development, then the statistics will be generated around the development data volumes, which makes it difficult to assess the impact my changes on production.
Does anyone have any tips on how to deal with this? Or is it just a case of fixing unexpected behaviour once we've discovered it on production? I do have a staging database with production volumes, but again I have to go through a third party to run queries against this, which is painful. So I'm looking for ways to cut out the middle man as much as possible.
All this is using Oracle 9i.
Thanks.
See the documentation for the DBMS_STATS.EXPORT_SCHEMA_STATS and DBMS_STATS.IMPORT_SCHEMA_STATS packages. You'll have to have someone with the necessary privileges do the export in the production database for you if you don't have access. If your development hardware is significantly different than your production hardware, you should also export/import the system statistics with the EXPORT/IMPORT_SYSTEM_STATS procedures.
Remember to turn off any jobs in the development database that recalculate statistics after you do this.
Related
I am not a DBA; however, my small company is using SQL Server for a project that we are working on. On the same SQL Server instance there is a MS Great Plains (Dynamics GP) database - as we pass data back and forth between the two databases (mainly a scribe process getting our data and transferring it into GP).
We are using database replication (snapshot) as a means of syncing our production and development (and soon DR) environments. Right now its set to replicate every three hours during core business hours - mainly to keep production and development up to date for us while we are working.
1) Is this the correct way of doing such a thing? Is there a better way?
2) Does this stress the server or the SQL Server? Is this a possible cause of GP database issues because they are on the same server and instance?
3) Replication only occurs on the non GP database - this shouldn't affect the GP database at all right?
Our database should stay rather small. In doing the snapshot, it is my understanding that tables get locked while the replication is going on. Do the tables stay locked until the entire replication is done or are they off loading after they are completed as the process continues?
There are many ways to sync a SQL Server with another. There is replication which you are currently using, log shipping, backup/restore, mirroring, and Always On to name a few methods.
The "best" method depends on your requirements. If you're concerned about disaster recovery, snapshot replication is not a great option and I would look into AlwaysOn Availability Groups.
If load on your production system is a concern I would look into nightly restoring a backup of the production system.
To answer your specific questions:
1) Is this the correct way of doing such a thing? Is there a better way?
This answer depends on your exact requirements
2) Does this stress the server or the SQL Server?
Doing something is always more work than doing nothing. Depending on many factors this could affect your production server.
3) Replication only occurs on the non GP database - this shouldn't affect the GP database at all right?
Your server only has a finite amount of hardware resources. It could affect the performance of queries against the GP database
We have found that having replication in place also adds complexity when it comes to upgrades and schema changes. If you must have dev and prod in sync (and I would argue about that) Always On or log shipping would be my preferred techniques.
DR is a separate issue. You have to determine your Recovery Point Objective (RPO) and Recovery Time Objective (RTO) and adopt the appropriate technology to satisfy your requirements.
I started at a company as a junior sql developer on a datawarehouse. Ever since I have been going through the code and learning the dimensional models etc. I struggle to see security measures outside of rights that the developer has on the environment.
but if someone would to write code that influences the data in the warehouse in a significant way, update to the wrong values, insert false data, delete records that should be there and hits that code with a commit statement, wouldn't there be a massive impact on the business intelligence aspect of the warehouse? Like if they were to pull data to create statistics and there is bad data, then they will have bad statistics.
We have about 7 billlion records and changes made in this way would be really hard to pick up if it can be seen at all.
Maybe this is a simple question, but I can't really find an answer, since in the datawarehouse you don't have the rigorous relational constraints to check data validity, especially when you move around big data and the database administrators drop the triggers and indexes as well. The transactional side we get the source data from also doesn't keep history (that's our job).
Any views and suggestions on this subject will be highly appreciated, thank you.
When working with databases or writing code in general, mistakes happen. That is why you ALWAYS separate your development environment from your production environment. Most of us also have an intermediate test environment, where new code is tested and data is validated, before the code is deployed to production.
Furthermore, before any deployment, a full backup is taken. That way, if an error is discovered after deployment, a restore of the backup can be made.
Preferably, your development and production environments run on separate, but identical, servers. If that is not possible, at least keep the data in separate databases, and use the security of your database server, to ensure that no one can make changes to the production database, unless a deployment is happening.
Now for the deployment itself, make sure you have a sort of checklist to go over, every time you make a deployment. First step on the checklist should be to backup the existing production environment. Write scripts to automate parts of the deployment, whenever possible. Use tools such as SQL Schema Compare, to identify differences between the development and production database, etc. Ideally, deployment should be a matter of pressing one button, and then everything deploys automagically, and you can go back to developing without worrying.
Is it a good idea to use different schemas inside one large database instead different db instances to reduce cost?
The schema would be absolutely the same, just different names.
For example I have one db for test environment, one for beta and for production etc. Can I collapse all these dbs into one large with different schema names without any issues in future?
Does this approach has some pitfalls?
Best practice recommendations is definitely to separate dev/test from production. You don't want your developers or testers running some test case where a rogue query brings the entire server to its knees.
But for the dev and test/qa environments you could use the same server but separate instances (SQL server installations on same physical hardware).
You might even be able to get by with SQL Server Express for dev and test/qa environments, which is a free version. SQL Server Express 2012 allows for 10 GB database size now I think.
We maintain a set of change scripts that must be run on the DB when our web application is released. We waste a lot of time and experience some difficultly keeping these updated however, our DBA likes to (rightly) tweak stored procedures and schemas on the live system to maintain system performance.
Every so often we have to rebase our patches to the current schema and stored procedures, however, it is extremely difficult to detect which changes might conflict and work out which of our DBAs' changes we might be clobbering.
How do others manage the need for changes on live DBs against pending changes?
What processes can we put in place to make this process more smooth?
What is the best way to store, manage our schema and apply our/his changesets?
Thanks in advance.
DBAs should not ever tweak procs on prod only. They should also use source control and put the changes onthe other environments so that others making changes are aware of them.
Make any and all DDL changes to the DB schema script based and store then in your source code control. Especially changes your DBA makes - I would suggest getting your base schema and stored procs examined by a db developer and the DBA prior to checking them into your source code control (props to HLGEM for saying it). Moves into prod should be scripted and approved prior to application (ie, if the DBA finds stuff that needs to be changed, have the DBA open a defect and handle via that process).
Lock all such DDL changes away from your developers. The smart guys writing Java and C# should be communicating with your team db "specialist" on how to best accomplish the design goals and needs on the db side.
Limit production tweaks to those highly situation dependent cases, for example, many IT shops have a DBA who will define the physical storage setup based on db deployment scripts your app supplies and this is usually Ok. A wizard with your app to help less experienced people along with a top 10 list of recommendations for setup and basic tuning will go a long way.
We have 18 databases that should have identical schemas, but don't. In certain scenarios, a table was added to one, but not the rest. Or, certain stored procedures were required in a handful of databases, but not the others. Or, our DBA forgot to run a script to add views on all of the databases.
What is the best way to keep database schemas in sync?
For legacy fixes/cleanup, there are tools, like SQLCompare, that can generate scripts to sync databases.
For .NET shops running SQL Server, there is also the Visual Studio Database Edition, which can create change scripts for schema changes that can be checked into source control, and automatically built using your CI/build process.
SQL Compare by Red Gate is a great tool for this.
SQLCompare is the best tool that I have used for finding differences between databases and getting them synced.
To keep the databases synced up, you need to have several things in place:
1) You need policies about who can make changes to production. Generally this should only be the DBA (DBA team for larger orgs) and 1 or 2 backaps. The backups should only make changes when the DBA is out, or in an emergency. The backups should NOT be deploying on a regular basis. Set Database rights according to this policy.
2) A process and tools to manage deployment requests. Ideally you will have a development environment, a test environment, and a production environment. Developers should do initial development in the dev environment, and have changes pushed to test and production as appropriate. You will need some way of letting the DBA know when to push changes. I would NOT recommend a process where you holler to the next cube. Large orgs may have a change control committee and changes only get made once a month. Smaller companies may just have the developer request testing, and after testing is passed a request for deployment to production. One smaller company I worked for used Problem Tracker for these requests.
Use whatever works in your situation and budget, just have a process, and have tools that work for that process.
3) You said that sometimes objects only need to go to a handful of databases. With only 18 databases, probably on one server, I would recommend making each Databse match objects exactly. Only 5 DBs need usp_DoSomething? So what? Put it in every databse. This will be much easier to manage. We did it this way on a 6 server system with around 250-300 DBs. There were exceptions, but they were grouped. Databases on server C got this extra set of objects. Databases on Server L got this other set.
4) You said that sometimes the DBA forgets to deploy change scripts to all the DBs. This tells me that s/he needs tools for deploying changes. S/He is probably taking a SQL script, opening it in in Query Analyzer or Manegement Studio (or whatever you use) and manually going to each database and executing the SQL. This is not a good long term (or short term) solution. Red Gate (makers of SQLCompare above) have many great tools. MultiScript looks like it may work for deployment purposes. I worked with a DBA that wrote is own tool in SQL Server 2000 using O-SQl. It would take an SQL file and execute it on each database on the server. He had to execute it on each server, but it beat executing on each DB. I also helped write a VB.net tool that would do the same thing, except it would also go through a list of server, so it only had to be executed once.
5) Source Control. My current team doesn't use source control, and I don't have enough time to tell you how many problems this causes. If you don't have some kind of source control system, get one.
I haven't got enough reputation to comment on the above answer but the pro version of SQL Compare has a scriptable API. Given that you have to replicate stuff to all of these databases you could use this to make an automated job to either generate the change scripts or to validate that the databases are all in sync. It's also not much more expensive than the standard version.
Aside from using database comparison tools, with 18 databases you should have a DBA, so enforce a policy that only the DBA can change tables at the database level by restricting access to CREATE and ALTER to the DBA only. On both your test and live databases. The dev database shouldn't have this, of course! Make the developers who have been creating or altering the schemas willy-nilly go via the DBA.
Create a single source-controlled DDL/SQL script for each release and only use it to update the databases. The diff tools can be useful but mainly for checking that you haven't made a mistake and getting out of trouble when the policies fail. Combine the DDL, SQL, and stored procedure scripts into a single script so that it's not easy to "forget" to run one of the scripts.
We have got a tool called DB Schema Difftective that can compare and sync database schemas. With our other tool, DB MultiRun you can easily deploy generated (sync) scripts to multiple db servers (project based).
I realize this post is old, but TurnKey is correct. If you are a developer working in a team environment, the best way to maintain a database schema for a large application, is to make updates to a Master Schema in what ever source safe you use. Simply write your own Scripting class and your Database will be perfect every time.