Can I exclude a custom schema from a Schema comparison in SSDT? - schema

We have a SQL server database that is very dynamic and is always creating new and dropping existing tables from a custom schema called 'temp' (we have a dbo schema and a temp schema). We also use SSDT to maintain and monitor changes in our schema but we are unable to use the update feature on a schema comparison because if a new table is created (say temp.MyTable) after the schema comparison is made and before the updated is attempted, SSDT invalidates the schema comparison because something has changed. At the moment, our only solution to this is to run the schema comparisons around midnight when system activity is practically non-existent but is not ideal for the person who has to do the schema comparison.
My question is, is there a way we can exlude tables from the schema comparison that are apart of the 'temp.' schema?

How are you doing the deployment? as I test I used sqlpackage.exe to publish a dacpac and sat there constantly creating new tables and it deployed without complaining.
However, there are a couple of things you can do, the first is to stop getting the deployment to stop when drift is detected:
/p:BlockWhenDriftDetected=False
This is set to true by default.
The second thing is to ignore the temp schema, but I don't think this will help unless you also stop the drift but you might want to use this filter to stop all changes to the temp schema:
http://agilesqlclub.codeplex.com/
Ed

Related

How to sync tables schema without dropping the table?

Not:
DROP -> CREATE
I need:
COMPARE -> ALTER
I have a test and a production database, the data withing these two are different but the schemas should be the same.
I need something like a production script or a tool or a method which compare these two dbs schema and sync them. I'm coding in nodejs and the thing is I haven't used tools like an ORM or db-migrate, I've created the database using MYSQL-workbench and it costs a lot to write every alter query. there must be an easier way.

Update Hive metadata location for many tables

I would like to change the bucket name in location of many Hive tables. Is it possible for us to connect to mySQL database and update it? I think it is possible.But I would like to know if it is safe to do it in production database.
Yes, it is possible, and I have seen it done; but
(a) the Metastore schema is not documented, and each Hive version brings some minor changes, so you have to do your own exploration to find where/how the StorageDescriptor objects are persisted -- then some unit tests / non-regression tests on a Dev system -- plus, don't forget to run a full DB backup before tinkering with your Prod system (and to rehearse an emergency restoration on your Dev system, too!)
(b) you have to update the StorageDescriptor for tables, but also for partitions -- remember that for partitioned tables, the table-level LOCATION is just used as default root dir for future partitions; once created, a partition retains its location until it is ALTERed explicitly.
For the record, the preferred method for bulk updates is (in theory) the Hive MetaTool but unfortunately, it does not support the kind of updates that you need.Right now it's only good for changing the NameNode alias in all HDFS paths, because that was a real pain point...
A valid alternative to brutal SQL Updates would be to develop a custom Java program, using the Hive MetaStore API, to scan all tables & partitions then read their StorageDescriptor then run RegEx changes on their Location then write back the changes (which is exactly what the MetaTool does, only at a lower level). But that would be overkill.
Finally, a possible compromise would be a SQL Select on the appropriate MySQL table, to generate (with regexp_replace()) a chain of ALTER Table/Partition LOCATION commands to run later in the Hive CLI.Plus a chain of ALTER to revert to the original locations, in case you have to do an emergency rollback :-/

Backing up portion of data in SQL

I have a huge schema containing billions of records, I want to purge data older than 13 months from it and maintain it as a backup in such a way that it can be recovered again whenever required.
Which is the best way to do it in SQL - can we create a separate copy of this schema and add a delete trigger on all tables so that when trigger fires, purged data gets inserted to this new schema?
Will there be only one record per delete statement if we use triggers? Or all records will be inserted?
Can we somehow use bulk copy?
I would suggest this is a perfect use case for the Stretch Database feature in SQL Server 2016.
More info: https://msdn.microsoft.com/en-gb/library/dn935011.aspx
The cold data can be moved to the cloud with your given date criteria without any applications or users being aware of it when querying the database. No backups required and very easy to setup.
There is no need for triggers, you can use job running every day, that will put outdated data into archive tables.
The best way I guess is to create a copy of current schema. In main part - delete all that is older then 13 months, in archive part - delete all for last 13 month.
Than create SP (or any SPs) that will collect data - put it into archive and delete it from main table. Put this is into daily running job.
The cleanest and fastest way to do this (with billions of rows) is to create a partitioned table probably based on a date column by month. Moving data in a given partition is a meta operation and is extremely fast (if the partition setup and its function is set up properly.) I have managed 300GB tables using partitioning and it has been very effective. Be careful with the partition function so dates at each edge are handled correctly.
Some of the other proposed solutions involve deleting millions of rows which could take a long, long time to execute. Model the different solutions using profiler and/or extended events to see which is the most efficient.
I agree with the above to not create a trigger. Triggers fire with every insert/update/delete making them very slow.
You may be best served with a data archive stored procedure.
Consider using multiple databases. The current database that has your current data. Then an archive or multiple archive databases where you move your records out from your current database to with some sort of say nightly or monthly stored procedure process that moves the data over.
You can use the exact same schema as your production system.
If the data is already in the database no need for a Bulk Copy. From there you can backup your archive database so it is off the sql server. Restore the database if needed to make the data available again. This is much faster and more manageable than bulk copy.
According to Microsoft's documentation on Stretch DB (found here - https://learn.microsoft.com/en-us/azure/sql-server-stretch-database/), you can't update or delete rows that have been migrated to cold storage or rows that are eligible for migration.
So while Stretch DB does look like a capable technology for archive, the implementation in SQL 2016 does not appear to support archive and purge.

Liferay ServiceBuilder doesn't alter tables

Short story
When I modify the column withs in tables.sql (VARCHAR(4000)) generated by the service builder, redeploying the portlet does not cause Liferay to alter the db tables. How can I make sure that the column withs get expanded?
Long story
I have to make some changes to a Liferay 6.1.20 EE GA2 project developed by another contractor. The project uses maven as a build tool.
After adding some columns to the service.xml and running mvn liferay:build-service, I noticed, that the portlet-model-hints.xmlgot overriden (see https://issues.liferay.com/browse/MAVEN-37) and resettet to the default column width.
There's alot of data in the tables (it is running in production mode), so I cannot simply drop and recreate the tables.
So I manually modified the column width in the generated tables.sql and redeployed the portlet. The new columns are now present in the db tables, but the column widths were not altered.
Does Liferay alter column width or do I have to fire some sql statements against the database manually?
(We are working with an oracle 10g database)
If you want to change the column withs, you need to write in the portlet-model-hints.xml.
For instance, to increase a field until 255 you will do:(Its important running the build service after that change.)
ServiceBuilder doesn't do ALTER TABLE by itself - you'll have to write an UpgradeProcess for this yourself. Check this blog post or the underlying documentation.
In short: The update that can always be done automatically is of the type "DROP TABLE - CREATE TABLE", but, as you say, this is typically not desirable. Any more fancy way needs to be done manually, and that's exactly what this mechanism is for.

Doctrine schema changes while keeping data?

We're developing a Doctrine backed website using YAML to define our schema. Our schema changes regularly (including fk relations) so we need to do a lot of:
Doctrine::generateModelsFromYaml(APPPATH . 'models/yaml', APPPATH . 'models', array('generateTableClasses' => true));
Doctrine::dropDatabases();
Doctrine::createDatabases();
Doctrine::createTablesFromModels();
We would like to keep existing data and store it back in the re-created database. So I copy the data into a temporary database before the main db is dropped.
How do I get the data from the "old-scheme DB copy" to the "new-scheme DB"? (the new scheme only contains NEW columns, NO COLUMNS ARE REMOVED)
NOTE:
This obviously doesn't work because the column count doesn't match.
SELECT * FROM copy.Table INTO newscheme.Table
This obviously does work, however this is consuming too much time to write for every table:
SELECT old.col, old.col2, old.col3,'somenewdefaultvalue' FROM copy.Table as old INTO newscheme.Table
Have you looked into Migrations? They allow you to alter your database schema in programmatical way. WIthout losing data (unless you remove colums, of course)
How about writing a script (using the Doctrine classes for example) which parses the yaml schema files (both the previous version and the "next" version) and generates the sql scripts to run? It would be a one-time job and not require that much work. The benefit of generating manual migration scripts is that you can easily store them in the version control system and replay version steps later on. If that's not something you need, you can just gather up changes in the code and do it directly through the database driver.
Of course, the more fancy your schema changes becomes, the harder the maintenance will get i.e. column name changes, null to not null etc.