Populating data from legacy DB using Entity Framework Code First

Populating data from legacy DB using Entity Framework Code First - sql

Basically we are doing a revamp of an old website so we've set up a new application/DB with proper relational schema but we need to bring data over from old system (invalid referential integrity/no fk/sometimes no pk) as each of us work on a module.
Looking at using EF code first as there will be a few of us working on the system.
Will it be best to write SQL script to bring over all these tables to fit the new schema or is there a good way to do this in Code First? like seeding?
Or will this case be suited to going Database First approach where we just bring over data using Generate Script functionality in SSMS?
Would like to hear if anyone's done similar work. Thanks in advance.

A one-time migration from the old schema to the new schema would normally be done with SQL scripts, not with EF. You have to write the transformation logic either way, and with EF you additionally have to create a DbContext model for the old schema, and EF is slower at performing bulk operations.

I had a similar task to do: I had to create a completely new web application (App B) with a new database (Db B) based on old ones (App A and Db A). The relations and general structure was a real mess in Db A, so I decided that it was better to generate Db B with a code-first approach and then I created another simple application which would do all the migration stuff. I had 2 contexts from Db A and Db B and all the migration logic was there. It worked like a charm.
As for seeding. It would be better if I could do it with a db seeder in the App B but there was a huge amount of data so it was literally impossible to do that.

Related

What is the best way to design, generate, and version a database schema script for MS SQL Server?

I have never really seen any questions (with answers) as general as this, so I'm hoping to get some useful feedback. The reason I'm asking is because I've done all of this before and I have my own ways, but sometimes I feel it's not the best practice.
Let's take for example that I can't afford better db modeling tools and I only have sql server and ms sql server management studio. What I do is:
I design with mssms, all of the entities in my db (tables, primary keys, foreign keys, indexes, etc)
then I just generate the schema script using 'Generate Scripts...' command in mssms. The script that's generated is rather large (using sql server express 2012) and seems like it's not organized for maintenance very well.
Example: after all the table creation scripts are setup, there's a bunch of ALTER TABLE commands to add all the constraints. This kind of thing seems like it would be better in the table creation script section, maybe not. Also, for upgrade-ability, I normally add for each table creation section, 'IF NOT EXISTS', so that it doesn't throw an error when I need to re-run the sql script when the db is updated with new tables, columns, etc.
Then for versioning, I generally have a separate script that I run to add the schema version in a VERSION table in the db itself (with just one row).
This allows me to do incremental upgrades when I run the script by adding 'if new-version > current-version' sort of thing.
It seems to have worked out for me in the past, but it just seems kind of, I don't know, not very sophisticated. Can a sql expert shed some light on this subject? It's something we all do for every data driven web app we create, over and over again. I'd like to see how other developers do it.
To recap,
how do you go about designing your db model and generate scripts (do you do it with a design tool, write from scratch, etc?),
how to you manage incremental db changes over time?
How do you version control your database?

SQL Server Data Tools is ideal for this. It has all the design features you require and configurable scripting. It will also diff two databases and generate the change script for you. Oh - and it's free!

How to sync database schema with Entity Framework in model-first without migrations

I have made changes to my model-first EDMX file, and now I want to apply the changes to my database. I've just added a new table and some new fields, nothing destructive. I want to apply the "diff" to my database, but without all the hassle of database migrations. What I actually need is a non-destructive SQL file containing only the differences.
Currently, I am doing this manually by creating new database SQL from model, deleting all the code non-relevant to the table I am creating, and running the SQL. However, my table is currently empty so I can do this destructively. Moreover, if there are any changes referencing other entities too (e.g. adding a new foreign key to one of the existing tables), the SQL is, obviously, destructive. So I need to add them manually by writing my own SQL.
Is there any tool or a shorter workaround that would automate this whole process? I am looking for something that will compare the current database and the newly created EDMX, and apply only the diff into the database, as a one time process. The whole database migration system of Entity Framework is an extreme overhead and unnecessary work, the whole process, which will run only once, can be boiled down to a single SQL file. Is there such a tool/method? What is the best practice for this (other than EF migrations)?

How to avoid manually writing/managing SQL

My team and I are rapidly developing an Webapp backed by an Oracle DB. We use maven's plugin flyway to manage our db creation and population from INSERT SQL scripts. Typically we add 3-4 tables per sprint and / or modify the existing tables structure.
We model the schema in an external tool that generates the schema including the constraints and run this in first followed by the SQL INSERTs to ensure the integrity of all the data.
We spend too much time managing the changes to the SQL to cover the new tables - by this I mean adding the extra column data to the existing SQL INSERT statements not to mention the manual creation of the new SQL INSERT data particularly when they reference a foreign key.
Surely there is another way, maybe maintaining raw data in Excel and passing this through a parser to the DB. Has anyone any ideas?
10 tables so far and up to 1000 SQL statements, DB is not live so we tear it down on every build.
Thanks
Edit: The inserted data is static reference data the platform depends on to function - menus etc.
The architecture is Tomcat, JSF, Spring, JPA, Oracle

Please store your raw data in tables in the database - hey! why on earth do you want to use Excel for this? You have Oracle Database - the best tool for the job!
Load your unpolished data using SQL*Loader or external tables into regular tables in the database.
From there you have SQL - the most powerful rdbms tool to manipulate your data.
NEVER do slow by slow inserts. (1000 sql statements). Please do CTAS.
Add/enable the constraints AFTER you have loaded all the data.
create table t as select * from raw_data;
or
insert into t (x,y,z) select x,y,z from raw_data;
Using this method, you can bypass the SQL engine and do direct inserts (direct path load). This can even be done in parallel to make your data go into the database superfast!
Do all of your data manipulation in SQL or PLSQL. (Not in the application)
Please invest time learning the Oracle Database. It is full of features for you to use!
Don't just use it like a datadump (a place where you store your data). Create packages - interfaces to your application - your API to the database.
Don't just throw around thousands of statements compiled into your application. It will get messy.
Build your business logic inside the database PLSQL - use your application for presentation.
Best of luck!

Alternatively, you also have the option to implement a Java migration. It could read whatever input data you have (Excel, csv, ...) and do the proper inserts.

how create a sql database fom a stongly typed dataset

I'm looking for an easy way to transfer a database schema I have developed inside visual studio as a strongly typed dataset (xsd file) into a corresponding sql server database. Silly me I assumed the process would be forthright, but I can't find out how to do it. I assume I could duplicate the tables column by column, but that seems so error prone. Does anyone know of a way to perform the schema transfer like this? Maybe a tool to translate the xsd file into a corresponding sql server ddl file?
Final thought once I have the schema transferred moving data around between the two data stores will be straight forward, its just getting the schemas synced that has me stumped...
Thanks,
Keith

Why didn't you implement your data model directly in SQL Server ?! It is more common and engineered and I think this is why Microsoft has not provided any wizard or tool for this case. As well you can make your data model as scripts or .sql files and they can be managed via SVN and whenever you need the model implementation you can sue them.

Methods of maintaining sample data in a database

Firstly, let me apologize for the title, as it probably isn't as clear as I think it is.
What I'm looking for is a way to keep sample data in a database (SQL, 2005 2008 and Express) that get modified every so often. At present I have a handful of scripts to populate the database with a specific set of data, but every time the database is changed all the scripts have to be more or less rewritten and I was looking for some alternatives.
I've seen a number of tools and other software for creating sample data in a database, some free and some not. Are there any other methods I haven’t considered?
Thanks in advance for any input.
Edit: Also, if anyone has any advice at all in dealing with keeping data in sync with a changing application or database, that would be of some help as well.

If you are looking for tools for SQL server, go visit Red Gate Software, they have the best tools. They have a data compare tool that you can use to keep lookup type tables up-to-date and a SQL compare tool that you can use to keep the tables synched up between two datbases. So using SQL data compare, create a datbase with all the sample data you want. Then periodically refresh your testing db (or your prod db if these are strictly lookup type tables) using the compare tool.
I also like the alternative of having a script (you can use Red Gate's tool to create scripts) because that means you can store this info in your source control and use it as part of a deployment package to other servers.

You could save them in another database or the same db in different tables distinguished by the name, like employee_test

Joseph,
Do you need to keep just the data in sync, or the schema as well?
One solution to the data question would be SQL Server snapshots. You create a snapshot of your initial configuration, so any changes to the "real" database don't show up in the snapshot. Then, when you need to reset the table, select from the snapshot into a new table. I'm not sure how it will work if the schema changes, but it might be worth a try.
For generation of sample data, the Database project in Visual Studio has functionality that will create fake/random data.
Let me know if this make sense.
Erick

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas