What would be the best ways to migrate data from a db2 database to big query?
Related
I want to use Delta Lake tables in my Hive Metastore on Azure Data Lake Gen2 as basis for my company's lakehouse.
Previously, I used "regular" hive catalog tables. I would load data from parquet into a spark dataframe, and create a temp table using df.CreateOrReplaceTempView("TableName"), so I could use Spark SQL or %%sql magic to do ETL. After doing this, I can use spark.sql or %%sql on the TableName. When I was done, I would write my tables to the hive metastore.
However, what If I don't want to perform this saveAsTable operation, and write to my Data Lake? What would be the best way to perform ETL with SQL?
I know I can persist Delta Tables in the Hive Metastore through a multitude of ways, for instance by creating a Managed catalog table through df.write.format("delta").saveAsTable("LakeHouseDB.TableName")
I also know that I can create a DeltaTable object through the DeltaTable(spark, table_path_data_lake), but then I can only use the Python API and not sql.
Does there exist some equivalent of CreateOrReplaceTempView(), or is there a better way to achieve ETL with SQL without 'writing' to the data lake first?
However, what If I don't want to perform this saveAsTable operation, and write to my Data Lake? What would be the best way to perform ETL with SQL?
Not possible with Delta Lake since it relies heavily on a transaction log (_delta_log) under the data directory of a delta table.
I have a visio .vdx for the design of the my data warehouse with Lucidchart. Is there a way to generate redshift sql from that ?
What would be the best tool to work with Redshift data modeling ?
If those sql generator can generate tables for special visio stencil, like http://www.visualdatavault.com
Amazon Redshift is (mostly) compatible with PostgreSQL, so any tool that can introspect PostgreSQL tables should work with Redshift.
One things to note -- constraints and foreign keys are not enforced in Redshift.
I have two IBM DB2 databases A and B with different database schemas. Database A schema is older, and database B schema is newer. I would neet to create SQL alter scripts that can update A schema to match that of B schema. This can ofcourse be done manually, but is there a tool that could analyse the two databases and do this for me?
I am using the free IBM Data Studio client for querying the database. Can the above operation be done using this tool?
Redgate sql compare is one of tge bests.
How to migrate data from some tables from one Oracle database to another in a real time (if a data is inserted in a table in oracle1.table1, I'd want it to be replicated to oracle2.table2 within 1 minute.
How would that be possible?
In SQL Server, I've seen how SSIS works but is there anything similar for Oracle to Oracle data migration?
What you want is data replication. After an initial copy of the data, copy only the transactions on that data. For this, dbvisit replicate is a very smart tool. You could code this yourself to using streams but you can expect to build a less stable and in the end more expensive system than just buying a dbvisit replicate license. An other option is Oracle's Golden Gate but that is powerful and a bit pricey.
See the dbvisit website
You may probably try to look at Data Pump
Oracle Data Pump technology enables very high-speed movement of data
and metadata from one database to another. Oracle Data Pump is
available only on Oracle Database 10g release 1 (10.1) and later.
I am upgrading / modifying my existing SQL Server 2005 database.
I would like to rename tables, change relationships in the schema and eventually migrate the data from the old tables to the new ones.
I was thinking of writing a script to :
change the constraints, table names
create the new tables
3 migrate the data from the old tables to the new ones
4 delete the old tables.
Are there any third-party tools I can use to migrate or can I do this with SQL Server DTS ? What is the best approach?