Using Multiple Sources in SSIS Data Flow Task - sql

For my data flow task I have a OLEDB Source. In the SQL command section of this I have compiled a select query based on tables from two different databases, held on the same instance. Every time I run this it errors, but when I moved the tables to the same database (for testing purposes) it worked.
I'm guessing from this that the source data needs to be from the same database but is there anyway around this? I tried using a look-up but I couldn't get it to work. I could create a view in the source database but I'm guessing there must be a way to keep it all within the package.
Thank you in advance! This is the query I was using in the OLE DB Source:
select *
from commoncomponents.meta.ItemTypeLabelDefinition
where internalid not in
(
select internalid
from iscanimport.dbo.ItemTypeLabelDefinition
)

Not sure why the cross-DB query wouldn't work in the one source, but one method would be to create two OleDb Sources, one pointing to CommonComponents DB doing the select from ItemTypeLabelDefinition, and the other pointing to IScanImport and the select statement from your sub-query. Preferably sort these the same way at source in your queries, then use a Merge Join task to combine them.

Related

SSAS Multidimensional - Table Value Function as a Query for Partition

#GregGalloway was able to answer the question I should have asked. I am adding a more concise question here, while maintaining the original lengthy text
How do I use a table valued function as the query for a partition, when the function is in separate database from my fact and referenced dimensions?
Overview: I am building a SSAS multidimensional cube that is built off of a single fact table in our application's data warehouse, and want to use the result set from a table valued function as my fact table's partition query. We are using SQL Server (and SSAS) 2014
Condition: For each environment (Dev,Tst,Prd) there are 2 separate databases on the same server, one for the application data warehouse [DW_App], the other for custom objects [DW_Custom]. I cannot create any objects in [DW_App], but have a lot of freedom in [DW_Custom]
Background info: I have not been able to find much information on using a TVF and partitions in this way. My thinking is that it will help streamline future development by giving me a single place to update the SQL if/when I modify the fact table.
So in testing out my crazy idea of using a TVF as the query for my partitions I have run into a bit of a conundrum. I am able to use my TVF when I explicitly state the Database in my FROM clause.
SELECT * FROM [DW_Custom].[dbo].[CubePartition](#StartDate, #EndDate)
However, that will not work, because the cube will be deployed in multiple environments before production, and it needs to point to different DBs for each. So I tried adding a new data source, setting my partition query to point to the new data source, and then remove the database name. IE:
SELECT * FROM [dbo].[CubePartition](#StartDate, #EndDate)
I get an error that
The SQL syntax is not valid. The relational database returned the following error message: Deferred prepare could not be completed. Invalid object name 'dbo.CubePartition'
If I click through this error and the subsequent warnings about the cube not being able to process if I continue I am able to build and deploy the cube. However I cannot process it, because I get an error that one of my dimensions does not exist.
Looking into the query that was generated and it is clear that it is querying my dimensions as well as fact, which do not exist inside of '[DW_Custom]' which explains that error perfectly fine.
So I guess 2 questions:
Is it possible to query another DB (on the same server) from inside of an SSAS partition query?
If not, is there any way I can use a variable as the database name in the query, and update that variable based on the project configuration (Dev,Tst,Prd)
Bonus question: Is the reason that I can not find much about doing it this way because it is an obviously bad idea that I am overlooking, and if so why?
How about creating a second SSAS Data Source pointing to the DW_Custom database (or whatever it's called in the particular environment you're deploying to)? Then when you deploy from Dev to Prod, you need only change that connection string. When you create your partitions, then specify the DW_Custom data source and then specify the query without database name:
SELECT * FROM [dbo].[CubePartition](#StartDate, #EndDate)
As long as the query plan for that table-valued function is efficient compared to a plain SELECT statement, then I don't see a problem with that.

Why does my SSIS package takes so long to execute?

I am fairly new creating SSIS packages. I have the following SQL Server 2008 table called BanqueDetailHistoryRef containing 10,922,583 rows.
I want to extract the rows that were inserted on a specific date (or dates) and insert them on a table on another server. I am trying to achieve this through a SSIS which diagram looks like this:
OLEDB Source (the table with the 10Million+ records) --> Lookup --> OLEDB Destination
On the look up I have set:
Now, the query (specified on the Lookup transformation):
SELECT * FROM BanqueDetailHistoryRef WHERE ValueDate = '2014-01-06';
Takes around 1 second to run through SQL Server Management Studio, but the described SSIS package is taking really long time to run (like an hour).
Why is causing this? Is this the right way to achieve my desired results?
You didn't show how your OLEDB Source component was set up but looking at the table names I'd guess you are loading the whole 10 million rows in the OLEDB source and then using the Lookup to filter out only the ones you need. This is needlessly slow.
You can remove the Lookup completely and filter the rows in OLEDB source using the same query you had in the Lookup.

Move data from one database to another over ODBC

Let say, I am connected to two different databases (one SQLite and one Oracle) over ODBC in one program. Is it possible to execute a query on one database and insert the resultset as a new table in the second database directly by just passing something like a data cursor, i.e. without the hurdle to create insert statements with explicit values from the result set and executing those on the destination database?
As far as I am aware you cannot do that. You can use something like a join engine to do it (e.g., http://www.easysoft.com/products/data_access/odbc_odbc_join_engine/index.html) but it might be overkill if you are doing it just the once.
If you use Oracle then you can use Oracle Heterogeneous Services which can work with ODBC sources. Have a look at: http://www.dba-oracle.com/t_heterogeneous_database_connections_sql_server.htm

SQL Server 2008 Query Between Databases

We are migrating database structures, so I have one database with the old structure and one database with the new structure (both on the same server). I want to write queries to copy data from one to the other. I am expecting to go table-by-table as the schema is different. How do I do this?
You need to provide more details to get a more specific answer, but in general you just use the three-part name:
INSERT INTO NewDB.dbo.TableName
SELECT <columns>
FROM OldDB.dbo.Tablename
Are you looking for a way to do this automatically for all the tables?
You can write cross database queries like so
INSERT INTO NewDatabase.Schema.Table
SELECT Column1, Column2
FROM OldDatabase.Schema.Table
you can probably use Import data under tasks.Right Click the Target DB -> Tasks ->Import Data .You can also specify the source-> target mapping here.. and also write queries

How do you transfer all tables between databases using SQL Management Studio?

When I right click on the database I want to export data from, I only get to select a single table or view, rather than being able to export all of the data. Is there a way to export all of the data?
If this is not possible, could you advise on how I could do the following:
I have two databases, with the same table names, but one has more data than the other
They both have different database names (Table names are identical)
They are both on different servers
I need to get all of the additional data from the larger database, into the smaller database.
Both are MS SQL databases
Being that both are MS SQL Servers, on different hosts... why bother with CSV when you can setup a Linked Server instance so you can access one instance from the other via a SQL statement?
Make sure you have a valid user on the instance you want to retrieve data from - it must have access to the table(s)
Create the Linked Server instance
Reference the name in queries using four name syntax:
INSERT INTO db1.dbo.SmallerTable
SELECT *
FROM linked_server.db.dbo.LargerTable lt
WHERE NOT EXISTS(SELECT NULL
FROM db1.dbo.SmallerTable st
WHERE st.col = lt.col)
Replace WHERE st.col = lt.col with whatever criteria you consider to be duplicate values between the two tables.
There is also a very good tool by Redgate software that syncs data between two databases.
I've also used SQL scripter before to generate a SQL file with insert statements that you can run on the other database to insert the data.
If you right-click on the database, under the Tasks menu, you can use the Generate Scripts option to produce SQL scripts for all the tables and data. See this blog post for details. If you want to sync the second database with the first, then you're better off using something like Redgate as suggested in mpenrow's answer.