Access slow when joining on Teradata SQL connections? - sql

I have a simple Access database I use to create a number of reports. I've linked to a Teradata database server in our organization to pull some additional employee-level details. There is a simple left join on employee number, and all I pull is the name and the role.
The query without the connect takes maybe a minute or so to run and is very quick once loaded. Left joining on the Teradata connection slows everything down to a crawl. It can take 10 minutes or so to run the query through Excel. When the query is loaded in Access, scrolling through it is very slow.
I should note there's no performance issues with the Teradata server. I pull unrelated reports from the same and different tables, with complex joins and the speed is very quick.
I tried creating an even simpler query that does almost noting, and the performance issues are still there. Here is the code:
SELECT EMPL_DETAILS_CURR.NM_PREFX, EMPL_DETAILS_CURR.NM_GIVEN,
MC.DT_APP_ENTRY, MC.CHANNEL_IND
FROM MC LEFT JOIN EMPL_DETAILS_CURR ON MC.EMP_ID = EMPL_DETAILS_CURR.EMP_ID;
There are only 7000 records in MC.

If you are joining data between MS Access tables and Teradata tables the join has to be completed using the Microsoft JET Engine on your local machine. That means the data that exists in your Teradata tables is being brought down to your local machine to so that it can be joined.
If the tables are all on Teradata and accessed via linked tables in MS Access the join may still be occurring locally. I would suggest running the query as an ODBC Direct (I forget the exact term) query so that the SQL is passed on to Teradata to be executed and the results returned to MS Access when the query completes.

Related

Slow Access query when joining SQL table with Access table

I am using a SQL database and MS Access 2019 as the front end. The SQL database tables are linked to the Access db using an ODBC connection.
All my queries (they have multiple joined linked tables) run just fine, but as soon as I add a join to a table stored in the Access app (for example, a small table just for mapping values) the query will slow to a crawl. Doesn't matter if the joined fields are indexed or what type of join I'm using.
If anyone has seen this behaviour and found a solution I would much appreciate hearing it.
Joining tables from two separate databases requires the client app to retrieve both tables in their entirety in order to determine the rows needed. That's why it's slow.
If your Access table is small, try using a stored procedure on the SQL side with the data from Access moved to a temporary table. (Or better yet, move the Access table to SQL).

sql temp table join between servers

So I have a summary i need to return to the end user application.
It should accept 3 parameters DateType, StartDate, EndDate.
Date Type will determine the date field I use to filter the data.
The way i accomplished this was putting all the IDs of the records for a datetype into a TEMP table and then joining my summary to the list of IDs.
This worked fine when running on the query on the SQL server that houses the data.
However, that is a replicated server, so when I compiled to a stored proc that would be on the server with the rest of the application data, it slowed the query down. IE 2 seconds vs 50 seconds.
I think the cross join from the temp table that is created on the SQL server then joining to the tables on the replciation server, is causing the slow down.
Are there any methods or techniques that I can use to get around this and build this all in one stored procedure?
If I create 3 stored procedures with their own date range, then they are fast again. However, this means maintaining multiple stored procs for the same thing.
First off, if you are running a version of SQL Server older than 2012 SP1, one problem is that users who aren't allowed to run DBCC SHOW_STATISTICS (which is most users who aren't sysadmins, see the "Permissions" section in the documentation) don't get access to statistics on remote tables. This can severely cripple the optimizer's ability to generate a good execution plan. Upgrading SQL Server or granting more permissions can help there.
If your query involves filtering or joining on a character column, make sure the remote server is flagged in the linked server options as "collation compatible". If this option is off, SQL Server can't assume strings can be compared across the servers and it will start pumping entire tables up and down just to make sure the data ends up where the comparison has to be made.
If the execution plan is as good as it gets and it's still not good enough, one general (lame) technique is to transfer all data locally first (SELECT * INTO #localtable FROM remote.db.schema.table), then run the query as a non-distributed query. Obviously, in order for this to work, the remote table cannot be "too big" and in some cases this actually has worse performance, depending on how many rows are involved. But it's always worth considering, because the optimizer does a better job with local tables.
Another approach that avoids pulling tables together across servers is packing up data in parameters to remote stored procedure calls. Entire tables can be passed as XML through an NVARCHAR(MAX), since neither XML columns nor table-valued parameters are supported in distributed queries. The basic idea is the same: avoid the need for the the optimizer to figure out an efficient distributed query. The best approach greatly depends on your data and your query, obviously.

join the local DB inside a pass-through query(openquery) on a linked server

I am querying a linked Oracle server from a local SQL Server instance using openquery. The issue is that I want to limit the results returned based on a table in the local DB (about 10k rows) while executing in the linked instance. Normally I would do it the other way around and do something like:
SELECT *
FROM OPENQUERY(DBlinked, 'SELECT…') link_serv
INNER JOIN localdb.table1 local_1
ON local_1.column_1 = link_serv.column_1
but because of the size of the linked tables being hit during the openquery (multiple tables with 100’s millions of rows) executing against the entire dataset then sub-setting locally with a join isn’t a good idea.
I’ve tried to reference back to the local server inside openquery but since the table is located on the local file it won't find the table, and I do not have write permission on the linked server so moving the local table over to the linked server would be problematic.
Any guidance on how to limit the results prior to all the joins would be appreciated. Using openquery is not a requirement -- its just the only elegant solution I've found for dealing with resource intensive queries against large linked tables.
Essentially what I’m trying to do is
SELECT * FROM OPENQUERY(oracleDB, '
SELECT *
FROM dbo.table1 Ot
INNER JOIN Local_SQLServ.DB1.DBO.Table1 St
ON St.Column1 = Ot.column1’)

SQL Server 2008, Sybase - large select queries over low bandwidth

I need to pull a large amount of data from various tables across a line that has very low bandwidth. I need to minimize the amount of data that gets sent too and fro.
On that side is a Sybase database, on this side SQL Server 2008.
What I need is to pull all the tables from the Sybase database that have to do with this office. Lets say I have the following tables as an example:
Farm
Tree
Branch
etc.
(one farm has many trees, one tree has many branches etc.)
Lets say the "Farm" table has a field called "CountryID", and I only want the data for where CountryID=12. The actual table structures I am looking at are very complex (and I am also not very familiar with them) so I want to try to keep the queries simple.
So I am thinking of setting up a series of views:
CREATE VIEW vw_Farm AS
SELECT * from Farm where CountryID=12
CREATE VIEW vw_Tree AS
SELECT * from Tree where FarmID in (SELECT FarmID FROM vw_Farm)
CREATE VIEW vw_Branch AS
SELECT * from Tree where BranchID in (SELECT BranchID FROM vw_Branch)
etc.
To then pull the actual data across I would then do:
SELECT * from vw_Farm into localDb.Farm
SELECT * from vw_Tree into localDb.Tree
SELECT * from vw_Branch into localDb.Branch
etc.
Simple enough to set up. I am wondering how this will perform though? Will it perform all the SELECT statements on the Sybase side and then just send back the result? Also, since this will be an iterative process, is it possible to index the views for subsequent calls?
Any other optimisation suggestions would also be welcome!
Thanks
Karl
EDIT: Just to clarify, the views will be set up in SQL Server. I am using a linked server using Sybase ASE to set up those views. What is worrying me in particular is whether the fact that the view is in SQL Server on this side and not on Sybase on that side will mean that for each iteration the data from the preceeing view will get pulled across to SQL Server first before the calculations get executed. I want Sybase to do all the calcs and just pass the results across.
It's difficult to be certain without testing, but my somewhat-relevant experience (using linked servers to platforms other than Sybase, and on SQL Server 2005) has been that using subqueries (such as your code for vw_Tree and vw_Branch) more or less guarantees that SQL Server will pull all the data for the outer table into a local temp table, then match it to the results of the inner query.
The problem is that SQL Server has no access to the linked server's table statistics, so can make no meaningful decisions about how to optimise the query.
If you want to be sure to have the work done on the Sybase server, your best bet will be to write code (could be views or stored procedures) on the Sybase side and reference them from SQL Server.
Linked server connections are, in my experience, not particularly resilient over flaky networks. If it's available, you could consider using Integration Services rather than linked-server queries - but even that may not be much better. You may need to consider falling back on moving text files with robocopy and bcp.

Join in linked server or join in host server?

Here's the situation: we have an Oracle database we need to connect to to pull some data. Since getting access to said Oracle database is a real pain (mainly a bureaucratic obstacle more than anything else), we're just planning on linking it to our SQL Server and using the link to access data as we need it.
For one of our applications, we're planning on making a view to get the data we need. Now the data we need is joined from two tables. If we do this, which would be preferable?
This (in pseudo-SQL if such a thing exists):
OPENQUERY(Oracle, "SELECT [cols] FROM table1 INNER JOIN table2")
or this:
SELECT [cols] FROM OPENQUERY(Oracle, "SELECT [cols1] FROM table1")
INNER JOIN OPENQUERY(Oracle, "SELECT [cols2] from table2")
Is there any reason to prefer one over the other? One thing to keep in mind: we are on a limitation on how long the query can run to access the Oracle server.
I'd go with your first option especially if your query contains a where clause to select a sub-set of the data in the tables.
It will require less work on both servers, assuming there are indices on the tables in the Oracle server that support the join operation.
If the inner join significantly reduces the total number of rows, then option 1 will result in much less network traffic (since you won't have all the rows from table1 having to go across the db link
What hamishmcn said applies.
Also, SQL Server doesn't really know anything about the indexes or statistics or cache kept by the oracle server. Therefore, the oracle server can probably do a much more efficient job with the join than the sql server can.