What is the overhead for a cross server query vs. a native query?

What is the overhead for a cross server query vs. a native query? - sql

If I am connected to a particular server in SQL Server, and I use a query referencing a table in another server, does this slow down the query and if so how much (e.g. if I am joining a lot to tables in the server I currently am in)? I ask because I want to know if it is ever more efficient to import that table to the server I am in than to cross server query.

I would say it depends. Yeah, probably not the answer you wanted, but it depends on the situation. Using linked servers does come with a cost, especially depending on the number of rows you're trying to return and the types of queries you're trying to run. You should be able to view your execution plan on your queries and see how much is being used to hit the linked server tables. That along with the time it takes to return the results would probably help determine if it's needed.
In regards to bringing the tables locally, then that depends as well. Do you need up-to-date data or is the data static? If static, then importing the table would not be a bad idea. If that data constantly changes and you need the changes, then this might not be the best solution. You could always look into creating SSIS packages to do this on a nightly basis, but again, it just depends.
Good luck.

As sgeddes mentions, it depends.
My own experiences with linked server queries were that they are pretty slow for large tables. Depending on how you write the query the predicates may have to be evaluated on the server that is running the statements (very likely if you're joining to them), meaning it could be transferring the entire table to that server anyways, and then filtering the result. And that is definitely bad for performance.

Related

Performance of calling stored procedure from different databases [duplicate]

Is there any performance hit when doing a select across another DB on the same physical machine? So I have 2 databases on the same physical machine running within the same SQL 2008 instance.
For instance in SomStoreProc on_this_db I run
SELECT someFields FROM the_other_db.dbo.someTable
So far from what I have read on the internet, most people seem to indicate NO.

Even if it is not a performance hit, it could be a problem in data integrity as FKs can't be enforced across databases.
However, it is more likely your procs need to be tuned especially if they are thousands of lines long. To begin with look for cursors, correlated subqueries and bad indexing. Also look for where clauses that are nonsaragable and scalar functions that are runing row-by-agonizing-row.
Of course the best way to prove that the separate database is not the issue is to take one slow proc and convert those tables to one database and test performance both ways. Please at least convince them to do this smaller test before they go ahead and make the horribly complicated and time consuming change to one database and then find out they still have performance problems.
And remember, the execution plan is your friend is looking at these things.

How do I organize my queries in Oracle?

I have a base query made (Thanks Justin Cave!)
Now I have to use that query and join it to different tables and sub queries many times over to do checks against our data. Additional queries are likely to be added in the future. So in the end there will be maybe two dozen checks for the data and the findings will be summarized in an SSRS report. If this were in MSSQL, I would put the results of the queries into a temp table and finally run a select on the temp table. Doing as much research as possible I've decided that the best way would be to use the WITH clause and joining with the other temp tables and queries to get results, then Unionize all of the queries together to get my result. However this seems like it is going to be extremely messy and large. I'd use Global Temporary Tables, but they seem to be frowned upon in Oracle. Perhaps you have a better method for modularizing and organizing this?
Per our licensing agreement we are not able to add new tables in oracle (so I am told), but we are able to add view, stored procedures and functions.
Thanks in advance!

If materialized views are not forbidden to use, you can use them to get all the advantages of a temp table.
But unless you need results of a sub-query in several different queries, you can just use as many independent sub-queries per query as you like, and operate on them as if these were tables. Most of the time you'll have pretty decent query plans.
Also, in my eyes, using a global temp table to speed up analysis 10x is worth it — as long as you don't expose sensitive data to someone not trusted.

Roll them all up into various stored procedures and enclose them in Oracle packages.
Then you can have a package for each logic area of your application. E.g. PKG_USERS, PKG_ACCOUNTS, etc.
It is also easier to track changes because you can put these under version control and see all changes at a glance.
It works for me, hopefully it helps you...

views based on a view is slow

I have created views on the SQL Server 2005 database and this view is based on views provided by a third party. I'm displaying them in our application via JDBC connection and they seems to be very slow. I tried another method and created them as a table by using SQL (select into) command in this case viewing the data in the application is fast. Can you advise me please about the best approach.
How can I improve the application performance?
Indexed view.
Use SSIS to get them into our database which is also an SQL Server 2008 R2.
What else?

The best way I have to found to improve the performance of queries (including views) is to take a look at the generated query plans produced by SSMS. The first thing I look for is Index or Table scans. When you see any of those, there is a good chance an index is needed, and often times you'll need to INCLUDE columns in the index for the index to actually get used.
Indexed views can give a tremendous performance boost. However, Microsoft has laid so many restrictions on them it's often very difficult to actually use them. They will also affect insert / update / delete performance on the base tables. So there is a trade-off.
I doubt that creating a separate table is a viable long-term and scaleable approach unless the plan is to execute these queries a very small number of times. The process of copying this data can be extremely resource intensive.

You should understand where the slowness is first.
Materializing the data into a table obviously means later selects can be faster, but the copying may be slow. If the data is slowly changing, that is certainly a design approach which can work.
Indexed views have restrictions and all indexes will affect write performance, since they need to be updated when data changes.
It sounds like two servers could be in play here. It's not clear if the views you created are on your server or the 2005 server. If you create a view in one server on views in another linked server, it is possible that more data is being pulled between the servers than is strictly necessary (compared to all the views being on the same server and being able to be optimized together).

How can I improve the application performance?
Indexed view.
Use SSIS to get them into our database which is also an SQL Server 2008 R2.
What else?
Another option not mentioned is
Don't use views
My experience is that non-indexed views typically make things slower and indexed views are difficult to create due to restrictions.
If you encounter some problem where you think you need to use a View try using a CTE or inline view instead.

How can my application benefit from temporary tables?

I've been reading a little about temporary tables in MySQL but I'm an admitted newbie when it comes to databases in general and MySQL in particular. I've looked at some examples and the MySQL documentation on how to create a temporary table, but I'm trying to determine just how temporary tables might benefit my applications and I guess secondly what sorts of issues I can run into. Granted, each situation is different, but I guess what I'm looking for is some general advice on the topic.
I did a little googling but didn't find exactly what I was looking for on the topic. If you have any experience with this, I'd love to hear about it.
Thanks,
Matt

Temporary tables are often valuable when you have a fairly complicated SELECT you want to perform and then perform a bunch of queries on that...
You can do something like:
CREATE TEMPORARY TABLE myTopCustomers
SELECT customers.*,count(*) num from customers join purchases using(customerID)
join items using(itemID) GROUP BY customers.ID HAVING num > 10;
And then do a bunch of queries against myTopCustomers without having to do the joins to purchases and items on each query. Then when your application no longer needs the database handle, no cleanup needs to be done.
Almost always you'll see temporary tables used for derived tables that were expensive to create.

First a disclaimer - my job is reporting so I wind up with far more complex queries than any normal developer would. If you're writing a simple CRUD (Create Read Update Delete) application (this would be most web applications) then you really don't want to write complex queries, and you are probably doing something wrong if you need to create temporary tables.
That said, I use temporary tables in Postgres for a number of purposes, and most will translate to MySQL. I use them to break up complex queries into a series of individually understandable pieces. I use them for consistency - by generating a complex report through a series of queries, and I can then offload some of those queries into modules I use in multiple places, I can make sure that different reports are consistent with each other. (And make sure that if I need to fix something, I only need to fix it once.) And, rarely, I deliberately use them to force a specific query plan. (Don't try this unless you really understand what you are doing!)
So I think temp tables are great. But that said, it is very important for you to understand that databases generally come in two flavors. The first is optimized for pumping out lots of small transactions, and the other is optimized for pumping out a smaller number of complex reports. The two types need to be tuned differently, and a complex report run on a transactional database runs the risk of blocking transactions (and therefore making web pages not return quickly). Therefore you generally don't want to avoid using one database for both purposes.
My guess is that you're writing a web application that needs a transactional database. In that case, you shouldn't use temp tables. And if you do need complex reports generated from your transactional data, a recommended best practice is to take regular (eg daily) backups, restore them on another machine, then run reports against that machine.

The best place to use temporary tables is when you need to pull a bunch of data from multiple tables, do some work on that data, and then combine everything to one result set.
In MS SQL, Temporary tables should also be used in place of cursors whenever possible because of the speed and resource impact associated with cursors.

If you are new to databases, there are some good books by Joe Kelko that review best practices for ANSI SQL. SQL For Smarties will describe in great detail the use of temp table, impact of indexes, where clauses, etc. It's a great reference book with in depth detail.

I've used them in the past when I needed to create evaluated data. That was before the time of views and sub selects in MySQL though and I generally use those now where I would have needed a temporary table. The only time I might use them is if the evaluated data took a long time to create.

I haven't done them in MySQL, but I've done them on other databases (Oracle, SQL Server, etc).
Among other tasks, temporary tables provide a way for you to create a queryable (and returnable, say from a sproc) dataset that's purpose-built. Let's say you have several tables of figures -- you can use a temporary table to roll those figures up to nice, clean totals (or other math), then join that temp table to others in your schema for final output. (An example of this, in one of my projects, is calculating how many scheduled calls a given sales-related employee must make per week, bi-weekly, monthly, etc.)
I also often use them as a means of "tilting" the data -- turning columns to rows, etc. They're good for advanced data processing -- but only use them when you need to. (My golden rule, as always, applies: If you don't know why you're using x, and you don't know how x works, then you probably shouldn't use it.)
Generally, I wind up using them most in sprocs, where complex data processing is needed. I'd love to give a concrete example, but mine would be in T-SQL (as opposed to MySQL's more standard SQL), and also they're all client/production code which I can't share. I'm sure someone else here on SO will pick up and provide some genuine sample code; this was just to help you get the gist of what problem domain temp tables address.

Effectively transforming data from one SQL database to another in live environment

We have a bit of a messy database situation.
Our main back-office system is written in Visual Fox Pro with local data (yes, I know!)
In order to effectively work with the data in our websites, we have chosen to regularly export data to a SQL database. However the process that does this basically clears out the tables each time and does a re-insert.
This means we have two SQL databases - one that our FoxPro export process writes to, and another that our websites read from.
This question is concerned with the transform from one SQL database to the other (SqlFoxProData -> SqlWebData).
For a particular table (one of our main application tables), because various data transformations take places during this process, it's not a straightforward UPDATE, INSERT and DELETE statements using self-joins, but we're having to use cursors instead (I know!)
This has been working fine for many months but now we are starting to hit upon performance problems when an update is taking place (this can happen regularly during the day)
Basically when we are updating SqlWebData.ImportantTable from SqlFoxProData.ImportantTable, it's causing occasional connection timeouts/deadlocks/other problems on the live websites.
I've worked hard at optimising queries, caching etc etc, but it's come to a point where I'm looking for another strategy to update the data.
One idea that has come to mind is to have two copies of ImportantTable (A and B), some concept of which table is currently 'active', updating the non-active table, then switching the currenly actice table
i.e. websites read from ImportantTableA whilst we're updating ImportantTableB, then we switch websites to read from ImportantTableB.
Question is, is this feasible and a good idea? I have done something like it before but I'm not convinced it's necessarily good for optimisation/indexing etc.
Any suggestions welcome, I know this is a messy situation... and the long term goal would be to get our FoxPro application pointing to SQL.
(We're using SQL 2005 if it helps)
I should add that data consistency isn't particularly important in the instance, seeing as the data is always slightly out of date

There are a lot of ways to skin this cat.
I would attack the locking issues first. It is extremely rare that I would use CURSORS, and I think improving the performance and locking behavior there might resolve a lot of your issues.
I expect that I would solve it by using two separate staging tables. One for the FoxPro export in SQL and one transformed into the final format in SQL side-by-side. Then either swapping the final for production using sp_rename, or simply using 3 INSERT/UPDATE/DELETE transactions to apply all changes from the final table to production. Either way, there is going to be some locking there, but how big are we talking about?

You should be able to maintain one db for the website and just replicate to that table from the other sql db table.
This is assuming that you do not update any data from the website itself.

"For a particular table (one of our main application tables), because various data transformations take places during this process, it's not a straightforward UPDATE, INSERT and DELETE statements using self-joins, but we're having to use cursors instead (I know!)"
I cannot think of a case where I would ever need to perform an insert, update or delete using a cursor. If you can write the select for the cursor, you can convert it into an insert, update or delete. You can join to other tables in these statements and use the case stament for conditional processing. Taking the time to do this in a set -based fashion may solve your problem.
One thing you may consider if you have lots of data to move. We occassionally create a view to the data we want and then have two tables - one active and one that data will be loaded into. When the data is finsihed loading, as part of your process run a simple command to switch the table the view uses to the one you just finshed loading to. That way the users are only down for a couple of seconds at most. You won't create locking issues where they are trying to access data as you are loading.
You might also look at using SSIS to move the data.

Do you have the option of making the updates more atomic, rather than the stated 'clear out and re-insert'? I think Visual Fox Pro supports triggers, right? For your key tables, can you add a trigger to the update/insert/delete to capture the ID of records that change, then move (or delete) just those records?
Or how about writing all changes to an offline database, and letting SQL Server replication take care of the sync?
[sorry, this would have been a comment, if I had enough reputation!]

Based on your response to Ernie above, you asked how you replicate databases. Here is Microsoft's how-to about replication in SQL2005.
However, if you're asking about replication and how to do it, it indicates to me that you are a little light in experience for SQL server. That being said, it's fairly easy to muck things up and while I'm all for learning by experience, if this is mission critical data, you might be better off hiring a DBA or at the very least, testing the #$##$% out of this before you actually implement it.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas