SQL alternative to a View - sql

I don't really know how to even ask for what i need.
So i try to explain my situation.
I have a rather simple sql query that joins various tables but I need to execute this query with slightly different conditions over and over again.
The execusion time of the query is somewhere around 0.25 seconds.
But all the queries i need to execute take easily 15 seconds.
This is way to long.
What i need is a table or view that holds the query results for me so that i only need to select from this one table instead of joining large tables over and over again.
A view wouldn't really help because it would just execute the same query over and over again. As far as i know.
Is there a way to have something like a view which holds its data as long as its source tables doesn't change ? And will only update and execute the query if it is really necessary?

I think what you described very good fits to
materialized view
usage with fast refresh on commit. However your query need to be eligible for fast refresh.
Another way to use
result_cache
it is automatically invalidates when one of the source tables is changed. I would try both to decide which one suites better for this particular task.

I would suggest table-valued functions for this purpose. Defining such a function requires coding in PL/SQL, but it is not that hard if the function is based on a single query.
You can think of such functions as a way of parameterizing views.
Here is a good place to start learning about them.

Related

SQL Server Performance Tuning of large table

I have a table of 755 columns and around holding 2 million records as of now and it will grow.There are many procedures accessing it with other tables join, are running slow. Now it's hard to split/normalize them as everything is already built and customer is not ready to spend much on it. Is there any way to make the query access to that table faster? Please advise.
Will column store index help?
How little are they prepared to spend?
It may be possible to split this table into multiple 1 to 1 joined tables (vertical partitioning), then use a view to present it as one single blob to existing code.
With some luck you may get join elimination happening frequently enough to make it worthwhile.
View will probably require INSTEAD OF triggers to fully replicate existing logic. INSTEAD OF triggers have a number of restrictions e.g. no support for OUTPUT clause, which can prove to be to hard to overcome depending on your specific setup.
You can name your view the same as existing table, which will eliminate the need of fixing code everywhere.
IMO this is the simplest you can do short of a full DB re-factoring exercise.
See: http://aboutsqlserver.com/2010/09/15/vertical-partitioning-as-the-way-to-reduce-io/ and https://logicalread.com/sql-server-optimizer-may-eliminate-foreign-key-joins-mc11/#.WXgEzlERW6I
755 Columns thats a lot. You should try to index the columns that are mostly used in where clause. this might speed up the process
It is fine, dont worry about it, actually how many columns you have it is not important in sql server (But be careful I said 'have'). The main problem is data count and how many column you select in queries. There is a few point firstly you can check.
Do not use * selector and change it if used in everywhere
In the joins, do not use it directly, you can firstly filter it as inner select. (Just try it, I have no idea about your table so I m telling the general rules.)
Try the diminish data count for ex: use history table for old records. This technicque depends on needs of your organization.
Try to use column index and something like that features.
And of course remove dynamic selects in your queries.
I wish one of them will work.

Change SQL Query update time

I'm reasonably new to using SQL, so I apologise if this is a newbish question!
I'm wanting to be able to view the results of a query in a graph on my website. The problem however is the query takes 2-3 seconds to process (due to a Distinct count over 200,000+ fields), and could likely be called multiple times a second.
If I put the query in a view instead, and have my graph access the view, is there anyway I can set the view to update periodically instead of everytime someone goes to the site? Or is there another way round this?
EDIT: DBMS is MySQL
You could populate the data using a sproc or even just a normal TSQL if you want, using a Scheduled Job. Or, as Gijo suggested, you could look at database replication, but I am not sure that you need that kind of overhead if you can get away with the first choice of periodically caching the data.
You may try replication. With replication, you can set to populate a certain table to get data from a stored procedure and then use this table to show the graph.

Avoiding duplicating SQL code?

I know many ways to avoid duplicating PHP code (in my case PHP). However, I am developing a rather big application that does some calculations on the database with the data it finds, and I have noticed the need to use the same code (parts of SQL) in other places.
I don't like the idea of copying and pasting the same thing over and over again. What is a good way to do this? Should I use stored procedures? I could almost calculate some of the stuff in PHP except that most of the times the queries are calculating values based on also data not returned by the query and it seems stupid to return extra data to PHP so that it could its calculations. Sometimes that may be okay, but now it does not feel so.
What should I do?
For example, all over in many SQL queries I am calculating similar to this:
...
(SELECT SUM(amount) FROM IT INNER JOIN Invoice I WHERE IT.invoiceId=I.id) AS total
...
FROM InvoiceTransaction IT
...
Note that I'm at home now so I'm writing this off the top of my head.
I think you have 2 solutions:
if the SQL returns a small amount of data, I would simply wrap the SQL invocation in a method call and call it (parameterising as necessary)
if the SQL handles a lot of data, I would keep that data in the database and use a stored procedure. You can then call that stored procedure without duplicating the code (but wrap the stored proc call in a function and call it - i.e. as in option 1)
I wouldn't necessarily shy away from stored procedures. But I would advise keeping business logic out of them (keep it in the application itself) and make sure you have sufficient unit testing around it.
I do not prefer store procedure, especially not for the sake of refactoring. You should consider writing a function that return the record you need, and put your SQL queries in that function so you can call it instead of putting your SQL everywhere.
I think we would need an example of a query. Stored procs might be a good option. Or an alternative might be to use views. One advantage in having your queries in views or stored procs is that you can often use the database to see where your tables are used. Disadvantage is that you are locking yourself into one database, however you are probably doing this anyway.

Best to use SQL + JOINS or Stored Proc to combine results from multiple queries?

I am creating a table to summarize data that is gathered from about 8 or so queries that have very light logic/WHERE clauses and all select against different tables.
I was wondering what the best option would be to fetch the summarized data:
One query with multiple JOINS to gather all relevant information
A stored proc that encapsulates the logic and maybe executes the 8 queries and does the "joining" in some other way? This seems more modular and maintainable to me...but I'm not sure.
I am using SQL Server 2008 for this. Any suggestions?
If you can, then use usual SQL methods. Db's are optimized to run them. This "joining in some other way" would probably require the use of cursor which slows down everything. Just let the db do its job. If you need more performance then you should examine execution plan and do what has to be done there(eg. adding indexes).
Databases are pretty good at figuring out the optimal way of executing SQL. It is what they are designed to do. Using stored procedures to load the data in chunks and combining it yourself will be more complex to write, and likely to be less efficient than letting the database just do it for you.
If you are concerned about reusing a complex query in multiple places, consider creating a view of it instead.
Depending on the size of the tables, joining 8 of them could be pretty hairy. I would try it that way first - as others have said, the db is pretty good at figuring this stuff out. If the performance is not as good as you would like, I would try a stored proc which creates a table variable (or a temp table) and inserts the data from each of the 8 tables separately. Then you can return the contents of the table variable to your app.
This method also makes it a little easier to add the 9th, 10th, etc tables in the future. And it gives you an easy way to do any processing you may need on the summarized data before returning it to your app.

SQL Efficiency with Function

So I've got this database that helps organize information for academic conferences, but we need to know sometimes whether an item is "incomplete" - the rules behind what could make something incomplete are a bit complex, so I built them into a scalar function that just returns true if the item is complete and 0 otherwise.
The problem I'm running into is that when I call the function on a big table of data, it'll take about 1 minute to return the results. This is causing time-outs on the web site.
I don't think there's much I can do about the function itself. But I was wondering if anybody knows any techniques generally for these kinds of situations? What do you do when you have a big function like that that just has to be run sometimes on everything? Could I actually store the results of the function and then have it refreshed every now and then? Is there a good and efficient way to have it stored, but refresh it if the record is updated? I thought I could do that as a trigger or something, but if somebody ever runs a big update, it'll take forever.
Thanks,
Mike
If the function is deterministic you could add it as a computed column, and then index on it, which might improve your performance.
MSDN documentation.
The problem is that the function looks at an individual record and has logic such as "if this column is null" or "if that column is greater than 0". This logic is basically a black box to the query optimizer. There might be indexes on those fields it could use, but it has no way to know about it. It has to run this logic on every available record, rather than using the criteria in a functional matter to pare down the result set. In database parlance, we would say that the UDF is not sargable.
So what you want is some way to build your logic for incomplete conferences into a structure that the query optimizier can take better advantage of: match conditions to indexes and so forth. Off the top of my head, your options to do this include a view or a computed column.
Scalar UDFs in SQL Server perform very poorly at the moment. I only use them as a carefully planned last resort. There are probably ways to solve your problem using other techniques (even deeply nested views or inline TVF which build up all the rules and are re-joined) but it's hard to tell without seeing the requirements.
If your function is that inefficient, you'll have to deal with either out of date data, or slow results.
It sounds like you care more about performance, so like #cmsjr said, add the data to the table.
Also, create a cron job to refresh the results periodically. Perhaps add an updated column to your database table, and then the cron job only has to re-process those rows.
One more thing, how complex is the function? Could you reduce the run-time of the function by pulling it out of SQL, perhaps writing it a layer above the database layer?
I've encountered cases where in SQL Server 2000 at least a function will perform terribly and just breaking that logic out and putting it into the query speeds things tremendously. This is an edge case but if you think the function is fine then you could try that. Otherwise I'd look to compute the column and store it as others are suggesting.
Don't be so sure that you can't tune your function.
Typically, with a 'completeness' check, your worst time is when the record's actually complete. For everything else, you can abort early, so either test the cases that are fastest to compute first, or those that are most likely to cause the record to be flagged incomplete.
For bulk updates, you either have to just sit and wait, or come up with a system where you can run a less complete by faster check first, and then a more thorough check in the background.
As Cade Roux says Scalar functions are evil they are interpreted for each row and as a result are a big problem where performance is concerned. If possible use a table valued function or computed column