When I create a table from a query, it seems like the table is only populated with data available from query at the time of creation. Is this correct?
If new data gets added to the dependent tables, then the created table wont sync with the updated data?
Is there anyway to make it so it always updates from the query?
Yes, that is correct.
If you want the "table" to be automatically updated, then use a view. This stores the query and runs it each time you reference it.
You can also create a materialized view. This is like a view but it is usually faster.
Related
I create a table from a junction between 3 tables.
The issue that I'm facing is that this table becomes outdated once a new modification affects once of the original table used to supply its data.
How should solve this problem? Is a trigger the only solution?
You have several options:
Live with the outdated data and periodically (say once a day or once an hour) update the table.
Use a trigger to update the associated values.
Use a view.
Use an indexed view if possible (see here).
The last option is a way to define a view that SQL Server updates automatically -- what is called a materialized view in other databases. That said, not all queries are supported in materialized views.
"There are two problems in computer science. Cache coherency, and naming things." Tables derived from other tables exemplify both.
SQL Server supports materialized views, if you actually need them. Before you decide you do, do what every good carpenter does: measure twice and cut once. That is, write an ordinary view, and see how fast it is. If it's too slow -- however defined -- find out why. Most of the time, slow views can be sped up using indexes. If you have 100s of millions of rows, you might convert the view to a function that takes an argument applied to the WHERE clause, making it a "parameterized" view.
Is it possible to delete rows from a View?
Usually, a view isn't something you can delete from; it's kind of a virtual table, which shows you the rows from one or more real tables in the database. If you want a row to disappear from a view, you need to either delete the data from the real tables behind the view, or alter the view-creating SQL so that that particular row won't be shown in the view. With some simpler views you can DELETE FROM (and update) a view; however, even so the data is actually deleted from the real table.
You also cannot generally add anything to a view; if you need completely new data, it has to be added in the real table(s) from which the view is created.
For view basics, see for example http://www.w3schools.com/sql/sql_view.asp
If your view is updatable - really depends on a database you are using and the way view was created. General rule (again, varies from one DB to another) there should be one table and no aggregates in the select statement, creating the view.
Here is details for MySQL: http://dev.mysql.com/doc/refman/5.7/en/view-updatability.html
And for SQL Server: https://msdn.microsoft.com/en-CA/library/ms187956.aspx
InterSystems Caché: http://docs.intersystems.com/cache20152/csp/docbook/DocBook.UI.Page.cls?KEY=GSQL_views#GSQL_views_update
Please correct me if my understanding is incorrect, but the data within a view is always "up to date" if you query the view because in querying the view, the query used to create the view is also refreshed. The particular view that I'm creating contains millions of records and I am thus wondering if you can keep the "historical" data in the view and only add new stuff to it?
I do not have table-writing privileges on the database.
EDIT: Real-time tracking data is constantly added to the database, and the view is meant to stitch together a lot of disparate information for easier BI analysis. So, the "new data" I am referring to is the constant addition of real-time tracking data.
EDIT: I'm writing this from an efficiency perspective (i.e. because there're millions of records, it would take a long time to recreate the entire view in each query). Maybe what I'm really asking is whether or not SQL Server 2008 would "optimize" this query by only adding the new stuff, would it reload all the data, or is there even a way to "optimize" this in such a way as the first case?
A view is basically a select statement with a name. When you query the view, the query "under the hood" is executed and this is how data fetched is always "fresh".
A solution for your scenario would be to create a different view (or recreate the existent one) where you filter the historical data (exclude rows that are historical, from your point of view by adding an extra condition on a date column). Of course, the other logic should be kept(tables joined, columns, calculations etc etc).
You can also use the existent view and make a new one starting from that. The "body" of the new view should be something like:
select *
from (
select * from existentView -- this should contain an AddedDate (or some sort of date)
) e
where e.AddedDate >= getdate()-30
I have a problem to decide whether to use a view or a temp table.
I have a stored procedure that i call from program. In that SP i store the result of a long query in a temp table, name the columns and make another queries on that table store the results in labels or a gridview or whatever and drop the Temp Table. I could also store the query-result in a view and make queries on that view. So what is better or in what case do i HAVE to use a VIEW/ Temp Table.
According to my research a view has the benefit of: Security, Simplicity and Column Name Specification. My temporary table fulfills all that too (according to my opinion).
If the query is "long" and you are accessing the results from multiple queries, then a temporary table is the better choice.
A view, in general, is just a short-cut for a select statement. If does not imply that the results are ever run and processed. If you use a view, the results will need to be regenerated each time it is used. Although subsequent runs of the view may be more efficient (say because the pages used by the view query are in cache), a temporary table actually stores the results.
In SQL Server, you can also use table variables (declare #t table . . .).
Using a temporary table (or table variable) within a single stored procedure would seem to have few implications in terms of security, simplicity, and column names. Security would be handled by access to the stored procedure. Column names are needed for either solution. Simplicity is hard to judge without more information, but nothing sticks out as being particularly complicated.
depends
A view must replicate the processing of your "long query" each time it is run, while a temp table stores the results.
so do you want to use more processing or more storage?
You can store some view values (persistent index) which could help on processing, but you don't provide enough info to really explore this.
If you are talking about just storing the data for the use within a single procedure call, then a temp table is the way to go.
I'd like to also mention that for temporary table,
You cannot refer to a TEMPORARY table more than once in the same query.
This make temp table inconvenient for the cases where you want to use self join on it.
It is really a situational and operation specific question and answer may vary depending on the requirements of the scenario.
However, a small point that i would like to add is that if you are using a view to store results of a complex query, which are in turn used in operations of a GridView, then it can be troublesome to perform update operations on complex views. On the contrary, Temp Tables can suffice to this perfectly.
Again, There are scenario's where Views may be a better choice [ as in Multiple Database Servers if not handled properly] but it depends on what you want to do.
In general I would use a temporary table when I want to refer multiple times to the same table within a stored procedure, and a view when I want to use the table across different stored procedures.
A view does not persist the data (in principle): each time you reference the view SQL uses the logic from the view to access the original table. So you would not want to build a view on a view on a view, or use multiple references to a view that has complex logic.
I have a 30gb table, which has 30-40 columns. I create reports using this table and it causes performance problems. I just use 4-5 columns of this table for the reports. So that, I want to create a second table for the reports. But the second table must be updated when the original table is changed without using triggers.
No matter what my query is, When the query is executed, sql tries to cache all 30gb. When the cache is fully loaded, sql starts to use disk. Actually I want to aviod this
How can I do this?
Is there a way of doing this using ssis
thanks in advance
CREATE VIEW myView
AS
SELECT
column1,
column3,
column4 * column7
FROM
yourTable
A view is effectively just a stored query, like a macro. You can then select from that view as if it were a normal table.
Unless you go for matierialised views, it's not really a table, it's just a query. So it won't speed anything up, but it does encapsulate code and assist in controlling what data different users/logins can read.
If you are using SQL Server, what you want is an indexed view. Create a view using the column you want and then place an index on them.
An indexed view stores the data in the view. It should keep the view up-to-date with the underlying table, and it should reduce the I/O for reading the table. Note: this assumes that your 4-5 columns are much narrower than the overall table.
Dems answer with the view seems ideal, but if you are truly looking for a new table, create it and have it automatically updated with triggers.
Triggers placed on the primary table can be added for all Insert, Update and Delete actions upon it. When the action happens, the trigger fires and can be used to do additional function... such as update your new secondary table. You will pull from the Inserted and Deleted tables (MSDN)
There are many great existing articles here on triggers:
Article 1, Article 2, Google Search
You can create that second table just like you're thinking, and use triggers to update table 2 whenever table 1 is updated.
However, triggers present performance problems of their own; the speed of your inserts and updates will suffer. I would recommend looking for more conventional alternatives to improve query performance, which sounds like SQL Server since you mentioned SSIS.
Since it's only 4-5 out of 30 columns, have you tried adding an index which covers the query? I'm not sure if there are even more columns in your WHERE clause, but you should try that first. A covering index would actually do exactly what you're describing, since the table would never need to be touched by the query. Of course, this does cost a little in terms of space and insert/update performance. There's always a tradeoff.
On top of that, I can't believe that you would need to pull a large percentage of rows for any given report out of a 30 gb table. It's simply too much data for a report to have. A filtered index can improve query performance even more by only indexing the rows that are most likely to be asked for. If you have a report which lists the results for the past calendar month, you could add a condition to only index the rows WHERE report_date > '5/1/2012' for example.