I have a weird scenario. I tried to see if I could find any help on the topic, but I either don't know how to search for it properly, or there is nothing to find.
So here is the scenario.
I have a table A. From Table T_A, I created a view V_B. Now, I can make UPDATES to V_B, and it works just fine. Then when I create a view V_C which is an UNION of T_A and T_D, the view V_C is un-Updateable. I understand the logic behind why that is the case.
But my question is, is there something I can do where I combine 2 tables and am able to update?
Maybe in a way have table T_D extend T_A?
Some extra information: T_A has items 1-10 and T_D has items 100 - 200. I want to join them so there is a table/view which is updateable that has items 1-10 and 100-200.
If you have a non-updatable view, you can always make it updatable by defining instead of triggers on the view. That means that you would need to implement the logic to determine how to translate DML against the view into DML against one or both of the base tables. In your case, it sounds like that would be the logic to figure out which of the two tables to update.
A couple of points, though.
If T_A and T_D have non-overlapping data, it doesn't make sense to use a UNION, which does an implicit DISTINCT. You almost certainly want to use the less expensive UNION ALL.
If you find yourself storing data about items in two separate tables, only to UNION ALL those two tables together in a view, it is highly likely that you have an underlying data model problem. It would seem to make much more sense to have a single table of items possibly with an ITEM_TYPE that is either A or D.
It may be possible to make your view updatable if you use a UNION ALL and have (or add) non-overlapping constraints that would allow you to turn your view into a partition view. That's something that has existed in Oracle for a long time but you won't find a whole lot of documentation about it in recent versions because Oracle partitioning is a much better solution for the vast majority of use cases today. But the old 7.3.4 documentation should still work.
Related
I create a table from a junction between 3 tables.
The issue that I'm facing is that this table becomes outdated once a new modification affects once of the original table used to supply its data.
How should solve this problem? Is a trigger the only solution?
You have several options:
Live with the outdated data and periodically (say once a day or once an hour) update the table.
Use a trigger to update the associated values.
Use a view.
Use an indexed view if possible (see here).
The last option is a way to define a view that SQL Server updates automatically -- what is called a materialized view in other databases. That said, not all queries are supported in materialized views.
"There are two problems in computer science. Cache coherency, and naming things." Tables derived from other tables exemplify both.
SQL Server supports materialized views, if you actually need them. Before you decide you do, do what every good carpenter does: measure twice and cut once. That is, write an ordinary view, and see how fast it is. If it's too slow -- however defined -- find out why. Most of the time, slow views can be sped up using indexes. If you have 100s of millions of rows, you might convert the view to a function that takes an argument applied to the WHERE clause, making it a "parameterized" view.
I have a table of 755 columns and around holding 2 million records as of now and it will grow.There are many procedures accessing it with other tables join, are running slow. Now it's hard to split/normalize them as everything is already built and customer is not ready to spend much on it. Is there any way to make the query access to that table faster? Please advise.
Will column store index help?
How little are they prepared to spend?
It may be possible to split this table into multiple 1 to 1 joined tables (vertical partitioning), then use a view to present it as one single blob to existing code.
With some luck you may get join elimination happening frequently enough to make it worthwhile.
View will probably require INSTEAD OF triggers to fully replicate existing logic. INSTEAD OF triggers have a number of restrictions e.g. no support for OUTPUT clause, which can prove to be to hard to overcome depending on your specific setup.
You can name your view the same as existing table, which will eliminate the need of fixing code everywhere.
IMO this is the simplest you can do short of a full DB re-factoring exercise.
See: http://aboutsqlserver.com/2010/09/15/vertical-partitioning-as-the-way-to-reduce-io/ and https://logicalread.com/sql-server-optimizer-may-eliminate-foreign-key-joins-mc11/#.WXgEzlERW6I
755 Columns thats a lot. You should try to index the columns that are mostly used in where clause. this might speed up the process
It is fine, dont worry about it, actually how many columns you have it is not important in sql server (But be careful I said 'have'). The main problem is data count and how many column you select in queries. There is a few point firstly you can check.
Do not use * selector and change it if used in everywhere
In the joins, do not use it directly, you can firstly filter it as inner select. (Just try it, I have no idea about your table so I m telling the general rules.)
Try the diminish data count for ex: use history table for old records. This technicque depends on needs of your organization.
Try to use column index and something like that features.
And of course remove dynamic selects in your queries.
I wish one of them will work.
What is the best way to roll up values from a series of child tables into a parent table in SQL Server?
Let's say we have a contracts table. This table has a series of child tables, such as contract_timesheets, contract_materials, contract_other_expenses - etc. What is the best way to pull costs / hours / etc out of those child tables and make them easily accessible in the parent table?
Option 1: My first thought would be to simply use a view. An example might be something like this:
SELECT
contract_code,
caption,
description,
(
SELECT SUM(t.hours * l.rate_hourly)
FROM timesheets t
JOIN labor l ON t.hr_code = l.hr_code AND t.contract_code = c.contract_code
) AS labor_cost,
(
SELECT ...
) AS material_cost,
...
FROM contracts c
So we'll have a view that might have a dozen or more subqueries like that, many of which will themselves need joins to pull in all of the info we need.
This completely works fine. Until we have hundreds of thousands of rows. Then things start to get noticeably slow. It's still workable, but it the row count gets up too much higher, or the server gets too much other workload, i'm concerned that this isn't workable.
Is there a more efficient way to structure such a view?
Option 2: The other obvious solution is to roll those numbers up into physical fields in the parent table. The big issue with that is just maintaining the numbers when the data might be accessed from a variety of clients. Maybe it's a report, maybe it's a form, maybe it's some integration service. So trying to use some premade roll-up SQL file that gets run as an event in the front-end prior to displaying the report / chart / whatever isn't an ideal solution.
To ensure that the roll up numbers stay in synch, we could attach a series of triggers to all of the child tables (and possibly relatives of those, if the numbers in the child tables rely on something else). Everytime the source numbers get updated, we roll it up into the parent. This seems like a lot of trouble, but if the triggers are written correctly, i suppose this would work fine.
Option 3: Do everything in the UI. This is also an option, but with a variety of clients accessing the data, it makes things unpleasant.
Option 4(?): Since most of these records are actually completed with no need to add more data, i can also imagine some kind of hybrid system. The base table for the parent contract would have physical columns for the labor costs, material costs, or whatever. When a contract is marked as Closed (or some other status indicating no more data needs to be entered), those physical columns would be filled in (otherwise they're NULL). The view which is accessible to the clients could then decide, based upon the status (or just an ISNULL stiuation), whether to directly return the data from the physical columns, or whether it needs to calculate it on the fly. I'm not sure how the performance would be with this, but it might be worth a look. This would mean that the roll up numbers only need to be calculated for a few thousand rows at most - everything else would come from the physical fields.
So, what is the right way to do this? Am i missing other possibilities?
Try using an Indexed View. This "materializes" the view. Creating a clustered index on the view will allow your queries to go directly to the index rather than all of the underlying tables/queries that make up the view.
Edit: Modified Indexed View link.
I think the view is probably the right answer, but the way you have the query written with correlated subqueries in the SELECT list may be what causes the performance degradation when the rows increase. If you write everything out as joins with GROUP BY, it might allow the query optimizer to simply the plan for the view at execution time and give you better performance.
Have you also looked into Indexed Views? There are a lot of restrictions to creating indexed views so they may not be a viable option for you, but it's something to consider. Essentially an indexed view is a sort of denormalization. It would allow SQL Server to keep the aggregations updated for you automatically as the underlying data in the tables changes. It may of course degrade performance for inserts, updates and deletes, but it's something to consider if the performance of the aggregations is critical.
To get the best read performance in this case, indexed views are the way to go.
CREATE VIEW labor_costs
WITH SCHEMA_BINDING
AS
SELECT t.contract_code, t.hr_code, SUM(t.hours * l.rate_hourly) AS labor_cost
FROM dbo.timesheets t
GROUP BY t.contract_code, t.hr_code
GO
CREATE UNIQUE CLUSTERED INDEX UX_LaborCosts
ON LaborCosts(t.contract_code, t.hr_code)
Once you have the indexed view you can left join to it. For example:
SELECT
c.contract_code,
c.caption,
c.description,
lb.labor_cost
FROM
dbo.contracts c
LEFT JOIN dbo.labor_costs lb WITH (NOEXPAND)
on c.contract_code = lb.contract_code AND c.hr_code = lb.hr_code
I designed 5 stored procedures which almost use same join condition but parameters or values in where clause change for each on different runs.
Is it best solution to create a view with all join conditions without where clause and then query from view or work on view? Can views auto update itself if i create view?
Can i do sub-queries or query similar to (i think i read somewhere views do not support sub queries but not 100% sure)
select count(x1) as x1cnt, count(x2) as x2count
from (
select x1,x2,
(
case when x1 is 'y' then 1 else 0 end +
case when x2 is 'y' then 1 else 0 end
) per
from vw_viewname) v1
where v1.per = 1
Updated below:
In my queries i use joins similar to this also
select c1,c2,c3
FROM [[join conditions - 5 tables]]
Inner join
(
select x1,x2,x3, some case statements
FROM [[join conditions - 5 tables]]
where t1.s1 = val1 and t2.s2 = v2 etc
) s
on s.id = id
so i'm using join twice so i thought can i reduce it using some views
Leaving out their where clause could make the query run more slowly or just give more results than a specific query would. But you will have to determine if that is advantageous based on your system.
You will get the common view results table to work with. View basically run the query when you use them so you will get results as if you did the query yourself by some other mechanism. You can do sub queries on a view just as if it were another table. That should not be a problem. But if you have 5 different queries doing 5 specific things then it is probably beneficial to leave it as so. One or two of those may be called more and you would be trading off their performance with a general view table and gain nothing really for doing so other than view reuse.
I would only construct the view if you have some specific benefit from doing so.
Also I found this post that may be similar Dunno if you will find it helpful or not.
EDIT: Well, I think it would just make it worse really. You would just be calling the view twice and if its a generic view it means each of those calls is going to get a lot of generic results to deal with.
I would say just focus on optimizing those queries to give you exactly what you need. Thats really what you have 5 different procedure for anyway right? :)
It's 5 different queries so leave it like that.
It's seductive to encapsulate similar JOINs in a view, but before you know it you have views on top of views and awful performance. I've seen it many times.
The "subquery in a view" thing probably refers to indexed views which have limitations.
Unless your talking about an indexed view, the view will actually run the script to generate the view on demand. In that regard, it would be the same as using a subquery.
If I were you, I would leave it as it is. It may seem like you should compact your code (each of the 5 scripts have almost the same code), but its what is different that is important here.
You can have subqueries in a view, and that approach is perfectly acceptable.
SQL Server views do support sub-queries. And, in a sense, views to auto update themselves because a view is not a persisted object (unless you use an Indexed View). With a non Indexed View, each time you query the view, it is using the underlying tables. So, your view will be as up to date as the tables they are based upon.
It sounds to me like a view would be a good choice here.
It's fine to create a view, even if it contains a subselect. You can remove the where for the view.
Are you sure you want to use COUNT like that without a group by? It counts the number of rows which contain non-null values or the parameter.
I've done a lot of presentations recently on the simplification offered by the Query Optimiser. Essentially if you have planned your joins well enough, the system can see that they're redundant and ignore them completely.
http://msmvps.com/blogs/robfarley/archive/2008/11/09/join-simplification-in-sql-server.aspx
Stored procedures will do the same work each time (parameters having some effect), but a view (or inline TVF) will be expanded into the outer query and simplified out.
Say that I have two tables like those:
Employers (id, name, .... , deptId).
Depts(id, deptName, ...).
But Those data is not going to be modified so often and I want that a query like this
SELECT name, deptName FROM Employers, Depts
WHERE deptId = Depts.id AND Employers.id="ID"
be as faster as it can.
To my head comes two possible solutions:
Denormalize the table:
Despite that with this solution I will lose some of the great advantages of have "normalized databases, but here the performance is a MUST.
Create a View for that Denormalize data.
I will keep the Data Normalized and (here is my question), the performance of a query over that view will be faster that without that view.
Or another way to ask the same question, the View is "Interpreted" every time that you make a query over it, or how works the views Stuff in a DBA?.
Generally, unless you "materialize" a view, which is an option in some software like MS SQL Server, the view is just translated into queries against the base tables, and is therefore no faster or slower than the original (minus the minuscule amount of time it takes to translate the query, which is nothing compared to actually executing the query).
How do you know you've got performance problems? Are you profiling it under load? Have you verified that the performance bottleneck is these two tables? Generally, until you've got hard data, don't assume you know where performance problems come from, and don't spend any time optimizing until you know you're optimizing the right thing - 80% of the performance issues come from 20% of the code.
If Depts.ID is the primary key of that table, and you index the Employers.DeptID field, then this query should remain very fast even over millions of records.
Denormalizing doesn't make sense to me in that scenario.
Generally speaking, performance of a view will be almost exactly the same as performance when running the query itself. The advantage of a view is simply to abstract that query away, so you don't have to think about it.
You could use a Materialized View (or "snapshot" as some say), but then your data is only going to be as recent as your last refresh.
In a comment to one of the replies, the author of the question explains that he is looking for a way to create a materialized view in MySQL.
MySQL does not wrap the concept of the materialized view in a nice package for you like other DBMSes, but it does have all the tools you need to create one.
What you need to do is this:
Create the initial materialization of the result of your query.
Create a trigger on insert into the employers table that inserts into the materialized table all rows that match the newly inserted employer.
Create a trigger on delete in the employers table that deletes the corresponding rows from the materialized table.
Create a trigger on update in the employers table that updates the corresponding rows in the materialized table.
Same for the departments table.
This may work ok if your underlying tables are not frequently updated; but you need to be aware of the added cost of create/update/delete operations once you do this.
Also you'll want to make sure some DBA who doesn't know about your trickery doesn't go migrating the database without migrating the triggers, when time comes. So document it well.
Sounds like premature optimisation unless you know it is a clear and present problem.
MySQL does not materialise views, they are no faster than queries against the base tables. Moreover, in some cases they are slower as they get optimised less well.
But views also "hide" stuff from developers maintaining the code in the future to make them imagine that the query is less complex than it actually is.