SQL Server 2005 Computed Column Result From Aggregate Of Another Table Field's Value - sql-server-2005

Sorry for the long question title.
I guess I'm on to a loser on this one but on the off chance.
Is it possible to make the calculation of a calculated field in a table the result of an aggregate function applied to a field in another table.
i.e.
You have a table called 'mug', this has a child called 'color' (which makes my UK head hurt but the vendor is from the US, what you going to do?) and this, in turn, has a child called 'size'. Each table has a field called sold.
The size.sold increments by 1 for every mug of a particular colour and size sold.
You want color.sold to be an aggregate of SUM size.sold WHERE size.colorid = color.colorid
You want mug.sold to be an aggregate of SUM color.sold WHERE color.mugid = mug.mugid
Is there anyway to make mug.sold and color.sold just work themselves out or am I going to have to go mucking about with triggers?

you can't have a computed column directly reference a different table, but you can have it reference a user defined function. here's a link to a example of implementing a solution like this.
http://www.sqlservercentral.com/articles/User-Defined+functions/complexcomputedcolumns/2397/

No, it is not possible to do this. A computed column can only be derived from the values of other fields on the same row. To calculate an aggregate off another table you need to create a view.
If your application needs to show the statistics ask the following questions:
Is it really necessary to show this in real time? If so, why? If it is really necesary to do this, then you would have to use triggers to update a table. This links to a short wikipedia article on denormalisation. Triggers will affect write performance on table updates and relies on the triggers being active.
If it is only necessary for reporting purposes, you could do the calculation in a view or a report.
If it is necessary to support frequent ad-hoc reports you may be into the realms of a data mart and overnight ETL process.

Related

Does using calculated fields in Access increase efficiency

I have a database that keeps track of attendance for students in a school. There's one table (SpecificClasses) with dates of all of the classes, and another table (Attendance) with a list of all the students in each class and their attendance on that day.
The school wants to be able to view that data in many different ways and to filter it according to many different parameters. (I won't paste the entire query here because it is quite complicated and the details are not important for my question.) One of their options they want is to view the attendance of a specific student on a certain day of the week. Meaning, they want to be able to notice if a student is missing every Tuesday or something like that.
To make the query be able to do that, I have used DatePart("w",[SpecificClasses]![Day]). However, running this on every class (when we may be talking about hundreds of classes taken by one student in one semester) is quite time-consuming. So I was thinking of storing the day of the week manually in the SpecificClasses table, or perhaps even in the Attendance table to be able to avoid making a join, and just being very careful in my events to keep this data up-to-date (meaning to fill in the info when the secretaries insert a new SpecificClass or fix the Day field).
Then I was wondering whether I could just make a calculated field that would store this value. (The school has Access 2010 so I don't have to worry about compatibility). If I create a calculated field, does Access actually store that field and remember it for the future and not have to recalculate it each time?
As HansUp mentions in his answer, a Calculated field cannot be indexed so it might not give you much of a performance boost. However, since you are using Access 2010 you could create a "real" Integer field named [WeekdayNumber] and put an index on it,
and then use a Before Change data macro to insert the Weekday() value for you:
(The Weekday() function gives the same result as DatePart("w", ...).)
I was wondering whether I could just make a calculated field that
would store this value.
No, not for a calculated field expression which uses DatePart(). Access supports a limited set of functions for calculated fields, and DatePart() is not one of those.
If I create a calculated field, does Access actually store that field
and remember it for the future and not have to recalculate it each
time?
Doesn't apply to your current case. But for a calculated field which Access would accept, yes, that is the way it works.
However a calculated field can not be indexed so that limits how much improvement it can offer in terms of data retrieval speed. If you encounter another situation where you can create a valid calculated field, test the performance to see whether you notice any improvement (vs. calculating the value in a query).
For your DatePart() query problem, consider creating a calendar table with a row for each date and include the weekday number as a separate indexed field. Then you could join the calendar table into your query, avoid the need to compute DatePart() again, and allow Access to use the indexed weekday number to quickly identify which rows match the weekday of interest.

Single Row Table in SQL : Is this a good implementation?

I am new to SQL. I read a bit about how creating a single row table is not really a good practice, but I can't help but find it useful in my case. I am making a web app which balances the workload of employees in the organization. So apart from keeping track of how much work is assigned to every employee and how much work does each task (2 main task types) require, I also need to track the overall workload.
So I plan to make a single row table for total workload, with three columns. One for each of the two task types, summed together. And the third for the sum of those 2 totals. I plan to use triggers to update the table in case of addition of a new task or change in its requirements so that it reflects on the total.
Please let me know if I am heading in the right direction. Thanks!
It will work but it is not extensible, in the sense if tomorrow you need to add a 3rd main task then you will need to alter the table and add another column (not so preferred ). So may be you can just have the table with two columns for now with task type and load and you can always calculate the sum with sql query.

sql same column value for all rows and update it frequently

What I want to achieve:
I have a table denoting ID and Credits of each individual ID.
I want to rate each ID as, rate(ID)=Credit(ID)/sum(Credit(ID)) the sum is over all IDs
I will be updating the table quite frequently and want to keep the sum(Credit(ID)) handy by creating another column and storing this sum in the table (say sigmaID), which should always have exact same value for all rows.
Whenever I change Credit for an ID (say add 100), I can simply do the same operation on this column value (add 100)
Do I have to update sigmaID for all rows? Will it be efficient?
I would like to periodically check if sigmaID is indeed sum(Credit(ID)) for consistency , am I overdoing it? Will it inefficient?
Is there any other approach to this (I am worried about efficiency)?
Kindly provide pure SQL queries as I need to put all of this in an UPDATE trigger which will calculate this rating (and loads of other formulas with other parameters of ID). I may have access to scripting language (PHP/python) but I don't know for sure. Hence the pure SQL request.
Unfortunately my English is very very poor.
As i realized, you want to have all information about your column values, in past and present?
You want to log it ? If it so you can make journal table and log everything yo want in it by trigger.
best regards,
tato mumladze

Help updating a column using other columns of the same table

Table: Customer with columns Start_Time and End_Time.
I need to add a new column "Duration" that is End_Time - Start_Time.
However, I need to do this using a trigger or procedure so that immediately after a new record is added to Customer table, the column Duration is updated.
If you are using MS SQL, the ideal answer is probably a computed column.
The less data you actually duplicate, the less opportunity for data inconsistency you will have, therefore the less consistency-ensuring/verification code and fewer maintenance processes will result from your schema.
To set this up, (again, if using MS SQL), just add another column using the designer, and expand the "Computed Column Specification" area. (You can refer to other columns from this same table for this calculation.) Then enter "End_Time - Start_Time". Depending on what you are going to do with this data, may want to use something like DATEDIFF(minute, Start_Time, End_Time) for your formula, instead. It's exactly what this feature is for.
If it is a very expensive calculation (which yours is probably not, from the information you've given) you could configure the results to be "persisted" - that's very much like a trigger but clearer to implement and maintain.
Alternately, you could create a new View that does the same calculation, and "project" this first table through it whenever getting information. But you probably already knew that, thus this answer was born! :)
p.s. I personally recommend avoiding triggers like the plague. They cause extra operations that are often not expected by a developer, maintainer, or admin. This can cause operations to fail, return unexpected extra result sets, or modify rows that perhaps an admin was specifically trying to avoid modifying during an administrative (read: unsupported grin) fix.
p.p.s. In this case I'd also recommend against a stored procedure, for the same maintenance reason as triggers. Although you could restrict security such that the only way to update the table was through a stored procedure, this can fail for many of the same reasons triggers can fail. Best to avoid duplicating the data if you can.
p.p.p.s :) This is not to say stored procedures are bad as a whole. On complex transactional operations or tightly integrated procedural filtering of large related tables in order to return a comparatively small result set they are still often the best choice.
As per shannon, though the the term in oracle is a "Virtual Column"
There were an 11g enhancement. Prior to that, use a view (and that is still a potential answer for 11g).
Do not use a trigger or stored procedure.

When are computed columns appropriate?

I'm considering designing a table with a computed column in Microsoft SQL Server 2008. It would be a simple calculation like (ISNULL(colA,(0)) + ISNULL(colB,(0))) - like a total. Our application uses Entity Framework 4.
I'm not completely familiar with computed columns so I'm curious what others have to say about when they are appropriate to be used as opposed to other mechanisms which achieve the same result, such as views, or a computed Entity column.
Are there any reasons why I wouldn't want to use a computed column in a table?
If I do use a computed column, should it be persisted or not? I've read about different performance results using persisted, not persisted, with indexed and non indexed computed columns here. Given that my computation seems simple, I'm inclined to say that it shouldn't be persisted.
In my experience, they're most useful/appropriate when they can be used in other places like an index or a check constraint, which sometimes requires that the column be persisted (physically stored in the table). For further details, see Computed Columns and Creating Indexes on Computed Columns.
If your computed column is not persisted, it will be calculated every time you access it in e.g. a SELECT. If the data it's based on changes frequently, that might be okay.
If the data doesn't change frequently, e.g. if you have a computed column to turn your numeric OrderID INT into a human-readable ORD-0001234 or something like that, then definitely make your computed column persisted - in that case, the value will be computed and physically stored on disk, and any subsequent access to it is like reading any other column on your table - no re-computation over and over again.
We've also come to use (and highly appreciate!) computed columns to extract certain pieces of information from XML columns and surfacing them on the table as separate (persisted) columns. That makes querying against those items just much more efficient than constantly having to poke into the XML with XQuery to retrieve the information. For this use case, I think persisted computed columns are a great way to speed up your queries!
Let's say you have a computed column called ProspectRanking that is the result of the evaluation of the values in several columns: ReadingLevel, AnnualIncome, Gender, OwnsBoat, HasPurchasedPremiumGasolineRecently.
Let's also say that many decentralized departments in your large mega-corporation use this data, and they all have their own programmers on staff, but you want the ProspectRanking algorithms to be managed centrally by IT at corporate headquarters, who maintain close communication with the VP of Marketing. Let's also say that the algorithm is frequently tweaked to reflect some changing conditions, like the interest rate or the rate of inflation.
You'd want the computation to be part of the back-end database engine and not in the client consumers of the data, if managing the front-end clients would be like herding cats.
If you can avoid herding cats, do so.
Make Sure You Are Querying Only Columns You Need
I have found using computed columns to be very useful, even if not persisted, especially in an MVVM model where you are only getting the columns you need for that specific view. So long as you are not putting logic that is less performant in the computed-column-code you should be fine. The bottom line is for those computed (not persisted columns) are going to have to be looked for anyways if you are using that data.
When it Comes to Performance
For performance you narrow your query to the rows and the computed columns. If you were putting an index on the computed column (if that is allowed Checked and it is not allowed) I would be cautious because the execution engine might decide to use that index and hurt performance by computing those columns. Most of the time you are just getting a name or description from a join table so I think this is fine.
Don't Brute Force It
The only time it wouldn't make sense to use a lot of computed columns is if you are using a single view-model class that captures all the data in all columns including those computed. In this case, your performance is going to degrade based on the number of computed columns and number of rows in your database that you are selecting from.
Computed Columns for ORM Works Great.
An object relational mapper such as EntityFramework allow you to query a subset of the columns in your query. This works especially well using LINQ to EntityFramework. By using the computed columns you don't have to clutter your ORM class with mapped views for each of the model types.
var data = from e in db.Employees
select new NarrowEmployeeView { Id, Name };
Only the Id and Name are queried.
var data = from e in db.Employees
select new WiderEmployeeView { Id, Name, DepartmentName };
Assuming the DepartmentName is a computed column you then get your computed executed for the latter query.
Peformance Profiler
If you use a peformance profiler and filter against sql queries you can see that in fact the computed columns are ignored when not in the select statement.
Computed columns can be appropriate if you plan to query by that information.
For instance, if you have a dataset that you are going to present in the UI. Having a computed column will allow you to page the view while still allowing sorting and filtering on the computed column. if that computed column is in code only, then it will be much more difficult to reasonably sort or filter the dataset for display based on that value.
Computed column is a business rule and it's more appropriate to implement it on the client and not in the storage. Database is for storing/retrieving data, not for business rule processing. The fact that it can do something doesn't mean you should do it that way. You too you are free to jump from tour Eiffel but it will be a bad decision :)