We have a client table with a field DateOfBirth.
I'm new to MS Analysis Services, OLAP and data cubes. I'm trying to report on client metrics by age categories (18-25,26-35,35-50,50-65,66+)
I don't see a way to accomplish this. (Note: I'm not concerned with age at the time of a sale. I'm interested in knowing the age distribution of my current active customers).
You can create either a TSQL or Named Calculation in the Data Source View that calculates the CurrentAge based on the DOB field.
You will likely also want to implement another similarly derived field that assigns the CurrentAge Value a Bucket in your date range. this is a simple TSQL Case statement.
Depending on how large the client table is (and the analytical purpose), you may want to make this into a fact table or at least use snowflaking to separate this from the other relatively static attribute fields in the client table.
Related
I have a pretty simple table with an ID, Date, and 20 value columns. Each column and each row can hold different type of data - with different unit of measure - and the ID column defines each fields unit of measure. So basically the ID field helps identifying the meaning behind each fields. Naturally I have an explanatory table that holds these definitions by ID.
The table holds sensor data, and these sensors are inserting thousands of rows of data each second (each TYPE of sensor has their own ID).
My problem is: how to aggregate this kind of table? Because each type of measurement requires different aggregation (some measuremants I need to average, other to sum or min or max etc...).
I think the perfect solution would be something like having an explanatory table by ID, which defines for each field (of that ID) that how should I aggregate them, and the aggregation command (somehow... magically...) should be dynamic by this table...
Do you have any suggestion how I can accomplish that? Or is it even possible to make the aggregation function dynamic by a certain condition (in this case the explanatory tables value)?
Are you sure SQL is the right tool for the job? sounds to me you want columnar DBs, or other types of noSQL will fit better
I have a database that keeps track of attendance for students in a school. There's one table (SpecificClasses) with dates of all of the classes, and another table (Attendance) with a list of all the students in each class and their attendance on that day.
The school wants to be able to view that data in many different ways and to filter it according to many different parameters. (I won't paste the entire query here because it is quite complicated and the details are not important for my question.) One of their options they want is to view the attendance of a specific student on a certain day of the week. Meaning, they want to be able to notice if a student is missing every Tuesday or something like that.
To make the query be able to do that, I have used DatePart("w",[SpecificClasses]![Day]). However, running this on every class (when we may be talking about hundreds of classes taken by one student in one semester) is quite time-consuming. So I was thinking of storing the day of the week manually in the SpecificClasses table, or perhaps even in the Attendance table to be able to avoid making a join, and just being very careful in my events to keep this data up-to-date (meaning to fill in the info when the secretaries insert a new SpecificClass or fix the Day field).
Then I was wondering whether I could just make a calculated field that would store this value. (The school has Access 2010 so I don't have to worry about compatibility). If I create a calculated field, does Access actually store that field and remember it for the future and not have to recalculate it each time?
As HansUp mentions in his answer, a Calculated field cannot be indexed so it might not give you much of a performance boost. However, since you are using Access 2010 you could create a "real" Integer field named [WeekdayNumber] and put an index on it,
and then use a Before Change data macro to insert the Weekday() value for you:
(The Weekday() function gives the same result as DatePart("w", ...).)
I was wondering whether I could just make a calculated field that
would store this value.
No, not for a calculated field expression which uses DatePart(). Access supports a limited set of functions for calculated fields, and DatePart() is not one of those.
If I create a calculated field, does Access actually store that field
and remember it for the future and not have to recalculate it each
time?
Doesn't apply to your current case. But for a calculated field which Access would accept, yes, that is the way it works.
However a calculated field can not be indexed so that limits how much improvement it can offer in terms of data retrieval speed. If you encounter another situation where you can create a valid calculated field, test the performance to see whether you notice any improvement (vs. calculating the value in a query).
For your DatePart() query problem, consider creating a calendar table with a row for each date and include the weekday number as a separate indexed field. Then you could join the calendar table into your query, avoid the need to compute DatePart() again, and allow Access to use the indexed weekday number to quickly identify which rows match the weekday of interest.
In my database, I have a table that has to get info from two adjacent rows from another table.
Allow me to demonstrate. There's a bill that calculates the difference between two adjacent meter values and calculates the cost accordingly (i.e., I have a water meter and if I want to calculate the amount I should pay in December, I take the value I measured in November and subtract it from the December one).
My question is, how to implement the references the best way? I was thinking about:
Making each meter value an entity on its own. The bill will then have two foreign keys, one for each meter value. That way I can include other useful data, like measurement date and so on. However, implementing and validating adjacency becomes icky.
Making a pair of meter values an entity (or a meter value and a diff). The bill will reference that pair. However, that leads to data duplication.
Is there a better way? Thank you very much.
First, there is no such thing as "adjacent" rows in a relational database. Tables represent unordered sets. If you have a concept of ordering it needs to be implementing using data in the rows. Let me assume that you have some sort of "id" or "creation date" that specifies the ordering.
Because you don't specify the database, I'll assume you have a functional database that supports the ANSI standard window functions. In that case, you can get what you want using the LAG() function. The syntax to get the previous meter reading is something like:
select lag(value) over (partition by meterid order by readdatetime)
There is no need to have data duplication or some arcane data data structure. LAG() should also be able to take advantage of appropriate indexes.
Is there any Database Server that offers the possibility to do global projection of the entire database? For example suppose that we have 30 tables that have a 'Year' column, and the database has data for the last 5 years, and let's say that we are interested in one year of data at a time, is there any way to do global projection so we can have a view of the database that include only data for one year at a time?
If you really must not alter existing code to have it only show the past year, then try to make a view for every table, have this view only show you the 'current year' if you want to show anything other than the current year you then can query the source table. You rename the table and name the view as the table was (though this is a generally sloppy practice).
Otherwise you're going to have to use a WHERE clause in all your queries.
Realistically this is something that your ORM should be dealing with NOT your RDBMS.. unless you're doing raw SQL queries in your code (in which case see the start of my answer for the VIEW option).
A UNION query with a WHERE clause to filter by a year date range should solve what you are describing.
All the major RDBMS support this functionality.
If the tables all have the same schema then it's easy; if not, you will probably have to introduce 'dummy' columns for some portions of the UNION.
[SGBD is the french term for a RDBMS: What does SGBD mean? ]
Sorry for the long question title.
I guess I'm on to a loser on this one but on the off chance.
Is it possible to make the calculation of a calculated field in a table the result of an aggregate function applied to a field in another table.
i.e.
You have a table called 'mug', this has a child called 'color' (which makes my UK head hurt but the vendor is from the US, what you going to do?) and this, in turn, has a child called 'size'. Each table has a field called sold.
The size.sold increments by 1 for every mug of a particular colour and size sold.
You want color.sold to be an aggregate of SUM size.sold WHERE size.colorid = color.colorid
You want mug.sold to be an aggregate of SUM color.sold WHERE color.mugid = mug.mugid
Is there anyway to make mug.sold and color.sold just work themselves out or am I going to have to go mucking about with triggers?
you can't have a computed column directly reference a different table, but you can have it reference a user defined function. here's a link to a example of implementing a solution like this.
http://www.sqlservercentral.com/articles/User-Defined+functions/complexcomputedcolumns/2397/
No, it is not possible to do this. A computed column can only be derived from the values of other fields on the same row. To calculate an aggregate off another table you need to create a view.
If your application needs to show the statistics ask the following questions:
Is it really necessary to show this in real time? If so, why? If it is really necesary to do this, then you would have to use triggers to update a table. This links to a short wikipedia article on denormalisation. Triggers will affect write performance on table updates and relies on the triggers being active.
If it is only necessary for reporting purposes, you could do the calculation in a view or a report.
If it is necessary to support frequent ad-hoc reports you may be into the realms of a data mart and overnight ETL process.