Dynamic subtraction of crosstab columns in MS Access - sql

I'm working in Access with some Crosstab columns that are generated dynamically based on the reports received (eg: Based on the parameter year: 2009,2010,2011). I’d like to calculate the difference of two years consecutively for all the generated columns (eg:2009-10,10-11,….). Is there any way to do it?
Eg:
Default crosstab:
2009 2010 2011
A 30 20 5
B 30 20 5
Expected output:
2009 2010 2011 20010-2009 2011-10
A 30 20 5 -10 -15
B 30 20 5 -10 -15
I can easily created calculated fields and do this manually, but the data changes and I get additional years received in the data and want to make it dynamic.
Is there a way to do this in MS Access?

Related

How to redistribute outliers over the previous time period?

Imagine a dataframe that looks like this:
1
2
3
4
5
6
7
50
16
17
Normally we would apply an algorithm from Detect and exclude outliers in a pandas DataFrame to entirely remove the 50, however my particular dataset instead requires me to distribute the values of the 50 over the previous 7 days:
8
9
10
11
12
13
14
15
16
17
How can I make this work in Pandas? I can detect the outliers pretty easily but not sure how to spread the values out into previous days. Note that a simple moving average doesn't work well for this type of data, as there would still be a jump in the average value when 50 shows up. What I need to do is smooth out 50 into the previous days so that no jump is visible.

Pentaho PRD - Summing Group'ed Values w/o Using SQL (Object-Based Datasource)

In Pentaho's PRD, I am working with an object datasource (i.e. I do not have a SQL query I may edit to group the data). To realize the required report, I must group the data within the PRD (OK) and only show these grouped values (OK). How can I sum the group values in the group headers to generate totals (MY PROBLEM) when there are multiple records per group? Here is a simplified example:
Assume the dataset I provide to the PRD is:
X 42 1
X 42 2
X 42 3
Y 10 12
Y 10 7
Z 8 22
Z 8 92
So, I need to display groups based upon column 1 and 2 only.
Column 3 is excluded; but, I can't remove it from the dataset.
Then, I must provide a total for the 2nd column, as follows:
X 42
Y 10
Z 8
---------
Total 60

Access 2010: data rows as column headings?

I'm using MS Access 2013.
I need to display AND EDIT a grid of data based on three tables:
UnitID UnitName
1 Unit1
2 Unit2
3 Unit3
ProdID ProdName
1 Furniture
2 Food
3 Other
UnitID ProdID Forecast
1 1 10
1 2 20
1 3 30
2 1 40
2 2 50
2 3 60
3 1 70
3 2 80
3 3 90
so it looks like:
Unit1 Unit2 Unit3
Furniture 10 40 70
Food 20 50 80
Other 30 60 90
Furthermore, the query must be editable (user should be able to enter his forecast data).
Any idea how to do this in Access 2010? I've looked into pivots and crosstab queries, but they use aggregate functions and thus aren't editable... but in my case, the source of the data is unambiguous so an editable option should exist? Anyone has an idea how to get the data in editable format?
Thanks!
Jur.
Create a temp table and fill it with the data from your crosstab query. Use that table as the source for a form, which will be editable. In the beforeupdate event of the form, add code to update the original source table.
Thanks all,
distributing any kind of exe is not an option due to security measurements in the client's environment (they can run Office and little else). So I'm going for the temp table option anyway... any pointers for a template solution to modify to my needs?
Thanks again!
Jur.

PowerPivot formula for row wise weighted average

I have a table in PowerPivot which contains the logged data of a traffic control camera mounted on a road. This table is filled the velocity and the number of vehicles that pass this camera during a specific time(e.g. 14:10 - 15:25). Now I want to know that how can I get the average velocity of cars for an specific hour and list them in a separate table with 24 rows(hour 0 - 23) where the second column of each row is the weighted average velocity of that hour? A sample of my stat_table data is given below:
count vel hour
----- --- ----
133 96.00237 15
117 91.45705 21
81 81.90521 6
2 84.29946 21
4 77.7841 18
1 140.8766 17
2 56.14951 14
6 71.72839 13
4 64.14309 9
1 60.949 17
1 77.00728 21
133 100.3956 6
109 100.8567 15
54 86.6369 9
1 83.96901 17
10 114.6556 21
6 85.39127 18
1 76.77993 15
3 113.3561 2
3 94.48055 2
In a separate PowerPivot table I have 24 rows and 2 columns but when I enter my formula, the whole rows get updated with the same number. My formula is:
=sumX(FILTER(stat_table, stat_table[hour]=[hour]), stat_table[count] * stat_table[vel])/sumX(FILTER(stat_table, stat_table[hour]=[hour]), stat_table[count])
Create a new calculated column named "WeightedVelocity" as follows
WeightedVelocity = [count]*[vel]
Create a measure "WeightedAverage" as follows
WeightedAverage = sum(stat_table[WeightedVelocity]) / sum(stat_table[count])
Use measure "WeightedAverage" in VALUES area of pivot Table and use "hour" column in ROWS to get desired result.

Extend observations for all years in sequence

I have 2 sets.
First one is big (~1000k rows), it contains patient observation data grouped by observation year, from, lets say 2000 to 2005. In this set there are some patients that contain observations for all years (or should I say for each year in sequence), and there are some that has, for example, observations for year 2002-2003 only.
The second set contains only sequence of years from 2000 till 2005, 6 rows.
What I want to have is a table with the data from set 1 for each patient, but extended so that for each patient I would see observations for each year from set 2, and if there were not any observation for particular year in set 1, the empty rows should be added or emptyness (or better "-") in the data column only.
For example set 1 could be:
patient_id | obs_year | data
a 2000 10
a 2001 12
a 2002 13
a 2003 9
a 2004 1
a 2005 6
bb 2002 100
bb 2003 110
Set 2 is like:
year |
2000
2001
2002
2003
2004
2005
So what I want in result ideally would be like this:
patient_id | obs_year | data
a 2000 10
a 2001 12
a 2002 13
a 2003 9
a 2004 1
a 2005 6
bb 2000 -
bb 2001 -
bb 2002 100
bb 2003 110
bb 2004 -
bb 2005 -
I should also mention that I do this job in SAS, so SQL query or SAS script (or both )solutions are welcomed.
Dedup your patient_id from set 1 in a sort. Merge this onto set 2 to give every patient_id against the years, then merge this back onto set 1 by patient_id and year to give your output. Anywhere that patient_id and year do not match will be blank as in your desired output
Another option is PROC FREQ with sparse, which produces a line for every possible combination whether they appear or not. This works if you don't have any legitimate zeroes in the data; if you do and care that they're different from missing, this won't work.
proc freq data=have noprint;
weight data;
tables patient_id*obs_year/missing sparse out=want(rename=count=data keep=count patient_id obs_year);
run;
Then you need to convert 0 back to missing, if you care about the difference (presumably in the next step, if there is one).
A similar approach that is closer to the desired results is proc tabulate with printmiss, which works similarly to sparse:
proc tabulate data=have out=want(keep=patient_id obs_year data_sum rename=data_sum=data);
class patient_id obs_year;
var data;
tables patient_id,obs_year*data*sum='data'/printmiss misstext='.';
run;
That actually does get you missing values properly.