Can Dimple.js aggregate and group data - dimple.js

I have data that looks like this (this is fake data presented here):
physician|patient|location|datetime|condition|treatment|billed|collected
Deanna Smith|Marko Cruise|Tampa|20140104|diabetic retinopathy|eylea|1800|1706
Jenna Lewis|Jenna Han|Jacksonville|20150320|macular deg.|eylea|1800|1726
George Cruise|Lisa Washington|Orlando|20140509|diabetic retinopathy|ajurdec|800|740
Lisa Kozaczinski|Lana Brown|Tampa|20151012|macular deg.|avastin|400|275
Lisa Smith|Joanna Cruise|Tampa|20140921|macular deg.|lucentis|1200|1061
Mike Taylor|Lana Smith|Orlando|20150322|diabetic retinopathy|ajurdec|800|676
Joanna Taylor|Lisa Washington|Jacksonville|20140121|macular deg.|lucentis|1200|1145
Jenna Taylor|George Taylor|Tampa|20150119|macular deg.|eylea|1800|1741
Jenna Washington|George Smith|Orlando|20150402|macular deg.|eylea|1800|1659
Mark Cheng|Lisa Taylor|Pensacola|20150418|macular deg.|eylea|1800|1679
Lisa Fox|Mike Hajdukovic|Orlando|20140423|macular deg.|ajurdec|800|693
Silvia Fox|Jenna Smith|Jacksonville|20151104|macular deg.|avastin|400|289
Mike Washington|John Hajdukovic|Tampa|20151005|macular deg.|avastin|400|302
and I would like to present the line chart histogram grouped by monthly or weekly collections.
Can Dimple.js do this, or do I need to pre-aggregate/group data before I chart it. Examples on the web seem to show the data properly aggregated.
Do I need to maybe re-format the dates?
Please advise.

take a look at dimple.axis.timeField, dimple.axis.dateParseFormat, and timePeriod, and timeFormat.
https://github.com/PMSI-AlignAlytics/dimple/wiki/dimple.axis#timeField
Dimple aggregates by default based on your series and axis setup.
The default is to use sum as the aggregate method, but averages can also be used (max, min, custom).

Related

How to merge crosstab info down in Access?

Not sure if this is possible but I'm hoping it is. I am using MS Access for Estate Planning for work. I've gotten to the point where I've got the data to look like this:
File_Name
Executor_1
Executor_2
Beneficiary_1
Beneficiary_2
Hill, Hank
Peggy Hill
Peggy Hill
Hill, Hank
Bobby Hill
Bobby Hill
Gribble, Dale
Nancy Gribble
Gribble, Dale
Joseph Gribble
Joseph Gribble
Gribble, Dale
John Redcorn
But I need it to look like this:
File_Name
Executor_1
Executor_2
Beneficiary_1
Beneficiary_2
Hill, Hank
Peggy Hill
Bobby Hill
Peggy Hill
Bobby Hill
Gribble, Dale
Nancy Gribble
Joseph Gribble
Joseph Gribble
John Redcorn
I need it in the latter format so I can use MailMerge in word and create the Will. Can anyone provide any guidance? We don't currently use any software for Est. Planning so anything beats having to go into Word manually and retype everything. Please let me know if more information is needed.
Edit:
This is what the SQL looks like:
TRANSFORM Last(File_Roles.File_Name) AS LastOfFile_Name
SELECT File_Roles.Executor_1,
File_Roles.Executor_2,
File_Roles.Beneficiary_1,
File_Roles.Beneficiary_2,
File_Roles.Trustee_1,
File_Roles.Trustee_2,
File_Roles.Guardian_1,
File_Roles.Guardian_2,
File_Roles.ATTY_IF_1, File_Roles.ATTY_IF_2,
File_Roles.HCATTY_IF_1,
File_Roles.HCATTY_IF_2
FROM File_Roles
GROUP BY File_Roles.Executor_1,
File_Roles.Executor_2,
File_Roles.Beneficiary_1,
File_Roles.Beneficiary_2,
File_Roles.Trustee_1,
File_Roles.Trustee_2,
File_Roles.Guardian_1,
File_Roles.Guardian_2,
File_Roles.ATTY_IF_1,
File_Roles.ATTY_IF_2,
File_Roles.HCATTY_IF_1,
File_Roles.HCATTY_IF_2
PIVOT File_Roles.File_Name;
You can use GROUP BY and MAX()
SELECT
t.File_Name,
MAX(t.Executor_1) As Executor_1,
MAX(t,Executor_2) As Executor_2,
MAX(t.Beneficiary_1) As Beneficiary_1,
MAX(t.Beneficiary_2) As Beneficiary_2
FROM table_or_query t
GROUP BY File_Name
But maybe you can fix your original crosstab query to do this right away. Probably you are doing the grouping wrong. You must group by File_Name in the crosstab query and apply Max to the total row of the value (so it is difficult to say without seeing this query).
GROUP BY File_Name means that one row is created for each distinct value of File_Name.
Since this will merge several rows into one, you must specify an aggregate function for every column in the SELECT list not listed in the GROUP BY clause. This can be e.g. SUM(), AVG(), MIN() or MAX(). See SQL Aggregate Functions for a complete list. Since any Null value is considered to be less than any other value, MAX() will take this non-Null value from the merged rows.

Tricks to exceed column limitations in SQL Database

Hello swarm intelligence!
I have the following use case: For every movie that is requested by a user, I create a number of tags for that specific movie, derived from several sources (actors, plot etc.. ).
I will use this data for associaton mining.
The problem: If I use the movie for rows and the tags for columns, the tags will easily exceed the technical limitations of 3000 columns ( there is even more actors, and then plot keywords etc)
Is there any way, I can organize this data to then use it for (quick) association mining?
Thanks a lot
Don't put tags in columns. Instead create a separate table, named something like movie_tags with two columns, movie_id and tag. Put each tag in a separate row of that table.
This is known as "normalizing" your data. Here's a nice walkthrough with an example very similar to yours.
Edit: Let's say you have a catalog of movies about the Italian Mafia in New York City in the 20th century. Let's say the movies are
1 Godfather
2 Goodfellas
3 Godfather II
Then your movie_tags table might contain these rows.
1 Gangsters
2 Gangsters
3 Gangsters
1 Francis Ford Coppola
3 Francis Ford Coppola
2 Martin Scorsese
Pro tip If you find yourself thinking about putting lots of data items with the same meaning in their own columns, you probably need to normalize the data and add appropriate tables.

With MDX is there a generic way to calculate the ratio of cells with regards to the selected members of a specific hierarchy?

I want to define a cube measure in a SSAS Analysis Services Cube (multidimensional model) that calculates ratios for the selection a user makes for a predefined hierarchy. The following example illustrates the desired behavior:
|-City----|---|
| Hamburg | 2 |
| Berlin | 1 |
| Munich | 3 |
This is my base table. What I want to achieve is a cube measure that calculates ratios based on a users' selection. E.g. when the user queries Hamburg (2) and Berlin (1) the measure should return the values 67% (for Hamburg) and 33% (for Berlin). However if Munich (3) is added to the same query, the return values would be 33% (Hamburg), 17% (Berlin) and 50% (Munich). The sum of the values should always equal to 100% no matter how many hierarchy members have been included into the MDX query.
So far I came up with different measures, but they all seem to suffer from the same problem that is it seems impossible to access the context of the whole MDX query from within a cell.
My first approach to this was the following measure:
[Measures].[Ratio] AS SUM([City].MEMBERS,[Measures].[Amount])/[Measures].[Amount]
This however sums up the amount of all cities regardless of the users selection and though always returns the ratio of a city with regards to the whole city hierarchy.
I also tried to restrict the members to the query context by adding the EXISTING keyword.
[Measures].[Ratio] AS SUM(EXISTING [City].MEMBERS,[Measures].[Amount])/[Measures].[Amount]
But this seems to restrict the context to the cell which means that I get 100% as a result for each cell (because EXISTING [City].MEMBERS is now restricted to a cell it only returns the city of the current cell).
I also googled to find out whether it is possible to add a column or row with totals but that also seems not possible within MDX.
The closest I got was with the following measure:
[Measures].[Ratio] AS SUM(Axis(1),[Measures].[Amount])/[Measures].[Amount]
Along with this MDX query
SELECT {[Measures].[Ratio]} ON 0, {[City].[Hamburg],[City].[Berlin]} ON 1 FROM [Cube]
it would yield the correct result. However, this requires the user to put the correct hierarchy for this specific measure onto a specific axis - very error prone, very unintuitive, I don't want to go this way.
Are there any other ideas or approaches that could help me to define this measure?
I would first define a set with the selected cities
[GeoSet] AS {[City].[Hamburg],[City].[Berlin]}
Then the Ratio
[Measures].[Ratio] AS [Measures].[Amount]/SUM([GeoSet],[Measures],[Amount])
To get the ratio of that city to the set of cities. Lastly
SELECT [Measures].[Ratio] ON COLUMNS,
[GeoSet] ON ROWS
FROM [Cube]
Whenever you select a list of cities, change the [GeoSet] to the list of cities, or other levels in the hierarchy, as long as you don't select 2 overlapping values ([City].[Hamburg] and [Region].[DE6], for example).

Nonexact duplication, sum rows in pivot table

I have a spreadsheet for payroll that is populated from a seperate spreadsheet. Occasionally,one of our workers will get a promotion. That promotion shows on the timesheets: ex. Smith, Adam Position becomes Smith, Adam Promotion.
This data is then populated into a pivot table where Smith, Adam Position and Smith, Adam Promotion show in separate cells. Currently, we are manually adding the two data sets so that payroll gets a single number instead of multiple. I would like to simplify this tasks. I am using excel 2003, so some more advanced functions don't work.
Any suggestions and help would be greatly appreciated. Thanks in advance.
Ideally, you'd use a different field (a unique identifier) to identify Smith, Adam (e.g., an employee ID number), but if that's not available, then you could take the following approach:
(Suppose that "Smith, Adam Position" is in A1.)
You could add an additional column that extracts the last name, the comma, and then whatever the next word is. For example, from
Smith, Adam Analyst
you would get Smith, Adam. Unfortunately, this means that If you have something like
Jones, Mary Ellen Consultant
you would end up with Jones, Mary. If you think you can live with that, this solution could work. The way you would extract that would be with the following formula:
=SUBSTITUTE(LEFT(SUBSTITUTE(A1,", ",",",1),FIND(" ",A1)-1),",",", ",1)
And then build your pivot table on that field.

Rows to Dynamic columns in Access

I need a setup in Access where some rows in a table are converted to columns...for example, lets say I have this table:
Team Employee DaysWorked
Sales John 23
Sales Mark 3
Sales James 5
And then through the use of a query/form/something else, I would like the following display:
Team John Mark James
Sales 23 3 5
This conversion of rows to columns would have to be dynamic as a team could have any number of Employees and the Employees could change etc. Could anyone please guide me on the best way to achieve this?
You want to create a CrossTab query. Here's the SQL that you can use.
TRANSFORM SUM(YourTable.DaysWorked) AS DaysWorked
SELECT YourTable.Team
FROM YourTable
GROUP BY YourTable.Team
PIVOT YourTable.Employee
Of course the output is slightly different in that the columns are in alphabetical order.
Team James John Mark
Sales 5 23 3
For more detail see Make summary data easier to read by using a crosstab query at office.microsoft.com