stacked bar chart from grouped object - pandas

I get the expected count of following group-by query. But when I add .plot.bar() method, I get bar chart for each record.
How do I get stacked bar chart?
df.groupby(['department', 'status'])['c_name'].count()
department status
Agriculture Accepted 3
Pending 2
Rejected 13
Department of Education and Training Accepted 290
Rejected 65
Higher Education Accepted 424
Pending 24
Rejected 92
Medical Education and Research Accepted 34
Pending 3
Rejected 1
This will create a bar chart but not the stacked one.
.plot(kind='bar', stacked=True)
For each department there should be 3 colors (for Accepted, Pending and Rejected)
Update:
I managed using pivot.
gdf=df.groupby(['department', 'status'])['c_name'].count().reset_index()
gdf.pivot(index='department', columns='status').plot(kind='bar', stacked=True)
But is it possible to improve the chart quality?

You are close, need unstack:
df.groupby(['department','status'])['c_name'].count().unstack().plot(kind='bar', stacked=True)

Related

How to label a whole dataset?

I have a question.I have a pandas dataframe that contains 5000 columns and 12 rows. Each row represents the signal received from an electrocardiogram lead. I want to assign 3 labels to this dataset. These 3 tags belong to the entire dataset and are not related to a specific row. How can I do this?
I have attached the picture of my dataframepandas dataframe.
and my labels are: Atrial Fibrillation:0,
right bundle branch block:1,
T Wave Change:2
I tried to assign 3 labels to a large dataset
(Not for a specific row or column)
but I didn't find a solution.
As you see, it has 12 rows and 5000 columns. each row represents 5000 data from one specific lead and overall we have 12 leads which refers to this 12 rows (I, II, III, aVR,.... V6) in my data frame. professional experts are recognised 3 label for this data frame which helps us to train a ML Model to detect different heart disease. I have 10000 data frame just like this and each one has 3 or 4 specific labels. Here is my question: How can I assign these 3 labels to this dataset that I mentioned.as I told before these labels don't refers to specific rows, in fact each data frame has 3 or 4 label for its whole. I mean, How can I assign 3 label to a whole data frame?

Multiple layer heatmap in seaborn?

I have a dataframe with 3 columns. Each of the columns contains some "labels". I want to study a correspondence between the labels of the three columns. As such, I created 3 heatmaps, for each pair of columns, that shows the number of times a pair of labels has appeared.
For example:
colA | colB | colC
dog car USA
cat plane Germany
fish truck Spain
eagle bike France
dog car USA
eagle train UK
A heat map of the first two columns above is:
dog 2 0 0 0 0
cat 0 1 0 0 0
fish 0 0 1 0 0
eagle 0 0 0 1 1
car plane truck bike train
Now, in the same manner, I can create the other two heatmaps. My question is, can I combine two of them (for example keeping the horizontal axis the same and adding two vertical axes, for the other two columns) create a heatmap that includes the entire triplet correspondence?
Sorry if my question seems a bit vague, but I am trying to see if there are ways of visualizing a three-way correspondence in the style of a heatmap.
this link may help you:
Combine multiple heatmaps in matplotlib
one of the answers from above link:
There are a few options to present 2 datasets together:
Options 1 - draw a heatmap of the difference of 2 datasets (or ratio, whatever is more appropriate in your case)
pcolor(D2-D1)
and then present several of these comparison figures.
Option 2 - present 1 dataset as pcolor, and another as countour:
pcolor(D1)
contour(D2)
If you really need to show N>2 datasets together, I would go with contour or contourf:
contourf(D1,cmap='Blues')
contourf(D2,cmap='Reds', alpha=0.66)
contourf(D2,cmap='Reds', alpha=0.33)
example output of 3 contourf commands
or
contour(D1,cmap='Blues')
contour(D2,cmap='Reds')
contour(D2,cmap='Reds')
example output of 3 contour commands
unfortunately, simiar alpha tricks do not work well with pcolor.

QlikView - Struggling to produce Bubble / scatter from record-data

I would like to make a Bubble-scatter where the data looks like:
Each row is an 'event', with a Day of the event and the event's grade
Day | Grade
------------
1 | A
1 | A
1 | B
1 | (empty)
1 | B
2 | A
I want this to turn into a bubble graph that looks like :
Day along the X-axis ( 1, 2)
On Y-axis I would like to see A,B (vertically)
And I would expect
one big bubble for day-1 A
one big bubble for day-2 A
one little bubble for day-2 A
Given the data above
It is either refusing to display anything at all saying 'undefined values'
I'm really struggling to understand how this bubble/scatter works, and the documentation isn't helping
It asks for Dimension, Measure, Measure and I am putting in many variations of Day, Count(Grade) and Avg(Grade)
Bubble charts are a bit tricky.
In your case you need 2 dimensions: dimension 1 (Grade) and dimension 2 (Day).
Expression is as simple as sum(1) (take into account that you are already double-filtering the data because you have 2 dimensions). To avoid extra bubbles due to the Null entries, just select "ignore null values" in the dimension panel options.

SSRS 2008 display mutilple columns of data without a new line

I am creating a report in SSRS 2008 with MS SQL Server 2008 R2. I have data based on the Aggregate value of Medical condition and the level of severity.
Outcome Response Adult Youth Total
BMI GOOD 70 0 70
BMI MONITOR 230 0 230
BMI PROBLEM! 10 0 10
LDL GOOD 5 0 5
LDL MONITOR 4 0 4
LDL PROBLEM! 2 0 2
I need to display the data based on the Response like:
BMI BMI BMI
GOOD MONITOR PROBLEM!
Total 70 230 10
Youth 0 0 0
Adult 70 230 10
LDL LDL LDL
GOOD MONITOR PROBLEM!
Total 5 4 2
Youth 0 0 0
Adult 5 4 2
I first tried to use SSRS to do the grouping based on the Outcome and then the Response but I got each response on a separate row of data but I need all Outcomes on a single line. I now believe that a pivot would work but all the examples I have seen is a pivot on one column of data pivoted using another. Is it possible to pivot multiple columns of data based on a single column?
With your existing Dataset you could so something similar to the following:
Create a List item, and change the Details grouping to be based on Outcome:
In the List cell, add a new Matrix with one Column Group based on Response:
You'll note that since you have individual columns for Total, Youth, Adult, you need to add grand total rows to display each group.
The end result is pretty close to your requirements:
For your underlying data, to help with report development it might be useful to have the Total, Youth, Adult as unpivoted columns, but it's not a big deal if the groups are fairly static.

vba loop through all the pivot fields of a pivot table and return specified values

I have a dataset whose entries has 5 different attributes and one value. For example, I have a height of 5000 people. For each person I have his hair color, eye color, his nationality, the city he were born and the name of his mother (the 5 dimensions).
No/Eye Color/Hair Color/Nationality/Hometown/Mother's Name/Height
Blue Blond Swiss Zürich Nicole 184
Blue Brown English York Ruby 164
Brown Brown French Paris Sophie 154
etc..
So there are 5 dimensions. The data is set dynamically, so the number of categories in each dimensions can vary. I sought to compute the average height of people depending on whether I want to include some dimensions or not (from 1 to 5). For example I wanted the retrieve:
The average height of French and Blue eyed people. Next day only the people born in London. And the week after, the Swiss, blue-eyed, red-haired, born in Geneva and whose mother is called Nicole.
So I create a pivot table with the Eye Color as Row labels, Hair Color as Column labels, the average height as the Data and the last 3 dimensions as Market Filters. This allowed me see all the possible and desired combinations of average height that my data implies.
Now my goal is:
I want to create a Macro that goes through all the possible combinations that my dimensions entails (i.e 2^5-1=31) and store in a vector all the combination of height average that are above a certain value, e.g. 190. And then It could print on a worksheet.
I was thinking on using some booleans arrays vector and For-Each-Next structure, but I must say that I fail to picture how to implement it.
Any ideas?
Thanks for the time and help!