I have a dataframe with 3 columns. Each of the columns contains some "labels". I want to study a correspondence between the labels of the three columns. As such, I created 3 heatmaps, for each pair of columns, that shows the number of times a pair of labels has appeared.
For example:
colA | colB | colC
dog car USA
cat plane Germany
fish truck Spain
eagle bike France
dog car USA
eagle train UK
A heat map of the first two columns above is:
dog 2 0 0 0 0
cat 0 1 0 0 0
fish 0 0 1 0 0
eagle 0 0 0 1 1
car plane truck bike train
Now, in the same manner, I can create the other two heatmaps. My question is, can I combine two of them (for example keeping the horizontal axis the same and adding two vertical axes, for the other two columns) create a heatmap that includes the entire triplet correspondence?
Sorry if my question seems a bit vague, but I am trying to see if there are ways of visualizing a three-way correspondence in the style of a heatmap.
this link may help you:
Combine multiple heatmaps in matplotlib
one of the answers from above link:
There are a few options to present 2 datasets together:
Options 1 - draw a heatmap of the difference of 2 datasets (or ratio, whatever is more appropriate in your case)
pcolor(D2-D1)
and then present several of these comparison figures.
Option 2 - present 1 dataset as pcolor, and another as countour:
pcolor(D1)
contour(D2)
If you really need to show N>2 datasets together, I would go with contour or contourf:
contourf(D1,cmap='Blues')
contourf(D2,cmap='Reds', alpha=0.66)
contourf(D2,cmap='Reds', alpha=0.33)
example output of 3 contourf commands
or
contour(D1,cmap='Blues')
contour(D2,cmap='Reds')
contour(D2,cmap='Reds')
example output of 3 contour commands
unfortunately, simiar alpha tricks do not work well with pcolor.
Related
I have a question.I have a pandas dataframe that contains 5000 columns and 12 rows. Each row represents the signal received from an electrocardiogram lead. I want to assign 3 labels to this dataset. These 3 tags belong to the entire dataset and are not related to a specific row. How can I do this?
I have attached the picture of my dataframepandas dataframe.
and my labels are: Atrial Fibrillation:0,
right bundle branch block:1,
T Wave Change:2
I tried to assign 3 labels to a large dataset
(Not for a specific row or column)
but I didn't find a solution.
As you see, it has 12 rows and 5000 columns. each row represents 5000 data from one specific lead and overall we have 12 leads which refers to this 12 rows (I, II, III, aVR,.... V6) in my data frame. professional experts are recognised 3 label for this data frame which helps us to train a ML Model to detect different heart disease. I have 10000 data frame just like this and each one has 3 or 4 specific labels. Here is my question: How can I assign these 3 labels to this dataset that I mentioned.as I told before these labels don't refers to specific rows, in fact each data frame has 3 or 4 label for its whole. I mean, How can I assign 3 label to a whole data frame?
I have a verbal algorithm question, thus I have no code yet. The question is this: How can I possibly create an algorithm such that I have 2 dynamic stacks, both can or can not have duplicate items of strings, for example I have 3 breads, 4 lemons and 2 pens in the first stack, say s1, and I have 5 breads, 3 lemons and 5 pens in the second stack, say s2. I want to find the number of duplicates in each stack, and print out the minimum number of duplicates in both lists, for example:
bread --> 3
lemon --> 3
pen --> 2
How can I traverse 2 stacks and print the number of duplicated occurrences until the end of stacks? If you are confused about anything, I can edit my question depending on your confusion. Thanks.
I have two data frames, and I want to create new columns in frame 1 using properties from frame 2
frame 1
Name
alice
bob
carol
frame 2
Name Type Value
alice lower 1
alice upper 2
bob equal 42
carol lower 0
desired result
frame 1
Name Lower Upper
alice 1 2
bob 42 42
carol 0 NA
Hence, the common column of both frames is Name. You can use Name to look up bounds (of a variable), which are specified in the second frame. Frame 1 lists each name exactly once. Frame 2 might have one or two entries per frame, which might either specify a lower or an upper bound (or both at a time if the type is equal). We do not need to have both bounds for each variable, one of the bounds can stay empty. I would like to have a frame that lists the range of each variable. I see how I can do that with for-loops over the columns, but that does not seem to be in the pandas spirit. Do you have any suggestions for a compact solution? :-)
Thanks in advance
You're not looking for a merge, but rather a pivot.
(df2[df2['Name'].isin(df1['Name'])]
.pivot('Name', 'Type', 'Value')
.reset_index()
)
But this doesn't handle the special 'equal' case.
For this, you can use a little trick. Replace 'equal' by a list with the other two values and explode to create the two rows.
(df2[df2['Name'].isin(df1['Name'])]
.assign(Type=lambda d: d['Type'].map(lambda x: {'equal': ['lower', 'upper']}.get(x,x)))
.explode('Type')
.pivot('Name', 'Type', 'Value')
.reset_index()
.convert_dtypes()
)
Output:
Name lower upper
0 alice 1 2
1 bob 42 42
2 carol 0 <NA>
I have a list and a data frame. I want to find the number of each word in the list (some words in the list are pair) for each "emotions" in the data frame.
Here is my list:
[(frozenset({'know'}), 16528),
(frozenset({'im'}), 39047),
(frozenset({'feeling'}), 99455),
(frozenset({'like'}), 49332),
(frozenset({'feel', 'im'}), 16602),
(frozenset({'feeling', 'im'}), 23488),
(frozenset({'feel'}), 202985),
(frozenset({'feel', 'like'}), 42162),
(frozenset({'time'}), 17203),
(frozenset({'really'}), 17247)]
and this is my data frame:
Unnamed: 0 id text emotions
0 0 27383 [feel, awful, job, get, position, succeed, hap... sadness
1 1 110083 [im, alone, feel, awful] sadness
2 2 140764 [ive, probably, mentioned, really, feel, proud... joy
3 3 100071 [feeling, little, low, day, back] sadness
4 4 2837 [beleive, much, sensitive, people, feeling, te... love
Here is the expected output:
6 columns for six existed emotions and the last column is for totall count.
I'm new to mdx and need your help:
[Item].[Segment] [Country].[World] [Measures].[Periodic]
1 Region A 150
2 Region B 60
3 Region C 1400
4 Region D 20
I have two dimensions Segment and World. If I take only world, I get no values. But I want to achieve to combine the two dimensions to one dimension on segment level as following:
[Item].[Segment] [Measures].[Periodic]
1 150
2 60
3 1400
4 20
Would an aggregation be useful in this case?
Thanks in advance!
The Structure is like following:
Cube_Structure
--> I need to combine both dimensions Segment and World in order to have one dimension on the row which shows me the values for the segments only!