I have a data frame like the following:
days movements count
0 0 0 2777
1 0 1 51
2 0 2 2
3 1 0 6279
4 1 1 200
5 1 2 7
6 1 3 3
7 2 0 5609
8 2 1 110
9 2 2 32
10 2 3 4
11 3 0 4109
12 3 1 118
13 3 2 101
14 3 3 8
15 3 4 3
16 3 6 1
17 4 0 3034
18 4 1 129
19 4 2 109
20 4 3 6
21 4 4 2
22 4 5 2
23 5 0 2288
24 5 1 131
25 5 2 131
26 5 3 9
27 5 4 2
28 5 5 1
29 6 0 1918
30 6 1 139
31 6 2 109
32 6 3 13
33 6 4 1
34 6 5 1
35 7 0 1442
36 7 1 109
37 7 2 153
38 7 3 13
39 7 4 10
40 7 5 1
41 8 0 1085
42 8 1 76
43 8 2 111
44 8 3 13
45 8 4 7
46 8 7 1
47 9 0 845
48 9 1 81
49 9 2 86
50 9 3 8
51 9 4 8
52 10 0 646
53 10 1 70
54 10 2 83
55 10 3 1
56 10 4 2
57 10 5 1
58 10 6 1
This shows that for example on day 0, I have 2777 entries with 0 movements, 51 entries with 1 movement, 2 entries with 2 movements. I want to plot it as bar graph for every day and show the entries count for all movements. In order to do it, I thought I would transform the data to something like below and then plot a bar graph.
days 0 1 2 3 4 5 6 7
0 2777 51 2
1 6279 200 7 3
2 5609 110 32 4
3 4109 118 101 8 3
4 3034 129 109 6 2 2
5 2288 131 131 9 2 1
6 1918 139 109 13 1 1
7 1442 109 153 13 10 1
8 1085 76 111 13 7 1
9 845 81 86 8 8
10 646 70 83 1 2 1 1
I am not getting an idea of how should I achieve this? I have thousands of lines of data so doing it by hand does not make sense. Can someone guide me how to rearrange the data or if there is a quick way to plot the bar graph using matplotlib straight from the actual data frame that would be even better. Thanks for the help.
Related
I have a dataframe with multiple columns, all are ordered in ascending order:
40 41 42 43 44 45 46 47 48 49
0 1 1 1 1 1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1 1
2 1 1 1 1 1 2 1 1 1 1
3 1 1 1 1 1 2 1 1 1 1
4 1 1 1 1 1 2 2 1 1 1
.. .. .. .. .. .. .. .. .. .. ..
367 18 26 25 25 30 25 27 27 30 29
368 18 26 26 25 30 25 27 27 31 29
369 18 27 27 25 30 25 27 27 31 29
370 19 27 27 25 30 25 27 27 31 29
371 19 27 27 25 30 25 27 27 31 29
I would want to groupby each column's values and run cumcount. I know I could iterate through all the columns, but as people say you should avoid iteration as much as you. So I would like to know if there is a more elegant solution.
If you have a reasonable number of columns, using apply on the columns is actually not that bad:
df.apply(lambda c: c.groupby(c).cumcount())
output:
40 41 42 43 44 45 46 47 48 49
0 0 0 0 0 0 0 0 0 0 0
1 1 1 1 1 1 1 1 1 1 1
2 2 2 2 2 2 0 2 2 2 2
3 3 3 3 3 3 1 3 3 3 3
4 4 4 4 4 4 2 0 4 4 4
367 0 0 0 0 0 0 0 0 0 0
368 1 1 0 1 1 1 1 1 0 1
369 2 0 0 2 2 2 2 2 1 2
370 0 1 1 3 3 3 3 3 2 3
371 1 2 2 4 4 4 4 4 3 4
DataFrame
pd.DataFrame({'a': range(20)})
>>
a
0 0
1 1
2 2
3 3
4 4
5 5
6 6
7 7
8 8
9 9
10 10
11 11
12 12
13 13
14 14
15 15
16 16
17 17
18 18
19 19
Expected result:
a group_num
0 0 1
1 1 1
2 2 2
3 3 2
4 4 3
5 5 3
6 6 4
7 7 4
8 8 5
9 9 5
10 10 6
11 11 6
12 12 7
13 13 7
14 14 8
15 15 8
16 16 9
17 17 9
18 18 10
19 19 10
What I want to do is to assign group number, from 1 to 9, according to its value.
The idea is to sort these values and split them into 10 groups and assign from 1 to 9 to each group.
But have no idea how to implement it in Pandas
Need your helps
I believe need qcut for evenly sized bins:
df['b'] = pd.qcut(df['a'], 10, labels=range(1, 11))
print (df)
a b
0 0 1
1 1 1
2 2 2
3 3 2
4 4 3
5 5 3
6 6 4
7 7 4
8 8 5
9 9 5
10 10 6
11 11 6
12 12 7
13 13 7
14 14 8
15 15 8
16 16 9
17 17 9
18 18 10
19 19 10
And if you wanted to create groups of 2 you can use this:
df['b'] = df['a'].floordiv(2)+1
You can using //
df['G']=df.a//2+1
df
Out[609]:
a G
0 0 1
1 1 1
2 2 2
3 3 2
4 4 3
5 5 3
6 6 4
7 7 4
8 8 5
9 9 5
10 10 6
11 11 6
12 12 7
13 13 7
14 14 8
15 15 8
16 16 9
17 17 9
18 18 10
19 19 10
Trying to run an update on the following result set:
Row# ProductRankID ProductID ProductCategoryID ProductTypeID Rank Score
1 3 11266 9 80 0 765
2 14 25880 9 80 0 656
3 12 25864 9 80 0 547
4 7 11252 9 80 0 457
5 8 25719 9 80 0 456
6 4 13425 9 80 0 456
7 11 25677 9 80 0 456
8 9 25716 9 80 0 432
9 15 25714 9 80 0 324
10 13 13589 9 80 0 234
11 20 25803 9 80 0 234
12 17 25715 9 80 0 213
13 5 21269 9 80 0 154
14 10 25867 9 80 0 123
15 16 25676 9 80 0 123
16 22 17861 9 80 0 67
17 19 13534 9 80 0 55
18 23 13659 9 80 0 54
19 29 13658 9 80 0 34
20 21 13591 9 80 0 32
21 6 11249 9 80 0 23
22 18 11253 9 80 0 12
23 28 11253 9 87 0 65
24 27 13664 9 87 0 45
25 25 13658 9 87 0 14
26 26 13657 9 87 0 13
27 24 13659 9 87 0 13
28 30 11252 9 87 0 12
29 2 12345 11 80 0 324
I want the "Rank" column to be set 1...2..3..4 etc based on each row. Then on change of the ProductCategoryID + ProductTypeID, I want it to reset to 1...2...3...4 etc.
So the results should look something like:
Row# ProductRankID ProductID ProductCategoryID ProductTypeID Rank Score
1 3 11266 9 80 1 765
2 14 25880 9 80 2 656
3 12 25864 9 80 3 547
4 7 11252 9 80 4 457
5 8 25719 9 80 5 456
6 4 13425 9 80 6 456
7 11 25677 9 80 7 456
8 9 25716 9 80 8 432
9 15 25714 9 80 9 324
10 13 13589 9 80 10 234
11 20 25803 9 80 11 234
12 17 25715 9 80 12 213
13 5 21269 9 80 13 154
23 28 11253 9 87 1 65
24 27 13664 9 87 2 45
25 25 13658 9 87 3 14
26 26 13657 9 87 4 13
27 24 13659 9 87 5 13
28 30 11252 9 87 6 12
29 2 12345 11 80 1 324
Hope that makes some sense?
Thanks,
Richie
If you want a select:
select t.*,
row_number() over (partition by ProductCategoryID, ProductTypeID
order by score desc, productid
) as new_rank
from t;
If you want an update, use a CTE:
with toupdate as (
select t.*,
row_number() over (partition by ProductCategoryID, ProductTypeID
order by score desc, productid
) as new_rank
from t
)
update toupdate
set rank = new_rank;
I want to find abnormal values and replace them with corresponding day of next week.
year week day v1 v2
2001 1 1 46 9999
2001 1 2 60 9335
2001 1 3 9999 9318
2001 1 4 47 9999
2001 1 5 57 9373
2001 1 6 9999 9384
2001 1 7 72 9444
2001 2 1 75 73
2001 2 2 74 63
2001 2 3 79 377
2001 2 4 70 361
2001 2 5 75 73
2001 2 6 77 64
2001 2 7 76 57
I could carry out column by column,code as follows:
index_row=df[df['v1']==9999].index
for i in index_row:
df['v1'][i]=df['v1'][i+7] # i+7 is the index of next week
How to element-wise the whole dataframe? Such as pd.applymap.
How get the columns number(name) and row number base on conditional seiving values?
The target df I want as follows:
( * indicated modified values and the next week values)
year week day v1 v2
2001 1 1 46 *73
2001 1 2 60 9335
2001 1 3 *79 9318
2001 1 4 47 *361
2001 1 5 57 9373
2001 1 6 *77 9384
2001 1 7 72 9444
2001 2 1 75 *73
2001 2 2 74 63
2001 2 3 *79 377
2001 2 4 70 *361
2001 2 5 75 73
2001 2 6 *77 64
2001 2 7 76 57
create d1 with set_index on columns ['year', 'week', 'day']
create d2 with same index as d1 except, subtract 1 from week
mask with other
cols = ['year', 'week', 'day']
d1 = df.set_index(cols)
d2 = df.assign(week=df.week - 1).set_index(cols)
d1.mask(d1.eq(9999), d2).reset_index()
year week day v1 v2
0 2001 1 1 46 73
1 2001 1 2 60 9335
2 2001 1 3 79 9318
3 2001 1 4 47 361
4 2001 1 5 57 9373
5 2001 1 6 77 9384
6 2001 1 7 72 9444
7 2001 2 1 75 73
8 2001 2 2 74 63
9 2001 2 3 79 377
10 2001 2 4 70 361
11 2001 2 5 75 73
12 2001 2 6 77 64
13 2001 2 7 76 57
old answer
One approach is to setup d1 with index of ['year', 'week', 'day'] and manipulate that to shift a week. Then mask it for equal to 9999 and fillna
d1 = df.set_index(['year', 'week', 'day'])
s1 = d1.unstack(['year', 'day']).shift(-1).stack(['year', 'day']).swaplevel(0, 1)
d1.mask(d1==9999).fillna(s1).reset_index()
year week day v1 v2
0 2001 1 1 46.0 73.0
1 2001 1 2 60.0 9335.0
2 2001 1 3 79.0 9318.0
3 2001 1 4 47.0 361.0
4 2001 1 5 57.0 9373.0
5 2001 1 6 77.0 9384.0
6 2001 1 7 72.0 9444.0
7 2001 2 1 75.0 73.0
8 2001 2 2 74.0 63.0
9 2001 2 3 79.0 377.0
10 2001 2 4 70.0 361.0
11 2001 2 5 75.0 73.0
12 2001 2 6 77.0 64.0
13 2001 2 7 76.0 57.0
You can working with DatetimeIndex, set value by mask with shifted rows:
a = df['year'].astype(str).add('-').add(df['week'].astype(str))
.add('-').add(df['day'].sub(1).astype(str))
#http://strftime.org/
df.index = pd.to_datetime(a, format='%Y-%U-%w')
df2 = df.shift(-1,freq='7D')
df = df.mask(df.eq(9999), df2).reset_index(drop=True)
print (df)
year week day v1 v2
0 2001 1 1 46 73
1 2001 1 2 60 9335
2 2001 1 3 79 9318
3 2001 1 4 47 361
4 2001 1 5 57 9373
5 2001 1 6 77 9384
6 2001 1 7 72 9444
7 2001 2 1 75 73
8 2001 2 2 74 63
9 2001 2 3 79 377
10 2001 2 4 70 361
11 2001 2 5 75 73
12 2001 2 6 77 64
13 2001 2 7 76 57
I detect mouse wheel scroll using PointerWheelChanged event at WinRT. I use PointerPoint.Properties.MouseWheelDelta to detect amount and direction of scroll:
PointerPoint mousePosition = e.GetCurrentPoint(_control);
var delta = mousePosition.Properties.MouseWheelDelta;
Nowadays there are devices which emulate mouse scroll (touchpad or touch mice etc).
They tend to issue tens or hundreds (sic!) PointerWheelChanged events per "scroll". Legacy mouse wheel issues one event per wheel click which has delta of +-120 units.
I need to do some heavy processing as soon as user scrolls to some position.
Is there a way to understand that "new" scroll is complete?
FYI Here is a mouse wheel deltas for a single finger flick with Microsoft TouchMouse (sorry for the amount, I just want to illustrate the problem).
15
15
164
164
304
304
658
658
773
773
887
887
1000
1000
1111
1111
1221
1221
1330
1330
108
108
107
107
106
106
105
105
104
104
103
103
102
102
203
203
100
100
99
99
98
98
97
97
96
96
95
95
94
94
93
93
92
92
91
91
90
90
89
89
88
88
88
88
87
87
86
86
85
85
84
84
83
83
82
82
82
82
81
81
80
80
79
79
78
78
78
78
77
77
76
76
75
75
75
75
74
74
73
73
72
72
72
72
71
71
70
70
70
70
69
69
68
68
67
67
67
67
66
66
65
65
65
65
64
64
63
63
63
63
62
62
62
62
61
61
60
60
60
60
59
59
59
59
58
58
57
57
57
57
56
56
56
56
55
55
55
55
54
54
54
54
53
53
52
52
52
52
51
51
51
51
50
50
50
50
49
49
49
49
48
48
48
48
47
47
47
47
46
46
46
46
46
46
45
45
45
45
44
44
44
44
43
43
43
43
42
42
42
42
42
42
41
41
41
41
40
40
40
40
40
40
39
39
39
39
38
38
38
38
38
38
37
37
37
37
37
37
36
36
36
36
35
35
35
35
35
35
34
34
34
34
34
34
33
33
33
33
33
33
32
32
32
32
32
32
31
31
31
31
31
31
30
30
30
30
30
30
30
30
29
29
29
29
29
29
28
28
28
28
28
28
28
28
27
27
27
27
27
27
26
26
26
26
26
26
26
26
25
25
25
25
25
25
25
25
24
24
24
24
24
24
24
24
23
23
23
23
23
23
23
23
23
23
22
22
22
22
22
22
22
22
21
21
21
21
21
21
21
21
21
21
20
20
20
20
20
20
20
20
20
20
19
19
19
19
19
19
19
19
19
19
18
18
18
18
18
18
18
18
18
18
18
18
17
17
17
17
17
17
17
17
17
17
17
17
16
16
16
16
16
16
16
16
16
16
16
16
15
15
15
15
15
15
15
15
15
15
15
15
14
14
14
14
14
14
14
14
14
14
14
14
14
14
14
14
13
13
13
13
13
13
13
13
13
13
13
13
13
13
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
12
11
11
11
11
11
11
11
11
11
11
11
11
11
11
11
11
11
11
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
10
9
9
9
9
9
9
9
9
9
9
9
9
9
9
9
9
9
9
9
9
9
9
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
8
15
15
22
22
7
7
7
7
14
14
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
7
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
4
8
8
12
12
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
3
9
9
3
3
3
3
3
3
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
2
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
1
EDIT:
Now I do this hack but it is far from perfect
// interval between mouse deltas
private readonly TimeSpan _wheelDeltaThrottleInterval = TimeSpan.FromMilliseconds(8);
// interval to wait until scroll is complete
private readonly TimeSpan _wheelDeltaCompleteInterval = TimeSpan.FromMilliseconds(600);
// create smart wheel handler
IObservable<PointerPoint> pointerWheelObservable =
System.Reactive.Linq.Observable
.FromEventPattern<PointerEventHandler, PointerRoutedEventArgs>(
handler => _control.PointerWheelChanged += handler,
handler => _control.PointerWheelChanged -= handler)
.Select(eventPattern =>
{
PointerRoutedEventArgs e = eventPattern.EventArgs;
PointerPoint mousePosition = e.GetCurrentPoint(_control);
return mousePosition;
})
.Where(mousePosition => Math.Abs(mousePosition.Properties.MouseWheelDelta) > MouseWheelDeltaThreshold);
// subscribe to wheel changes
pointerWheelObservable
.Throttle(_wheelDeltaThrottleInterval)
.ObserveOnDispatcher()
.Subscribe(
OnPointerWheelChanged,
Logger.TrackException);
pointerWheelObservable
.Throttle(_wheelDeltaCompleteInterval)
.Subscribe(
OnPointerWheelCompleted,
Logger.TrackException);
EDIT2 GestureRecognizer class does not help
See this great blog post regarding Windws 8 manipulations handling.
http://blogs.msdn.com/b/windowsappdev/archive/2012/07/02/modernizing-input-in-windows-8.aspx
Unfortunately after my experiments I see GestureRecognizer is not able to detect mouse wheel events flood is over. It fires ManipulationCompleted event after each call of .ProcessMouseWheelEvent()
You can use Reactive Extension library and throttle on the WheelChangedEvent, that way you would always get the last notification for the specified throttle time period
Use GestureRecognizer for a better low level detection of manipulations including mouse whell.
All inputs (mouse, touch, pen, etc.) are included here and supported better than traditional manipulation events. (they don't support single touch rotation, mouse scroller, etc.)
http://code.msdn.microsoft.com/windowsapps/Input-Windows-8-gestures-62c6689b#content
This is much more efficient, flexible and safer than implementing everything from scratch.