Plotting Charts for monthly counts per company - pandas

I want to create a program that prints out bar charts or CSV files for monthly counts per company. So I should have a graph for January which has all the companies on the x axis and the counts on the y axis
So I am able to split my date in to month and year and I want that to be the heading. So I am able to program my df table to be this:
Date Modified Company
2019-01 Apple 113 0.0
Blackberry 66 0.0
LG 73 0.0
Linux 115 0.0
Microsoft 187 0.0
Panasonic 336 0.0
Samsung 151 0.0
2019-02 Apple 151 0.0
Blackberry 163 0.0
LG 301 0.0
Linux 108 0.0
Microsoft 199 0.0
Panasonic 142 0.0
Samsung 304 0.0
2019-03 Apple 358 0.0
Blackberry 230 0.0
LG 288 0.0
Linux 464 0.0
Microsoft 53 0.0
Panasonic 113 0.0
Samsung 177 0.0
df['Date Modified']=pd.to_datetime(df['Date']).dt.to_period('M')
df=df.groupby(["Date Modified","Company"]).sum()
print(df)
df = pd.read_csv("Sample_Data.csv")
df['Date Modified']=pd.to_datetime(df['Date']).dt.to_period('M')
df=df.groupby(["Date Modified","Company"]).sum()
So there's currently nothing faulty with this program. I want to create monthly graphs with every company listed on the x axis and the count on the y axis with a title containg the month and year so for e.g 2019-03 or 2019-02

months = df.index.levels[0]
for month in months:
data = df.loc[month]
data.plot(kind='bar', align='center', title =str(month), legend=True)

Related

Creating a Lookup Matrix in Microsoft Access

I have the matrix below in Excel and want to import it into Access (2016) to then use in queries. The aim is to be able to lookup values based on the row and column. Eg lookup criteria of 10 and 117 should return 98.1.
Is this possible? I'm an Access novice and don't know where to start.
.
10
9
8
7
6
5
4
3
2
1
0
120
100.0
96.8
92.6
86.7
78.8
68.2
54.4
37.5
21.3
8.3
0.0
119
99.4
96.2
92.0
86.2
78.5
67.9
54.3
37.5
21.3
8.3
0.0
118
98.7
95.6
91.5
85.8
78.1
67.7
54.1
37.4
21.2
8.3
0.0
117
98.1
95.1
90.9
85.3
77.8
67.4
54.0
37.4
21.2
8.3
0.0
116
97.4
94.5
90.3
84.8
77.4
67.1
53.8
37.4
21.1
8.3
0.0
115
96.8
93.9
89.8
84.4
77.1
66.9
53.7
37.3
21.1
8.3
0.0
Consider creating a table with 3 columns to store this data:
Value1 - numeric
Value2 - numeric
LookupValue - currency
You can then use DLookup to get the value required:
?DLookup("LookupValue","LookupData","Value1=117 AND Value2=10")
If you have the values stored in variables, then you need to concatenate them in:
lngValue1=117
lngValue2=10
Debug.Print DLookup("LookupValue","LookupData","Value1=" & lngValue1 & " AND Value2=" & lngValue2)

Python - Unable to get all text from webpage with embedded scripts - Selenium, ChromeDriveManager, BS, requests_html

I am stuck at the following, I want to check the below site in an automate way if there are new packages/shares available (new boxes with a buy option essentially):
https://staking.pocketfives.com/staking/market-place
Given You don't need to log in, it should be okay to scrape afaik.
So far I have read a dozen questions and tried all the methods in the title, however I was unable to make it work. When I use Beautifulsoup or requests_html it just gives me back only the title and one other line of text, not the text in the boxes that I need. The closest I got is this:
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import pyperclip
from webdriver_manager.chrome import ChromeDriverManager
import time
import random
link='https://staking.pocketfives.com/staking/market-place'
CD=ChromeDriverManager().install()
driver = webdriver.Chrome(CD)
driver.get(link)
element=driver.find_element_by_tag_name('body')
time.sleep(random.uniform(6, 12))
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(random.uniform(3, 6))
driver.execute_script("window.scrollTo(0, document.body.scrollHeight);")
time.sleep(random.uniform(3, 6))
element.send_keys(Keys.CONTROL,'a')
element.send_keys(Keys.CONTROL,'c')
driver.quit()
alltext=pyperclip.paste()
print(alltext)
I then check every so many minutes in a loop and compare the text from iteration x+1 to iteration x to see if anything has changed.
However this has one big problem, even when scrolling through the website, this does not copy all text! Just the last part of the site.
This is obviously not the most elegant way either, so I am very open to other solutions. I tried making it work with other packages, I just can't seem to run the dynamic scripts on the site correctly, even using render etc. I went into the HTML code but I have trouble understanding it and it consists of a lot of scripts.
Help is much appreciated!
I'm not really sure what you are after, but they do have an api. Does this get you what you want?
import requests
import pandas as pd
url = 'https://api2.pocketfives.com/FrontOfficeStake/GetStakeListRequest'
payload= {
'currentPage': '1',
'pageSize': '9999'}
jsonData = requests.post(url, json=payload).json()
df = pd.DataFrame(jsonData['stakeList'])
Output:
print(df.to_string())
stakeId sellerId sellerName sellerImageUrl venueId venueName venueImageUrl tournamentId startTime title buyIn guarantee markup offeredPercent offeredPrice thresholdPercent thresholdPrice purchaseCapPercent purchaseCapPrice soldPercent soldPrice availablePercent availablePrice note hasPassword myTransactionSummary replyCount
0 108 481 RobinPoker //cdn.pocketfives.com/v72/monthly_2021_10/6A79114C-C8C4-4EB1-8476-D1E9E01B1D83.thumb.jpeg.f629f2b49f04b8c9d2c7c272e04cd973.jpeg 1 WSOP https://cdn.pocketfives.com/staking/common/venue/wsop.png 1234 2021-11-01T18:00:00 Event #61: Deepstack Championship No-Limit Hold'em 600.0 0.0 1.3 50.0 390.00 0.0 0.0 5.00 39.00 50.000 390.00 0.000 0.0 <p><span style="background-color: rgb(246, 248, 249);">*You are purchasing action for FIRST BULLET ONLY (if player get knocked out and re-enters, you DO NOT have action.)</span></p> False None 0
1 39 480 Pamsi //cdn.pocketfives.com/v72/monthly_2021_09/pam_gg.thumb.jpg.ce2c1f50b5bc3c805fa4545ab9a3227d.jpg 1 WSOP https://cdn.pocketfives.com/staking/common/venue/wsop.png 1238 2021-11-03T18:00:00 Event #65: MINI Main Event No-Limit Hold’em (freezeout) 1000.0 0.0 1.1 15.0 165.00 0.0 0.0 1.00 11.00 15.000 165.00 0.000 0.0 <p>Pamela Balzano</p> False None 0
2 42 23 JonathanLittle //cdn.pocketfives.com/v72/monthly_2021_09/1004.thumb.JPG.10a21649c3a3cfef8d0ebc5d6f133ec9.JPG 1 WSOP https://cdn.pocketfives.com/staking/common/venue/wsop.png 1240 2021-11-04T18:00:00 Event #67A: MAIN EVENT No-Limit Hold'em World Championship - Day 1A 10000.0 0.0 1.0 10.0 1000.00 0.0 0.0 0.10 10.00 10.000 1000.00 0.000 0.0 None False None 0
3 109 481 RobinPoker //cdn.pocketfives.com/v72/monthly_2021_10/6A79114C-C8C4-4EB1-8476-D1E9E01B1D83.thumb.jpeg.f629f2b49f04b8c9d2c7c272e04cd973.jpeg 1 WSOP https://cdn.pocketfives.com/staking/common/venue/wsop.png 1244 2021-11-08T20:00:00 Event #68A: LITTLE ONE FOR ONE DROP No-Limit Hold’em 1111.0 0.0 1.2 50.0 666.60 0.0 0.0 5.00 66.66 50.000 666.60 0.000 0.0 <p><span style="background-color: rgb(246, 248, 249);">*You are purchasing action for FIRST BULLET ONLY (if player get knocked out and re-enters, you DO NOT have action.)</span></p> False None 0
4 40 480 Pamsi //cdn.pocketfives.com/v72/monthly_2021_09/pam_gg.thumb.jpg.ce2c1f50b5bc3c805fa4545ab9a3227d.jpg 1 WSOP https://cdn.pocketfives.com/staking/common/venue/wsop.png 1244 2021-11-08T20:00:00 Event #68A: LITTLE ONE FOR ONE DROP No-Limit Hold’em 1111.0 0.0 1.1 15.0 183.32 0.0 0.0 1.00 12.22 15.000 183.32 0.000 0.0 <p>Pamela Balzano</p> False None 0
5 41 480 Pamsi //cdn.pocketfives.com/v72/monthly_2021_09/pam_gg.thumb.jpg.ce2c1f50b5bc3c805fa4545ab9a3227d.jpg 1 WSOP https://cdn.pocketfives.com/staking/common/venue/wsop.png 1248 2021-11-11T20:00:00 Event #70A: CRAZY EIGHTS No-Limit Hold'em 8-Handed 888.0 0.0 1.1 15.0 146.52 0.0 0.0 1.00 9.77 15.000 146.52 0.000 0.0 <p>Pamela Balzano</p> False None 0
6 59 23 JonathanLittle //cdn.pocketfives.com/v72/monthly_2021_09/1004.thumb.JPG.10a21649c3a3cfef8d0ebc5d6f133ec9.JPG 1 WSOP https://cdn.pocketfives.com/staking/common/venue/wsop.png 1257 2021-11-15T23:00:00 Event #76: Super Turbo Bounty No-Limit Hold'em (freezeout) 10000.0 0.0 1.0 10.0 1000.00 0.0 0.0 0.10 10.00 10.000 1000.00 0.000 0.0 <p>*You are purchasing action for FIRST BULLET ONLY (if player gets knocked out and re-enters, you DO NOT have action.)</p><p><br></p><blockquote><br></blockquote><p><br></p> False None 0
7 34 56 Daniel Negreanu //cdn.pocketfives.com/v72/monthly_2021_09/239657270_ScreenShot2021-09-10at10_04_11AM.thumb.png.fdefefe1ee4ccfa9148e72b34c59bdf0.png 1 WSOP https://cdn.pocketfives.com/staking/common/venue/wsop.png 1263 2021-11-18T23:00:00 Event #82: Super High Roller No-Limit Hold'em 250000.0 0.0 1.0 25.0 62500.00 0.0 0.0 0.20 500.00 25.000 62500.00 0.000 0.0 <p>*You are purchasing action for FIRST BULLET ONLY (if player get knocked out and re-enters, you DO NOT have action.</p><p><br></p><p>Are you guys as ready for the WSOP as I am? This is my first public offering for a piece of me in some WSOP action! Ive opened up 25% of of my action at NO MARKUP! Lets make some money and have some fun!</p> False None 0
8 43 23 JonathanLittle //cdn.pocketfives.com/v72/monthly_2021_09/1004.thumb.JPG.10a21649c3a3cfef8d0ebc5d6f133ec9.JPG 1 WSOP https://cdn.pocketfives.com/staking/common/venue/wsop.png 1267 2021-11-20T23:00:00 Event #85: HIGH ROLLER No-Limit Hold’em 50000.0 0.0 1.0 20.0 10000.00 0.0 0.0 0.20 100.00 15.700 7850.00 4.300 2150.0 None False None 0
9 52 23 JonathanLittle //cdn.pocketfives.com/v72/monthly_2021_09/1004.thumb.JPG.10a21649c3a3cfef8d0ebc5d6f133ec9.JPG 1 WSOP https://cdn.pocketfives.com/staking/common/venue/wsop.png 1269 2021-11-21T23:00:00 Event #87: High Roller No-Limit Hold’em 100000.0 0.0 1.0 70.0 70000.00 69.0 69000.0 0.00 0.00 64.852 64852.00 5.148 5148.0 <p>Please be aware that if this package does not sell out in its entirety, all investors will get a full refund and the package is canceled.</p> False None 0
10 36 56 Daniel Negreanu //cdn.pocketfives.com/v72/monthly_2021_09/239657270_ScreenShot2021-09-10at10_04_11AM.thumb.png.fdefefe1ee4ccfa9148e72b34c59bdf0.png 1 WSOP https://cdn.pocketfives.com/staking/common/venue/wsop.png 1269 2021-11-21T23:00:00 Event #87: High Roller No-Limit Hold’em 100000.0 0.0 1.0 25.0 25000.00 0.0 0.0 0.25 250.00 25.000 25000.00 0.000 0.0 <p><span style="background-color: rgb(246, 248, 249);">*You are purchasing action for FIRST BULLET ONLY (if player get knocked out and re-enters, you DO NOT have action.</span></p> False None 0

python pandas find percentile for a group in column

I would like to find percentile of each column and add to df data frame and also label
if the value of the column is
top 20 percent (value>80th percentile) then 'strong'
below 20 percent (value>80th percentile) then 'weak'
else average
Below is my dataframe
df=pd.DataFrame({'month':['1','1','1','1','1','2','2','2','2','2','2','2'],'X1':
[30,42,25,32,12,10,4,6,5,10,24,21],'X2':[10,76,100,23,65,94,67,24,67,54,87,81],'X3':
[23,78,95,52,60,76,68,92,34,76,34,12]})
df
Below what I tried
df['X1_percentile'] = df.X1.rank(pct = True)
df['X1_segment'] = np.where(df['X1_percentile']>0.8, 'Strong',np.where(df['X1_percentile']
<0.20,'Weak', 'Average'))
But I would like to do this for each month and for each column. And if possible this could be automted by a function for any col numbers and also type colname+"_per" and colname+"_segment" for each column ?
Thanks
We can use groupby + rank with optional parameter pct=True to calculate the ranking expressed as percentile rank, then using np.select bin/categorize the percentile values into discrete lables.
p = df.groupby('month').rank(pct=True)
df[p.columns + '_per'] = p
df[p.columns + '_seg'] = np.select([p.gt(.8), p.lt(.2)], ['strong', 'weak'], 'average')
month X1 X2 X3 X1_per X2_per X3_per X1_seg X2_seg X3_seg
0 1 30 10 23 0.600000 0.200000 0.200000 average average average
1 1 42 76 78 1.000000 0.800000 0.800000 strong average average
2 1 25 100 95 0.400000 1.000000 1.000000 average strong strong
3 1 32 23 52 0.800000 0.400000 0.400000 average average average
4 1 12 65 60 0.200000 0.600000 0.600000 average average average
5 2 10 94 76 0.642857 1.000000 0.785714 average strong average
6 2 4 67 68 0.142857 0.500000 0.571429 weak average average
7 2 6 24 92 0.428571 0.142857 1.000000 average weak strong
8 2 5 67 34 0.285714 0.500000 0.357143 average average average
9 2 10 54 76 0.642857 0.285714 0.785714 average average average
10 2 24 87 34 1.000000 0.857143 0.357143 strong strong average
11 2 21 81 12 0.857143 0.714286 0.142857 strong average weak

Add column for percentages

I have a df who looks like this:
Total Initial Follow Sched Supp Any
0 5525 3663 968 296 65 533
I transpose the df 'cause I have to add a column with the percentages based on column 'Total'
Now my df looks like this:
0
Total 5525
Initial 3663
Follow 968
Sched 296
Supp 65
Any 533
So, How can I add this percentage column?
The expected output looks like this
0 Percentage
Total 5525 100
Initial 3663 66.3
Follow 968 17.5
Sched 296 5.4
Supp 65 1.2
Any 533 9.6
Do you know how can I add this new column?
I'm working in jupyterlab with pandas and numpy
Multiple column 0 by scalar from Total with Series.div, then multiple by 100 by Series.mul and last round by Series.round:
df['Percentage'] = df[0].div(df.loc['Total', 0]).mul(100).round(1)
print (df)
0 Percentage
Total 5525 100.0
Initial 3663 66.3
Follow 968 17.5
Sched 296 5.4
Supp 65 1.2
Any 533 9.6
Consider below df:
In [1328]: df
Out[1328]:
b
a
Total 5525
Initial 3663
Follow 968
Sched 296
Supp 65
Any 533
In [1327]: df['Perc'] = round(df.b.div(df.loc['Total', 'b']) * 100, 1)
In [1330]: df
Out[1330]:
b Perc
a
Total 5525 100.0
Initial 3663 66.3
Follow 968 17.5
Sched 296 5.4
Supp 65 1.2
Any 533 9.6

trying to add text at the end of a line at the fist occurrence of a variable in a file

I've attached the text(s) file I'm working with below:
using sed I can get a result which adds the text but it's being added to every line that contained my variable which happened to be "33" in this case. I would only like the text "SLOTS" added to the first occurrence in the file and basically stop once the first is encountered. sed adds the text "SLOTS" to every line that contains my variable in my text file there are other columns that happen to have the same value as my variable. I've searched through numerous websites to get the desired results but haven't had any luck trying awk or other sed examples. any help would be greatly appreciated.
here's my current sed line:
sed '/\b'$slot_drill'\b/s/$/SLOTS/' $slot_card > $new_slot_card
what my current sed line outputs:
d25104-1.dr -- PANEL SIZE: 18x24
Drilled Slots: Yes
Tool Tool Spindle Feed Hits Max Bits Path Time
Size Speed Rate Hits (Min)
T01 126 1.0 1.0 58 1600 0.0 182.8 1.7
T02 250 1.0 1.0 9 1600 0.0 67.5 0.7
T03 12 1.0 1.0 3965 1600 2.5 514.4 4.6
T04 31 1.0 1.0 65 1600 0.0 62.0 0.6
T05 33 1.0 1.0 569 1600 0.4 46.6 0.4 SLOTS
T06 35 1.0 1.0 33 1600 0.0 45.3 0.4 SLOTS
T07 41 1.0 1.0 97 1600 0.1 79.6 0.7
T08 42 1.0 1.0 135 1600 0.1 104.6 0.9
T09 43 1.0 1.0 33 1600 0.0 53.0 0.5 SLOTS
T10 49.2 1.0 1.0 65 1600 0.0 44.2 0.4
T11 52 1.0 1.0 17 1600 0.0 40.7 0.4
T12 63.5 1.0 1.0 33 1600 0.0 55.6 0.5 SLOTS
T13 98 1.0 1.0 25 1600 0.0 52.0 0.5
Total 5104 3.2 1348.2 12.5
output that is desired:
d25104-1.dr -- PANEL SIZE: 18x24
Drilled Slots: Yes
Tool Tool Spindle Feed Hits Max Bits Path Time
Size Speed Rate Hits (Min)
T01 126 1.0 1.0 58 1600 0.0 182.8 1.7
T02 250 1.0 1.0 9 1600 0.0 67.5 0.7
T03 12 1.0 1.0 3965 1600 2.5 514.4 4.6
T04 31 1.0 1.0 65 1600 0.0 62.0 0.6
T05 33 1.0 1.0 569 1600 0.4 46.6 0.4 SLOTS
T06 35 1.0 1.0 33 1600 0.0 45.3 0.4
T07 41 1.0 1.0 97 1600 0.1 79.6 0.7
T08 42 1.0 1.0 135 1600 0.1 104.6 0.9
T09 43 1.0 1.0 33 1600 0.0 53.0 0.5
T10 49.2 1.0 1.0 65 1600 0.0 44.2 0.4
T11 52 1.0 1.0 17 1600 0.0 40.7 0.4
T12 63.5 1.0 1.0 33 1600 0.0 55.6 0.5
T13 98 1.0 1.0 25 1600 0.0 52.0 0.5
Total 5104 3.2 1348.2 12.5
desired output
It is difficult to tell what you are asking. A good minimal example would probably omit the shell variables and take the line you are working with as completely independent so that other people can run it on their own systems.
However, it sounds as if you are just asking how to limit sed's s command to working only for the first line it finds that is a match. In general, I think the way is to prefix a range to the command and use 0 for the beginning of the range. For example: sed '0,/foo/ {s/foo/bar}' will replace only the first instance of foo that it finds with bar.
This is an answer to this sort of problem given at https://unix.stackexchange.com/questions/188264/want-to-substitute-only-first-occurence-with-sed
awk to the rescue!
on a simplified file
$ cat file
1
2
3
1
2
3
append SLOT to the first occurrence of the variable slot, let's say 3.
$ awk -v value="$slot" '$1==value && !c++{$0=$0 "SLOT"} 1' file
1
2
3SLOT
1
2
3
Looking at your example you'll need to use `$2==value``