Python Dash plotly update table - pandas

I have a DataFrame in pandas called interesttable, which is getting updated with time (seconds). I am using Dash plotly to display the dataframe. Although i successfully display the dataframe in Dash, I cannot update the Dash with the new rows added in the dataframe. I try the following but it doesn't work. Thank you for your feedback!
def generate_table():
return html.Table(
# Header
[html.Tr([html.Th(col) for col in interesttable.columns])] +
# Body
[html.Tr([html.Td(interesttable.iloc[i][col]) for col in interesttable.columns])
for i in range(min(len(interesttable), 50))]
)
app = dash.Dash()
app.layout = html.Div(children=[
html.H1(children='Interest Table'),
dcc.Interval(id='generate_table()',interval=1*1000),
generate_table()
])
app.callback(Output('generate_table()','children'), [Input('interesttable', 'n_intervals')])
if __name__ == '__main__':
app.run_server(debug=True)
Unfortunatelly the error I receive is:
Here is a list of the available properties in "generate_table()":
['id','interval','disabled','n_intervals','max_intervals']

Isn't app.callback supposed to be a decorator?
Try:
#app.callback(Output('generate_table()','children'), [Input('interesttable', 'n_intervals')])
I don't see also n_intervals declared.
You should add it:
app.layout = html.Div(children=[html.H1(children='Interest Table'),
dcc.Interval(id='generate_table()',interval=1000,n_intervals=0 ),generate_table()])

Related

Use pandas df.concat to replace .append with custom index

I'm currently trying to replace .append in my code since it won't be supported in the future and I have some trouble with the custom index I'm using
I read the names of every .shp files in a directory and extract some date from it
To make the link with an excel file I have, I use the name I extract from the title of the file
df = pd.DataFrame(columns = ['date','fichier'])
for i in glob.glob("*.shp"):
nom_parcelle = i.split("_")[2]
if not nom_parcelle in df.index:
# print(df.last_valid_index())
date_recolte = i.split("_")[-1]
new_row = pd.Series(data={'date':date_recolte.split(".")[0], 'fichier':i}, name = nom_parcelle)
df = df.append(new_row, ignore_index=False)
This works exactly as I want it to be
Sadly, I can't find a way to replace it with .concat
I looked for ways to keep the index whith concat but didn't find anything that worked as I intended
Did I miss anything?
Try the approach below with pandas.concat based on your code :
import glob
import pandas as pd
​
df = pd.DataFrame(columns = ['date','fichier'])
dico_dfs={}
​
for i in glob.glob("*.shp"):
nom_parcelle = i.split("_")[2]
if not nom_parcelle in df.index:
# print(df.last_valid_index())
date_recolte = i.split("_")[-1]
new_row = pd.Series(data={'date':date_recolte.split(".")[0], 'fichier':i}, name = nom_parcelle)
dico_dfs[i]= new_row.to_frame()
df= pd.concat(dico_dfs, ignore_index=False, axis=1).T.droplevel(0)
# Output :
print(df)
date fichier
nom1 20220101 a_xx_nom1_20220101.shp
nom2 20220102 b_yy_nom2_20220102.shp
nom3 20220103 c_zz_nom3_20220103.shp

Webscraping several URLs into panda df

Need some help appending several webscraping resaults to a panda df.
Currently im only getting the output from one of the URLs to the DF.
I left out the URLs, if you need them i will supply them to you.
##libs
import bs4
import requests
import re
from time import sleep
import pandas as pd
from bs4 import BeautifulSoup as bs
##webscraping targets
URLs = ["URL1","URL2","URL3"]
## Get columns
column_list = []
r1 = requests.get(URLs[0])
soup1 = bs(r1.content)
data1 = soup1.find_all('dl', attrs= {"class": "border XSText rightAlignText noMarginTop highlightOnHover thickBorderBottom noTopBorder"})
columns = soup1.find_all('dt')
for col in columns:
column_list.append(col.text.strip()) # strip() removes extra space from the text
##Get values
value_list = []
for url in URLs:
r1 = requests.get(url)
soup1 = bs(r1.content)
data1 = soup1.find_all('dl', attrs= {"class": "border XSText rightAlignText noMarginTop highlightOnHover thickBorderBottom noTopBorder"})
values = soup1.find_all('dd')
for val in values:
value_list.append(val.text.strip())
df=pd.DataFrame(list(zip(column_list,value_list)))
df.transpose()
Current output only showing the resaults of one URL:
Expected output:
The problem here is with your zip function. It will only zip the values until the length of the shortest list, in this case, the column_list. Leaving all the other values unused.
If you want to append the other values to the dataframe as well you will have to iterate over then. So change the last two lines on your code to this and it should work:
result = [[i] for i in column_list]
for i, a in enumerate(value_list):
result[i % len(column_list)].extend([a])
df = pd.DataFrame(result)
df.transpose()

Problems appending rows to DataFrame. ZMQ messages to Pandas Dataframe

I am taking messages for market data from a ZMQ subscription and turning it into a pandas dataframe.
I tried creating a empty dataframe and appending rows to it. It did not work out. I keep getting this error.
RuntimeWarning: '<' not supported between instances of 'str' and 'int', sort
order is undefined for incomparable objects
result = result.union(other)
Im guessing this is because Im appending a list of strings to a dataframe. I clear the list then try to append the next row. The data is 9 rows. First one is a string and the other 8 are all floats.
list_heartbeat = []
list_fills= []
market_data_bb = []
market_data_fs = []
abacus_fs = []
abacus_bb =[]
df_bar_data_bb = pd.DataFrame(columns= ['Ticker','Start_Time_Intervl','Interval_Length','Current_Open_Price',
'Previous_Open','Previous_Low','Previous_High','Previous_Close','Message_ID'])
def main():
context = zmq.Context()
socket_sub1 = context.socket(zmq.SUB)
socket_sub2 = context.socket(zmq.SUB)
socket_sub3 = context.socket(zmq.SUB)
print('Opening Socket...')
# We can connect to several endpoints if we desire, and receive from all.
print('Connecting to Nicks BroadCast...')
socket_sub1.connect("Server:port")
socket_sub2.connect("Server:port")
socket_sub3.connect("Server:port")
print('Connected To Nicks BroadCast... Waiting For Messages.')
print('Connected To Jasons Two BroadCasts... Waiting for Messages.')
#socket_sub1.setsockopt_string(zmq.SUBSCRIBE, 'H')
socket_sub1.setsockopt_string(zmq.SUBSCRIBE, 'R')
#socket_sub1.setsockopt_string(zmq.SUBSCRIBE, 'HEARTBEAT') #possible heartbeat from Jason
socket_sub2.setsockopt_string(zmq.SUBSCRIBE, 'BAR_FS')
socket_sub2.setsockopt_string(zmq.SUBSCRIBE, 'HEARTBEAT')
socket_sub2.setsockopt_string(zmq.SUBSCRIBE, 'BAR_BB')
socket_sub3.setsockopt_string(zmq.SUBSCRIBE, 'ABA_FS')
socket_sub3.setsockopt_string(zmq.SUBSCRIBE, 'ABA_BB')
poller = zmq.Poller()
poller.register(socket_sub1, zmq.POLLIN)
poller.register(socket_sub2, zmq.POLLIN)
poller.register(socket_sub3, zmq.POLLIN)
while (running):
try:
socks = dict(poller.poll())
except KeyboardInterrupt:
break
#check if the message is in socks, if so then save to message1-3 for future use.
#Msg1 = heartbeat for Nicks server
#Msg2 = fills
#msg3 Mrkt Data split between FS and BB
#msg4
if socket_sub1 in socks:
message1 = socket_sub1.recv_string()
list_heartbeat.append(message1.split())
if socket_sub2 in socks:
message2 = socket_sub2.recv_string()
message3 = socket_sub2.recv_string()
if message2 == 'HEARTBEAT':
print(message2)
print(message3)
if message2 == 'BAR_BB':
message3_split = message3.split(";")
message3_split = [e[3:] for e in message3_split]
#print(message3_split)
message3_split = message3_split
market_data_bb.append(message3_split)
if len(market_data_bb) > 20:
#df_bar_data_bb = pd.DataFrame(market_data_bb, columns= ['Ticker','Start_Time_Intervl','Interval_Length','Current_Open_Price',
# 'Previous_Open','Previous_Low','Previous_High','Previous_Close','Message_ID'])
#df_bar_data_bb.set_index('Start_Time_Intervl', inplace=True)
#ESA = df_bar_data_bb[df_bar_data_bb['Ticker'] == 'ESA Index'].copy()
#print(ESA)
#df_bar_data_bb.set_index('Start_Time_Intervl', inplace=True)
df_bar_data_bb.append(market_data_bb)
market_data_bb.clear()
print(df_bar_data_bb)
The very bottom is what throws the Error. I found a simple way around this that may or may not work. Its the 4 lines above that create a dataframe then set the index and try to create copies of the dataframe. The only problem is I get about anywhere from 40-90 messages a second and every time I get a new one it creates a new dataframe. I eventually have to create a graph out of this and im not exactly sure how I would create a live graph out of this. But thats another problem.
EDIT: I figured it out. Instead of adding the messages to a list I simply convert each message to a pandas series then call my dataframe globally then do df=df.append(message4,ignore_index=True)
I completely removed the need for lists
if message2 == 'BAR_BB':
message3_split = message3.split(";")
message3_split = [e[3:] for e in message3_split]
message4 = pd.Series(message3_split)
global df_bar_data_bb1
df_bar_data_bb1 = df_bar_data_bb1.append(message4, ignore_index = True)

Autocorrect a column in a pandas dataframe using pyenchant

I tried to apply the code from the accepted answer of this question to one of my dataframe columns where each row is a sentence, but it didn't work.
My code looks this:
from enchant.checker import SpellChecker
checker = SpellChecker("id_ID")
h = df['Jawaban'].astype(str).str.lower()
hayo = []
for text in h:
checker.set_text(text)
for s in checker:
sug = s.suggest()[0]
s.replace(sug)
hayo.append(checker.get_text())
I got this following error:
IndexError: list index out of range
Any help is greatly appreciated.
I don't get the error using your code. The only thing I'm doing differently is to import the spell checker.
from enchant.checker import SpellChecker
checker = SpellChecker('en_US','en_UK') # not using id_ID
# sample data
ds = pd.DataFrame({ 'text': ['here is a spllng mstke','the wrld is grwng']})
p = ds['text'].str.lower()
hayo = []
for text in p:
checker.set_text(text)
for s in checker:
sug = s.suggest()[0]
s.replace(sug)
print(checker.get_text())
hayo.append(checker.get_text())
print(hayo)
here is a spelling mistake
the world is growing

Python 3.6 Pandas Difflib Get_Close_Matches to filter a dataframe with user input

Using a csv imported using a pandas dataframe, I am trying to search one column of the df for entries similar to a user generated input. Never used difflib before and my tries have ended in a TypeError: object of type 'float' has no len() or an empty [] list.
import difflib
import pandas as pd
df = pd.read_csv("Vendorlist.csv", encoding= "ISO-8859-1")
word = input ("Enter a vendor: ")
def find_it(w):
w = w.lower()
return difflib.get_close_matches(w, df.vendorname, n=50, cutoff=.6)
alternatives = find_it(word)
print (alternatives)
The error seems to occur at "return.difflib.get_close_matches(w, df.vendorname, n=50, cutoff=.6)"
Am attempting to get similar results to "word" with a column called 'vendorname'.
Help is greatly appreciated.
Your column vendorname is of the incorrect type.
Try in your return statement:
return difflib.get_close_matches(w, df.vendorname.astype(str), n=50, cutoff=.6)
import difflib
import pandas as pd
df = pd.read_csv("Vendorlist.csv", encoding= "ISO-8859-1")
word = input ("Enter a vendor: ")
def find_it(w):
w = w.lower()
return difflib.get_close_matches(w, df.vendorname.astype(str), n=50, cutoff=.6)
alternatives = find_it(word)
print (alternatives)
As stated in the comments by #johnchase
The question also mentions the return of an empty list. The return of get_close_matches is a list of matches, if no item matched within the cutoff an empty list will be returned – johnchase
I've skipped the:
astype(str)in (return difflib.get_close_matches(w, df.vendorname.astype(str), n=50, cutoff=.6))
Instead used:
dtype='string' in (df = pd.read_csv("Vendorlist.csv", encoding= "ISO-8859-1"))