when reading an html (pandas.read_html), how to select dataframe and set_ index in one line - pandas

I'm reading an html which brings back a list of dataframes. I want to be able to choose the dataframe from the list and set my index (index_col) in the least amount of lines.
Here is what I have right now:
import pandas as pd
df =pd.read_html('http://finviz.com/insidertrading.ashx?or=-10&tv=100000&tc=1&o=-transactionvalue', header = 0)
df2 =df[4] #here I'm assigning df2 to dataframe#4 from the list of dataframes I read
df2.set_index('Date', inplace =True)
Is it possible to do all this in one line? Do I need to create another dataframe (df2) to assign one dataframe from a list, or is it possible I can assign the dataframe as soon as I read the list of dataframes (df).
Thanks.

Anyway:
import pandas as pd
df = pd.read_html('http://finviz.com/insidertrading.ashx?or=-10&tv=100000&tc=1&o=-transactionvalue', header = 0)[4].set_index('Date')

Related

How do I subset a dataframe based on index matches to the column name of another dataframe?

I want to keep the columns of df if its column name matches the index of df2.
My code below only returns the df.index but I want to return the entire subset of pandas dataframe.
import pandas as pd
df = df[df.columns.intersection(df2.index)]
From my understanding, you want to have datas from both dataframes matching with index of df2. Correct?
You can use Merge to join the dataframes.
df = pd.merge(df1, df2, how='inner', on=[df2.index])

Concatenate row values in Pandas DataFrame

I have a problem with Pandas' DataFrame Object.
I have read first excel file and I have DataFrame like this:
First DataFrame
And read second excel file like this:
Second DataFrame
I need to concatenate rows and it should like this:
Third DataFrame
I have code like this:
import pandas as pd
import numpy as np
x1 = pd.ExcelFile("x1.xlsx")
df1 = pd.read_excel(x1, "Sheet1")
x2 = pd.ExcelFile("x2.xlsx")
df2 = pd.read_excel(x2, "Sheet1")
result = pd.merge(df1, df2, how="outer")
The second df just follow the first df,how can I get the style with dataframe like the third one?
merge does not concatenate the dfs as you want, use append instead.
ndf = df1.append(df2).sort_values('name')
You can also use concat:
ndf = pd.concat([df1, df2]).sort_values('name')

How to include an if condition after merging two dataframes?

currently in my code I'm merging two dataframes from my desktop, dropping some duplicates and some column and the final output is converted in a picture to be sent via telegram.
import pandas as pd
import dataframe_image as di
import telepot
df = pd.read_csv('a.csv', delimiter=';')
df1 = pd.read_csv('b.csv', delimiter=';')
total = pd.merge(df, df1, on="Conc", how="inner")
total = total.drop_duplicates(subset=["A"], keep="first")
total = total.drop(['A','B','C', 'D', 'E','Conc'], 1)
di.export(total, 'total.png')
bot = telepot.Bot('token')
bot.sendPhoto(chatid, photo=open('total.png', 'rb'))
This is the good path, in case the merging row is giving me a new dataframe with text on it. How can I manage the situation if the merging task as an output an empty df so I can send "NA" via telegram?
Many thanks

Seach and delete item in list of dataframes

Lets say I creat a list of dataframes by:
import pandas as pd
lDfs = []
for i in range(0, 3):
lDfs.append(pd.read_csv('SomeTable.csv')
then I have a list of 3 dataframes:
lDfs[0]
lDfs[1]
lDfs[2]
Lets say each dataframe has the following structure:
Date,Open,High,Low,Close,Volume
0 2020-03-02,3355.330078,3406.399902,3257.989990,3338.830078,90017600
1 2020-03-03,3355.520020,3448.239990,3354.300049,3371.969971,79445600
Now I want to search each dataframe in that list for a string pattern:
search = 'null'
and drop that row which includes that specific dataframe. How can I do that?
Thank you!
It turned out that 'null' was interpretet from pandas as NaN. So DataFrame.dropna does the trick pretty easy:
for i in range(0, len(lDfs)):
lDfs[i].dropna(inplace=True)

How to create a DataFrame with index names different from `row` and write data into (`index`, `column`) pairs in Julia?

How can I create a DataFrame with Julia with index names that are different from Row and write values into a (index,column) pair?
I do the following in Python with pandas:
import pandas as pd
df = pd.DataFrame(index = ['Maria', 'John'], columns = ['consumption','age'])
df.loc['Maria']['age'] = 52
I would like to do the same in Julia. How can I do this? The documentation shows a DataFrame similar to the one I would like to construct but I cannot figure out how.