Converting string element into float problem

Converting string element into float problem - selenium

Firstly, I have written a code to append data from www.coinmarketcap.com and I did it though. I repeatedly receive data. But it comes with str type. Then I tried to convert it into float but it did not work. The data I received has the form 2,179.87 How can I solve this problem? Thanks in advance!
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
driver = webdriver.Chrome()
values = []
counter = 0
website = driver.get("https://www.binance.com/en/trade/ETH_USDT?theme=dark&type=spot")
while True:
currency = driver.find_element_by_xpath('//*[#id="__APP"]/div/div/div[4]/div/div[1]/div[1]/div/div[2]/div[1]')
print(currency.text)
values.append(float(currency.text))
time.sleep(0.1)
counter += 1
if counter == 300:
break
time.sleep(1)
In the part values.append(float(currency.text)) I got an error called:
could not convert string to float: '2,184.65'
As I mentioned above I cannot convert this string.

See this string 2,179.87 has , in it. So you have to first replace that like this replace(',' , '') and then simply convert to float using float()
a = "2,184.65"
print(type(a))
b = a.replace(',' , '')
c = float(b)
print(type(c))
print(c)
for you specific issue, I think :
values.append(float(currency.text.replace(',' , '')))

Related

Using string output from pytesseract to do a vlookup in pandas dataframe

I'm very new to Python, and I'm trying to make a simple image to song title to BPM program. My approach is using pytesseract to generate a string output; and then, using that string output, I wish to vlookup in a dataframe created by pandas. However, it always return zero value even though that song does exist in the data.
import PIL.ImageGrab
from PIL import ImageGrab
import numpy as np
import pytesseract
import pandas as pd
pytesseract.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
def getTitleImage(left, top, width, height):
printscreen_pil = ImageGrab.grab((left, top, left + width, top + height))
printscreen_numpy = np.array(printscreen_pil.getdata(), dtype='uint8') \
.reshape((printscreen_pil.size[1], printscreen_pil.size[0], 3))
return printscreen_numpy
# Printscreen:
titleImage = getTitleImage(x, y, w, h)
# pytesseract to string:
songTitle = pytesseract.image_to_string(titleImage)
print('Name of the song: ', songTitle)
# Importing the csv data via pandas.
songTable = pd.read_csv(r'C:\Users\leech\Desktop\songList.csv')
# A simple vlookup formula that return the BPM of the song by taking data from the same row.
bpmSong = songTable[songTable['Song Title'] == songTitle]['BPM'].sum()
print('The BPM of the song is: ', bpmSong)
Output:
Name of the song: Macarena
The BPM of the song is: 0
However, when I tried to forcefully provide the string to the songTitle variable, it works:
songTitle = 'Macarena'
print('Name of the song: ', songTitle)
songTable = pd.read_csv(r'C:\Users\leech\Desktop\songList.csv')
bpmSong = songTable[songTable['Song Title'] == songTitle]['BPM'].sum()
print('The BPM of the song is: ', bpmSong)
Output:
Name of the song: Macarena
The BPM of the song is: 103
I have checked the string generated from pytesseract: It has no extra space in the front or the back, totally identical to the forced string, but they still produce different results. What could be the problem?

I found the answer.
It is because the songTitle coming from:
songTitle = pytesseract.image_to_string(titleImage)
...is actually 'Macarena\n' instead of 'Macarena'.
They might look the same after print out, except the former will create a new line after it.
A great lesson learn for me.

df.ix not working , whats the right iloc method?

This is my program-
#n= no. of days
def ATR(df , n):
df['H-L'] = abs(df['High'] - df['Low'])
df['H-PC'] = abs(df['High'] - df['Close'].shift(1))
df['L-PC'] = abs(df['Low'] - df['Close'].shift(1))
df['TR']=df[['H-L','H-PC','L-PC']].max(axis=1)
df['ATR'] = np.nan
df.ix[n-1,'ATR']=df['TR'][:n-1].mean()
for i in range(n , len(df)):
df['ATR'][i] = (df['ATR'][i-1]*(n-1) + df['TR'][i])/n
return
A warning shows up
'DataFrame' object has no attribute 'ix
I tried to replace it with iloc:
df.iloc[df.index[n-1],'ATR'] = df['TR'][:n-1].mean()
But this time another error pops up :
only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
How to fix this?

Converting code is a pain and we have all been there...
df.ix[n-1,'ATR'] = df['TR'][:n-1].mean()
should become
df['ATR'].iloc[n-1] = df['TR'][:n-1].mean()
Hope this fits the bill

Autocorrect a column in a pandas dataframe using pyenchant

I tried to apply the code from the accepted answer of this question to one of my dataframe columns where each row is a sentence, but it didn't work.
My code looks this:
from enchant.checker import SpellChecker
checker = SpellChecker("id_ID")
h = df['Jawaban'].astype(str).str.lower()
hayo = []
for text in h:
checker.set_text(text)
for s in checker:
sug = s.suggest()[0]
s.replace(sug)
hayo.append(checker.get_text())
I got this following error:
IndexError: list index out of range
Any help is greatly appreciated.

I don't get the error using your code. The only thing I'm doing differently is to import the spell checker.
from enchant.checker import SpellChecker
checker = SpellChecker('en_US','en_UK') # not using id_ID
# sample data
ds = pd.DataFrame({ 'text': ['here is a spllng mstke','the wrld is grwng']})
p = ds['text'].str.lower()
hayo = []
for text in p:
checker.set_text(text)
for s in checker:
sug = s.suggest()[0]
s.replace(sug)
print(checker.get_text())
hayo.append(checker.get_text())
print(hayo)
here is a spelling mistake
the world is growing

genfromtxt in Python-3.5

I am trying to fix a data set using genfromtxt in Python 3.5. But I keep getting the next error:
ndtype = np.dtype(dict(formats=ndtype, names=names))
TypeError: data type not understood
This is the code I'm using. Any help will be appreciated!
names = ["country", "year"]
names.extend(["col%i" % (idx+1) for idx in range(682)])
dtype = "S64,i4" + ",".join(["f18" for idx in range(682)])
dataset = np.genfromtxt(data_file, dtype=dtype, names=names, delimiter=",", skip_header=1, autostrip=2)

dtype = "S64,i4" + ",".join(["f18" for idx in range(682)])
is going to produce something like:
s64,i4f18,f18,f18,f18...
Note the lack of a comma after the i4.

times table generator python - syntax error

the error is on line 10. no clue why it crashes. the equals sign is highlighted red once it is run.
code as follows:
import random
question = 1
correct = 0
while question < 10:
a = random.randint(1, 12)
b = random.randint(1, 12)
answer = input(a, 'x', b, '=')
if 'answer' = 'a*b':
print ('correct!')
correct = correct+1
else:
print ('Incorrect\nthe correct answer was', a*b)
print ('You got', correct, 'out of 10 correct')

Change your if statement to this:
if answer == a*b:
Using = assigns the value where == tests equality.
The other issue is that you have too many arguments for the input function. Input takes one argument, that is a string to output to the command line to show the user. Then the input comes in as a string and you cannot directly compare string's to integers so you need to convert the string to an integer.
answer = input("Enter in the answer for {} * {}".format(a,b))
answer = int(answer)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Converting string element into float problem - selenium

Related

Using string output from pytesseract to do a vlookup in pandas dataframe

df.ix not working , whats the right iloc method?

Autocorrect a column in a pandas dataframe using pyenchant

genfromtxt in Python-3.5

times table generator python - syntax error

Categories

Resources