How do I get the full list list sent via telegram when web scrapping

How do I get the full list list sent via telegram when web scrapping - selenium

I have managed to get the text I want but I can't seem to send the entire list to a telegram message. I only manage to send the first line.
service = Service(executable_path=ChromeDriverManager().install())
driver = webdriver.Chrome(service=service)
driver.get("")
Source = driver.page_source
soup = BeautifulSoup(Source, "html.parser")
for cars in soup.findAll(class_="car-title"):
print(cars.text)
driver.close()
def telegram_bot_sendtext(bot_message):
bot_token = ''
bot_chatID = ''
send_text = 'https://api.telegram.org/bot' + bot_token + '/sendMessage?chat_id=' + bot_chatID + '&parse_mode=Markdown&text=' + bot_message
response = requests.get(send_text)
return response.json()
test = telegram_bot_sendtext(cars.text)
The print function gives me this
AUDI E-TRON
MERCEDES-BENZ EQC
TESLA MODEL 3
NISSAN LEAF
MERCEDES-BENZ EQV
AUDI E-TRON
At some point I would like to add a function to check for updates and if there any changes then send a push message to telegram. If someone could point me in the right direction I would be grateful.

What happens?
Your sending one line, cause you do not store the results anywhere and only the last result from iterating is in memory.
How to fix?
Asuming you want to send text as in the question, you should store results in variable - Iterate over resultset, extract text and join() results by newline character:
cars = '\n'.join([cars.text for cars in soup.find_all(class_="car-title")])
Example
...
cars = '\n'.join([cars.text for cars in soup.find_all(class_="car-title")])
def telegram_bot_sendtext(bot_message):
bot_token = ''
bot_chatID = ''
send_text = 'https://api.telegram.org/bot' + bot_token + '/sendMessage?chat_id=' + bot_chatID + '&parse_mode=Markdown&text=' + bot_message
response = requests.get(send_text)
return response.json()
test = telegram_bot_sendtext(cars)

Related

Webscraping customer review - Invalid selector error using XPath

I am trying to extract userid, rating and review from the following site using selenium and it is showing "Invalid selector error". I think, the Xpath I have tried to define to get the review text is the reason for error. But I am unable to resolve the issue. The site link is as below:
teslamotor review
The code that I have used is following:
#Class for Review webscraping from consumeraffairs.com site
class CarForumCrawler():
def __init__(self, start_link):
self.link_to_explore = start_link
self.comments = pd.DataFrame(columns = ['rating','user_id','comments'])
self.driver = webdriver.Chrome(executable_path=r'C:/Users/mumid/Downloads/chromedriver/chromedriver.exe')
self.driver.get(self.link_to_explore)
self.driver.implicitly_wait(5)
self.extract_data()
self.save_data_to_file()
def extract_data(self):
ids = self.driver.find_elements_by_xpath("//*[contains(#id,'review-')]")
comment_ids = []
for i in ids:
comment_ids.append(i.get_attribute('id'))
for x in comment_ids:
#Extract dates from for each user on a page
user_rating = self.driver.find_elements_by_xpath('//*[#id="' + x +'"]/div[1]/div/img')[0]
rating = user_rating.get_attribute('data-rating')
#Extract user ids from each user on a page
userid_element = self.driver.find_elements_by_xpath('//*[#id="' + x +'"]/div[2]/div[2]/strong')[0]
userid = userid_element.get_attribute('itemprop')
#Extract Message for each user on a page
user_message = self.driver.find_elements_by_xpath('//*[#id="' + x +'"]]/div[3]/p[2]/text()')[0]
comment = user_message.text
#Adding date, userid and comment for each user in a dataframe
self.comments.loc[len(self.comments)] = [rating,userid,comment]
def save_data_to_file(self):
#we save the dataframe content to a CSV file
self.comments.to_csv ('Tesla_rating-6.csv', index = None, header=True)
def close_spider(self):
#end the session
self.driver.quit()
try:
url = 'https://www.consumeraffairs.com/automotive/tesla_motors.html'
mycrawler = CarForumCrawler(url)
mycrawler.close_spider()
except:
raise
The error that I am getting is as following:
Also, The xpath that I tried to trace is from following HTML

You are seeing the classic error of...
as find_elements_by_xpath('//*[#id="' + x +'"]]/div[3]/p[2]/text()')[0] would select the attributes, instead you need to pass an xpath expression that selects elements.
You need to change as:
user_message = self.driver.find_elements_by_xpath('//*[#id="' + x +'"]]/div[3]/p[2]')[0]
References
You can find a couple of relevant detailed discussions in:
invalid selector: The result of the xpath expression "//a[contains(#href, 'mailto')]/#href" is: [object Attr] getting the href attribute with Selenium

Get sender message if keyword is in it then use part of it

i want to use a command /price, if only /price is send give price of 1 unit.
if its /price 5 give price of 5 units.
in normal telegram import i can use this:
def handleCommandPrice(self, update, context):
message = update.message
text = message.text
try:
priceMultiplier = Decimal(text.replace("/" + COMMAND_PRICE, "").replace(' ', ''))
except InvalidOperation:
priceMultiplier = Decimal(1)
priceMultiplied = getCurrentCoinPrice() * priceMultiplier
priceMultiplied = round(priceMultiplied, 5)
update.message.reply_text(text = "$ " + str(priceMultiplied), parse_mode=telegram.ParseMode.HTML)
How to make this in Telethon? can you give me an advise or hint please?
Greetings

When, why and how to avoid KeyError in Odoo Development

I´ve noticed that some custom modules that I develop can be installed on databases with records, while others throw the KeyError message, unless the database is empty (no records). Generally the errors appear when the module contains computed fields. So, does anybody know why this happens? and how my code should look like to avoid this kind of errors?
an example computed field that throws this errors looks like this:
from odoo import models, fields, api
from num2words import num2words
Class InheritingAccountMove(models.Model):
_inherit = 'account.move'
total_amount_text = fields.Char(string='Total', compute='_compute_total_amount_text', store=True)
#api.depends('amount_total')
def _compute_total_amount_text(self):
lang_code = self.env.context.get('lang') or self.env.user.lang
language = self.env['res.lang'].search([('iso_code', '=', lang_code)])
separator = language.read()[0]['decimal_point']
for record in self:
decimal_separator = separator
user_language = lang_code[:2]
amount = record.amount_total
amount_list = str(amount).split(decimal_separator)
amount_first_part = num2words(int(amount_list[0]), lang=user_language).title() + ' '
amount_second_part = amount_list[1]
if len(amount_second_part) == 0:
amount_text = amount_first_part + '00/100'
elif len(amount_second_part) < 2:
amount_text = amount_first_part + amount_second_part + '0/100'
else:
amount_text = amount_first_part + amount_second_part[:2] + '/100'
record.total_amount_text = amount_text

UPDATED
The reason your code has a problem in this situation is that when there are no records in the table(at time of installation) your loop won’t run which result in no value assigning of your computed field so
Add the first line of code in function
self.total_amount_text = False This is required to assign value to the computed field in compute function from Odoo 13 and maybe 12
----------------------------------------------------------------
Other reasons could be :
This error occurs when one tries to access a key from a dictionary that doesn't exist like,
language.read()[0]['decimal_point']
the dictionary may not have 'decimal_point' at the time of installation of the module, which may have returned this error a common way to handle this is by checking if the key exists or not before accessing it like,
if 'decimal_point' in language.read()[0].keys()
also, a dictionary can also be empty in that case the language.read()[0] will throw an error

I´ve changed my code making it specifically for spanish and the error doesn´t appear anymore. I appreciate Muhammad´s answer, maybe he´s right but anyway here is the modified code:
#api.depends('invoice_line_ids')
def _compute_total_amount_text(self):
for record in self:
amount = record.amount_total
amount_list = str(amount).split('.')
amount_first_part = num2words(int(amount_list[0]), lang='es').title() + ' '
amount_second_part = amount_list[1]
if len(amount_second_part) == 0:
amount_text = amount_first_part + '00/100'
elif len(amount_second_part) < 2:
amount_text = amount_first_part + amount_second_part + '0/100'
else:
amount_text = amount_first_part + amount_second_part[:2] + '/100'
record.total_amount_text = amount_text

Telegram bot: How to get chosen inline result

I'm sending InlineQueryResultArticle to clients and i'm wondering how to get chosen result and it's data (like result_id,...).
here is the code to send results:
token = 'Bot token'
bot = telegram.Bot(token)
updater = Updater(token)
dispatcher = updater.dispatcher
def get_inline_results(bot, update):
query = update.inline_query.query
results = list()
results.append(InlineQueryResultArticle(id='1000',
title="Book 1",
description='Description of this book, author ...',
thumb_url='https://fakeimg.pl/100/?text=book%201',
input_message_content=InputTextMessageContent(
'chosen book:')))
results.append(InlineQueryResultArticle(id='1001',
title="Book 2",
description='Description of the book, author...',
thumb_url='https://fakeimg.pl/300/?text=book%202',
input_message_content=InputTextMessageContent(
'chosen book:')
))
update.inline_query.answer(results)
inline_query_handler = InlineQueryHandler(get_inline_results)
dispatcher.add_handler(inline_query_handler)
I'm looking for a method like on_inline_chosen(data) to get id of the chosen item. (1000 or 1001 for snippet above) and then send the appropriate response to user.

You should set /setinlinefeedback in #BotFather, then you will get this update

OK, i got my answer from here
Handling user chosen result:
from telegram.ext import ChosenInlineResultHandler
def on_result_chosen(bot, update):
print(update.to_dict())
result = update.chosen_inline_result
result_id = result.result_id
query = result.query
user = result.from_user.id
print(result_id)
print(user)
print(query)
print(result.inline_message_id)
bot.send_message(user, text='fetching book data with id:' + result_id)
result_chosen_handler = ChosenInlineResultHandler(on_result_chosen)
dispatcher.add_handler(result_chosen_handler)

Retrieve all videos added to all playlists by YouTube user

I've hit a wall with the way I would like to use the YouTube data API. I have a user account that is trying to act as an 'aggregator', by adding videos from various other channels into one of about 15 playlists, based on categories. My problem is, I can't get all these videos into a single feed, because they belong to various YouTube users. I'd like to get them all into a single list, so I could sort that master list by most recent and most popular, to populate different views in my web app.
How can I get a list of all the videos that a user has added to any of their playlists?
YouTube must track this kind of stuff, because if you go into the "Feed" section of any user's page at `http://www.youtube.com/' it gives you a stream of activity that includes videos added to playlists.
To be clear, I don't want to fetch a list of videos uploaded by just this user, so http://gdata.../<user>/uploads won't work. Since there are a number of different playlists, http://gdata.../<user>/playlists won't work either, because I would need to make about 15 requests each time I wanted to check for new videos.
There seems to be no way to retrieve a list of all videos that a user has added to all of their playlists. Can somebody think of a way to do this that I might have overlooked?

Something like this for retrieving youtube links from playlist. It still need improvements.
import urllib2
import xml.etree.ElementTree as et
import re
import os
more = 1
id_playlist = raw_input("Enter youtube playlist id: ")
number_of_iteration = input("How much video links: ")
number = number_of_iteration / 50
number2 = number_of_iteration % 50
if (number2 != 0):
number3 = number + 1
else:
number3 = number
start_index = 1
while more <= number3:
#reading youtube playlist page
if (more != 1):
start_index+=50
str_start_index = str(start_index)
req = urllib2.Request('http://gdata.youtube.com/feeds/api/playlists/'+ id_playlist + '?v=2&&start-index=' + str_start_index + '&max-results=50')
response = urllib2.urlopen(req)
the_page = response.read()
#writing page in .xml
dat = open("web_content.xml","w")
dat.write(the_page)
dat.close()
#searching page for links
tree = et.parse('web_content.xml')
all_links = tree.findall('*/{http://www.w3.org/2005/Atom}link[#rel="alternate"]')
#writing links + attributes to .txt
if (more == 1):
till_links = 50
else:
till_links = start_index + 50
str_till_links = str(till_links)
dat2 = open ("links-"+ str_start_index +"to"+ str_till_links +".txt","w")
for links in all_links:
str1 = (str(links.attrib) + "\n")
dat2.write(str1)
dat2.close()
#getting only links
f = open ("links-"+ str_start_index +"to"+ str_till_links +".txt","r")
link_all = f.read()
new_string = link_all.replace("{'href': '","")
new_string2 = new_string.replace("', 'type': 'text/html', 'rel': 'alternate'}","")
f.close()
#writing links to .txt
f = open ("links-"+ str_start_index +"to"+ str_till_links +".txt","w")
f.write(new_string2)
f.close()
more+=1
os.remove('web_content.xml')
print "Finished!"

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

How do I get the full list list sent via telegram when web scrapping - selenium

Related

Webscraping customer review - Invalid selector error using XPath

Get sender message if keyword is in it then use part of it

When, why and how to avoid KeyError in Odoo Development

Telegram bot: How to get chosen inline result

Retrieve all videos added to all playlists by YouTube user

Categories

Resources