Adding custom headers to all boto3 requests - http-headers

I need to add some custom headers to every boto3 request that is sent out. Is there a way to manage the connection itself to add these headers?
For boto2, connection.AWSAuthConnection has a method build_base_http_request which has been helpful. I've yet to find an analogous function within the boto3 documentation though.

This is pretty dated but we encountered the same issue, so I'm posting our solution.
I wanted to add custom headers to boto3 for specific requests.
I found this: https://github.com/boto/boto3/issues/2251, and used the event system for adding the header
def _add_header(request, **kwargs):
request.headers.add_header('x-trace-id', 'trace-trace')
print(request.headers) # for debug
some_client = boto3.client(service_name=SERVICE_NAME)
event_system = some_client.meta.events
event_system.register_first('before-sign.EVENT_NAME.*', _add_header)
You can try using a wildcard for all requests:
event_system.register_first('before-sign.*.*', _add_header)
*SERVICE_NAME- you can find all available services here: https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/index.html
For more information about register a function to a specific event: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/events.html

Answer from #May Yaari is pretty awesome. To the concern raised by #arainchi:
This works, there is no way to pass custom data to event handlers, currently we have to do it in a non-pythonic way using global variables/queues :( I have opened issue ticket with Boto3 developers for this exact case
Actually, we could leverage the python functional programming property: returning a function inside a function to get around:
In the case we want to add a custom value custom_variable to the header, we could do
some_client = boto3.client(service_name=SERVICE_NAME)
event_system = some_client.meta.events
event_system.register_first('before-sign.EVENT_NAME.*', _register_callback(custom_variable))
def _register_callback(custom_variable):
def _add_header(request, **kwargs):
request.headers.add_header('header_name_you_want', custom_variable)
return _add_header
Or a more pythonic way using lambda
some_client = boto3.client(service_name=SERVICE_NAME)
event_system = some_client.meta.events
event_system.register_first('before-sign.EVENT_NAME.*', lambda request, **kwargs: _add_header(request, custom_variable))
def _add_header(request, custom_variable):
request.headers.add_header('header_name_you_want', custom_variable)

Related

JobQueue.run_repeating to run a function without command handler in Telegram

I need to start sending notifications to a TG group, before that I want to run a function continuosly which would query an API and store data in DB. While this function is running I would want to be able to send notifications if they are available in the DB:
That's my code:
import telegram
from telegram.ext import Updater,CommandHandler, JobQueue
token = "tokenno:token"
bot = telegram.Bot(token=token)
def start(update, context):
context.bot.send_message(chat_id=update.message.chat_id,
text="You will now receive msgs!")
def callback_minute(context):
chat_id = context.job.context
# Check in DB and send if new msgs exist
send_msgs_tg(context, chat_id)
def callback_msgs():
fetch_msgs()
def main():
JobQueue.run_repeating(callback_msgs, interval=5, first=1, context=None)
updater = Updater(token,use_context=True)
dp = updater.dispatcher
dp.add_handler(CommandHandler("start",start, pass_job_queue=True))
updater.start_polling()
updater.idle()
if __name__ == '__main__':
main()
This code gives me error:
TypeError: run_repeating() missing 1 required positional argument: 'callback'
Any help would greatly appreciated
There are a few issues with your code, let me try to point them out:
1.
def callback_msgs(): fetch_msgs()
You use callback_msgs as callback for your job. But job callbacks take exactly one argument of type telegram.ext.CallbackContext.
JobQueue.run_repeating(callback_msgs, interval=5, first=1, context=None)
JobQueue is a class. To use run_repeating, which is an instance method, you'll need an instance of that class. In fact the Updater already builds an instance for you, it's available as updater.job_queue in your case. So the call should look like this:
updater.job_queue.run_repating(callback_msgs, interval=5, first=1, context=None)
CommandHandler("start",start, pass_job_queue=True)
This is not strictly speaking an issue, bot pass_job_queue=True has no effect at all, because you use use_context=True
Please note that there is a nice tutorial on JobQueue over at the ptb-wiki. There is also an example on how to use it.
Disclaimer: I'm currently the maintainer of python-telegram-bot

How to call a the python code when a new message is delivered from telethon API

How can I call some Python code when a new message is delivered from the Telethon API? I need to run the code all the day so that I can do my processing from Python code.
How to use this? #client.on(events.NewMessage(chats=channel, incoming=True))
Do I need run the scheduler to check this?
I am using history = client(GetHistoryRequest) method.
First Steps - Updates in the documentation greets you with the following code:
import logging
logging.basicConfig(format='[%(levelname) 5s/%(asctime)s] %(name)s: %(message)s',
level=logging.WARNING)
from telethon import TelegramClient, events
client = TelegramClient('anon', api_id, api_hash)
#client.on(events.NewMessage)
async def my_event_handler(event):
if 'hello' in event.raw_text:
await event.reply('hi!')
client.start()
client.run_until_disconnected()
Note you can "call" any Python code inside my_event_handler. It also shows how #client.on() is meant to be used. Note there is no need for a scheduler.
I am using history = client(GetHistoryRequest) method.
As a side note this is raw API which is discouraged if a friendly alternative, like client.get_messages, exists.

S3 Boto3 Stubber doesn't have mapping for download file?

Currently writing tests and trying to make use of the Stubber provided by botocore.
I'm trying:
client = boto3.client("s3")
response = {'Body': 'content'}
expected_params = {'Bucket': 'a_bucket_name', 'Key': 'a_path', 'Filename': 'a_target'}
with Stubber(client) as stubber:
stubber.add_response('download_file', response, expected_params)
download_file(client, "a_bucket_name", "a_path", "a_target")
Where that download file is my own function that just wraps the client download_file call. It works in practice.
However, the test fails on the stubber.add_response due to a 'OperationNotFound' error. I stepped through using the debugger, and the issue appears here in the stub API:
if not hasattr(self.client, method):
raise ValueError(
"Client %s does not have method: %s"
% (self.client.meta.service_model.service_name, method))
# Create a successful http response
http_response = AWSResponse(None, 200, {}, None)
operation_name = self.client.meta.method_to_api_mapping.get(method) <------- Error here
self._validate_response(operation_name, service_response)
There doesn't seem to be a mapping between the two in the dictionary, is this a failure of the stub API or am I missing something?
I've just found this issue, so looks like for once it really is the library and not me:
https://github.com/boto/botocore/issues/974
That's because download_file and upload_file are customizations which live in boto3. They call out to one or many requests under the hood. Right now there's not a great story for supporting customizations other than recording underlying commands they use and adding them to the stubber. There's an external library that can handle that for you, though we don't support it ourselves.

How to access `request_seen()` inside Spider?

I have a Spider and I have a situation where I want to check if the request I am going to schedule already exists in request_seen() or not?
I don't want any method to check inside a download/spider middleware, I just want to check inside my Spider.
Is there any way to call that method?
You should be able to access the dupe filter itself from the spider like this:
self.dupefilter = self.crawler.engine.slot.scheduler.df
then you could use that in other places to check:
req = scrapy.Request('whatever')
if self.dupefilter.request_seen(req):
# it's already been seen
pass
else:
# never saw this one coming
pass
I did something similar to yours with pipeline. Following command is the code that I use.
You should specify an identifier and then go with it to check whether it is seen or not.
class SeenPipeline(object):
def __init__(self):
self.isbns_seen = set()
def process_item(self, item, spider):
if item['isbn'] in self.isbns_seen:
raise DropItem("Duplicate item found : %s" %item)
else:
self.isbns_seen.add(item['isbn'])
return item
Note: You can use these codes within your spider, too

Renaming an Amazon CloudWatch Alarm

I'm trying to organize a large number of CloudWatch alarms for maintainability, and the web console grays out the name field on an edit. Is there another method (preferably something scriptable) for updating the name of CloudWatch alarms? I would prefer a solution that does not require any programming beyond simple executable scripts.
Here's a script we use to do this for the time being:
import sys
import boto
def rename_alarm(alarm_name, new_alarm_name):
conn = boto.connect_cloudwatch()
def get_alarm():
alarms = conn.describe_alarms(alarm_names=[alarm_name])
if not alarms:
raise Exception("Alarm '%s' not found" % alarm_name)
return alarms[0]
alarm = get_alarm()
# work around boto comparison serialization issue
# https://github.com/boto/boto/issues/1311
alarm.comparison = alarm._cmp_map.get(alarm.comparison)
alarm.name = new_alarm_name
conn.update_alarm(alarm)
# update actually creates a new alarm because the name has changed, so
# we have to manually delete the old one
get_alarm().delete()
if __name__ == '__main__':
alarm_name, new_alarm_name = sys.argv[1:3]
rename_alarm(alarm_name, new_alarm_name)
It assumes you're either on an ec2 instance with a role that allows this, or you've got a ~/.boto file with your credentials. It's easy enough to manually add yours.
Unfortunately it looks like this is not currently possible.
I looked around for the same solution but it seems neither console nor cloudwatch API provides that feature.
Note:
But we can copy the existing alram with the same parameter and can save on new name
.