Airflow 1.10.6 kubernetes executor over EKS using s3 connection, task pass test but dag run fail - amazon-s3

I tried:
- configuring s3 connection from the value.yaml:
connections:
id: aws_default
type: aws
login: xxxaws_access_key_idxxx
password: xxxxxxxxxxxxxaws_secret_access_keyxxxxxxxxxxxxxxxxxx
id: my_s3
type: s3
login: xxxaws_access_key_idxxx
password: xxxxxxxxxxxxxaws_secret_access_keyxxxxxxxxxxxxxxxxxx
write DAG that uses the s3Hook and write string to s3.
run test from scheduler pod:
/entrypoint airflow test dag_id task_id date_before_the_start_data_of_DAG
the file is created and content is ok
# Airflow UI activating the
DAG and RUN it is queued and fail.
any suggestions?
BTW DAG ARGS:
args = {
'owner': 'airflow',
'start_date': airflow.utils.dates.days_ago(0),
'trigger_rule': 'dummy',
#'pool': 'my_workers_pool',
'catchup': False, # dont fill back
}
in addition I added 5 min sleep to the task that cause the DAG to fail and watched the pod creation at the kubectl but the task started for few seconds and disappears. any Ideas how to debug this issue?
logs from task pod:
kubectl logs postgressexamplesababaegozimpostgress-7322d44cb2684a09bef95ad3080b9505
-n airflow-research-p-8482f --tail=200 [2020-08-24 12:04:44,262] {{settings.py:252}} INFO - settings.configure_orm(): Using pool
settings. pool_size=5, max_overflow=10, pool_recycle=1800, pid=1
/usr/local/lib/python3.7/site-packages/psycopg2/init.py:144:
UserWarning: The psycopg2 wheel package will be renamed from release
2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see:
http://initd.org/psycopg/docs/install.html#binary-install-from-pypi.
""") [2020-08-24 12:04:44,833] {{init.py:51}} INFO - Using
executor LocalExecutor [2020-08-24 12:04:44,834] {{dagbag.py:92}} INFO
Filling up the DagBag from /usr/local/airflow/dags/postgress_example.py Traceback (most recent
call last): File "/usr/local/bin/airflow", line 37, in
[2020-08-24 12:04:44,841] {{dagbag.py:207}} ERROR - Failed to import:
/usr/local/airflow/dags/postgress_example.py Traceback (most recent
call last): File
"/usr/local/lib/python3.7/site-packages/airflow/models/dagbag.py",
line 204, in process_file
m = imp.load_source(mod_name, filepath) File "/usr/local/lib/python3.7/imp.py", line 171, in load_source
module = _load(spec) File "", line 696, in _load File "", line 677, in
_load_unlocked File "", line 728, in exec_module File "", line 219,
in _call_with_frames_removed File
"/usr/local/airflow/dags/postgress_example.py", line 33, in
import boto3 ModuleNotFoundError: No module named 'boto3'
args.func(args) File "/usr/local/lib/python3.7/site-packages/airflow/utils/cli.py", line
74, in wrapper
return f(*args, **kwargs) File "/usr/local/lib/python3.7/site-packages/airflow/bin/cli.py", line 529,
in run
dag = get_dag(args) File "/usr/local/lib/python3.7/site-packages/airflow/bin/cli.py", line 148,
in get_dag
'parse.'.format(args.dag_id)) airflow.exceptions.AirflowException: dag_id could not be found: postgress_example. Either the dag did not
exist or it failed to parse.

Related

Django tests pass locally but not on Github Actions push

My tests pass locally and in fact on Github Actions it also says "ran 8 tests" and then "OK" (and I have 8). However, the test stage fails due to a strange error in the traceback.
Traceback (most recent call last):
File "/opt/hostedtoolcache/Python/3.9.9/x64/lib/python3.9/site-packages/django/db/backends/utils.py", line 82, in _execute
return self.cursor.execute(sql)
File "/opt/hostedtoolcache/Python/3.9.9/x64/lib/python3.9/site-packages/django/db/backends/sqlite3/base.py", line 421, in execute
return Database.Cursor.execute(self, query)
sqlite3.OperationalError: near "SCHEMA": syntax error
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/runner/work/store/store/manage.py", line 22, in <module>
main()
File "/home/runner/work/store/store/manage.py", line 18, in main
execute_from_command_line(sys.argv)
File "/opt/hostedtoolcache/Python/3.9.9/x64/lib/python3.9/site-packages/django/core/management/__init__.py", line 419, in execute_from_command_line
utility.execute()
File "/opt/hostedtoolcache/Python/3.9.9/x64/lib/python3.9/site-packages/django/core/management/__init__.py", line 413, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/opt/hostedtoolcache/Python/3.9.9/x64/lib/python3.9/site-packages/django/core/management/commands/test.py", line 23, in run_from_argv
super().run_from_argv(argv)
File "/opt/hostedtoolcache/Python/3.9.9/x64/lib/python3.9/site-packages/django/core/management/base.py", line 354, in run_from_argv
self.execute(*args, **cmd_options)
File "/opt/hostedtoolcache/Python/3.9.9/x64/lib/python3.9/site-packages/django/core/management/base.py", line 398, in execute
output = self.handle(*args, **options)
File "/opt/hostedtoolcache/Python/3.9.9/x64/lib/python3.9/site-packages/django/core/management/commands/test.py", line 55, in handle
failures = test_runner.run_tests(test_labels)
File "/opt/hostedtoolcache/Python/3.9.9/x64/lib/python3.9/site-packages/django/test/runner.py", line 736, in run_tests
self.teardown_databases(old_config)
File "/opt/hostedtoolcache/Python/3.9.9/x64/lib/python3.9/site-packages/django_heroku/core.py", line 41, in teardown_databases
self._wipe_tables(connection)
File "/opt/hostedtoolcache/Python/3.9.9/x64/lib/python3.9/site-packages/django_heroku/core.py", line 26, in _wipe_tables
cursor.execute(
File "/opt/hostedtoolcache/Python/3.9.9/x64/lib/python3.9/site-packages/django/db/backends/utils.py", line 66, in execute
return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
File "/opt/hostedtoolcache/Python/3.9.9/x64/lib/python3.9/site-packages/django/db/backends/utils.py", line 75, in _execute_with_wrappers
return executor(sql, params, many, context)
File "/opt/hostedtoolcache/Python/3.9.9/x64/lib/python3.9/site-packages/django/db/backends/utils.py", line 84, in _execute
return self.cursor.execute(sql, params)
File "/opt/hostedtoolcache/Python/3.9.9/x64/lib/python3.9/site-packages/django/db/utils.py", line 90, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File "/opt/hostedtoolcache/Python/3.9.9/x64/lib/python3.9/site-packages/django/db/backends/utils.py", line 82, in _execute
return self.cursor.execute(sql)
File "/opt/hostedtoolcache/Python/3.9.9/x64/lib/python3.9/site-packages/django/db/backends/sqlite3/base.py", line 421, in execute
return Database.Cursor.execute(self, query)
django.db.utils.OperationalError: near "SCHEMA": syntax error
Error: Process completed with exit code 1.
These are all just default Django files and I haven't messed with any of them. I don't really know what to do about it and internet searches yield nothing helpful.
Just had the same issue. For me I had django_heroku in my project settings.py. It looks like, based on the comments that your app is using this.
I had something like this in settings.py:
import django_heroku
django_heroku.settings(locals())
change to the following so it doesn't use django_heroku on github actions:
if os.environ.get('ENVIRONMENT') != 'github':
import django_heroku
django_heroku.settings(locals())
and then declare an environment variable in the workflow file, I have this:
jobs:
build:
runs-on: ubuntu-latest
strategy:
max-parallel: 4
matrix:
python-version: [3.7, 3.8, 3.9, '3.10']
env:
DJANGO_SECRET_KEY: "someKey"
ENVIRONMENT: github
Note: this is assuming you have environment variables setup. I know there are different ways of doing it, so depending on how they are setup you might have to change how they are accessed
more info here
edit:
As per the comments, you don't have to declare an additional environment variable because github has a default variable called GITHUB_ACTIONS
so you can do
if os.environ.get('GITHUB_ACTIONS') != 'true':
import django_heroku
django_heroku.settings(locals())

s3fs timeout issue on an AWS Lambda function within a VPN

s3fs seems to fail from time to time when reading from an S3 bucket using an AWS Lambda function within a VPN. I am using s3fs==0.4.0 and pandas==1.0.1.
import s3fs
import pandas as pd
def lambda_handler(event, context):
bucket = event['Records'][0]['s3']['bucket']['name']
s3_file = event['Records'][0]['s3']['object']['key']
s3fs.S3FileSystem.connect_timeout = 1800
s3fs.S3FileSystem.read_timeout = 1800
with s3fs.S3FileSystem(anon=False).open(f"s3://{bucket}/{s3_file}", 'rb') as f:
self.data = pd.read_json(f, **kwargs)
The stacktrace is the following:
Traceback (most recent call last):
File "/var/task/urllib3/connection.py", line 157, in _new_conn
(self._dns_host, self.port), self.timeout, **extra_kw
File "/var/task/urllib3/util/connection.py", line 84, in create_connection
raise err
File "/var/task/urllib3/util/connection.py", line 74, in create_connection
sock.connect(sa)
TimeoutError: [Errno 110] Connection timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/var/task/botocore/httpsession.py", line 263, in send
chunked=self._chunked(request.headers),
File "/var/task/urllib3/connectionpool.py", line 720, in urlopen
method, url, error=e, _pool=self, _stacktrace=sys.exc_info()[2]
File "/var/task/urllib3/util/retry.py", line 376, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/var/task/urllib3/packages/six.py", line 735, in reraise
raise value
File "/var/task/urllib3/connectionpool.py", line 672, in urlopen
chunked=chunked,
File "/var/task/urllib3/connectionpool.py", line 376, in _make_request
self._validate_conn(conn)
File "/var/task/urllib3/connectionpool.py", line 994, in _validate_conn
conn.connect()
File "/var/task/urllib3/connection.py", line 300, in connect
conn = self._new_conn()
File "/var/task/urllib3/connection.py", line 169, in _new_conn
self, "Failed to establish a new connection: %s" % e
urllib3.exceptions.NewConnectionError: <botocore.awsrequest.AWSHTTPSConnection object at 0x7f4d578e3ed0>: Failed to establish a new connection: [Errno 110] Connection timed out
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/var/task/botocore/endpoint.py", line 200, in _do_get_response
http_response = self._send(request)
File "/var/task/botocore/endpoint.py", line 244, in _send
return self.http_session.send(request)
File "/var/task/botocore/httpsession.py", line 283, in send
raise EndpointConnectionError(endpoint_url=request.url, error=e)
botocore.exceptions.EndpointConnectionError: Could not connect to the endpoint URL: "https://my_bucket.s3.eu-west-1.amazonaws.com/?list-type=2&prefix=my_folder%2Fsomething%2F&delimiter=%2F&encoding-type=url"
Has someone faced this same issue? Why would it fail only sometimes? Is there a s3fs configuration that could help for this specific issue?
Actually there was no problem at all with s3fs. Seems like we were using a Lambda function with two Subnets within the VPC and one was working normally but the other one wasn't allowed to access S3 resources, therefore when a Lambda was spawned using the second network it wouldn't be able to connect at all.
Fixing this issue was as easy as removing the second subnet.
You could also use boto3 which is supported by AWS, in order to get json from S3.
import json
import boto3
def lambda_handler(event, context):
bucket = event['Records'][0]['s3']['bucket']['name']
key = event['Records'][0]['s3']['object']['key']
s3 = boto3.resource('s3')
file_object = s3_resource.Object(bucket, key)
json_content = json.loads(file_object.get()['Body'].read())

Using python-rq with Zope

I'm trying to use Python-RQ with Zope by calling an external method (for background tasks) from ZMI after a certain operation. The file called by external method resides in Extensions. It initialises connection to Redis and imports a module that runs the background tasks. The question is where should this to be imported file be placed ? Python-RQ does not seem to recognise if I put it inside Products directory. It throws no module named Products.xyz. Below is the code snippet
from redis import Redis
from rq import Queue
from Products.def_update_company_status import ae_update_company_status
q = Queue(connection=Redis())
def rq_worker(context):
q.enqueue(ae_update_company_status)
return 'DONE'
The rq_worker function is invoked by the external method.
Below is the error
18:12:40 default: Products.def_update_company_status.ae_update_company_status() (4b2b5c81-e329-4031-a3e7-b9b1bb198278)
18:12:40 ImportError: No module named Products.def_update_company_status
Traceback (most recent call last):
File "/home/zope/ams/lib/python2.6/site-packages/rq-0.6.0-py2.6.egg/rq/worker.py", line 588, in perform_job
rv = job.perform()
File "/home/zope/ams/lib/python2.6/site-packages/rq-0.6.0-py2.6.egg/rq/job.py", line 498, in perform
try:
File "/home/zope/ams/lib/python2.6/site-packages/rq-0.6.0-py2.6.egg/rq/job.py", line 206, in func
File "/home/zope/ams/lib/python2.6/site-packages/rq-0.6.0-py2.6.egg/rq/utils.py", line 150, in import_attribute
module = importlib.import_module(module_name)
File "build/bdist.linux-x86_64/egg/importlib/__init__.py", line 37, in import_module
__import__(name)
ImportError: No module named Products.def_update_company_status
Traceback (most recent call last):
File "/home/zope/ams/lib/python2.6/site-packages/rq-0.6.0-py2.6.egg/rq/worker.py", line 588, in perform_job
rv = job.perform()
File "/home/zope/ams/lib/python2.6/site-packages/rq-0.6.0-py2.6.egg/rq/job.py", line 498, in perform
try:
File "/home/zope/ams/lib/python2.6/site-packages/rq-0.6.0-py2.6.egg/rq/job.py", line 206, in func
File "/home/zope/ams/lib/python2.6/site-packages/rq-0.6.0-py2.6.egg/rq/utils.py", line 150, in import_attribute
module = importlib.import_module(module_name)
File "build/bdist.linux-x86_64/egg/importlib/__init__.py", line 37, in import_module
__import__(name)
ImportError: No module named Products.def_update_company_status
18:12:40 Moving job to u'failed' queue
18:12:40
18:12:40 *** Listening on default...

gcloud auth login : Properties Parse Error

i installed google-cloud-sdk in ubuntu 14.04 and when tried to login,it is showing this error.
krish#jarvis:~$ gcloud auth login
Traceback (most recent call last):
File "/usr/bin/../lib/google-cloud-sdk/./lib/googlecloudsdk/gcloud/gcloud.py", line 87, in
from googlecloudsdk.calliope import base
File "/usr/bin/../lib/google-cloud-sdk/./lib/googlecloudsdk/calliope/base.py", line 8, in
from googlecloudsdk.core import log
File "/usr/bin/../lib/google-cloud-sdk/./lib/googlecloudsdk/core/log.py", line 413, in
_log_manager = _LogManager()
File "/usr/bin/../lib/google-cloud-sdk/./lib/googlecloudsdk/core/log.py", line 195, in init
self.console_formatter = _ConsoleFormatter(sys.stderr)
File "/usr/bin/../lib/google-cloud-sdk/./lib/googlecloudsdk/core/log.py", line 172, in init
use_color = not properties.VALUES.core.disable_color.GetBool()
File "/usr/bin/../lib/google-cloud-sdk/./lib/googlecloudsdk/core/properties.py", line 782, in GetBool
value = _GetBoolProperty(self, PropertiesFile.Load(), required)
File "/usr/bin/../lib/google-cloud-sdk/./lib/googlecloudsdk/core/properties.py", line 1141, in Load
PropertiesFile._PROPERTIES = PropertiesFile(paths)
File "/usr/bin/../lib/google-cloud-sdk/./lib/googlecloudsdk/core/properties.py", line 1160, in init
self.__Load(properties_path)
File "/usr/bin/../lib/google-cloud-sdk/./lib/googlecloudsdk/core/properties.py", line 1174, in __Load
raise PropertiesParseError(e.message)
googlecloudsdk.core.properties.PropertiesParseError: File contains no section headers.
file: /home/krish/.config/gcloud/properties, line: 1
'h4\xaf\xe3\xda^\xa6\xe8\xb2\xdb`$?\x11\x7f\xce\xc1\x1f\x88\xcd"\x82c\x13Bj\x07\xc3\xe3\x9ds\xdd d\xe1\n'
This does look like you have a corruppted gcloud config file /home/krish/.config/gcloud/properties. Check what's the context, may be it's just one line and you can fix it, or just move/remove it (you will need to set all configurations once again if you did any).

SchedulingError: Couldn't apply scheduled task downloader: argument must be an int, or have a fileno() method

i have added djcelery and kombu.transport.django to django apps.
settings.py:
import djcelery
djcelery.setup_loader()
from .celery import app as celery_app
BROKER_URL = "redis://abc:abc#localhost:6379/0"
CELERYBEAT_SCHEDULER = "djcelery.schedulers.DatabaseScheduler"
CELERY_TIMEZONE = "Europe/London"
CELERY_ENABLE_UTC = True
# store AsyncResult in redis
CELERY_RESULT_BACKEND = "redis"
REDIS_HOST = "localhost"
REDIS_PORT = 6379
REDIS_DB = 0
REDIS_VHOST = 0
REDIS_USER="xyz"
REDIS_PASSWORD="xyz"
REDIS_CONNECT_RETRY = True
celery.py
from __future__ import absolute_import
import os
from celery import Celery
from django.conf import settings
from . import celeryconfig
# set the default Django settings module for the 'celery' program.
os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'config.settings')
app = Celery('apps')
# Using a string here means the worker will not have to
# pickle the object when using Windows.
app.config_from_object('django.conf:settings')
app.autodiscover_tasks(lambda: settings.INSTALLED_APPS)
#app.task(bind=True)
def debug_task(self):
print('Request: {0!r}'.format(self.request))
celery supervisor configuration:
[program:celeryd]
command=/home/vagrant/.virtualenvs/project/bin/python /vagrant/project/manage.py celeryd -B -E -c 1 --settings=config.settings
directory=/vagrant/project
user=vagrant
autostart=true
autorestart=true
stdout_logfile=/var/log/supervisor/celeryd.log
stderr_logfile=/var/log/supervisor/celeryd_err.log
when a task is scheduled using django-celery interface which supposed to download some files, following errors are thrown:
Traceback (most recent call last):
File "/home/vagrant/.virtualenvs/project/local/lib/python2.7/site-packages/celery/beat.py", line 203, in maybe_due
result = self.apply_async(entry, publisher=publisher)
File "/home/vagrant/.virtualenvs/project/local/lib/python2.7/site-packages/celery/beat.py", line 259, in apply_async
entry, exc=exc)), sys.exc_info()[2])
File "/home/vagrant/.virtualenvs/project/local/lib/python2.7/site-packages/celery/beat.py", line 251, in apply_async
**entry.options)
File "/home/vagrant/.virtualenvs/project/local/lib/python2.7/site-packages/celery/app/task.py", line 555, in apply_async
**dict(self._get_exec_options(), **options)
File "/home/vagrant/.virtualenvs/project/local/lib/python2.7/site-packages/celery/app/base.py", line 323, in send_task
reply_to=reply_to or self.oid, **options
File "/home/vagrant/.virtualenvs/project/local/lib/python2.7/site-packages/celery/app/amqp.py", line 302, in publish_task
**kwargs
File "/home/vagrant/.virtualenvs/project/local/lib/python2.7/site-packages/kombu/messaging.py", line 168, in publish
routing_key, mandatory, immediate, exchange, declare)
File "/home/vagrant/.virtualenvs/project/local/lib/python2.7/site-packages/kombu/connection.py", line 440, in _ensured
return fun(*args, **kwargs)
File "/home/vagrant/.virtualenvs/project/local/lib/python2.7/site-packages/kombu/messaging.py", line 180, in _publish
[maybe_declare(entity) for entity in declare]
File "/home/vagrant/.virtualenvs/project/local/lib/python2.7/site-packages/kombu/messaging.py", line 111, in maybe_declare
return maybe_declare(entity, self.channel, retry, **retry_policy)
File "/home/vagrant/.virtualenvs/project/local/lib/python2.7/site-packages/kombu/common.py", line 99, in maybe_declare
return _maybe_declare(entity)
File "/home/vagrant/.virtualenvs/project/local/lib/python2.7/site-packages/kombu/common.py", line 110, in _maybe_declare
entity.declare()
File "/home/vagrant/.virtualenvs/project/local/lib/python2.7/site-packages/kombu/entity.py", line 505, in declare
self.queue_declare(nowait, passive=False)
File "/home/vagrant/.virtualenvs/project/local/lib/python2.7/site-packages/kombu/entity.py", line 531, in queue_declare
nowait=nowait)
File "/home/vagrant/.virtualenvs/project/local/lib/python2.7/site-packages/kombu/transport/virtual/__init__.py", line 447, in queue_declare
return queue_declare_ok_t(queue, self._size(queue), 0)
File "/home/vagrant/.virtualenvs/project/local/lib/python2.7/site-packages/kombu/transport/redis.py", line 651, in _size
sizes = cmds.execute()
File "/home/vagrant/.virtualenvs/project/local/lib/python2.7/site-packages/redis/client.py", line 2157, in execute
conn.disconnect()
File "/home/vagrant/.virtualenvs/project/local/lib/python2.7/site-packages/kombu/transport/redis.py", line 800, in disconnect
channel._on_connection_disconnect(self)
File "/home/vagrant/.virtualenvs/project/local/lib/python2.7/site-packages/kombu/transport/redis.py", line 461, in _on_connection_disconnect
self.connection.cycle._on_connection_disconnect(connection)
File "/home/vagrant/.virtualenvs/project/local/lib/python2.7/site-packages/kombu/transport/redis.py", line 259, in _on_connection_disconnect
self.poller.unregister(connection._sock)
File "/home/vagrant/.virtualenvs/project/local/lib/python2.7/site-packages/kombu/utils/eventio.py", line 85, in unregister
self._epoll.unregister(fd)
SchedulingError: Couldn't apply scheduled task downloader: argument must be an int, or have a fileno() method.
and i'm not getting any clue where from these errors are coming...
any suggestions will be greatly appreciated.
Update 1:
i use django-celery admin interface to schedule a task and task is defined inside an app registered in INSTALLED_APP as follows:
#shared_task
def downloader():
# execute static/shell_scripts/feed_downloader.sh
import subprocess
PROJECT_PATH = os.path.dirname(os.path.dirname(os.path.dirname(os.path.abspath(__file__))))
subprocess.call(PROJECT_PATH+"/static/shell_scripts/feed_downloader.sh", shell=True)
the feed_downloader.sh file calls a django management command which if i execute independetly works fine like:
python manage.py feed_downloader
if i add a print statement just before line no. 85 to check fd parameter in file
/home/vagrant/.virtualenvs/project/local/lib/python2.7/site-packages/kombu/utils/eventio.py
then following messages get printed:
see this <socket._socketobject object at 0x131b750>
see this None
see this <socket._socketobject object at 0x131bc90>
see this None