Celery tasks from different applications in different log files - redis

I'm looking for configure Celery on my FreeBSD server and I get some issues according to log files.
My configuration:
FreeBSD server
2 Django applications : app1 and app2
Celery is daemonized and Redis
Each application has his own Celery task
My Celery config file:
I have in /etc/default/celeryd_app1 :
# Names of nodes to start
CELERYD_NODES="worker"
# Absolute or relative path to the 'celery' command:
CELERY_BIN="/usr/local/www/app1/venv/bin/celery"
# App instance to use
CELERY_APP="main"
# Where to chdir at start.
CELERYD_CHDIR="/usr/local/www/app1/src/"
# Extra command-line arguments to the worker
CELERYD_OPTS="--time-limit=300 --concurrency=8"
# Set logging level to DEBUG
#CELERYD_LOG_LEVEL="DEBUG"
# %n will be replaced with the first part of the nodename.
CELERYD_LOG_FILE="/var/log/celery/app1/%n%I.log"
CELERYD_PID_FILE="/var/run/celery/app1/%n.pid"
# Workers should run as an unprivileged user.
CELERYD_USER="celery"
CELERYD_GROUP="celery"
# If enabled pid and log directories will be created if missing,
# and owned by the userid/group configured.
CELERY_CREATE_DIRS=1
I have exactly the same file for celeryd_app2
Django settings file with Celery settings:
CELERY_BROKER_URL = 'redis://localhost:6379'
CELERY_ACCEPT_CONTENT = ['application/json']
CELERY_RESULT_BACKEND = 'redis://localhost:6379'
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
CELERY_IGNORE_RESULT = False
CELERY_TASK_TRACK_STARTED = True
# Add a one-minute timeout to all Celery tasks.
CELERYD_TASK_SOFT_TIME_LIMIT = 60
Both settings have the same redis' port.
My issue:
When I execute a celery task for app1, I find logs from this task in app2 log file with an issue like this :
Received unregistered task of type 'app1.task.my_task_for_app1'
...
KeyError: 'app1.task.my_task_for_app1'
There is an issue in my Celery config file ? I have to set different redis port ? If yes, How I can do that ?
Thank you very much

I guess the problem lies in the fact that you are using the same Redis database for both applications:
CELERY_BROKER_URL = 'redis://localhost:6379'
Take a look into the guide for using Redis as a broker. Just change the database for each application, e.g.
CELERY_BROKER_URL = 'redis://localhost:6379/0'
and
CELERY_BROKER_URL = 'redis://localhost:6379/1'

Related

Web HUE installation and set up is done. Bu the dashboard is not working

I have recently set up the Hue set up on my hadoop cluster and everything seems fine. I was able to open the webhue ie., localhost:8888 and i can see the HDFS, HBase and Mysql. But I am still facing some issues on this. Could anyone please help me out in this regard.
Problems facing are :
Hive connection: I am using beeline and i was able to connect to hive databses using beeline on the shelll But in the web hue, it shows error loading databases. The configuration i have used in hue.ini file is
hive_server_host=localhost
Port where HiveServer2 Thrift server runs on.
hive_server_port=10000
The second issue is even though i was able to connect to the mysql database, the issue i am facing is in the dashboard tab. I can see all the widgets and charting options like pie,bar etc. But when i drag and drop them on the page, its loading forever. I dont able to see any chart of the table data.
Please help me out as i have been trying since 10 days and i could not able to find any pointers yet.
#Ruthikajawar I have a working hue.ini here:
https://github.com/steven-dfheinz/HDP3-Hue-Service/blob/Hue.4.6.0/configuration/live.hue.ini
The specifics for working hive are:
[beeswax]
# Host where HiveServer2 is running.
# If Kerberos security is enabled, use fully-qualified domain name (FQDN).
hive_server_host=hdp.cloudera.com
# Binary thrift port for HiveServer2.
#hive_server_port=10000
# Http thrift port for HiveServer2.
#hive_server_http_port=10001
# Host where LLAP is running
## llap_server_host = localhost
# LLAP binary thrift port
## llap_server_port = 10500
# LLAP HTTP Thrift port
## llap_server_thrift_port = 10501
# Alternatively, use Service Discovery for LLAP (Hive Server Interactive) and/or Hiveserver2, this will override server and thrift port
# Whether to use Service Discovery for LLAP
## hive_discovery_llap = true
# is llap (hive server interactive) running in an HA configuration (more than 1)
# important as the zookeeper structure is different
## hive_discovery_llap_ha = false
# Shortcuts to finding LLAP znode Key
# Non-HA - hiveserver-interactive-site - hive.server2.zookeeper.namespace ex hive2 = /hive2
# HA-NonKerberized - (llap_app_name)_llap ex app name llap0 = /llap0_llap
# HA-Kerberized - (llap_app_name)_llap-sasl ex app name llap0 = /llap0_llap-sasl
## hive_discovery_llap_znode = /hiveserver2-hive2
# Whether to use Service Discovery for HiveServer2
hive_discovery_hs2 = true
# Hiveserver2 is hive-site hive.server2.zookeeper.namespace ex hiveserver2 = /hiverserver2
hive_discovery_hiveserver2_znode = /hiveserver2
# Applicable only for LLAP HA
# To keep the load on zookeeper to a minimum
# ---- we cache the LLAP activeEndpoint for the cache_timeout period
# ---- we cache the hiveserver2 endpoint for the length of session
# configurations to set the time between zookeeper checks
## cache_timeout = 60
# Host where Hive Metastore Server (HMS) is running.
# If Kerberos security is enabled, the fully-qualified domain name (FQDN) is required.
#hive_metastore_host=hdp.cloudera.com
# Configure the port the Hive Metastore Server runs on.
#hive_metastore_port=9083
# Hive configuration directory, where hive-site.xml is located
hive_conf_dir=/etc/hive/conf
# Timeout in seconds for thrift calls to Hive service
## server_conn_timeout=120
# Choose whether to use the old GetLog() thrift call from before Hive 0.14 to retrieve the logs.
# If false, use the FetchResults() thrift call from Hive 1.0 or more instead.
## use_get_log_api=false
# Limit the number of partitions that can be listed.
## list_partitions_limit=10000
# The maximum number of partitions that will be included in the SELECT * LIMIT sample query for partitioned tables.
## query_partitions_limit=10
# A limit to the number of rows that can be downloaded from a query before it is truncated.
# A value of -1 means there will be no limit.
## download_row_limit=100000
# A limit to the number of bytes that can be downloaded from a query before it is truncated.
# A value of -1 means there will be no limit.
## download_bytes_limit=-1
# Hue will try to close the Hive query when the user leaves the editor page.
# This will free all the query resources in HiveServer2, but also make its results inaccessible.
## close_queries=false
# Hue will use at most this many HiveServer2 sessions per user at a time.
# For Tez, increase the number to more if you need more than one query at the time, e.g. 2 or 3 (Tez has a maximum of 1 query by session).
## max_number_of_sessions=1
# Thrift version to use when communicating with HiveServer2.
# Version 11 comes with Hive 3.0. If issues, try 7.
thrift_version=11
# A comma-separated list of white-listed Hive configuration properties that users are authorized to set.
## config_whitelist=hive.map.aggr,hive.exec.compress.output,hive.exec.parallel,hive.execution.engine,mapreduce.job.queuename
# Override the default desktop username and password of the hue user used for authentications with other services.
# e.g. Used for LDAP/PAM pass-through authentication.
## auth_username=hive
## auth_password=hive
# Use SASL framework to establish connection to host.
use_sasl=true
For the second part of your question. You should monitor the /var/log/hue/error.log while using the UI to capture and resolve any errors.

Establish Redis Connection in Phoenix

I need to establish a Redis connection when my Phoenix app initially loads. When reading the docs I thought that code would go in /config/dev.exs or /config/config.exs but the Redix dependency I am using as a Redis interface is not loaded in /config
Below results in a reference error in /config:
Redix.start_link("redis://localhost:6379/3", name: :redix)
I only want to call this once on app load. Where should I put this call in my Phoenix app?
Adding {Redix, name: :redix} to children array in application.ex adds redix process to the supervisor tree. Which means it will start along with your application:
children = [
# Start the Ecto repository
MyApp.Repo,
# Start the Telemetry supervisor
AppWeb.Telemetry,
# Start the PubSub system
....
# Single Redis connection
{Redix, name: :redix}
]
See https://hexdocs.pm/redix/real-world-usage.html
You can check in iex -S mix:
iex(1)> Redix.command(:redix, ["PING"])
{:ok, "PONG"}
Now you can use all the regular Redix commands: https://hexdocs.pm/redix/readme.html#usage

Not able to start/access Klov Server at http://localhost for extent reports - Error starting ApplicationContext

I'm new to klov reports and have downloaded the klov jar from http://extentreports.com/community/0 and tried running klov-server (klov-0.2.0.jar) following the instruction at https://github.com/anshooarora/klov (java -jar klov-0.2.0.jar), however I'm getting below error and not able to start the klov server (http://localhost:portNo)
enter image description here
Error starting ApplicationContext. To display the conditions report re-run your application with 'debug' enabled.
2018-11-13 09:09:08.298 ERROR 40212 --- [ main] o.s.boot.SpringApplication : Application run failed
[Refer - Error Screenshot here]
Mongodb 3.2 running and listening on port 27017
klov application.properties file reside in the same folder as klov-0.2.0.jar
Have tried different ports for Klov (80, 90, 2571,1337), but all giving the same error as in description
Running it on windows 10, with application.properties settings as below:
# klov
application.name=Klov
server.host=localhost
server.port=90
# data.mongodb
spring.data.mongodb.host=localhost
spring.data.mongodb.port=27017
spring.data.mongodb.database=klov
# data.rest
spring.data.rest.basePath=/rest
spring.data.rest.default-page-size=6
# redis, session
#use.redis.session.store=false
#spring.redis.host=localhost
#spring.redis.port=6379
#spring.redis.ssl=false
#spring.redis.database=0
#spring.session.store-type=redis
#server.session.timeout=-1
# users
server.admin.name=klovadmin
server.admin.key=$2a$10$I/5TFi6BrHChUghTZEZfCO82txzu8L5brcK0CxhS3m.V6glfj2vZe
# storage
file.storage.location=./upload/reports/
# schedulers
scheduler.jobs.enabled=false
# automatically delete older builds
# default is -1 (keep all)
# this count must be greater than 0 for this scheduler to work
# scheduled to run daily at 12:00AM
scheduler.job.builds.retain.count=-1
# mail
spring.mail.host=
spring.mail.port=
spring.mail.username=
spring.mail.password=
spring.mail.properties.mail.smtp.ssl.enable=true
#spring.mail.properties.mail.smtp.starttls.enable=true
#spring.mail.properties.mail.smtp.starttls.required=true
spring.mail.properties.mail.smtp.auth=true
#spring.mail.properties.mail.smtp.connectiontimeout=5000
#spring.mail.properties.mail.smtp.timeout=5000
#spring.mail.properties.mail.smtp.writetimeout=5000
spring.mail.test-connection=true
Since you are not using a mail server, start with the following properties:
# klov
application.name=Klov
server.host=localhost
server.port=80
# data.mongodb
spring.data.mongodb.host=localhost
spring.data.mongodb.port=27017
spring.data.mongodb.database=klov
# data.rest
spring.data.rest.basePath=/rest
spring.data.rest.default-page-size=6
# redis, session
use.redis.session.store=false
spring.redis.host=localhost
spring.redis.port=6379
spring.redis.ssl=false
spring.redis.database=0
spring.session.store-type=redis
server.session.timeout=-1
# users
server.admin.name=klovadmin
server.admin.key=$2a$10$I/5TFi6BrHChUghTZEZfCO82txzu8L5brcK0CxhS3m.V6glfj2vZe
# storage
file.storage.location=./upload/reports/
# schedulers
scheduler.jobs.enabled=false
# automatically delete older builds
# default is -1 (keep all)
# this count must be greater than 0 for this scheduler to work
# scheduled to run daily at 12:00AM
scheduler.job.builds.retain.count=-1
# mail
#spring.mail.host=
#spring.mail.port=
#spring.mail.username=
#spring.mail.password=
#spring.mail.properties.mail.smtp.ssl.enable=true
#spring.mail.properties.mail.smtp.starttls.enable=true
#spring.mail.properties.mail.smtp.starttls.required=true
#spring.mail.properties.mail.smtp.auth=true
#spring.mail.properties.mail.smtp.connectiontimeout=5000
#spring.mail.properties.mail.smtp.timeout=5000
#spring.mail.properties.mail.smtp.writetimeout=5000
spring.mail.test-connection=false

send_task works only with a specific user

setup: Celery 4.1, RabbitMQ 3.6.1 (As broker), Redis (As backend, not relevant here).
Having two rabbit users:
admin_user with permissions of .* .* .*.
remote_user with permissions of ack ack ack.
admin_user can trigger tasks and is used by celery workers to handle tasks.
remote_user can only trigger one type of task - ack and is enqueued in a dedicated ack queue which later on being consumed by ack worker (by admin_user).
The remote_user sends the task by the following code:
from celery import Celery
app = Celery('remote', broker='amqp://remote_user:remote_pass#<machine_ip>:5672/vhost')
app.send_task('ack', args=('a1', 'a2'), queue='ack', route_name='ack')
This works perfectly in Celery 3.1. After upgrade to Celery 4.1 it doesn't send the task anymore. The call returns an AsyncResult but I don't see the message in Celery flower (or via rabbit management ui), or in the logs.
Trying to set permissions to remote_user .* .* .* as in the admin_user - doesn't help.
Trying to add administrator tag - doesn't help.
Changing the user of the broker to
'amqp://admin_user:admin_pass#<machine_ip>:5672/vhost' DOES work :
from celery import Celery
app = Celery('remote', broker='amqp://admin_user:admin_pass#<machine_ip>:5672/vhost')
app.send_task('ack', args=('a1', 'a2'), queue='ack', route_name='ack')
But I don't want to give a remote machine the admin_user permissions.
Any idea what I can do?
Solved,
API changed I guess, but to stay with the current permissions of RabbitMQ I had to use the following route:
old_celery_config.py: (celery 3.1)
CELERY_ROUTES = {
'ack_task': {
'queue': 'geo_ack'
}
}
celery_config.py: (celery 4.1)
CELERY_ROUTES = {
'ack_task': {
'exchange': 'ack',
'exchange_type': 'direct',
'routing_key': 'ack'
}
}
run_task.py:
from celery import Celery
app = Celery('remote', broker='amqp://remote_user:remote_pass#<machine_ip>:5672/vhost')
app.config_from_object('celery_config')
app.send_task('ack_task', args=('a1', 'a2'))

Running buildbot behind cherokee reverse proxy

I am attempting to run my buildbot master server behind a cherokee reverse proxy with the buildbot instance as cherokee's information source in a round robin reverse proxy layout.
This is the buildbot master.cfg configuration file:-
# -*- python -*-
# ex: set syntax=python:
# This is a sample buildmaster config file. It must be installed as
# 'master.cfg' in your buildmaster's base directory.
# This is the dictionary that the buildmaster pays attention to. We also use
# a shorter alias to save typing.
c = BuildmasterConfig = {}
####### BUILDSLAVES
# The 'slaves' list defines the set of recognized buildslaves. Each element is
# a BuildSlave object, specifying a unique slave name and password. The same
# slave name and password must be configured on the slave.
from buildbot.buildslave import BuildSlave
c['slaves'] = [BuildSlave("example-slave", "pass")]
# 'slavePortnum' defines the TCP port to listen on for connections from slaves.
# This must match the value configured into the buildslaves (with their
# --master option)
c['slavePortnum'] = 9989
####### CHANGESOURCES
# the 'change_source' setting tells the buildmaster how it should find out
# about source code changes. Here we point to the buildbot clone of pyflakes.
from buildbot.changes.gitpoller import GitPoller
c['change_source'] = []
c['change_source'].append(GitPoller(
'git://github.com/buildbot/pyflakes.git',
workdir='gitpoller-workdir', branch='master',
pollinterval=300))
####### SCHEDULERS
# Configure the Schedulers, which decide how to react to incoming changes. In this
# case, just kick off a 'runtests' build
from buildbot.schedulers.basic import SingleBranchScheduler
from buildbot.schedulers.forcesched import ForceScheduler
from buildbot.changes import filter
c['schedulers'] = []
c['schedulers'].append(SingleBranchScheduler(
name="all",
change_filter=filter.ChangeFilter(branch='master'),
treeStableTimer=None,
builderNames=["runtests"]))
c['schedulers'].append(ForceScheduler(
name="force",
builderNames=["runtests"]))
####### BUILDERS
# The 'builders' list defines the Builders, which tell Buildbot how to perform a build:
# what steps, and which slaves can execute them. Note that any particular build will
# only take place on one slave.
from buildbot.process.factory import BuildFactory
from buildbot.steps.source import Git
from buildbot.steps.shell import ShellCommand
factory = BuildFactory()
# check out the source
factory.addStep(Git(repourl='git://github.com/buildbot/pyflakes.git', mode='copy'))
# run the tests (note that this will require that 'trial' is installed)
factory.addStep(ShellCommand(command=["trial", "pyflakes"]))
from buildbot.config import BuilderConfig
c['builders'] = []
c['builders'].append(
BuilderConfig(name="runtests",
slavenames=["example-slave"],
factory=factory))
####### STATUS TARGETS
# 'status' is a list of Status Targets. The results of each build will be
# pushed to these targets. buildbot/status/*.py has a variety to choose from,
# including web pages, email senders, and IRC bots.
c['status'] = []
from buildbot.status import html
from buildbot.status.web import authz, auth
authz_cfg=authz.Authz(
# change any of these to True to enable; see the manual for more
# options
auth=auth.BasicAuth([("pyflakes","pyflakes")]),
gracefulShutdown = False,
forceBuild = 'auth', # use this to test your slave once it is set up
forceAllBuilds = False,
pingBuilder = False,
stopBuild = False,
stopAllBuilds = False,
cancelPendingBuild = False,
)
c['status'].append(html.WebStatus(http_port=8010, authz=authz_cfg))
####### PROJECT IDENTITY
# the 'title' string will appear at the top of this buildbot
# installation's html.WebStatus home page (linked to the
# 'titleURL') and is embedded in the title of the waterfall HTML page.
c['title'] = "Pyflakes"
c['titleURL'] = "http://divmod.org/trac/wiki/DivmodPyflakes"
# the 'buildbotURL' string should point to the location where the buildbot's
# internal web server (usually the html.WebStatus page) is visible. This
# typically uses the port number set in the Waterfall 'status' entry, but
# with an externally-visible host name which the buildbot cannot figure out
# without some help.
c['buildbotURL'] = "http://localhost:8010/"
####### DB URL
c['db'] = {
# This specifies what database buildbot uses to store its state. You can leave
# this at its default for all but the largest installations.
'db_url' : "sqlite:///state.sqlite",
}
# change any of these to True to enable; see the manual for more
# options
auth=auth.BasicAuth([("pyflakes","pyflakes")]),
And this is the cherokee configuration:-
Unfortunately, I get 502 Bad gateway when I go to my web url but on the other hand, I know that my buildbot master server instance is working correctly because going to the same web url and appending :8010 behind the web url gives me the "Welcome to the Buildbot ..." page.
Is your proxy on the same machine as the buildbot? If not, you will need to adjust the URL in cherokee, to point to the machine running buildbot (localhost points to the machine cherokee is running on).
In any case, c['buildbotURL'] should be changed to point to the public URL that the buildbot is available under (i.e. what cherokee exposes, rather than the URL being proxied).