could not open client transport airflow hiveoperator to connect - hive

Error: Could not open client transport with JDBC Uri: jdbc:hive2://XXXX:10000/default;auth=none: Failed to open new session: java.lang.IllegalArgumentException: Cannot modify airflow.ctx.task_id at runtime. It is not in list of params that are allowed to be modified at runtime (state=08S01,code=0)
Beeline version 2.3.6 by Apache Hive
I have tried to change parameter with hiveconfs to None but still getting issue
configuration which i have used :
Airflow connection is configure based on hive zookeeper based jdbc connection and used 'Hive client wrapper' as connection type.
Trying to pass some extra paramters to connect
extras : {"hive_cli_params": "","use_beeline": "true","auth":"none"}
Sample dag code:
import airflow
from airflow import DAG
from airflow.operators.hive_operator import HiveOperator
from airflow.utils.dates import days_ago
dag_conf = DAG(dag_id = "airflow_hiveoperator",schedule_interval = None,start_date = airflow.utils.dates.days_ago(1))
HiveOperator = HiveOperator(
hql='hql/query1.hql',
task_id = 'airflow_hive',
schema='default',
hiveconf_jinja_translate=False,
dag = dag_conf,
conn_id = 'hive_cli_default',
hiveconfs=None
)
if __name__ == "__main__":
dag_conf.cli()

Related

Connecting to redis through dash-extensions Websocket

Is there a way to listen to redis publisher message, through Websocket of dash-extensions package?
from dash_extensions import WebSocket
...
html.Div(id="message", className='p-4 border'),
WebSocket(url='ws://127.0.0.1:6379/1', id='ws')
This gives error in redis-server
# Possible SECURITY ATTACK detected. It looks like somebody is sending POST or Host: commands to Redis. This is likely due to an attacker attempting to use Cross Protocol Scripting to compromise your Redis instance. Connection aborted.
Is such provision limited to specified redis clients only?
I believe that you are missing a component that reads the appropriate data from the Redis server and passes them through the websocket. This could be done in many different ways. Here is a small self-contained example written purely in Python,
import redis
import dash_html_components as html
from gevent import sleep
from dash import Dash
from dash.dependencies import Input, Output
from dash_extensions import WebSocket
from dash_extensions.websockets import SocketPool, run_server
# Read data from Redis and pass it through the web socket.
def ws_handler(ws):
while True:
sleep(2) # delay between updates, here 1s
value = redis.Redis().get('foo').decode("utf-8") # get redis value
ws.send(f"Redis value is [{value}]") # send data
# Inject some dummy data into Redis.
redis.Redis().set('foo', 'bar')
# Create example app.
app = Dash(prevent_initial_callbacks=True)
socket_pool = SocketPool(app, handler=ws_handler)
app.layout = html.Div([html.Div("Listening for message", id="log"), WebSocket(id="ws")])
#app.callback(Output("log", "children"), [Input("ws", "message")])
def update_graph(msg):
return msg['data']
if __name__ == '__main__':
run_server(app, port=5000) # 5000 if the default port

Connection problem with Clickhouse and RabbitMQ

I am a newbie to Clickhouse and RabbitMQ. While I am trying to record data in RabbitMQ to Clickhouse with the below script, it doesn't work.
CREATE TABLE Station (
Station varchar(2000)
) ENGINE = RabbitMQ SETTINGS rabbitmq_host_port = '<IP>:5672',
rabbitmq_exchange_name = 'Clickhouse',
rabbitmq_exchange_type = 'direct',
rabbitmq_routing_key_list = 'Station',
rabbitmq_format = 'CSV',
rabbitmq_num_consumers = 1;
And the following error message is given.
SQL Error [115]: ClickHouse exception, code: 115, host: <IP>, port: 8123; Code: 115, e.displayText() = DB::Exception: Unknown setting rabbitmq_username: for storage RabbitMQ (version 21.4.3.21 (official build))
Any suggestion for setting the rabbitmq_username?
The Rabbit MQ credentials should be defined in config-file:
open exist or create a new custom config file rabbitmq.xml
sudo nano /etc/clickhouse-server/config.d/rabbitmq.xml
add this configuration & save it
<yandex>
<rabbitmq>
<username>your_rabbitmq_username</username>
<password>your_rabbitmq_password</password>
</rabbitmq>
</yandex>
restart service
sudo service clickhouse-server restart

Unable to write from spark pool to sql pool in Azure Synapse

I have a table in the default spark pool which i need to load into the dedicated sql pool in the azure synapse. Below are the codes I implemented, however it is not loading.
%%pyspark
spark.conf.set("spark.sql.execution.arrow.pyspark.enabled", "true")
new_df = spark.createDataFrame(segmentation_output)
new_df.write.mode("overwrite").saveAsTable("default.segmentation_output")
%%pyspark
new_df.createOrReplaceTempView("pysparkdftemptable")
%%spark
val scala_df = spark.sqlContext.sql ("select * from pysparkdftemptable")
scala_df.write.synapsesql("eana.bi.xim_CustomerSegment", Constants.INTERNAL)
Error : StructuredStream-spark package version: 2.4.5-1.3.1
StructuredStream-spark package version: 2.4.5-1.3.1
StructuredStream-spark package version: 2.4.5-1.3.1
com.microsoft.sqlserver.jdbc.SQLServerException: The TCP/IP connection to the host eec8e890e9d5--0.tr624.northeurope1-a.worker.database.windows.net (redirected from emna-dv-ibsanalytics-wco-id-euno-sqs.database.windows.net), port 11030 has failed. Error: "connect timed out. Verify the connection properties. Make sure that an instance of SQL Server is running on the host and accepting TCP/IP connections at the port. Make sure that TCP connections to the port are not blocked by a firewall.".

ftplib.error_perm: 550 Create directory operation failed while using FTP_TLS

I am trying to connect to a secure FTP server created based on this link. I have not followed step 8 in the link. I am able to establish the connection and also change and print directories but am unable to create new directories. I am also unable to fetch files list.
Here is my code snippet:
import ssl
from ftplib import FTP_TLS
import sys
import os
import os.path
def connect():
ftp = FTP_TLS()
ftp.debugging = 2
ftp.connect('ipaddress', 21)
ftp.set_pasv(False)
ftp.login('user', 'passwd')
return ftp
ftps = connect()
destdir = "/"
try:
resp = ftps.pwd()
ftps.cwd(destdir)
except Exception:
ftps.mkd(destdir)
print(resp)
root = 'C:\\Users\\****\\****\\Logs' # local dir
for (dir, _, files) in os.walk(root):
newdir = destdir+dir[len(root):len(dir)].replace("\\", "/")
print(newdir)
try:
ftps.cwd(newdir)
except Exception:
ftps.mkd(newdir)
I am using python 3.7.3 and the corresponding ftplib. I would be happy to provide any other details required.
PS: I am able to connect with Filezilla and create directories.
This is the error after running.
I am able to create the directories successfully once I change the dir to /logs. I am getting an error "ftplib.error_perm: 500 Illegal PORT command." whenever I send cmd like retrlines or storbinary
, I get this error
I have searched about this and people have asked to set it to pasv mode. When I do that, I get this error. FYI, I have enabled pasv mode in the config file
I tried changing the port number to a number between pasv ports enabled in the config file (between 30000-31000). It does not connect also in this case. Error returned "ConnectionRefusedError: [WinError 10061] No connection could be made because the target machine actively refused it"
I am working the first time with an FTP server or for that matter any server communications, so my problems might seem trivial. I am sorry for your trouble.
Thanks.

Airflow Adaptive Server connection failed

I want to connect my Airflow and Microsoft SQL Server. I configured my connection under 'connections' bar in 'Admin' box as mentioned in the following link:
http://airflow.apache.org/howto/manage-connections.html
But when I run my Dag task that is related to SQL server immedatly fails by following error:
[2019-03-28 16:16:07,439] {models.py:1788} ERROR - (18456, "Login failed for user 'XXXX'.DB-Lib error message 20018, severity 14:\nGeneral SQL Server error: Check messages from the SQL Server\nDB-Lib error message 20002, severity 9:\nAdaptive Server connection failed (***.***.***.28:1433)\n")
My code from DAG for Micrososft Sql Connection is following:
sql_command = """
select * from [sys].[tables]
"""
t3 = MsSqlOperator( task_id = 'run_test_proc',
mssql_conn_id = 'FIConnection',
sql = sql_command,
dag = dag)
I verified ip address and port number kind of configuration things by establishing connection through pymssql library from my local computer. Test code is following:
pymssql.connect(server="***.***.***.28:1433",
user="XXXX",
password="XXXXXX"
) as conn:
df = pd.read_sql("SELECT * FROM [sys].[tables]", conn)
print(df)
Could you please share if you have experienced this issue?
By the way I am using VirtualBox in Ubuntu 16.04 LTS
I had the same problem because freetds-dev was missing on linux:
apt-get install freetds-dev