Connect timeout from Presto / Trino to Amazon S3 - amazon-s3

I currently have a Kubernetes setup outside of AWS where a data lake which resides in Amazon S3 gets queried using Presto v348. Data is stored in parquet file format. Additional component is a Hive metastore.
I encounter the following error and am at a loss on regards to troubleshooting the underlying issue:
io.prestosql.spi.PrestoException: Unable to execute HTTP request: Connect to s3-eu-central-1.amazonaws.com:80 [s3-eu-central-1.amazonaws.com] failed: connect timed out
This issue sometimes arises with bigger queries and interestingly brings the system into a state where all following queries time out. There are cases where in 1/5 of tries the query will succeed. Smaller queries in general work perfectly fine. This gets better after about 10-20min. Restarting Presto does not solve the 10-20min problem. Therefore I suspect there must be another problem.
I am aware of the fact that I might run into a performance ceiling, but the fact that instead of an error there are just timeouts and the whole system is unusable for 10-20 minutes is not acceptable.
I have already increased configs like hive.s3.max-connections in Presto and fs.s3a.connection.maximum in the metastore config but it doesn't seem to solve the problem. Besides these, I found no suggestions on how to tweak the setup to prevent the error from happening.
Presto connector config:
connector.name=hive-hadoop2
hive.metastore.uri=thrift://hive-metastore:9083
hive.metastore.username=prestodb
hive.s3.aws-access-key="S3_ACCESS_KEY"
hive.s3.aws-secret-key="S3_SECRET_KEY"
hive.s3.endpoint=s3-eu-central-1.amazonaws.com
hive.s3.ssl.enabled=false
hive.s3.path-style-access=true
hive.parquet.use-column-names=true
hive.allow-drop-table=true
hive.s3-file-system-type=PRESTO
hive.s3.max-connections=50000
hive.s3select-pushdown.max-connections=50000
hive.s3.connect-timeout=60s
hive.allow-rename-column=true
Metatore config:
core-site.xml: |
<configuration>
<property>
<name>fs.s3a.connection.ssl.enabled</name>
<value>false</value>
</property>
<property>
<name>fs.s3a.access.key</name>
<value>xxx</value>
</property>
<property>
<name>fs.s3a.secret.key</name>
<value>xxx</value>
</property>
<property>
<name>fs.s3a.fast.upload</name>
<value>true</value>
</property>
<property>
<name>fs.s3a.connection.maximum</name>
<value>50000</value>
</property>
<property>
<name>fs.s3a.connection.establish.timeout</name>
<value>60000</value>
</property>
<property>
<name>fs.s3a.threads.max</name>
<value>64</value>
</property>
<property>
<name>fs.s3a.max.total.tasks</name>
<value>128</value>
</property>
</configuration>

Related

'hiveserver2 not listening on port 10000 and 10001'

When I run:
hive --service hiveserver2 --hiveconf hive.server2.thrift.port=10000 --hiveconf hive.root.logger=INFO,console
It shows
Starting HiveServer2
and nothing listens on port 10000 and 10001
The HiveServer2 service does not output error information, causing it hard to diagnostic the problem. You can try to start the metastore service provided by Hive, which listens on port 9083 and might give some information when your configuration is not properly set:
hive --service metastore # not detach from terminal to see logs
In my case, this service cannot be started, with error message:
MetaException(message:Hive Schema version 3.1.0 does not match metastore's schema
version 1.2.0 Metastoed or corrupt)
One of the direct solution to resolve this error is to ignore the version difference by setting the hive-site.xml if there is only one hive version in your machine (another solution is to modify the metastore_db version):
<property>
<name>hive.metastore.schema.verification</name>
<value>false</value>
</property>
After this problem is resolved, the HiveServer2 service can be running and listening on port 10000.
hive --service hiveserver2 > /dev/null 2>&1 &
If your HiveServer2 access metastore via Derby or MySQL JDBC driver, then the aforementioned metastore service is not needed for HiveServer2. However, if HiveServer2 access metastore via thrift protocol, as configed in conf/hive-site.xml like
<property>
<name>hive.metastore.uris</name>
<value>thrift://hadoop-master:9083</value>
<description>
Thrift URI for the remote metastore.
Used by metastore client to connect to remote metastore.
</description>
</property>
Then, the metastore service must be started at first.
I had a hard time to set up hive-3.1.2. I write this maybe it helps someone out. in order to diagnose the problem first try to launch metastore and hiveserver2 like this:
metastore:
hive --service metastore --hiveconf hive.root.logger=INFO,console
hiveserver2:
hive --service hiveserver2 --hiveconf hive.server2.thrift.port=10000 --hiveconf hive.root.logger=INFO,console
then carefully read the the exceptions were thrown.
my problem was user hive is not allowed to perform this api call
and to solve that I added the following property to hive-site.xml:
<property>
<name>hive.metastore.event.db.notification.api.auth</name>
<value>false</value>
<description>
Should metastore do authorization against database notification related APIs such as get_next_notification.
If set to true, then only the superusers in proxy settings have the permission
</description>
</property>
also I add my full hive-site.xml as a sample:
<configuration>
<property>
<name>datanucleus.schema.autoCreateTables</name>
<value>true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:mysql://server-2:3306/metastore?createDatabaseIfNotExist=true</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>mysql_username</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>mysql_password</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>com.mysql.jdbc.Driver</value>
</property>
<property>
<name>hive.metastore.uris</name>
<value>thrift://server-2:9083</value>
</property>
<property>
<name>atanucleus.fixedDatastore</name>
<value>true</value>
</property>
<property>
<name>hive.server2.thrift.bind.host</name>
<value>server-2</value>
</property>
<property>
<name>hive.server2.transport.mode</name>
<value>binary</value>
</property>
<property>
<name>hive.server2.enable.doAs</name>
<value>false</value>
</property>
<property>
<name>hive.metastore.event.db.notification.api.auth</name>
<value>false</value>
</property>
</configuration>
Thanks. There is typo. It should hive.metastore not as shown below.
**metastore**.metastore.event.db.notification.api.auth
false

When YARN is running the hadoop job submitted get stuck in Accepted state

I am using VirualBox to run Ubuntu 14 VM on Windows laptop. I have configured Apache distribution HDFS and YARN for Single Node. When I run dfs and YARN then all required demons are running. When I don't configure YARN and run DFS only then I can execute MapReduce job successfully, But when I run YARN as well then job get stuck at ACCEPTED state, I tried many settings regarding changing memory settings of node but no luck.
Following link I followed to set single node
https://hadoop.apache.org/docs/r2.8.0/hadoop-project-dist/hadoop-common/SingleCluster.html
core-site.xml
`
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>`
settings of hdfs-site.xml`
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.name.dir</name>
<value>/home/shaileshraj/hadoop/name/data</value>
</property>
</configuration>`
settings of mapred-site.xml
`<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>`
settings of yarn-site.xml`
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>2200</value>
<description>Amount of physical memory, in MB, that can be allocated for containers.</description>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>500</value>
</property>
RM Web UI
Here is Application Master screen of RM Web UI. What I can see AM container is not allocated, may be that is problem
If the job is not getting enough number of resources, it will be in ACCEPTED state. Whenever it gets resources it will change to RUNNING state.
In your case, open Resource Manager WebUI and check how much of resources are available to run jobs.

My Yarn Map-Reduce Job is taking a lot of time

Input File size : 75GB
Number of Mappers : 2273
Number of reducers : 1 (As shown in the web UI)
Number of splits : 2273
Number of Input files : 867
Cluster : Apache Hadoop 2.4.0
5 nodes cluster, 1TB each.
1 master and 4 Datanodes.
It's been 4 hrs. now and still only 12% of map is completed. Just wanted to know given my cluster configuration does this make sense or is there anything wrong with the configuration?
Yarn-site.xml
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux- services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.resource- tracker.address</name>
<value>master:8025</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.address</name>
<value>master:8030</value>
</property>
<property>
<name>yarn.resourcemanager.address</name>
<value>master:8040</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>master</value>
<description>The hostname of the RM.</description>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>1024</value>
<description>Minimum limit of memory to allocate to each container request at the Resource Manager.</description>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>8192</value>
<description>Maximum limit of memory to allocate to each container request at the Resource Manager.</description>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-vcores</name>
<value>1</value>
<description>The minimum allocation for every container request at the RM, in terms of virtual CPU cores. Requests lower than this won't take effect, and the specified value will get allocated the minimum.</description>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-vcores</name>
<value>32</value>
<description>The maximum allocation for every container request at the RM, in terms of virtual CPU cores. Requests higher than this won't take effect, and will get capped to this value.</description>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>8192</value>
<description>Physical memory, in MB, to be made available to running containers</description>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>4</value>
<description>Number of CPU cores that can be allocated for containers.</description>
</property>
<property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>4</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
<description>Whether virtual memory limits will be enforced for containers</description>
</property>
Map-Reduce job where I am using multiple outputs. So reducer will emit multiple files. Each machine has 15GB Ram. Containers running are 8. Total memory available is 32GB in RM Web UI.
Any guidance is appreciated. Thanks in advance.
A few points to check:
The block & split size seems very small considering the data you shared. Try increasing both to an optimal level.
If not used, use a custom partitioner that would uniformly spread your data across reducers.
Consider using combiner.
Consider using appropriate compression (while storing mapper results)
Use optimum number of block replication.
Increase the number of reducers as appropriate.
These will help increase performance. Give a try and share your findings!!
Edit 1: Try to compare the log generated by a successful map task with that of the long running map task attempt. (12% means 272 map tasks completed). You will get to know where it got stuck.
Edit 2: Tweak these parameters: yarn.scheduler.minimum-allocation-mb, yarn.scheduler.maximum-allocation-mb, yarn.nodemanager.resource.memory-mb, mapreduce.map.memory.mb, mapreduce.map.java.opts, mapreduce.reduce.memory.mb, mapreduce.reduce.java.opts, mapreduce.task.io.sort.mb, mapreduce.task.io.sort.factor
These will improve the situation. Take trial and error approach.
Also refer: Container is running beyond memory limits
Edit 3: Try to understand a part of the logic, convert it to pig script, execute and see how it behaves.

Hive tez execution error

I am running a hive query and I got the following error when setting the hive.execution.engine=tez, while the query is working under engine=MR.
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.tez.TezTask
My query is an inner join and the data is quite big.
Another thing is that I have met this problem before. But tez works later so I thought it was about something unstable about hive.
While running your HQL via hive include following parameter. This will give you detailed logs and you can determine the root cause easily.
-hiveconf hive.root.logger=DEBUG,console
I faced similar problem and above property help me big time.
e.g.: I got following message
16/04/14 10:29:26 ERROR exec.Task: Failed to execute tez graph.
org.apache.tez.dag.api.TezException: org.apache.hadoop.yarn.exceptions.InvalidResourceRequestException: Invalid resource request, requested memory < 0, or requested memory > max configured, requestedMemory=20480, maxMemory=11288
When I changed my setting to 11288, my query went through fine.
Once check your yarn-site.xml with following properties.
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
<description>Whether virtual memory limits will be enforced for containers</description>
</property>
<property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>4</value>
<description>Ratio between virtual memory to physical memory when setting memory limits for containers</description>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>1024</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>2048</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>2048</value>
</property>
</configuration>
Found this post, which made it work for me. Needed to add the username
hadoop

NoClassDefFoundError HBase with YARN

I know that this is one of the topic that's asked much. Still after I digged into all of the topics I could find (most of them talking about CLASSPATH), I cant solve mine.
Examples of the topics I found and tried:
Exception in thread "main" java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration
java.lang.NoClassDefFoundError with HBase Scan
I'm using Hadoop 2.5.1 with HBase 0.98.11 on Ubuntu 14.04
I set up pseudo-distributed mode and running hadoop with hbase successfully. After I want to set up the full-distributed mode, jobs fail with NoClassDefFound error. I tried adding "export HADOOP_CLASSPATH=/usr/local/hbase-0.98.11-hadoop2/bin/hbase classpath" into hadoop-env (also yarn-env), still dont work.
One notice I found is if I comment the
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
I can run the jobs SUCCESSFULLY. BUT it seems that I run it on single not multi node.
Here are some of the configs:
mapred-site
<property>
<name>mapred.job.tracker</name>
<value>hadoop1:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>`
hdfs-site
<property>
<name>dfs.replication</name>
<value>2</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:/usr/local/hadoop_store/hdfs/datanode</value>
</property>
<property>
<name>dfs.datanode.use.datanode.hostname</name>
<value>false</value>
</property>
<property>
<name>dfs.namenode.datanode.registration.ip-hostname-check</name>
<value>false</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
yarn-site
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
<description>shuffle service that needs to be set for Map Reduce to run
</description>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
In yarn-env and hadoop-env there is just as default except the HADOOP_CLASSPATH (which doesn't change things even if I add it or not..)
Here is the error trace:
2015-04-25 23:29:25,143 FATAL [main] org.apache.hadoop.mapred.YarnChild: Error running child : java.lang.NoClassDefFoundError: org/apache/hadoop/hbase/HBaseConfiguration
at apriori2$FrequentItemsReduce.reduce(apriori2.java:550)
at apriori2$FrequentItemsReduce.reduce(apriori2.java:532)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:171)
at org.apache.hadoop.mapred.Task$NewCombinerRunner.combine(Task.java:1651)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1611)
at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1462)
at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:700)
at org.apache.hadoop.mapred.MapTask.closeQuietly(MapTask.java:1990)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:774)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:340)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Really thanks for every help sir.
With Yarn, you need to set "yarn.application.classpath" property with the classpath for your MapReduce job. "export HADOOP_CLASSPATH" would not work with Yarn.