Unable to upload file to S3 Bucket using aws-java-sdk - aws-java-sdk

I have a spring boot application which uploads files to S3 bucket.
I am receiving the following error whenever the application tries to upload a file. The stack trace is a huge one. So I am providing only a part of it.
java.lang.IllegalStateException: Socket not created by this factory
at org.apache.http.util.Asserts.check(Asserts.java:34) ~[httpcore-4.4.6.jar:4.4.6]
at org.apache.http.conn.ssl.SSLSocketFactory.isSecure(SSLSocketFactory.java:435) ~[httpclient-4.5.3.jar:4.5.3]
at org.apache.http.impl.conn.DefaultClientConnectionOperator.openConnection(DefaultClientConnectionOperator.java:186) ~[httpclient-4.5.3.jar:4.5.3]
I am using the following dependency
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk</artifactId>
<version>1.11.123</version>
</dependency>
I have even tried with
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>1.5.1.RELEASE</version>
</parent>
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-starter-aws</artifactId>
</dependency>
<!--<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-aws-context</artifactId>
</dependency>
But sill getting the same type of error
I have tried using both TransferManager as well as putObject() method from AmazonS3 but with same error.
The application was running well a few days back and the error has started to come only very recently.

I had the same issue on v1.10.12 of the SDK, I switched to v1.11.136 and that resolved my issue, add the code below, to your pom file
<!-- AWS S3 Dependencies-->
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk</artifactId>
<version>1.11.136</version>
</dependency>
<!-- End of AWS S3 Dependencies -->

It would be useful to post more of the stack-trace so we can see at what point in the SDK lifecycle the exception is being generated (the stack-trace above only shows the apache classes). Can you also show how you're configuring the S3 client?
Are you configuring a custom SocketFactory? The check in question is looking to see if the Socket that the SocketFactory created is in fact an SSLSocket if not - that's where it bombs - you can see that from the Apache code here.

Related

NoHostAvailableException .TransportException: Error writing while connecting to cassandra (Automation)

For my automation project, i am trying to integrate cassandra to spring boot jpa using datastasx driver and
Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s) tried for query failed (tried: api-beta.caas.dbattery.akamai.com/x.x.x.x:9042 (com.datastax.driver.core.exceptions.TransportException: [api-beta.caas.dbattery.akamai.com/x.x.x.x:9042] Error writing), api-beta.caas.dbattery.akamai.com/x.x.x.x:9042 (com.datastax.driver.core.exceptions.TransportException: [api-beta.caas.dbattery.akamai.com/x.x.x.x:9042] Error writing), api-beta.caas.dbattery.akamai.com/x.x.x.x:9042 (com.datastax.driver.core.exceptions.TransportException: [api-beta.caas.dbattery.akamai.com/x.x.x.x:9042] Error writing))
In POM.xml
<parent>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-parent</artifactId>
<version>2.1.1.RELEASE</version>
<relativePath/>
</parent>
<properties>
<spring.boot.version>2.1.1.RELEASE</spring.boot.version>
<datastax.driver.version>3.10.2</datastax.driver.version>
</properties>
<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-driver-core</artifactId>
<version>${datastax.driver.version}</version>
</dependency>
<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-driver-mapping</artifactId>
<version>${datastax.driver.version}</version>
</dependency>
<dependency>
<groupId>com.datastax.cassandra</groupId>
<artifactId>cassandra-driver-extras</artifactId>
<version>${datastax.driver.version}</version>
</dependency>
<dependency>
<groupId>io.netty</groupId>
<artifactId>netty-tcnative</artifactId>
<version>2.0.34.Final</version>
</dependency>
<dependency>
<groupId>io.netty</groupId>
<artifactId>netty-tcnative-boringssl-static</artifactId>
<version>2.0.34.Final</version>
</dependency>
<dependency>
<groupId>io.netty</groupId>
<artifactId>netty-transport-native-epoll</artifactId>
<version>4.1.54.Final</version>
</dependency>
I suspect you were getting other exceptions like OperationTimedOutException when the driver attempts to connect to the nodes in the list but continually failed.
The driver generates a query plan which is a list of nodes to contact for an app query. The driver cycles through this list one node at a time until it is able to connect to all of them. After it has tried to connect to all nodes in the list (which all failed), the driver returns NoHostAvailableException: All host(s) tried for query failed.
The most common causes for this are (1) network connectivity issues, or (2) nodes being down or unresponsive. There is a good chance that the cluster nodes are listening on a different IP address that you are connecting to.
It sounds like you're trying this out for the first time and you might find it easier to just focus on coding your app by connecting to Astra so you don't have to worry about the DB side of things. There's a Spring sample app that you could look at that will get you up and running in literally a few minutes. Cheers!

Apache Hudi throwing Dataset not found exception when storing to S3

I am trying to load a simple dataframe as Hudi dataset into S3 and I am having trouble in doing that. I am new to Apache Hudi and I am trying to load the data from by running the code locally on my Windows machine. All the Maven dependencies I am using to achieve this and the code along with exceptions are mentioned below
inputDF.write.format("com.uber.hoodie")
.option(HoodieWriteConfig.TABLE_NAME, tablename)
.option(DataSourceWriteOptions.RECORDKEY_FIELD_OPT_KEY, "GameId")
.option(DataSourceWriteOptions.PARTITIONPATH_FIELD_OPT_KEY,"operatorShortName")
.option(DataSourceWriteOptions.PRECOMBINE_FIELD_OPT_KEY, "HandledTimestamp")
.option(DataSourceWriteOptions.OPERATION_OPT_KEY, DataSourceWriteOptions.UPSERT_OPERATION_OPT_VAL)
.mode(SaveMode.Append)
.save("s3a://s3_buket/Games2" )
<!-- https://mvnrepository.com/artifact/com.amazonaws/aws-java-sdk -->
<dependency>
<groupId>com.amazonaws</groupId>
<artifactId>aws-java-sdk</artifactId>
<version>1.11.623</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-aws -->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-aws</artifactId>
<version>3.2.0</version>
</dependency>
<!-- https://mvnrepository.com/artifact/org.apache.hadoop/hadoop-common -->
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-common</artifactId>
<version>3.2.0</version>
</dependency>
<dependency>
<groupId>com.uber.hoodie</groupId>
<artifactId>hoodie</artifactId>
<version>0.4.7</version>
<type>pom</type>
</dependency>
<!-- https://mvnrepository.com/artifact/com.uber.hoodie/hoodie-spark -->
<dependency>
<groupId>com.uber.hoodie</groupId>
<artifactId>hoodie-spark</artifactId>
<version>0.4.7</version>
</dependency>
Exception in thread "main" com.uber.hoodie.exception.DatasetNotFoundException: Hoodie dataset not found in path s3a://gat-datalake-raw-dev/Games2\.hoodie
at com.uber.hoodie.exception.DatasetNotFoundException.checkValidDataset(DatasetNotFoundException.java:45)
at com.uber.hoodie.common.table.HoodieTableMetaClient.<init>(HoodieTableMetaClient.java:91)
at com.uber.hoodie.HoodieWriteClient.rollbackInflightCommits(HoodieWriteClient.java:1172)
at com.uber.hoodie.HoodieWriteClient.startCommitWithTime(HoodieWriteClient.java:1044)
at com.uber.hoodie.HoodieWriteClient.startCommit(HoodieWriteClient.java:1037)
at com.uber.hoodie.HoodieSparkSqlWriter$.write(HoodieSparkSqlWriter.scala:144)
at com.uber.hoodie.DefaultSource.createRelation(DefaultSource.scala:91)
at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:45)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:86)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:131)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:127)
at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:155)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:152)
at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:127)
at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:80)
at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:80)
at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:668)
at org.apache.spark.sql.DataFrameWriter$$anonfun$runCommand$1.apply(DataFrameWriter.scala:668)
at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:78)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73)
at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:668)
at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:276)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:270)
at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:228)
at com.playngoplatform.scala.dao.DataAccessS3.writeDataToRefinedS3(DataAccessS3.scala:26)
at com.playngoplatform.scala.controller.GameAndProviderDataTransform.processData(GameAndProviderDataTransform.scala:29)
at com.playngoplatform.scala.action.GameAndProviderData$.main(GameAndProviderData.scala:10)
at com.playngoplatform.scala.action.GameAndProviderData.main(GameAndProviderData.scala)
I am not doing anything else apart from this. I am just creating a Hudi dataset directly from my Spark data source code. I am seeing the folder getting created the S3 path but not any further
.hoodie.properties file is mentioned below
hoodie.compaction.payload.class=com.uber.hoodie.common.model.HoodieAvroPayload
hoodie.table.name=hoodie.games
hoodie.archivelog.folder=archived
hoodie.table.type=MERGE_ON_READ
Hudi is not completely mature to support your windows OS.
The issue is fixed by changing file seperation character in terms of running this on windows machine.

Spring config server with JDBC is throwing Invalid config server configuration error

I want to use JDBC mysql with Spring cloud config server, but always failed, this is what I am doing:
Spring cloud version: Finchley.SR2
In POM.xml
<dependencies>
<dependency>
<groupId>org.springframework.cloud</groupId>
<artifactId>spring-cloud-config-server</artifactId>
</dependency>
<dependency>
<groupId>org.springframework</groupId>
<artifactId>spring-jdbc</artifactId>
<exclusions>
<exclusion>
<groupId>org.apache.tomcat</groupId>
<artifactId>tomcat-jdbc</artifactId>
</exclusion>
</exclusions>
</dependency>
Inside the application.config:
spring.profiles.active= jdbc
spring.datasource.url=jdbc:mysql://localhost:3306/config_db
spring.datasource.username=root
spring.datasource.password=12345
spring.datasource.driver-class-name=com.mysql.jdbc.Driver
spring.datasource.platform= mysql
spring.cloud.config.server.jdbc.sql= SELECT `key`, `value` FROM `properties` WHERE `application`=? AND `profile`=? AND `label`=?;
spring.cloud.config.server.jdbc.order=0
spring.cloud.config.server.default-profile=production
spring.cloud.config.server.default-label=latest
Finally, when I start server, I am getting below errors:
APPLICATION FAILED TO START
Description:
Invalid config server configuration.
Action:
If you are using the git profile, you need to set a Git URI in your configuration. If you are using a native profile and have spring.cloud.config.server.bootstrap=true, you need to use a composite configuration.
I am not using git here, why the error is about git url?
I had the same issue when used MySQL.
It seems to be an issue with MySQL JdbcTemplate (look here).
I switched to H2 to store configuration and it works.
I wonder if there any workaround to use MySQL?
I got the same error when I attempt to remove the DataSourceAutoConfiguration.class on start up i.e.
#SpringBootApplication(exclude = {DataSourceAutoConfiguration.class })
When I just used
#SpringBootApplication
everything worked as expected.
My reason for excluding the class was to stop the auto generation of a password on startup.

Exception in deserializing avro object in map reduce

I am trying to run a map reduce job which takes an avro file as input and does some processing. I followed the sample program apache has given us here
http://avro.apache.org/docs/1.7.6/mr.html
But I keep on running into this exception
java.lang.Exception: java.lang.NoSuchMethodError: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:522)
Caused by: java.lang.NoSuchMethodError: org.apache.avro.generic.GenericData.createDatumWriter(Lorg/apache/avro/Schema;)Lorg/apache/avro/io/DatumWriter;
Any idea on what I may be doing wrong? I have specified my pom configs in the bottom. Also I am using MapR version 4.
<repositories>
<repository>
<id>MapR</id>
<url>http://repository.mapr.com/maven/.</url>
</repository>
</repositories>
<dependencies>
<dependency>
<groupId>org.apache.hadoop</groupId>
<artifactId>hadoop-core</artifactId>
<version>1.2.0</version>
<scope>provided</scope>
</dependency>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro</artifactId>
<version>1.7.6</version>
</dependency>
<dependency>
<groupId>org.apache.avro</groupId>
<artifactId>avro-mapred</artifactId>
<version>1.7.6</version>
<classifier>hadoop2</classifier>
</dependency>
</dependencies>
Common cause of such errors is this:
Your software was compiled against 1.7.6 version of avro, but in runtime, classes from older version were probably loaded.
Make sure that 1.7.6 is the actual version of your avro artifacts in your runtime classpath. Print out the classpath at the start of your mapper. If you're using oozie, the classpath jars are listed in launcher job output.
The first avro jar you see in the classpath is the one that will be used to load the classes, so if it isn't 1.7.6, that's the problem.
You can force your classpath artifacts to come first in the task's classpath by setting mapreduce.job.user.classpath.first configuration property to true.
Also you have another error in your pom that may very well cause you problems, maybe the very ones you're seeing. You are using avro-mapred artifact compiled for hadoop2 while the hadoop artifact you're depending on is that of hadoop1. These should not be compatible. If you're using hadoop1, loose the hadoop2 classifier on avro-mapred, and if you're using hadoop2, remove hadoop-core and put hadoop-mapreduce-client-core instead.
I have solved this by injecting the right Avro jar in bootstrap action, as described here:
https://stackoverflow.com/a/40235289/3487888

Apache Common IO FileUtils Issue

I am trying to use the FileUtils.writeStringToFile() method of the Apache Commons IO. Every bit of documentation says that I can do this:
FileUtils.writeStringToFile(File, String with data, boolean append);
I want this method, because I want the data to be written to the end of the file each time.
However, in Eclipse, it keeps telling me that this method does not exist. The only two I have are:
FileUtils.writeStringToFile(File, String with data);
FileUtils.writeStringToFile(File, String with data, String encoding);
I corrected my POM file to now have this dependency:
<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
<version>2.4</version>
</dependency>
Can someone please tell me what I am doing wrong?
Version 1.3.2 doesn't have this method, use a newer version of commons-io
<dependency>
<groupId>commons-io</groupId>
<artifactId>commons-io</artifactId>
<version>2.4</version>
</dependency>
Check the FileUtils 2.4 javadoc
Turns out I was adding the Tomcat library files as well as the JRE library files to my project. Because when I deleted commons-io from my POM, I still had FileUtils available.
I had to get rid of the Tomcat library files from my build path, and once I put commons-io back in, it worked.