How to change Hudi table version via Hudi CLI - apache-hudi

How do I change the table version via the Hudi CLI?
Steps:
ssh into EMR
kick off the hudi cli /usr/lib/hudi/cli/bin/hudi-cli.sh. Version of the Hudi CLI is 1.
connect to my table connect --path s3://bucket/db/table
In the desc of the table I see that it is version=3, but I want to use Hudi 0.9.0 to write to the table so I would like to set the table to version=2.
org.apache.hudi.exception.HoodieException: Unknown versionCode:3
at org.apache.hudi.common.table.HoodieTableVersion.lambda$versionFromCode$1(HoodieTableVersion.java:54)
at java.util.Optional.orElseThrow(Optional.java:290)
at org.apache.hudi.common.table.HoodieTableVersion.versionFromCode(HoodieTableVersion.java:54)
at org.apache.hudi.common.table.HoodieTableConfig.getTableVersion(HoodieTableConfig.java:246)

Sadly, I'm not aware of any way to use version 0.9.0 to downgrade 3 to 2, due to the error you are getting. There is no way for version 0.9.0 to know how 0.10.0 was writing things differently.
Recently, AWS has 6.6 available for use, but it isn't well documented. I'd recommend switching over to that, because it has hudi version 0.10.0 and can then do that downgrade.
This link should get updated whenever 6.6 gets updated in the docs.
https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-release-app-versions-6.x.html
Side note, if you are using the bootstrap action script provided by AWS to repair the log4j vulnerability, I'd recommend taking the version 6.5 version provided and editing it to be 6.6. There is not a 6.6 script available at this time, but I did that and was not able to detect any vulnerabilities.
This link provides an explanation on the bootstrap action:
https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-log4j-vulnerability.html

Related

How to downgrade version from 4.5.X to 4.4.X

I upgraded the mettermost version from 4.4.1 to 4.5.
However, I got the error...
So, I tried to downgrade the version to 4.4.1, but the mattermost said "The database schema of the older version isn't supported".
Does anyone know how we should modify the database schema.
※browser: FireFox ESR 52
Unfortunately you can't downgrade a Mattermost install to a previous version through any officially supported path. You will need to revert to the database backup from before the upgrade in order to go back to version 4.4.
However, for the specific versions in question (reverting from 4.5 to 4.4) it is possible to do a downgrade as there are no database schema changes between these two versions. You will need to run the following SQL command in your Mattermost database:
UPDATE Systems SET Value='4.4.0' WHERE Name='Version';
You should then be able to run Mattermost 4.4.1 successfully on that database.

Apache NiFi Hive Processors with Hive 1.1 (CDH 5.7.1)

I work with Cloudera Manager CDH 5.7.1, which supports only Hive 1.1.0.
NiFi 1.0.0-BETA uses Hive 1.2.1.
When I try to use SelectHiveQL processor, I get the following error: Required field 'client_protocol' is unset!, which means that there's a version mismatch between Hive client and server.
Any suggestions to solve this problem?
I thought about building NiFi with hive-jdbc dependency version 1.1.0 instead of the default 1.2.1, but I hope there's a better solution.
Since NiFi is an Apache project, it builds with Apache JARs (such as Hive and Hadoop). However there are vendor-specific profiles and build properties you can use to build NiFi for a particular Hadoop distribution.
For example you could try the following to build a NiFi distro for CDH 5.7.1:
mvn clean install -DskipTests -Pcloudera -Dhadoop.version=2.6.0-cdh5.7.1 -Dhive.version=1.1.0-cdh5.7.1 -Dhbase.version=1.2.0-cdh5.7.1
The Hive processors use Hadoop libraries provided by the NiFi Hadoop Libraries NAR, and other NARs (like the Hadoop/HDFS processors) use this same libraries NAR, so the best approach is to build the whole thing. Otherwise you can try to replace just the Hadoop/Hive/HBase-related NARs and see if that works.
Because NiFi expects the newer version of Hive, it is necessary to remove the unsupported newer features (such as HiveStreaming and ORC support), support the older version of Thrift, and build against the Cloudera-specific libraries.
I have created a branch of the current NiFi-1.1.x release with the necessary changes to get the PutHiveQL and SelectHiveQL processors to work, which you could build as below:
git clone https://github.com/Chaffelson/nifi.git
git checkout nifi-1.1.x-cdhHiveBundle
mvn -T C2.0 clean install -Pcloudera -Dhive.version=1.1.0-cdh5.10.0 -Dhive.hadoop.version=2.6.0-cdh5.10.0 -Dhadoop.version=2.6.0-cdh5.10.0 -DskipTests
I have posted a more complete coverage of this on the Hortonworks Community forum: https://community.hortonworks.com/articles/93771/connecting-nifi-to-cdh-hive.html

Error while executing select query in Hive - how to update Hadoop version

I am currently using hadoop 1.0.3 version. I recently installed Apache Hive to run with it. I was running the select * query which gave me an NoSuchMethodError: org.apache.hadoop.mapred.JobConf.unset
I further found out its a compatibility issue with my current version of hadoop and requires me to upgrade to 1.2 or later.
I am fairly new to hadoop and would like to upgrade my current version to 1.2 or later. How do I go about doing the same.
I could not find any resources online to do so.
Thanks.
Just download hadoop 1.2.x from here and do necessary configuration changes in your new hadoop. Change HADOOP_HOME to point to your new hadoop folder.
NOTE: Change all the environmental variables (including .bashrc) to point to your new hadoop.

Spring data redis mock

I need to do integration testing for a spring cloud application running with spring data on redis.
Tests work locally with the regular redis server instance and I need to run this on a Jenkins CI server that is controlled by the corporate CI engineering group.
Obviously I can attach to a redo server there so I used an embedded redis server (from here: https://github.com/kstyrc/embedded-redis).
Running tests locally with this redis server works well since there is a test profile to inject the embedded server in place of the production one.
Now the problem is that when we run this in the Jenkins environment this is the error we see.
/tmp/1430170830037-0/redis-server-2.8.19: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by /tmp/1430170830037-0/redis-server-2.8.19)
So this version of redis has specific dependency on a specific version of glibc. I tried a couple of other libraries but they all depend on the same underlying version of the embedded redis server.
Is there a spring data mock framework that can be used to get around this sort of issue?
This might come a little late for you, but there is indeed a Spring Data Mock framework that you can use, which let's you mock repositories (regardless of the specific backend solution) without a real database connection.
Here is a link: https://github.com/mmnaseri/spring-data-mock
You don't have a high enough version of libc6, that is causing the error.
From How to fix “/lib/x86_64-linux-gnu/libc.so.6: version `GLIBC_2.14' not found”? – Super User:
That means the program was compiled against glibc version 2.14, and it requires that version to run, but your system has an older version installed. You'll need to either recompile the program against the version of glibc that's on your system, or install a newer version of glibc (the "libc6" package in Debian).
So, you just need to upgrade your libc6 package. All versions of Ubuntu have at least version 2.15 because it's a faily important package (reference).
To upgrade it, use these commands in a terminal:
sudo apt-get update
sudo apt-get install libc6
p.s. This is answer from askubuntu.com by minerz029

Where to download Hive 0.12 source?

I have raised a beeline bug and would like to test the patch, so I'm trying to recompile Hive 0.12 with the patch, but the problem that it seems Apache only host versions 0.13.1+:
http://www.apache.org/dyn/closer.cgi/hive/
Anybody knows a place to find older versions (0.12)?
I think you can find the source code you're looking for here Apache Hive releases page
Now it seems to be hosted on GitHub.