Hive Udf exception - hive

I have a simple hive UDF:
package com.matthewrathbone.example;
import org.apache.hadoop.hive.ql.exec.Description;
import org.apache.hadoop.hive.ql.exec.UDF;
import org.apache.hadoop.io.Text;
#Description(
name="SimpleUDFExample",
value="returns 'hello x', where x is whatever you give it (STRING)",
extended="SELECT simpleudfexample('world') from foo limit 1;"
)
class SimpleUDFExample extends UDF {
public Text evaluate(Text input) {
if(input == null) return null;
return new Text("Hello " + input.toString());
}
}
When I am executing it using select query :
select helloUdf(method) from tests3atable limit 10;
method is the name of column in tests3atable table.
I am getting below exception :
FAILED: SemanticException [Error 10014]: Line 1:7 Wrong arguments 'method': Unable to instantiate UDF implementation class com.matthewrathbone.example.SimpleUDFExample: java.lang.IllegalAccessException: Class org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge can not access a member of class com.matthewrathbone.example.SimpleUDFExample with modifiers ""

Declare the class as public, it should work

I also had the same issue. It turned out, eclipse was not refreshing the program which i modified. So please make sure that the modifications you make in your code gets reflected in the jar.

Related

Junit5 call methos with parameter but receive error: No ParameterResolver registered for parameter

I should perform a Junit5 test where method 'test1' can receive a String parameter
I need recall again this method from a new #Test case called 'call_test1WithParameter' passing the parameter.
It is possible to do that in Junit5?
Moreover, when I run method "test1", I receive this error:
There is a way do perform this test case as by me indicated?
Thanks in advance
org.junit.jupiter.api.extension.ParameterResolutionException:
No ParameterResolver registered for parameter [java.lang.String arg1] in method [void com.netsgroup.igfs.cg.test.IntegrationTest.MyTest.test1(org.junit.jupiter.api.TestInfo,java.lang.String)].
Thanks in advance for your support on This.
Bolow my test class:
import org.junit.jupiter.api.Order;
import org.junit.jupiter.api.Test;
import org.junit.jupiter.api.TestInfo;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
class MyTest {
final static Logger log = LoggerFactory.getLogger(MyTest.class);
#Test
#Order(1)
void test1(TestInfo testInfo, String myOrderId){
log.info("Start test case: " + testInfo.getDisplayName());
if (myOrderId!=null)
log.info("MyOrderId is: " + myOrderId);
}
#Test
#Order(2)
void call_test1WithParameter(TestInfo testInfo){
log.info("Start test case: " + testInfo.getDisplayName());
test1(testInfo, "OrderId_123456");
}
}
It is because you specify some arguments in your test method and you need to configure it in order to tell JUnit how to resolve the value for those parameters.
For the arguments that are provided by JUnit such as TestInfo , TestReport or RepetitionInfo , it will be automatically be resolved without any extra configuration.
But for other arguments such as the orderId in your case ,you have to change your test case to a #ParameterizedTest such that you can use #ValueSource/ #EnumSource / #MethodSource / #CsvSource or #ArgumentsSource etc. to define the actual value for the orderId (see this for details).
So change your test case to the following should solve your problem :
#ParameterizedTest
#ValueSource(strings = {"order1" , "order2" , "order3"})
#Order(1)
void test1(String myOrderId , TestInfo testInfo) {
log.info("Start test case: " + testInfo.getDisplayName());
if (myOrderId != null)
log.info("MyOrderId is: " + myOrderId);
}
One thing that need to pay attention is that the arguments that are resolved by argument sources (i.e #ValueSource) should be come first in the argument list. (see this).

Kotlin: How to invoke super after some computation?

At present, I have a custom error defined like so:
class IsEvenError(message:String):Exception(message)
val n = 10;
if (n%2 == 0) {
throw IsEvenError("${n} is even");
}
The problem with this is, I have to manually write out the error message every time I want to throw it.
I want to be able to embed the error message into the class itself, so I can do something like:
throw IsEvenError(n); // this should throw an error saying "10 is even".
How do I accomplish this?
You can change your IsEvenError to accept the number instead of a string, and pass the formatted string to Exception:
class IsEvenError(number: Int) : Exception("$number is even")
fun main() : Unit = throw IsEvenError(10)
Produces:
Exception in thread "main" IsEvenError: 10 is even
at TestKt.main(Test.kt:2)

Bitwise operations in Apache Pig?

I'm looking at the reference manual and can't find any documentation of bitwise operations/functions.
Is there any way to use, for example, a bitwise AND operation (equivalent to "A & B" in Hive) in a Pig script?
You can provide custom UDF for this. eg. see https://pig.apache.org/docs/r0.7.0/udf.html
In pig script you would do
REGISTER myudfs.jar;
And example for BinaryAND UDF:
package myudfs;
import java.io.IOException;
import org.apache.pig.EvalFunc;
import org.apache.pig.data.Tuple;
import org.apache.pig.impl.util.WrappedIOException;
public class BitwiseAND extends EvalFunc (Integer)
{
public String exec(Tuple input) throws IOException {
// check input tuple:
if (input == null || input.size() < 2)
return null;
try{
return (Integer)input.get(0) & (Integer)input.get(1);
}catch(Exception e){
throw WrappedIOException.wrap("Caught exception processing input row ", e);
}
}
}
NOTE: This is not tested, it's just copied from the pig udf page.

Dataflow returns correct type locally, but not when executed in the cloud

Given the following table in BigQuery:
With the following 5 values:
And a simple ParDo which reads it, and prints the type:
import com.google.api.services.bigquery.model.TableRow;
import com.google.cloud.dataflow.sdk.Pipeline;
import com.google.cloud.dataflow.sdk.io.BigQueryIO;
import com.google.cloud.dataflow.sdk.options.DataflowPipelineOptions;
import com.google.cloud.dataflow.sdk.options.DataflowPipelineWorkerPoolOptions;
import com.google.cloud.dataflow.sdk.options.PipelineOptionsFactory;
import com.google.cloud.dataflow.sdk.runners.BlockingDataflowPipelineRunner;
import com.google.cloud.dataflow.sdk.transforms.DoFn;
import com.google.cloud.dataflow.sdk.transforms.ParDo;
public class FloatBug {
public static void main(String[] args) {
DataflowPipelineOptions options = PipelineOptionsFactory.create().as(DataflowPipelineOptions.class);
options.setRunner(BlockingDataflowPipelineRunner.class);
options.setProject("<project_id>");
options.setWorkerMachineType("n1-standard-1");
options.setZone("us-central1-a");
options.setStagingLocation("<gcs_bucket>");
options.setNumWorkers(1);
options.setMaxNumWorkers(1);
options.setAutoscalingAlgorithm(DataflowPipelineWorkerPoolOptions.AutoscalingAlgorithmType.NONE);
Pipeline pipeline = Pipeline.create(options);
pipeline.apply(BigQueryIO.Read.from("FLOAT_BUG.float_bug")).apply(ParDo.of(new DoFn<TableRow, TableRow>() {
#Override
public void processElement(ProcessContext c) throws Exception {
Object o = c.element().get("VHH");
if (o instanceof Double) {
System.out.println("Awesome. Got expected Double: " + o);
} else if (o instanceof Integer) {
System.out.println("Bummer. Got an Integer: " + o);
} else {
assert false;
}
}
}));
pipeline.run();
}
}
Running locally gives back a Double for every value. And that is what I would expect:
Awesome. Got expected Double: 2.0
Awesome. Got expected Double: 2.245
Awesome. Got expected Double: 1.773
Awesome. Got expected Double: 4.567
Awesome. Got expected Double: 1.342
However, running in the cloud using the Dataflow service gives back an Integer for the value 2.0:
Awesome. Got expected Double: 2.245
Awesome. Got expected Double: 1.342
Awesome. Got expected Double: 1.773
Awesome. Got expected Double: 4.567
Bummer. Got an Integer: 2
It should return a Double, not an Integer for 2.0
The observation is true. A pipeline, which reads input from BigQuery, may output data with a different type than the underlying data type in the BigQuery schema. As observed, the type may also vary from element to element.
This is an unfortunate consequence of the fact that Dataflow Service first exports the data from BigQuery to JSON-encoded files in Google Cloud Storage, and then it reads data from those files. JSON, obviously, doesn't preserve types. For example, a floating point number 2.0 would be encoded as a string "2", which would be read as an Integer in Java. This doesn't occur when executing pipelines with DirectPipelineRunner, because that runner reads from BigQuery directly.
Now, the easiest way to avoid these kinds of problems is via Number abstract class in Java. This is the superclass of classes like Double and Integer. It should be safe to interpret the result as a Number and then call doubleValue() method on it.
That said, going forward, I expect this behavior to change. The exact timeline is not known yet, but the behavior of the Dataflow Service should shortly match the local execution. A workaround via Number class should be correct either way.

Cannot find symbol: createStatement()

Why I'm getting error "cannot find symbol: createStatement()"? My goal is to create a t1 Tab object in a TestClass and calls t1.Cr() with a String parameter.
I'm not forgetting the import java.sql.*;
public void Cr() throws Exception {
try {
ConexaoPlus con = new ConexaoPlus();
con.conect();
Statement st = con.createStatement();
st.execute("CREATE TABLE " + getName() + "();");
st.execute("DROP TABLE " + getName() + "();");
st.close();
} catch (Exception e) {
e.printStackTrace();
}
}
Thanks in advance!
A cannot find symbol error may occur for many reasons. You can read this post.
From what I see from the code you have provided, your class ConexaoPlus does not contain the method createStatement() or you might have not defined it.
java sql packages will have these methods already defined which you can use. For more help you will need to share the class ConexaoPlus.
This worked for me.
import java.sql.Statement;