how to read hive conf variables in UDF initialize method - hive

I am trying to read a hive conf variable in initialize method, but not works, any suggestion plz?
My UDF Class:
public class MyUDF extends GenericUDTF {
MapredContext _mapredContext;
#Override
public void configure(MapredContext mapredContext) {
_mapredContext = mapredContext;
super.configure(mapredContext);
}
#Override
public StructObjectInspector initialize(ObjectInspector[] args) throws UDFArgumentException {
Configuration conf = _mapredContext.getJobConf();
// i am getting conf as null
}
}

Probably its too late to answer this question, but for others below is the answer inside a GenericUDF evaluate() method:
#Override
public Object evaluate(DeferredObject[] args) throws HiveException {
String myconf;
SessionState ss = SessionState.get();
if (ss != null) {
HiveConf conf = ss.getConf();
myconf= conf.get("my.hive.conf");
System.out.println("sysout.myconf:"+ myconf);
}
}
The code is tested on hive 1.2
You should also override configure method to support MapReduce
#Override
public void configure(MapredContext context) {
...................
........................
JobConf conf = context.getJobConf();
if (conf != null) {
String myhiveConf = conf.get("temp_var");
}
}
}
To test the code:
Build UDF Jar
On hive CLI, execute the below commands:
SET hive.root.logger=INFO,console;
SET my.hive.conf=test;
ADD JAR /path/to/the/udf/jar;
CREATE TEMPORARY FUNCTION test_udf AS com.example.my.udf.class.qualified.classname';

I was also running into this issue with a custom UDTF. It seems that the configure() method is not called on the user defined function until the MapredContext.get() method returns a non-null result (see UDTFOperator line 82 for example). MapredContext.get() likely returns a null result because the hive job has yet to spin up the mappers/reducers (you can see that MapredContext.get() will return null up until the MapredContext.init() method has been called; the init() method takes boolean isMap as a param, so this method doesn't get called until MR/Tez runtime - the comment associated with the GenericUDTF.configure() method confirms this).
TLDR the UDF/UDTF initialize() method will be called during job setup, and the configure() will be called at MR runtime, hence the null result in your example code.

Related

Hangfire - DisableConcurrentExecution - Prevent concurrent execution if same value passed in method parameter

Hangfire DisableConcurrentExecution attribute not working as expected.
I have one method and that can be called with different Id. I want to prevent concurrent execution of method if same Id is passed.
string jobName= $"{Id} - Entry Job";
_recurringJobManager.AddOrUpdate<EntryJob>(jobName, j => j.RunAsync(Id, Null), "0 2 * * *");
My EntryJob interface having RunAsync method.
public class EntryJob: IJob
{
[DisableConcurrentExecution(3600)] <-- Tried here
public async Task RunAsync(int Id, SomeObj obj)
{
//Some coe
}
}
And interface look like this
[DisableConcurrentExecution(3600)] <-- Tried here
public interface IJob
{
[DisableConcurrentExecution(3600)] <-- Tried here
Task RunAsync(int Id, SomeObj obj);
}
Now I want to prevent RunAsync method to call multiple times if Id is same. I have tried to put DisableConcurrentExecution on top of the RunAsync method at both location inside interface declaration and also from where Interface is implemented.
But it seems like not working for me. Is there any way to prevent concurrency based on Id?
The existing implementation of DisableConcurrentExecution does not support this. It will prevent concurrent executions of the method with any args. It would be fairly simple to add support in. Note below is untested pseudo-code:
public class DisableConcurrentExecutionWithArgAttribute : JobFilterAttribute, IServerFilter
{
private readonly int _timeoutInSeconds;
private readonly int _argPos;
// add additional param to pass in which method arg you want to use for
// deduping jobs
public DisableConcurrentExecutionAttribute(int timeoutInSeconds, int argPos)
{
if (timeoutInSeconds < 0) throw new ArgumentException("Timeout argument value should be greater that zero.");
_timeoutInSeconds = timeoutInSeconds;
_argPos = argPos;
}
public void OnPerforming(PerformingContext filterContext)
{
var resource = GetResource(filterContext.BackgroundJob.Job);
var timeout = TimeSpan.FromSeconds(_timeoutInSeconds);
var distributedLock = filterContext.Connection.AcquireDistributedLock(resource, timeout);
filterContext.Items["DistributedLock"] = distributedLock;
}
public void OnPerformed(PerformedContext filterContext)
{
if (!filterContext.Items.ContainsKey("DistributedLock"))
{
throw new InvalidOperationException("Can not release a distributed lock: it was not acquired.");
}
var distributedLock = (IDisposable)filterContext.Items["DistributedLock"];
distributedLock.Dispose();
}
private static string GetResource(Job job)
{
// adjust locked resource to include the argument to make it unique
// for a given ID
return $"{job.Type.ToGenericTypeString()}.{job.Method.Name}.{job.Args[_argPos].ToString()}";
}
}

Is it possible to add completion items to a Microsoft Language Server in runtime?

I am trying to develop a IntelliJ plugin which provides a Language Server with help of lsp4intellij by ballerina.
Thing is, i've got a special condition: The list of completion items should be editable in runtime.
But I've not found any way to communicate new completionItems to the LanguageServer process once its running.
My current idea is to add an action to the plugin which builds a new jar and then restarts the server with the new jar, using the Java Compiler API.
The problem with that is, i need to get the source code from the plugin project including the gradle dependencies accessable from the running plugin... any ideas?
If your requirement is to modify the completion items (coming from the language server) before displaying them in the IntelliJ UI, you can do that by implementing the LSP4IntelliJ's
LSPExtensionManager in your plugin.
Currently, we do not have proper documentation for the LSP4IntelliJ's extension points but you can refer to our Ballerina IntelliJ plugin as a reference implementation, where it has implemented Ballerina LSP Extension manager to override/modify completion items at the client runtime in here.
For those who might stumble upon this - it is indeed possible to change the amount of CompletionItems the LanguageServer can provide during runtime.
I simply edited the TextDocumentService.java (the library I used is LSP4J).
It works like this:
The main function of the LanguageServer needs to be started with an additional argument, which is the path to the config file in which you define the CompletionItems.
Being called from LSP4IntelliJ it would look like this:
String[] command = new String[]{"java", "-jar",
"path\\to\\LangServer.jar", "path\\to\\config.json"};
IntellijLanguageClient.addServerDefinition(new RawCommandServerDefinition("md,java", command));
The path String will then be passed through to the Constructor of your CustomTextDocumentServer.java, which will parse the config.json in a new Timer thread.
An Example:
public class CustomTextDocumentService implements TextDocumentService {
private List<CompletionItem> providedItems;
private String pathToConfig;
public CustomTextDocumentService(String pathToConfig) {
this.pathToConfig = pathToConfig;
Timer timer = new Timer();
timer.schedule(new ReloadCompletionItemsTask(), 0, 10000);
loadCompletionItems();
}
#Override
public CompletableFuture<Either<List<CompletionItem>, CompletionList>> completion(CompletionParams completionParams) {
return CompletableFuture.supplyAsync(() -> {
List<CompletionItem> completionItems;
completionItems = this.providedItems;
// Return the list of completion items.
return Either.forLeft(completionItems);
});
}
#Override
public void didOpen(DidOpenTextDocumentParams didOpenTextDocumentParams) {
}
#Override
public void didChange(DidChangeTextDocumentParams didChangeTextDocumentParams) {
}
#Override
public void didClose(DidCloseTextDocumentParams didCloseTextDocumentParams) {
}
#Override
public void didSave(DidSaveTextDocumentParams didSaveTextDocumentParams) {
}
private void loadCompletionItems() {
providedItems = new ArrayList<>();
CustomParser = new CustomParser(pathToConfig);
ArrayList<String> variables = customParser.getTheParsedItems();
for(String variable : variables) {
String itemTxt = "$" + variable + "$";
CompletionItem completionItem = new CompletionItem();
completionItem.setInsertText(itemTxt);
completionItem.setLabel(itemTxt);
completionItem.setKind(CompletionItemKind.Snippet);
completionItem.setDetail("CompletionItem");
providedItems.add(completionItem);
}
}
class ReloadCompletionItemsTask extends TimerTask {
#Override
public void run() {
loadCompletionItems();
}
}
}

Spring Batch JdbcBatchItemWriter setSql never throws exception

My simple flow of batch process reads from a CSV file and write into a MySQL database (batch configuration is ok and works).
I'm using a custom implementation of JdbcBatchItemWriter in order to do the job and I'm actually making an Update in my writer constructor.
CsvReader.java
#Component
#StepScope
public class EducationCsvReader extends FlatFileItemReader {
public final static String CSV_FILE_NAME = "education.csv.file";
#Value("#{jobParameters['"+ CSV_FILE_NAME +"']}")
public void setResource(final String csvFileName) throws Exception {
setResource(
new FileSystemResource(csvFileName)
);
}
public EducationCsvReader() {
setLinesToSkip(1);
setEncoding("UTF-8");
setStrict(true);
setLineMapper((line, num) -> {
String[] values = line.split(";");
return new Education()
.setName(values[2].trim())
.setId(Integer.parseInt(values[0].trim()));
});
}
}
my custom JdbcBatchItemWriter : AbstractJdbcBatchItemWriter.java
public abstract class AbstractJdbcBatchItemWriter<T> extends JdbcBatchItemWriter<T>{
#Autowired
public AbstractJdbcBatchItemWriter(String SQL_QUERY) {
setSql(SQL_QUERY);
}
#Autowired
#Override
public void setItemSqlParameterSourceProvider(
#Qualifier("beanPropertyItemSqlParameterSourceProvider") ItemSqlParameterSourceProvider provider){
super.setItemSqlParameterSourceProvider(provider);
}
#Autowired
#Override
public void setDataSource(#Qualifier("mysqlDataSource") DataSource dataSource){
super.setDataSource(dataSource);
}
}
And here is my writer implementation : MySQLWriter.java
#Component
public class EducationMysqlWriter extends AbstractJdbcBatchItemWriter<Education> {
public EducationMysqlWriter(){
super("");
try {
setSql("UPDATE ecole SET nom=:name WHERE id=:id");
} catch (EmptyResultDataAccessException exception){
setSql("INSERT INTO ecole (nom, id) VALUES (:name, :id");
}
}
}
I need to update rows but if it fails (EmptyResultDataAccessException) I need to do an Insert.
But EmptyResultDataAccessException is shown on log console and kills the job but the exception catching is never reachable into MySQLWriter.java ...
JdbcBatchItemWriter#setSql doesn't throw an exception because it doesn't do anything but assign a string to an instance variable. The try block in the constructor doesn't have anything to do with the write method, it is executed when the itemwriter is first instantiated, while the write method is executed once for each chunk of items being processed. If you read the stacktrace I expect you'll see the JdbcBatchtemWriter is executing its write method and throwing the exception.
The ItemWriter is not going to get instantiated for each row so, assuming you will have some rows that need to be inserted and some that need to be updated, setting the sql string in the constructor does not seem like a good idea.
You could override the ItemWriter#write method, using a separate sql string for the insert, but it would be easier to use one string using the mysql upsert syntax:
INSERT INTO ecol (nom, id) VALUES (:name, :id)
ON DUPLICATE KEY UPDATE nom = :name;

Global object onStart fails all my tests

I had a number of tests, it ran perfectly, untill I write Global object:
#Override
public void onStart(Application app) {
Mails.plugin = app.plugin(MailerPlugin.class).email();
Mails.from = app.configuration().getString("smtp.from");
if (Mails.plugin != null) Logger.info("Mailer plugin successfully loaded");
if (Mails.from != null) Logger.info("Mail account is " + Mails.from);
}
Here I am loading plugin for email messages. Now when I try to run my fakeApplication with inMemoryDatabase I get a null pointer exception. Probably it is becouse fakeApplication don't use configuration file, and can't load configuration from this file. Please help me to sort out this problem.
Try adding custom config parameters in your FakeApplication:
Map<String, Object> additionalConfiguration = new HashMap<String, Object>();
additionalConfiguration.put("smtp.from", "foo#bar.com");
running(fakeApplication(additionalConfiguration), new Runnable() {
...
I find the solution , for this reason I create fake Global class (or you also can mock it):
class Global extends GlobalSettings{
}
and Then can pass it:
#BeforeClass
public static void startApp() {
app = Helpers.fakeApplication(new Global());
Helpers.start(app);
}

What is the reason that Policy.getPolicy() is considered as it will retain a static reference to the context and can cause memory leak

I just read some source code is from org.apache.cxf.common.logging.JDKBugHacks and also in
http://svn.apache.org/viewvc/tomcat/trunk/java/org/apache/catalina/core/JreMemoryLeakPreventionListener.java. In order to make my question clear not too broad. :)
I just ask one piece of code in them.
// Calling getPolicy retains a static reference to the context
// class loader.
try {
// Policy.getPolicy();
Class<?> policyClass = Class
.forName("javax.security.auth.Policy");
Method method = policyClass.getMethod("getPolicy");
method.invoke(null);
} catch (Throwable e) {
// ignore
}
But I didn't understand this comment. "Calling getPolicy retains a static reference to the context class loader". And they trying to use JDKBugHacks to work around it.
UPDATE
I overlooked the static block part. Here it is. This is the key. Actually it already has policy cached. So why cache contextClassLoader also? In comment, it claims #deprecated as of JDK version 1.4 -- Replaced by java.security.Policy.
I have double checked the code of java/security/Policy.java. It really removed the cached classloader. So my doubt is valid! :)
#Deprecated
public abstract class Policy {
private static Policy policy;
private static ClassLoader contextClassLoader;
static {
contextClassLoader = java.security.AccessController.doPrivileged
(new java.security.PrivilegedAction<ClassLoader>() {
public ClassLoader run() {
return Thread.currentThread().getContextClassLoader();
}
});
};
I also add the getPolicy source code.
public static Policy getPolicy() {
java.lang.SecurityManager sm = System.getSecurityManager();
if (sm != null) sm.checkPermission(new AuthPermission("getPolicy"));
return getPolicyNoCheck();
}
static Policy getPolicyNoCheck() {
if (policy == null) {
synchronized(Policy.class) {
if (policy == null) {
String policy_class = null;
policy_class = java.security.AccessController.doPrivileged
(new java.security.PrivilegedAction<String>() {
public String run() {
return java.security.Security.getProperty
("auth.policy.provider");
}
});
if (policy_class == null) {
policy_class = "com.sun.security.auth.PolicyFile";
}
try {
final String finalClass = policy_class;
policy = java.security.AccessController.doPrivileged
(new java.security.PrivilegedExceptionAction<Policy>() {
public Policy run() throws ClassNotFoundException,
InstantiationException,
IllegalAccessException {
return (Policy) Class.forName
(finalClass,
true,
contextClassLoader).newInstance();
}
});
} catch (Exception e) {
throw new SecurityException
(sun.security.util.ResourcesMgr.getString
("unable to instantiate Subject-based policy"));
}
}
}
}
return policy;
}
Actually I dig deeper, I find some interesting thing. Someone report a bug to apache CXF about the org.apache.cxf.common.logging.JDKBugHacks for this piece code recently.
In order for disabling url caching, JDKBugHacks runs:
URL url = new URL("jar:file://dummy.jar!/");
URLConnection uConn = url.openConnection();
uConn.setDefaultUseCaches(false);
When having the java.protocol.handler.pkgs system property set, that can lead to deadlocks between the system classloader and the file protocol Handler in particular situations (for instance if the file protocol URLStreamHandler is a signleton).
Besides that, the code above is really there for the sake of setting defaultUseCaches to false only, so actually opening a connection can be avoided, to speed up the execution.
So the fix is
URL url = new URL("jar:file://dummy.jar!/");
URLConnection uConn = new URLConnection(url) {
#Override
public void connect() throws IOException {
// NOOP
}
};
uConn.setDefaultUseCaches(false);
It's normal that JDK or apache cxf to have some minor bugs. And normally they will fix it.
javax.security.auth.login.Configuration has the same issues with Policy but it's not Deprecated.
The Policy class in java 6 contains a static reference to a classloader that is initialized to the current threads context classloader on the first access to the class:
private static ClassLoader contextClassLoader;
static {
contextClassLoader =
(ClassLoader)java.security.AccessController.doPrivileged
(new java.security.PrivilegedAction() {
public Object run() {
return Thread.currentThread().getContextClassLoader();
}
});
};
Tomcats lifecycle listener is making sure to to initialize this class from within a known environment where the context classloader is set to the system classloader. If this class was first accessed from within a webapp, it would retain a reference to the webapps classloader. This would prevent the webapps classes from getting garbage collected, creating a leak of perm gen space.