Spark CodeGenerator failed to compile, got NPE, infrequently - apache-spark-sql

I'm doing simple spark aggregation operation, reading data from avro file as dataframe and then mapping them to case-classes using rdd.map method then doing some aggregation operation, like count etc.
Most of the time it works just fine. But sometimes it generating weird CodeGen exception;
[ERROR] 2017-03-24 08:43:20,595 org.apache.spark.sql.catalyst.expressions.codegen.CodeGenerator logError - failed to compile: java.lang.NullPointerException
/* 001 */ public java.lang.Object generate(Object[] references) {
/* 002 */ return new SpecificUnsafeProjection(references);
/* 003 */ }
/* 004 */
/* 005 */ class SpecificUnsafeProjection extends org.apache.spark.sql.catalyst.expressions.UnsafeProjection {
I am using this code;
val deliveries = sqlContext.read.format("com.databricks.spark.avro").load(deliveryDir)
.selectExpr("FROM_UNIXTIME(timestamp/1000, 'yyyyMMdd') as day",
"FROM_UNIXTIME(timestamp/1000, 'yyyyMMdd_HH') as hour",
"deliveryId"
)
.filter("valid = true").rdd
.map(row => {
val deliveryId = row.getAs[Long]("deliveryId")
val uid = row.getAs[Long]("uid")
val deviceModelId: Integer = if(row.getAs[Integer]("deviceModelId") == null) {
0
} else {
row.getAs[Integer]("deviceModelId")
}
val delivery = new DeliveryEvent(deliveryId, row.getAs[Integer]("adId"), row.getAs[Integer]("adSpaceId"), uid, deviceModelId)
eventCache.getDeliverCache().put(new Element(deliveryId, delivery))
new InteractedAdInfo(row.getAs[String]("day"), delivery.deliveryId, delivery.adId, delivery.adSpaceId, uid, deviceModelId, deliveryEvent=1)
})
deliveries.count()
I can't regenerate the problem. But i get it irregularly in production. I'm using from java-app and taking spark-core_2.11:2.1.0 and spark-avro_2.11:3.1.0 maven co-ordinates.
Where might be the problem, i'm setting java -Xms8G -Xmx12G -XX:PermSize=1G -XX:MaxPermSize=1G while running the app.

I'm seeing a similar error with the very simple action spark.read.format("com.databricks.spark.avro").load(fn).cache.count, which is intermittent when applied to large AVRO files (4GB-10GB range in my tests). However, I can eliminate the error removing the setting --conf spark.executor.cores=4 and letting it default to 1.
WARN TaskSetManager: Lost task 58.0 in stage 2.0 (TID 82, foo.com executor 10): java.lang.RuntimeException:
Error while encoding:
java.util.concurrent.ExecutionException:
java.lang.Exception: failed to compile: java.lang.NullPointerException
/* 001 */ public java.lang.Object generate(Object[] references) {
/* 002 */ return new SpecificUnsafeProjection(references);
/* 003 */ }

Related

Room database query returns null where it should be null-safe in Kotlin

I have a query that does not have a result, when the DB is empty. Therefore NULL is the correct return value.
However, the compiler in Android Studio gives me the warning: Condition 'maxDateTime != null' is always 'true'.
If I debug the code, the null check performs correctly as the value is actually null.
When I rewrite the interface to 'fun queryMaxServerDate(): String?' (notice the question mark), the compiler warning goes away.
But should not 'fun queryMaxServerDate(): String' result in a compilation error since it can be null?
#Dao
interface CourseDao {
// Get latest downloaded entry
#Query("SELECT MAX(${Constants.COL_SERVER_LAST_MODIFIED}) from course")
fun queryMaxServerDate(): String
}
// calling function
/**
* #return Highest server date in table in milliseconds or 1 on empty/error.
*/
fun queryMaxServerDateMS(): Long {
val maxDateTime = courseDao.queryMaxServerDate()
var timeMS: Long = 0
if (maxDateTime != null) { // Warning: Condition 'maxDateTime != null' is always 'true'
timeMS = TimeTools.parseDateToMillisOrZero_UTC(maxDateTime)
}
return if (timeMS <= 0) 1 else timeMS
}
The underlying code generated by the annotation is java and thus the exception to null safety as per :-
Kotlin's type system is aimed to eliminate NullPointerException's from
our code. The only possible causes of NPE's may be:
An explicit call to throw NullPointerException(); Usage of the !!
operator that is described below;
Some data inconsistency with regard
to initialization, such as when:
An uninitialized this available in a
constructor is passed and used somewhere ("leaking this");
A
superclass constructor calls an open member whose implementation in
the derived class uses uninitialized state;
Java interoperation:
Attempts to access a member on a null reference of a platform type;
Generic types used for Java interoperation with incorrect nullability,
e.g. a piece of Java code might add null into a Kotlin
MutableList, meaning that MutableList should be used
for working with it;
Other issues caused by external Java code.
Null Safety
e.g. the generated code for queryMaxServerDate() in CourseDao would be along the lines of :-
#Override
public String queryMaxServerDate() {
final String _sql = "SELECT max(last_mopdified) from course";
final RoomSQLiteQuery _statement = RoomSQLiteQuery.acquire(_sql, 0);
__db.assertNotSuspendingTransaction();
final Cursor _cursor = DBUtil.query(__db, _statement, false, null);
try {
final String _result;
if(_cursor.moveToFirst()) {
final String _tmp;
_tmp = _cursor.getString(0);
_result = _tmp;
} else {
_result = null;
}
return _result;
} finally {
_cursor.close();
_statement.release();
}
}
As you can see, no data extracted (no first row) and null is returned.

RisingEdge example doesn't work for module input signal in Chisel3

In Chisel documentation we have an example of rising edge detection method defined as following :
def risingedge(x: Bool) = x && !RegNext(x)
All example code is available on my github project blp.
If I use it on an Input signal declared as following :
class RisingEdge extends Module {
val io = IO(new Bundle{
val sclk = Input(Bool())
val redge = Output(Bool())
val fedge = Output(Bool())
})
// seems to not work with icarus + cocotb
def risingedge(x: Bool) = x && !RegNext(x)
def fallingedge(x: Bool) = !x && RegNext(x)
// works with icarus + cocotb
//def risingedge(x: Bool) = x && !RegNext(RegNext(x))
//def fallingedge(x: Bool) = !x && RegNext(RegNext(x))
io.redge := risingedge(io.sclk)
io.fedge := fallingedge(io.sclk)
}
With this icarus/cocotb testbench :
class RisingEdge(object):
def __init__(self, dut, clock):
self._dut = dut
self._clock_thread = cocotb.fork(clock.start())
#cocotb.coroutine
def reset(self):
short_per = Timer(100, units="ns")
self._dut.reset <= 1
self._dut.io_sclk <= 0
yield short_per
self._dut.reset <= 0
yield short_per
#cocotb.test()
def test_rising_edge(dut):
dut._log.info("Launching RisingEdge test")
redge = RisingEdge(dut, Clock(dut.clock, 1, "ns"))
yield redge.reset()
cwait = Timer(10, "ns")
for i in range(100):
dut.io_sclk <= 1
yield cwait
dut.io_sclk <= 0
yield cwait
I will never get rising pulses on io.redge and io.fedge. To get the pulse I have to change the definition of risingedge as following :
def risingedge(x: Bool) = x && !RegNext(RegNext(x))
With dual RegNext() :
With simple RegNext() :
Is it a normal behavior ?
[Edit: I modified source example with the github example given above]
I'm not sure about Icarus, but using the default Treadle simulator for a test like this.
class RisingEdgeTest extends FreeSpec {
"debug should toggle" in {
iotesters.Driver.execute(Array("-tiwv"), () => new SlaveSpi) { c =>
new PeekPokeTester(c) {
for (i <- 0 until 10) {
poke(c.io.csn, i % 2)
println(s"debug is ${peek(c.io.debug)}")
step(1)
}
}
}
}
}
I see the output
[info] [0.002] debug is 0
[info] [0.002] debug is 1
[info] [0.002] debug is 0
[info] [0.003] debug is 1
[info] [0.003] debug is 0
[info] [0.003] debug is 1
[info] [0.004] debug is 0
[info] [0.004] debug is 1
[info] [0.005] debug is 0
[info] [0.005] debug is 1
And the wave form looks like
Can you explain what you think this should look like.
Do not change module input value on rising edge of clock.
Ok I found my bug. In the cocotb testbench I toggled input values on the same edge of synchronous clock. If we do that, the input is modified exactly under the setup time of D-Latch, then the behavior is undefined !
Then, the problem was a cocotb testbench bug and not Chisel bug. To solve it we just have to change the clock edge for toggling values like it :
#cocotb.test()
def test_rising_edge(dut):
dut._log.info("Launching RisingEdge test")
redge = RisingEdge(dut, Clock(dut.clock, 1, "ns"))
yield redge.reset()
cwait = Timer(4, "ns")
yield FallingEdge(dut.clock) # <--- 'synchronize' on falling edge
for i in range(5):
dut.io_sclk <= 1
yield cwait
dut.io_sclk <= 0
yield cwait

Did Vulkan-HPP developers change anything in vk::DebugUtilsMessengerEXT creation?

Recently I've updated my system and tried to recompile my Vulkan app (which uses Vulkan cpp binding) and got almost no output from vk::DebugUtilsMessengerEXT (except the string "Added messenger"). I set it up to std::cout every kind of callback, and it printed lots of information strings (before update). Does anyone know, what to do to bring back debug output?
Here is my debug messenger code:
// ...
vk::DebugUtilsMessengerCreateInfoEXT messengerInfo;
messengerInfo.setMessageSeverity(
vk::DebugUtilsMessageSeverityFlagBitsEXT::eError |
vk::DebugUtilsMessageSeverityFlagBitsEXT::eWarning |
vk::DebugUtilsMessageSeverityFlagBitsEXT::eInfo |
vk::DebugUtilsMessageSeverityFlagBitsEXT::eVerbose);
messengerInfo.setMessageType(
vk::DebugUtilsMessageTypeFlagBitsEXT::eGeneral |
vk::DebugUtilsMessageTypeFlagBitsEXT::eValidation |
vk::DebugUtilsMessageTypeFlagBitsEXT::ePerformance);
messengerInfo.setPfnUserCallback(callback);
messengerInfo.setPUserData(nullptr);
if(instance.createDebugUtilsMessengerEXT(&messengerInfo, nullptr, &debugMessenger, loader) != vk::Result::eSuccess) throw std::runtime_error("Failed to create debug messenger!\n");
}
VKAPI_ATTR VkBool32 VKAPI_CALL System::callback(VkDebugUtilsMessageSeverityFlagBitsEXT messageSeverity, VkDebugUtilsMessageTypeFlagsEXT messageType, const VkDebugUtilsMessengerCallbackDataEXT* pCallbackData, void* pUserData)
{
std::cout << pCallbackData->pMessage << '\n';
return false;
}
"loader" is vk::DispatchLoaderDynamic
Seems like the trouble is not only with Vulkan-Hpp, but also with C Vulkan.
Did some testing and the callbacks seem to be working correctly. I'm thinking the issue here might be the removal of some old INFO messages for performance reasons -- see Vulkan-ValidationLayers commits 18BF5C637 and 523D9C775. If INFORMATION_BIT messages were enabled in SDK releases previous to 1.1.108, you'd have seen a ton of spew. If expected validation errors are not making it to your callback, please create a github issue in the VVL repo and we'll address it immediately.
How I'm doing it and it works, albeit not the latest SDK:
void VulkanContext::createInstance()
{
// create the list of required extensions
uint32_t glfwExtensionCount = 0;
const char **glfwExtensions;
glfwExtensions = glfwGetRequiredInstanceExtensions(&glfwExtensionCount);
std::vector<const char *> extensions(glfwExtensions, glfwExtensions + glfwExtensionCount);
extensions.push_back(VK_KHR_GET_PHYSICAL_DEVICE_PROPERTIES_2_EXTENSION_NAME);
auto layers = std::vector<const char *>();
if ( enableValidationLayers )
{
extensions.push_back(VK_EXT_DEBUG_REPORT_EXTENSION_NAME);
extensions.push_back(VK_EXT_DEBUG_UTILS_EXTENSION_NAME);
layers.push_back("VK_LAYER_LUNARG_standard_validation");
}
vk::ApplicationInfo appInfo("Vulkan test", 1, "test", 1, VK_API_VERSION_1_1);
auto createInfo = vk::InstanceCreateInfo(
vk::InstanceCreateFlags(),
&appInfo,
static_cast<uint32_t>(layers.size()),
layers.empty() ? nullptr : layers.data(),
static_cast<uint32_t>(extensions.size()),
extensions.empty() ? nullptr : extensions.data()
);
instance = vk::createInstanceUnique(createInfo);
dispatcher = vk::DispatchLoaderDynamic(instance.get(), vkGetInstanceProcAddr);
if ( enableValidationLayers )
{
auto severityFlags = vk::DebugUtilsMessageSeverityFlagBitsEXT::eError
| vk::DebugUtilsMessageSeverityFlagBitsEXT::eWarning
| vk::DebugUtilsMessageSeverityFlagBitsEXT::eVerbose
| vk::DebugUtilsMessageSeverityFlagBitsEXT::eInfo;
auto typeFlags = vk::DebugUtilsMessageTypeFlagBitsEXT::eGeneral
| vk::DebugUtilsMessageTypeFlagBitsEXT::eValidation
| vk::DebugUtilsMessageTypeFlagBitsEXT::ePerformance;
messenger = instance->createDebugUtilsMessengerEXTUnique(
{{}, severityFlags, typeFlags, debugCallback},
nullptr,
dispatcher
);
}
}
Make sure you're enabling the debug extensions and validation layers.
Check that your loader/dispatcher is initialized correctly.
Try some of the other commands for creating the messenger, not sure but maybe the API changed and the severity flags are passed in the wrong place.
Make sure the validation layers are installed correctly, don't recall dealing with that myself but saw mentions that it can be an issue.

Antlr 4: Is getting this form of output possible?

Within the context of scanning, what do i need to override, extend, listen to, visit to be able to print out this form of informative output when my text is being scanned?
-- Example output only ---------
DEBUG ... current mode: DEFAULT_MODE
DEBUG ... matching text '#' on rule SHARP ; pushing and switching to DIRECTIVE_MODE
DEBUG ... matching text 'IF" on rule IF ; pushing and switching to IF_MODE
DEBUG ... matching text ' ' on rule WS; skipping
DEBUG ... no match for text %
DEBUG ... no match for text &
DEBUG ... mathcing text '\r\n' on rule EOL; popping mode; current mode: DIRECTIVE_MODE
...
thanks
The solution was a lot simpler than I thought.
You just need to subclass the generated Lexer and override methods such as popMode(), pushMode() to get the printout you want. If you do this you should also override emit() methods as well to get properly sequential and contextual information.
Here's an example in C#:
class ExtendedLexer : MyGeneratedLexer
{
public ExtendedLexer(ICharStream input)
: base(input) { }
public override int PopMode()
{
Console.WriteLine($"Mode is being popped: Line: {Line} Column:{Column} ModeName: {ModeNames[ModeStack.Peek()]}");
return base.PopMode();
}
public override void PushMode(int m)
{
Console.WriteLine($"Mode is being pushed: Line: {Line} Column:{Column} ModeName: {ModeNames[m]}");
base.PushMode(m);
}
public override void Emit(IToken t)
{
Console.WriteLine($"[#{t.TokenIndex},{t.StartIndex}:{t.StopIndex}, <{Vocabulary.GetSymbolicName(t.Type)}> = '{t.Text}']");
base.Emit(t);
}
}
And the output would be something like:
Mode is being pushed: Line: 4 Column:3 ModeName: IF_MODE
[#-1,163:165, <IF> = '#IF']
Mode is being pushed: Line: 4 Column:4 ModeName: CONDITION_MODE
[#-1,166:166, <LPAREN> = '(']
[#-1,167:189, <EXP> = '#setStartDateAndEndDate']
Mode is being popped: Line: 4 Column:28 ModeName: IF_MODE
[#-1,190:190, <RPAREN> = ')']

PHP Extension return structure

I am working on a PHP extension and wants to let PHP returns a structure. But it always cause core dump. My step is:
./ext_skel --extname=test
./configure --enable-test
in php_test.h, add:
typedef struct mydata {
int m_id;
int m_age;
}MYDATA;
PHP_FUNCTION(wrap_getMydata);`
In test.c, add:
#define MY_RES_NAME "my_resource";
static int my_resource_descriptor;
PHP_FE(wrap_getMydata, NULL)
...
ZEND_MINIT_FUNCTION(test)
{
/* If you have INI entries, uncomment these lines
REGISTER_INI_ENTRIES();
*/
resid = zend_register_list_destructors_ex(NULL, NULL, MY_RES_NAME, module_number);
return SUCCESS;
}
PHP_FUNCTION(test_getMydata)
{
zval* res;
long int a, b;
long int result;
if (zend_parse_parameters(ZEND_NUM_ARGS() TSRMLS_CC, "ll", &a, &b) == FAILURE) {
return;
}
MYDATA objData;
objData.m_id = a;
objData.m_age = b;
ZEND_REGISTER_RESOURCE(res, &objData, resid);
RETURN_RESOURCE(res);
}
add: var_dump(test_getMydata(3,4)) in test.php
then make; make install; ./php test.php, it prints:
Functions available in the test extension:
confirm_wrap_compiled
test_getMydata
Congratulations! You have successfully modified ext/wrap/config.m4. Module wrap is now compiled into PHP.
Segmentation fault (core dumped)
$ gdb ../../bin/php core.23310
Loaded symbols for /home/user1/php/php-5.2.17/lib/php/extensions/no-debug-non-zts-20060613/test.so
#0 0x00000000006388ad in execute (op_array=0x2a9569bd68) at /home/user1/php/php-5.2.17/Zend/zend_vm_execute.h:92
92 if (EX(opline)->handler(&execute_data TSRMLS_CC) > 0) {`
Can someone give some help?
sorry for the bad formatting in the comment - here is my final answer:
i had to rename the extension from test enter code hereto hjtest - everthing else should be pretty much in line with your posted sample.
tl;dr - the problem - and SIGSEGV in your sample is that you are registering a resource to a local variable objData - wich at the end of the function is not reachable anymore - you need to use emalloc to get a piece of dynamic memory - wich holds your MYDATA
as from there you have a resource - bound to some piece of dyn. memory, you need to register a dtor function - so you can release/efree your registered memory.
hope that helps.
to solve the above issue - modifie your resource registration like this:
MYDATA * objData=emalloc(sizeof(MYDATA));
objData->m_id = a;
objData->m_age = b;
ZEND_REGISTER_RESOURCE(return_value, objData, resid);
and add a dtor:
... MINIT
resid = zend_register_list_destructors_ex(resdtor, NULL, MY_RES_NAME, module_number);
and
static void resdtor(zend_rsrc_list_entry *rsrc TSRMLS_DC)
{
MYDATA *res = (MYDATA*)rsrc->ptr;
if (res) {
efree(res);
}
}
for full sample see this GIST: https://gist.github.com/hjanuschka/3ed54e66f017a379cf25