Netty 4.0 Difference between ChannelInboundByteHandler and ChannelInboundMessageHandler - handler

In Netty 4.0, What is the difference between ChannelInboundByteHandler and ChannelInboundMessageHandler or ChannelOutboundByteHandler and ChannelOutboundMessageHandler?

Well Channel*ByteHandler operates on a "stream" of bytes and Channel*MessageHandler on messages.
Also read [1].
[1] http://netty.io/4.0/guide/

Related

Fetching results from Celery backend is abnormally slow

I'm using Celery with a Redis broker to do some "heavy" processing for my Django app. Everything is running locally in Docker containers on WSL2.
The tasks output a JSON which is roughly 2.5 Mb large and it takes up to 9 seconds to retrieve the result via get() in the Django app. For smaller payloads, the time goes down
I tried increasing the RAM and the CPU for WSL2 up to 6 CPUs and 8Gb RAM. Celery was configured with --max-memory-per-child=1024000 --concurrency=4
I've tried using different result_backend configuration with similar results:
Redis
RPC
SQLite with SQLAlchemy
I tried setting an interval when using SQLite (doesn't matter for RPC & Redis) with a 0.5sec improvement get(interval=0.01)
I also tried changing the result_serializer from JSON to pickle for poorer performance. But I don't think the serializer is the culprit here as serializing / deserializing the same JSON is pretty fast in console
>>> timeit.timeit(lambda: pickle.dumps(big_dict,0), number=10)
0.567067899999528
>>> timeit.timeit(lambda: pickle.loads(str), number=10)
0.3542163999991317
I tried using compression, only zlib seemed to provide a small gain.
I'm not too familiar with this setup but IMHO I should be able to retrieve results faster. The best I could achieve was 6sec. Any idea how to improve this or how to explain it ?
settings.py
CELERY_BROKER_URL = "redis://{host}:{port}/{db}".format(
host=os.environ.get('REDIS_HOST'),
port=os.environ.get('REDIS_PORT'),
db=os.environ.get('CELERY_REDIS_DB')
)
CELERY_RESULT_BACKEND = "redis://{host}:{port}/{db}".format(
host=os.environ.get('REDIS_HOST'),
port=os.environ.get('REDIS_PORT'),
db=os.environ.get('CELERY_REDIS_DB')
)
# CELERY_RESULT_BACKEND = 'db+sqlite:///celery.sqlite' # SQL Example (need SQLAlchemy==1.4.29 in requirements.txt)
# CELERY_RESULT_BACKEND = 'rpc://localhost' # RPC Example
CELERY_ACCEPT_CONTENT = ['json']
CELERY_TASK_SERIALIZER = 'json'
CELERY_RESULT_SERIALIZER = 'json'
Thanks
In general, Redis has a reputation for being bad at dealing with large objects and is not generally intended to be a large object store. You're better off using a general purpose RDBMS or a file store and returning a key to where the JSON can be retrieved.

Kafka 1.0.0 - Serialized.with() uses default serde instead of the ones provided

We recently updated our kafka version from 0.10 to 1.0 and I am updating the deprecated code
KTable<Long, myClass> myKTable = this.streamBuilder
.stream(Serdes.Long(), mySerde, sub_topic)
.groupByKey(Serdes.Long(), mySerde)
.reduce(myReducer, my_store);
to this
KTable<Long, myClass> myKTable = this.streamBuilder
.stream(sub_topic, Consumed.with(Serdes.Long(), mySerde))
.groupByKey(Serialized.with(Serdes.Long(), mySerde))
.reduce(myReducer, Materialized.as(my_store));
My stream throws an error while serializing in groupByKey. The Serialized.with() does not use the keySerde provided and defaults back to byteArray. And this byteArray serde then encounters my key which is a Long and throws a cast error.
Has anyone else encountered this error in the 1.0.0 version of kafka. The first code with the outdated version of kafka works fine. But updating the code to use Serialized.with() does not seem to work. Any help is greatly appreciated.
Can you share the stack trace? I actually think the issue is with reduce() -- you need to specify the Serdes via Materialized again.
This is kinda regression on the new API and got fixed recently in trunk: https://github.com/apache/kafka/pull/4919 Thus, upcoming 2.0 release will contain the fix.

OrientDB serverside NullPointerException while serializing record using binary protocol

I've just started to implement binary protocol API to orientDB with C++. Current Version of used orientDB is "orientdb-community-2.2.29" with win 10 x64 and java 1.8. Since I've tried to query "select * from XXXX" on example DB serverside exceptions are thrown and no record is serialized to client. Here are the logs after successful connection and query:
2017-12-03 14:14:12:561 INFO {db=Site} /0:0:0:0:0:0:0:1:2520 - Writing bytes (4+0=4 bytes): null [OChannelBinaryServer]$ANSI{green {db=Site}} Error on unmarshalling record #73:0 (java.lang.NullPointerException)
java.lang.NullPointerException
at com.orientechnologies.orient.server.network.protocol.binary.ONetworkProtocolBinary.getRecordBytes(ONetworkProtocolBinary.java:2894)
at com.orientechnologies.orient.server.network.protocol.binary.ONetworkProtocolBinary.writeRecord(ONetworkProtocolBinary.java:2907)
at com.orientechnologies.orient.server.network.protocol.binary.ONetworkProtocolBinary.writeIdentifiable(ONetworkProtocolBinary.java:2697)
at com.orientechnologies.orient.server.network.protocol.binary.ONetworkProtocolBinary.serializeValue(ONetworkProtocolBinary.java:1639)
at com.orientechnologies.orient.server.network.protocol.binary.ONetworkProtocolBinary.command(ONetworkProtocolBinary.java:1584)
at com.orientechnologies.orient.server.network.protocol.binary.ONetworkProtocolBinary.executeRequest(ONetworkProtocolBinary.java:660)
at com.orientechnologies.orient.server.network.protocol.binary.ONetworkProtocolBinary.sessionRequest(ONetworkProtocolBinary.java:394)
at com.orientechnologies.orient.server.network.protocol.binary.ONetworkProtocolBinary.execute(ONetworkProtocolBinary.java:217)
at com.orientechnologies.common.thread.OSoftThread.run(OSoftThread.java:81)
2017-12-03 14:14:12:561 WARNI {db=Site} Cannot serialize record: XXXX#73:0{Name:[2],IDs:[1]} v3 [ONetworkProtocolBinary]
Before writing the "null" bytes the recordID, position and record version is serialized and received on client side correctly, also querying from Studio or console works like a charm. I've tried to change the class - property to STRING or EMBEDDEDMAP with the same problem.
Thanks in advance for help :-)
Fortunately I found the mistake at my own: wrong SerializationImpl was configured. The correct configuration must be ORecordSerializerBinary and not ONetworkProtocolBinary.

The receiveBufferSize not being honored. UDP packet truncated

netty 4.0.24
I am passing XML over UDP. When receiving the UPD packet, the packet is always of length 2048, truncating the message. Even though, I have attempted to set the receive buffer size to something larger (4096, 8192, 65536) but it is not being honored.
I have verified the UDP sender using another UDP ingest mechanism. A standalone Java app using java.net.DatagramSocket. The XML is around 45k.
I was able to trace the stack to DatagramSocketImpl.createChannel (line 281). Stepping into DatagramChannelConfig, it has a receiveBufferSize of whatever I set (great), but a rcvBufAllocator of 2048.
Does the rcvBufAllocator override the receiveBufferSize (SO_RCVBUF)? Is the message coming in multiple buffers?
Any feedback or alternative solutions would be greatly appreciated.
I also should mention, I am using an ESB called vert.x which uses netty heavily. Since I was able to trace down to netty, I was hopeful that I could find help here.
The maximum size of incoming datagrams copied out of the socket is actually not a socket option, but rather a parameter of the socket read() function that your client passes in each time it wants to read a datagram. One advantage of this interface is that programs accepting datagrams of unknown/varying lengths can adaptively change the size of the memory allocated for incoming datagram copies such that they do not over-allocate memory while still getting the whole datagram. (In netty this allocation/prediction is done by implementors of io.netty.channel.RecvByteBufAllocator.)
In contrast, SO_RCVBUF is the size of a buffer that holds all of the datagrams your client hasn't read yet.
Here's an example of how to configure a UDP service with a fixed max incoming datagram size with netty 4.x using a Bootstrap:
import io.netty.bootstrap.Bootstrap;
import io.netty.channel.ChannelOption;
import io.netty.channel.FixedRecvByteBufAllocator;
import io.netty.channel.nio.NioEventLoopGroup;
import io.netty.channel.socket.nio.NioDatagramChannel;
int maxDatagramSize = 4092;
String bindAddr = "0.0.0.0";
int port = 1234;
SimpleChannelInboundHandler<DatagramPacket> handler = . . .;
InetSocketAddress address = new InetSocketAddress(bindAddr, port);
NioEventLoopGroup group = new NioEventLoopGroup();
Bootstrap b = new Bootstrap()
.group(group)
.channel(NioDatagramChannel.class)
.handler(handler);
b.option(ChannelOption.RCVBUF_ALLOCATOR, new FixedRecvByteBufAllocator(maxDatagramSize));
b.bind(address).sync().channel().closeFuture().await();
You could also configure the allocator with ChannelConfig.setRecvByteBufAllocator

Accumulo-Pig error - Connector info for AccumuloInputFormat can only be set once per job

Versions:
Accumulo 1.5
Pig 0.10
Attempted:
Read/write data in/into Accumulo from Pig, using accumulo-pig.
Encountered an error - any insight into getting past this error is greatly appreciated.
Switching to Accumulo 1.4 is not an option as we are using the Accumulo Thrift Proxy in our C# codebase.
Impact:
This is currently a roadblock in our project.
Source reference:
Source code - https://git-wip-us.apache.org/repos/asf/accumulo-pig.git
Error:
In attemtping to read a dataset in Accumulo, from Pig, I am getting the following error-
org.apache.pig.backend.executionengine.ExecException: ERROR 2118:
Connector info for AccumuloInputFormat can only be set once per job
Code snippet:
DATA = LOAD 'accumulo://departments?instance=indra&user=root&password=xxxxxxx&zookeepers=cdh-dn01:2181' using org.apache.accumulo.pig.AccumuloStorage() AS (row, cf, cq, cv, ts, val);
dump DATA;
Try using the ACCUMULO-1783-1.5 branch from the same repository. The way that Pig sets up the InputFormat doesn't play nicely with how Accumulo sets up InputFormats (notably, Accumulo makes a funny assertion that you never call the same static method more than one for a Configuration).
I have been using pig 0.12 -- I doubt there's a difference in how 0.10 sets up the InputFormats as opposed to 0.12, but I'm not positive YMMV.
I just pushed a fix to the above branch that gets rid of the previously mentioned limitation on Hadoop version.