Use Java8 Stream on JDBCTemplate Results from HIVE

Use Java8 Stream on JDBCTemplate Results from HIVE - hive

I am using jdbcTemplate to query hive then writing the results to a .csv file. I basically just generate a list of objects then steam the list to write each record to the file.
I will like to stream the results as they coming back from hive and write it to the file instead of wait to get the whole thing then processing it. Can anyone pointing me to the right direction? Thanks!
private List<Avs> queryAvsData(String asSql) {
List<Avs> llistAvs = new ArrayList<Avs>();
List<Map<String, Object>> rows = hiveJdbcTemplate.queryForList(asSql);
Iterator<Map<String, Object>> it = rows.iterator();
while (it.hasNext()) {
Map<String, Object> row = it.next();
Avs laAvs = Avs.builder()
.make((String) row.get("make"))
.model((String) row.get("model"))
.build();
llistAvs.add(laAvs);
}
return llistAvs;
}

It doesn't look like there's a built-in solution, but you can do it. Basically, you wrap the existing functionality in an iterator, and use a spliterator to turn it into a stream. Here's a blog post on the subject:
The code implements Spring’s ResultSetExtractor interface, which is a Single Abstract Method (SAM) interface, allowing the use of a lambda expression to implement it.
The implementation wraps the SQL ResultSet in an iterator, constructs a stream using the Spliterators and StreamSupport utility classes, and applies that to a Function taking a stream of row sets and returning a generic result.

It's possible to stream values from JdbcTemplate. The following example is a service based on Spring Boot 2.4.8.
As, I run into problems (connection leak) using queryForStream then I will put a demo code here just to know that stream must be closed after usage.
import lombok.RequiredArgsConstructor;
import org.springframework.jdbc.core.SingleColumnRowMapper;
import org.springframework.jdbc.core.namedparam.NamedParameterJdbcTemplate;
import org.springframework.stereotype.Service;
import java.util.Map;
import java.util.stream.Stream;
#Service
#RequiredArgsConstructor
public class DataCleaningService {
private final NamedParameterJdbcTemplate jdbcTemplate;
public void doSomeStreaming() {
String nativeQuery = "SELECT string_value FROM my_table WHERE column = :valueToFiler";
Map<String, Object> queryParameters = Map.of("valueToFiler", "my value");
SingleColumnRowMapper<String> stringRowMapper = SingleColumnRowMapper.newInstance(String.class);
try (Stream<String> stringValueStream = jdbcTemplate.queryForStream(nativeQuery, queryParameters, stringRowMapper)) {
stringValueStream.forEach(stringValue -> {
// do the needed action with the value
//..
System.out.printf("My cool value: %s", stringValue);
});
}
}
}

Related

how to reverse the measurement data using MeasurementFilter in Java SDK for Cumulocity Api?

I am using below code to get the latest measurement API details for specific device but its not returning the data in descending order:
import com.cumulocity.sdk.client.measurement.MeasurementFilter;
import com.cumulocity.sdk.client.Platform;
import com.cumulocity.rest.representation.measurement.MeasurementRepresentation;
#Autowired
private Platform platform;
MeasurementFilter filter = new MeasurementFilter().byType("type").bySource("deviceId").byDate(fromDate,dateTo);
Iterable<MeasurementRepresentation> mRep = platform.getMeasurementApi().getMeasurementsByFilter(filter).get().elements(1);
List<MeasurementRepresentation> mRepList = StreamSupport.stream(mRep.spliterator(), false).collect(Collectors.toList());
...
MeasurementFilter api
we can get the latest data using 'revert=true' in Http REST url call..
../measurement/measurements?source={deviceId}&type={type}&dateTo=xxx&dateFrom=xxx&revert=true
How we can use 'revert=true' or other way to get measurement details in order using Cumulocity Java SDK? appreciate your help here.

The SDK currently has no out-of-the-box QueryParam for revert parameter so you have to create it yourself:
import com.cumulocity.sdk.client.Param;
public class RevertParam implements Param {
#Override
public String getName() {
return "revert";
}
}
And then you can combine it with your query. Therefore you to include your Query Param when you use the get() on the MeasurementCollection. You are currently not passing anything but you can pass pageSize and an arbitrary number of QueryParam.
private Iterable<MeasurementRepresentation> getMeasurementByFilterAndQuery(int pageSize, MeasurementFilter filter, QueryParam... queryParam) {
MeasurementCollection collection = measurementApi.getMeasurementByFilter(filter);
Iterable<MeasurementRepresentation> iterable = collection.get(pageSize, queryParam).allPages();
return iterable;
}
private Optional<MeasurementRepresentation> getLastMeasurement(GId source) {
QueryParam revertQueryParam = new QueryParam(new RevertParam(), "true");
MeasurementFilter filter = new MeasurementFilter()
.bySource(source)
.byFromDate(new DateTime(0).toDate());
Iterable<MeasurementRepresentation> iterable = measurementRepository.getMeasurementByFilterAndQuery(1, filter, revertQueryParam);
if (iterable.iterator().hasNext()) {
return Optional.of(iterable.iterator().next());
} else {
return Optional.absent();
}
}
Extending your code it could look like this:
QueryParam revertQueryParam = new QueryParam(new RevertParam(), "true");
MeasurementFilter filter = new MeasurementFilter().byType("type").bySource("deviceId").byDate(fromDate,dateTo);
Iterable<MeasurementRepresentation> mRep = platform.getMeasurementApi().getMeasurementsByFilter(filter).get(1, revertQueryParam);
List<MeasurementRepresentation> mRepList = StreamSupport.stream(mRep.spliterator(), false).collect(Collectors.toList());
What you did with elements is not incorrect but it is not limiting the API call to just return one value. It would query with defaultPageSize (=5) and then on Iterable level limit it to only return one. The elements() function is more for usage when you need more elements than the maxPageSize (=2000). Then it will handle automatic requesting for additional pages and you can just loop through the Iterable.

Spring Webflux send event when any new data

I'm trying to learn Spring webflux & R2DBC. The one I try is simple use case:
have a book table
create an API (/books) that provides text stream and returning Flux<Book>
I'm hoping when I hit /books once, keep my browser open, and any new data inserted to book table, it will send the new data to browser.
Scenario 2, still from book table:
have a book table
create an API (/books/count) that returning count of data in book as Mono<Long>
I'm hoping when I hit /books/count once, keep my browser open, and any new data inserted /deleted to book table, it will send the new count to browser.
But it does not works. After I isnsert new data, no data sent to any of my endpoint.
I need to hit /books or /books/count to get the updated data.
I think to do this, I need to use Server Sent Events? But how to do this in and also querying data? Most sample I got is simple SSE that sends string every certain interval.
Any sample to do this?
Here is my BookApi.java
#RestController
#RequestMapping(value = "/books")
public class BookApi {
private final BookRepository bookRepository;
public BookApi(BookRepository bookRepository) {
this.bookRepository = bookRepository;
}
#GetMapping(produces = MediaType.TEXT_EVENT_STREAM_VALUE)
public Flux<Book> getAllBooks() {
return bookRepository.findAll();
}
#GetMapping(value = "/count", produces = MediaType.TEXT_EVENT_STREAM_VALUE)
public Mono<Long> count() {
return bookRepository.count();
}
}
BookRepository.java (R2DBC)
import org.springframework.data.r2dbc.repository.R2dbcRepository;
public interface BookRepository extends R2dbcRepository<Book, Long> {
}
Book.java
#Table("book")
#Data
#AllArgsConstructor
#NoArgsConstructor
public class Book {
#Id
private Long id;
#Column(value = "name")
private String name;
#Column(value = "author")
private String author;
}

Use a Processor or Sink to handle the Book created event.
Check my example using reactor Sinks, and read this article for the details.
Or use a tailable Mongo document.
A tailable MongoDB document can do the work automatically, check the main branch of the same repos.
My above example used the WebSocket protocol, it is easy to switch to SSE, RSocket.

Below Post would help you to achieve your first requirement
Spring WebFlux (Flux): how to publish dynamically
Let me know , if that helps you

Netty client server login, how to have channelRead return a boolean

I'm writing client server applications on top of netty.
I'm starting with a simple client login server that validates info sent from the client with the database. This all works fine.
On the client-side, I want to use If statements once the response is received from the server if the login credentials validate or not. which also works fine. My problem is the ChannelRead method does not return anything. I can not change this. I need it to return a boolean which allows login attempt to succeed or fail.
Once the channelRead() returns, I lose the content of the data.
I tried adding the msg to a List but, for some reason, the message data is not stored in the List.
Any suggestions are welcome. I'm new... This is the only way I've figured out to do this. I have also tried using boolean statements inside channelRead() but these methods are void so once it closes the boolean variables are cleared.
Following is the last attempt I tried to insert the message data into the list I created...
import io.netty.channel.ChannelHandlerContext;
import io.netty.channel.ChannelInboundHandlerAdapter;
import java.util.Collection;
import java.util.Iterator;
import java.util.List;
import java.util.ListIterator;
public class LoginClientHandler extends ChannelInboundHandlerAdapter {
Player player = new Player();
String response;
public volatile boolean loginSuccess;
// Object message = new Object();
private Object msg;
public static final List<Object> incomingMessage = new List<Object>() {
#Override
public void channelRead(ChannelHandlerContext ctx, Object msg) throws Exception {
// incomingMessage.clear();
response = (String) msg;
System.out.println("channel read response = " + response);
incomingMessage.add(0, msg);
System.out.println("incoming message = " + incomingMessage.get(0));
}
How can I get the message data "out" of the channelRead() method or use this method to create a change in my business logic? I want it to either display a message to tell the client login failed and try again or to succeed and load the next scene. I have the business logic working fine but I can't get it to work with netty because none of the methods return anything I can use to affect my business logic.
ChannelInitializer
import io.netty.channel.ChannelInitializer;
import io.netty.channel.ChannelPipeline;
import io.netty.channel.socket.SocketChannel;
import io.netty.handler.codec.DelimiterBasedFrameDecoder;
import io.netty.handler.codec.Delimiters;
import io.netty.handler.codec.string.StringDecoder;
import io.netty.handler.codec.string.StringEncoder;
public class LoginClientInitializer extends ChannelInitializer <SocketChannel> {
#Override
protected void initChannel(SocketChannel ch) throws Exception {
ChannelPipeline pipeline = ch.pipeline();
pipeline.addLast("framer", new DelimiterBasedFrameDecoder(8192, Delimiters.lineDelimiter()));
pipeline.addLast("decoder", new StringDecoder());
pipeline.addLast("encoder", new StringEncoder());
pipeline.addLast("handler", new LoginClientHandler());
}
}

To get the server to write data to the client, call ctx.write here is a basic echo server and client example from the Netty in Action book. https://github.com/normanmaurer/netty-in-action/blob/2.0-SNAPSHOT/chapter2/Server/src/main/java/nia/chapter2/echoserver/EchoServerHandler.java
There are several other good examples in that repo.
I highly recommend reading the "netty in action" book if you're starting out with netty. It will give you a solid foundational understanding of the framework and how it's intended to be used.

Graph traversal name to graph name mapping

Is there any API using which I can get graphTraversalName to graphName mapping defined in the script?
I am using the below messy code but it's error-prone if both graphs are using the same underlying storage.
Map<String, String> graphTraversalToNameMap = new ConcurrentHashMap<String, String>();
while(traversalSourceIterator.hasNext()){
String traversalSource = traversalSourceIterator.next();
String currentGraphString = ( (GraphTraversalSource) graphManager.getAsBindings().get(traversalSource)).getGraph().toString();
graphNameTraversalMap.put(currentGraphString, traversalSource);
}
Iterator<String> graphNamesIterator = graphManager.getGraphNames().iterator();
while(graphNamesIterator.hasNext()){
String graphName = graphNamesIterator.next();
String currentGraphString = graphManager.getGraph(graphName).toString();
String traversalSource = graphNameTraversalMap.get(currentGraphString);
graphTraversalToNameMap.put(traversalSource, graphName);
}
Does gremlinExecutor.getScriptEngineManager().getBindings().entrySet() provide order guarantee? I can iterate over this and populate my map

Is there any API using which I can get graphTraversalName to graphName mapping defined in the script?
No. They share the same namespace in Gremlin Server so the relationship gets lost programmatically. You would need to do something like what you are doing but I wouldn't rely on toString() of a Graph for equality. Perhaps use the Graph instance itself? Although that might not work either depending on your situation and what you want for equality as you could have two different Graph configurations pointed at the same data and want to resolve those as the same graph. I'm also not sure that any approach will work generally for all graph systems. Anyway, I think I'd experiment with using Map<Graph, String> graphTraversalToNameMap for your case and see how that goes.
Does gremlinExecutor.getScriptEngineManager().getBindings().entrySet() provide order guarantee?
No as it is backed by a ConcurrentHashMap. You would have to provide your own order.

Underlying storage details can be obtained from the configuration object and can be used for the mapping, sample code:
public class GraphTraversalMappingUtil {
public static void populateGraphTraversalToNameMapping(GraphManager graphManager){
if(graphTraversalToNameMap.size() != 0){
return;
}
Iterator<String> traversalSourceIterator = graphManager.getTraversalSourceNames().iterator();
Map<StorageBackendKey, String> storageKeyToTraversalMap = new HashMap<StorageBackendKey, String>();
while(traversalSourceIterator.hasNext()){
String traversalSource = traversalSourceIterator.next();
StorageBackendKey key = new StorageBackendKey(
graphManager.getTraversalSource(traversalSource).getGraph().configuration());
storageKeyToTraversalMap.put(key, traversalSource);
}
Iterator<String> graphNamesIterator = graphManager.getGraphNames().iterator();
while(graphNamesIterator.hasNext()) {
String graphName = graphNamesIterator.next();
StorageBackendKey key = new StorageBackendKey(
graphManager.getGraph(graphName).configuration());
graphTraversalToNameMap.put(storageKeyToTraversalMap.get(key), graphName);
}
}
}
For full code, refer: https://pastebin.com/7m8hi53p

Only show effective SQL string P6Spy

I'm using p6spy to log the sql statements generated by my program. The format for the outputted spy.log file looks like this:
current time|execution time|category|statement SQL String|effective SQL string
I'm just wondering if anyone knows if there's a way to alter the spy.properties file and have only the last column, the effective SQL string, output to the spy.log file? I've looked through the properties file but haven't found anything that seems to support this.
Thanks!

In spy.properties there is a property called logMessageFormat that you can set to a custom implementation of MessageFormattingStrategy. This works for any type of logger (i.e. file, slf4j etc.).
E.g.
logMessageFormat=my.custom.PrettySqlFormat
An example using Hibernate's pretty-printing SQL formatter:
package my.custom;
import org.hibernate.jdbc.util.BasicFormatterImpl;
import org.hibernate.jdbc.util.Formatter;
import com.p6spy.engine.spy.appender.MessageFormattingStrategy;
public class PrettySqlFormat implements MessageFormattingStrategy {
private final Formatter formatter = new BasicFormatterImpl();
#Override
public String formatMessage(int connectionId, String now, long elapsed, String category, String prepared, String sql) {
return formatter.format(sql);
}
}

There is no such option provided to achieve it via configuration only yet. I think you have 2 options here:
fill a new bug/feature request report (which could bring benefit to others using p6spy as well) on: https://github.com/p6spy/p6spy/issues?state=open or
provide custom implementation.
For the later option, I believe you could achieve it via your own class (depending on the logger you use, let's assume you use Log4jLogger).
Well, if you check relevant part of the Log4jLogger github as well as sourceforge version, your implementation should be rather straightforward:
spy.properties:
appender=com.EffectiveSQLLog4jLogger
Implementation itself could look like this:
package com;
import com.p6spy.engine.logging.appender.Log4jLogger;
public class EffectiveSQLLog4jLogger extends Log4jLogger {
public void logText(String text) {
super.logText(getEffectiveSQL(text));
}
private String getEffectiveSQL(String text) {
if (null == text) {
return null;
}
final int idx = text.lastIndexOf("|");
// non-perfect detection of the exception logged case
if (-1 == idx) {
return text;
}
return text.substring(idx + 1); // not sure about + 1, but check and see :)
}
}
Please note the implementation should cover github (new project home, no version released yet) as well as sourceforge (original project home, released 1.3 version).
Please note: I didn't test the proposal myself, but it could be a good starting point and from the code review itself I'd say it could work.

I agree with #boberj, we are used to having logs with Hibernate formatter, but don't forget about batching, that's why I suggest to use:
import com.p6spy.engine.spy.appender.MessageFormattingStrategy;
import org.hibernate.engine.jdbc.internal.BasicFormatterImpl;
import org.hibernate.engine.jdbc.internal.Formatter;
/**
* Created by Igor Dmitriev on 1/3/16
*/
public class HibernateSqlFormatter implements MessageFormattingStrategy {
private final Formatter formatter = new BasicFormatterImpl();
#Override
public String formatMessage(int connectionId, String now, long elapsed, String category, String prepared, String sql) {
if (sql.isEmpty()) {
return "";
}
String template = "Hibernate: %s %s {elapsed: %sms}";
String batch = "batch".equals(category) ? ((elapsed == 0) ? "add batch" : "execute batch") : "";
return String.format(template, batch, formatter.format(sql), elapsed);
}
}

In p6Spy 3.9 this can be achieved quite simply. In spy.properties set
customLogMessageFormat=%(effectiveSql)

You can patch com.p6spy.engine.spy.appender.SingleLineFormat.java
removing the prepared element and any reference to P6Util like so:
package com.p6spy.engine.spy.appender;
public class SingleLineFormat implements MessageFormattingStrategy {
#Override
public String formatMessage(final int connectionId, final String now, final long elapsed, final String category, final String prepared, final String sql) {
return now + "|" + elapsed + "|" + category + "|connection " + connectionId + "|" + sql;
}
}
Then compile just the file
javac com.p6spy.engine.spy.appender.SingleLineFormat.java
And replace the existing class file in p6spy.jar with the new one.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Use Java8 Stream on JDBCTemplate Results from HIVE - hive

Related

how to reverse the measurement data using MeasurementFilter in Java SDK for Cumulocity Api?

Spring Webflux send event when any new data

Netty client server login, how to have channelRead return a boolean

Graph traversal name to graph name mapping

Only show effective SQL string P6Spy

Categories

Resources