Google Cloud Dataflow, BigQueryIO and NullPointerException on TableRow.get

Google Cloud Dataflow, BigQueryIO and NullPointerException on TableRow.get - google-bigquery

I'm new to GC Dataflow and didn't find a relevant answer here. Apologies if I should have found this already answered.
I'm trying to create a simple pipeline using the v2.0 SDK and am having trouble reading data into my PCollection using BigQueryIO. I am using the .withQuery method and I have tested the query in the BigQuery interface and it seems to be working fine. The initial PCollection seems to get created without any issues, but when I think setup a simple ParDo function to convert the values from the TableRow into a PCollection I am getting a NullPointerException on the line of code that does the .get on the TableRow object.
Here is my code. (I'm probably missing something simple. I'm a total newbie at Pipeline programming. Any input would be most appreciated.)
public class ClientAutocompletePipeline {
private static final Logger LOG = LoggerFactory.getLogger(ClientAutocompletePipeline.class);
public static void main(String[] args) {
// create the pipeline
Pipeline p = Pipeline.create(
PipelineOptionsFactory.fromArgs(args).withValidation().create());
// A step to read in the product names from a BigQuery table
p.apply(BigQueryIO.read().fromQuery("SELECT name FROM [beaming-team-169321:Products.raw_product_data]"))
.apply("ExtractProductNames", ParDo.of(new DoFn<TableRow, String>() {
#ProcessElement
public void processElement(ProcessContext c) {
// Grab a row from the BigQuery Results
TableRow row = c.element();
// Get the value of the "name" column from the table row.
//NOTE: This is the line that is giving me the NullPointerException
String productName = row.get("name").toString();
// Make sure it isn't empty
if (!productName.isEmpty()) {
c.output(productName);
}
}
}))
The query definitely works in the BigQuery UI and the column called "name" is returned when I test the query. Why am I getting a NullPointerException on this line:
String productName = row.get("name").toString();
Any ideas?

This is a common problem when working with BigQuery and Dataflow (most likely the field is indeed null). If you are ok with using Scala, you could take a look at Scio (which is a Scala DSL for Dataflow) and its BigQuery IO.

Just make your code null safe. Replace this:
String productName = row.get("name").toString();
With something like this:
String productName = String.valueOf(row.get("name"));

I think I'm late for this but you can do something like if(row.containsKey("column-name")).
This will basically tell you if the field is null or not.
In BigQuery what happens is, while reading data, if a column value is null, it is not available as a part of that particular TableRow. Hence, you are getting that error. You can also do something like if(null == row.get("column-name")) to check if the field is null or not.

Related

How to make new mono with DTO from mono and flux in spring reactive webflux

Here I try to make call from database and combine into new mono from different mono and flux.
public Mono<ListMovieWithKomenDTO> fetchMovieAndKomen(Integer movieId){
Mono<Movie> movie = findById(movieId).subscribeOn(Schedulers.elastic());
Flux<MovieKomen> movieKomen = getKomenByMovieId(movieId).subscribeOn(Schedulers.elastic());
return Mono.zip(movie, movieKomen.collectList(), movieMovieKomenDTOBiFunction);
}
private BiFunction<Movie, List<MovieKomen>, ListMovieWithKomenDTO> movieMovieKomenDTOBiFunction = (x1, x2) -> ListMovieWithKomenDTO.builder()
// .age(x1.getAge())
.id(x1.getId())
.name(x1.getName())
.status(x1.getStatus())
.detail(x1.getDetail())
.url(x1.getUrl())
.movieKomen(x2).build();
In here I make db call twice for header ( like movie ) and detail ( like movie comment ) to separate them. After I make retrieve two different data, I want to join into new mono data based on flux data and mono. to make them into one data, I make DTO to put together from movie table and comment table but it failed. I assume that errors from mono.zip to get data into one new mono.
Here the error from debug console
java.lang.IllegalArgumentException: Cannot encode parameter of type org.springframework.r2dbc.core.Parameter
at io.r2dbc.postgresql.ExtendedQueryPostgresqlStatement.bind(ExtendedQueryPostgresqlStatement.java:89) ~[r2dbc-postgresql-0.8.10.RELEASE.jar:0.8.10.RELEASE]
Thank you

Problem is in my repository I used
public interface MovieKomenRepository extends ReactiveCrudRepository<MovieKomen,Integer> {
#Query("select * from m_movie_komen where m_movie_id = $1")
Flux<MovieKomen> findByMovieId(int movie_id);
}
in above example, I used $1 for the param in query. But when I change my code like bottom. It works like a charm.
public interface MovieKomenRepository extends ReactiveCrudRepository<MovieKomen,Integer> {
#Query("select * from m_movie_komen where m_movie_id = :movie")
Flux<MovieKomen> findByMovieId(#Param("movie") int movie_id);
}
so if someone want to use my service code is fine but careful in repository. we should not used '$1' instead ':movie'. so the problem not in service or mono/flux. but in my repository
Thank you.

RepoDb cannot find mapping configuration

I'm trying to use RepoDb to query the contents of a table (in an existing Sql Server database), but all my attempts result in an InvalidOperationException (There are no 'contructor parameter' and/or 'property member' bindings found between the resultset of the data reader and the type 'MyType').
The query I'm using looks like the following:
public Task<ICollection<MyType>> GetAllAsync()
{
var result = new List<MyType>();
using (var db = new SqlConnection(connectionString).EnsureOpen())
{
result = (await db.ExecuteQueryAsync<MyType>("select * from mytype")).ToList();
}
return result;
}
I'm trying to run this via a unit test, similar to the following:
[Test]
public async Task MyTypeFetcher_returns_all()
{
SqlServerBootstrap.Initialize();
var sut = new MyTypeFetcher("connection string");
var actual = await sut.GetAllAsync();
Assert.IsNotNull(actual);
}
The Entity I'm trying to map to matches the database table (i.e. class name and table name are the same, property names and table column names also match).
I've also tried:
putting annotations on the class I am trying to map to (both at the class level and the property level)
using the ClassMapper to map the class to the db table
using the FluentMapper to map the entire class (i.e. entity-table, all columns, identity, primary)
putting all mappings into a static class which holds all mapping and configuration and calling that in the test
providing mapping information directly in the test via both ClassMapper and FluentMapper
From the error message it seems like RepoDb cannot find the mappings I'm providing. Unfortunately I have no idea how to go about fixing this. I've been through the documentation and the sample tutorials, but I haven't been able to find anything of use. Most of them don't seem to need any mapping configuration (similar to what you would expect when using Dapper). What am I missing, and how can I fix this?

Gson api parsing issue Kotlin

I'm trying to parse the JSON returned by the following API call (recipe and ingredientLines only):
https://api.edamam.com/search?q=khachapuri&app_id=xxx&app_key=yyy
My model for GSON looks like this:
class FoodModel {
var label:String = "Yummy"
var image:String = "https://agenda.ge/files/khachapuri.jpg"
var ingredientLines = ""
}
After launching the app, I'm facing the following error:
com.google.gson.JsonSyntaxException: java.lang.IllegalStateException: Expected BEGIN_ARRAY but was BEGIN_OBJECT at line 1 column 2 path $
I think I'm writing the model class incorrectly, because the structure of a json is not clear for me. This is how I'm trying to use Gson: val foodItems = Gson().fromJson(response, Array<FoodModel>::class.java) can anyone help?

The JSON object returned by the API has a slightly different structure compared to your model.
In particular the API is returning a complex object that you need to traverse in order to extract the information you are interest into. A high-level example (I'm not able to test it, but hopefully you'll get the gist of it):
data class Response(
val hits: List<Hit>
)
data class Hit(
val recipe: Recipe
)
data class Recipe(
val label: String,
val image: String
)
val foodItems = Gson().fromJson(response, Response::class.java)
Just be aware that Gson may create instances in an unsafe manner, which means you may experience NullPointerExceptions thrown apparently with no reason. If you want to prove it, just rename image to anything else (you can also try with other fields, it doesn't matter), and you'll see its value is null even if the type is non-nullable.

Can you avoid for loops with solrj?

I was curious if there was a way to avoid using loops when using SolrJ.
For example. If I were to use straight SQL, using an appropriate java library, I could return a Query result and caste it as a List and pass it on up to my view (in a webapp).
SolrJ (SolrQuery and QueryResponse) have no way of returning succinct lists it seems. This would imply I have to create an iterator to go through each return doc and get the value I want which isn't ideal.
Is there something I am missing here, is there away to avoid these seemingly useless loops?

The SOLRJ wiki give an example that does what you want:
https://wiki.apache.org/solr/Solrj#Reading_Data_from_Solr
Basically:
QueryResponse rsp = server.query( query );
SolrDocumentList docs = rsp.getResults();
List<Item> beans = rsp.getBeans(Item.class);
EDIT:
Based on your comments below, it appears what you want is a non-looping transform of the SOLR response (e.g. a map function in a functional language). Google's guava library provides something like this. My example below assumes that your SOLR response has a "name" field that you want to return a list of:
QueryResponse rsp = server.query(query);
SolrDocumentList docs = rsp.getResults();
List<String> names = Lists.transform(docs, new Function<String,SolrDocument>() {
#Override
public String apply(SolrDocument d) {
return (String)d.get("name");
}
});
Unfortunately, java does not support this style of programming very well, so the functional approach ends up being more verbose (and probably less clear) than a simple loop:
QueryResponse rsp = server.query(query);
SolrDocumentList docs = rs.getResults();
List<String> names = new ArrayList<String>();
for (SolrDocument d : docs) names.add(d.get("name"));

Auto generated linq class is empty?

This is a continuation of my previous question: Could not find an implementation of the query pattern
I'm trying to insert a new 'Inschrijving' into my database. I try this with the following code:
[OperationContract]
public void insertInschrijving(Inschrijvingen inschrijving)
{
var context = new DataClassesDataContext();
context.Inschrijvingens.InsertOnSubmit(inschrijving);
dc.SubmitChanges();
}
But the context.Inschrijvingens.InsertOnSubmit(inschrijving); gives me the following error:
cannot convert from 'OndernemersAward.Web.Service.Inschrijvingen' to 'OndernemersAward.Web.Inschrijvingen'
I call the method in my main page:
Inschrijvingen1Client client = new Inschrijvingen1Client();
Inschrijvingen i = new Inschrijvingen();
client.insertInschrijvingCompleted += new EventHandler<System.ComponentModel.AsyncCompletedEventArgs>(client_insertInschrijvingCompleted);
client.insertInschrijvingAsync(i);
But as you can see there appears to be something wrong with my Inschrijvingen class, which is auto generated by LINQ. (Auto generated class can be found here: http://pastebin.com/QKuAAKgV)
I'm not entirely sure what is causing this, but I assume it has something to do with the auto generated class not being correct?
Thank you for your time,
Thomas

The problem is that you've got two Inschrijvingen classes - one in the OndernemersAward.Web.Service namespace, and one in the OndernemersAward.Web namespace.
You either need to change the codebase so that you've only got one class, or you need to convert from one type to the other.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Google Cloud Dataflow, BigQueryIO and NullPointerException on TableRow.get - google-bigquery

This is a common problem when working with BigQuery and Dataflow (most likely the field is indeed null). If you are ok with using Scala, you could take a look at Scio (which is a Scala DSL for Dataflow) and its BigQuery IO.

Just make your code null safe. Replace this: String productName = row.get("name").toString(); With something like this: String productName = String.valueOf(row.get("name"));

Related

How to make new mono with DTO from mono and flux in spring reactive webflux

RepoDb cannot find mapping configuration

Gson api parsing issue Kotlin

Can you avoid for loops with solrj?

Auto generated linq class is empty?

Categories

Resources