Trouble iterating over resultset and generating json - sql

I am struggling a bit trying to generate some JSON from a query I execute. All of the groovy JsonBuilder examples I've looked at only seem to deal with statically defining a dataset.
code:
def db = new Sql(datasource)
def builder = new JsonBuilder()
db.eachRow('SELECT t.day, t.start FROM mytable') { row ->
builder.days {
day(
date row.day
)
}
}
println builder.toString()
I had it at 1 point where it was printing only the last value in the resultset out.
Currently I am receiving the following error:
unexpected token: $ # line 46, column 18.
date row.day
I'm still a bit of a novice at groovy, any help greatly appreciated.

I generally prefer to present JsonBuilder with a complete object rather than use the DSL, so my solution would look something like this:
def map = [days:[]]
def db = new Sql(dataSource)
db.eachRow('SELECT t.day, t.start FROM mytable') { row ->
map.days << [day : [date: row.day]]
}
println new JsonBuilder(map).toString()
If you have a large number of results, this approach has the advantage of not forcing you to compile a huge list of GroovyRowResult objects, only a huge list of much smaller LinkedHashMap objects.

The builder there does not opening a list, add items, then close it. you would have to provide it in a single go. E.g. collect all rows as maps:
def builder = new groovy.json.JsonBuilder()
def dbresult = [1,2,3]
builder {
days dbresult.collect{
[day: [date: it]]
}
}
println builder
Use db.rows to get the list. You might have to try, what is happening, if you just send in the result. Maybe needs a cast to a Map or you have to do the mapping yourself.
If your rowcount is very high, you might be better off some other library, that don't need you to manifest the list beforehand.

Related

Lucene calculate term vectors for existing index

With Lucene.net I would like to get the term vectors as described in this stackoverflow question.
The problem is, the index is already generated with the field indexed and stored, but without term vectors.
FieldType type = new FieldType();
type.setIndexed(true);
type.setStored(true);
type.setStoreTermVectors(false);
Theoretically, it should be possible to re-calculate the term vectors for each document and then store it in the index.
Do you know how this could be possible, without deleting the complete Lucene index?
As mentioned in my comments in the question, you can generate term vector data on-the-fly, which may help you to avoid a complete rebuild of your indexed data.
In my scenario, I want to find the offset positions of my search term in the matched document.
I don't want to oversell this approach - it's absolutely not a substitute for re-indexing - but if your queries are basic, it may help.
Step 1: Perform whatever query you are currently performing.
For each document in the list of hits, you will then need to re-process the relevant field from that document - so, either you already have the field data stored in your existing index, or you will need to retrieve it from its original source.
Step 2: For each such field, you can re-use the same analyzer to build a token stream on-the-fly. The token stream can be configured with different attributes, such as:
token attributes
offset attributes
and others (see here)
Example:
using Lucene.Net.Analysis.Standard;
using Lucene.Net.Analysis.TokenAttributes;
using Lucene.Net.Util;
const LuceneVersion AppLuceneVersion = LuceneVersion.LUCENE_48;
String? fieldName = null;
String fieldContent = "Foo Bar Baz Bar Bat";
String searchTerm = "bar";
var analyzer = new StandardAnalyzer(AppLuceneVersion);
var ts = analyzer.GetTokenStream(fieldName, fieldContent);
var charTermAttr = ts.AddAttribute<ICharTermAttribute>();
var offsetAttr = ts.AddAttribute<IOffsetAttribute>();
try
{
ts.Reset();
Console.WriteLine("");
Console.WriteLine("Token: " + searchTerm);
while (ts.IncrementToken())
{
if (searchTerm.Equals(charTermAttr.ToString()))
{
var start = offsetAttr.StartOffset;
var end = offsetAttr.EndOffset;
Console.WriteLine(String.Format(" > offset: {0}-{1}", start, end));
}
}
ts.End();
}
catch (Exception)
{
throw;
}
The above example assumes one of the hits from step 1 was a field containing "Foo Bar Baz Bar Bat" - with a search term of bar.
The output generated is:
Token: bar
> offset: 4-7
> offset: 12-15
So, as you can see, you are not re-executing a query - you are just re-processing a token stream. The more complex the original search term is, the harder it will be to make this approach work the way you probably need it to.

How to properly create and use dynamic Xpath in JSON (Page Object Model) - Karate DSL

For example, I have this sample JSON object in pages folder which contains all the XPaths for specific page.
{
"pageTitle1": "//*[#class='page-title' and text()='text1']",
"pageTitle2": "//*[#class='page-title' and text()='text2']",
"pageTitle_x" : "//*[#class='page-title' and text()='%s']"
}
* def pageHome = read('classpath:/pages/pageHome.json')
* click(pageHome.pageTitle_x) <-- how to properly replace %s in the string?
Update: I tried the replace function, not sure if this is the proper way.
* click(pageHome.pageTitle_x.replace("%s","new value"))
First a bit of advice. Trying to be "too clever" like this causes maintainability problems in the long run. I have said a lot about this here, please read it: https://stackoverflow.com/a/54126724/143475
That said, you can write re-usable JS functions that will do all these things:
* def pageTitle = function(x){ return "//*[#class='page-title' and text()='" + x "']" }
Now using that you can do this:
* click(pageTitle('foo'))
If you redesign the function even this may be possible:
* click(pageTitle(pageHome.pageTitle_x, 'foo'))
But see how things become more complicated and less readable. The choice is yours. Note that anything you can do in JS (e.g. String.replace()) will be possible, it is up to you and your creativity.

Slick plain sql query with pagination

I have something like this, using Akka, Alpakka + Slick
Slick
.source(
sql"""select #${onlyTheseColumns.mkString(",")} from #${dbSource.table}"""
.as[Map[String, String]]
.withStatementParameters(rsType = ResultSetType.ForwardOnly, rsConcurrency = ResultSetConcurrency.ReadOnly, fetchSize = batchSize)
.transactionally
).map( doSomething )...
I want to update this plain sql query with skipping the first N-th element.
But that is very DB specific.
Is is possible to get the pagination bit generated by Slick? [like for type-safe queries one just do a drop, filter, take?]
ps: I don't have the Schema, so I cannot go the type-safe way, just want all tables as Map, filter, drop etc on them.
ps2: at akka level, the flow.drop works, but it's not optimal/slow, coz it still consumes the rows.
Cheers
Since you are using the plain SQL, you have to provide a workable SQL in code snippet. Plain SQL may not type-safe, but agile.
BTW, the most optimal way is to skip N-th element by Database, such as limit in mysql.
depending on your database engine, you could use something like
val page = 1
val pageSize = 10
val query = sql"""
select #${onlyTheseColumns.mkString(",")}
from #${dbSource.table}
limit #${pageSize + 1}
offset #${pageSize * (page - 1)}
"""
the pageSize+1 part tells you whether the next page exists
I want to update this plain sql query with skipping the first N-th element. But that is very DB specific.
As you're concerned about changing the SQL for different databases, I suggest you abstract away that part of the SQL and decide what to do based on the Slick profile being used.
If you are working with multiple database product, you've probably already abstracted away from any specific profile, perhaps using JdbcProfile. In that case you could place your "skip N elements" helper in a class and use the active slickProfile to decide on the SQL to use. (As an alternative you could of course check via some other means, such as an environment value you set).
In practice that could be something like this:
case class Paginate(profile: slick.jdbc.JdbcProfile) {
// Return the correct LIMIT/OFFSET SQL for the current Slick profile
def page(size: Int, firstRow: Int): String =
if (profile.isInstanceOf[slick.jdbc.H2Profile]) {
s"LIMIT $size OFFSET $firstRow"
} else if (profile.isInstanceOf[slick.jdbc.MySQLProfile]) {
s"LIMIT $firstRow, $size"
} else {
// And so on... or a default
// Danger: I've no idea if the above SQL is correct - it's just placeholder
???
}
}
Which you could use as:
// Import your profile
import slick.jdbc.H2Profile.api._
val paginate = Paginate(slickProfile)
val action: DBIO[Seq[Int]] =
sql""" SELECT cols FROM table #${paginate.page(100, 10)}""".as[Int]
In this way, you get to isolate (and control) RDBMS-specific SQL in one place.
To make the helper more usable, and as slickProfile is implicit, you could instead write:
def page(size: Int, firstRow: Int)(implicit profile: slick.jdbc.JdbcProfile) =
// Logic for deciding on SQL goes here
I feel obliged to comment that using a splice (#$) in plain SQL opens you to SQL injection attacks if any of the values are provided by a user.

Slick Plain Sql Generic Return Type

I am trying to write a configurable sql query executor using Slick. User provides a prepared statement with ? and at run time the exact query is formed by replacing ? with values.
Generally this is how one would run a plain sql query using slick.
val query = sql"#$queryString".as[(String,Int)]
In my case i would not know the result type so i want to get back a generic result type. Maybe a List of Tuples with each tuple representing a row of result SET.
Any ideas on how this would be done?
I found a solution from one of the scala git issues. Here it is
ResultMap extends GetResult[Map[String, Any]] {
def apply(pr: PositionedResult) = {
val resultSet = pr.rs
val metaData = resultSet.getMetaData();
(1 to pr.numColumns).map { i =>
metaData.getColumnName(i) -> resultSet.getObject(i)
}.toMap
}
and then we can simply do val query = sql"#$queryString".as(ResultMap)
Hope it helps!!

Can I pretty-print the DBIC_TRACE output in DBIx::Class?

Setting the DBIC_TRACE environment variable to true:
BEGIN { $ENV{DBIC_TRACE} = 1 }
generates very helpful output, especially showing the SQL query that is being executed, but the SQL query is all on one line.
Is there a way to push it through some kinda "sql tidy" routine to format it better, perhaps breaking it up over multiple lines? Failing that, could anyone give me a nudge into where in the code I'd need to hack to add such a hook? And what the best tool is to accept a badly formatted SQL query and push out a nicely formatted one?
"nice formatting" in this context simply means better than "all on one line". I'm not particularly fussed about specific styles of formatting queries
Thanks!
As of DBIx::Class 0.08124 it's built in.
Just set $ENV{DBIC_TRACE_PROFILE} to console or console_monochrome.
From the documentation of DBIx::Class::Storage
If DBIC_TRACE is set then trace information is produced (as when the
debug method is set). ...
debug Causes trace information to be emitted on the debugobj
object. (or STDERR if debugobj has not specifically been set).
debugobj Sets or retrieves the object used for metric collection.
Defaults to an instance of DBIx::Class::Storage::Statistics that is
compatible with the original method of using a coderef as a callback.
See the aforementioned Statistics class for more information.
In other words, you should set debugobj in that class to an object that subclasses DBIx::Class::Storage::Statistics. In your subclass, you can reformat the query the way you want it to be.
First, thanks for the pointers! Partial answer follows ....
What I've got so far ... first some scaffolding:
# Connect to our db through DBIx::Class
my $schema = My::Schema->connect('dbi:SQLite:/home/me/accounts.db');
# See also BEGIN { $ENV{DBIC_TRACE} = 1 }
$schema->storage->debug(1);
# Create an instance of our subclassed (see below)
# DBIx::Class::Storage::Statistics class
my $stats = My::DBIx::Class::Storage::Statistics->new();
# Set the debugobj object on our schema's storage
$schema->storage->debugobj($stats);
And the definition of My::DBIx::Class::Storage::Statistics being:
package My::DBIx::Class::Storage::Statistics;
use base qw<DBIx::Class::Storage::Statistics>;
use Data::Dumper qw<Dumper>;
use SQL::Statement;
use SQL::Parser;
sub query_start {
my ($self, $sql_query, #params) = #_;
print "The original sql query is\n$sql_query\n\n";
my $parser = SQL::Parser->new();
my $stmt = SQL::Statement->new($sql_query, $parser);
#printf "%s\n", $stmt->command;
print "The parameters for this query are:";
print Dumper \#params;
}
Which solves the problem about how to hook in to get the SQL query for me to "pretty-ify".
Then I run a query:
my $rs = $schema->resultset('SomeTable')->search(
{
'email' => $email,
'others.some_col' => 1,
},
{ join => 'others' }
);
$rs->count;
However SQL::Parser barfs on the SQL generated by DBIx::Class:
The original sql query is
SELECT COUNT( * ) FROM some_table me LEFT JOIN others other_table ON ( others.some_col_id = me.id ) WHERE ( others.some_col_id = ? AND email = ? )
SQL ERROR: Bad table or column name '(others' has chars not alphanumeric or underscore!
SQL ERROR: No equijoin condition in WHERE or ON clause
So ... is there a better parser than SQL::Parser for the job?