Using named converter in AdditionalCriteria statement - eclipselink

Suppose there is an entity:
class Aaa {
#Convert
Bbb bbb
}
with properly defined named converter (via #Converter).
Now, create additional criteria like this:
#AdditionalCriteria("this.bbb IN :bbbList")
and use like this:
entityManager.setProperty("bbbList", Arrays.asList(Bbb.X, Bbb.Y))
And now the problem: synthetized IN clause looks like:
IN (Bbb.X.toString(), Bbb.Y.toString())
In other words, named converter is not used, instead bbb is matched against the default string one. How can be this fixed?

Related

Open Refine: Exporting nested XML with templating

I have a question regarding the templating option for XML in Open Refine. Is it possible to export data from two columns in a nested XML-structure, if both columns contain multiple values, that need to be split first?
Here's an example to illustrate better what I mean. My columns look like this:
Column1
Column2
https://d-nb.info/gnd/119119110;https://d-nb.info/gnd/118529889
Grützner, Eduard von;Elisabeth II., Großbritannien, Königin
https://d-nb.info/gnd/1037554086;https://d-nb.info/gnd/1245873660
Müller, Jakob;Meier, Anina
Each value separated by semicolon in Column1 has a corresponding value in Column2 in the right order and my desired output would look like this:
<rootElement>
<recordRootElement>
...
<edm:Agent rdf:about="https://d-nb.info/gnd/119119110">
<skos:prefLabel xml:lang="zxx">Grützner, Eduard von</skos:prefLabel>
</edm:Agent>
<edm:Agent rdf:about="https://d-nb.info/gnd/118529889">
<skos:prefLabel xml:lang="zxx">Elisabeth II., Großbritannien, Königin</skos:prefLabel>
</edm:Agent>
...
</recordRootElement>
<recordRootElement>
...
<edm:Agent rdf:about="https://d-nb.info/gnd/1037554086">
<skos:prefLabel xml:lang="zxx">Müller, Jakob</skos:prefLabel>
</edm:Agent>
<edm:Agent rdf:about="https://d-nb.info/gnd/1245873660">
<skos:prefLabel xml:lang="zxx">Meier, Anina</skos:prefLabel>
</edm:Agent>
...
</recordRootElement>
<rootElement>
(note: in my initial posting, the position of the root element was not indicated and it looked like this:
<edm:Agent rdf:about="https://d-nb.info/gnd/119119110">
<skos:prefLabel xml:lang="zxx">Grützner, Eduard von</skos:prefLabel>
</edm:Agent>
<edm:Agent rdf:about="https://d-nb.info/gnd/118529889">
<skos:prefLabel xml:lang="zxx">Elisabeth II., Großbritannien, Königin</skos:prefLabel>
</edm:Agent>
)
I managed to split the values separated by ";" for both columns like this
{{forEach(cells["Column1"].value.split(";"),v,"<edm:Agent rdf:about=\""+v+"\">"+"\n"+"</edm:Agent>")}}
{{forEach(cells["Column2"].value.split(";"),v,"<skos:prefLabel xml:lang=\"zxx\">"+v+"</skos:prefLabel>")}}
but I can't find out how to nest the splitted skos:prefLabel into the edm:Agent element. Is that even possible? If not, I would work with seperate columns or another workaround, but I wanted to make sure, if there's a more direct way before.
Thank you!
Kristina
I am going to expand the answer from RolfBly using the Templating Exporter from OpenRefine.
I do have the following assumptions:
There is some other column left of Column1 acting as record identifying column (see first screenshot).
The columns actually have some proper names
The columns URI and Name are the only columns with multiple values. Otherwise we might produce empty XML elements with the following recipe.
We will use the information about records available via GREL to determine whether to write a <recordRootElement> or not.
Recipe:
Split first Name and then URI on the separator ";" via "Edit cells" => "Split multi-valued cells".
Go to "Export" => "Templating..."
In the prefix field use the value
<?xml version="1.0" encoding="utf-8"?>
<rootElement>
Please note that I skipped the namespace imports for edm, skos, rdf and xml.
In the row template field use the value:
{{if(row.index - row.record.fromRowIndex == 0, '<recordRootElement>', '')}}
<edm:Agent rdf:about="{{escape(cells['URI'].value, 'xml')}}">
<skos:prefLabel xml:lang="zxx">{{escape(cells['Name'].value, 'xml')}}</skos:prefLabel>
</edm:Agent>
{{if(row.index - row.record.fromRowIndex == row.record.rowCount - 1, '</recordRootElement>', '')}}
The row separator field should just contain a linebreak.
In the suffix field use the value:
</rootElement>
Disclaimer: If you're keen on using only OpenRefine, this won't be the answer you were hoping for. There may be ways in OR that I don't know of. That said, here's how I would do it.
Edit The trick is to keep URL and literal side by side on one line. b2m's answer below does just that: go from right to left splitting, not from left to right. You can then skip steps 2 and 3, to get the result in the image.
split each column into 2 columns by separator ;. You'll get 4 columns, 1 and 3 belong together, and 2 and 4 belong together. I'm assuming this will be the case consistently in your data.
export 1 and 3 to a file, and export 2 and 4 to another file, of any convenient format, using the custom tabular exporter.
concatenate those two files into one single file using an editor (I use Notepad++), or any other method you may prefer. Several ways to Rome here. Result in OR would be something like this.
You then have all sorts of options to put text strings in front, between and after your two columns.
In OR, you could use transform on column URL to build your XML using the below code
(note the \n for newline, that's probably just a line feed, you may want to use \r\n for carriage return + line feed if you're using Windows).
'<edm:Agent rdf:about="' + value + '">\n<skos:prefLabel xml:lang="zxx">' + cells.Name.value + '</skos:prefLabel>\n</edm:Agent>'
to get your XML in one column, like so
which you can then export using the custom tabular exporter again. Or instead you could use Add column based on this column in a similar manner, if you want to retain your URL column.
You could even do this in the editor without re-importing the file back into OR, but that's beyond the scope of this answer.

how to generate a subfactory based on condition of a parent attribute

I have a factory like so:
class PayInFactory(factory.DjangoModelFactory):
class Meta:
model = PayIn
#factory.lazy_attribute
def card(self):
if self.booking_payment and self.booking_payment.payment_type in [bkg_cts.PAYMENT_CARD, bkg_cts.PAYMENT_CARD_2X]:
factory.SubFactory(
CardFactory,
user=self.user,
)
I'm trying to generate a field card only if the booking_payment field has a payment_type value in [bkg_cts.PAYMENT_CARD, bkg_cts.PAYMENT_CARD_2X]
The code goes into that statement but card field is empty after generation.
How can I do that properly ?
Is SubFactory allowed in lazy_attribute ?
I'd like to be able to modify Card field from PayInFactory if possible like so:
>>> PayInFactory(card__user=some_user)
PostGeneration won't do as I need this Card to be available before the call to create. I overrided _create and it may use the card if available.
Thanks !
The solution lies in factory.Maybe:
class PayInFactory(factory.django.DjangoModelFactory):
class Meta:
model = models.PayIn
card = factory.Maybe(
factory.LazyAttribute(
lambda o: o.booking_payment and o.booking_payment.payment_type in ...
),
factory.SubFactory(
CardFactory,
# Fetch 'user' one level up from CardFactory: PayInFactory
user=factory.SelfAttribute('..user'),
),
)
However, I haven't tested whether the extra params get actually passed to the CardFactory, nor what happens when CardFactory is not called — you'll have to check (and maybe open an issue on the project if you get an unexpected behaviour!).

Add field/string length to logstash event

I'm trying to add a string length field to an index. Ideally, I'd like to use the kibana script feature as I can 'add' this field later but I keep getting a null_pointer_exception with the following code... I'm trying to sort in a visualization based on the fields length.
doc['field'].value ? doc['field'].length() : 0
Is this correct?
I thought it was because my field isn't always set (sparse data), but I added the ?:0 to combat that (which didn't work)
Any ideas?
You can define an scripted field in Kibana, of type int, language painless, and try this:
return (doc['field'].value != null? doc['field'].value.length(): 0);

How to pass variable to groovy code in Intellij IDEA live templates groovy script?

I have a groovyScript in my Intellij IDEA live template, like this :
groovyScript("D:/test.groovy","", v1)
on my D:/test.groovy i have a code like this:
if ( v1 == 'abc') {
'abc'
}
Now I want to pass v1 variable into test.groovy ,can any one help me how can I do this?
For exemplification purposes I made a live template which is printing a comment with the current class and current method.
This is how my live template is defined:
And here is how I edited the variableResolvedWithGroovyScript variable:
The Expression for the given variable has the follwing value:
groovyScript("return \"// Current Class:\" + _1 + \". Current Method:\"+ _2 ", className(),methodName())
As you can see, in this case the _1(which acts like a variable in the groovy script) takes the value of the first parameter which is the class name, and the _2 takes the value of the
second parameter which is the method name. If another parameter is needed the _3 will be used in the groovy script to reference the given parameter.
The arguments to groovyScript macro are bound to script variables named _1, _2 etc. This is also described at groovyScript help at Edit Template Variables Dialog / Live Template Variables.
I found a solution.
I was needing to calculate a CRC32 of the qualified class name using live models
I used it like this:
groovyScript("
def crc = new java.util.zip.CRC32().with { update _1.bytes; value };
return Long.toHexString(crc);
", qualifiedClassName())
then the result is
Based on the documentation, your variables are available as _1, _2, etc. Please note that variables are passed without dollar signs (so only v1 instead of $v1$)
So your test script should look like
if ( _2 == 'abc') {
'abc'
}

REGEX_EXTRACT error in PIG

I have a CSV file with 3 columns: tweetid , tweet, and Userid. However within the tweet column there are comma separated values.
i.e. of 1 row of data:
`396124437168537600`,"I really wish I didn't give up everything I did for you, I'm so mad at my self for even letting it get as far as it did.",savava143
I want to extract all 3 fields individually, but REGEX_EXTRACT is giving me an error with this code:
a = LOAD tweets USING PigStorage(',') AS (f1,f2,f3);
b = FILTER a BY REGEX_EXTRACT(f1,'(.*)\\"(.*)',1);
The error is:
error: Filter's condition must evaluate to boolean.
In the use case shared, reading the data using PigStrorage(',') will result in missing savava143 (last field value)
A = LOAD '/Users/muralirao/learning/pig/a.csv' USING PigStorage(',') AS (f1,f2,f3);
DUMP A;
Output : A : Observe that the last field value is missing.
(396124437168537600,"I really wish I didn't give up everything I did for you, I'm so mad at my self for even letting it get as far as it did.")
For the use case shared, to extract all the values from CSV file with field values having ',' we can use either CSVExcelStorage or CSVLoader.
Approach 1 : Using CSVExcelStorage
Ref : http://pig.apache.org/docs/r0.12.0/api/org/apache/pig/piggybank/storage/CSVExcelStorage.html
Input : a.csv
396124437168537600,"I really wish I didn't give up everything I did for you, I'm so mad at my self for even letting it get as far as it did.",savava143
Pig Script :
REGISTER piggybank.jar;
A = LOAD 'a.csv' USING org.apache.pig.piggybank.storage.CSVExcelStorage() AS (f1,f2,f3);
DUMP A;
Output : A
(396124437168537600,I really wish I didn't give up everything I did for you, I'm so mad at my self for even letting it get as far as it did.,savava143)
Approach 2 : Using CSVLoader
Ref : http://pig.apache.org/docs/r0.9.1/api/org/apache/pig/piggybank/storage/CSVLoader.html
Below script makes use of CSVLoader(), DUMP A will result in the same output seen earlier.
A = LOAD 'a.csv' USING org.apache.pig.piggybank.storage.CSVLoader() AS (f1,f2,f3);
The error is that you do not want to FILTER based on a regex but GENERATE new fields based on a regex. To filter, you need to know if the line have to be filtered, hence the boolean requirement.
Therefore, you have to use :
b = FOREACH a GENERATE REGEX_EXTRACT(FIELD, REGEX, HOW_MANY_GROUPS_TO_RETURN);
However, as #Murali Rao said, your values are not just coma separated but CSV (think how you will handle a coma in tweet : it is not a field separator, just some content).