How to custom tag word(s) in GATE JAPE grammar? - grammar

I have a set of documents and each document has different heading. Example if document heading says "Psychological Evaluation" I want to tag the document as "Medicalrule".
I loaded the document and loaded ANNIE with defaults.
In Processing Resources > New > Jape Transducer
2.1 wrote the following code in the text document and saved it as .JAPE extension
CODE :
Phase: ConjunctionIdentifier
Input: Token Split
Rule: Medicalrule
(
({Token.string=="Psychological"})+({Token.string == " "})+ ({Token.string == "Evaluation"}):Meddoc({Token.kind=="word"})
)
-->
:Meddoc
{
gate.AnnotationSet matchedAnns= (gate.AnnotationSet) bindings.get("Meddoc"); gate.FeatureMap newFeatures= Factory.newFeatureMap();newFeatures.put("rule","Medicalrule");annotations.add(matchedAnns.firstNode(),matchedAnns.lastNode(),"CC", newFeatures);
}
Loaded the above created .JAPE file and reinitialized
After the application is run the Annotation Set does not show the tag !
Am I doing wrong somewhere ?It would be great if someone could help me on this.
Appreciate your time.
Thank you

I'm sure that there is no annotation like: Token.string == " ". Try to use a SpaceToken annotation instead.
Also, why not to try gazetteers instead of hardcoding of texts values in to JAPE code?

There are three issues I can see here.
First, as ashingel says, spaces are not represented as Token annotations - this is deliberate as in most cases you don't care about the spacing between words, only the words themselves.
Second, the trailing ({Token.kind=="word"}) means that the rule will only match when "Psychological Evaluation" is followed by another word before the end of the current sentence (because you've got Split in the Input line).
Third, you're only binding the Meddoc label to the "Evaluation" token, not to the whole match.
I would try and simplify the LHS of the rule:
Phase: ConjunctionIdentifier
Input: Token Split
Rule: Medicalrule
(
{Token.string=="Psychological"}
{Token.string == "Evaluation"}
):meddoc
and for the RHS (a) you don't need to do the explicit bindings.get because you've used a labelled block so you already have the bound annots available, (b) you should use outputAS instead of annotations, and (c) you should generally avoid the add method that takes nodes, as it isn't safe if the input and output annotation sets are different. If you're using a recent snapshot of GATE then the gate.Utils static methods can help you a lot here
:meddoc {
Utils.addAnn(outputAS, meddocAnnots,"CC",
Utils.featureMap("rule","Medicalrule"));
}
If you're using 7.1 or earlier then the addAnn method isn't available so it's slightly more convoluted:
:meddoc {
try {
outputAS.add(Utils.start(meddocAnnots), Utils.end(meddocAnnots),"CC",
Utils.featureMap("rule","Medicalrule"));
} catch(InvalidOffsetException e) { // can't happen, but won't compile without
throw new JapeException(e);
}
}
Finally, just to check, you did definitely add your new JAPE Transducer PR to the end of the pipeline?

Related

Check if an input field is empty or not is not working properly in Cypress tests

I got 2 step definitions in Cypress that check if an input field is empty or not (depends on how I build the sentence I setup with RegEx).
First my problem was, that cypress said the test failed because the input field is empty while it was not.
My defined steps:
/** We check if the input field with the given name is empty */
Given(/^The input field "(.*)" is (not )?empty$/, (inputFieldName, negation) => {
if (negation === 'not ') {
CypressTools.getByName(inputFieldName).should('not.be.empty');
} else {
CypressTools.getByName(inputFieldName).should('be.empty');
}
});
/** We check if the input field with the given name is visible and empty */
Given(/^The input field "(.*)" is visible and empty$/, (inputFieldName) => {
CypressTools.getByName(inputFieldName).should('be.visible').should('be.empty');
});
In my specific test cypress should check a value filled input field and the step is defined like that:
The input field "XYZ" is not empty
I can see, that the if-condition is working fine, so no problems on the definition or RegEx site.
But the test fails because Cypress say the input field is empty but it's not.
I suspect, that Cypress test the input fields for values between the input tags, but doesn't check for a value attribute in the input tag.
At least, I tried to add an invoke('val') in the step definition:
CypressTools.getByName(inputFieldName).invoke('val').should('not.be.empty');
and it works for the first step definition, but when I do that for the 2nd one aswell, cypress tests fail and tell me this:
Timed out retrying: You attempted to make a chai-jQuery assertion on an object that is neither a DOM object or a jQuery object.
The chai-jQuery assertion you used was:
> visible
The invalid subject you asserted on was:
>
To use chai-jQuery assertions your subject must be valid.
This can sometimes happen if a previous assertion changed the subject.
I don't understand the problem here. Is this method valid with invoke() or is there a better solution to cover all cases?
Thanks a lot for your help.
The problem your error message is pointing to is that the subject being passed along the command chain in not appropriate for the next step,
CypressTools.getByName(inputFieldName)
.invoke('val') // changes subject to the text of the input
// (not a DOM element)
.should('be.visible') // needs a DOM element
.should('not.be.empty');
The surest way around it is to break the testing into two steps
CypressTools.getByName(inputFieldName).should('be.visible');
CypressTools.getByName(inputFieldName)
.invoke('val')
.should('not.be.empty');
but I think a simple reordering will also work
CypressTools.getByName(inputFieldName)
.should('be.visible') // check the DOM element, passes it on as subject
.invoke('val') // changes subject to the text of the input
.should('not.be.empty'); // check the text is not empty

Last iteration of "forEach" loop adds extra period

I'm new to kotlin and for practice, I had to use a "forEach" loop to print this from a text file:
*** Welcome to Taernyl's Folly ***
Dragon's Breath................5.91
Shirley temple.................4.12
Goblet of la croix.............1.22
Pickled camel hump.............7.33
Iced boilermaker..............11.22
The in the file looks like this:
shandy,Dragon's Breath,5.91
elixir,shirley temple,4.12
meal,goblet of la croix,1.22
desert,pickled camel hump,7.33
elixir,iced boilermaker,11.22
so I saved each line In a list called 'menuFile' and then iterated through it to print it out like the above menu using this code:
println("*** Welcome to Taernyl's Folly ***")
menuFile.forEach{
val (type, name, price)=it.split(",")
val x=34-(price.length+name.length)
var dots=""
val dot="."
var padding=0
while(padding<=x){
dots+=dot
padding++
}
println("${name.capitalize()}$dots$price")
The issue is that for some reason on the last iteration of the loop it always adds an extra period so that the last line of the "menu" is always not even with the rest of the items on the menu. It doesn't matter which of the items I put last it always adds an extra one.
As Tenfour04 wrote above, you have problem with Carriage Return (CR).
If you print price.length you will see that 7.33 has the same length as 11.22 :)
If you don't know how to remove it you can just save file bignerdranch.com/​solutions/​tavern-menu-data.txt instead of copy/paste :)
You can use "repeat" for generating dots.

Word to HTML fields in header and footer

I'm using docx4j to convert a Word template to several HTML files, one per chapter.
The Word template has several custom properties mapped by several fields (DOCPROPERTY ...) represented as both simple and complex fields. I populate those properties to obtain Freemarker code when the word document is converted to HTML (like ${...} or [#... /] directives).
In a later step I look for "heading 1" paragraphs to identify chapters and then split the document in several Word documents before conversion, then these documents are converted to HTML and written to temporary files.
Each document is successfully converted to HTML and fields are correctly replaced with my markers, but it behaves wrong when it writes header and footer parts: field codes are written before field values (eg. DOCPROPERTY "PROPERTY_NAME" \* MERGEFORMAT ${constants['PROPERTY_NAME']} ) instead of field values only (eg. ${constants['PROPERTY_NAME']} ).
If I write the updated document to a docx file instead, nothing seems wrong into the generated document.
If it's useful to solve the problem, this is what I do to split the document (per chapter):
clone the updated WordprocessingMLPackage (clone method)
delete every root element before the chapter's "heading 1" element
delete every root element from the "heading 1" element of the next chapter
convert the cloned and cleaned document
(actually I don't use the clone method every time, but I write the updated document to a ByteArrayOutputStream and then read it for every chapter, inspired by the source of the clone method).
I suspect it's for a docx4j bug, did anybody else try something similar?
Finally these are my platform details:
JDK 1.6
Docx4J v3.2.2
Thanks in advance for any help
EDIT
To produce freemarker markers in place of Word fields, I set document property values as follows:
traverse the document looking for simple or complex fields with new TraversalUtil(wordMLPackage.getMainDocumentPart().getContent(), visitor);, where visitor is my custom callback for looking for fields and set properties
traversing the document I look for
FldChar elements with type BEGIN and parse them using FieldsPreprocessor.canonicalise((P) ((R) fc.getParent()).getParent(), fields); (I don't use the return value of canonicalise) where fc is the found FldChar and fields is a empty ArrayList<FieldRef>; then I extract and parse field's instrText attribute
CTSimpleField elements and parse them using FldSimpleModel fldSimpleModel = new FldSimpleModel(); fldSimpleModel.build((CTSimpleField) o, null);; then I use fldSimpleModel.getFldArgument() to get the property name
I look for the freemarker code to show in place of the current field and set it as property value using wordMLPackage.getDocPropsCustomPart().setProperty(propertyName, finalValue);
finally I do the same from step 1 for headers and footers as follows:
List<Relationship> rels = wordMLPackage.getMainDocumentPart().getRelationshipsPart().getRelationships().getRelationship();
for (Relationship rel : rels) {
Part p = wordMLPackage.getMainDocumentPart().getRelationshipsPart().getPart(rel);
if (p == null) {
continue;
}
if (p instanceof ContentAccessor) {
new TraversalUtil(((ContentAccessor) p).getContent(), visitor);
}
}
Finally I update fields as follows
FieldUpdater updater = new FieldUpdater(wordMLPackage);
try {
updater.update(true);
} catch (Docx4JException ex) {
Logger.getLogger(WorkerDocx4J.class.getName()).log(Level.SEVERE, null, ex);
}
After filling all field properties, I clone the document as previously described and convert filtered cloned instances using
HTMLSettings settings = Docx4J.createHTMLSettings();
settings.setWmlPackage(wordDoc);
settings.setImageHandler(new InlineImageHandler(myDataModel));
Docx4jProperties.setProperty("docx4j.Convert.Out.HTML.OutputMethodXML", true);
ByteArrayOutputStream os = new ByteArrayOutputStream();
os.write("[#ftl]\r\n".getBytes("UTF-8"));
Docx4J.toHTML(settings, os, Docx4J.FLAG_EXPORT_PREFER_XSL);
String template = new String(os.toByteArray(), "UTF-8");
then I obtain in template variable the resulting freemarker template.
The following XML is the content of footer1.xml part of the document generated after updating the document properties as described: footer1.xml after field updates
The very strange thing (in my opinion) is that if some properties are not found, step 5 throws an Exception (ok), fields updating stops at the wrong field (ok) and all fields in header and footer are rendered right. In this case, this is the content for footer1.xml.
In the last case, fields are defined in a different way. I think the HTML converter handles well the last case and does something wrong in the first one.
Is there something I do wrong or I can do better?

How to make jedit file-dropdown to display absolute path (not filename followed by directory)?

All is in the title.
If a have opened the three files:
/some/relatively/long/path/dir1/file_a
/some/relatively/long/path/dir1/file_b
/some/relatively/long/path/dir2/file_a
The file dropdown contains:
file_a (/some/relatively/long/path/dir1)
file_a (/some/relatively/long/path/dir2)
file_b (/some/relatively/long/path/dir1)
And that bother me because I have to look on the right to differentiate the two file_a, and on the left for the others. This happens a lot to me mostly because I code in python, and thus I often have several __init__.py files opened.
How do I get jedit to display
/some/relatively/long/path/dir1/file_a
/some/relatively/long/path/dir1/file_b
/some/relatively/long/path/dir2/file_a
config:
jedit 5.1.0
java 1.6.0_26
mac osx 10.6
Unfortunately this is not easily possible currently, I just had a look at the source and this is not configurable.
You can:
Submit a Feature Request to make this configurable (good idea in any case)
Create or let create a startup macro that
registers an EBComponent with the EditBus that listens for new EditPanes getting created
retrieve the BufferSwitcher from the EditPane
retrieve the ListCellRenderer from the BufferSwitcher
set a new ListCellRenderer to the BufferSwitcher that first calls the retrieved ListCellRenderer and then additionally sets the text to value.getPath()
Try the Buffer List plugin as to whether it maybe suits your needs
Now follows code that implements the work-part of option two, runnable as BeanShell code which does this manipulation for the current edit pane. The third line is not necessary when done in an EBComponent, this is just that the on-the-fly manipulation is shown immediately.
r = editPane.getBufferSwitcher().getRenderer();
editPane.getBufferSwitcher().setRenderer(
new ListCellRenderer() {
public Component getListCellRendererComponent(list, value, index, isSelected, cellHasFocus) {
rc = r.getListCellRendererComponent(list, value, index, isSelected, cellHasFocus);
rc.setText(value.getPath());
return rc;
}
});
editPane.repaint();

Selenium Webdriver - using isDisplayed() in If statement is not working

I am creating a script that involved searching for a record and then updating the record. On the search screen, the user has the option of viewing advanced search options. To toggle showing or hiding advanced search is controlled by one button.
<a title="Searches" href="javascript:expandFilters()"><img border="0" align="absmiddle" alt="Advanced" src="****MASKED URL****"></a>
The only difference between the properties of the search button when it is showing or hiding the advanced search is the img src:
When advanced search is hidden the IMG src ends with "/Styles/_Images/advanced_button.jpg", when advanced search is visible, the IMG src ends with "/Styles/_Images/basic_button.png"
When I open the page, sometimes the Advanced search options are showing, sometimes they aren't. The value that I want to search on appears in the Advanced section, so for my script to work I have added an IF statement.
<input type="text" value="" maxlength="30" size="30" name="guiSystemID">
The IF statement looks for the fields that I need to enter data into, and if the field does not exist then that would indicate that the Advanced options are not visible I need to click on the button to expand the search option.
I created the following IF statement.
if (!driver.findElement(By.name("guiSystemID")).isDisplayed()) {
driver.findElement(By.cssSelector("img[alt='Advanced']")).click();
}
When I run the script and the Advanced search is expanded then the script runs successfully. However, when I run the script and the Advanced search is not expanded, the script fails, advising me that it could not find the object "guiSystemID". This is frustrating because if it can't find it then I want the script to continue, entering into the True path of the IF statement.
Has anyone got any suggestions about how else I could assess if the field is appearing without having the script fail because it can't find the field.
Thanks in advance
Simon
I might be late in answering this, but it might help someone else looking for the same.
I recently faced a similar problem while working with isDisplayed(). My code was something like this
if(driver.findElement(By.xpath(noRecordId)).isDisplayed() )
{
/**Do this*/
}
else
{
/**Do this*/
}
This code works pretty well when the element that isDisplayed is trying to find is present. But when the element is absent, it continues looking for that and hence throws an exception "NosuchElementFound". So there was no way that I could test the else part.
I figured out a way to work with this(Surround the {if, else} with try and catch block, say something like this.
public void deleteSubVar() throws Exception
{
try
{
if(driver.findElement(By.xpath(noRecordId)).isDisplayed() )
{
/**when the element is found do this*/
}
}
catch(Exception e)
{
/**include the else part here*/
}
}
Hope this helps :)
I've had mixed results with .isDisplayed() in the past. Since there are various methods to hide an element on the DOM, I think it boils down to a flexibility issue with isDisplayed(). I tend to come up with my own solutions to this. I'll share a couple things I do, then make a recommendation for your scenario.
Unless I have something very specific, I tend to use a wrapper method that performs a number of checks for visibility. Here's the concept, I'll leave the actual implementation approach to you. For general examples here, just assume "locator" is your chosen method of location (CSS, XPath, Name, ID, etc).
The first, and easiest check to make is to see if the element is even present on the DOM. If it's not present, it certainly isn't visible.
boolean isPresent = driver.findElements(locator).size() > 0;
Then, if that returns true, I'll check the dimensions of the element:
Dimension d = driver.findElement(locator).getSize();
boolean isVisible = (d.getHeight() > 0 && d.getWidth() > 0);
Now, dimensions, at times, can return a false positive if the element does in fact have height and width greater than zero, but, for example, another element covers the target element, making it appear hidden on the page (at least, I've encountered this a few times in the past). So, as a final check (if the dimension check returns true), I look at the style attribute of the element (if one has been defined) and set the value of a boolean accordingly:
String elementStyle = driver.findElement(locator).getAttribute("style");
boolean isVisible = !(elementStyle.equals("display: none;") || elementStyle.equals("visibility: hidden;"));
These work for a majority of element visibility scenarios I encounter, but there are times where your front end dev does something different that needs to be handled on it's own.
An easy scenario is when there's a CSS class that defines element visibility. It could be named anything, so let's assume "hidden" to be what we need to look for. In this case, a simple check of the 'class' attribute should yield suitable results (if any of the above approaches fail to do so):
boolean isHidden = driver.findElement(locator).getAttribute("class").contains("hidden");
Now, for your particular situation, based on the information you've given above, I'd recommend setting a boolean value based on evaluation of the "src" attribute. This would be a similar approach to the CSS class check just above, but used in a slightly different context, since we know exactly what attribute changes between the two states. Note that this would only work in this fashion if there are two states of the element (Advanced and Basic, as you've noted). If there are more states, I'd look into setting an enum value or something of the like. So, assuming the element represents either Advanced or Basic:
boolean isAdvanced = driver.findElement(locator).getAttribute("src").contains("advanced_button.jpg");
From any of these approaches, once you have your boolean value, you can begin your if/then logic accordingly.
My apologies for being long winded with this, but hopefully it helps get you on the right path.
Use of Try Catch defies the very purpose of isdisplayed() used as If condition, one can write below code without using "if"
try{
driver.findElement(By.xpath(noRecordId)).isDisplayed();
//Put then statements here
}
Catch(Exception e)
{//put else statement here.}