Testing my anaphora resolution tool - testing

I am in the course of building an anaphora resolution tool. I have done a lot of literature review and I have a pretty good idea on what I should do to build a basic tool. However, the problem is, how do I test it. I can't find any annotated corpus which I could test it on. Could someone suggest how I would measure the precision and recall of my tool.

From here:
http://www.mitpressjournals.org/doi/pdf/10.1162/COLI_a_00152
Section 4.1
OntoNotes-Dev
– development partition of OntoNotes v4.0 provided in
the CoNLL2011 shared task (Pradhan et al. 2011).
OntoNotes-Test
– test partition of OntoNotes v4.0 provided in the
CoNLL-2011 shared task.
ACE2004-Culotta-Test
– partition of the ACE 2004 corpus reserved for
testing by several previous studies (Culotta et al. 2007; Bengtson and Roth
2008; Haghighi and Klein 2009).
ACE2004-nwire
– newswire subset of the ACE 2004 corpus, utilized by
Poon and Domingos (2008) and Haghighi and Klein (2009) for testing.
MUC6-Test
– test corpus from the sixth Message Understanding
Conference (MUC-6) evaluation.
You can find MUC details here
http://www-nlpir.nist.gov/related_projects/muc/muc_data/muc_data_index.html
Just look around at the start of the experimental section in your references. You are bound to find links. If you look at the most commonly used ones, you will find your data sets.

Related

Automated Design in CAD, Analysis in FEA, and Optimization

I would like to optimize a design by having an optimizer make changes to a CAD file, which is then analyzed in FEM, and the results fed back into the optimizer to make changes on the design based on the FEM, until the solution converges to an optimum (mass, stiffness, else).
This is what I envision:
create a blueprint of the part in a CAD software (e.g. CATIA).
run an optimizer code (e.g. fmincon) from within a programming language (e.g. Python). The parameters of the optimizer are parameters of the CAD model (angles, lengths, thicknesses, etc.).
the optimizer evaluates a certain design (parameter set). The programming language calls the CAD software and modifies the design accordingly.
the programming language extracts some information (e.g. mass).
then the programming language extracts a STEP file and passes it a FEA solver (e.g. Abaqus) where a predefined analysis is performed.
the programming language reads the results (e.g. max van Mises stress).
the results from CAD and FEM (e.g. mass and stress) are fed to the optimizer, which changes the design accordingly.
until it converges.
I know this exists from within a closed architecture (e.g. isight), but I want to use an open architecture where the optimizer is called from within an open programming language (ideally Python).
So finally, here are my questions:
Can it be done, as I described it or else?
References, tutorials please?
Which softwares do you recommend, for programming, CAD and FEM?
Yes, it can be done. What you're describing is a small parametric structural sizing multidisciplinary optimization (MDO) environment. Before you even begin coding up the tools or environment, I suggest doing some preliminary work on a few areas
Carefully formulate the minimization problem (minimize f(x), where x is a vector containing ... variables, subject to ... constraints, etc.)
Survey and identify individual tools of interest
How would each tool work? Input variables? Output variables?
Outline in a Design Structure Matrix (a.k.a. N^2 diagram) how the tools will feed information (variables) to each other
What optimizer is best suited to your problem (MDF?)
Identify suitable convergence tolerance(s)
Once the above steps are taken, I would then start to think MDO implementation details. Python, while not the fastest language, would be an ideal environment because there are many tools that were built in Python to solve MDO problems like the one you have and the low development time. I suggest going with the following packages
OpenMDAO (http://openmdao.org/): a modern MDO platform written by NASA Glenn Research Center. The tutorials do a good job of getting you started. Note that each "discipline" in the Sellar problem, the 2nd problem in the tutorial, would include a call to your tool(s) instead of a closed-form equation. As long as you follow OpenMDAO's class framework, it does not care what each discipline is and treats it as a black-box; it doesn't care what goes on in-between an input and an output.
Scipy and numpy: two scientific and numerical optimization packages
I don't know what software you have access to, but here are a few tool-related tips to help you in your tool survey and identification:
Abaqus has a Python API (http://www.maths.cam.ac.uk/computing/software/abaqus_docs/docs/v6.12/pdf_books/SCRIPT_USER.pdf)
If you need to use a program that does not have an API, you can automate the GUI using Python's win32com or Pywinauto (GUI automation) package
For FEM/FEA, I used both MSC PATRAN and MSC NASTRAN on previous projects since they have command-line interfaces (read: easy to interface with via Python)
HyperSizer also has a Python API
Install Pythonxy (https://code.google.com/p/pythonxy/) and use the Spyder Python IDE (included)
CATIA can be automated using win32com (quick Google search on how to do it: http://code.activestate.com/recipes/347243-automate-catia-v5-with-python-and-pywin32/)
Note: to give you some sort of development time-frame, what you're asking will probably take at least two weeks to develop.
I hope this helps.

Semantic techniques in IOT

I am trying to use semantic technologies in IOT. From the last two months I am doing literature survey and during this time I came to know some of the tools required like (protege, Apache Jena). Now along with reading papers I want to play with semantic techniques like annotation, linking data etc so that I can get the better understanding of the concepts involved. For the same I have put the roadmap as:
Collect data manually (using sensors) or use some data set already on the web.
Annotate the dataset and possibly use ontology (not sure)
Apply open linking data principles
I am not sure whether this road map is correct or not. I am asking for suggestions in following points
Is this roadmap correct?
How should I approach for steps 2 and 3. In other words which tools should I use for these steps?
Hope you guys can help me in finding a proper way for handling this issue. Thanks
Semantics and IoT (or semantic sensor web [1]) is a hot topic. Congratulations that you choose a interesting and worth pursuing research topic.
In my opinion, your three steps approach looks good. I would recommend you to do a quick prototype so you can learn the possible challenges early.
In addition to the implementation technologies (Portege, etc.), there are some important works might be useful for you:
Open Geospatial Consortium (OGC) Sensor Web Enablement (SWE). [2] It is an important work for sharing and exchanging sensor observation data. Many large organizations (NOAA, NASA, NRCan, AAFC, ESA, etc.) have adopted this standard. This standard has defined a conceptual data model/ontology (O&M, ISO 19156). Note: this is a very comprehensive standard, hence it's very BIG and can be time consuming to read. I recommend to read #2 mentioned below.
OGC SensorThings API (http://ogc-iot.github.io/ogc-iot-api/), a IoT cloud API standard based on the OGC SWE. This might be most relevant to you. It is a light-weight protocol of the SWE family, and designed specifically for IoT. Some early research work has been done to use JSON-LD to annotate SensorThings.
W3C Spatial Data on Web (http://www.w3.org/2015/spatial/wiki/Main_Page). It is an on-going joint work between W3C and OGC. Part of the goal is to mature SSN (Semantic Sensor Network) ontology. Once it's ready, the new SSN can be used to annotate SensorThings API for example. A work worth to monitor.
[1] Sheth, Amit, Cory Henson, and Satya S. Sahoo. "Semantic sensor web." Internet Computing, IEEE 12.4 (2008): 78-83.
[2] Bröring, Arne, et al. "New generation sensor web enablement." Sensors 11.3 (2011): 2652-2699.

Swarm Intelligence - what kinds of problems are effectively solved?

I am looking for practical problem (or implementations, applications) examples which are effectively algoritmized using swarm intelligence. I found that multicriteria optimization is one example. Are there any others?
IMHO swarm-intelligence should be added to the tags
Are you looking for toy problems or more for real-world applications?
In the latter category I know variants on swarm intelligence algorithms are used in Hollywood for CGI animations such as large (animated) armies riding the fields of battle.
Related but more towards the toy-problem end of the spectrum you can model large crowds with similar algorithms, and use it for example to simulate disaster-scenarios. AFAIK the Dutch institute TNO has research groups on this topic, though I couldn't find an English link just by googling.
One suggestion for a place to start further investigation would be this PDF book:
http://www.cs.vu.nl/~schut/dbldot/collectivae/sci/sci.pdf
That book also has an appendix (B) with some sample projects you could try and work on.
If you want to get a head start there are several frameworks (scientific use) for multi-agent systems such as swarming intelligence (most of 'em are written with Java I think). Some of them include sample apps too. For example have a look at these:
Repast:
http://repast.sourceforge.net/repast_3/
Swarm.org:
http://swarm.org/
Netlogo:
http://ccl.northwestern.edu/netlogo
Post edited, added more info.
I will take your question like: what kind of real-world problems SI can solve?
There are alot. Swarm intelligence is based on the complex behaviour of swarms, where agents in the swarm coordinate and cooperate by executing very simple rules to generate an emergent complex auto organized behaviour. Also, the agents often make a deliberation process to make efficient decisions, and also, the emergent behaviour of the swarms allows them to find patterns, learn and adapt to their environment. Therefore, real-world applications based on SI are those that often required coordination and cooperation techniques, optimization process, exploratory analysis, dynamical poblems, etc. Some of these are:
Optimization techniques (mathematical functions for example)
Coordination of a swarm of robots (to organize inventory for example)
Routing in communication networks. (This is also dynamical combinatorial optimization)
Data analysis (usually exploratory, like clustering). SI has alot of applications in data mining and machine learning. This allows SI algorithms to find interesting patterns in big sets of data.
Np problems in general
I'm sure there are alot more. You should check the book:
"Swarm Intelligence: from natural to artificial systems". This is the basic book.
Take care.

Recommended Model Based Testing Tools [closed]

As it currently stands, this question is not a good fit for our Q&A format. We expect answers to be supported by facts, references, or expertise, but this question will likely solicit debate, arguments, polling, or extended discussion. If you feel that this question can be improved and possibly reopened, visit the help center for guidance.
Closed 10 years ago.
Does anyone have any suggestions on what Model Based Testing Tools to use? Is Spec Explorer/SPEC# worth it's weight in tester training?
What I have traditionally done is create a Visio Model where I call out the states and associated variables, outputs and expected results from each state. Then in a completely disconnected way, I data drive my test scripts with those variables based on that model. But, they are not connected. I want a way to create a model, associate the variables in a business friendly way, that will then build the data parameters for the scripts.
I can't be the first person to need this. Is there a tool out there that will do basically that? Short of developing it myself.
You might find the following answer to a similar question helpful:
http://testing.stackexchange.com/questions/92/how-to-get-started-with-model-based-testing
In it, I mention:
UML Pad http://web.tiscali.it/ggbhome/umlpad/umlpad.htm
A list of free UML Tools: http://en.wikipedia.org/wiki/Category:Free_UML_tools
Our Pairwise and combinatorial test case generator (which generates tests for you automatically based on a model you create - even if you don't create a UML model): http://hexawise.com
Incidentally, as explained in the answer I link to above, I focus my energies (research, tool development focus, passion, etc.) on the second part of your question - generating efficient and effective sets of tests that maximize coverage in a minimum number of test cases.
Justin (Founder of Hexawise)
I think an updated version of the "Spec Explorer for Visual Studio" power tool is supposed to be released soon - it's much easier to ramp up on than the current version, but still takes some time to learn.
If you want to start smaller, nmodel (also from microsoft) is a good place to start.
Check out TestOptimal. It offers full cycle Model-Based Testing with built-in data driven testing and combinatorial testing right within the model. It has graphical modeling and debugging which you can play the model and it graphically animates the model execution. You can link state/transition to the requirements. Models can be re-purposed for load testing with no changes. It can even create full automated MBT for web applications without any coding/scripting. Check out this short slide presentation: http://TestOptimal.com/tutorials/Overview.htm
You should try the "MaTeLo" tool of All4Tec. www.all4tec.net
"MaTeLo is a test cases generator for black box functional and system testing. Conformed to the Model Based Testing approach, MaTeLo uses Markov chains for modeling the test. This statistic addin allows products validation in a Systematic way. The efficiency is achieved by a reduction of the human resources needed, an increase of the model reuse and by the enhancement of the test strategy relevance (due to the reliability target). MaTeLo is independent and user-friendly, offers to the validation activities to pass from test scripting to real test engineering and to focus on the real added value of testing: the test plans"
You can ask an evaluation licence and try by yourself.
You can find some exemples here : http://www.all4tec.net/wiki/index.php?title=Tutorials
A colleague of mine have made this tool, http://mbt.tigris.org/ and its being used in large scale testing environments for years. It's Open Source and all..
Updating:
Here are short whitepaper: http://www.prolore.se/filer/whitepaper/MBT-Agile.pdf
This tool is great with MBT, yED a free modelling software.
I can tell you that the 2010 version of Spec Explorer that requires The Professional version of Visual Studio is a great tool, assuming you already have Visual Studio. The older version of spec explorer was good, but the limitation was that if you ended up modeling a system that was non-finite, you were out of luck.
The new version has improved techniques for looking at 'slices' of the model to the point where you have finite states. Once you have the finite states, you can generate the test cases.
The great thing is that as you change the model and re-slice your model, it's straightforward to re-generate tests and re-run them. This certainly beats the manual process any day.
I can't compare this tool to other toolsets, but the integration with Visual Studio is invaluable. If you don't use Visual Studio, you may have limited success.

Floorplan and packaging architecture resources for the interested software professional?

One of the more interesting things I've run into lately is the art and science of laying out chip floorplan and determining packaging for the silicon. I would like to read some materials on the subject for the "Interested Software Guy".
Does anyone have any recommendations (Website or book, so long as it is a good quality)?
This is a result of my search on the subject as I was curious about your question and this is where I would start myself. Sorry I am not a specialist on the subject but hope it can kick-start you!
Seems floorplan optimization is a matter of combinatorial optimization.
As a developer, you'll want to tackle the theory behind it and most likely some proven algorithms. You might then be interested by books such as:
Computational Geometry: Algorithms and Applications by Mark de Berg and al.
Integer and Combinatorial Optimization by Laurence A. Wolsey and al.
Algorithms for VLSI Design Automation by Sabih H. Gerez
Handbook of Algorithms for Physical Design Automation by Charles J. Alpert and al.
Evolutionary Algorithms for VLSI CAD by Rolf Drechsler
It's a bit more difficult to get links on this subject, but if you're a member of IEEE Xplore, you might want to look at this paper and other similar ones.
Finally, on the floorplan wikipedia entry, you'll notice this on sliceable floorplans that might give you your best starting point:
Sliceable floorplans have been used in
a number of early EDA tools for a
number of reasons. Sliceable
floorplans may be conveniently
represented by binary trees which
correspond to the order of slicing.
What is more important, a number of
NP-hard problems with floorplans have
polynomial time algorithms when
restricted to sliceable floorplans
Good luck!
There seems to be a class on this at Carnegie Mellon
VLSI CAD
some of the lecture notes that looked more interesting than others:
All Notes
layout
"Floorplanning"
maze search
That site make references to one or more books that this fellow published:
Majid Sarrafzadeh
Also I found a short tutorial here
AutoCAD is a classic... Though on my limited budget I prefer Rhino 2.0.