I want to add custom operators for temporal representation kind of like how this paper describes on page 8: https://pdfs.semanticscholar.org/c097/3553764e2959af3ad4513515588791a13867.pdf.
Although it can be done in SPARQL, (I'm assuming through use of registries like this -> https://github.com/dotnetrdf/dotnetrdf/wiki/DeveloperGuide-SPARQL-Operators) I would like to know if it also possible to add these same operators into manchester syntax or find an alternative way of doing things so that I can do a temporal query within a DL Query.
Related
I use synonyms in lucene to increase the recall in the search. For that I construct a SynonymMap und use a SynonymGraphFilter in my custom Analyzer.
The synonym map looks like:
vw -> volkswagen
bmw -> bayerische motoren werke
I use QueryParser to parse the query.
Now I would like to lower the boost for synonym terms (e.g if I search for 'bmw', then the terms 'bayerische motoren werke' should have a lower boost)
How can I achieve it? It seems that Lucene supports this (see https://issues.apache.org/jira/browse/LUCENE-9171) however I do not know how to use it.
There are two different approaches for handling synonyms, here:
(1) Your usage of SynonymMap, which, as you note, is a way to pre-build synonym lists, which can then be used in analyzers and general queries.
(2) The enhancement you mention.
As the enhancement ticket notes,"this has been done targeting the Synonyms Query.".
The SynonymQuery class has a builder which allows you to add terms (as synonyms) with a boost value.
I do not believe there is any direct way to combine the two approaches. Synonym maps are not boost-aware. I think the best you can do is to iterate over your pre-defined list of synonyms, and feed the values into the synonym query builder.
I'm trying to get a list of Change Requests that match certain conditions, some of these conditions are met by using functions like has_attr().
I would like to ask is it at all possible, I need for instance to use such function has_associated_task(cvtype="task") is it possible to do that?
For queries I'm using the following pattern:
http://ip[:port]/change/oslc/db/dbURI/role/User/cr?oslc_cm.query=change:cvtype="problem" and request_type="Change_Request" and has_associated_task(cvtype="task")&oslc_cm.properties=problem_synopsis
this does work without the function term but I would like to extend the search criteria further, is there any other way besides doing a predefined query in change? Is there somewhere a list of terms? like change:cvtype (I've tried to see this [http://www.ibm.com/xmlns/prod/rational/change/1.0/][1] but I got a "whoops" from the web server)
There are some ways you could solve this:
OSLC Resource Shapes - some OSLC providers associate shapes (like schemas) that describe what you can expect from an OSLC Query Capability.
There isn't a way in the simple query syntax to test for null (or not null), assuming you want to have some condition such as (cvtype="task" and linkedTask != NULL). To get around this you can simply query based on cvtype="task" and locally filter the results using tools such as XPath or Jena. Alternatively you can do is look for extensions to the tool you are working with to see if they provide any extensions to the query syntax to support your use case, I don't have this information off hand.
Are there any open source tools that can generate a natural language description of a given SQL query? If not, some general pointers would be appreciated.
I don't know much about NLP, so I am not sure how difficult this is, although I saw from some previous discussion that the vice versa conversion is still an active area of research. It might help to say that the SQL tables I will be handling are not arbitrary in any sense, yet mine, which means that I know exact semantics of each table and its columns.
I can devise two approaches:
SQL was intended to be "legible" to non-technical people. A naïve and simpler way would be to perform a series of replacements right on the SQL query: "SELECT" -> "display"; "X=Y" -> "when the field X equals to value Y"... in this approach, using functions may be problematic.
Use a SQL parser and use a series of templates to realize the parsed structure in a textual form: "(SELECT (SUM(X)) (FROM (Y)))" -> "(display (the summation of (X)) (in the table (Y))"...
ANTLR has a grammar of SQL you can use: https://github.com/antlr/grammars-v4/blob/master/sqlite/SQLite.g4 and there are a couple SQL parsers:
http://www.sqlparser.com/sql-parser-java.php
https://github.com/facebook/presto/tree/master/presto-parser/src/main
http://db.apache.org/derby/
Parsing is a core process for executing a SQL query, check this for more information: https://decipherinfosys.wordpress.com/2007/04/19/parsing-of-sql-statements/
There is a new project (I am part of) called JustQuery.Me which intends to do just that with NLP and google's SyntaxNet. You can go to the https://github.com/justquery-me/justqueryme page for more info. Also, sign up for the mailing list at justqueryme-development#googlegroups.com and we will notify you when we have a proof of concept ready.
Given your data stored somewhere in a database:
Hello my name is Tom I like dinosaurs to talk about SQL.
SQL is amazing. I really like SQL.
We want to implement a site search, allowing visitors to enter terms and return relating records. A user might search for:
Dinosaurs
And the SQL:
WHERE articleBody LIKE '%Dinosaurs%'
Copes fine with returning the correct set of records.
How would we cope however, if a user mispells dinosaurs? IE:
Dinosores
(Poor sore dino). How can we search allowing for error in spelling? We can associate common misspellings we see in search with the correct spelling, and then search on the original terms + corrected term, but this is time consuming to maintain.
Any way programatically?
Edit
Appears SOUNDEX could help, but can anyone give me an example using soundex where entering the search term:
Dinosores wrocks
returns records instead of doing:
WHERE articleBody LIKE '%Dinosaurs%' OR articleBody LIKE '%Wrocks%'
which would return squadoosh?
If you're using SQL Server, have a look at SOUNDEX.
For your example:
select SOUNDEX('Dinosaurs'), SOUNDEX('Dinosores')
Returns identical values (D526) .
You can also use DIFFERENCE function (on same link as soundex) that will compare levels of similarity (4 being the most similar, 0 being the least).
SELECT DIFFERENCE('Dinosaurs', 'Dinosores'); --returns 4
Edit:
After hunting around a bit for a multi-text option, it seems that this isn't all that easy. I would refer you to the link on the Fuzzt Logic answer provided by #Neil Knight (+1 to that, for me!).
This stackoverflow article also details possible sources for implentations for Fuzzy Logic in TSQL. Once respondant also outlined Full text Indexing as a potential that you might want to investigate.
Perhaps your RDBMS has a SOUNDEX function? You didn't mention which one was involved here.
SQL Server's SOUNDEX
Just to throw an alternative out there. If SSIS is an option, then you can use Fuzzy Lookup.
SSIS Fuzzy Lookup
I'm not sure if introducing a separate "search engine" is possible, but if you look at products like the Google search appliance or Autonomy, these products can index a SQL database and provide more searching options - for example, handling misspellings as well as synonyms, search results weighting, alternative search recommendations, etc.
Also, SQL Server's full-text search feature can be configured to use a thesaurus, which might help:
http://msdn.microsoft.com/en-us/library/ms142491.aspx
Here is another SO question from someone setting up a thesaurus to handle common misspellings:
FORMSOF Thesaurus in SQL Server
Short answer, there is nothing built in to most SQL engines that can do dictionary-based correction of "fat fingers". SoundEx does work as a tool to find words that would sound alike and thus correct for phonetic misspellings, but if the user typed in "Dinosars" missing the final U, or truly "fat-fingered" it and entered "Dinosayrs", SoundEx would not return an exact match.
Sounds like you want something on the level of Google Search's "Did you mean __?" feature. I can tell you that is not as simple as it looks. At a 10,000-foot level, the search engine would look at each of those keywords and see if it's in a "dictionary" of known "good" search terms. If it isn't, it uses an algorithm much like a spell-checker suggestion to find the dictionary word that is the closest match (requires the fewest letter substitutions, additions, deletions and transpositions to turn the given word into the dictionary word). This will require some heavy procedural code, either in a stored proc or CLR Db function in your database, or in your business logic layer.
You can also try the SubString(), to eliminate the first 3 or so characters . Below is an example of how that can be achieved
SELECT Fname, Lname
FROM Table1 ,Table2
WHERE substr(Table1.Fname, 1,3) || substr(Table1.Lname,1 ,3) = substr(Table2.Fname, 1,3) || substr(Table2.Lname, 1 , 3))
ORDER BY Table1.Fname;
I'm using Hibernate for ORM of my Java app to an Oracle database (not that the database vendor matters, we may switch to another database one day), and I want to retrieve objects from the database according to user-provided strings. For example, when searching for people, if the user is looking for people who live in 'fran', I want to be able to give her people in San Francisco.
SQL is not my strong suit, and I prefer Hibernate's Criteria building code to hard-coded strings as it is. Can anyone point me in the right direction about how to do this in code, and if impossible, how the hard-coded SQL should look like?
Thanks,
Yuval =8-)
For the simple case you describe, look at Restrictions.ilike(), which does a case-insensitive search.
Criteria crit = session.createCriteria(Person.class);
crit.add(Restrictions.ilike('town', '%fran%');
List results = crit.list();
Criteria crit = session.createCriteria(Person.class);
crit.add(Restrictions.ilike('town', 'fran', MatchMode.ANYWHERE);
List results = crit.list();
If you use Spring's HibernateTemplate to interact with Hibernate, here is how you would do a case insensitive search on a user's email address:
getHibernateTemplate().find("from User where upper(email)=?", emailAddr.toUpperCase());
You also do not have to put in the '%' wildcards. You can pass MatchMode (docs for previous releases here) in to tell the search how to behave. START, ANYWHERE, EXACT, and END matches are the options.
The usual approach to ignoring case is to convert both the database values and the input value to upper or lower case - the resultant sql would have something like
select f.name from f where TO_UPPER(f.name) like '%FRAN%'
In hibernate criteria restrictions.like(...).ignoreCase()
I'm more familiar with Nhibernate so the syntax might not be 100% accurate
for some more info see pro hibernate 3 extract and hibernate docs 15.2. Narrowing the result set
This can also be done using the criterion Example, in the org.hibernate.criterion package.
public List findLike(Object entity, MatchMode matchMode) {
Example example = Example.create(entity);
example.enableLike(matchMode);
example.ignoreCase();
return getSession().createCriteria(entity.getClass()).add(
example).list();
}
Just another way that I find useful to accomplish the above.
Since Hibernate 5.2 session.createCriteria is deprecated. Below is solution using JPA 2 CriteriaBuilder. It uses like and upper:
CriteriaBuilder builder = session.getCriteriaBuilder();
CriteriaQuery<Person> criteria = builder.createQuery(Person.class);
Root<Person> root = criteria.from(Person.class);
Expression<String> upper = builder.upper(root.get("town"));
criteria.where(builder.like(upper, "%FRAN%"));
session.createQuery(criteria.select(root)).getResultList();
Most default database collations are not case-sensitive, but in the SQL Server world it can be set at the instance, the database, and the column level.
You could look at using Compass a wrapper above lucene.
http://www.compass-project.org/
By adding a few annotations to your domain objects you get achieve this kind of thing.
Compass provides a simple API for working with Lucene. If you know how to use an ORM, then you will feel right at home with Compass with simple operations for save, and delete & query.
From the site itself.
"Building on top of Lucene, Compass simplifies common usage patterns of Lucene such as google-style search, index updates as well as more advanced concepts such as caching and index sharding (sub indexes). Compass also uses built in optimizations for concurrent commits and merges."
I have used this in the past and I find it great.