Why not use indexwriter to build index in Lucene? - indexing

I didn't find any lucene forums, so this was the only relevant place to ask this question. Hope for the best.
These are the steps of indexing in Lucene given in our syllabus-:
Figure1
Figure2
I understand the second step clearly. But I don't understand the first and third step. It's not mentioned clearly in this figure imo.
Can you clear my confusion?
Plus the sources that I refer don't even mention it like this, they explain it differently. I'm not sure from where this is copied from.
What are we doing in first vs third step as written in that figure text?
Why was indexwriter created first and not used later? Because according to my information that I've collected, you can also use indexwriter to add/remove/update indexes. So, we could just use it for the purpose. What're they doing in that figure?
This information is originally written by a no-name person so I can't ask anyone.

"the sources that I refer don't even mention it like this"
There can be more than one way to do things in Lucene. For example. the official documentation includes a basic demo which uses an IndexWriterConfig instead. See line 129 of the indexer demo source code.
"Why was indexwriter created first and not used later?"
It looks as if there is something left unexplained, or explained elsewhere: The final step , which can be:
Something like indexWriter.addDocument(doc); to add a document to a newly created index. See line 271 of the above mentioned demo.
Something like indexWriter.updateDocument([term goes here], doc); to update an existing document. based on an identifier for the specific doc (the "term"). See line 277 of the above mentioned demo.
Either way, now you see the document you just created being added to the index using the index writer you previously created.
Give it a try. If you get stuck, you can ask a specific question - but the chances are it may have already been asked and answered here.

Related

Umbraco Lucene search breaking on 'special characters'

When a user inputs certain special characters i.e a html tag, the user receives this:
Error loading Partial View script (file: ~/Views/MacroPartials/ezSearch.cshtml)
I have been investigating and this seems to be a common issue and I attempted to apply a fix so it would strip out 'bad' characters:
public string CleanseSearchTerm(string input)
{
System.Text.RegularExpressions.Regex rgx = new System.Text.RegularExpressions.Regex("[^a-zA-Z0-9 -]");
input = rgx.Replace(input, "");
return input.ToString();
}
However, the issue is that this error is getting generated before it has a chance to hit my method to strip out 'bad' characters. Any ideas of how this can be resolved?
I haven't tried these changes myself Kyle but I have done some investigation to help you fix this problem. Could one of the following solutions help you?
See this Umbraco form question and its answers, especially the one from Ismail Mayat could help you.
Linked to Ismail's answer, there is also this "How to make the Lucene QueryParser more forgiving?" question and I'd recommend you to check the answers for this question, too.
Another answer that might help you is this stackoverflow answer.
Lucene.NET does contain a few classes to help with HTML stripping. These classes are located here in the Lucene.NET repo. Specifically, the HTMLStripCharFilter may be of interest.

Can Intellij IDEA (14 Ultimate) generate regex based TODO-comments?

A few years back i worked in a company where i could press CTRL+T and a TODO-comment was generated - say my ID to be identified by other developers was xy45 then the generated comment was:
//TODO (xy45):
Is something available from within Intellij 14 Ultimate or did they write their own plugin for it?
What i tried: Webreserach, Jetbrais documentations - it looks like its not possible out of the box (i however ask before i write a plugin for it) or masked by the various search results regarding the TODO-view (due to bad research skills of mine).
There is no built-in feature in IntelliJ IDEA to generate such comments, so it looks like they did write their own plugin.
Found something that works quite similar but is not boundable to a shortcut:
File -> Settings -> Live Templates
I guess the picture says enoth to allow customization (consult the Jetbrains documentation for more possibilities). E.g. browse to the Live Template section within the settings, add a new Live Template (small green cross, upper right corner in the above picture) and set the context where this Live Template is applicable.
Note: Once you defined the Live Template to be applicable within Java (...Change in the above image where the red exclamation marks are shown) context you can just type "t", "todo" and hit CTRL+Space (or the shortcut you defined for code completion).
I suggest to reconsider using that practice at all. Generally you should not include redundant information which is easily and more reliably accessible through your Version Control System (easily available in Idea directly in editor using Annotate feature). It is similiar to not using javadoc tag #author as the information provided with it is often outdated inaccurate and redundant. Additionaly, I don´t think author of TODO is that much valuable information. Person who will solve the issue will often be completly different person and the TODO should be well documented and descriptive anyway. When you find your own old TODO, which is poorly documented, you often don't remember all the required information even if you were the author.
However, instead of adding author's name, a good practice is to create a task in you issue management system and add identifier of this task to the description of the todo. This way you have all your todos in evidence at one place, you can add additional information to the task, track progress, assign it etc. My experience is that if you don´t use this, todos tend to stay in the code forever and after some time no one remembers clearly the details of the problem. Additionaly, author mentioned in the todo is often already gone working for a different company.
Annotated TODO with issue ID

How to create documentation for instance variable and methods in Xcode?

I'd like to be able to Alt-Click an instance variable (or a method) as part of the program i created and read what it's purpose is.
The fact that Xcode is telling me the class variable is declared at - is nice but not enough. In this case i'd like to see custom text i typed to describe what an asset really is. Additionally type of the ivar would also be useful to know.
How can this be done? In this case, i wonder what exactly did i mean by assets
I specifically wonder if this information can be viewed from inside Xcode, similar to how Eclipse shows JavaDoc content.
You would need to create a documentation set for your project and install it in Xcode. appledoc can help you with this. This is a command-line tool that can generate documentation in Apple's style from specially formatted comments in your headers. You can also integrate this into your build process with a run script build phase, so that documentation is always up-to-date.
For small projects, it's usually not worth the effort though and you're probably better off just adding comments to your header files and jumping there with Cmd-click (Ctrl+Cmd+left-arrow to go back to where you came from).
You'll probably want to take a look at Apple's documentation on Documentation Sets as well as their article on generating doc sets using Doxygen. The latter is based on Xcode 3.x, so how relevant it is is somewhat questionable, but it'd be a good idea to take a look nonetheless.
That said, if you decide to use Doxygen (alternatives like HeaderDoc can be used for documentation, but I'm not sure what's available to you as far as creating doc sets goes), it looks like the main point is you'll want to throw GENERATE_DOCSET=YES into your Doxyfile (or whatever you decide to call it). After that, you'd just throw the results into ~/Library/Developer/Shared/Documentation/DocSets (according to Doxygen's documentation). I don't know whether this works in Xcode 4.x - it's worth a shot though, and it'd be nice to hear back on it.
Note: most of this was based on this answer by Barry Wark. Figure credit is due there, since I wouldn't have bothered looking into this were it not for his answer.

Difference between (plain) Classworlds and Plexus Classworlds?

Can anyone please explain the difference between plexus-classworlds and (plain) classworlds?
These two are confusing and can't see the difference. Plexus classworlds contains almost no description. Apparently, a maven-based Java project uses both, I don't understand why.
Is it possible to replace classworlds with plexus-classworlds without much hassle?
I'm gonna answer that, even though the question is so old...
classworlds was migrated to plexus-classworlds, but the documentation on the site doesn't seem to keep up with that... the best docs I've seen was on classworlds 1.1-SNAPSHOT, although the current is plexus-classworlds 2.4.1-SNAPSHOT, and there is hardly any doc there.
if you look at plexus-classworlds, you can also see the original org.codehaus.classworlds package, with class comments like this:
A compatibility wrapper for org.codehaus.plexus.classworlds.launcher.Launcher provided for legacy code
which means that they thought about migration, but of course nothing replaces a thorough test.

How do I just SAVE a jsFiddle and not get a new version

In the documentation:
Buttons Save or Fork are always present in the UI. First one appears if no fiddle was loaded, the latter is used to create a new fiddle from the existing one.
I ONLY see SAVE when the fiddle is brand new, then RUN/update/fork. In Fx4 and Safari 5 on MAC (and Fx 4 on pc)
UPDATE: New BASE functionality does exactly what I wanted.
From the SO FAQ
Stack Overflow is for professional and
enthusiast programmers, people who
write code because they love it. We
feel the best Stack Overflow questions
have a bit of source code in them, but
if your question generally covers …
a specific programming problem
a software algorithm
software tools commonly used by programmers
matters that are unique to the programming profession
… then you’re in the right place to ask your question!
When you log into JsFiddle, you'll get a Set as Base button, which will make the revision you're working on the base version - think of it as an alias for john/7hd62/12/ -> john/7hd62/.
I ran into an issue where set as base would not save my work. The solution was to:
make a change.
Update to get a new version.
Set as Base.
Hope that helps
I always use Update to save and create a new revision.
I haven't seen the button Save... maybe it is so that we can't Save to a version, but always need to Update to a new version, so everybody can look at the same code at a certain version.
Have a look on Issue #225 in JSFiddle GitHub Repository - URL for the latest version of a fiddle such as /xxxxx/latest/:
#zalun: Please read http://doc.jsfiddle.net/basic/introduction.html#setting-base-version
Sharing a latest fiddle is not always what you wanted. Because anyone is able to save "latest" fiddle, someone would be able to change it to the content you wouldn't like to share. With setting a base version you are the person who chooses which version is shared under default "no version" URL.