Lucene and Solr - lucene

I am creating the Index using Lucene.
But using the SolrSearch engine to search.
My problem is
while I index I add each filed in my
code using the code
doc.add(fieldname, val, tokenized)
**But my code does not see the schema file
Even copy fields I need to add manually**
Now I want to use the autosuggest feature of Solr
I do not know how to enable this feature while creating the Index
But when I use the simplePostTool to post the data through Solr all is fine.
I cannot do that because I have to
get some text from different urls.
So can someone please advise me how
can I achieve this? A sample code
will be very helpful. In any case If
I can have some code that can see
the schema file and use the
fieldTypes it would be great.
Thanks everyone.
--pramila

See EmbeddedSolrServer at:
http://wiki.apache.org/solr/Solrj
It's a pure Java API to Solr, which will allow you to index your documents using the schema.xml you have defined.

Related

What is the significance of data-config.xml file in Solr?

and when shall I use it? How is it configured can anyone please tell me in detail?
The data-config.xml file is an example configuration file for how to use the DataImportHandler in Solr. It's one way of getting data into Solr, allowing one of the servers to connect through JDBC (or through a few other plugins) to a database server or a set of files and import them into Solr.
DIH has a few issues (for example the non-distributed way it works), so it's usually suggested to write the indexing code yourself (and POST it to Solr from a suitable client, such as SolrJ, Solarium, SolrClient, MySolr, etc.)
It has been mentioned that the DIH functionality really should be moved into a separate application, but that hasn't happened yet as far as I know.

Indexes Gallery failed to render index list

I am new to sitecore and currently in sitecore developer training. Today I ahve faced weird issue and after trainer also not able to resolve, I think I should post in this forum.
I have added some custom search field to the solution. These fields also added in Lucene Default Search config. After deploying the solution, I am tried to rebuild index option from developer menu, However I am unable to see any Indexes list over there. I am getting message as "Indexes List Failed to Render"
Also I have tried
sitecore desktop-> Control Panel-> Indexing-> Indexing manager But
Sitecore dialog box does not pop up.
desktop-> Control Panel-> Database-> Rebuild index didnt work.
IIS Reset.
Any help in this regard is highly appreciated. Thanks in advance!!
I would recommend you patch in a separate config with your custom index configuration than changing the default lucene index config. You may need to post your custom field configuration so we can figure out what's causing the error.
Thanks for help. I have now able to figure out what was going wronng. I have not made changes using patch and and messed up directly into the Sitecore.Content.Search.config instead of Lucene config. Because of this changes I am having Sitecore Configuration exception and that caused to indexes list disapper.
I had a similar issue and this worked for me
While integrating Solr, I disabled all Lucene configs and after that internal search wasn't working, and also no search indexes were visible in the developer tab. I enabled this config and its all good now
Sitecore.ContentSearch.Lucene.Indexes.Sharded.Master.config

saving and returning cached results via solr rails

I have a Ember.js app that can generate html templates that i then want to save as html.
For each of these templates i will be searching through then via solr via sunspot gem.
When a user searches, i want solr to return the results and then also return the template html for each of the results to display.
I am using mongodb to store data but not sure to store html documents into mongodb, on the file system or in solr itself.
If i save them into mongo when i get the results back from solr i then have an extra step to query mongodb.
Anyone got any experience with this sort of thing ?
Any help or suggestion would be great!
thanks
Rick
We have a similar setup, ASP.NET MVC4 application, backed by MongoDb and Solr.
We use Mongo to store data + template (we use Mustache for templating). Outside Solr, we simply query Mongo for data+template, compile the Mustache template with the data JSON and render it.
For SOLR, we store( indexed=false ) the compiled HTML along with other structured data ( stored=true indexed=true) for a document. This helps us avoid additional query to Mongo, and also allows us to use Facets on the document.
If you store the HTML in Mongo, and you intend to have facets, you will be looking at at least one extra Mongo query, per facet choice.
SOLR is good enough to store html as text, I would not index it, when I have structured data for the document.

Apache Solr & schema.xml

I have just began with Apache Solr 4.1 yesterday and though I have managed to get our MySQL data imported successfully in Solr, I am unable to view any data using queries. I suspect the problem is in schema.xml changes (data-config.xml is correct). Here are my questions -
Do I need to add all DB fields in the schema.xml? My table has 275+
fields, and configuring all of them would be a task. I am hoping
there is a way to auto-configure these fields.
Is there a way to use separate schema.xml for my requirement? Where
and how do I configure this? I don't want to modify the example-DIH's
sample schema.xml
Any pointers here would be highly appreciated! I have already gone through this document -
http://wiki.apache.org/solr/DataImportHandler
and I have also read few question posted here, but couldn't find answer to my queries.
1) All the fields that you are mentioning in data-config.xml as field column must add them to schema.xml. No auto-configure is available.
2) No we cannot use separate schema.xml.
Could you please show some of your indexed field that you have mentioned in schema.xml

Program to scrape a webpage into an index

I've been looking for a program to create an index from static webpages. I'm not looking for a program like Solr, or elasticsearch because both are assuming I will be interactively creating an index. I need something that can basically go to a url, and create a search index from the pages that it pulls. It can create the index in whatever way necessary (db, xml, etc.) I just don't need the programs that are so involved with the backend database access and the code, as this search will be very light and mostly for internal purposes, on a site that does not use any of those.
Thanks for any tips that may get me started or answers that will solve my problem!
Investigate Nutch. Nutch can index a URL and what you can index is very configurable.
Once you finish crawling/indexing, that index is searchable. There is no programming involved.