Indexing a document with content using solrj in EmbeddedSolrServer - indexing

I want to query an EmbeddedSolrServer instance with a Filter query. Like we normally do in the picture with an admin panel. But the problem here is that I want to do this programmatically with Java. I know that we can do that query.setQuery("*:*"); , but this is not what I want if someone want to search by a specific word in content's document. I found also this solrParams.add(CommonParams.QT, "*:*");, But it's not working. I think that may be the problem is from parsing the PDF document, when I try to index it. So please if someone know how to index a document using EmbeddedSolrServer exactly the same way we index it using post.jar in command.

Indexing a file is as easy as
EmbeddedSolrServer server = new EmbeddedSolrServer(solrHome, defaultCoreName)
ContentStreamUpdateRequest req = new ContentStreamUpdateRequest("/update/extract");
req.addFile(fileToIndex, "application/octet-stream");
req.setParam("commit", "true");
req.setParam("literal.id", id);
NamedList<Object> namedList = server.request(req);
server.close();

Related

SearcherManager and MultiReader in Lucene

I am using Lucene and I have a MultiReader from a few directory readers like:
MutiReader myMultiReader = new Multireader(directoryReader1, directoryReader2,...)
I want to use SearcherManager from it since there will be changes in the index from time to time. How can I do this? SeacherManager only accepts a single DirectoryReader or IndexWriter as parameters in the constructor
https://lucene.apache.org/core/6_0_1/core/org/apache/lucene/search/SearcherManager.html
I don't understand how I could combine both MultiReader and SearcherManager.
By the way, I have already checked this links which don't really answer this particular issue:
http://blog.mikemccandless.com/2011/09/lucenes-searchermanager-simplifies.html
http://lucene.472066.n3.nabble.com/SearcherManager-vs-MultiReader-td4068411.html

Reading Images, Pdf from SQL Database via LINQPad

I have a table in a database that contains all kind of attachments, images, pdf, excel, and other formats. creating an application is not an option, so I googled other options and I found this related question that mentioned LINQPad I downloaded it but I still do not know how exactly it works. Anyone please explain that to me? I can query the attachments using sql query but not sure how to dump them and preview them via the mentioned tools.
Following on from Dan's answer, once you have the data context set up you can dump the images from the database. I use this snippet for checking an image I've written to the database, you should be able to edit as required to match your scenario:-
var ii = ItemImages.Where (v => v.Id == 10).FirstOrDefault();
using (var ms = new MemoryStream(ii.Image.ToArray()))
{
System.Drawing.Image.FromStream(ms).Dump();
}
Use the Util.Image built in utility for images.
Ex:
var personPictures = PictureTable.Take(1);
Util.Image(personPictures.First().Picture).Dump();
Util.Image takes a byte array.
Depending on your database of your choice, most likely you'll need a data context driver
http://www.linqpad.net/richclient/datacontextdrivers.aspx
Once you establish a connection you can start writing queries against the data

Lucene query parser not parse field as expected

I want to parse a simple query using lucene (3.0.3):
title:(+return +"pink panther")
Just like in the documentation example.
The expected result is:
+title:return +title:"pink panther"
But instead i get:
+title:return +title:"itle return pink panther"
The code is very simple (c#):
Query query =
new QueryParser(
Lucene.Net.Util.Version.LUCENE_30,
"content",
new Lucene.Net.Analysis.Standard.StandardAnalyzer(Lucene.Net.Util.Version.LUCENE_30))
.Parse("title:(+return +\"pink panther\")");
I'm unable to reproduce this. Does this still occur for you?
I'm thinking that it may be some display artifacts from the output window. Is this from the Immediate Window, the Watch Window or a call to Console.WriteLine?
Sorry for the trouble, the issue was a custom-modified Lucene.Net assembly...

Ektron Workarea

I need to develop an application that extracts all the contents in Content Tab of the Ektron Workarea and I have to keep tree structure of folders (taxonomies,collections,forms,etc.) also.When I click the content I need to get the Content ID in the code behind also.I need to do all these in a single function.
I tried this requirement with the concept of content block widget in workarea.When we drag that widget and edit it a pop up will come and it displays the folders of work area in tree structure.But when I created an aspx page, put the same code and I browse that page I didn't get the tree structure of all contents.Only the main tabs(Folders,Taxonomies and search ) are visible.Then I drag the user control in the aspx page .But it also doest work.
So how will I solve the above problem.
Can I pull all the contents in tree structure from work area from the root using API codes?.Then can anyone please give the API code to solve?
Please anyone reply!
Assuming you are using 8.6 look here to start with:
http://reference.ektron.com/developer/framework/content/contentmanager/getlist.aspx
Update:
I think I misread your question the first time around. Allow me to expand on my answer a bit. My original answer with the web services assumes that you are rendering the content tree from some sort of "presentation tier" -- a different web site, a console app, or a WPF/WinForms app, etc.
You can get the recursive folder structure with something like this:
private FolderData GetFolderWithChildren(long folderId)
{
var folderApi = new Ektron.Cms.API.Folder();
var folderData = folderApi.GetFolder(folderId);
// This next method is marked as obsolete in v9.0;
// a newer overload is available in v9.0, but I
// don't know if it's available in v8.0
folderData.ChildFolders = folderApi.GetChildFolders(folderId, true);
}
I'm a little confused as to what exactly you're trying to accomplish. If you want to show the entire tree structure graphically, have you tried taking the code and markup from the edit view of the content widget and using it on your non-edit view?
I must say, your requirement that "I need to do all these in a single function" worries me a bit. Workarea content trees can get really large very quickly. If you're trying to load all of the folders and all the taxonomies and all the collections, etc. Then the user will likely be waiting a long time for the page to load, and you risk running into timeout issues.
-- Original Answer --
Ektron v8.0 doesn't have the 3-tier option, which is too bad because that would really make your job a lot easier. In v8.0, there are ASMX web services that you can reference, including:
/workarea/webservices/content.asmx
/workarea/webservices/webserviceapi/user/user.asmx
There are lots more than this; browse through the folders within /workarea/ to see what's available.
It's been a while since I've worked with these services, so I'm a little rusty...
Suppose you add references to those two services I listed above and name them ContentService and UserService. The first thing you'll want to do is set the authentication headers. Then you can call the service methods in much the same way as the old legacy apis.
var contentApi = new ContentService.Content();
contentApi.AuthenticationHeaderValue = new ContentService.AuthenticationHeader();
contentApi.AuthenticationHeaderValue.Username = username;
contentApi.AuthenticationHeaderValue.Password = password;
contentApi.AuthenticationHeaderValue.Domain = domain;
var userApi = new UserService.User();
userApi.AuthenticationHeaderValue = new UserService.AuthenticationHeader();
userApi.AuthenticationHeaderValue.Username = username;
userApi.AuthenticationHeaderValue.Password = password;
userApi.AuthenticationHeaderValue.Domain = domain;
var ud = userApi.GetUserbyUsername("jimmy456");
long folderID = 85;
bool recursive = true;
ContentData[] folderContent = contentApi.GetChildContent(folderID, recursive, "content_id");

iTextSharp 5.3.3: replace the pages from the 1st document and insert the pages from the 2nd document instead of them

Forgive me for the bad english.
I want replace the pages from the 1st document and insert the pages from the 2nd document instead of them. I use itextsharp 5.3.3. Pages of the Second document contain pictures.
My code:
reader1:=New iTextSharp.text.pdf.PdfReader (file_name_1);
reader2:=New iTextSharp.text.pdf.PdfReader (file_name_2);
Document:= New iTextSharp.text.Document();
Document.Compress:=False;
For i:=4 To reader1.NumberOfPages Do
reader1.SetPageContent(i,reader2.GetPageContent(i));
End For;
Stamper:=New iTextSharp.text.pdf.PdfStamper(reader1, New System.IO.FileStream(new_file_name, System.IO.FileMode.CreateNew));
stamper.Close();
As a result, the images in new document mixed up.
What am I doing wrong?
Thanks for any help!
Your code is wrong on many levels. You are copying content streams without copying any of the resources. I never want to see such code again, ever!
Please read http://www.manning.com/lowagie2/samplechapter6.pdf
The best way to achieve your assignment is to use PdfCopy. Create two PdfReader objects, and add 4 PdfImportedPage objects from the second reader, following by PdfImportedPage objects from the first reader starting at page 5.
Use the following code samples for inspiration:
http://itextpdf.com/examples/iia.php?id=123
http://kuujinbo.info/iTextInAction2Ed/index.aspx?ch=Chapter06&ex=Concatenate
If you've found a page advising to use your original approach, please let me know so that I can take action to have that page removed. If you've found this page on itextpdf.com, please DO NOT USE those examples WITHOUT READING THE DOCUMENTATION!