Sensenet: sort by file type on List View - sensenet

It is possible to sort the content on a list view by file type?
For instance I want to see all the doc files first, then the folders, then the pdf files, etc.

In Sense/Net to sort by a value, you have to have a dedicated field for that value, because our search engine can sort only by fields.
Unfortunately the value of file extension is not stored in a dedicated field like that, however it is possible to display it. You need to add a new "extension" field to the CTD of the files:
<Field name="Extension" type="ShortText">
<DisplayName>Extension of the file.</DisplayName>
</Field>
and define a ContentHandeler, inherited from File, that has access to the file name, and can retrive the extension. For further details visit SenseNet-wiki: http://wiki.sensenet.com/How_to_create_a_ContentHandler

Related

nested field in Solr 5.2

I'm new to Solr and I have a very specific problem that I need to solve:
I have a csv file that contains my Solr document. Now, I do have a column (field) that's not only multiValued, but also contains 'subfields'
for example
"id":"0101",
"addMaterials":[{"name":"Mat1", "property":"prop1"},
{"name":"Mat2","property":"prop2"},
{"name":"Mat3","property":"prop3"}],
"mainProperty":"mainproperty1",
"URL":"http://www.mySite..."
where id, addMaterials, mainProperty, and URL are my main fields while 'name' and 'property' are my subfields. I know that Solr is designed to handle denormalized documents but denormalizing is not a possible solution for my application.
What I'm thinking is to just separate my data set and move the fields (that have subfields) to another document and somehow make a new field to link it to the orginial document (e.g. fromIdField).
Is there any other solution to do this? My minimum goal is to index the values of addMaterials field (even without indexing the subfields)
from:
"addMaterials":[{"name":"Mat1", "property":"prop1"},
{"name":"Mat2","property":"prop2"},
{"name":"Mat3","property":"prop3"}],
to
"addMaterials":{"name":"Mat1", "property":"prop1"}
"addMaterials":{"name":"Mat2", "property":"prop2"}
"addMaterials":{"name":"Mat3", "property":"prop3"}
Thanks in advance.
I have found a solution to my problem. Instead of separating my data set, I kept the addMaterials field as a multiValued field and ignored the subfields. So I only have one multiValued field to be indexed. What I did was to use the update/ request of Solr to index my csv file and put },{ as my separator in my addMaterials multiValued field. The indexed document looks like this:
"addMaterials": ["[{\"name\":\"Mat1\", \"property\":\"prop1\"",
"\"name\":\"Mat2\", \"property\":\"prop2\"",
"\"name\":\"Mat3\", \"property\":\"prop3\"}]"]
I indexed my document using this:
curl "http://localhost:8983/solr/<coreName>/update/csv?
stream.file=C:/userName/Solr/solr-5.2.0/documentFolder/myFile.csv&
f.addMaterials.split=true&
f.addMaterials.separator=\},\{&
stream.contentType=text/plain;charset=utf-8"
Also, this assumes that the addMaterials field is a multiValued field. So make sure you modify your schema first before indexing your document using the procedure above. Otherwise, it will give an error saying that the f. is not a multiValued field.
Of course, if you need to query against the sub-fields then I guess you can use the !join command/function of Solr.

Synchronize modification between SWT table and TextEditor

I'm facing a problem and want to ask for a solution.
I'm working on an eclipse plugin project, in which an editor for a type of resource file is required. The resource file has similar structure like CSV file. My idea is to provide user the option to edit this type of file both in plain text format and also in an SWT table. Plain text is required for examining data and table provides more flexibility to editing such as sorting by column.
I have been able to create a MultiPageEditorPart, with one page of org.eclipse.ui.editors.text.TextEditor, and another page with a org.eclipse.swt.widgets.Table and several other widgets like search bar. The content of the resource file can be shown in the TextEditor, can also be edited and saved. On the other hand, the content can be loaded in the table too, sorting and searching all work good.
The problem is: when I edit a cell in the table, I want the change also reflected in the TextEditor, and vice versa. Since the resource file can be very large, I want the saving action happen only on the TextEditor, i.e. I don't want any modification in the table directly stored to resource file, but to mark the file dirty, but I can't figure out how. How can I for example get the content of EditorInput, check it line by line, and modify it outside TextEditor?
Or, are there more efficient ways to do this? Can anyone give any hints?
The IDocument used by the TextEditor gives you access to the document contents. Get this with something like:
IDocumentProvider provider = editor.getDocumentProvider();
IEditorInput input = editor.getEditorInput();
IDocument document = provider.getDocument(input);
IDocument has many methods for accessing lines such as:
int getLineOffset(int line);
int getLineLength(int line);
and methods for modify the text:
void replace(int offset, int length, String text);

Limit selection choices

How do you limit selection options in a view? For instance account.voucher has 4 type options, but I want to display only two of them. How do you achieve that in a view definition?
If the selection is applied on a relation field (o2m, m2m) you can play with domains on the xml view itself. If the selection is actually a selection field I'm afraid you can't do this from XML.
You should be able to do that by overriding the fields_view_get (or fields_get can't remember right now). From there you can manipulate all the stuff you want but you'll have to handle python code and XML building.
grep "def $your_method_here" * into addons folder is your friend ;)
Use the domain attribute.
<field name="voucher_id" domain="[('type','in',['payment','receipt'])]"/>
This could also be done directly in the business object model:
_columns = {
'voucher_id': fields.many2one('account.voucher', 'Voucher',
domain="[('type','in',['payment','receipt'])]",
}

Indexing Multiple documents and mapping to unique solr id

My use case is to index 2 files: metadata file and a binary PDF file to a unique solr id. Metadata file has content in form of XML file and some schema fields are mapped to elements in that XML file.
What I do: Extract content from PDF files(using pdftotext), process that content and retrieve specific information(example: PDF's first page/line has information about the medicine, research stage). Information retrieved(medicine/research stage) needs to be indexed and one should be able to search/sort/facet.
I can create a XML file with information retrieved(lets call this as metadata file). Now assuming my schema would be
<field name="medicine" type="text" stored="true" indexed="true"/>
<field name="researchStage". ../>
Is there a way to put this metadata file and the PDF file in Solr?
What I have tried:
Based on a suggestion in archives, I zipped these files and gave to ExtractRequestHandler. I was able to put all the content in SOLR and make it searchable. But it appear as content of zip file.(I had to apply some patches to Solr Code base to make this work). But this is not sufficient as the content in metadata file is not mapped to field names.
curl "http://localhost:8983/solr/update/extract?literal.id=doc1&commit=true" -F "myfile=#file.zip"
I tried to work with DataImportHandler(binURLdatasource). But I don't think I understand how it works. So could not go far.
I thought of adding metadata tags to PDF itself. For this to work, ExtractrequestHandler should process this metadata. I am not sure of that either.
So I tried "pdftk" to add metadata. Was not able to add custom tags to it. It only updates/adds title/author/keywords etc. Does anyone know similar unix tool.
If someone has tips, please share.
I want to avoid creating 1 file(by merging PDF text + metadata file).
Given a file record1234.pdf and metadata like:
<metadata>
<field1>value1</field1>
<field2>value2</field2>
<field3>value3</field3>
</metadata>
Do the programmatic equivalent of
curl "http://localhost:8983/solr/update/extract?
literal.id=record1234.pdf
&literal.field1=value1
&literal.field2=value2
&literal.field3=value3
&captureAttr=true&defaultField=text&capture=div&fmap.div=foo_txt&boost.foo_txt=3&" -F "tutorial=#tutorial.pdf"
Adapted from http://wiki.apache.org/solr/ExtractingRequestHandler#Literals .
This will create a new entry in the index containing the text output from Tika/Solr CEL as well as the fields you specify.
You should be able to perform these operations in your favorite language.
the content in metadata file is not mapped to field names
If they dont map to a predefined field, then use dynamic fields. For example you can set a *_i to be an integer field.
I want to avoid creating 1 file(by merging PDF text + metadata file).
That looks like programmer fatigue :-) But, do you have a good reason?

Plone 4 - Get url of a file in a plone.app.blob.field.FileField

I have a custom content type with 3 FileFields (plone.app.blob.field.FileField) and I want to get their url's, so i can put them on my custom view and people will be able to download these files.
However, when using Clouseau to test and debug, I call :
context.getFirst_file().absolute_url()
Where getFirst_file() is the accessor to the first file (field called 'first_file').
The url returned is 'http://foo/.../eat.00001', where 'eat.00001' is the object of my custom type that contains the file fields...
The interesting thing is, if I call:
context.getFirst_file().getContentType()
It returns 'application/pdf', which is correct since it's a pdf file.
I'm pretty lost here, any help is appreciated. Thanks in advance!
File fields do not support a absolute_url method; instead, through acquisition you inherit the method from the object itself, hence the results you see. Moreover, calling getFirst_field() will return the actual downloadable contents of the field, not the field itself which could provide such information.
Instead, you should use the at_download script appended to the object URL, followed by the field id:
First File
You can also re-use the Archetypes widget for the field, by passing the field name to the widget method:
<metal:field use-macro="python:context.widget('first_field', mode='view')">
First File
</metal:field>
This will display the file size, icon (if available), the filename and the file mime type.
In both these examples, I assumed the name of the field is 'first_field'.