GSA - Last modified dates for documents (PDF/DOC etc)

GSA - Last modified dates for documents (PDF/DOC etc) - pdf

According to GSA's documentation:
PDF or XPS documents typically have metadata such as:
<MT N="CreationDate" V="D:20040107111105Z"/>
<MT N="ModDate" V="D:20040209162220+01'00'"/>
The search appliance can automatically pick up these formats without any special formatting configuration.
But unfortunately this does not seem to be working. We have PDFs, DOCs and other files in our site, and the last modified dates are appearing in the corresponding <MT> entries in the GSA search results. But <FS NAME="date"> has a blank value, which indicates that GSA could not extract the date. Even specifying the date format in "Document Dates" page in the GSA console does not help.
So how to make GSA "see" the documents' last modified dates? Please note: we cannot use web server's last-modified HTTP header values since they are not correct in our case (AEM dispatcher/caching interference).

GSA can extract metadata from Document Properties but I am not sure if GSA can use that ModDate/CreationDate to populate <FS NAME="date"> without "Document Dates" configuration.
You have mentioned that "you cannot use web server's last-modified HTTP header values since they are not correct in our case." Does it mean your web server is returning last-modified header with incorrect values?
Last-Modified response header takes precedence over all other metadata in GSA. So if your server cannot return correct values then you have to remove the Last-Modified header from response.
I have come across many people using java Simpledateformat (yy-MM-dd) while specifying the format under Document Dates but GSA can only understands strptime format.This is one of the prime reason why GSA fails to populate <FS NAME="date">. So make sure to use date format in strptime else leave it blank as it is not a mandatory field.

Related

#Dblookup and formatting on web

I have been developing a web application using domino, therein I have dblookup-ing the field from notes client; Now, this is working fine but the format of value is missing while using on web.
For example in lotus notes client the field value format is as above
I am one, I am two, I am one , I am two, labbblallalalalalalalalalalalalalalalalalalaallllal
Labbbaalalalallalalalalalaalallaal
Hello there, labblalalallalalalllaalalalalalalalalalalalalalalalalalalalalalalala
Now when I retrieve the value of the field on web it seems it takes 2 immediate after 1. and so forth, I was expecting line feed here which is not happening.
The field above is multi valued field. Also on web I have used computed text which does db lookup from notes client.
Please help me what else could/alternate solution for this case.
Thanks
HD

Your multi-valued field has display options associated with it and the Notes client honors those. Obviously, your options are set up to display entries separated by newlines.
The computed text that you are using for the web does not have options like that and the field options are irrelevant because you aren't displaying the field. Your code has to insert the #Newlines. That's pretty easy because #DbLookup returns a list, and if you concatenate a list and a scalar, the scalar will be appended to each element of the list. (Look at the third example under "concatenation, pairwise" here to see what I mean.
The way you've worded your question is a little unclear to me, but what you need in your computed text formula is either something like this:
list := #DbLookup(etc,. etc.);
list + #Newline;
Or something like this:
multiValueFieldContainingListWithDbLookupResult + #NewLine;

I used #implode(Dblookupreturnedvalue;"");
thanks All :)

VSTO accessing date of message as entered in the header

in VSTO I want to access the date of a sent message as it appears in the recipient clients header. Sent Items return an empty transport header (for obvious reasons) however I can't find a date to match the date that a non exchange recipient system would get from the message header.
I've tried:
CreationTime();
PropertyAccessor.GetProperty("http://schemas.microsoft.com/mapi/proptag/0x30070040").ToString(); //MAPI creation time
.PropertyAccessor.GetProperty("http://schemas.microsoft.com/mapi/proptag/0x30080040").ToString(); //MAPI last modification time
.GetProperty("http://schemas.microsoft.com/mapi/proptag/0x0E060040").ToString(); //MAPI Date Message Delivered
But none of them match the actual Date: that appears in the header on the recipient end. Taking into account timezones, etc. the Date field is a couple of seconds off.
Any ideas on how to access the date of a sent item as it appears to the clients? I would have expected date of delivery or date of creation to match.

Try PR_CLIENT_SUBMIT_TIME (DASL name http://schemas.microsoft.com/mapi/proptag/0x00390040). Also keep in mind that OOM always rounds the date/time properties to the nearest second.

How do I access the "See Also" Field in the Wiktionary API?

Many of the Wiktionary pages for Chinese Characters (Hanzi) include links at the top of the page to other similar-looking characters. I'd like to use the Wiktionary API to send a single character in the query and receive a list of similar characters as the response. Unfortunately, I can't seem to find any query that includes the "See Also" field. Is this kind of query possible?

The “see also” field is just a line of wiki code in the page source, and there is no way for the API to know that it's different from any other piece of text on the page.
If you are happy with using only the English version of Wiktionary, you can fetch the wikicode: index.php?title=太&action=raw, and then parse the result for the template also. In this case, the line you are looking for is {{also|大|犬}}.
To check if the template is used on the page at all, query the API for titles=太&prop=templates&tltemplates=Template:also
Similar templates are avilable in more language editions of Wiktionary, in case you want to use other sources than the English one. The current list is:
br:Patrom:gwelet
ca:Plantilla:vegeu
cs:Šablona:Viz
de:Vorlage:Siehe auch
el:Πρότυπο:δείτε
es:Plantilla:desambiguación
eu:Txantiloi:Esanahi desberdina
fi:Malline:katso
fr:Modèle:voir
gl:Modelo:homo
id:Templat:lihat
is:Snið:sjá einnig
it:Template:Vedi
ja:テンプレート:see
no:Mal:se også
oc:Modèl:veire
pl:Szablon:podobne
pt:Predefinição:ver também
ru:Шаблон:Cf
sk:Šablóna:See
sv:Mall:se även
It has been suggested that the WikiData project be expanded to cover Wiktionary. If and when that happens, you might be able to query theWikiData API for that kind of stuff!

Apply regional settings to application based on user profile

I have an MVC 4 application with international users (all over the world). I want to add a new page called profile settings where users can select their regional settings and by that I mean that they should be able to select:
- time zone (UTC +- .....)
- date format (dd.MM.yyyy or dd/MM/yyyy or MM/dd/yyyy ....)
- time format (12/24 - AM PM)
- number format (1234.56 or 1234,56)
After the user selects his regional settings, all specific data (date, time, number ...) should be shown in that specific format.
Any advice how to make this work?

Most of the time, you shouldn't expose each and every detail of the culture format to your users. Instead, provide a drop-down list of cultures you want to support. Cultures are specified using codes. Some common codes are en-US (English/USA), es-MX (Spanish/Mexico) and de-DE (Germany/German). The first part refers to the language, and the second part refers to the specific country or region.
Once you have the selected culture code, you can apply it to each user, such as:
CultureInfo culture = new CultureInfo("en-US");
Thread.CurrentThread.CurrentCulture = culture;
If you are using culture-specific resource files, then you will also need:
Thread.CurrentThread.CurrentUICulture = culture;
There are several places you could do this, but a common location is in the Application_BeginRequest event in your global.asax file.
There is a good tutorial on MSDN here.
While it is common to think about time zones when you are considering regional settings, they are actually quite different things and should be considered separately. Time zones can't really be set globally, you need to consider how they affect your application logic in each and every place you work with date and time. You should look into the TimeZoneInfo class. If you have questions, please ask separately. Although if you search, you may find that many have already been answered.

Lucene query that eliminates xml tags in full text search

In alfresco I need to write a lucene query such a way that It has to eliminate/exclude the xml tags from content while searching.
Example If a file try.xml is searched against the content, my search should not search for the xml tags.
try.xml
<sample>This is an example</sample>
If I give the search text as "sample" it should not return the file name "try.xml".
So how could I achieve this?
Edit
I have tried with the below query and no change.
#cm\:name:"try*" -TEXT:"<*>" +TEXT:"sample"
Whats wrong in the above query. I just tried to get the file name which starts with "try" and eliminating the text inside tag, and trying to search for text "sample".

By default Alfresco treats XML files as plain text and indexes the xml tags as words, that's why they can be found via full text search. XML content is handled by the StringExtractingContentTransformer in Alfresco which converts text/xml to text/plain before indexing it.
To check which transformers are registered in your Alfresco installation you can check
http://localhost:8080/alfresco/service/mimetypes?mimetype=text/xml#text/xml
To prevent the indexing of xml attributes you have to write a special transformer which strips out the XML tags. See http://wiki.alfresco.com/wiki/Content_Transformations for an introduction in content transformation with Alfresco. The easiest way would be to integrate a command line utility that converts the xml file into text or you could implement a java class which does the transformation.

There's no standard way to do what you need, here's an excerpt of the official documentation:
Wild card queries Wildcard queries
using * and ? are support as terms and
phrases. For tokenized fields the
pattern match can not be exact as all
the non token characters (whitespace,
punctuation, etc) will have been lost
and treated as equal.
Basically, angle brackets are stripped out by default. You need to hack the indexing and query parsing processes in order to enable your wanted behavior.

Could you not just exclude the xml mimetype? (See http://wiki.alfresco.com/wiki/Search#Finding_nodes_by_content_mimetype for the syntax)
I guess you might want to exclude html too (so you'd exclude text/html and text/xml), that'd prevent you getting any nodes in your results that contain xml tags.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

GSA - Last modified dates for documents (PDF/DOC etc) - pdf

Related

#Dblookup and formatting on web

VSTO accessing date of message as entered in the header

How do I access the "See Also" Field in the Wiktionary API?

Apply regional settings to application based on user profile

Lucene query that eliminates xml tags in full text search

Categories

Resources