How to fetch type of the document using HstLink object - hippocms

in Hippo, I have created the Custom Link processor class and in the preProcess method I want to fetch the type of the document through HstLink object.

The simple answer is that you can't. This is just a link, and at this point it hasn't even been matched to content. In fact, the point of manipulating the link is so that it can be matched to the right content.


Retrieve the configuration parameter in adobe livecycle

I have set a configuration parameter containing a path in adobe livecycle process. I want that path to be reflected in a fragment source file property of an xdp. Is there a way to retrieve the value of configuration parameter in adobe live cycle designer? will it be done using javascript in designer ?
If I understand your question correctly, here is what you are doing:
In a LiveCycle process orchestration, you are reading in a configuration parameter.
You have a form with a fragment embedded in it and you would like to read in a the configuration parameter.
Your form is probably saved on the LiveCycle server.
Since the process orchestrations are called at runtime, you can't call a variable/configuration parameter during design time. If you want to read it in at runtime, I would suggest the following approach:
Add a setValue component that injects the runtime parameter into a hidden field ( if you just need to read the value and perform some calculation based on it) or a visible field.
Test your rendered from to ensure that the variable value is injected correctly into the template.
Do let me know if you have any other questions.

Combining a page map with a PDF so that annotations move with pages in updated PDFs

Has anyone managed to provide an end-user with an updated PDF, allowing that user to transfer his local annotations to the new PDF and keeping the annotations on the correct page, even when there are pages inserted into the update PDF at a point earlier in the PDF than the annotations.
I thought there might be a page map or page guid approach that someone has used.
Sorry - I hope that is clear.
Instead of using the page index as ID of the page, you can use the page content stream instead (after decoding/decompression). Most PDF libraries will give you access to that, so you could compute an MD5 hash from the page content and search for that instead on your "updated" file in order to know where to transfer your annotations.
This is assuming that the page content will be indeed identical, which is not a common scenario.

Dumping browser document content using Zombie.js

Using browser.visit, I am fetching the page of a browser as shown in the documentation. According to the browser API, browser.document returns the main window's document. However, I am not sure how to dump (display) the contents of the document. Is there a method like browser.document.toString() or browser.document.text() to be able to print the contents of the document in the console.
What you want is probably:
There is a browser.text(selector, context?).
Selector is a CSS selector evaluated against the document body.
Context is a optional second argument, the CSS selector is evaluated against the element given as the context.
You can say something like browser.text('body') to get the text in the body.
I got here while looking for answer for the same question.
I may be late for the party, but try using
Browser.visit(url, function(error, browser){
fs.appendFileSync('index.html', browser.html());
Remember to put error checking here and do better handling, but should give you basic HTML document.
If it's not necessarily HTML (like you find yourself pulling XML or JSON through Zombie due to complicated, valid reasons...), you can access it like this:

Where I can get hyperlinks in pdf document structure (except "Annots" entry in page dictionary)?

I have two pdf documents (doc1 and doc2) with hyperlinks e.g,
According to PDF Specification I can get those hyperlinks via Link Annotations. Link Annotations can be found in pdf page's dictionary under "Annots" key.
CGPDFDictionaryRef pageDictionary = CGPDFPageGetDictionary(someCGPDFPage);
CGPDFArrayRef annots;
CGPDFDictionaryGetArray(pageDictionary, "Annots", &annots);
So the problem is that in one pdf document (doc1) I get that "Annots" array but in another document (doc2) there is no such entry in page dictionary.
And the thing is that with PDFKit.framework you can get those annotations in PDFPage class using - (NSArray *)annotations method even if there is no "Annots" entry in page dictionary.
I can't use PDFKit.framework on iPad/iPhone so I am working with Quartz framework :)
So it seems that there is another place where you can specify hyperlinks (or Link Annotations in PDF Reference), not only in "Annots" array and PDFKit.framework somehow know ho to do that.
Any ideas where can I get those hyperlinks?
Links on a page THAT YOU CAN CLICK ON have to be annotations. Period. No annotations, no links.
A string of text "" isn't necessarily a link, it's just a piece of text describing a URL. This may be what's causing your confusion.
It's also possible to embed link actions in bookmarks. I'm not at all familiar with PDFKit or Quartz, so you're on your own as far as API calls are concerned.
And finally, (having reread your question), I believe annotations can be inherited from their parent Pages object. Gonna have to look that one up... Nope. The annotations array MUST be in the leaf page object, or it's not valid.
Can you post links to your PDFs? Something Ain't Right here.
PDF viewer like Adobe Reader simply allows to click and navigate on a plain text, if it looks as a hyperlink - i.e. starts with http://, https://, ftp:// and ends up with some URL delimiter such as space. As simple as that ;)

how to read/parse dynamically generated web content?

I need to find a way to write a program (in any language) that will connect to a website and read dynamically generated data from the website.
Note that it's dynamically generated--it's not enough to get the source html, because the data I'm interested in is generated via javascript that references back-end code. So when i view the webpage source, I can't see the data. (For example, go to google, and do a search. Check the source code on the search results page. Very little of the data your browser is displaying is reflected in the source--most of it is dynamically generated. I need some way to access this data.)
Pick a language and environment that includes an HTML renderer (e.g. .NET and the WebBrowser control). Use the HTML renderer to get the URL and produce an HTML DOM in memory (making sure that scripting is enabled). Read the contents of the HTML DOM after the renderer has done its work.
Example (you'll need to do this inside a System.Windows.Form derived class):
WebBrowser browser = new WebBrowser();
HtmlDocument document = browser.Document;
// extract what you want from the document
I used to have a Perl program to access to get the drive direction from one location to another location. I parsed the returned page and save to database. If the source never change their format, it is OK. the problem is the source format often change, your parser also need change.
A simple thought: if we're talking about AJAX, you can rather look up the urls for the dynamic data. Then you can use the javascript on the page you're talking about to reformat this.
If you have Firefox/greasemonkey making a DOM dumper should be a simple matter.