I am reading some .docx files with Docx4J which contains hyperlinks.
I am getting URLs while clicking on those hyperlinks manually but when i am trying to read those file with Docx4J i am getting only text nothing about those Hyperlinks and URLs.
Document Text -
Infosys Chairman, KV Kamath said that IT services were facing challenges of scalability. Speaking at the 31st Annual General Meeting of the company in Bangalore, Kamath said the management has met all the challenges successfully and demonstrated leadership. Infosys, India's second largest IT services company announced a final dividend of Rs 22/share. The company also announced a special dividend of Rs 10/share on account of the 10th year of operations of the Infosys BPO. Speaking at the AGM, S D Shibulal, CEO of Infosys said that transformation is complete and the company is now focussed on growth. "Infosys 3.0 will help company address challenges," said Shibulal. Shibulal said: "We had a choice between commoditization and re-defining the industry. We chose to redefine the industry."..more
Hyperlink is on "more"
Docx4J is giving the text 'more' only. It is not giving information regarding that hyperlink.
Is there any way how to get that URL??
Please Help...
Related
Problem: Attempting to insert a JSON string into a Postgres table column of json datatype intermittently returns this error for some record insertion attempts but not others.
I confirmed using multiple third party 'JSON validator' apps that the JSON I am inserting is indeed valid, and I have confirmed that any single ' quote characters have been escaped with the double '' technique, and the issue persists.
What are some additional troubleshooting steps to consider?
Here is a scrubbed sample JSON I have attempted:
{"id": "jf4ba72kFNQ","publishedAt": "2012-09-02T06:07:28Z","channelId": "UCrbUQCaozffv1soNdfDROXQ","title": "Scout vs. Witch: a tale of boy meets ghoul (Official Version)","tags": ["L4D","TF2","SFM","animation","zombies","Valve","video game"],"description": "Howdy folks (he''s alive!). I made a new SFM video (October 2015), called \"Nick in a Hotel Room\". Please check it out: https://www.youtube.com/watch?v=FOCTgwBIun0\n\nAlso check out some early behind the scenes of Scout vs. Witch:\nhttps://www.youtube.com/watch?v=73tQEBgD09I\n\nYou can find links to my stuff on my website: http://nailbiter.net\n\n-----\n\nhey gang,\nI''m the animator who made this cartoon. Hope you like it.\n\nThis is my little mash-up of a bunch of stuff I like. What happens when the Scout from Valve''s Team Fortress 2 video-game walks into the wrong neighborhood (Left 4 Dead). Hilarity (and a bodycount) ensues. It was created using Source Film Maker (for all the dialog stuff and the montage at the beginning), and with TF2/Source SDK for the entire 300 alley-run sequence. I had already completed that part before SFM was released. The big zombie horde scenes and a couple others were shot in Left 4 Dead. I hope you get a kick out of it.\n\nStuff I did:\nI animated all of the characters (using Maya) except for the big crowd scenes and parts of the headcrab zombie (the crawling and the legs). The faces in the dialog scenes were animated in SFM.\n\nAlso did additional mapping, particles, motion graphics, zombie maya rigging, and created blendshapes for the Witch''s face to enable her to talk/emote. I didn''t do a full set, just the phonemes I needed for this performance. Inspiration for her performance was based on Meg Mucklebones (if you''ve ever seen Legend) mixed with the demon ladies in Army of Darkness. I have a feeling Valve had seen those movies too when they designed her..\n\nthanks for watching."}
I am answering this question by enumerating all the other troubleshooting steps I have found so far, either 'working knowledge' that 'field workers' will have, or a little more obscure (or buried in postgres docs which, while thorough, are esoteric) insights I have found thru my own trial & error
Steps
Make sure you have escaped any single quote ' characters by double-escaping with like ''
Make sure your JSON string is actually a single line string - JSON is very easy to copy as a multiline string, and postgres JSON columns will not accept this (easy as hitting backspace on any newline)
Most obscure I've found: even when encapsulated in a JSON string field, the ? question mark weirdly enough breaks the JSON syntax for postgres. Something like {"url": "myurl.com?queryParam=someId"} will return as invalid. Solve this by escaping the question mark like: {"url": "myurl.com\?queryParam=someId"}
I am searching for a BAPI to search FI documents, based on the input criteria (document type, posting date,...). Same as it is on the FB03, but the Document List screen, not the screen with only three inputs (Document Number, Company Code, Fiscal Year).
As I don't have the document number, I need the search enabled BAPI.
I am using the BAPI_ACC_DOCUMENT_POST for posting.
Any ideas?
Need to answer my own question - I was hoping to skip these two days of investigation by getting an answer here :)
BAPI_ACC_CO_DOCUMENT_FIND is correct BAPI to use for searching the posted FI documents. What I found out is that if I want to search by posting date, I have to provide Controlling Area (but instead of an error, there is nothing returned).
I've searched the net and can not find simple HTMLAgilityPack example to extract 1 information from webpage. Most of the examples are in C# and code convertors don't work properly. Also developer's forum wasn't helpful.
Anyways, I am trying to extract “Consumer Defensive” string from this URL “http://quotes.morningstar.com/stock/c-company-profile?t=dltr” and this text “Dollar Tree Stores, Inc., operates discount variety stores in United States and Canada. Its stores offer merchandise at fixed price of $1.00 and C$1.25. The company operates stores under the names of Dollar Tree, Deal$, Dollar Tree Canada, etc. “ from same webpage.
Tried code on this link : https://stackoverflow.com/questions/13147749/html-agility-pack-with-vb-net-parsing but GetPageHTML is not declared.
This one is in C# HTML Agility pack - parsing tables
and so on.
Thanks.
The HTML returned from that URL is translated to XML with 2 root nodes, so it can not be transformed directly to an XML document.
For the values you wish to retrieve it may be easier to simply retrieve the HTML document and search for the start and end tags of the strings you wish to extract.
How can i let "google rich snippet" display this format such as the following?
rich snippet display the "Job Title", "Company", "Location", "Posted"
glassdoor jobs - Computerworld
25+ items - 5158+ glassdoor jobs available on Computerworld.
Job Title Company Location Posted.
Senior Software Engineer ... Riverbed Technology Sunnyvale, CA Aug 09.
Senior Java Software Engineer Glassdoor.com Manhattan, NY Aug 17.
is it use microdata, mircoformat or RDFa?
or need to write the specific HTML structure?
i know the JobPosting of microdata, but i think this format is more better to me.
Thanks for your help!
it's "bulleted snippets".
※ Use a consistent structure, whatever it is.
※ Keep extraneous code to a minimum.
※ Test removing your META description or setting it to “”.
http://moz.com/blog/how-do-i-get-googles-bulleted-snippets
http://insidesearch.blogspot.tw/2011/08/new-snippets-for-list-pages.html
When doing research I find myself usually annotating a pdf document (highlighting, adding notes), then I will create a note in Evernote and index all my annotations.
For example,
p 3 - "is it possible for schools to change their practices and thereby have a strongly positive effect on student achievement?"
p 10 - "the district boldly moved forward with several new reforms"
My hope is to work with a pdf document, annotate it, then run the applet which would copy all my annotations (highlights and notes) to clipboard, where then I could paste them in a note, thereby having an index of all the points I found useful.
I am using a mac, and am open to using which ever language would be simple to creating this. My thoughts are that an applescript would be best.
Skim can export notes as text, and it also has an AppleScript dictionary.
tell application "Skim" to tell document 1 to save as "notes as text" in "/Users/username/Desktop/notes.txt"
The output looks like this:
* Highlight, page 1
ocument (highlighting
* Text Note, page 1
aa
* Highlight, page 1
ent, annotate it,