Like the first image, the meta tag is displayed correctly in inspect elements mode but incorrectly displayed in view page source mode as in the second image. Thank you for suggesting a solution to this problem.
I understood the answer:
Because, by default, the HTML encoding engine will only safelist the basic latin alphabet (because browsers have bugs. So we're trying to protect against unknown problems). The &XXX values you see still render as correctly as you can see in your screen shots, so there's no real harm, aside from the increased page size.
If the increased page size bothers you then you can customise the encoder to safe list your own character pages (not language, Unicode doesn't think in terms on language)
To widen the characters treated as safe by the encoder you would insert the following line into the ConfigureServices() method in startup.cs;
services.AddSingleton<HtmlEncoder>(
HtmlEncoder.Create(allowedRanges: new[] { UnicodeRanges.BasicLatin,
UnicodeRanges.Arabic }));
Arabic has quite a few blocks in Unicode, so you may need to add more blocks to get the full range you need.
Related
How can I make the page larger in the way that the text/images will stretch
according to the new size?
I only found ways to scale down but not scale up... any idea?
Thanks in advance!!!
There are two conceptional options:
For each page enlarge MediaBox and CropBox, prepend current transformation matrix scaling operation to the page content stream or array of streams, and update annotation positions and sizes.
For each page set the property UserUnit to a value > 1.
The second option can e.g. be implemented like this using PDFBox and a stretch factor of 1.7:
PDDocument document = PDDocument.load(SOURCE);
for (PDPage page : document.getPages()) {
page.getCOSObject().setFloat("UserUnit", 1.7f);
}
document.save(TARGET);
(The second option obviously is much easier to implement than the first one. Feature incomplete PDF viewers might ignore this value, though. If you need to support such incomplete viewers, you probably have to go for the first option.)
I'm building a QuickLook plugin. I want to change the width of the windows that pops up when user hits the spacebar.
I've read there are two keys in the info.plist file of the project where height and width are customisable. Even if I change those values I can't get the size of the preview windows to my desired one.
I don't know what else to try. Any idea?
Thanks!
Thought I'd dig a little on this. I have not tried any of the following suggestions, so nobody get their hopes up. I'll assume you're using the generator callback:
OSStatus (*GeneratePreviewForURL)(
void *thisInterface,
QLPreviewRequestRef preview,
CFURLRef url,
CFStringRef contentTypeUTI,
CFDictionaryRef options
);
Before anything else, you might manually check the options dictionary argument and verify that the kQLPreviewPropertyWidthKey and kQLPreviewPropertyHeightKey keys are indeed mapped to the desired CFNumber values.
Referring to each of these properties, the Apple QuickLook programming guide says:
Note that this property is a hint; Quick Look might set the width
automatically for some types of previews. The value must be
encapsulated in a CFNumber object.
(Edit: If your preview representation is flexible, you might try finding a preview type for which QuickLook honors your size hints, as per the statement above. Just a thought.)
Running nm on the QuickLook framework binary revealed some undocumented kQLPreviewProperty-- constants as well as the aforementioned width and height keys. One that caught my attention was kQLPreviewPropertyAutoSizeKey. Recalling Apple's statement about ignoring the hints to set the size automatically, this might be significant? Following the convention in QuickLook.framework/Headers/QLBase.h, you might try declaring
extern const CFStringRef kQLPreviewPropertyAutoSizeKey;
Then you could try associating a CFNumber 0 with that property key in the options dictionary. There are other undocumented keys of note, such as kQLPreviewPropertyAttributesKey.
Back to the Info.plist you mentioned, Apple says about those keys QLPreviewWidth and QLPreviewHeight:
This number gives Quick Look a hint for the width (in points) of
previews. It uses these values if the generator takes too long to
produce the preview. (emphasis added)
This is where someone makes the terrible suggestion of calling sleep() in your generator. But I'm perplexed as to why Apple would make following the size hints dependent on the generator latency. (?)
Edit: Also note the above statement says the Info.plist hints must be expressed in points (not pixels), a unit dependent on the user's screen resolution.
Recently I was developing a Quick Look Plugin myself which uses HTML+CSS and faced the same problem.
The solution for my was to test the plugin not within Xcode and qlmanage as the executable but instead to try the real .qlgenerator from my user library.
When invoking the generator from my user library, the Quick Look window was resized exactly the way I specified in the *-Info.plist.
I've run into the same problem, and may offer some clues: In my case I'm generating an image quick look preview for my custom file format. I initiate the preview context to draw my preview into using
CGContextRef QLPreviewRequestCreateContext(QLPreviewRequestRef preview, CGSize size, Boolean isBitmap, CFDictionaryRef properties);
The curious thing is that if I set isBitmap to true, quick look adjusts the preview panel size to the size specified for the context (up to a certain size at least). But if you set isBitmap to false, it seems to disregard the context size and instead always shows a full size preview panel with the vector graphics image scaled to cover the entire panel.
So, if you use a bitmap graphical preview context, it seems the preview panel will be set to the size of the context you specify. However, I haven't found any way to set the size of the panel when using a vector graphic preview context (which is what I want).
i have a set of pdfs, from which i want to process( VB.NET) only those which are non text searchable, can you please tell me how to go about this?
Generally speaking, the way to do this is open up each page and rip the content stream and see if any text operators are executed that place text on the page.
Let me explain what that means - PDF content is a small RPN language that contains operations that mark the page in some way. For example, you might see something like this:
BT 72 400 Td /F0 12 Tf (Throatwarbler Mangrove) Tj ET
Which means:
Begin a text area
Set the position of the text baseline to (72, 400) in PDF units
Set the font to a resource named F0 from the current page's font resource dictionary
Draw the text "Throatwarbler Mangrove"
End a text area
So you can try short cuts
does my page's resource dictionary contain any fonts?
This will fail in some cases because some PDF generation tools put fonts into the resource
dictionary and don't use them (false positive). It will also fail if the page content contains a Form XObject which contains text (false negative).
does my page's content stream have BT/ET opertors?
This will get you closer, but will fail if there is not content in them (false positive) or if they're not present, but there's a Form XObject which contains text (false negative).
So really, the thing to do is to execute the entire page's content stream, including recursing on all XObject to look for text operators.
Now, there's another approach that you can take using my Atalasoft's software (disclaimer, I work for Atalasoft and have written most of the PDF handling code, I also worked on Acrobat versions 1-4). Instead of asking, does this page contain any text, you can ask "does this page contain only a single image?"
bool allPagesImages = true;
using (Document doc = new Document(inputStream))
{
foreach (Page p in doc.Pages)
{
if (!p.SingleImageOnly)
{
allPagesImages = false;
break;
}
}
}
Which will leave allPagesImages with a pretty decent indication that each page is all images, which if you're looking to OCR is the non-searchable documents, is probably what you really want.
The down side is that this will be a very high price for a single predicate, but it also gets you a PDF rasterizer and the ability to extract the images directly out of the file.
Now, I have no doubt that a solid engineer could work their way through the PDF spec and write some code to extend iTextPdfSharp to do this task I think that if I sat down with it, I might be able to write that predicate in a few days, but I already know most of the PDF spec. So it might take you more like two weeks to a month. So your choice.
I think this option could be your consideration, though I haven't tested the code yet but I think it can be done by read the properties for each PDF files that you want to proceed.
You might check this link :
http://www.codeguru.com/columns/vb/manipulating-pdf-files-with-itextsharp-and-vb.net-2012.htm
You have to read the producer properties right after you proceeded it. That's just only example. But my advice please include your code here so we can give a try to help you. Bless you
Consider the following, I have paragraph data being sent to a view which needs to be placed over a background image, which has at the top and the bottom, fixed elements (fig1)
Fig1.
My thought was to split this into 4 labels (Fig1.example2) my question here is how I can get the text to flow through labels 1 - 4 given that label 1,2 & 3 ar of fixed height. I assumed here that label 3 should be populated prior to 4 hence the layout in the attached diagram.
Can someone suggest the best way of doing this with maybe an example?
Thanks
Wish I could help more, but I think I can at least point you in the right direction.
First, your idea seems very possible, but would involve lots of calculations of text size that would be ugly and might not produce ideal results. The way I see it working is a binary search of testing portions of your string with sizeWithFont: until you can get the best guess for what the label will fit into that size and still look "right". Then you have to actually break up the string and track it in pieces... just seems wrong.
In iOS 6 (unfortunately doesn't apply to you right now but I'll post it as a potential benefit to others), you could probably use one UILabel and an NSAttributed string. There would be a couple of options to go with here, (I haven't done it so I'm not sure which would be the best) but it seems that if you could format the page with html, you can initialize the attributed string that way.
From the docs:
You can create an attributed string from HTML data using the initialization methods initWithHTML:documentAttributes: and initWithHTML:baseURL:documentAttributes:. The methods return text attributes defined by the HTML as the attributes of the string. They return document-level attributes defined by the HTML, such as paper and margin sizes, by reference to an NSDictionary object, as described in “RTF Files and Attributed Strings.” The methods translate HTML as well as possible into structures of the Cocoa text system, but the Application Kit does not provide complete, true rendering of arbitrary HTML.
An alternative here would be to just use the available attributes, setting line indents and such according to the image size. I haven't worked with attributed strings at this level, so I the best reference would be the developer videos and the programming guide for NSAttributedString. https://developer.apple.com/library/mac/#documentation/Cocoa/Conceptual/AttributedStrings/AttributedStrings.html#//apple_ref/doc/uid/10000036-BBCCGDBG
For lesser versions of iOS, you'd probably be better off becoming familiar with CoreText. In the end you'll be rewarded with a better looking result, reusability/flexibility, the list goes on. For that, I would start with the CoreText programming guide: https://developer.apple.com/library/mac/#documentation/StringsTextFonts/Conceptual/CoreText_Programming/Introduction/Introduction.html
Maybe someone else can provide some sample code, but I think just looking through the docs will give you less of a headache than trying to calculate 4 labels like that.
EDIT:
I changed the link for CoreText
You have to go with CoreText: create your AttributedString and a CTFramesetter with it.
Then you can get a CTFrame for each of your textboxes and draw it in your graphics context.
https://developer.apple.com/library/mac/#documentation/Carbon/Reference/CTFramesetterRef/Reference/reference.html#//apple_ref/doc/uid/TP40005105
You can also use a UIWebView
I am creating a large LaTeX document, and my appendix has reproductions of several booklets that I have as PDFs. I am trying to create a section header and then include the pages at a slightly lower scale. For example:
\section{Booklet about Yada Yada Yada}
\includepdf[pages={-}, frame=true, scale=0.8]{booklet_yadayada.pdf}
However, pdfpagex does two annoying things. First, it devotes one output document page for included document page. I can live with that as I am using 80% scale. The main problem, however, is that the first page is also a new page, so I have a page with just a section title, and then a separate page with the booklet.
Is there some way to get pdfpages to be a little smarter here?
\includepdf uses \includegraphics internally, so something like
\section{Foo}
\fbox{\includegraphics[page=1,scale=0.8]{foo.pdf}}
would include the page without starting a new one, although it only does one page at a time.
For me the following worked just fine:
\includepdf[pages=1,pagecommand=\section{Section Heading}]{testpdf}
\includepdf[pages=2-,pagecommand={}]{testpdf}
I tried this solution too, but \includepdf keeps the advantage of outputting the file over the margin (the output is centered from the edges of the page).
So I openned pdfpages.sty, and I searched for \newpage command. I deleted the first occurance (line 326), just to try, and after saving then compiling again, there were no page break anymore.
Use the minipage environement :
\chapter*{Sujet du stage}
%\fbox{
\begin{minipage}{\textwidth}
\includepdf[scale=0.8]{../sujet-stage/main.pdf}
\end{minipage}
It doesn't add any extra page and it works with includepdf.
Thanks for all the answers - I couldn't for the life of me figure out what logic \includepdf uses to insert blank pages; the trick with including the first page via \includegraphics solved most (but not all) of those problems; so here are some notes:
First, out of curiosity, I have also tried to use only \includepdf, but split in two parts:
\includepdf[pages=1]{MYINCLDOC.pdf}
\includepdf[pages=2-last]{MYINCLDOC.pdf}
... unfortunately, this has the same problem as the question in OP.
Since #WASE's answer, there are now multiple \newpages in the source (pdfpages.sty). I tried reading the source, but I found it quite difficult; so I tried temporarily setting \newpage to \relax only for \includepdf - and that puts all pages in the document on top of each other; so probably not a good idea to get rid of \newpage blindly.
Just \includegraphics[page=1,scale=0.8]{foo.pdf} works - but (as #WASE also note) it is aligned at the top-left corner of the page body, which is to say inside the margins; for a full page we'd want the pdf inclusion overlaid over the whole page, margins included.
This page: graphics - How do I add an image in the upper, left-hand corner using TikZ and graphicx - TeX - LaTeX points to several possibilities for positioning on page over the margins; but for me, the best solution for a full page PDF inclusion is to use package tikz to center it to the page:
\begin{tikzpicture}[remember picture,overlay]
\node at (current page.center) {\includegraphics[page=1]{MYINCLDOC.pdf}};
\end{tikzpicture}
\includepdf[pages=2-last]{MYINCLDOC.pdf}
After this is done, as a bonus, I have also experienced:
Proper targets of PDF bookmarks (going to the right page when clicked)
If you use package pax, the data seems to be included also for the \includegraphics standalone first page, so no difference there
If you have a twoside document - pdfpages, with the above split of the first page in \includegraphics, will now (seemingly) correctly insert the equivalent of \cleardoublepages between pdfs that are included back to back (so I don't have to insert such a command manually).
Hope this helps someone,
Cheers!
I'm a little late, but the following solution worked for me:
\includepdf[pages={-},angle=90, scale=0.7]{lorem-ipsum.pdf}
All pages are imported, scaled and rotated by 90 degrees.
Works with Texmaker 5.0.4