Providing an embedded webkit with resources from memory - webkit

I'm working on an application that embeds WebKit (via the Gtk bindings). I'm trying to add support for viewing CHM documents (Microsoft's bundled HTML format).
HTML files in such documents have links to images, CSS etc. of the form "/blah.gif" or "/layout.css" and I need to catch these to provide the actual data. I understand how to hook into the "resource-request-starting" signal and one option would be to unpack parts of the document to temporary files and change the uri at this point to point at these files.
What I'd like to do, however, is provide WebKit with the relevant chunk of memory. As far as I can see, you can't do this by catching resource-request-starting, but maybe there's another way to hook in?

An alternative is to base64-encode the image into a data: URI. It's not exactly better than using a temporary file, but it may be simpler to code.

Related

Is there a way to hide assets from users within my Mac App?

I have developed a simple app for Mac which uses a browser window to display some content. Now the assets (images etc.) are visible to anyone who receives the app and discloses the content in finder using 'show package content'.
Is there a way to prevent this? Can I hide it or encapsulate it somehow using code or some XCode function?
A trivial way would be to change the extension on your files so the system doesn't recognize them as images. You'd then have to load the images as data and convert them to images in code, which would be a bit of a pain.
A more rigorous solution would be to encrypt the images in your app bundle, then write a utility function that loads and decrypts images.
Here's another option.
You can zip all the assets. Use whatever is easiest e.g. pkzip or gzip or even just tar it all. Then you hide a lot of info and, if you want to go the extra step, it is easy to encrypt the zipped file and there are lots of libraries around to include in your project and use to unzip it with.
It should be easy to read assets directly from the zipped file, but if you need them individually you could e.g. put a single file / resource inside a zip or you could unzip it. You could even unzip to temporary space and remove it all when the app quits if you have really sensitive stuff that is too big to fit in memory.
** EDIT **
Java works this way right. A jar file is just a renamed zip and it often contains all of the resources the app needs, and it seems to be working there. So if that is a guide performance should not be too bad.

Get selected "PostScript" from PDF

I wasn't able to find anything on the internet and I get the feeling that what I want is not such a trivial thing. To make a long story short: I'd like to get my hands on the underlying code that describes the PDF document of a selected area from a .pdf file. I've been looking for libraries or open source readers but couldn't find anything useful yet.
Does there exist something that might be able to accomplish my needs here or anything that might be reused (like an open source reader) to get there a little faster and not having to write everything from scratch?
You can convert a whole PDF document to PostScript using pdftops, one of the utilities from the poppler PDF rendering library.
This utility enables you to convert individual pages, which is at least a start.
If you just want to extract bitmapped images, try pdfimages from the same package. This extraction can also be restricted to individual pages.
The poppler library was originally written for UNIX-like systems, but there are a couple of windows builds available.
The open source tool from iText called iText RUPS does what you want, showing you all the PDF commands for a particular PDF and allow you to visualize the structure and relationships.
http://sourceforge.net/projects/itextrups/

progressive pdf download

I want to download a pdf file progressively in an iPad application. I m not sure how to do that and google wasn't very helpful. can anyone help me understand the concepts here please. I am planning to render in core graphics.
Thanks.
Do you mean you want to render pdf pages before download is completed? If yes:
First of all, PDF format initially was not designed for that.
Let me explain. PDF file consists of a number of objects and xref. xref is a table containing location (in bytes from the beginning) of every object withing the file, so objects may be located at random locations withing the file. Even worse, xref itself is located at the end of file, so you can't locate any object in the file until you download it.
So, PDF is designed for random access. Actually, HTTP protocol allows it, so if you really need it, you can try to implement it :)
Good news for you: starting from PDF-1.2 there is a special feature called "Linearized PDF". It is designed exactly for your task, so you can render the first page before the next one if downloaded. You can google around or check out pdf reference for more details. The most important thing: you have to linearize pdf file using special tools, so not every pdf file can be rendered progressively.
Bad news for you: looks like core graphics doesn't support. I didn't tried it actually, but I found nothing re linearized pdf in core graphics documentation. (Please let me know if you will find anything.) So you may need to render PDF manually.
Not entirely sure about for iPad, but doing a Save as... in Acrobat by default it will be optimized as Fast Web View, which allows downloads a page at a time instead of the whole document in one go.
http://www.adobe.com/designcenter-archive/acrobat/articles/acr6optimize/acr6optimize.pdf
Linearzied PDF will meet your needs. You need a capable reader such as the one from Adobe to utilize this feature.

Naming convention for assets (images, css, js)?

I am still struggling to find a good naming convention for assets like images, js and css files used in my web projects.
So, my current would be:
CSS: style-{name}.css
examples: style-main.css, style-no_flash.css, style-print.css etc.
JS:
script-{name}.js
examples: script-main.js, script-nav.js etc.
Images: {imageType}-{name}.{imageExtension}
{imageType} is any of these
icon (e. g. question mark icon for help content)
img (e. g. a header image inserted via <img /> element)
button (e. g. a graphical submit button)
bg (image is used as a background image in css)
sprite (image is used as a background image in css and contains multiple "versions")
Example-names would be: icon-help.gif, img-logo.gif, sprite-main_headlines.jpg, bg-gradient.gif etc.
So, what do you think and what is your naming convention?
I've noticed a lot of frontend developers are moving away from css and js in favor of styles and scripts because there is generally other stuff in there, such as .less, .styl, and .sass as well as, for some, .coffee. Fact is, using specific technology selections in your choice of folder organization is a bad idea even if everyone does it. I'll continue to use the standard I see from these highly respected developers:
src/html
src/images
src/styles
src/styles/fonts
src/scripts
And their destination build equivalents, which are sometimes prefixed with dest depending on what they are building:
./
images
styles
styles/fonts
scripts
This allows those that want to put all files together (rather than breaking out a src directory) to keep that and keeps things clearly associated for those that do break out.
I actually go a bit futher and add
scripts/before
scripts/after
Which get smooshed into two main-before.min.js and main-after.min.js scripts, one for the header (with essential elements of normalize and modernizr that have to run early, for example) and after for last thing in the body since that javascript can wait. These are not intended for reading, much like Google's main page.
If there are scripts and style sheets that make sense to minify and leave linked alone because of a particular cache management approach that is taken care of in the build rules.
These days, if you are not using a build process of some kind, like gulp or grunt, you likely are not reaching most of the mobile-centric performance goals you should probably be considering.
I place CSS files in a folder css, Javascript in js, images in images, ... Add subfolders as you see fit. No need for any naming convention on the level of individual files.
/Assets/
/Css
/Images
/Javascript (or Script)
/Minified
/Source
Is the best structure I've seen and the one I prefer. With folders you don't really need to prefix your CSS etc. with descriptive names.
For large sites where css might define a lot of background images, a file naming convention for those assets comes in really handy for making changes later on.
For example:
[component].[function-description].[filetype]
footer.bkg-image.png
footer.copyright-gradient.png
We have also discussed adding in the element type, but im not sure how helpful that is and could possibly be misleading for future required changes:
[component].[element]-[function-description].[filetype]
footer.div-bkg-image.png
footer.p-copyright-gradient.png
You can name it like this:
/assets/css/ - For CSS files
/assets/font/ - For Font files. (Mostly you can just go to google fonts to search for usable fonts.)
/assets/images/ - For Images files.
/assets/scripts/ or /assets/js/ - For JavaScript files.
/assets/media/ - For video and misc. files.
You can also replace "assets" with "resource" or "files" folder name and keep the name of it's subfolders. Well having an order folder structure like this isn't very important the only important is you just have to arrange your files by it's format. like creating a folder "/css/" for CSS files or "/images/" for Image files.
First, I divide into folders: css, js, img.
Within css and js, I prefix files with the project name because your site may include js and css files which are components, this makes it clear where files are specific for your site, or relating to plugins.
css/mysite.main.css css/mysite.main.js
Other files might be like
js/jquery-1.6.1.js
js/jquery.validate.js
Finally images are divided by their use.
img/btn/submit.png a button
img/lgo/mysite-logo.png a logo
img/bkg/header.gif a background
img/dcl/top-left-widget.jpg a decal element
img/con/portait-of-something.jpg a content image
It's important to keep images organized since there can be over 100 and can easily get totally mixed together and confusingly-named.
I tend to avoid anything generic, such as what smdrager suggested. "mysite.main.css" doesn't mean anything at all.
What is "mysite"?? This one I'm working on? If so then obvious really, but it already has me thinking what it might be and if it is this obvious!
What is "Main"? The word "Main" has no definition outside the coders knowledge of what is within that css file.
While ok in certain scenarios, avoid names like "top" or "left" too: "top-nav.css" or "top-main-logo.png".
You might end up wanting to use the same thing elsewhere, and putting an image in a footer or within the main page content called "top-banner.png" is very confusing!
I don't see any issue with having a good number of stylesheets to allow for a decent naming convention to portray what css is within the given file.
How many depends entirely on the size of the site and what it's function(s) are, and how many different blocks are on the site.
I don't think you need to state "CSS" or "STYLE" in the css filenames at all, as the fact it's in "css" or "styles" folder and has an extension of .css and mainly as these files are only ever called in the <head> area, I know pretty clearly what they are.
That said, I do this with library, JS and config (etc) files. eg libSomeLibrary.php, or JSSomeScript.php. As PHP and JS files are included or used in various areas within other files, and having info of what the file's main purpose is within the name is useful.
eg: Seeing the filename require('libContactFormValidation.php'); is useful. I know it's a library file (lib) and from the name what it does.
For image folders, I usually have images/content-images/ and images/style-images/. I don't think there needs to be any further separation, but again it depends on the project.
Then each image will be named accordingly to what it is, and again I don't think there's any need for defining the file is an image within the file name. Sizes can be useful, especially for when images have different sizes.
site-logo-150x150.png
site-logo-35x35.png
shop-checkout-button-40x40.png
shop-remove-item-20x20.png
etc
A good rule to follow is: if a new developer came to the files, would they sit scratching their head for hours, or would they likely understand what things do and only need a little time researching (which is unavoidable)?
As anything like this, however, one of the most important rules to follow is simply constancy!
Make sure you follow the same logic and patterns thoughout all your naming conventions!
From simple css file names, to PHP library files to database table and column names.
This is an old question, but still valid.
My current recommendation is to go with something in this lines:
assets (or assets-web or assets-www); this one is intended for static content used by the client (browser)
data; some xml files and other stuff
fonts
images
media
styles
scripts
lib (or 3rd-party); this one is intended for code you don't make or modify, the libraries as you get them
lib-modded (or 3rd-party-modified); this one is intended for code you weren't expected to modify, but had to, like applying a workaround/fix in the meantime the library provider releases it
inc (or assets-server or assets-local); this one is intended for content used server side, not to be used by the client, like libraries in languages like PHP or server scripts, like bash files
fonts
lib
lib-modded
I marked in bold the usual ones, the others are not usual content.
The reason for the main division, is in the future you can decide to server the web assets from a CDN or restrict client access to server assets, for security reasons.
Inside the lib directories i use to be descriptive about the libraries, for example
lib
jquery.com
jQuery
vX.Y.Z
github
[path]
[library/project name]
vX.Y.Z (version)
so you can replace the library with a new one, without breaking the code, also allowing future code maintainers, including yourself, to find the library and update it or get support.
Also, feel free to organize the content inside according to its usage, so images/logos and images/icons are expected directories in some projects.
As a side note, the assets name is meaningful, not only meaning we have resources in there, but meaning the resources in there must be of value for the project and not dead weight.
The BBC have tons of standards relating web development.
Their standard is fairly simple for CSS files:
http://www.bbc.co.uk/guidelines/futuremedia/technical/css.shtml
You might be able to find something useful on their main site:
http://www.bbc.co.uk/guidelines/futuremedia/

Web served image in PDF?

Does PDF and/or Adobe Reader support including an image by URL so that you can insert a dynamic images from a web server into a document?
The answer to your question is both yes and no. If you look in the PDF spec (I'm going by version 1.7) in section 7.11.5, you'll see that a stream within a PDF document can be represented by an URL. So yes, you can go ahead and specify that a PDF has, say, its image content in the specified URL.
The problem will be that when you specify an image within PDF, you are specifying a PARTICULAR image that must have a particular data length and encoding. Simply specifying dimensions, dct compression (aka jpg), and URL is not enough. Images are contained in streams of a particular length. If the stream is too long or too short, it is considered an error.
So you can have images dynamically served up, provided that they are always exactly the same byte length. I think. And I say this because the specification is somewhat ambiguous as to what happens when you set the length to 0 in the stream dictionary.
Now, is doing this practical? Maybe - you'll need a fairly strong PDF toolkit in order to be able to author these documents. And if you have that, I think you'd be better off authoring the entire PDF document that your clients want on the fly rather than trying to substitute an image at read time.
I don't believe you can place a dynamic image in a PDF document in this manner. It's possible to dynamically create an entire PDF document using web-hosted content (using PHP, Coldfusion, etc.) but changing that content later on the web server will not dynamically update previously generated PDF documents, which is what it sounds like you want to do.
As PDFs are meant to be portable by nature (PORTABLE Document Format) and thus, not always viewed online, this goes against the very principle of the document format, and is not supported as far as I know.
You could include a reference to an image at the time of generation of the PDF, but said image will embedded into the PDF, not linked.
You could use pdf.js and modify the rendering methods slightly so that you inject your image. You can find pdf.js here: https://github.com/mozilla/pdf.js
You can also use FlexPaper which has an API that allows you to overlay your document with images
http://flexpaper.devaldi.com/