python-pptx with matplotlib fix image resolution - matplotlib

i am trying to generate a graph using matplotlib and save it to python-pptx . everything is working fine but the image resolution is low when imported to pptx.( i am just saving to memory using StringIO then using add_picture() in pptx to add image)
when i do :
some_image_in_memory = StringIO()
plt.savefig(some_image_in_memory)
it works fine but give low res image but when i do :
plt.savefig(some_image_in_memory, format='svg')
i get error:
cannot identify image file <StringIO.StringIO INstamce at ..>
is this even correct? svg should maintain resolution but i cant read this in pptx.

I got around this by setting dpi value to savefig():
ex
plt.savefig(some_image_stream_in_memory, dpi=1200)

Unfortunately, PowerPoint does not directly support the SVG format (I've heard it's a turf issue between MS and Adobe). I expect that explains the error you're getting when you save with format=svg.
Other folks seem to have good luck with the PNG format from matplotlib. I kind of suppose that's the default image format, but might be worth a check.
The other thing that occurs to me is I don't see anywhere you have specified the size of the graph to be saved from matplotlib. If it is getting saved as a small image and then getting scaled significantly larger when displaying it in PowerPoint, this will produce a "grainy" appearance.

Related

pdf2svg leads to blurry images

I'm trying to convert a pdf figure to svg so I can edit some details with Inkscape. The problem I have is that the import changes slightly through some sort of smoothing.
In particular, this is the original figure:
And this is the figure after converting to SVG
This is the output of pdf2svg, which is exactly the same I get if I use Inkscape directly.
I attach a link where you can get both files.
https://www.dropbox.com/s/domxcc8pncyouy6/images.tar.gz?dl=0
Do you know a workaround to this issue?
Without seeing the SVG it is hard to tell for sure. However it looks like the "heat map" portion of your PDF/SVG may be a low resolution bitmap that is being enlarged in the page.
By default, SVG renderers will use interpolation when enlarging an image. This gives the image a smoothed/blurry look at large scales.
You could try locating the <image> element in your SVG and adding the attribute image-rendering="pixelated" to the <image> tag. Some browsers support that option and will scale the image using the nearest-neighbour scaling method.
Otherwise you may need to extract the image from the PDF or SVG; resample it at a higher (eg. 4x or 8x) resolution; then reinsert it back into the file.
Find the image in the SVG file (<image id="image5" .../>
Extract the Base64 encoded image from the DataURI. And decode it using a Base64 decoder.
Multiply the image resolution using an editor, cusch as Photoshop or Gimp.
Encode the file back to Base64
Update that <image> element with the new Base64.

Tesseract cannot recognize my image correctly

I am developing an Android app now, it needs to recognize captcha from website.
I utilize the tess-two to recognize captcha and follow TrainingTesseract3 instructions to train my own traineddata (using jTessBoxEditor to correct characters), but it cannot recognize correctly and even cannot recognize it.
The below TIFF image is that I use to train my Tesseract, I collect many captchas and merge them into a image.
TIFF image
The image that I want to recognize
For example, the expected result of the above image should be k8666, but the actual result is only 66.
Does anyone give me a help? Thanks.
I tried your images using a .NET wrapper for tesseract-ocr Tesseract-ocr .Net Wrapper by Charliesw.
I got some better results like (K8EEE, K8656), i think you have to increase the text font and make it bold and i saved the image in tiff format with 96DPI resolution to get a better results than mine.

WebGL 2D Texture Display Error

I'm having an odd problem. On Chrome and Firefox, everything is fine, but in Safari when I load 2D images onto a particular panel (using WebGL) I get the following error:
WebGL: INVALID_VALUE: texImage2D: packImage error
The images are greyscale 128x128 jpegs. I can provide more code if necessary, but I'm having trouble even finding out what this packImage error means.
Thanks!
I found that after loading the texture, you just need to set the appropriate format. For instance:
var tex = THREE.ImageUtils.loadTexture('img/grayscale.png');
tex.format = THREE.LuminanceFormat;

White image while inserting a SVG image in TCPDF

I'm trying to insert some SVG images in a PDF using TCPDF with the method TCPDF::ImageSVG, but when I try this I get a white space.
If I try to enable TCPDF::setRasterizeVectorImages the image shows in the PDF file, but it is rasterized of course and so its quality is not good.
Do you have any idea?
Thank you very much for your help!
Unfortunately, TCPDF's SVG handling is quite limited, and the cause of your issue depends on the SVG you are trying to use. Later versions of TCPDF support more SVG functionality, so if you haven't done so, try using a later version of TCPDF.
If an update doesn't resolve the issue, and you're forced to use raster images, you can improve quality at the cost of file size. You can do this by rasterizing them at a high DPI yourself outside of TCPDF. Once you've done this, take your new high-resolution raster image and add it to your PDF with the Image method like any other raster image. At work we usually rasterize to 300dpi, but your application may call for more or less.
If your image gets added to the PDF far larger on the page than you expected, specify at least one of the dimensions so TCPDF knows how much of the page you're intending the image to use.

PDFBox : Converting to image : Quality loss when converting PDF containing scanned documents

My use case is pretty simple. I need to convert the PDFs to images.I tried using apache pdfbox and i am having some trouble in converting pdfs which contains scanned images. when i convert scanned image the image clarity is lost due to compression/scaling. So i was trying to extract the image data from the PDF and then store it. But the problem is i may get PDF files which will contain images and text in which case i would need to fallback to image conversion mode. The problem is how to differentiate between the pages/documents having only image and the ones with composite data. I was thinking i could use ProcSet defenition for this purpose but looks like it is marked as obsolete and non-reliable according to PDF specifications. Other possibility is to check all the objects linked to that page and see if it contains anything other than images. Please let me know if there is an easier way of doing this
Thanks
If your intention is convert pdf to image, It is better to use ImageMagick for that. If you use ImageMagick, there is a lot options to change the quality of the image. And converting pdf to image is pretty simple using ImageMagick.