How to add multiple Images into pdf using prawn table as per the image size using Ruby-on Rails - ruby-on-rails-5

I am using below code to generate pdf with multiple images using prawn-table using Ruby-on-Rails.
def image_generation_with_pdf(result_files)
pdf = Rails.root.join("tmp/images/test.pdf").to_s
Prawn::Document.generate(pdf) do
data = []
result_files.each_slice(2) do |batch_file|
image1 = { image: File.open(batch_file[0]), image_width: 250 }
image2 = batch_file[1].present? ? { image: File.open(batch_file[1]), image_width: 250 } : ""
data << [image1, image2]
File.delete(batch_file[0]) if batch_file[0].present?
File.delete(batch_file[1]) if batch_file[1].present?
end
table(data)
end
[pdf]
end
In above example I am trying to add only 2 column of image into one row.
But I want to give images into multiple column into row as per image size. Where all image size will be equal.

Related

Multi page PDF from AppScript - How to insert page breaks? [duplicate]

I would like to programmatically set page breaks in my Google Spreadsheet before exporting to PDF, using Apps Script
It should be possible as you can manually set the page breaks when you print the Spreadsheet (https://support.google.com/docs/answer/7663148?hl=en)
I found that it's possible in Google Docs (https://developers.google.com/apps-script/reference/document/page-break) but they don't mention it on the sheet.
Is there a way to do it, even if it's a "hack"?
Talking about "hacks", you may try to capture HTTP request sent from the Spreadsheet to Google when you are trying to save a sheet as PDF by going to the developer tools - Network.
From this link you can get formatting parameter pc, which in my case looks like this:
[null,null,null,null,null,null,null,null,null,0,
[["1990607563"]],
10000000,null,null,null,null,null,null,null,null,null,null,null,null,null,null,
43866.56179325232,
null,null,
[0,null,1,0,0,0,1,1,1,1,2,1,null,null,2,1],
["A4",0,6,1,[0.75,0.75,0.7,0.7]],
null,0,
[["1990607563",[[45,92],[139,139]],[[0,15]]]],0]
where:
[["1990607563",[[45,92],[139,139]],[[0,15]]]],0] // page breaks parameters
Note though that I used custom page breaks and landscape orientation, which are reflected in the response above.
Putting it all together, the following code does the trick:
function exportPDFtoGDrive (ssID, filename, source){
var source = "1990607563"
var dt = new Date();
var d = encodeDate(dt.getFullYear(),dt.getMonth(),dt.getDate(),dt.getHours(),dt.getMinutes(),dt.getSeconds());
var pc = [null,null,null,null,null,null,null,null,null,0,
[[source]],
10000000,null,null,null,null,null,null,null,null,null,null,null,null,null,null,
d,
null,null,
[0,null,1,0,0,0,1,1,1,1,2,1,null,null,2,1],
["A4",0,6,1,[0.75,0.75,0.7,0.7]],
null,0,
[[source,[[45,92],[139,139]],[[0,15]]]],0];
var folder = DriveApp.getFoldersByName("FolderNameGoesHere").next();
var options = {
'method': 'post',
'payload': "a=true&pc="+JSON.stringify(pc)+"&gf=[]",
'headers': {Authorization: "Bearer " + ScriptApp.getOAuthToken()},
'muteHttpExceptions': true
};
const esid = (Math.round(Math.random()*10000000));
const theBlob =
UrlFetchApp.fetch("https://docs.google.com/spreadsheets/d/"+ssID+"/pdf?id="+ssID+"&esid="+esid, options).getBlob();
folder.createFile(theBlob).setName(filename+".pdf");
}
function myExportPDFtoGDrive(){
var ss = SpreadsheetApp.openById('yourSpreadSheetID');
var sheet = ss.getSheetByName("NameGoesHere");
var filename = ss.getName()+" ["+sheet.getName()+"]";
exportPDFtoGDrive (ss.getId(),filename);
}
A more detailed explanation of the hack is available here
Export Google Sheets to PDF though in Russian only.
I use a work around. I adjust the page size by altering the row height to fit the paper size I want (A4).
When exporting to pdf google changes sizes to fit the width. I add up the size of the columns and then set the row heights accordingly. Numbers were chosen by trial and error.
var width = 0;
for(var z = 0; z < s4.getLastColumn(); z++){
width += s4.getColumnWidth(z+1);
}
var a4PageHeightPixels = 1050 * width / 800;
Because I wanted the rows all the same height I set the row height dividing my page height by the number of rows. Having ensured the last row was blank, I adjusted the last row to take up the rounding error.
rowHeight= Math.floor(a4PageHeightPixels/(numDataRows ));
lastRowHeight = a4PageHeightPixels - (numDataRows -1) * rowHeight;
s4.setRowHeights(pageFirstRow,numDataRows-1,rowHeight);
s4.setRowHeight(pageFirstRow+numDataRows-1,lastRowHeight);
(s4 is the sheet I am using)However, I would expect most people would simply want to insert a blank line at the bottom of each page and adjust its size to fit the pdf paper size.

How to get multiple images from a multipage pdf in typoscript?

Using an Images element I can add a PDF and have it render as a JPG. However it only does the first page. Is there any way to output each page as JPG files?
I am using the layout field to change how the images are rendered by typoscript. Can I split the pdf somehow?
eg.
# Image Layouts
temp.image < tt_content.image.20
tt_content.image.20 >
tt_content.image.20 = CASE
tt_content.image.20 {
key.field = layout
default < temp.image
101 < temp.image
101 {
??
}
}
As described here on TypoScript Reference you can use frame option to define the page:
10 = IMAGE
10.file = fileadmin/some.pdf
10.frame = 0
20 = < .10
20.frame = 1
etc.
But as I know there is no automatic mode to loop and detect if the frame exist in the pdf.

django.db.utils.IntegrityError when trying to delete duplicate images

I've got the following code to delete duplicate images from a perceptual hash I calculated.
images = Image.objects.all()
images_deleted = 0
for image in images:
duplicates = Image.objects.filter(hash=image.hash).exclude(pk=image.pk).exclude(hash="ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff")
for duplicate in duplicates:
duplicate_tags = duplicate.tags.all()
image.tags.add(*duplicate_tags)
duplicate.delete()
images_deleted+=1
print(str(images_deleted))
running it I get the following exception:
django.db.utils.IntegrityError: insert or update on table
"crawlers_image_tags" violates foreign key constraint
"crawlers_image_t_image_id_72a28d1d54e11b5f_fk_crawlers_image_id"
DETAIL: Key (image_id)=(5675) is not present in table
"crawlers_image".
can anyone shed some light on what exactly the problem is?
edit:
models:
class Tag(models.Model):
name = models.CharField(max_length=100)
def __str__(self):
return self.name
class Image(models.Model):
origins = (
('PX', 'Pexels'),
('MG', 'Magdeleine'),
('FC', 'FancyCrave'),
('SS', 'StockSnap'),
('PB', 'PixaBay'),
('TP', 'tookapic'),
('KP', 'kaboompics'),
('PJ', 'picjumbo'),
('LS', 'LibreShot')
)
source_url = models.URLField(max_length=400)
page_url = models.URLField(unique=True, max_length=400)
thumbnail = models.ImageField(upload_to='thumbs', null=True)
origin = models.CharField(choices=origins, max_length=2)
tags = models.ManyToManyField(Tag)
hash = models.CharField(max_length=200)
def __str__(self):
return self.page_url
def create_hash(self):
thumbnail = Imagelib.open(self.thumbnail.path)
thumbnail = thumbnail.convert('RGB')
self.hash = blockhash(thumbnail, 24)
self.save(update_fields=["hash"])
def create_thumbnail(self, image_url):
if not self.thumbnail:
if not image_url:
image_url = self.source_url
headers = {
'User-Agent': 'Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/45.0.2454.101 Safari/537.36',
}
for i in range(5):
r = requests.get(image_url, stream=True, headers=headers)
if r.status_code != 200 and r.status_code!= 304:
print("error loading image url status code: {}".format(r.status_code))
time.sleep(2)
else:
break
if r.status_code != 200 and r.status_code!= 304:
print("giving up on this image, final status code: {}".format(r.status_code))
return False
# Create the thumbnail of dimension size
size = 500, 500
img = Imagelib.open(r.raw)
thumb = ImageOps.fit(img, size, Imagelib.ANTIALIAS)
# Get the image name from the url
img_name = os.path.basename(image_url.split('?', 1)[0])
file_path = os.path.join(djangoSettings.MEDIA_ROOT, "thumb" + img_name)
thumb.save(file_path, 'JPEG')
# Save the thumbnail in the media directory, prepend thumb
self.thumbnail.save(
img_name,
File(open(file_path, 'rb')))
os.remove(file_path)
return True
Let's examine your code step by step.
Say, you have 3 images in your database (for simplicity i've skipped irrelevant fields):
Image(pk=1, hash="d2ffacb...e3')
Image(pk=2, hash="afcbdee...77')
Image(pk=3, hash="d2ffacb...e3')
As we can see, first and third image have exact same hash. Let's assume all your images have some tags. Now back to your code. Lets check what will happen in first iteration:
all images with same hash will be fetched from database, this will be only image pk=3
Iterating through that images will copy all your tags from that duplicates to original one. There is nothing wrong.
iterating through that images will also remove them.
So after first iteration, image with pk=3 doesn't exist anymore.
Next iteration, image pk=2. Nothing will happen because there are no duplicates.
Next iteration, image pk=3.
all images with same hash will be fetched from database, this will be only image pk=1
Iterating through that images will copy all your tags from that duplicates to original one. But wait... there is no image pk=3 in database, we can't assign any tags to it. And that will throw your IntegrityError.
To avoid that, you should simply fetch from database only original ones in outer for loop. To do that, you can do:
images = Image.objects.distinct('hash')
You can also add some ordering here, so there always will be fetched for example image with lower ID as original one:
images = Image.objects.order_by('id').distinct('hash')
This is to do with the evaluation strategy of the queryset.
Image.objects.all() returns a thunk - that is, a sort of promise of an iterable sequence of images. The SQL query is not executed at this stage.
When you start iterating over it - for image in images - the SQL query is evaluated. You now have a list of image objects in memory.
Now, say you have four images in the database - ids 0, 1, 2, and 3. 0 and 3 are duplicates. The first image is processed, turning up 3 as a duplicate. You delete 3. Image 3 is still in the images iterator, however. When you get there, you're going to try to add tags from image 0 to image 3's tags collection. This will trigger the integrity error, since image 3 has already been deleted.
The simple fix is to keep an accumulator of images to be deleted, and do them all at the end.
images = Image.objects.all()
images_to_delete = []
for image in images:
if image.pk in images_to_delete:
pass
else:
duplicates = Image.objects.filter(hash=image.hash).exclude(pk=image.pk).exclude(hash="ffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffffff")
for duplicate in duplicates:
duplicate_tags = duplicate.tags.all()
image.tags.add(*duplicate_tags)
images_to_delete.append(duplicate.pk)
print(len(images_to_delete))
for pk in images_to_delete:
Image.objects.get(pk=pk).delete()
EDIT: corrected proximate cause of the error, as pointed out by GwynBleidD.

CMS and store hi-resolution images in generated pdf

I'm looking for good CMS for publishing software manuals.
Requirements:
publish manual pages as web pages with thumbnails and shows full resolution after click on image,
exporting manual pages to a pdf file with full resolution images instead to thumbnails.
I found IMHO best wiki system named Tiki Wiki (https://info.tiki.org/) but when I export to pdf then I gets low resolution thumbnail.
I solve this problem by very simple Tiki Wiki code modification:
Modify lib/wiki-plugins/wikiplugin_img.php to force using full image resolution instead to thumbnail in print page mode (inserted code 1) and rescale images in generated HTML by 0.5 factor (inserted code 2):
[...]
function wikiplugin_img( $data, $params )
{
[...]
$imgdata = array_merge($imgdata, $params);
// inserted code 1 (~410 line)
if ($GLOBALS['section_class']=="tiki_wiki_page print"){
$imgdata['thumb'] = '';
}
// end of inserted code 1
//function calls
if ( !empty($imgdata['default']) || !empty($imgdata['mandatory'])) {
[...]
$fwidth = '';
$fheight = '';
if (isset(TikiLib::lib('parser')->option['indexing']) && TikiLib::lib('parser')->option['indexing']) {
$fwidth = 1;
$fheight = 1;
} else {
// inserted code 2 (~410 line)
if ($GLOBALS['section_class']=="tiki_wiki_page print"){
$fwidth = $imageObj->get_width() / 2;
$fheight = $imageObj->get_height() / 2;
} else {
$fwidth = $imageObj->get_width();
$fheight = $imageObj->get_height();
}
// end of inserted code 2 (~638 line)
}
[...]
Now, after printing to pdf by wkhtmltopdf we gets pdf with small but full resolution images.
Additional modifies:
Adds following lines to cms/cssmenus.css (or other css included in print mode) for increase bottom margin of image caption:
div.thumbcaption {
margin-bottom: 5mm;
}
Removes lines from 171 to ~175 in templates/tiki-show_content.tpl for remove the "The original document is available at" foot.

Content templates rendering in TYPO3

I've got a strange problem connected with content rendering.
I use following code to grab the content:
lib.otherContent = CONTENT
lib.otherContent {
table = tt_content
select {
pidInList = this
orderBy = sorting
where = colPos=0
languageField = sys_language_uid
}
renderObj = COA
renderObj {
10 = TEXT
10.field = header
10.wrap = <h2>|</h2>
20 = TEXT
20.field = bodytext
20.wrap = <div class="article">|</div>
}
}
and everything works fine, except that I'd like to use also predefined column-content templates other than simple text (Text with image, Images only, Bullet list etc.).
The question is: with what I have to replace renderObj = COA and the rest between the brackets to let the TYPO3 display it properly?
Thanks,
I.
The available cObjects are more or less listed in TSRef, chapter 8.
TypoScript for rendering Text w/image can be found in typo3/sysext/css_styled_content/static/v4.3/setup.txt at line 724, and in the neighborhood you'll find e.g. bullets (below) and image (above), which is referenced in textpic line 731. Variants of this is what you'll write in your renderObj.
You will find more details in the file typo3/sysext/cms/tslib/class.tslib_content.php, where e.g. text w/image is found at or around line 897 and is called IMGTEXT (do a case-sensitive search). See also around line 403 in typo3/sysext/css_styled_content/pi1/class.cssstyledcontent_pi1.php, where the newer css-based rendering takes place.