Output filenames when extracting a range of pages from pdf into jpeg using Imagemagick

Output filenames when extracting a range of pages from pdf into jpeg using Imagemagick - pdf

I am trying to extract a range of pages from a multipage pdf file into individual jpegs using convert (Imagemagick). The extraction works fine. What I am stuck on is that if I want to extract page range 10-20, I still get out jpeg files with names page-0.jpeg to page-9.jpeg while I want them to be named page-10.jpeg to page-20.jpeg. Is there a way of specifying that on the command line?
I require this since I want to extract pages in chucks of 10 to avoid eating up too much memory for huge pdf files and don't want to keep renaming the files.
I remember having this working in an earlier project but can't figure out what I am missing now.

Finally managed to do this. Leaving a answer in case somebody else is looking for the same. The solution works with Imagemagick 6.5.1.
So we want to extract page numbered i to j from a.pdf into individual jpegs with files named from a-10.jpeg to a-20.jpeg.
convert a.pdf[i-j] -set filename:page "%[fx:t+i]" a-%[filename:page].jpeg
This uses fx operators. fx:t gives the screen number of current image in sequence and we can add our offset to it.

You can specify the first "page" number used by %d in the output filename by adding the -scene n parameter, e.g.:
convert a.pdf[0-9] -scene 10 a-%d.jpeg
will output a-10.jpeg, a-11.jpeg, etc.

Related

How can I replace a single PDF page using Imagemagick?

I have a multi-page PDF document, and I want a version of the document with one of the pages in the middle with a scanned-in copy, as I needed a physical signature in the document. How can I do this with ImageMagick? I'm aware that ImageMagick may not necessarily be the best tool for the job. However, the resulting PDF does not need to be high quality or a high fidelity copy, so it should be sufficient for my needs.
As a specific example, I have a 9 page my-file.pdf, and I want to create a copy of the PDF with the 8th page replaced with page-8.png. It looks like I should be able to achieve this goal with the convert tool, though it's not immediately obvious what the syntax would be. How can I achieve this goal?
If I merely wanted to append the new page to the end of the file, I know I can do the following:
convert my-file.pdf page-8.png output-file.pdf
However, this end up with the original pages 1-9, then the new page 8. What I actually want is to replace the original page 8 with the new page 8. My desired output is:
[original pages 1 - 7],[new page 8],[original page 9]

A specific page or range of pages can be specified using the bracket syntax with zero-based indexing. For instance, [8] will refer to the ninth page, and [0-6] to the first seven pages. Using this, a duplicate of the PDF with the 8th page replaced can be achieved as follows:
convert my-file.pdf[0-6] page-8.png my-file.pdf[8] output-file.pdf

As indicated in the question, as well as a comment, Imagemagick is not a great tool for this task, as it rasterizes the output of the PDF. An alternate solution that avoids this would be to use pdftk to replace the page. While the inserted page will still be an image, the replaced pages will not be rasterized.
First, save the scanned-in page as a PDF using an application capable of this. Then, use the pdftk cat operation to combine the PDFs:
pdftk A=my-file.pdf B=page-8.pdf cat A1-7 B A9 output output-file.pdf

Knitr Spin and Rmarkdown Fig.cap (figure caption). Producing double numbering pdf document

I am referring to this Suppress automatic figure numbering in pdf output with r markdown/knitr
which I don't think was answered fully.
Essentially, I am using knitr::spin and rmarkdown to produce word, pdf and html documents.
For word, there appears to be no numbering when one puts in
+fig.1, fig.cap = "Figure name"
You only get an output Figure name in the caption.
To solve that, I used captioner class.
figs = captioner("Figure")
That works fine for word
But I am not faced with rewriting the script for pdf document as the caption turns up as figure 1: figure 1: The name
I am using knitr::spin to actually generate the RMD document for forward outputs in word and pdf.
I am not sure I can use hooks in knitr::spin, as I have tried it as advertised but can't get it to work.
I also tried
header-includes: \usepackage{caption} \usepackage{float}
\captionsetup[table]{labelformat=empty}
\captionsetup[figure]{labelformat=empty}
as suggested somehere to surpress the prefix for pdf but I get errors from pandoc. It uses pdf2latex.
I am not sure how one would query the output format in knitr::spin to actually produce different actions for different formats which could be a solution although cumbersome.
Thank you so much for your help from a novice.

SPSS tables to latex (PDF) without creating an A4 page

This may be a stupid question, but I can't figure it out.
I have made some tables in SPSS. Now I want them over to my latex document.
What I do, it that I right-click the table in SPSS, and press export.
Here I can choose between PDF or .doc. BUT the PDF-file created, generates a file with the table on top of a page (A4 size, with "page 1" at the bottom). I do not want this, I only want the table.
example how it turns out:
Example how I want it to turn out:
If I export to word, I can further save as PDF, but same problem occurs.
Screenshot works, but does not give me the same picture-quality that I prefer.
Do anyone of you have any tips for me?
Thanks :)

Unfortunately SPSS does not provide native table export to Latex. It does provide table export to html and xls, which can post-hoc be converted to Tex tables. PDF output for everything forces to export the full page (very annoying for graphics as well) - but you probably don't want to insert the image of the table (you could crop the PDF if need be), but have a Tex table (in the same font) as your document anyway.
One thing I have done in the past to make the export to text tables with specific markup is to use the PRINT or LIST commands to print the text table to the output (or to a text file) that is closer to the end goal. In this NABBLE post I have some syntax that makes pandoc flavored pipe style markdown tables - it should be pretty clear how that same approach could be used for Tex tables (actually Tex tables should be much simpler).
Here is an example of some code using LIST to make a the markup closer to Tex tables.
DATA LIST FREE / Variable (A1) Mean Median (2F4.2).
BEGIN DATA
A 3.25 2.00
B 2.56 2.50
C 9.87 10.20
END DATA.
*Using LIST to make Latex style table.
STRING Mid (A1) End (A2).
COMPUTE Mid = "&".
COMPUTE End = "//".
LIST /VARIABLES = Variable Mid Mean Mid Median End.
And here is a screen shot of the produced output on my machine.
So here I would still have to copy-paste the text output into my Tex document, (and make the header row).

You can also use OMS to save designed items in a variety of formats, including XML and then use an xml-to-Latex tool such as xmltex. You could probably even generate such a conversion with XSLT from the XML.
From the Viewer, you could also retrieve the table with Python scripting and use a Python-based converter tool.

SAS : read in PDF file

I am looking for ways to read in a PDF file with SAS. Apparently this is not basic functionality and there is very little to be found on the internet. (Let alone that google is not easy with PDF in you search giving you also links to PDF documents that go about other things.)
The only things that can be found, are people looking for ways to import data into datasets from a PDF. For me, that is not even necesarry. I would like to be able to read the contents of the PDF file in one big character variable. If possible, it would even be better to be able to read in the file's binary data.
Is this possible with SAS and how? (I got it to work in Access VBA, but can't find any similar ways in SAS.)
(In the end, the purpose is to convert this to base64 and put that base64-string into an XML document.)

You probably will not be able to read the entire file into one character variable since the maximum size of a character variable is around 33 KB. A simple way to read in one line at a time, though, is something like the following:
%let pdfFileName = Test.pdf;
%let lineSize = 2000;
data base;
format text_line $&lineSize..;
infile "&pdfFileName" lrecl=&lineSize;
input text_line $;
run;
This requires that you have a general idea of the maximum record length ahead of time, but you could write additional code to determine the maximum record size prior to reading in the file. In this example each line of text is read into one character variable named "text_line." From there, you could use a RETAIN statement or double trailers (##) in the INPUT line to process multiple lines at a time. The SAS web-site has plenty of documentation on how to read and process text from various types of input files.

How to mix sound files of different format using FFMPEG?

I need to mix audio files of different types into a single output file through code in my iPad app.
For example, I need to merge a .m4a file with .mp3 or .wav or any other format file.
The resulting output file should be of .m4a type.
I have compiled FFMPEG for iOS with the link: http://lists.mplayerhq.hu/pipermail/ffmpeg-devel/2009-October/076618.html
Now, I am not able to understand in which direction to proceed?

This is more of a suggestion than an answer. I don't have any experience with obj-c, though I have worked with audio formats in other languages. Unless you find some library that does this specific task, you may need to decode the files and convert their data to some common numerical representation.
The sample data of a .wav file is stored as signed integers between the range of -32768 to 32767, while mp3 sample data is stored as floating points between the range of -1 to +1. Either representation could be converted to the other through some simple calculation.
mp3ToWavSample = mp3Sample * 32767
Once the data is converted "merging" becomes very easy. You can simply add the sample values together.
mergedSample = convertedSample1 + convertedSample2
You would need to apply this to every sample in the mp3. Depending on the size of your files, this could be a significant processing task.
As for adding reverb to your track, I'd suggest that you ask for help on that in a another question.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Output filenames when extracting a range of pages from pdf into jpeg using Imagemagick - pdf

You can specify the first "page" number used by %d in the output filename by adding the -scene n parameter, e.g.: convert a.pdf[0-9] -scene 10 a-%d.jpeg will output a-10.jpeg, a-11.jpeg, etc.

Related

How can I replace a single PDF page using Imagemagick?

Knitr Spin and Rmarkdown Fig.cap (figure caption). Producing double numbering pdf document

SPSS tables to latex (PDF) without creating an A4 page

SAS : read in PDF file

How to mix sound files of different format using FFMPEG?

Categories

Resources