Remove or hide PDF layer using ABCPdf? - pdf

Is is possible to remove or hide a layer from a PDF using ABCPdf or another framework?

The following C# example shows how layer 2 of page 1 can be deleted:
Doc theDoc = new Doc();
theDoc.Read("source.pdf");
int thePages = theDoc.GetInfoInt(theDoc.Root, "Pages");
int thePage = theDoc.GetInfoInt(thePages, "Page 1");
int theLayer = theDoc.GetInfoInt(thePage, "Content 2");
theDoc.Delete(theLayer);

Or perhaps you were looking for the Flatten() function?

ABCpdf contains an Example project called OCGLayers. This project shows you how to identify and redact all the items in a layer.
For example:
Properties props = Properties.FromDoc(_doc, false);
Page page = (Page)_doc.ObjectSoup[_doc.Page];
Reader reader = Reader.FromPage(props, page);
List<OptionalContent.Layer> layers = reader.GetLayers();
foreach (OptionalContent.Layer layer in layers) {
if (layer.Visible == false) {
if (reader == null)
reader = Reader.FromPage(props, page);
Reader.Redact(ref reader, layer);
}
}
UpdateLayers();
UpdatePreview();

Related

Adobe illustrator linked file name to a layer name script

I want the layer enter image description here
to have the name of the linked file without .eps at the end.
Anyway, I found an answer. So for anyone looking here it is, although it does take to click on every layer, at least you don't have to type it.
function test() {
var sel_itemPlaced = app.activeDocument.selection[0]; // be sure that a linked item (and not an embedded) is selected
var fileName = sel_itemPlaced.file.name;
var textContents = fileName.replace(/\%20/g, " "); //change %20 to spaces
textContents = textContents.replace(/\.[^\.]*$/, ""); //remove extension
var _item = sel_itemPlaced;
while (_item.parent.typename != 'Layer') {
_item = _item.parent;
}
_item.parent.name = textContents;
}
test();
It does work for all placed images and all layers at once:
var images = app.activeDocument.placedItems;
for (var i=0; i<images.length; i++)
images[i].layer.name = images[i].file.name.replace(/\.[^\.]+$/, "");

Having Issue On Removing Layer on ArcGIS API For JavaScript

I am adding a markers layer called layer1 like this to map
function drawPoints(mapInfo) {
layer1 = new esri.layers.GraphicsLayer();
for (var i = 0; i < mapInfo.length; i++) {
var projects = mapInfo[i];
var project = new esri.geometry.Point(projects.Longitude, projects.Latitude);
project = esri.geometry.geographicToWebMercator(project);
var symbol = new esri.symbol.PictureMarkerSymbol("img/map/marker.png", 18, 18);
projectInfoTemplate = new InfoTemplate();
projectInfoTemplate.setTitle("Project Details");
projectInfoTemplate.setContent('<div class="row"></div> ');
var projectsG = new esri.Graphic(project, symbol).setInfoTemplate(projectInfoTemplate);
layer1.add(projectsG);
}
map.addLayer(layer1);
}
now in next request I need to clear map so I used the
map.removeLayer(layer1);
but this is causing error because the layer1 still not created at first request. Now I need to check IF the map has a layer called layer1 then removeit. Here is a pseudo code of what I need to do:
if(map.has/contains/include(layer1){
map.removeLayer(layer1);
}
can you please let me know how to do that?
It is a graphics layer so the layer will be listed in the map.graphicsLayerIds array. You can search for and remove the layer like this:
if (map.graphicsLayerIds.indexOf(layer1.id) != -1) {
map.removeLayer(layer1);
}

How do I make ABCPdf to automatically write to a new page when text requires more than 1 page?

I deal with dynamic input text, so the pages should be dynamically created. If page 1 is already full, it should write to a new page, so it means I can have page 2, page 3 and so on depending on the data processed.
Currently, my text is truncated. Only writes Page 1, the rest of data are not written.
My current code below:
//add page 1
theDoc.Page = theDoc.AddPage();
theDoc.AddImageHtml(html, true, 826, true);
//continue adding page if needed
while (theDoc.GetInfo(theID, "Truncated") == "1")
{
theDoc.Page = theDoc.AddPage();
theDoc.AddImageHtml(html, true, 826, true);
}
//save file
String pdfFilePath = WebConfigurationManager.AppSettings["pdfFilePath"];
Guid fileName = Guid.NewGuid();
pdfLink = pdfFilePath + fileName.ToString() + ".pdf";
theDoc.Save(pdfLink);
theDoc.Clear();
variable html contains all the data(webpage), I'm probably missing something in my while loop. Any help is appreciated! Thanks
Found it, Use Chainable and then Flatten()
theDoc.Page = theDoc.AddPage();
int theID;
theID = theDoc.AddImageUrl("http://www.yahoo.com/");
while (true) {
theDoc.FrameRect(); // add a black border
if (!theDoc.Chainable(theID))
break;
theDoc.Page = theDoc.AddPage();
theID = theDoc.AddImageToChain(theID);
}
for (int i = 1; i <= theDoc.PageCount; i++) {
theDoc.PageNumber = i;
theDoc.Flatten();
}

Split PDF into separate files based on text

I have a large single pdf document which consists of multiple records. Each record usually takes one page however some use 2 pages. A record starts with a defined text, always the same.
My goal is to split this pdf into separate pdfs and the split should happen always before the "header text" is found.
Note: I am looking for a tool or library using java or python. Must be free and available on Win 7.
Any ideas? AFAIK imagemagick won't work for this. May itext do this? I never used and it's
pretty complex so would need some hints.
EDIT:
Marked Answer led me to solution. For completeness here my exact implementation:
public void splitByRegex(String filePath, String regex,
String destinationDirectory, boolean removeBlankPages) throws IOException,
DocumentException {
logger.entry(filePath, regex, destinationDirectory);
destinationDirectory = destinationDirectory == null ? "" : destinationDirectory;
PdfReader reader = null;
Document document = null;
PdfCopy copy = null;
Pattern pattern = Pattern.compile(regex);
try {
reader = new PdfReader(filePath);
final String RESULT = destinationDirectory + "/record%d.pdf";
// loop over all the pages in the original PDF
int n = reader.getNumberOfPages();
for (int i = 1; i < n; i++) {
final String text = PdfTextExtractor.getTextFromPage(reader, i);
if (pattern.matcher(text).find()) {
if (document != null && document.isOpen()) {
logger.debug("Match found. Closing previous Document..");
document.close();
}
String fileName = String.format(RESULT, i);
logger.debug("Match found. Creating new Document " + fileName + "...");
document = new Document();
copy = new PdfCopy(document,
new FileOutputStream(fileName));
document.open();
logger.debug("Adding page to Document...");
copy.addPage(copy.getImportedPage(reader, i));
} else if (document != null && document.isOpen()) {
logger.debug("Found Open Document. Adding additonal page to Document...");
if (removeBlankPages && !isBlankPage(reader, i)){
copy.addPage(copy.getImportedPage(reader, i));
}
}
}
logger.exit();
} finally {
if (document != null && document.isOpen()) {
document.close();
}
if (reader != null) {
reader.close();
}
}
}
private boolean isBlankPage(PdfReader reader, int pageNumber)
throws IOException {
// see http://itext-general.2136553.n4.nabble.com/Detecting-blank-pages-td2144877.html
PdfDictionary pageDict = reader.getPageN(pageNumber);
// We need to examine the resource dictionary for /Font or
// /XObject keys. If either are present, they're almost
// certainly actually used on the page -> not blank.
PdfDictionary resDict = (PdfDictionary) pageDict.get(PdfName.RESOURCES);
if (resDict != null) {
return resDict.get(PdfName.FONT) == null
&& resDict.get(PdfName.XOBJECT) == null;
} else {
return true;
}
}
You can create a tool for your requirements using iText.
Whenever you are looking for code samples concerning (current versions of) the iText library, you should consult iText in Action — 2nd Edition the code samples from which are online and searchable by keyword from here.
In your case the relevant samples are Burst.java and ExtractPageContentSorted2.java.
Burst.java shows how to split one PDF in multiple smaller PDFs. The central code:
PdfReader reader = new PdfReader("allrecords.pdf");
final String RESULT = "record%d.pdf";
// We'll create as many new PDFs as there are pages
Document document;
PdfCopy copy;
// loop over all the pages in the original PDF
int n = reader.getNumberOfPages();
for (int i = 0; i < n; ) {
// step 1
document = new Document();
// step 2
copy = new PdfCopy(document,
new FileOutputStream(String.format(RESULT, ++i)));
// step 3
document.open();
// step 4
copy.addPage(copy.getImportedPage(reader, i));
// step 5
document.close();
}
reader.close();
This sample splits a PDF in single-page PDFs. In your case you need to split by different criteria. But that only means that in the loop you sometimes have to add more than one imported page (and thus decouple loop index and page numbers to import).
To recognize on which pages a new dataset starts, be inspired by ExtractPageContentSorted2.java. This sample shows how to parse the text content of a page to a string. The central code:
PdfReader reader = new PdfReader("allrecords.pdf");
for (int i = 1; i <= reader.getNumberOfPages(); i++) {
System.out.println("\nPage " + i);
System.out.println(PdfTextExtractor.getTextFromPage(reader, i));
}
reader.close();
Simply search for the record start text: If the text from page contains it, a new record starts there.
Apache PDFBox has a PDFSplit utility that you can run from the command-line.
If you like Python, there's a nice library: PyPDF2. The library is pure python2, BSD-like license.
Sample code:
from PyPDF2 import PdfFileWriter, PdfFileReader
input1 = PdfFileReader(open("C:\\Users\\Jarek\\Documents\\x.pdf", "rb"))
# analyze pdf data
print input1.getDocumentInfo()
print input1.getNumPages()
text = input1.getPage(0).extractText()
print text.encode("windows-1250", errors='backslashreplacee')
# create output document
output = PdfFileWriter()
output.addPage(input1.getPage(0))
fout = open("c:\\temp\\1\\y.pdf", "wb")
output.write(fout)
fout.close()
For non coders PDF Content Split is probably the easiest way without reinventing the wheel and has an easy to use interface: http://www.traction-software.co.uk/pdfcontentsplitsa/index.html
hope that helps.

How to read all pages from PDF?

I am using an sdk from pdftron,which reads a single page at a time. My code would be:
PDFDoc doc = new PDFDoc(input_path);
doc.InitSecurityHandler();
PageIterator itr = doc.GetPage(1);
for (line = txt.GetFirstLine(); line.IsValid(); line = line.GetNextLine()){
for (word = line.GetFirstWord(); word.IsValid(); word = word.GetNextWord()){
Console.WriteLine(word.GetString());
}
}
I want to read each and every page, I had posted my same problem in PDFTRON forums.But couldn't get the solution for this.
Is it possible to read each and every pages?
Yes,you can read each and every pages of pdf at a time.You need to do just s slight change initializing page iterator.
I have modified the code,and it works fine.
PDFDoc doc = new PDFDoc(input_path);
doc.InitSecurityHandler();
PageIterator itr = doc.GetPageIterator();
for (; itr.HasNext(); itr.Next()) // Read every page
{
for (line = txt.GetFirstLine(); line.IsValid(); line = line.GetNextLine())
{
for (word = line.GetFirstWord(); word.IsValid(); word = word.GetNextWord())
{
Console.WriteLine(word.GetString());
}
}
}
Hope this will help you.