Pandoc Lua script to filter specific markdown sub sections during PDF generation

Pandoc Lua script to filter specific markdown sub sections during PDF generation - pdf

I have markdown source and want to generate PDF using Pandoc.
I want to remove ALL sub sections below a specified level in the generated document. E.g. filter them from the source markdown.
Would this be possible with Lua or would it be better to do prefiltering using some other tools?

Got a suggestion from the Lua Google Groups forum which works for me:
local keep_deleting = false
function Block (b)
if b.t == 'Header' and b.level >= 3 then
keep_deleting = true
return {}
elseif b.t == 'Header' then
keep_deleting = false
elseif keep_deleting then
return {}
end
end

Related

IBM DOORS: DXL script that follows link chain

Im totally new in DXL scripting and try to get deeper into it. Unfortunately DXL is not the best documented kind of language.
First of all let me describe the situation. I have 3 different folders. The first folder includes just a start modul. The other both folders include in each case some more modules.
I want to start with the start modul going through all objects inside of it to look out for incoming links. If there is an incoming link I want to follow this link to the outgoing object which is just a further object in a module in the second folder
and switch to it. From there I want to do the same procedure until objects in folder 3 are reached or rather until check if folder 2 module objects have incoming links from folder 3 module objects.
The reason I want to do this is to check if the link chain is complete or not and how many of them are complete or not. Means if folder 1 module objects have incoming links from folder 2 module objects
and folder 2 module objects have incoming links from folder 3 module objects.
Thats want I ve got so far:
pragma runLim, 0
int iAcc = 0
int iRej = 0
int linkChainComplete = 0
int linkChainNotComplete = 0
string startModulPath = "/X/X.1"
Module m
m = read(startModulPath)
Object o
Link l
filtering off
Filter f = hasLinks(linkFilterIncoming, "*")
for o in m do {
bool isComplete = true
set(m, f, iAcc, iRej)
if(iAcc == 0){
isComplete = false
}
else{
for l in o <- "*" do {
Object src = source l;
set(m, f, iAcc, iRej)
if(iAcc == 0){
isComplete = false
}
}
}
if(isComplete){
linkChainComplete++
}
else{
linkChainNotComplete++
}
}
The first question is, am I on the right path?
And one question or rather problem is for example that I want to proof if there are incoming links by using the hasLinks function (concerning this also see "set(m, f, iAcc, iRej) and if(iAcc == 0)" part.
But this function refers to the module (m) instead objects (o). Is there another way to proof if an object has got incoming links or not?
Hope anyone could help me. Thank you very much indeed.

How to use mapFieldType with gdal.VectorTranslate

I'm trying to export a postgresql database into a .gpkg file, but some of my fields are lists, and ogr2ogr send me the message :
Warning 1: The output driver does not natively support StringList type for field my_field_name. Misconversion can happen. -mapFieldType can be used to control field type conversion.
But, as in the documentation, -mapFieldType is not a -lco, i don't find how to use it with the python version of gdal.VectorTranslate
here ma config :
gdal_conn = gdal.OpenEx(f"PG:service={my_pgsql_service}", gdal.OF_VECTOR)
gdal.VectorTranslate("path_to_my_file.gpkg"), gdal_conn,
SQLStatement=my_sql_query,
layerName=my_mayer_name,
format="GPKG",
accessMode='append',
)
so i've tried to add it in the -lco :
layerCreationOptions=["-mapFieldType StringList=String"]
but it didn't work
so i diged into the code of gdal, added a field mapFieldType=None into the VectorTranslateOptions function, and added into its code the following lines :
if mapFieldType is not None:
mapField_str = ''
i = 0
for k, v in mapFieldType.items():
i += 1
mapField_str += f"{k}={v}" if i == len(mapFieldType) else f"{k}={v},"
new_options += ['-mapFieldType', mapField_str]
And it worked, but is there an other way ?
And if not, where can i propose this feature ?
Thank you for your help

How can I scrape a Website with onClick event listener using Nokogiri

I am trying to scrape a website using Nokogiri and download documents thats are posted on the website. I can scrape other websites like this one: Matatiela Website and get the documents from it. But when I try to scrape this website: Mbhashe Website I can't get the documents because I have to first triger the onclick event in order to get to the document.
The problem now, I don't know how to triger the onclick event in order to get to the document. I have tried this code that I worked on with my friend but it didn't work:
if url.include?('http://www.alfredduma.gov.za/bids-tender-notices/')
file = anchor['onclick'].to_s.gsub("location.href=","").gsub(";return false;","").gsub("'","")
f = mech.get(file)
fileNmae = f.header['content-disposition']
fileNmae = fileNmae.match('"(.*?)"').andand[1].to_s
fileNmae = municipalityName+ " -" +fileNmae.gsub("_"," ")
downld(municipalityName,file,filepath,fileNmae,provinceName)
end
This code didn't work. But bellow is the code that is similar to the one i used to scrape Matatiela website but it's not working on the website of Mbhashe. Can you please help me because it does not return anything.
["https://www.mbhashemun.gov.za/procurement/tenders/","div.tb > div.tbrow > a","http://www.mbhashemun.gov.za","Mbhashe municipality","Eastern Cape"]
My Myfuction gets the css from this array.
if baseurl.include?('ttps://www.mbhashemun.gov.za/procurement/tenders/')
puts "downloading from mbhashemun"
parenturl = anchor['href']
puts parenturl
puts baseurl
tenderurl = parenturl
begin
if tenderurl.include?('http://www.mbhashemun.gov.za/web/2018/11/upgrade-and-maintenance-of-data-centre-for-a-period-of-three-03-years/')
puts "the document is currently not available"
else
puts tenderurl
passingparentUrl = HTTParty.get(tenderurl)
parsedparentUrl = Nokogiri::HTML(passingparentUrl)
downloadtenderurl = parsedparentUrl.at_css('div.media div.media-body > div.wpfilebase-attachment > div.wpfilebase-rightcol > div.wpfilebase-filetitle > a')[:href]
puts downloadtenderurl
bean = downloadtenderurl
puts bean
myfunction = bean.split('/').last
puts Myfunction
if File.exists?(File.join('public/uploads', Myfunction))
puts "the file exist in upload folder and in the database already"
else
mech.pluggable_parser.default = Mechanize::Download
mech.get(bean).save(File.join('public/uploads', monwai))
Tender.create municipality_name: municipalityName ,tender_description:Myfunction ,tender_document: Myfunction ,provincename: provinceName
end
end
rescue Exception => e
puts e
end
end
The code supposed to go throught the website and download the documents and save them on the public/uploads folder on the app.

How can I programmatically export pdf annotations (such as a formula encircled in a rectangle) as images?

I'm wondering if it is possible to export some annotations as images. I already know how to export highlighted text as text, but this doesn't work well with equations. If equations were denoted by an annotation, such as a box encircling them, could I convert them all at once to images using a pdf snapshot tool?
It is easy to do each one individually by hand with the pdf snapshot tool. Do any pdf libraries or programs have any tools that let you make image snapshots programmatically, not of whole pages, but of individual equations that are marked somehow with an annotation?
For the purposes of the question, they don't necessarily have to be free programs.
Thanks.

I came up with a full ruby based solution here, using the ruby gems pdf-reader and rmagick (along with an installation of imagemagick).
require 'pdf-reader'
require 'RMagick'
pdf_file_name='statmech' #without extension
doc = PDF::Reader.new(File.expand_path(pdf_file_name+".pdf"))
$objects = doc.objects
def convertpagetojpgandcrop(filename,pagenum,croprect,imgname)
pagename = filename+".pdf[#{pagenum-1}]"
#higher density used for quality purposes (otherwise fuzzy)
pageim = Magick::Image.read(pagename){ |opts| opts.density = 216}.first
#factors of 3 needed because higher density TODO: generalize to pdf density!=72
#SouthWestGravity puts coordinate origin in bottom left to match pdf coords
eqim =pageim.crop(Magick::SouthWestGravity,...
3*croprect[0],3*croprect[1],3*croprect[2]-3*croprect[0],3*croprect[3]-3*croprect[1])
eqim.write(imgname)
end
def is_square?(object)
object[:Type] == :Annot && object[:Subtype] == :Square
end
def is_highlight?(object)
object[:Type] == :Annot && object[:Subtype] == :Highlight
end
def annots_on_page(page)
references = (page.attributes[:Annots] || [])
lookup_all(references).flatten
end
def lookup_all(refs)
refs = *refs
refs.map { |ref| lookup(ref) }
end
def lookup(ref)
object = $objects[ref]
return object unless object.is_a?(Array)
lookup_all(object)
end
def highlights_on_page(page)
all_annots = annots_on_page(page)
all_annots.select { |a| is_highlight?(a) }
end
def squares_on_page(page)
all_annots = annots_on_page(page)
all_annots.select { |a| is_square?(a) }
end
def restricted_annots_on_page(page)
all_annots = annots_on_page(page)
all_annots.select { |a| is_square?(a)||is_highlight?(a) }
end
#This block exports a jpg for each 'square' annotation in pdf
doc.pages.each do |page|
eqnum=0
all_squares = squares_on_page(page)
all_squares.each do |annot|
eqnum = eqnum+1
puts "#{annot[:Rect]}"
convertpagetojpgandcrop(pdf_file_name,page.number,annot[:Rect],...
pdf_file_name+"page#{page.number}eq#{eqnum}.jpg")
end
end
#This block gives the text of the highlights and wikilinks to the images
#TODO:(needs to go in text file)
doc.pages.each do |page|
eqnum = 0
annots = restricted_annots_on_page(page)
if annots.length>0
puts "# Page #{page.number}"
end
annots.each do |annot|
if is_square?(annot)
eqnum = eqnum+1
puts "{{wiki:#{pdf_file_name}page#{page.number}eq#{eqnum}.jpg}}"
else
puts "#{annot[:Contents]}"
end
end
end
This code expands upon example code for the pdf-reader and rmagick gems found online. Few of the lines are original.

This code sample uses Amyuni PDF Creator .Net, it will export the page with only one annotation visible at a time:
using System.IO;
using Amyuni.PDFCreator;
using System.Collections;
//open a pdf document
FileStream testfile = new FileStream(filename, FileMode.Open, FileAccess.Read, FileShare.Read);
IacDocument document = new IacDocument(null);
document.SetLicenseKey("your license", "your code");
document.Open(testfile, "");
document.CurrentPageNumber = 1;
IacAttribute attribute = document.CurrentPage.AttributeByName("Objects");
// listobj is an array list of objects
ArrayList listobj = (System.Collections.ArrayList)attribute.Value;
ArrayList annotations = new ArrayList();
foreach (Amyuni.PDFCreator.IacObject iacObj in listobj)
{
if ((bool)iacObj.AttributeByName("Annotation").Value)
{
annotations.Add(iacObj);
// Put the annotation out of sight
iacObj.Coordinates = Rectangle.FromLTRB(
-iacObj.Coordinates.Left,
-iacObj.Coordinates.Top,
-iacObj.Coordinates.Right,
-iacObj.Coordinates.Bottom);
}
else
iacObj.Delete(false);
}
ArrayList images = new ArrayList();
int i = 0;
foreach (Amyuni.PDFCreator.IacObject iacObj in annotations)
{
// Back on sight
iacObj.Coordinates = Rectangle.FromLTRB(
-iacObj.Coordinates.Left,
-iacObj.Coordinates.Top,
-iacObj.Coordinates.Right,
-iacObj.Coordinates.Bottom);
//Draw the page
Bitmap bmp = new Bitmap(1000, 1000);
Graphics gr = Graphics.FromImage(bmp);
IntPtr hdc = gr.GetHdc();
document.DrawCurrentPage(hdc.ToInt32(), true);
gr.ReleaseHdc();
images.Add(bmp);
bmp.Save("c:\\temp\\image" + i + ".pdf");
iacObj.Delete(false); // object not needed anymore
i++;
}
If needed, you can extract the part of the resulting image that corresponds to the annotation by using the Coordinates property of the annotation object.
If you want to extract all objects from a rectangular area (annotations or otherwise) you can replace the loop that collects annotations with a call to the method IacDocument.GetObjectsInRectangle
Usual disclaimer applies

How to get an outline view in sublime texteditor?

How do I get an outline view in sublime text editor for Windows?
The minimap is helpful but I miss a traditional outline (a klickable list of all the functions in my code in the order they appear for quick navigation and orientation)
Maybe there is a plugin, addon or similar? It would also be nice if you can shortly name which steps are neccesary to make it work.
There is a duplicate of this question on the sublime text forums.

Hit CTRL+R, or CMD+R for Mac, for the function list. This works in Sublime Text 1.3 or above.

A plugin named Outline is available in package control, try it!
https://packagecontrol.io/packages/Outline
Note: it does not work in multi rows/columns mode.
For multiple rows/columns work use this fork:
https://github.com/vlad-wonderkidstudio/SublimeOutline

I use the fold all action. It will minimize everything to the declaration, I can see all the methods/functions, and then expand the one I'm interested in.

I briefly look at SublimeText 3 api and view.find_by_selector(selector) seems to be able to return a list of regions.
So I guess that a plugin that would display the outline/structure of your file is possible.
A plugin that would display something like this:
Note: the function name display plugin could be used as an inspiration to extract the class/methods names or ClassHierarchy to extract the outline structure

If you want to be able to printout or save the outline the ctr / command + r is not very useful.
One can do a simple find all on the following grep ^[^\n]*function[^{]+{ or some variant of it to suit the language and situation you are working in.
Once you do the find all you can copy and paste the result to a new document and depending on the number of functions should not take long to tidy up.
The answer is far from perfect, particularly for cases when the comments have the word function (or it's equivalent) in them, but I do think it's a helpful answer.
With a very quick edit this is the result I got on what I'm working on now.
PathMaker.prototype.start = PathMaker.prototype.initiate = function(point){};
PathMaker.prototype.path = function(thePath){};
PathMaker.prototype.add = function(point){};
PathMaker.prototype.addPath = function(path){};
PathMaker.prototype.go = function(distance, angle){};
PathMaker.prototype.goE = function(distance, angle){};
PathMaker.prototype.turn = function(angle, distance){};
PathMaker.prototype.continue = function(distance, a){};
PathMaker.prototype.curve = function(angle, radiusX, radiusY){};
PathMaker.prototype.up = PathMaker.prototype.north = function(distance){};
PathMaker.prototype.down = PathMaker.prototype.south = function(distance){};
PathMaker.prototype.east = function(distance){};
PathMaker.prototype.west = function(distance){};
PathMaker.prototype.getAngle = function(point){};
PathMaker.prototype.toBezierPoints = function(PathMakerPoints, toSource){};
PathMaker.prototype.extremities = function(points){};
PathMaker.prototype.bounds = function(path){};
PathMaker.prototype.tangent = function(t, points){};
PathMaker.prototype.roundErrors = function(n, acurracy){};
PathMaker.prototype.bezierTangent = function(path, t){};
PathMaker.prototype.splitBezier = function(points, t){};
PathMaker.prototype.arc = function(start, end){};
PathMaker.prototype.getKappa = function(angle, start){};
PathMaker.prototype.circle = function(radius, start, end, x, y, reverse){};
PathMaker.prototype.ellipse = function(radiusX, radiusY, start, end, x, y , reverse/*, anchorPoint, reverse*/ ){};
PathMaker.prototype.rotateArc = function(path /*array*/ , angle){};
PathMaker.prototype.rotatePoint = function(point, origin, r){};
PathMaker.prototype.roundErrors = function(n, acurracy){};
PathMaker.prototype.rotate = function(path /*object or array*/ , R){};
PathMaker.prototype.moveTo = function(path /*object or array*/ , x, y){};
PathMaker.prototype.scale = function(path, x, y /* number X scale i.e. 1.2 for 120% */ ){};
PathMaker.prototype.reverse = function(path){};
PathMaker.prototype.pathItemPath = function(pathItem, toSource){};
PathMaker.prototype.merge = function(path){};
PathMaker.prototype.draw = function(item, properties){};

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Pandoc Lua script to filter specific markdown sub sections during PDF generation - pdf

I have markdown source and want to generate PDF using Pandoc. I want to remove ALL sub sections below a specified level in the generated document. E.g. filter them from the source markdown. Would this be possible with Lua or would it be better to do prefiltering using some other tools?

Got a suggestion from the Lua Google Groups forum which works for me: local keep_deleting = false function Block (b) if b.t == 'Header' and b.level >= 3 then keep_deleting = true return {} elseif b.t == 'Header' then keep_deleting = false elseif keep_deleting then return {} end end

Related

IBM DOORS: DXL script that follows link chain

How to use mapFieldType with gdal.VectorTranslate

How can I scrape a Website with onClick event listener using Nokogiri

How can I programmatically export pdf annotations (such as a formula encircled in a rectangle) as images?

How to get an outline view in sublime texteditor?

Categories

Resources