Gitbook-style glossary in blogdown - blogdown

I am trying to add the glossary function of legacy gitbook to blogdown with Hugo. In gitbook, this feature automatically generated an <a> tag for terms listed in a separate glossary.md file. The glossary file is structured as:
## Term 1
Definition 1
## Term 2
Definition 2
A working example (in legacy gitbook) can be seen here.
I cannot find a way to do this in Hugo, but think I should be able to with blogdown. Could I use a build.R script to call a separate function to conduct a find & replace on the .rmd files, replacing each instance of a string Term X with a <span title="Definition X">Term X</span>?
The envisaged workflow would be something like:
Copy content directory, so original content dir remains unchanged
Find & replace terms (from terms in glossary document, to text in content directory)
Call blogdown to build HTML
Blogdown calls Hugo to render site
Is this a reasonable approach/is there a better way?

I have worked it out. My reading suggests that using lapply or others in the apply family are more computationally efficient in R but I cracked the nut with for loops earlier.
In the build.R script:
#Ensure working directory is the site root
library(R.utils)
library(xfun)
#Move everything to a safe space
copyDirectory(from="content", to="working", recursive=TRUE)
#Draws the glossary from a separate .md file, and separates it out into terms and definitions
glossary <- lapply(strsplit(readLines(con = "content/glossary.md", warn = FALSE), "## |##"), function(x){x[!x ==""]})
glossary <- glossary[lapply(glossary, length)>0] #Tidy up
terms <- glossary[seq_along(glossary) %% 2 > 0] #Separates the terms
defs <- glossary[seq_along(glossary) %% 2 == 0] #And the defs. terms[1] corresponds to defs[1].
#Get files for site
files <- list.files(path="working", pattern = "*.md|*.rmd|*.rmarkdown", recursive = TRUE)
#Replacing bit
setwd("working")
for (file in seq_along(files)) {
for (term in seq_along(terms)) {
gsub_file(files[file], sprintf("%s", terms[term]), sprintf("<span title=\"%s\" class=\"glossary\">%s</span>", defs[term], terms[term]), fixed = TRUE)
}
}
setwd("..")
#Build the site from the new glossaried markdown
blogdown::build_dir('working')

Related

How to open and read a .gz file in Nim (preferably line by line)

I just sat down to write my first Nim script to parse a .vcf (Variant Call Format) file. This file format stores genetic mutations from sequencing data.
For scripting languages, I 'grew up' on Perl and later migrated to Python, but I would love to use a language with the speed that Nim offers. I realize Nim is still young, but I couldn't even find a clear example for how to open and read a .gz (gzip) file (preferably line by line).
Can anyone provide a simple example to open and read a gzip file using Nim, line by line?
In Python, I'm accustomed to the following (uber-simple) code:
import gzip
my_file = gzip.open('my_file.vcf.gz', 'w')
for line in my_file:
# do something
my_file.close()
I have seen related questions, but they're not clear. The posts are also relatively old and I hope/suspect something better has come about. Here's what I've found:
Read gzip-compressed file line by line
File, FileStream, and GZFileStream
Reading files from tar.gz archive in Nim
Really appreciate it.
P.S. I also think it would be useful if someone created a Nim tag in StackOverflow. I do not have the reputation to create tags.
Just in case you need to handle VCF rather than .gz, there's a nice wrapper for htslib written by Brent Pedersen:
https://github.com/brentp/hts-nim
You need to install the htslib in your system, and then require the library in your .nimble file with requires "hts", or install the library with nimble install hts. If you are going to do NGS analysis in Nim you'll need it.
The code you need:
import hts
var v:VCF
doAssert open(v, "myfile.vcf.gz")
# Here you have the VCF file loaded in v, and can access the headers through
# v.header property
for record in v:
# Here you get a Record object per line, e.g. extract the Ref and Alts:
echo v.REF, " ", v.ALT
v.close()
Be sure to follow the docs, because some things differ from python, specially when getting the INFO and FORMAT fields.
Checkout the whole Brent repo. It has plenty of wrappers, code samples and utilities to handle NGS problems (e.g. an ultrafast coverage tool utility called Mosdepth).
Per suggestion from Maurice Meyer, I looked at the tests for the Nim zip package. It turned out to be quite simple. This is my first Nim script, so my apologies if I didn't follow convention, etc.
import zip/gzipfiles # Import zip package
block:
let vcf = newGzFileStream("my_file.vcf.gz") # Open gzip file
defer: outFile.close() # Close file (like a 'final' statement in 'try' block)
var line: string # Declare line variable
# Loop over each line in the file
while not vcf.atEnd():
line = vcf.readLine()
# Cure disease with my VCF file
To install the zip package, I simply ran because it is already in the Nim package library:
> nimble refresh
> nimble install zip
I tried to use Nim some time ago to parse a fastq or fastq.gz file.
The code should be available here:
https://gitlab.pasteur.fr/bli/qaf_demux/blob/master/Nim/src/qaf_demux.nim
I don't remember exactly how this works, but apparently, I did an import zip/gzipfiles and used newGZFileStream on the input file name to obtain a Stream from which lines can be read using .readLine() in this piece of code:
proc fastqParser(stream: Stream): iterator(): Fastq =
result = iterator(): Fastq =
var
nameLine: string
nucLine: string
quaLine: string
while not stream.atEnd():
nameLine = stream.readLine()
nucLine = stream.readLine()
discard stream.readLine()
quaLine = stream.readLine()
yield [nameLine, nucLine, quaLine]
It is used in something that amounts to this piece of code:
let inputFqs = fastqParser(newGZFileStream($inFastqFilename))
Hopefully you can adapt this to your case.
My .nimble file has a requires "zip#head". I suppose this triggers the installation of zip/gzipfiles.

stat function for perl6

Is there an alternate way in perl6 to get file attribute details like size, access_time, modified_time.. etc. without having to invoke native call?
As per the doc it is "unlikely to be implemented as a built in as its POSIX specific".
What workaround options are available excluding the system call to stat?
Any ideas or pointers are much appreciated.
Thanks.
See the IO::Path doc.
For example:
say 'foo'.IO.s; # 3 if 'foo' is an existing file of size 3 bytes
.IO on a string creates an IO::Path object corresponding to the filesystem entry corresponding to the path given by the string.
See examples of using junctions to get multiple attributes at the same time at the doc on ACCEPTS.
I'm not sure if the following is too much. Ignore it if it is. Hopefully it's helpful.
You can discover/explore some of what's available in Perl 6 via its HOW objects (aka Higher Order Workings objects, How Objects Work objects, metaobjects -- whatever you want to call them) which know HOW objects of a particular type work.
say IO::Path.^methods
displays:
(BUILD new is-absolute is-relative parts volume dirname basename extension
Numeric sibling succ pred open watch absolute relative cleanup resolve
parent child add chdir rename copy move chmod unlink symlink link mkdir
rmdir dir slurp spurt lines comb split words e d f s l r w rw x rwx z
modified accessed changed mode ACCEPTS Str gist perl IO SPEC CWD path BUILDALL)
Those are some of the methods available on an IO::Path object.
(You can get more or less with adverbs, eg. say IO::Path.^methods(:all), but the default display aims at giving you the ones you're likely most interested in. The up arrow (^) means the method call (.methods) is not sent to the invocant but rather is sent "upwards", up to its HOW object as explained above.)
Here's an example of applying some of them one at a time:
spurt 'foo', 'bar'; # write a three letter string to a file called 'foo'.
for <e d f s l r w rw x rwx z modified accessed changed mode>
-> $method { say 'foo'.IO."$method"() }
The second line does a for loop over the methods listed by their string names in the <...> construct. To call a method on an invocant given it's name in a variable $qux, write ."$qux"(...).
While looking for an answer to this question in 2021, there is the File::Stat module. It provides some additional stat(2) information such as UID, GID and mode.
#!/usr/bin/env raku
use File::Stat <stat>;
say File::Stat.new(path => $?FILE).mode.base(8);
say stat($?FILE).uid;
say stat($?FILE).gid;

Documenting CMake scripts

I find myself in a situation where I would like to accurately document a host of custom CMake macros and functions and was wondering how to do it.
The first thing that comes to mind is simply using the built-in syntax and only document scripts, like so:
# -----------------------------
# [FUNCTION_NAME | MACRO_NAME]
# -----------------------------
# ... description ...
# -----------------------------
This is fine. However, I'd like to employ common doc generators, for instance doxygen, to also generate external documentation that can be read by anyone without looking at the implementation (which is a common scenario).
One way would be to write a simple parser that generates a corresponding C/C++ header with the appropriate signatures and documentation directly from the CMake script, which could the be processed by doxygen or comparable tools. One could also maintain such a header by hand - which is obviously tedious and error prone.
Is there any other way to employ a documentation generator with CMake scripts?
Here is the closest I could get. The following was tested with CMake 2.8.10. Currently, CMake 3.0 is under development which will get a new documentation system based on Sphinx and reStructuredText. I guess that this will bring new ways to document your modules.
CMake 2.8 can extract documentation from your modules, but only documentation at the beginning of the file is considered. All documentation is added as CMake comments, beginning with a single #. Double ## will be ignored (so you can add comments to your documentation). The end of documentation is marked by the first non-comment line (e.g. an empty line)
The first line gives a brief description of the module. It must start with - and end with a period . or a blank line.
# - My first documented CMake module.
# description
or
# - My first documented CMake module
#
# description
In HTML, lines starting with at two or more spaces (after the #) are formatted with monospace font.
Example:
# - My custom macros to do foo
#
# This module provides the macro foo().
# These macros serve to demonstrate the documentation capabilietes of CMake.
#
# FOO( [FILENAME <file>]
# [APPEND]
# [VAR <variable_name>]
# )
#
# The FOO() macro can be used to do foo or bar. If FILENAME is given,
# it even writes baz.
MACRO( FOO )
...
ENDMACRO()
To generate documentation for your custom modules only, call
cmake -DCMAKE_MODULE_PATH:STRING=. --help-custom-modules test.html
Setting CMAKE_MODULE_PATH allows you to define additional directories to search for modules. Otherwise, your modules need to be in the default CMake location. --help-custom-modules limits the documentation generation to custom, non-CMake-standar modules. If you give a filename, the documentation is written to the file, to stdout otherwise. If the filename has a recognized extension, the documentation is formatted accordingly.
The following formats are possible:
.html for HTML documentation
.1 to .9 for man page
.docbook for Docbook
anything else: plain text

how to fetch a data from one file location and to run using tcl code

In tcl how to get the data from one file location and to run that data using TCL code .
for example
In the folder 1 there is config file ,i want to get the informations of config file and i want to execute the information that is present or not,
If the configuration file contains Tcl code, it's just:
# Put the filename in quotes if you want, or in a variable, or ...
source /the/path/to/the/file.tcl
If the file contains Tcl code but you don't trust it, you can use a “safe interpreter” context. This disables many commands, giving a much more restricted set of capabilities that you can then add specific exceptions to (with interp alias):
# Make the context
set i [interp create -safe]
# Set up a way for the context to let the master find out about what to
# really set
interp alias $i configure {} recordConfiguration
proc recordConfiguration args {
puts "configured with $args"
}
# Evaluate the script (note that [source] is hidden by default) in the context
$i invokehidden source /the/path/to/the/file.tcl
# Dispose the context
interp delete $i
If the file isn't Tcl code, you have to parse it. That's a substantially more complex matter, so much so that we'll need to know the format of the file before we can answer.
If you are trying to read data (like text strings) from a file then you'll have to open a channel for that particular file like this:
set fileid [open "path/to/your/file.txt" r]
Read open manual page.
Then you can use gets command to read data from the file through the channel fileid .

Groc (Docco fork) using strip option

From what I understand, the --strip option of groc (a docco fork) is to allow me to strip out folders from the documentation hierarchy. eg. I have a folder structure like:
src/
module1/
coffee/
submod1/
xxx.coffee
yyy.coffee
submod2
zzz.coffee
module2/
coffee/
submod1/
xxx.coffee
yyy.coffee
submod2
zzz.coffee
I want to exclude all coffee folders from the hierarchy of the docs. How do I use strip to do that? Its not really clear in the docs
I don't think you can achieve what you're looking for with the --strip option. I believe strip only allows you to strip from the beginning of the path. I've tried "*" glob pattern matching and I couldn't get your scenario to work (eg --strip coffee and --strip */coffee)
For general usage, I use strip like this:
groc **/*.js --strip modules/WebCommon/j2ee-apps/webcommon.war/javascript
# INPUT
# modules/WebCommon/j2ee-apps/webcommon.war/javascript/validator/validator.js
# OUTPUT
# validator/validator.js
However, the filename in the actual html file still holds the entire path. That's because the template uses projectPath and not targetPath for the title.
# "projectPath":"modules/WebCommon/j2ee-apps/webcommon.war/javascript/validator/validator.js","targetPath":"validator/validator"
Admittedly, Ian MacLeod (the author) agrees that it's confusing:
https://github.com/nevir/groc/issues/13