Can't figure out output character encoding for MeCab - mecab

I'm trying to parse some Japanese text, and I can't seem to figure out the output encoding.
This is the output I'm getting:
これは ̾��,����,*,*,*,*,*
本 ̾��,����,*,*,*,*,*
です ̾��,����,*,*,*,*,*
。 ̾��,������³,*,*,*,*,*
EOS
Steps I took:
git clone https://github.com/taku910/mecab
cd mecab/mecab
./configure --enable-utf8-only --with-charset=utf8
make
sudo make install
mecab -o ~/Desktop/output.txt ~/Desktop/input.txt, where input.txt contains "これは本です。"
Using OSX 10.15.3

Related

RISC-V: How to fix "file format not recognized" when disassembling a .img file?

I'm playing with RISC-V.
I have a .img file and I want to disassemble it into a .asm file, so I ran the following command:
> riscv64-unknown-elf-objdump -d xxx.img > xxx.asm
However, I got this issue:
riscv64-unknown-elf-objdump: xxx.img: file format not recognized
How can I fix it? I have no idea what to do with this issue.
If you run:
riscv64-unknown-elf-objdump --help
You'll see a line like:
riscv64-unknown-elf-objdump: supported architectures: riscv riscv:rv64 riscv:rv32
These are the supported architectures that you need to pass as the -m argument. Normally, an ELF file will encode this information so there's no guesswork, but in the case of using a flat file, there's no way for objdump to know how the instructions are supposed to be interpreted. The final command is:
riscv64-unknown-elf-objdump -b binary -m riscv:rv64 -D xxx.bin

Problem running bash-script in wsl with convert function (ImageMagik) - WSL

A Windows user here, with little, almost zero experience with Linux.
While procrastinating as I wrote my thesis, I encountered a script to convert the pdf to a gif using the command convert from ImageMagik. Here is the result of the gif:
https://raw.githubusercontent.com/npcardoso/PhDThesis/master/thesis.gif
Since I use Windows, I activated WSL and installed Ubuntu.
This is the script I am using:
#!/bin/bash
tmp_dir=$(mktemp -d -t cho-XXXXXXXXXX)
echo $tmp_dir
cd $(dirname $0)
function remove_alpha() {
convert -monitor -alpha remove -background white -antialias $*
}
function to_gif() {
convert -monitor -loop 0 -strip -layers OptimizePlus -delay 50 -antialias $*
}
remove_alpha -density 50 thesis/main.pdf $tmp_dir/thesis_raster.pdf
pdfnup $tmp_dir/thesis_raster.pdf {},1- -o $tmp_dir/thesis_nup.pdf
to_gif $tmp_dir/thesis_nup.pdf thesis.gif
rm -rfv $tmp_dir
However, I am not able to run the script successfully. I get the following errors:
these errors
I do not know how to get rid of these errors. I even tried removing the functions, but I still get the error about the $'.\r': No such file or directory.
errors without the functions
Any guidance?

ImageMagick: How do I convert pdf to jpg? I get ERROR: no decode delegate for this image format

Problem
I need to convert a multipage pdf to jpg-files but ImageMagick keeps throwing errors that are hard to interpret.
Installing ImageMagick
At first I installed it using apt-get, but as I could read that several people had problems doing that, i ended up installing it from source.
My linux distribution (A Docker image):
>lsb_release -a
No LSB modules are available.
Distributor ID: Debian
Description: Debian GNU/Linux 10 (buster)
Release: 10
Codename: buster
Installing ImageMagick from source:
# Installing build tools and ghostscript
apt update
apt-get install -y build-essential make ghostscript
# Downloading imagemagick
wget https://www.imagemagick.org/download/ImageMagick.tar.gz
# Installing and cleaning up
tar xvzf ImageMagick.tar.gz && cd ImageMagick-7* && ./configure && make && make install && ldconfig /usr/local/lib && cd .. && rm -r ImageMagick-7*
# Checking ImageMagick version
>magick -version
Version: ImageMagick 7.0.10-60 Q16 x86_64 2021-01-25 https://imagemagick.org
Copyright: (C) 1999-2021 ImageMagick Studio LLC
License: https://imagemagick.org/script/license.php
Features: Cipher DPC HDRI OpenMP(4.5)
Delegates (built-in): jpeg x xml zlib
Converting files
# Image to image
>convert test.jpg test.png
# PDF to image
>convert test.pdf test.jpg
convert: no decode delegate for this image format `' # error/constitute.c/ReadImage/572.
convert: no images defined `test.jpg' # error/convert.c/ConvertImageCommand/3304.
Is Ghostscript the problem?
The Ghostscript installation is a common problem for many, but Ghostscript seems to work fine and produce a jpg-file
# PDF to image with Ghostscript
gs -sDEVICE=pngalpha -sOutputFile=test.jpg test.pdf
Do I have to install more Delegates?
The error suggests that there is something off with my limited delegates, so I thought to install all dependencies up front.
# Listing dependencies
>apt update && apt build-dep imagemagick
Reading package lists... Done
E: You must put some 'source' URIs in your sources.list
This is where I got stuck.
Solution
It turned you that it was indeed the delegates that were missing. I haven't seen this well described in the documentation or anywhere else.
NOTE: Delegates should be installed before you install ImageMagick
Here is how I fixed it:
# Add source URI or uncomment source URI
## Adding URI
echo "deb-src http://deb.debian.org/debian buster main" >> /etc/apt/sources.list
apt update
## Uncommenting URI
sudo sed -Ei 's/^# deb-src /deb-src /' /etc/apt/sources.list
sudo apt update
# Installing dependencies
apt-get build-dep imagemagick
Now I can convert pdf to jpg!
My solution was to reinstall imagemagick, but following the steps in the answer of andrew.46 at https://askubuntu.com/questions/745660/imagemagick-png-delegate-install-problems/746195#746195

How do I install libpng on MSYS2?

I want to build a program with mingw w64 and I have msys2 installed.
I tried to work with pacman from the msys2 prompt.
$ pacman -Q libpng
error: package 'libpng' was not found
$ pacman -S libpng
error: target not found: libpng
$ pacman -S *libpng
error: target not found: *libpng
I attempted to use google and came up with:
$ pacman -S mingw-w64-libpng
error: target not found: mingw-w64-libpng
$ pacman -F mingw-w64-libpng
warning: database file for 'mingw32' does not exist (use '-Fy' to download)
warning: database file for 'mingw64' does not exist (use '-Fy' to download)
warning: database file for 'msys' does not exist (use '-Fy' to download)
error: no options specified (use -h for help)
Very peculiar that after all the downloading I did, which I distinctly recall including a database for pacman, that these database files don't seem to exist.
$ pacman -Fy mingw-w64-libpng
[... stuff downloads ... ]
error: no options specified (use -h for help)
$ pacman -U mingw-w64-libpng
loading packages...
error: 'mingw-w64-libpng': could not find or read package
So now the questions are,
1) How in the future will I find the magic prefix for a well-known library in order to be able to tell pacman what to install?
2) How at the moment do I instruct pacman to install the libpng package which seems to be in the mingw-w64-libpng package?
3) Is that the package with the development headers or is that yet another package, as I have adjusted to on Deb/Ubuntu by looking for something like libpng-dev?
Have you tried pacman -Ss libpng? This will list all packages mentioning libpng, prefix and all:
$ pacman -Ss libpng
mingw32/mingw-w64-i686-libpng 1.6.35-1
A collection of routines used to create PNG format graphics (mingw-w64)
mingw64/mingw-w64-x86_64-libpng 1.6.35-1 [installed]
A collection of routines used to create PNG format graphics (mingw-w64)
I notice that these names include an architecture (i686/x86_64), which is fairly common in MinGW package names.
EDIT: The headers end up here:
$ ls /mingw64/include/libpng16/
png.h pngconf.h pnglibconf.h

pandoc: xelatex not found. xelatex is needed for pdf output

I have just upgraded my Macbook Pro OS to El Capitan (v10.11.4).
My attempt to export a Markdown file (created using Sublime Text 2, v2.0.2, build 2221) to pdf using pandoc is now failing, and I receive the following error:
pandoc: xelatex not found. xelatex is needed for pdf output
My output command is as follows:
pandoc doc1.md -o doc1.pdf --toc -V geometry:margin=1in --variable fontsize=10pt --variable fontfamily=utopia --variable linkcolor=blue --latex-engine=xelatex -f markdown-implicit_figures -s
Above command worked like a charm prior to installing El Capitan.
FYI - in searching for questions here I have not found one that gives a suitable answer.
For my case, add one line into ~/.bashrc solved the error:
export PATH=/Library/TeX/texbin:$PATH
Of course, the environment variable should be activated in the current term:
$ . ~/.bashrc
then run: $ make
the error disappears.
El Capitan's security features disable and remove the old symlink /usr/texbin. If you have MacTeX 2015, they should've been installed in /Library/TeX/texbin as well. You'll have to update the PATH your using to launch pandoc to include that folder. If you have a pre-2015 distribution of MacTeX, there are instructions here.
Linux Ubuntu instructions:
Tested on Ubuntu 18.04:
If you see this error on Linux Ubuntu:
pandoc: xelatex not found. xelatex is needed for pdf output
Then you need to install the texlive-xetex package like this:
sudo apt update
sudo apt install texlive-xetex
That solves it! Source where I learned this: TEX: XeLatex under Ubuntu.
In my particular case, I was trying to run this make_book.sh script to generate book.pdf, so I needed to do all of the following:
sudo apt update
sudo apt install pandoc
pip3 install MarkdownPP
sudo apt install texlive-xetex
cd path/to/repo
cd systemd-by-example
./make_book.sh
# You'll now have "book.pdf" inside directory "systemd-by-example"!
References:
https://github.com/jreese/markdown-pp - instructions to install MarkdownPP
https://tex.stackexchange.com/a/179811/168682 - instructions to install texlive-xetex