Extracting a Gzip-file leads to a Gzip-file - gzip

I have a nquads file 'articles.nq.gz', (can be found at http://opencitations.net/source-data/ )After using the command
gunzip articles.nq.gz
I end up with 'acrticles.nq', but the document isn't recognized as an .nquads type, but is still considered as an Gzip-file.
Thanks in advance!

Related

Hash 'hashcat': Token length exception

hashcat64.exe hashcat -m0 -a0 crackme.txt password.txt
Device #1: Intel's OpenCL runtime(GPU only) is currently broken. We
are waiting for updated OpenCL drivers from Intel
Hash 'hashcat': Token length exception No hashes loaded.
I'm getting this message. I've attached a snapshot of my CL.
I've looked for any spaces in the hash directory and its format.
I've also tried changing all the Unicode formats of the .txt file.
Nothing seems to work. I've also updated the intel drivers.\
Can anyone help please. Thanks in advance.
I think you should look end of each line in your hash password containing files. If spaces are at there end of lines then you will get an error "token length exception" or "No hashes loaded". Just remove those spaces and then try.
For anyone looking into this : I used two rules, you can use many of others to increase the efficiency.
hashcat64.exe hashcat -m0 -a0 crackme.txt password.txt -r rules/best64.rule
or
hashcat64.exe hashcat -m0 -a0 crackme.txt password.txt -r rules/d3ad0ne.rule
This error can also occur if the hash file is not found. Note that the restore file effectively encodes the absolute path to the hash file, so this error can occur if it has moved when attempting to resume. (technically it saves the potentially relative path as specified when originally run, but it also saves the original working directory and cds there first)

How to fix 'File name too long' errors when using Snakemake

When using Snakemake, I store the values for my variables as part of the filenames (ex. "processed/count_{project}.tsv"). Recently, I've started using R formulas with many covariates as a variable. Now I get an error because the the filename is too long for the operating system. Has anyone else run into this issue and have any suggestions? Is there a canonical Snakemake approach for this problem?
Personally, I don't think it is a good idea to store information into the filename.
Rather, I would create a temp file in tabular or yaml format linking the file in question to covariates or other data. Then read this file in R or else to extract the relevant information.
One idea is to use paths instead since paths allowed to be longer.

How to ignore failure when file does exist when downloading with WinSCP script

Running a script to get a file from SFTP server, however this is recurring job and should still succeed if no file exist, is there an option I can specify?
option batch on
option confirm off
option transfer binary
open sftp://server -timeout=60
password
get /File/2_04-28-2015.txt D:\Files
close
exit
Getting this result:
Can't get attributes of file 'File/2_04-28-2015.txt'.
No such file or directory.
Error code: 2
Tried setting failonnomatch:
winscp> option failonnomatch on
Unknown option 'failonnomatch'.
You cannot tell WinSCP to ignore absent file, when using a specific file name.
But you can check the file existence prior to the actual download.
Easy alternative hack is to use a file mask (note the trailing *) and set the failonnomatch off:
option failonnomatch off
get /File/2_04-28-2015.txt* D:\Files\
(if you are getting "Unknown option 'failonnomatch'", then you have an old version of WinSCP).
Have you tried using MGET instead of GET? It shouldn't fail, just not transfer anything if there's nothing there.

In kettle use text file input read csv file from a tar.gz file but it didn't worked. Where it might be wrong?

I have a csv file that is tared and zipped. So I have test.tar.gz.
I would like, through text file input, read csv file.
I try this tar:gz:file://C:/test/test.tar.gz!/test.tar! use wildcard like ".*\.csv".
But it sometime can't read success.
It throws Exception
org.apache.commons.vfs.FileNotFolderException:
Could not list the contents of
"tar:gz:file:///C:/test/test.tar.gz!/test.tar!/"
because it is not a folder.
I use windows8.1, pdi 5.2
Where it might be wrong?
For a compressed file csv reading, "Text File Input" step in Pentaho Kettle only supports the first files inside the compressed folder(either in Zip/GZip file). Check the Pentaho Wiki in the compression section.
Now for your issue, try removing the wildcard entry since only the first file inside the zip/gzip file will be read. (as explained above)
I have placed a sample code containing both reading zip and gzip files. Check it here.
Hope it helps :)

How do I can check an input file is compressed (ZIP) or not?

How do I can check an input file is compressed (ZIP) or not ?.
Is the solution to read the file info using "Get File Names" step and check the extension field ?
Use the "file" command if you're on Unix.
If not install cygwin and goto 1.
If this is related to your other question about conditionally reading different files then I would consider getting your files into a consistent format first. i.e. all compressed.