Get directories count of IPFS - api

I installed the ipfs version 0.8.0 on WSL Ubuntu 18.04. Started ipfs using sudo ipfs daemon. Added 2 directories using the command sudo ipfs add -r /home/user/ipfstest, it results like this:
added QmfYH2KVxANPA3um1W5MYWA6zR4Awv8VscaWyhhQBVj65L ipfstest/abc.sh
added QmTXny9ZjuFPm4C4KbQSEYxvUp2MYbSCLppPQirW7ap4Go ipfstest
Likewise, I added one more directory having 2 files. Now, I need the total files and directories in my ipfs using go-ipfs-api. Following is my code:
package main
import (
"fmt"
"context"
"os"
"net/http"
"github.com/prometheus/client_golang/prometheus"
"github.com/prometheus/client_golang/prometheus/promhttp"
"github.com/ipfs/go-ipfs-api"
)
var sh *shell.Shell
func main() {
sh := shell.NewShell("localhost:5001")
dir,err:=sh.FilesLs(context.TODO(),"")
if err != nil {
fmt.Fprintf(os.Stderr, "error: %s", err)
os.Exit(1)
}
fmt.Printf("Dir are: %d", dir)
pins,err:=sh.Pins()
if err != nil {
fmt.Fprintf(os.Stderr, "error: %s", err)
os.Exit(1)
}
fmt.Printf("Pins are: %d", len(pins))
dqfs_pincount.Add(float64(len(pins)))
prometheus.MustRegister(dqfs_pincount)
http.Handle("/metrics", promhttp.Handler())
http.ListenAndServe(":8090", nil)
}
If I run this code, I get the output as:
Dir are: [824634392256] Pins are: 18
Pinned files are incremented as I added files. But what is this output [824634392256]? And why only one?
I tried giving a path to the function dir,err:=sh.FilesLs(context.TODO(),"/.ipfs"). As I guess the files and dir's must be stored in ~/.ipfs. But this gives an error:
error: files/ls: file does not exist
How can I get all directories of ipfs? Where I am mistaken? what path should I prove as a parameter? Please help and guide.

There's a bit to unpack here.
Why are you using sudo?
IPFS is meant to be run as a regular user. Generally you don't want to run it as root, but you'd instead run the same commands, just without sudo:
ipfs daemon
ipfs add -r /home/user/ipfstest
...
Code doesn't compile
Let's begin with the code, and make sure that's working as intended before moving forward, first off your import:
"github.com/ipfs/go-ipfs-api"
Should read:
shell "github.com/ipfs/go-ipfs-api"
As otherwise the code won't compile, because of your usage of shell later in the code.
Why does dir produce the output it does?
Next, let's look at your usage of dir. You're storing *[]MfsLsEntry (MfsLsEntry), which is a slice of pointers. You're outputting that with string formatting %d, which will be a base10 integer (docs), so the "824634392256" is just the memory address of the MfsLsEntry object in the first index of the slice.
Why does sh.FilesLs(context.TODO(),"/.ipfs") fail?
Well FilesLs isn't querying your own regular filesystem that your OS runs on, but actually MFS. MFS is stored locally though, but using the add API doesn't automatically add something to your MFS. You can use FilesCp to add a CID to your MFS after you add it though.
How do I list my directories on IPFS?
This is a bit of a tricky question. The only data really retain on IPFS is either data pinned, or data referenced in the MFS. So above we already learned the FilesLs command lists the files/directories on your MFS. To list your recursive pins (directories), it's quite simple using the command line:
ipfs pin ls -t recursive
For the API though, you'll first want to call something like Shell.Pins(), filter out for the pins you want (maybe a quick scan through, pull out anything recursive), then query the CIDs using Shell.ObjectStat or whatever you prefer.
If working with the pins though, do remember that it won't feel quite like a regular mutable filesystem, because it isn't. It's much easier to navigate through CIDs added to MFS. So that's how I'd recommend you list your directories on IPFS.

Related

How does the variable scope in BitBake works

I use Yocto and I was wondering how the variable scope works in a BitBake recipe:
My recipe looks like:
SRC_URI += "file://something"
python do_fetch_prepend() {
d.appendVar("SRC_URI", "https://www.bla.com/resource.tar")
bb.error("SRC_URI_1: %s " % d.getVar("SRC_URI"))
d.setVar("TEST_VAR", "test")
}
python do_unpack_append() {
bb.error("SRC_URI_2: %s " % d.getVar("SRC_URI"))
bb.error("TEST_VAR: %s " % d.getVar("TEST_VAR"))
}
I run bitbake -v -c unpack myrecipe
SRC_URI_1 is printed as expected: "file://something https://www.bla.com/resource.tar"
SRC_URI_2 is printed as: "file://something"
TEST_VAR is printed as: None
I looks like setting/changing a variable in datastore (d) is only done in the scope of the do_fetch. Is this expected behaviour, because I read in the documentation that 'd' is global variable.
If this is expected behaviour, is there away to change global variables in a task of a recipe?
The reason behind the question is that I need another native-recipe before I can add the extra URI to the SRC_URI. I tried first Inline Python Variable Expansion, but the BitBake parser already expanse the variable before the native recipe is put in 'native directory'. So I try to change the SRC_URI during fetch task and I 'load' my native recipe as follow:
python () {
d.appendVarFlag('do_parse', 'depends', 'my-recipe-native:do_populate_sysroot')
}
In the do_fetch_prepend I use this native recipe which gives me the correct URL, which I wanted to append to the SRC_URI. So the fetching, unpacking, cleaning, etc works. It looks like I fetching works, but the unpacking not because the SRC_URI is not updated.
With a given task, variable changes are only local. This means do_unpack does not 'see' a change made by the do_fetch task.
This is necessary to allow some tasks to rerun when others are covered by sstate, to ensure things are deterministic.
If you really want to do what you describe, you'd need something like a prefunc for the tasks where you need to modify SRC_URI.
python myprefunc() {
d.appendVar("SRC_URI", "https://www.bla.com/resource.tar")
}
do_fetch[prefuncs] += "myprefunc"
do_unpack[prefuncs] += "myprefunc"
However note that whilst this will do some of what you want, source archives, license manifests and sstate checksums may not work correctly since you're "hiding" source data from bitbake and this data is only present at task execution time, not parse time.

How do I tell Octave where to find functions without picking up other files?

I've written an octave script, hello.m, which calls subfunc.m, and which takes a single input file, a command line argument, data.txt, which it loads with load(argv(){1}).
If I put all three files in the same directory, and call it like
./hello.m data.txt
then all is well.
But if I've got another data.txt in another directory, and I want to run my script on it, and I call
../helloscript/hello.m data.txt
this fails because hello.m can't find subfunc.m.
If I call
octave --path "../helloscript" ../helloscript/hello.m data.txt
then that seems to work fine.
The problem is that if I don't have a data.txt in the directory, then the script will pick up any data.txt that is lying around in ../helloscript.
This seems a bit fragile. Is there any way to tell octave, preferably in the script itself, to get subfunctions from the same directory as the script, but to get everything else relative to the current directory.
The best robust solution I can think of at the moment is to inline the subfunction in the script, which is a bit nasty.
Is there a good way to do this, or is it just a thorny problem that will cause occasional hard to find problems and can't be avoided?
Is this in fact just a general problem with scripting languages that I've just never noticed before? How does e.g. python deal with it?
It seems like there should be some sort of library-load-path that can be set without altering the data-load-path.
Adding all your subfunctions to your program file is not nasty at all. Why would you think so? It is perfectly normal to have function definitions in your script. The only language I know that does not do this is Matlab but that's just braindead.
The other alternative you have is to check that the input file argument, data.txt exists. Like so:
fpath = argv (){1};
[info, err, msg] = stat (fpath);
if (err)
error ("could not stat `%s' : %s", fpath, msg);
endif
## continue your script knowing the file exists
But really, I would recommend you to use both. Add your subfunctions in your main program, the only reason to have it on separate file is if you plan on sharing with other programs, and always check input arguments.

How to force STORE (overwrite) to HDFS in Pig?

When developing Pig scripts that use the STORE command I have to delete the output directory for every run or the script stops and offers:
2012-06-19 19:22:49,680 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 6000: Output Location Validation Failed for: 'hdfs://[server]/user/[user]/foo/bar More info to follow:
Output directory hdfs://[server]/user/[user]/foo/bar already exists
So I'm searching for an in-Pig solution to automatically remove the directory, also one that doesn't choke if the directory is non-existent at call time.
In the Pig Latin Reference I found the shell command invoker fs. Unfortunately the Pig script breaks whenever anything produces an error. So I can't use
fs -rmr foo/bar
(i. e. remove recursively) since it breaks if the directory doesn't exist. For a moment I thought I may use
fs -test -e foo/bar
which is a test and shouldn't break or so I thought. However, Pig again interpretes test's return code on a non-existing directory as a failure code and breaks.
There is a JIRA ticket for the Pig project addressing my problem and suggesting an optional parameter OVERWRITE or FORCE_WRITE for the STORE command. Anyway, I'm using Pig 0.8.1 out of necessity and there is no such parameter.
At last I found a solution on grokbase. Since finding the solution took too long I will reproduce it here and add to it.
Suppose you want to store your output using the statement
STORE Relation INTO 'foo/bar';
Then, in order to delete the directory, you can call at the start of the script
rmf foo/bar
No ";" or quotations required since it is a shell command.
I cannot reproduce it now but at some point in time I got an error message (something about missing files) where I can only assume that rmf interfered with map/reduce. So I recommend putting the call before any relation declaration. After SETs, REGISTERs and defaults should be fine.
Example:
SET mapred.fairscheduler.pool 'inhouse';
REGISTER /usr/lib/pig/contrib/piggybank/java/piggybank.jar;
%default name 'foobar'
rmf foo/bar
Rel = LOAD 'something.tsv';
STORE Rel INTO 'foo/bar';
Once you use the fs command, there a lot of ways to do this. For an individual file, I wound up adding this to the beginning of my scripts:
-- Delete file (won't work for output, which will be a directory
-- but will work for a file that gets copied or moved during the
-- the script.)
fs -touchz top_100
rm top_100
For a directory
-- Delete dir
fs -rm -r out

stat vs mkdir with EEXIST

I need to create folder if it does not exists, so I use:
bool mkdir_if_not_exist(const char *dir)
{
bool ret = false;
if (dir) {
// first check if folder exists
struct stat folder_info;
if (stat(dir, &folder_info) != 0) {
if (errno == ENOENT) { // create folder
if (mkdir(dir, S_IRWXU | S_IXGRP | S_IRGRP | S_IROTH | S_IXOTH) ?!= 0) // 755
perror("mkdir");
else
ret = true;
} else
perror("stat");
} else
ret = true; ?// dir exists
}
return ret;
}
The folder is created only during first run of program - after that it is just a check.
There is a suggestion to skipping the stat call and call mkdir and check errno against EEXIST.
Does it give real benefits?
More important, with the stat + mkdir approach, there is a race condition: in between the stat and the mkdir another program could do the mkdir, so your mkdir could still fail with EEXIST.
There's a slight benefit. Look up 'LBYL vs EAFP' or 'Look Before You Leap' vs 'Easier to Ask Forgiveness than Permission'.
The slight benefit is that the stat() system call has to parse the directory name and get to the inode - or the missing inode in this case - and then mkdir() has to do the same. Granted, the data needed by mkdir() is already in the kernel buffer pool, but it still involves two traversals of the path specified instead of one. So, in this case, it is slightly more efficient to use EAFP than to use LBYL as you do.
However, whether that is really a measurable effect in the average program is highly debatable. If you are doing nothing but create directories all over the place, then you might detect a benefit. But it is definitely a small effect, essentially unmeasurable, if you create a single directory at the start of a program.
You might need to deal with the case where strcmp(dir, "/some/where/or/another") == 0 but although "/some/where" exists, neither "/some/where/or" nor (of necessity) "/some/where/or/another" exist. Your current code does not handle missing directories in the middle of the path. It just reports the ENOENT that mkdir() would report. Your code that looks does not check that dir actually is a directory, either - it just assumes that if it exists, it is a directory. Handling these variations properly is trickier.
Similar to Race condition with stat and mkdir in sequence, your solution is incorrect not only due to the race condition (as already pointed out by the other answers over here), but also because you never check whether the existing file is a directory or not.
When re-implementing functionality that's already widely available in existing command-line tools in UNIX, it always helps to see how it was implemented in those tools in the first place.
For example, take a look at how mkdir(1) -p option is implemented across the BSDs (bin/mkdir/mkdir.c#mkpath in OpenBSD and NetBSD), all of which, on mkdir(2)'s error, appear to immediately call stat(2) to subsequently run the S_ISDIR macro to ensure that the existing file is a directory, and not just any other type of a file.

Automatically mounting NTFS partition on FreeBSD at boot time

I am looking for the way to mount NTFS hard disk on FreeBSD 6.2 in read/write mode.
searching google, I found that NTFS-3G can be a help.
Using NTFS-3G, there is no problem when I try to mount/unmount NTFS manually:
mount: ntfs-3g /dev/ad1s1 /home/admin/data -o uid=1002,
or
umount: umount /home/admin/data
But I have a problem when try to mount ntfs hard disk automatically at boot time.
I have tried:
adding fstab: /dev/ad1s1 /home/admin/data ntfs-3g uid=1002 0 0
make a script, that automatically mount ntfs partition at start up, on /usr/local/etc/rc.d/ directory.
But it is still failed.
The script works well when it is executed manually.
Does anyone know an alternative method/ solution to have read/write access NTFS on FreeBSD 6.2?
Thanks.
What level was your script running at? Was it a S99, or lower?
It sounds like either there is a dependency that isn't loaded at the time you mount, or that the user who is trying to mount using the script isn't able to succeed.
In your script I suggest adding a sudo to make sure that the mount is being performed by root:
/sbin/sudo /sbin/mount ntfs-3g /dev/ad1s1 /home/admin/data -o uid=1002, etc
Swap the sbin for wherever the binaries are.
After some ways I tried before.
The last, I tried to add ntfs-3g support by change the mount script on mount.c
Like this:
use_mountprog(const char *vfstype)
{
/* XXX: We need to get away from implementing external mount
* programs for every filesystem, and move towards having
* each filesystem properly implement the nmount() system call.
*/
unsigned int i;
const char *fs[] = {
"cd9660", "mfs", "msdosfs", "nfs", "nfs4", "ntfs",
"nwfs", "nullfs", "portalfs", "smbfs", "udf", "unionfs",
"ntfs-3g"
NULL
};
for (i = 0; fs[i] != NULL; ++i) {
if (strcmp(vfstype, fs[i]) == 0)
return (1);
}
return (0);
}
Recompile the mount program, and it works!
Thanks...