import server file using relative path via REST API - graphdb

I'm trying to kick-off an import of RDF files into a GraphDB repository via the workbench REST API.
It works fine when the file is in the {graphdb.workbench.importDirectory} directory and the request specifies "filenames": [ "file1.owl" ].
However, if the file is in a subdirectory (eg. {graphdb.workbench.importDirectory}/top/) and the request uses "filenames": [ "top/file1.owl" ], no such luck - nor does "/top/file1.owl" work. The Workbench Import UI shows the entire collection of eligible files under the {graphdb.workbench.importDirectory} directory. The file in question imports when the Workbench UI is used to initiate the import.
My question is: does the REST API support importing server files that are located is such child directories? And if so, what simple syntax am I missing out? any chance I have to specify any other property (eg. "baseURI":"file:/home/steve/graphdb-import/top/file1.owl")
Many thanks for any feedback.

If you have started GDB with -Dgraphdb.workbench.importDirectory=<path_to_the_import_directory> in "Server files" tab you should be able to see listed all files in this directory and in the child directories, which are located in the <path_to_the_import_directory> in following manner:
I've started GDB with
-Dgraphdb.workbench.importDirectory=/home/sava/Videos/data_for_import and in this directory I have subDirectory "movieDB" with two files "movieDB.brf" and "movieDB.brf.gz" and both are shown in the tab like "movieDB/movieDB.brf" and "movieDB/movieDB.brf.gz".
If you want to import these files using cURL use server import URL with method POST or:
curl -H POST 'http://localhost:7200/rest/data/import/server/w1' -H 'Accept: application/json, text/plain, /' -H 'Content-Type: application/json;charset=UTF-8' --data-binary '{"importSettings":{"name":"movieDB/movieDB.brf","status":"NONE","message":"","context":"","replaceGraphs":[],"baseURI":null,"forceSerial":false,"type":"file","format":null,"data":null,"timestamp":1608016179633,"parserSettings":{"preserveBNodeIds":false,"failOnUnknownDataTypes":false,"verifyDataTypeValues":false,"normalizeDataTypeValues":false,"failOnUnknownLanguageTags":false,"verifyLanguageTags":true,"normalizeLanguageTags":false,"stopOnError":true},"requestIdHeadersToForward":null},"fileNames":["movieDB/movieDB.brf"]}'

Related

Sorting artifacts using aql and cleanup old artifacts

I'm trying to sort the list of artifacts from jfrog artifactory but getting (The requested URL returned error: 400 Bad Request), in the jfrog documentation (https://www.jfrog.com/confluence/display/JFROG/Artifactory+Comparison+Matrix) says it won't work for open source services. After we get list of artifacts need to delete old artifacts from subfolder in the artifactory repo. Tried with CLI and AQL but nothing worked.
Our repo url looks like this
http://domainname/artifactory/repo/folder/subfolder/test1.zip
Like test 1.zip we have many artifacts(let's say 50)in that subfolder. Looking for help on this, anyone pls me on this issue. Thanks.
While sorting is not supported in OSS versions, if you would like to delete artifacts older than a certain time period, you can use Relative Time Operators, parse the output, and use a script to delete those artifacts.
You can also specify a specific date. There are several Comparison Operators that you can use.
You can use the below AQL for reference:
curl -uadmin:password -XPOST "http://localhost:8082/artifactory/api/search/aql" -d 'items.find({"repo": "repo"}, {"path": "folder/subfolder"}, {"created" : {"$before" : "2minutes"}})' -H "Content-Type: text/plain"

Graphdb restore from backup in curl

I'm writing a script to automatically setup a repository starting from a clean GraphDB running in a Docker container.
I have a config.ttl file containing repository configuration, the namespace and a dump in a file init.nq
I have successfully created the repository using the config.ttf and updated namespace but I cannot understand how to load the init.nq file.
This operation is extremely simple from web interface: Import -> RFD -> Upload, but I'm not able to understand how to perform it using Curl. I suppose that the correct API should be
post /repositories/{repositoryID}/statements
but the dump is to huge to pass it as simple text (~44MB).
This should work:
curl -X POST -H "Content-Type:application/n-quads" -T init.nq 'http://localhost:7200/repositories/test/statements'

How does one remove (unregister) a runtime (not component) from the NoFlo Development Environment

I am running a local version of the NoFlo Development Environment and would like to know how to remove (unregister) a runtime. Actually, how can I remove a runtime from the FlowHub hosted environment, as well?
There is currently no UI to do this, but the API exists: Issue
Here is my bash script for doing just that.
#!/bin/bash -x
# Your UUID can be found through developer JS console: Resources -> Local Storage -> Look for grid-token
uuid="<your uuid>"
# the list of runtimes you want to delete.
list=$1
for i in ${list}
do
curl -X DELETE http://api.flowhub.io/runtimes/${i} -H "Authorization: Bearer ${uuid}"
done

wget downloading only PDFs from website

I am trying to download all PDFs from http://www.fayette-pva.com/.
I believe the problem is that when hovering over the link to download the PDF chrome shows the URL in the bottom left hand corner without a .pdf file extension. I saw and used another forum answer similar to this but the .pdf extension was used for the URL when hovering over the PDF link with my cursor. I have tried the same code that is in the link below but it doesn't pick up the PDF files.
Here is the code I have been testing with:
wget --no-directories -e robots=off -A.pdf -r -l1 \
http://www.fayette-pva.com/sales-reports/salesreport03-feb-09feb2015/
I am using this on a single page of which I know that it has a PDF on it.
The complete code should be something like
wget --no-directories -e robots=off -A.pdf -r http://www.fayette-pva.com/
Related answer: WGET problem downloading pdfs from website
I am not sure if downloading the entire website would work and if it wouldn't take forever. How do I get around this and download only the PDFs?
Yes, the problem is precisely what you stated: The URLs do not contain regular or absolute filenames, but are calls to a script/servlet/... which hands out the actual files.
The solution is to use the --content-disposition option, which tells wget to honor the Content-Disposition field in the HTTP response, which carries the actual filename:
HTTP/1.1 200 OK
(...)
Content-Disposition: attachment; filename="SalesIndexThru09Feb2015.pdf"
(...)
Connection: close
This option is supported in wget at least since version 1.11.4, which is already 7 years old.
So you would do the following:
wget --no-directories --content-disposition -e robots=off -A.pdf -r \
http://www.fayette-pva.com/

How to download CSV file with poltergeist using Capybara on phantomjs?

For a integration test, I need to download a CSV file using poltergeist driver with Capybara. In selenium(for example firefox/chrom webdriver), I can specify download directory and it works fine. But in poltergeist, is there a way to specify the download directory or any special configuration?. Basically I need to know how download stuff works using poltergeist,Capybara, Phantomjs.
I can read server response header as Hash using ruby but can not read the server response to get the file content.Any clue? or help please.
Finally I solved the download part by simply using CURL inside Ruby code without using any webdriver. The idea is simple, first of all, I submitted the login form via CURL and saved the cookie into my server and then submitted(via CURL) the CVS Export form using the saved cookie like this
post_data = "p1=d1&p2=d2&p3=d3"
`curl -c cookie.txt -d "userName=USERNAME&password=PASSWORD" LOGIN SUBMIT_URL`
csv_data = `curl -X POST -b cookie.txt -d '#{post_data}' SUBMIT_URL_FOR_DOWNLOAD_CSV`