Virtuoso ISQL Result Dump Format - sparql

I am running following query on Virtuoso isql.
SPARQL
CONSTRUCT
{
?infectee ?getInfectedBy ?infector
}
FROM <http://ndssl.bi.vt.edu/chicago/>
WHERE
{
?s <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://ndssl.bi.vt.edu/chicago/vocab/dendrogram>.
?s <http://ndssl.bi.vt.edu/chicago/vocab/dendrogram_infectee_pid> ?infectee.
?s <http://ndssl.bi.vt.edu/chicago/vocab/dendrogram_infector_pid> ?infector.
?s <http://ndssl.bi.vt.edu/chicago/vocab/dendrogram_iteration> '0'^^xsd:decimal.
BIND (iri('http://ndssl.bi.vt.edu/chicago/vocab/getInfectedBy') as ?getInfectedBy)
};
I want to dump result in "N-Triples" format. How can I do it in isql?

Answered on the Virtuoso Users mailing list where the question was also asked...
Dumping results in various formats can be done by using the
define output:format "{XX}"
pragma, so in your case it would be:
SQL> sparql define output:format "TURTLE" CONSTRUCT ...
Other possible formats are:
NICE_TTL
RDF_XML
etc.
When using the ISQL client to fetch long texts, use the set blobs on; directive to avoid receiving a data truncated warning.
i.e.:
SQL> set blobs on;
SQL> sparql define output:format ...
For CONSTRUCT, the supported formats are:
TRIG, TTL, JSON, JSON;TALIS, SOAP, RDF/XML, NT, RDFA;XHTML, JSON;RES, HTML;MICRODATA, HTML, JS, ATOM;XML, JSON;ODATA, XML, CXML;QRCODE, CXML, HTML;UL, HTML;TR, JSON;LD, CSV, TSV, NICE_TTL, HTML;NICE_MICRODATA, HTML;SCRIPT_LD_JSON, HTML;SCRIPT_TTL, HTML;NICE_TTL
Documentation links:
Pragmas to control the type of the result
List of supported formats
Examples:
Example Dump arbitrary query result as N-Triples
Controlling SPARQL Output Data Types
To get the result into a local file, the following should work:
insert the data into XX.ttl local file:
isql host:port dba pwd exec="set blobs on; sparql define output:format '"TURTLE"' construct {...} from <....> where {....}" > XX.ttl
trim the first 9 lines so to have as content only the triples:
tail -n +9 XX.ttl > XX_new.ttl

Related

Saving SPARQL query results from AWS neptune to CSV files

Can anyone please let me know how to save the results of SPARQL queries in AWS Neptune to CSV file. I am using sagemaker notebook to connect to database cluster. Please find the below query. Any help would be greatly appreciated.
%%sparql
PREFIX mo: <>
SELECT (strafter(str(?s), '#') as ?sample)WHERE { ?s mo:belongs_to_project ?o.
FILTER(regex(str(?o), "PROJECT_00001"))}
The results from a query can be saved to a Python variable using:
%%sparql --store-to result
Then in another cell you could write a little Python code that takes the result (which will be in JSON) and creates a CSV containing whatever data you need from the result using the Python CSV helper classes and methods (https://docs.python.org/3/library/csv.html).
UPDATED: to add that if you just want the whole result as a CSV file you can achieve that by running a curl command from a cell using the %%bash magic. Here is an example:
%%bash
curl -X POST --data-binary 'query=SELECT ?s ?p ?o WHERE {?s ?p ?o} LIMIT 10' -H "Accept: text/csv" https://{cluster endpoint}:8182/sparql > results.csv

When I running gh-rdf3x engine's commend rdf3xquery It prompt:parse error: unknown prefix 'http'

I try to use gh-rdf3x engine to do some SPARQL search, so I use LUBM-100 dataset and then I use RDF2RDF tool to make all .owl file into a test.nt file.
then I use gh-rdf3x command
./rdf3xload dataDB test.nt
to build a dataDB file. At last, I want to do some search so I use LUBM SPARQL#1 as test.sparql.
Then I do the command
./rdf3xquery dataDB test.sparql
It prompts
parse error: unknown prefix 'http'
I do all the thing as described in the GH-RDF3X Wiki, so I don't know why it prompt that.
And the message may be from file gh-rdf3x/cts/parser/TurtleParser.cpp
Thank you for your help.
I guess you're using the LUBM query from this file which unfortunately contains several syntax errors.
The first query is missing the angle brackets < and > which must be put around full URIs:
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX ub: <http://www.lehigh.edu/~zhp2/2004/0401/univ-bench.owl#>
SELECT ?X WHERE {
?X rdf:type ub:GraduateStudent .
?X ub:takesCourse <http://www.Department0.University0.edu/GraduateCourse0>
}

query a fuseki server using python (or something)

I'm trying to issue a complicated query against a fuseki server that I'm running locally through a browser but it keeps crashing- is it possible to do it through a python script? If so- how?
You can use any suitable command line tool, for example curl:
curl http://localhost:3030/your_service/sparql --data 'query=ASK { ?s ?p ?o . }'
If you want to use Python specifically, you can use SPARQLWrapper, or just the Requests package.
Example using Requests:
import requests
response = requests.post('http://localhost:3030/your_service/sparql',
data={'query': 'ASK { ?s ?p ?o . }'})
print(response.json())
./s-query --service=http://localhost:3030/myDataset/query --query=/home/matthias/EIS/EDSA/27/18.05/queryFile.rq
With the above command it can also work.
Follow the ideas from the SOH - SPARQL over HTTP page, i.e.
SOH SPARQL Query
s-query --service=endpointURL 'query string'
s-query --service=endpointURL --query=queryFile.rq

Generate a Properties File using Shell Script and Results from a SQL Query

I am trying to create a properties file like this...
firstname=Jon
lastname=Snow
occupation=Nights_Watch
family=Stark
...from a query like this...
SELECT
a.fname as firstname,
a.lname as lastname,
b.occ as occupation...
FROM
names a,
occupation b,
family c...
WHERE...
How can I do this? As I am aware of only using spool to a CSV file which won't work here?
These property files will be picked up by shell scripts to run automated tasks. I am using Oracle DB
Perhaps something like this?
psql -c 'select id, name from test where id = 1' -x -t -A -F = dbname -U dbuser
Output would be like:
id=1
name=test1
(For the full list of options: man psql.)
Since you mentionned spool I will assume you are running on Oracle. This should produce a result in the desired format, that you can spool straight away.
SELECT
'firstname=' || firstname || CHR(10) ||
'lastname=' || lastname || CHR(10) -- and so on for all fields
FROM your_tables;
The same approach should be possible with all database engines, if you know the correct incantation for a litteral new line and the syntax for string concatenation.
It is possible to to this from your command line SQL client but as STTLCU notes it might be better to get the query to output in something "standard" (like CSV) and then transform the results with a shell script. Otherwise, because a lot of the features you would use are not part of any SQL standard, they would depend on the database server and client application. Think of this step as sort of the obverse of ETL where you clean up the data you "unload" so that it is useful for some other application.
For sure there's ways to build this into your query application: e.g. if you use something like perl DBI::Shell as your client (which allows you to connect to many different servers using the DBI module) you can jazz up your output in various ways. But here you'd probably be best off if could send the query output to a text file and run it through awk.
Having said that ... here's how the Postgresql client could do what you want. Notice how the commands to set up the formatting are not SQL but specific to the client.
~/% psql -h 192.168.2.69 -d cropdusting -u stubblejumper
psql (9.2.4, server 8.4.14)
WARNING: psql version 9.2, server version 8.4.
Some psql features might not work.
You are now connected to database "cropdusting" as user "stubblejumper".
cropdusting=# \pset border 0 \pset format unaligned \pset t \pset fieldsep =
Border style is 0.
Output format is unaligned.
Showing only tuples.
Field separator is "=".
cropdusting=# select year,wmean_yld from bckwht where year=1997 AND freq > 13 ;
1997=19.9761904762
1997=14.5533333333
1997=17.9942857143
cropdusting=#
With the psql client the \pset command sets options affecting the output of query results tables. You can probably figure out which option is doing what. If you want to do this using your SQL client tell us which one it is or read through the manual page for tips on how to format the output of your queries.
My answer is very similar to the two already posted for this question, but I try to explain the options, and try to provide a precise answer.
When using Postgres, you can use psql command-line utility to get the intended output
psql -F = -A -x -X <other options> -c 'select a.fname as firstname, a.lname as lastname from names as a ... ;'
The options are:
-F : Use '=' sign as the field separator, instead of the default pipe '|'
-A : Do not align the output; so there is no space between the column header, separator and the column value.
-x : Use expanded output, so column headers are on left (instead of top) and row values are on right.
-X : Do not read $HOME/.psqlrc, as it may contain commands/options that can affect your output.
-c : The SQL command to execute
<other options> : Any other options, such as connection details, database name, etc.
You have to choose if you want to maintain such a file from shell or from PL/SQL. Both solutions are possible and both are correct.
Because Oracle has to read and write from the file I would do it from database side.
You can write data to file using UTL_FILE package.
DECLARE
fileHandler UTL_FILE.FILE_TYPE;
BEGIN
fileHandler := UTL_FILE.FOPEN('test_dir', 'test_file.txt', 'W');
UTL_FILE.PUTF(fileHandler, 'firstname=Jon\n');
UTL_FILE.PUTF(fileHandler, 'lastname=Snow\n');
UTL_FILE.PUTF(fileHandler, 'occupation=Nights_Watch\n');
UTL_FILE.PUTF(fileHandler, 'family=Stark\n');
UTL_FILE.FCLOSE(fileHandler);
EXCEPTION
WHEN utl_file.invalid_path THEN
raise_application_error(-20000, 'ERROR: Invalid PATH FOR file.');
END;
Example's source: http://psoug.org/snippet/Oracle-PL-SQL-UTL_FILE-file-write-to-file-example_538.htm
At the same time you read from the file using Oracle external table.
CREATE TABLE parameters_table
(
parameters_coupled VARCHAR2(4000)
)
ORGANIZATION EXTERNAL
(
TYPE ORACLE_LOADER
DEFAULT DIRECTORY test_dir
ACCESS PARAMETERS
(
RECORDS DELIMITED BY NEWLINE
FIELDS
(
parameters_coupled VARCHAR2(4000)
)
)
LOCATION ('test_file.txt')
);
At this point you can write data to your table which has one column with coupled parameter and value, i.e.: 'firstname=Jon'
You can read it by Oracle
You can read it by any shell script because it is a plain text.
Then it is just a matter of a query, i.e.:
SELECT MAX(CASE WHEN INSTR(parameters_coupled, 'firstname=') = 1 THEN REPLACE(parameters_coupled, 'firstname=') ELSE NULL END) AS firstname
, MAX(CASE WHEN INSTR(parameters_coupled, 'lastname=') = 1 THEN REPLACE(parameters_coupled, 'lastname=') ELSE NULL END) AS lastname
, MAX(CASE WHEN INSTR(parameters_coupled, 'occupation=') = 1 THEN REPLACE(parameters_coupled, 'occupation=') ELSE NULL END) AS occupation
FROM parameters_table;

how do I set one variable equal to another in pig latin

I would like to do
register s3n://uw-cse344-code/myudfs.jar
-- load the test file into Pig
--raw = LOAD 's3n://uw-cse344-test/cse344-test-file' USING TextLoader as (line:chararray);
-- later you will load to other files, example:
raw = LOAD 's3n://uw-cse344/btc-2010-chunk-000' USING TextLoader as (line:chararray);
-- parse each line into ntriples
ntriples = foreach raw generate FLATTEN(myudfs.RDFSplit3(line)) as (subject:chararray,predicate:chararray,object:chararray);
--filter 1
subjects1 = filter ntriples by subject matches '.*rdfabout\\.com.*' PARALLEL 50;
--filter 2
subjects2 = subjects1;
but I get the error:
2012-03-10 01:19:18,039 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: mismatched input ';' expecting LEFT_PAREN
Details at logfile: /home/hadoop/pig_1331342327467.log
so it seems pig doesn't like that. How do I accomplish this?
i don't think that kind of 'typical' assignment works in pig. It's not really a programming language in the strict sense - it's a high-level language on top of hadoop with some specialized functions.
i think you'll need to simply re-project the data from subjects1 to subjects2, such as:
subjects2 = foreach subjects1 generate $0, $1, $2;
another approach might be to use the LIMIT function with some absurdly high parameter.
subjects2 = subjects2 LIMIT 100000000 ;
there could be a lot of reasons why that doesn't make sense, but it's a thought.
i sense you are considering doing things as you would in a programming language
i have found that rarely works out like you want it to but you can always get the job done once you think like Pig.
As I understand your example fro DataScience coursera course.
It's strange but I found the same problem. This code works on the on amount of data and don't on the another.
Because we need to change parameters I used this code:
filtered2 = foreach filtered generate subject as subject2, predicate as predicate2, object as object2;