Confluent schema-registry how to http post json-schema - confluent-schema-registry

Confluent 5.5.0 understands not just Avro schemas, but also json-schema and protobuf. I have a valid json-schema that I'm trying to curl to the schema registry server, but I keep getting the response
curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" \
--data #$tmpfile ${schemaregistry}/subjects/${topic}-value/versions?schemaType=JSONSCHEMA
{"error_code":42201,"message":"Either the input schema or one its references is invalid"}
The manual is unclear about how to use the schemaType parameter. I've tried as a query parameter, as a field in the json, ...
The $tmpfile I'm posting is a json with one top-level field named schema that contains a quote-escaped json-schema. The same mechanism works perfectly for Avro schemas.
Looking in the logging from the schema registry, I see that it tries to parse the provided data as an Avro schema, so no wonder it fails.
Any help? And Confluent: please clarify and fix your documentation!

Ah I got it. The documentation is unclear and wrong!
You have to add a field inside the posted json. The field name is schemaType, and its value must be JSON, and not JSONSCHEMA (what the documentation says).
For others here's an example that shows how to put local files with an avro and json schema into the schema-registry:
#!/bin/bash
schemaregistry="$1"
tmpfile=$(mktemp)
topic=avro-topic
export SCHEMA=$(cat schema.avsc)
echo '{"schema":""}' | jq --arg schema "$SCHEMA" '.schema = $schema' \
> $tmpfile
curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" \
--data #$tmpfile ${schemaregistry}/subjects/${topic}-value/versions
topic=json-topic
export SCHEMA=$(cat schema.json)
echo '{"schema":"","schemaType":"JSON"}' | jq --arg schema "$SCHEMA" '.schema = $schema' \
> $tmpfile
curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" \
--data #$tmpfile ${schemaregistry}/subjects/${topic}-value/versions
rm $tmpfile

According to the Confluent schema registry API documentation one can check the supported schema types by calling
curl --silent -X GET http://localhost:8081/schemas/types
This should result in
["JSON","PROTOBUF","AVRO"]
for the current version (5.5) so the output might help to set a proper schemaType-attribute.
Nonetheless as #bart van deenen pointed out here, the API doc is (still) wrong.

Related

How to Update the Input File of an OntoRefine project in GraphDB

I'm trying to script RDF/OWL data (re)loading to a GraphDB store and I wonder how to be able to process again a CSV file through the Ontorefine component, keeping the columns modifications and the RDF mapping, only using the REST API.
One way to script this is by using the rdf-mapper REST API, which accepts a column mapping file and a tabular file and streams the result to an input location file.
This file can afterwards be imported into GraphDB by using the import server file REST API (for which more information can be found here https://graphdb.ontotext.com/free/devhub/workbench-rest-api/curl-commands.html#data-import ).
Please keep in mind that when starting GraphDB, you need to input the directory from which you plan to import the RDF file by using this property:
-Dgraphdb.workbench.importDirectory=/import/location/
Here is a small example script of how you can import a CSV file as RDF .ttl document using cURL:
curl -X POST -sL \
--url "http://address:port/rest/rdf-mapper/rdf/stream:csv:separator={CSV-SEPERATOR}"\
-F mapping=#mapping.json \
-F data=#import_file.csv \
-H 'accept: text/turtle' \
-o export_file.ttl
curl -X POST --header 'Content-Type: application/json' --header 'Accept: application/json' -d '{
"fileNames": [
"export_file.ttl"
],
"importSettings": {
"baseURI": "",
"context": "",
"parserSettings": {
"failOnUnknownDataTypes": true,
"failOnUnknownLanguageTags": true,
"normalizeDataTypeValues": true,
"normalizeLanguageTags": true,
"preserveBNodeIds": true,
"stopOnError": true,
"verifyDataTypeValues": true,
"verifyLanguageTags": true
}
}
}' 'http://address:port/rest/data/import/server/{REPOSITORY-NAME}'
P.S. Here is how to create the needed mapping.json by using GraphDB Workbench and following these steps:
Go to Ontorefine -> Select and Import a tabular file -> Select Create Project -> Select RDF Mapping / Edit RDF Mapping -> Then a new window opens where you can configure the said mapping -> After configuring the mapping select "Download JSON" . The downloaded JSON mapping can be used then with the example provided above.
For more information take a look at https://graphdb.ontotext.com/free/loading-data-using-ontorefine.html?highlight=mapping

GitLab CI variables in job api?

I am using rest API to run manual jobs in GitLab CI. When i start a manual job from UI I am able to define custom variables that i can use during the job. How can i define them when running job through API?
Could not find any documentation on it. Or not even a single question in forums.
This is how i currently run my job
curl -k --request POST --header "PRIVATE-TOKEN: abc" https://mygit.com/api/v4/projects/17/jobs/1956/play
I tried adding:
--form variables[TEST]=hello
But this didnt work.
Edit:
A bit more information on what im doing. So my pipeline has two stages. Build and deploy. On each commit I want build to run once and then i want to be able to deploy this result to multiple different servers. Because the server list is dynamic and there are a lot of them I want to have the IP address of the server as an variable I can give to my deploy job.
Instead of starting a job you can start a pipeline and set the variables from there. Here's an example of how to do this from the GitLab documentation:
curl --request POST --header "PRIVATE-TOKEN: <your_access_token>" \
--header "Content-Type: application/json" \
--data '{ "ref": "master", "variables": [ {"key": "VAR1", "value": "hello"}, {"key": "VAR2", "value": "world"} ] }' \
"https://gitlab.example.com/api/v4/projects/169/pipeline"
This is the way how I'm using it, didn't find a way to use API tokens for it though.
curl -X POST \
-F token=xxxxxxxxxxxxxxxx \
-F "ref=some_branch" \
-F "variables[VAR1]=abc" \
-F "variables[VAR2]=cde" \
"https://example.gitlab.com/api/v4/projects/312/trigger/pipeline"
Where -F "variables[VAR1]=abc" for example is set in .gitlab-ci.yml.
only:
variables:
- $VAR1
The idea was to create some manual CI jobs and tell the devs they can run them via API call, but since I can only use the project token here, it's absolutely not secure.
It would be really handy to run it via
curl --request PUT --header "PRIVATE-TOKEN: <your_access_token>"
Passing variables is documented in gitlab-org/gitlab issue 2772, but more about triggering pipeline (not job)
See if that syntax would work, for trigger variables (syntax variables[xxx]=yyy):
# gitlab-ci.yml
build:
script:
- curl --request POST --form "variables[PRE_CI_PIPELINE_SOURCE]=$CI_PIPELINE_SOURCE" --form "token=$CI_JOB_TOKEN" --form ref=master http://192.168.10.3:3001/api/v4/projects/13/trigger/pipeline
Or simply for regular variables --form key=value:
curl --request POST --form "token=$CI_JOB_TOKEN" --form ref=master https://gitlab.example.com/api/v4/projects/9/trigger/pipeline
It looks like that as of Jan 25, 2021 this feature not yet supported. There is a feature request I found here: https://gitlab.com/gitlab-org/gitlab/-/issues/37267
Update 2022-03:
After you create a trigger token, and create trigger_pipeline step in pipeline, like this
trigger_pipeline:
tags:
image: alpine:latest
stage: deploy
script:
only:
variables:
- $MANUAL
you can use it to trigger pipelines with a tool that can access the API
curl --request POST \
--form token=TOKEN \
--form ref=main \
--form "variables[MANUAL]=true" \
"https://gitlab.example.com/api/v4/projects/123456/trigger/pipeline"
or a webhook:
https://gitlab.example.com/api/v4/projects/123456/ref/<ref_name>/trigger/pipeline?token=<token>
for example for manual run.

How to use markdown for description from a file in gitlab CI using release API

I'm using the Gitlab release API in the gitlab-ci.yml to be able to automatically create a new release when deploying.
Simply putting a curl request like here in the docs works just fine. For the description, the docs state that markdown is allowed, which is great. However, I can't seem to figure out or come up with an idea to load a description from a markdown file within the curl request. I've already tried storing the content of the markdown file in a variable in the gitlab-ci.yml prior to the curl and then pass it and expand it within the curl like so:
# gitlab-ci.yml
...
- DESCRIPTION=`cat ./description.md`
and also to just put the cat ./description.md in the curl request itself as the value of "description".
Here is the example from the docs:
curl --header 'Content-Type: application/json' --header "PRIVATE-TOKEN: gDybLx3yrUK_HLp3qPjS" \
--data '{ "name": "New release", "tag_name": "v0.3", "description": "Super nice release", "milestones": ["v1.0", "v1.0-rc"], "assets": { "links": [{ "name": "hoge", "url": "https://google.com" }] } }' \
--request POST https://gitlab.example.com/api/v4/projects/24/releases
And for the "description" key I would like to pass the contents of a markdown file as the value.
I was surprised to not have found a post or discussion about this already, so I suspect I'm either missing something (very basic/obvious) or folks don't really use this function (yet)?
Any help will be much appreciated.
Using the variable like you, this .gitlab-ci.yml works :
create_release:
script:
- DESCRIPTION=$(cat description.md)
- |
curl --silent --request POST --header "Content-Type:application/json" \
--header "PRIVATE-TOKEN: TOKEN" \
--data '{"name":"New release","tag_name":"v0.3", "description":"'"$DESCRIPTION"'","assets":{"links":[{"name":"hoge","url":"https://google.com"}]}}' \
https://gitlab.bankassembly.com/api/v4/projects/369/releases
The variable is expanded inside double quote (see https://superuser.com/a/835589)
Example of the content of my description.md :
## CHANGELOG\r\n\r\n- Escape label and milestone titles to prevent XSS in GFM autocomplete. !2740\r\n- Prevent private snippets from being embeddable.\r\n- Add subresources removal to member destroy service.

Where do I find the project ID for the GitLab API?

I use GitLab on their servers. I would like to download my latest built artifacts (build via GitLab CI) via the API like this:
curl --header "PRIVATE-TOKEN: 9koXpg98eAheJpvBs5tK" "https://gitlab.com/api/v3/projects/1/builds/8/artifacts"
Where do I find this project ID? Or is this way of using the API not intended for hosted GitLab projects?
I just found out an even easier way to get the project id: just see the HTML content of the gitlab page hosting your project. There is an input with a field called project_id, e.g:
<input type="hidden" name="project_id" id="project_id" value="335" />
The latest version of GitLab 11.4 at the time of this writing now puts the Project ID at the top of the frontpage of your repository.
Screenshot:
On the Edit Project page there is a Project ID field in the top right corner.
(You can also see the ID on the CI/CD pipelines page, in the exameple code of the Triggers section.)
In older versions, you can see it on the Triggers page, in the URLs of the example code.
You can query for your owned projects:
curl -XGET --header "PRIVATE-TOKEN: XXXX" "https://gitlab.com/api/v4/projects?owned=true"
You will receive JSON with each owned project:
[
{
"id":48,
"description":"",
"default_branch":"master",
"tag_list":[
...
You are also able to get the project ID from the triggers configuration in your project which already has some sample code with your ID.
From the Triggers page:
curl -X POST \
-F token=TOKEN \
-F ref=REF_NAME \
https://<GitLab Installation>/api/v3/projects/<ProjectID>/trigger/builds
As mentioned here, all the project scoped APIs expect either an ID or the project path (URL encoded).
So just use https://gitlab.com/api/v4/projects/gitlab-org%2Fgitlab-foss directly when you want to interact with a project.
Enter the project.
On the Left Hand menu click Settings -> General -> Expand General Settings
It has a label Project ID and is next to the project name.
This is on version GitLab 10.2
Provide the solution that actually solve the problem the api of getting the project id for specific gitlab project
curl -XGET -H "Content-Type: application/json" --header "PRIVATE-TOKEN: $GITLAB_TOKEN" http://<YOUR-GITLAB-SERVER>/api/v3/projects/<YOUR-NAMESPACE>%2F<YOUR-PROJECT-NAME> | python -mjson.tool
Or maybe you just want the project id:
curl -XGET -H "Content-Type: application/json" --header "PRIVATE-TOKEN: $GITLAB_TOKEN" http://<YOUR-GITLAB-SERVER>/api/v3/projects/<YOUR-NAMESPACE>%2F<YOUR-PROJECT-NAME> | python -c 'import sys, json; print(json.load(sys.stdin)["id"])'
Note that the repo url(namespace/repo name) is encoded.
If you know your project name, you can get the project id by using the following API:
curl --header "Private-Token: <your_token>" -X GET https://gitlab.com/api/v4/projects?search=<exact_project_name>
This will return a JSON that includes the id:
[
{
"id":<project id>, ...
}
]
Just for the record, if someone else has the need to download artifacts from gitlab.com created via gitlab-ci
Create a private token within your browser
Get the project id via curl -XGET --header "PRIVATE-TOKEN: YOUR_AD_HERE?" "https://gitlab.com/api/v3/projects/owned"
Download the last artifact from your master branch created via a gitlab-ci step called release curl -XGET --header "PRIVATE-TOKEN: YOUR_AD_HERE?" -o myapp.jar "https://gitlab.com/api/v3/projects/4711/builds/artifacts/master/download?job=release"
I am very impressed about the beauty of gitlab.
You can view it under the repository name
You can query projects with search attribute e.g:
http://gitlab.com/api/v3/projects?private_token=xxx&search=myprojectname
As of Gitlab API v4, the following API returns all projects that you own:
curl --header 'PRIVATE-TOKEN: <your_token>' 'https://gitlab.com/api/v4/projects?owned=true'
The response contains project id. Gitlab access tokens can be created from this page- https://gitlab.com/profile/personal_access_tokens
No answer suits generic needs, the most similar is intended only for the gitlab site, not specific sites. This can be used to find the ID of the project streamer in the Gitlab server my-server.com, for example:
$ curl --silent --header 'Authorization: Bearer MY-TOKEN-XXXX' \
'https://my-server.com/api/v4/projects?per_page=100&simple=true'| \
jq -rc '.[]|select(.name|ascii_downcase|startswith("streamer"))'| \
jq .id
168
Remark that
this gives only the first 100 projects, if you have more, you should request the pages that follow (&page=2, 3, ...) or run a different API (e.g. groups/:id/projects).
jq is quite flexible. Here we're just filtering a project, you can do multiple things with it.
There appears to be no way to retrieve only the Project ID using the gitlab api. Instead, retrieve all the owner's projects and loop through them until you find the matching project, then return the ID. I wrote a script to get the project ID:
#!/bin/bash
projectName="$1"
namespace="$2"
default=$(sudo cat .namespace)
namespace="${namespace:-$default}"
json=$(curl --header "PRIVATE-TOKEN: $(sudo cat .token)" -X GET
'https://gitlab.com/api/v4/projects?owned=true' 2>/dev/null)
id=0
idMatch=0
pathWithNamespaceMatch=0
rowToMatch="\"$(echo "$namespace/$projectName" | tr '[:upper:]' '[:lower:]')\","
for row in $(echo "${json}" | jq -r '.'); do
[[ $idMatch -eq 1 ]] && { idMatch=0; id=${row::-1}; }
[[ $pathWithNamespaceMatch -eq 1 ]] && { pathWithNamespaceMatch=0; [[ "$row" == "$rowToMatch" ]] && { echo "$id"; return 0; } }
[[ ${row} == "\"path_with_namespace\":" ]] && pathWithNamespaceMatch=1
[[ ${row} == "\"id\":" ]] && idMatch=1
done
echo 'Error! Could not retrieve projectID.'
return 1
It expects the default namespace to be stored in a file .namespace and the private token to be stored in a file .token. For increased security, its best to run chmod 000 .token; chmod 000 .namespace; chown root .namespace; chown root .token
If your project name is unique, it is handy to follow the answer by shunya, search by name, refer API doc.
If you have stronger access token and the Gitlab contains a few same name projects within different groups, then search within group is more convenient. API doc here. e.g.
curl --header "PRIVATE-TOKEN: <token>" -X GET https://gitlab.com/api/v4/groups/<group_id>/search?scope=projects&search=<project_name>
The group ID can be found from the Settings page under the group domain.
And to fetch the project id from the output, you can do:
curl --header "PRIVATE-TOKEN: <token>" -X GET https://gitlab.com/api/v4/groups/<group_id>/search?scope=projects&search=<project_name> | jq '[0].id'
To get id from all projects, use:
curl --header 'PRIVATE-TOKEN: XXXXXXXXXXXXXXXXXXXXXXX' 'https://gitlab.com/api/v4/projects?owned=true' > curloutput
grep -oPz 'name\":\".*?\"|{\"id\":[0-9]+' curloutput | sed 's/{\"/\n/g' | sed 's/name//g' |sed 's/id\"://g' |sed 's/\"//g' | sort -u -n
Not Specific to question, but somehow reached here, might help others
I used chrome to get a project ID
Go to the desired project example gitlab.com/username/project1
Inspect network tab
see the first garphql request in network tab
You can search for the project path
curl -s 'https://gitlab.com/api/v4/projects?search=my/path/to/my/project&search_namespaces=true' --header "PRIVATE-TOKEN: $GITLAB_TOKEN" |python -mjson.tool |grep \"id\"
https://docs.gitlab.com/ee/api/projects.html
Which will only match your project and will not find other unnecessary projects
My favorite method is to pull from the CI/CD pipeline so on build it dynamically assigns the project id.
Simply assign a variable in your code to = CI_PROJECT_ID

How to remove meta-data from jql query?

I have the following curl command:
curl -k -u sandboxer:sandboxer -D- -X POST -H "Content-Type: application/json" --data
'{"jql":"project = BNAP AND resolution = null AND status != Resolved AND status != Rejected",
"maxResults":2 ,
"fields":["KEY","versions","description","status","resolution"]}'
https://127.0.0.1/rest/api/latest/search 1> newtest
I recieve a JSON object with meta-data,
{"expand":"schema,names"
,"startAt":0
,"maxResults":2
,"total":74
...
All I am intersted in is what follows, issues. I could take care of this in my application, but I am wondering if there is a way I could just tell JIRA 'Don't send me meta-data'. Is there?
Use the Speakeasy plugin to create a custom request/response as a Jira extension:
Speakeasy plugin
Installing Speakeasy
Developing Speakeasy Extensions