Scrapyd: How to retrieve spiders or version of a scrapyd project? - scrapy

It apears that either the documentation of scrapyd is wrong or that there is a bug. I want to retrieve the list of spiders from a deployed project. the docs tell me to do it this way:
curl http://localhost:6800/listspiders.json?project=myproject
So in my environment it looks like this:
merlin#192-143-0-9 spider2 % curl http://localhost:6800/listspiders.json?project=crawler
zsh: no matches found: http://localhost:6800/listspiders.json?project=crawler
So the command seems not to be recognised. Lets check the project availability:
merlin#192-143-0-9 spider2 % curl http://localhost:6800/listprojects.json
{"node_name": "192-143-0-9.ip.airmobile.co.za", "status": "ok", "projects": ["crawler"]}
Looks OK to me.
Checking the docs again, the other API calls take a parameter not as a GET but in a different way:
curl http://localhost:6800/schedule.json -d project=myproject -d spider=somespider
Applying this to listspiders:
merlin#192-143-0-9 spider2 % curl http://localhost:6800/listspiders.json -d project=crawler
{"node_name": "192-143-0-9.ip.airmobile.co.za", "status": "error", "message": "Expected one of [b'HEAD', b'object', b'GET']"}
Missing the GET parameter. So it looks like we are runnning in circles.
How can one retrieve a list of spiders or version (listversion) with scrapyd?

Maybe the url needs to be wrapped in double-quotes, Try
curl "http://localhost:6800/listspiders.json?project=crawler"

Related

Uploading local file to GraphDB

Tried the following curl command for uploading a local turtle file to GraphDB (free version), running at http://localhost:7200/.
curl -X POST 'http://localhost:7200/repositories/testrepository1/statements' -H "Content-Type:text/turtle" -T "/home/Desktop/onto.ttl"
Eventhough this curl command doesn't return any error when executed, the local file onto.ttl is not uploaded to testrepository1
Iam using graphDB free with version 9.7.0.
It'll be grateful if someone help me with this. Thanks in advance!
You are using a wrong mime type - it should be application/x-turtle instead of text/turtle. The request will look something like this:
curl -X POST -H "Content-Type:application/x-turtle" -T "/home/Desktop/onto.ttl" http://localhost:7200/repositories/testrepository1/statements

SauceLabs Pass/Fail using behat

I am trying to add Pass/Fail status in Saucelabs whenever I run an Automated test but I can't figure out how shall I do it. I use Behat - Selenium Driver. I read the documentation but it didn't help me.
I tried to use the Saucelabs Rest API guide and I launch in my console the following
curl -X PUTĀ \
-s -d '{"passed":true}' \
-u https://USERNAME:APIKEY#saucelabs.com/rest/v1/users/USERNAME
But it doesn't work.
I think you need the session Id
ownCloud uses:
curl -X PUT -s -d "{\"passed\": $PASSED}" -u $SAUCE_USERNAME:$SAUCE_ACCESS_KEY https://saucelabs.com/rest/v1/$SAUCE_USERNAME/jobs/$SAUCELABS_SESSIONID
see: https://github.com/owncloud/core/blob/master/tests/travis/start_ui_tests.sh#L235
and this Id is pulled from the URL: https://github.com/owncloud/core/blob/master/tests/ui/features/bootstrap/FeatureContext.php#L171
but there might be better ways of getting it

Where do I find the project ID for the GitLab API?

I use GitLab on their servers. I would like to download my latest built artifacts (build via GitLab CI) via the API like this:
curl --header "PRIVATE-TOKEN: 9koXpg98eAheJpvBs5tK" "https://gitlab.com/api/v3/projects/1/builds/8/artifacts"
Where do I find this project ID? Or is this way of using the API not intended for hosted GitLab projects?
I just found out an even easier way to get the project id: just see the HTML content of the gitlab page hosting your project. There is an input with a field called project_id, e.g:
<input type="hidden" name="project_id" id="project_id" value="335" />
The latest version of GitLab 11.4 at the time of this writing now puts the Project ID at the top of the frontpage of your repository.
Screenshot:
On the Edit Project page there is a Project ID field in the top right corner.
(You can also see the ID on the CI/CD pipelines page, in the exameple code of the Triggers section.)
In older versions, you can see it on the Triggers page, in the URLs of the example code.
You can query for your owned projects:
curl -XGET --header "PRIVATE-TOKEN: XXXX" "https://gitlab.com/api/v4/projects?owned=true"
You will receive JSON with each owned project:
[
{
"id":48,
"description":"",
"default_branch":"master",
"tag_list":[
...
You are also able to get the project ID from the triggers configuration in your project which already has some sample code with your ID.
From the Triggers page:
curl -X POST \
-F token=TOKEN \
-F ref=REF_NAME \
https://<GitLab Installation>/api/v3/projects/<ProjectID>/trigger/builds
As mentioned here, all the project scoped APIs expect either an ID or the project path (URL encoded).
So just use https://gitlab.com/api/v4/projects/gitlab-org%2Fgitlab-foss directly when you want to interact with a project.
Enter the project.
On the Left Hand menu click Settings -> General -> Expand General Settings
It has a label Project ID and is next to the project name.
This is on version GitLab 10.2
Provide the solution that actually solve the problem the api of getting the project id for specific gitlab project
curl -XGET -H "Content-Type: application/json" --header "PRIVATE-TOKEN: $GITLAB_TOKEN" http://<YOUR-GITLAB-SERVER>/api/v3/projects/<YOUR-NAMESPACE>%2F<YOUR-PROJECT-NAME> | python -mjson.tool
Or maybe you just want the project id:
curl -XGET -H "Content-Type: application/json" --header "PRIVATE-TOKEN: $GITLAB_TOKEN" http://<YOUR-GITLAB-SERVER>/api/v3/projects/<YOUR-NAMESPACE>%2F<YOUR-PROJECT-NAME> | python -c 'import sys, json; print(json.load(sys.stdin)["id"])'
Note that the repo url(namespace/repo name) is encoded.
If you know your project name, you can get the project id by using the following API:
curl --header "Private-Token: <your_token>" -X GET https://gitlab.com/api/v4/projects?search=<exact_project_name>
This will return a JSON that includes the id:
[
{
"id":<project id>, ...
}
]
Just for the record, if someone else has the need to download artifacts from gitlab.com created via gitlab-ci
Create a private token within your browser
Get the project id via curl -XGET --header "PRIVATE-TOKEN: YOUR_AD_HERE?" "https://gitlab.com/api/v3/projects/owned"
Download the last artifact from your master branch created via a gitlab-ci step called release curl -XGET --header "PRIVATE-TOKEN: YOUR_AD_HERE?" -o myapp.jar "https://gitlab.com/api/v3/projects/4711/builds/artifacts/master/download?job=release"
I am very impressed about the beauty of gitlab.
You can view it under the repository name
You can query projects with search attribute e.g:
http://gitlab.com/api/v3/projects?private_token=xxx&search=myprojectname
As of Gitlab API v4, the following API returns all projects that you own:
curl --header 'PRIVATE-TOKEN: <your_token>' 'https://gitlab.com/api/v4/projects?owned=true'
The response contains project id. Gitlab access tokens can be created from this page- https://gitlab.com/profile/personal_access_tokens
No answer suits generic needs, the most similar is intended only for the gitlab site, not specific sites. This can be used to find the ID of the project streamer in the Gitlab server my-server.com, for example:
$ curl --silent --header 'Authorization: Bearer MY-TOKEN-XXXX' \
'https://my-server.com/api/v4/projects?per_page=100&simple=true'| \
jq -rc '.[]|select(.name|ascii_downcase|startswith("streamer"))'| \
jq .id
168
Remark that
this gives only the first 100 projects, if you have more, you should request the pages that follow (&page=2, 3, ...) or run a different API (e.g. groups/:id/projects).
jq is quite flexible. Here we're just filtering a project, you can do multiple things with it.
There appears to be no way to retrieve only the Project ID using the gitlab api. Instead, retrieve all the owner's projects and loop through them until you find the matching project, then return the ID. I wrote a script to get the project ID:
#!/bin/bash
projectName="$1"
namespace="$2"
default=$(sudo cat .namespace)
namespace="${namespace:-$default}"
json=$(curl --header "PRIVATE-TOKEN: $(sudo cat .token)" -X GET
'https://gitlab.com/api/v4/projects?owned=true' 2>/dev/null)
id=0
idMatch=0
pathWithNamespaceMatch=0
rowToMatch="\"$(echo "$namespace/$projectName" | tr '[:upper:]' '[:lower:]')\","
for row in $(echo "${json}" | jq -r '.'); do
[[ $idMatch -eq 1 ]] && { idMatch=0; id=${row::-1}; }
[[ $pathWithNamespaceMatch -eq 1 ]] && { pathWithNamespaceMatch=0; [[ "$row" == "$rowToMatch" ]] && { echo "$id"; return 0; } }
[[ ${row} == "\"path_with_namespace\":" ]] && pathWithNamespaceMatch=1
[[ ${row} == "\"id\":" ]] && idMatch=1
done
echo 'Error! Could not retrieve projectID.'
return 1
It expects the default namespace to be stored in a file .namespace and the private token to be stored in a file .token. For increased security, its best to run chmod 000 .token; chmod 000 .namespace; chown root .namespace; chown root .token
If your project name is unique, it is handy to follow the answer by shunya, search by name, refer API doc.
If you have stronger access token and the Gitlab contains a few same name projects within different groups, then search within group is more convenient. API doc here. e.g.
curl --header "PRIVATE-TOKEN: <token>" -X GET https://gitlab.com/api/v4/groups/<group_id>/search?scope=projects&search=<project_name>
The group ID can be found from the Settings page under the group domain.
And to fetch the project id from the output, you can do:
curl --header "PRIVATE-TOKEN: <token>" -X GET https://gitlab.com/api/v4/groups/<group_id>/search?scope=projects&search=<project_name> | jq '[0].id'
To get id from all projects, use:
curl --header 'PRIVATE-TOKEN: XXXXXXXXXXXXXXXXXXXXXXX' 'https://gitlab.com/api/v4/projects?owned=true' > curloutput
grep -oPz 'name\":\".*?\"|{\"id\":[0-9]+' curloutput | sed 's/{\"/\n/g' | sed 's/name//g' |sed 's/id\"://g' |sed 's/\"//g' | sort -u -n
Not Specific to question, but somehow reached here, might help others
I used chrome to get a project ID
Go to the desired project example gitlab.com/username/project1
Inspect network tab
see the first garphql request in network tab
You can search for the project path
curl -s 'https://gitlab.com/api/v4/projects?search=my/path/to/my/project&search_namespaces=true' --header "PRIVATE-TOKEN: $GITLAB_TOKEN" |python -mjson.tool |grep \"id\"
https://docs.gitlab.com/ee/api/projects.html
Which will only match your project and will not find other unnecessary projects
My favorite method is to pull from the CI/CD pipeline so on build it dynamically assigns the project id.
Simply assign a variable in your code to = CI_PROJECT_ID

Curl command to query Rally not abiding by page size parameter

I am trying to use a curl command to query for all my projects within my workspace. When I specify a page size, ex. 100, the result only returns 19 projects instead of my total number of projects which is over 50. Why is my curl command not abiding by the page size parameter?
Browser REST Client - This returns all projects correctly.
https://rally1.rallydev.com/slm/webservice/v2.0/project?workspace=https://rally1.rallydev.com/slm/webservice/v2.0/workspace/1234567890&query=&start=1&pagesize=100
Curl command - I only get 19 projects, and the result json said "pagesize=20" even though my curl command query said pagesize=100
% curl -u 'user#company.com:' https://rally1.rallydev.com/slm/webservice/v2.0/project?workspace=https://rally1.rallydev.com/slm/webservice/v2.0/workspace/1234567890&query=&start=1&pagesize=100
Try a different URL. Instead of
https://rally1.rallydev.com/slm/webservice/v2.0/project?workspace=https://rally1.rallydev.com/slm/webservice/v2.0/workspace/1111&query=&start=1&pagesize=100
try:
https://rally1.rallydev.com/slm/webservice/v2.0/workspace/1111/Projects?&pagesize=100"
Here is a curl command:
curl -v -u user#co.com:secret "https://rally1.rallydev.com/slm/webservice/v2.0/workspace/1111/Projects?&pagesize=200" | python -m json.tool > /home/user/curloutput
the purpose of this part:
| python -m json.tool > /home/user/curloutput
was to prettify the json response and to save it to a file to make it easier to examine.
pagesize=200 worked. There are 24 projects in this workspace. The same command, without pagesize=200, returned 19 projects, but URL that includes pagesize=200 returned all 24.

Private app API using 'curl'

I am trying to use curl and private app API and have successed to access to my shop but got an error message to modify current product using curl in PHP. Below is the code I used
(I put xml file as input.xml which included modify detail and tried sku using product i.d)
Is there any where keep modifying more than 250 product on one click?
//php
curl -v -X PUT -d #input.xml -H 'Content-Type: application/xml'(<-- this is line 2)
'https://myapikey:mypassword#myshopdomain/admin/products/product_id.xml'
// and i got this error message
"Parse error: syntax error, unexpected T_STRING in /home/content/s/o/n/song2660/html/curltest.php on line 2"