Retrieve Wikimedia Commons Category of a Wikipedia page using the Wikimedia API - wikipedia-api

I am trying to use the Wikimedia Api to get the corresponding Wikimedia Commons category to a specific Wikipedia page. I assume that it is possible as most Wikipedia pages include an "in other projects" - section in the sidebar which has a link that redirects to the Commons Category (for example: https://de.wikipedia.org/wiki/Albert_Einstein)
Thanks in advance.

You can do it in two API calls, the first call to German Wikipedia gets you the Wikidata Qid:
https://de.wikipedia.org/w/api.php?action=query&format=json&prop=wbentityusage&titles=Albert%20Einstein&wbeuprop=&wbeuaspect=
Which returns:
{
"batchcomplete": "",
"query": {
"pages": {
"1278360": {
"pageid": 1278360,
"ns": 0,
"title": "Albert Einstein",
"wbentityusage": {
"Q937": {
"aspects": [
"S",
"T",
"C.P227",
"C.P214",
"C.P244"
]
}
}
}
}
}
}
Then you can use the Wikidata API to get the name of the Commons category: https://www.wikidata.org/w/api.php?action=wbgetclaims&format=json&entity=Q937&property=P373
Which returns:
{
"claims": {
"P373": [
{
"mainsnak": {
"snaktype": "value",
"property": "P373",
"hash": "be154a8a3dfc826844ceb5a62389857a65ff1e4e",
"datavalue": {
"value": "Albert Einstein",
"type": "string"
},
"datatype": "string"
},
"type": "statement",
"id": "q937$2F332903-133D-4CA0-AD24-8B4292C2BF89",
"rank": "normal"
}
]
}
}
The value in datavalue is the name of the category. You get the full URL by just prepending https://commons.wikimedia.org/wiki/Category:
https://commons.wikimedia.org/wiki/Category:Albert Einstein

Related

2022: Get LinkedIn Posts via Access Token

I have been looking for an updated version of how to retrieve my own posts from LinkedIn via an access token which I have received from completing a three-legged OAuth process.
I have reviewed: https://learn.microsoft.com/en-us/linkedin/marketing/integrations/community-management/shares/share-api?view=li-lms-unversioned&tabs=http#permissions- but its now showing as legacy.
As it seems, LinkedIn unfortunately no longer wants us to retrieve this kind of content:
"Find Posts by authors is only supported for organization authors.
Finding by member authors is not supported."
Source
Neither a good, nor a solution within the LinkedIn-TOS (I assume) would be to use a webscraper. I have seen some solutions online but they are all very pricey in my opinion (and you never know if they get shut down by LinkedIn)
{
"activity": "urn:li:activity:12345657",
"content": {
"contentEntities": [
{
"entity": "urn:li:article:0",
"entityLocation": "https://www.example.com/content.html",
"thumbnails": [
{
"imageSpecificContent": {},
"resolvedUrl": "https://www.example.com/image.jpg"
}
]
}
],
"description": "content description",
"title": "Test Share with Content"
},
"created": {
"actor": "urn:li:person:A8xe03Qt10",
"time": 1471967236000
},
"distribution": {
"linkedInDistributionTarget": {}
},
"id": "6173878065928642560",
"lastModified": {
"actor": "urn:li:person:A8xe03Qt10",
"time": 1471967237000
},
"owner": "urn:li:organization:123456789",
"text": {
"text": "Test Share!"
}
}

GraphQL query - Query by ID

I have installed the strapi-starter-blog locally and I'm trying to understand how I can query article by ID (or slug). When I open the GraphQL Playground, I can get all the article using:
query Articles {
articles {
id
title
content
image {
url
}
category {
name
}
}
}
The response is:
{
"data": {
"articles": [
{
"id": "1",
"title": "Thanks for giving this Starter a try!",
"content": "\n# Thanks\n\nWe hope that this starter will make you want to discover Strapi in more details.\n\n## Features\n\n- 2 Content types: Article, Category\n- Permissions set to 'true' for article and category\n- 2 Created Articles\n- 3 Created categories\n- Responsive design using UIkit\n\n## Pages\n\n- \"/\" display every articles\n- \"/article/:id\" display one article\n- \"/category/:id\" display articles depending on the category",
"image": {
"url": "/uploads/blog_header_network_7858ad4701.jpg"
},
"category": {
"name": "news"
}
},
{
"id": "2",
"title": "Enjoy!",
"content": "Have fun!",
"image": {
"url": "/uploads/blog_header_balloon_32675098cf.jpg"
},
"category": {
"name": "trends"
}
}
]
}
}
But when I try to get the article using the ID with variable, like here github code in the GraphQL Playground with the following
Query:
query Articles($id: ID!) {
articles(id: $id) {
id
title
content
image {
url
}
category {
name
}
}
}
Variables:
{
"id": 1
}
I get an error:
...
"message": "Unknown argument \"id\" on field \"articles\" of type \"Query\"."
...
What is the difference and why can't I get the data like in the example of the Github repo.
Thanks for your help.
It's the difference between articles and article as the query. If you use the singular one you can use the ID as argument

Importing Data to Contentful programatically from a json file

I am trying to import some data programatically into contentful:
I am following the docs here
And running the command inside my integrated terminal
contentful space import --config config.json
Where the config file is
{
"spaceId": "abc123",
"managementToken": "112323132321adfWWExample",
"contentFile": "./dataToImport.json"
}
And the dataToImport.json file is
{
"data": [
{
"address": "11234 New York City"
},
{
"address": "1212 New York City"
}
]
}
The thing is I don't understand what format my dataToImport.json should be and what is missing inside this file or in my config file so that the array of addresses from the .json file get added as new entries to an already created content model inside the Contentful UI show in the screenshot below
I am not specifying the content model for the data to go into so I believe that is one issue, and I don't know how I do that. An example or repo would help me out greatly
The types of data you can import are listed : in their documentation
your json top level should say "entries" and not data, if new content of a content type is what you would like to import.
This is an example of a blog post as per content model of the tutorial they provide.
The only thing i didn't work out yet is where the user id is :D so i substituted for one of the content type 'person' also provided in their tutorial (I think it's called Gatsby Starter)
{"entries": [
{
"sys": {
"space": {
"sys": {
"type": "Link",
"linkType": "Space",
"id": "theSpaceIdToReceiveYourImport"
}
},
"type": "Entry",
"createdAt": "2019-04-17T00:56:24.722Z",
"updatedAt": "2019-04-27T09:11:56.769Z",
"environment": {
"sys": {
"id": "master",
"type": "Link",
"linkType": "Environment"
}
},
"publishedVersion": 149, -- these are not compulsory, you can skip
"publishedAt": "2019-04-27T09:11:56.769Z", -- you can skip
"firstPublishedAt": "2019-04-17T00:56:28.525Z", -- you can skip
"publishedCounter": 3, -- you can skip
"version": 150,
"publishedBy": { -- this is an example of a linked content
"sys": {
"type": "Link",
"linkType": "person",
"id": "personId"
}
},
"contentType": {
"sys": {
"type": "Link",
"linkType": "ContentType",
"id": "blogPost" -- here should be your content type 'RealtorProperties'
}
}
},
"fields": { -- here should go your content type fields, i can't see it in your post
"title": {
"en-US": "Test 1"
},
"slug": {
"en-US": "Test-1"
},
"description": {
"en-US": "some description"
},
"body": {
"en-US": "some body..."
},
"publishDate": {
"en-US": "2016-12-19"
},
"heroImage": { -- another example of a linked content
"en-US": {
"sys": {
"type": "Link",
"linkType": "Asset",
"id": "idOfTHisImage"
}
}
}
}
},
--another entry, ...]}
Have a look at this repo. I am also trying to figure this out. Looks like there's quite a lot of fields that need to be included in the json file. I was hoping there'd be a simple solution but it seems you (me too actually) will need to create scripts to "convert" your json file to data contentful can read and import.
I'll let you know if I find anything better.

How to define a related resource URI in JSON:API

In the json:api format relationships are defined with a type and a id.
Like in the example bellow. The article has a relationship with the type people and the id 9.
Now if i want to fetch the related resource i use the URI from "links.related"
// ...
{
"type": "articles",
"id": "1",
"attributes": {
"title": "Rails is Omakase"
},
"relationships": {
"author": {
"links": {
"self": "http://example.com/articles/1/relationships/author",
"related": "http://example.com/articles/1/author"
},
"data": { "type": "people", "id": "9" }
}
},
"links": {
"self": "http://example.com/articles/1"
}
}
// ...
But in my case the related resource (people) are in a separate API. There is no way to get the full people data from the articles API nor is it possible to include it. The only way to get the related data would be a call to:
http://example.com/v1-2/people/9/
Where can i define the relation between the URI and people:9
Or in other words: How would a client know where to fetch the related resource?

Specific analyzers for sub-documents in lucene / elasticsearch

After reading the documentation, testing and reading a lot of other questions here on stackoverflow:
We have documents that have titles and description in multiple languages. There are also tags that are translated to the same languages. There might be up to 30-40 different languages in the system, but probably only 3 or 4 translations for a single document.
This is the planned document structure:
{
"luck": {
"id": 10018,
"pub": 0,
"pr": 100002,
"loc": {
"lat": 42.7,
"lon": 84.2
},
"t": [
{
"lang": "en-analyzer",
"title": "Forest",
"desc": "A lot of trees.",
"tags": [
"Wood",
"Nature",
"Green Mouvement"
]
},
{
"lang": "fr-analyzer",
"title": "ForĂȘt",
"desc": "A grand nombre d'arbre.",
"tags": [
"Bois",
"Nature",
"Mouvement Vert"
]
}
],
"dates": [
"2014-01-01T20:00",
"2014-06-06T20:00",
"2014-08-08T20:00"
]
}
}
Possible queries are "arbre" or "wood" or "forest" or "nature" combined with a date and a geo_distance filter, furthermore there will be some facets over the tags array (that obviously include counting).
We can produce any document structure that fits best for elasticsearch (or for lucene). It's crucial that each language is analyzed specifically, so we use "_analyzer" in order to distinguish the languages.
{
"luck": {
"properties": {
"id": {
"type": "long"
},
"pub": {
"type": "long"
},
"pr": {
"type": "long"
},
"loc": {
"type": "geo_point"
},
"t": {
"_analyzer": {
"path": "t.lang"
},
"properties": {
"lang": {
"type": "string"
},
"properties": {
"title": {
"type": "string"
},
"desc": {
"type": "string"
},
"tags": {
"type": "string"
}
}
}
}
}
}
A) Apparently, this idea does not work: after PUTting the mapping, we retrieve the same mapping ("GET") and it seems to ignore the specific analyzers (A test with a top-level "_analyzer" worked fine). Does "_analyzer" work for sub-documents and if yes how to should we refer to it? We also tested declaring the sub-document as "object" or "nested". How is multi-language document indexing supposed to work.
B) One possibility would be to put each language in its own document: In that case how do we manage the id? Finally both documents should refer to the same id. For example if the user searches for "nature" (and we don't know if the user intends to find "nature" in English or French), this document would appear twice in the result set, and the counting and paging would be very wrong (also facet counting).
Any ideas?