Querying 'DeletedEntry' using an id - api

The ContentManagement.Entry.delete webhook does not contain the entry fields.
Eg:
{
"sys": {
"type": "DeletedEntry",
"id": "{ID HERE}",
"space": {
"sys": {
"type": "Link",
"linkType": "Space",
"id": "{SPACE ID HERE}"
}
},
"revision": 1,
"createdAt": "2017-08-18T09:57:26.226Z",
"updatedAt": "2017-08-18T09:57:26.226Z",
"deletedAt": "2017-08-18T09:57:26.226Z",
"contentType": {
"sys": {
"type": "Link",
"linkType": "ContentType",
"id": "page"
}
}
}
}
Is there a way to retrieve the fields associated with the entry after it has been deleted? -or - Is there a method to query deleted entries so that we can get the associated field data?
Thanks,

Unfortunately there is no way to query deleted entries or retrieve their fields.
You can query unpublished entries through the management API, but once an entry is deleted it's gone and the webhook payload simply indicates the ID of the deleted entry, but all the corresponding content fields are lost.

Related

Get all API fields definitions from Podio application

Responses from Podio API returns an JSON array of items with a fields property. Each field carries its values and its config.
For example a category field for the Gender:
{
"type": "category",
"field_id": 219922852,
"label": "Gender",
"values": [
{
"value": {
"status": "active",
"text": "Prefer not to say",
"id": 3,
"color": "F7F0C5"
}
}
],
"config": {
"settings": {
"multiple": true,
"options": [
{
"status": "active",
"text": "Male",
"id": 1,
"color": "DCEBD8"
},
{
"status": "active",
"text": "Female",
"id": 2,
"color": "F7F0C5"
},
{
"status": "active",
"text": "Prefer not to say",
"id": 3,
"color": "F7F0C5"
}
],
"display": "inline"
},
"mapping": null,
"label": "Gender"
},
"external_id": "gender"
},
How can I fetch the config without having to query a specific item?
Is there a way to get every field in the response? Because if the queried item does not have a field value set, Podio doesn't return it in the response.
I would like to get the field config for ALL the fields. If possible, with a single API request. In particular I am interested in all the possible values (in case of Category or Relationship fields) so that I could match them with local values I have.
This way I can use the field structure to programmatically map some local values to the format required by the Podio API; and then generate a fields payload that to update/create Podio items via an API calls.
You can request the Podio Get App method to get the app configuration.
Podio Doc Ref: https://developers.podio.com/doc/applications/get-app-22349

Error when extracting data from Azure Table Storage using Azure Data Factory

I want to copy data from Azure Table Storage to Azure SQL Server using Azure Data Factory, but I get a strange error.
In my Azure Table Storage I have a column which contains multiple data types (this is how Table Storage works) E.G. Date time and String.
In my Data Factory project I mentioned that the entire column is string, but for some reason the Data Factory assumes the data type based on the first cell that it encounters during the extraction process.
In my Azure SQL Server database all columns are string.
Example
I have this table in Azure Table Storage: Flights
RowKey PartitionKey ArrivalTime
--------------------------------------------------
1332-2 2213dcsa-213 04/11/2017 04:53:21.707 PM - this cell is DateTime
1332-2 2213dcsa-214 DateTime.Null - this cell is String
If my table is like the one below, the copy process will work, because the first row is string and it will convert the entire column to string.
RowKey PartitionKey ArrivalTime
--------------------------------------------------
1332-2 2213dcsa-214 DateTime.Null - this cell is String
1332-2 2213dcsa-213 04/11/2017 04:53:21.707 PM - this cell is DateTime
Note: I am not allowed to change the data type in Azure Table Storage, move the rows or to add new ones.
Below are the input and output data sets from Azure Data Factory:
"datasets": [
{
"name": "InputDataset",
"properties": {
"structure": [
{
"name": "PartitionKey",
"type": "String"
},
{
"name": "RowKey",
"type": "String"
},
{
"name": "ArrivalTime",
"type": "String"
}
],
"published": false,
"type": "AzureTable",
"linkedServiceName": "Source-AzureTable",
"typeProperties": {
"tableName": "flights"
},
"availability": {
"frequency": "Day",
"interval": 1
},
"external": true,
"policy": {}
}
},
{
"name": "OutputDataset",
"properties": {
"structure": [
{
"name": "PartitionKey",
"type": "String"
},
{
"name": "RowKey",
"type": "String"
},
{
"name": "ArrivalTime",
"type": "String"
}
],
"published": false,
"type": "AzureSqlTable",
"linkedServiceName": "Destination-SQLAzure",
"typeProperties": {
"tableName": "[dbo].[flights]"
},
"availability": {
"frequency": "Day",
"interval": 1
},
"external": false,
"policy": {}
}
}
]
Does anyone knows a solution to this issue?
I've just been playing around with this. I think you have 2 options to deal with this.
Option 1
Simply remove the data type attribute from your input dataset. In the 'structure' block of the input JSON table dataset you don't have to specify the type attribute. Remove or comment it out.
For example:
{
"name": "InputDataset-ghm",
"properties": {
"structure": [
{
"name": "PartitionKey",
"type": "String"
},
{
"name": "RowKey",
"type": "String"
},
{
"name": "ArrivalTime"
/* "type": "String" --<<<<<< Optional! */
},
This should mean the data type is not validated on read.
Option 2
Use a custom activity upstream of the SQL DB table load to cleanse and transform the table data. This will mean breaking out the C# and require a lot more dev time. But you may want to reuse the cleaning code for other datasets.
Hope this helps.

How to define a related resource URI in JSON:API

In the json:api format relationships are defined with a type and a id.
Like in the example bellow. The article has a relationship with the type people and the id 9.
Now if i want to fetch the related resource i use the URI from "links.related"
// ...
{
"type": "articles",
"id": "1",
"attributes": {
"title": "Rails is Omakase"
},
"relationships": {
"author": {
"links": {
"self": "http://example.com/articles/1/relationships/author",
"related": "http://example.com/articles/1/author"
},
"data": { "type": "people", "id": "9" }
}
},
"links": {
"self": "http://example.com/articles/1"
}
}
// ...
But in my case the related resource (people) are in a separate API. There is no way to get the full people data from the articles API nor is it possible to include it. The only way to get the related data would be a call to:
http://example.com/v1-2/people/9/
Where can i define the relation between the URI and people:9
Or in other words: How would a client know where to fetch the related resource?

Schema to load JSON to Google BigQuery

Suppose I have the following JSON, which is the result of parsing urls parameters from a log file.
{
"title": "History of Alphabet",
"author": [
{
"name": "Larry"
},
]
}
{
"title": "History of ABC",
}
{
"number_pages": "321",
"year": "1999",
}
{
"title": "History of XYZ",
"author": [
{
"name": "Steve",
"age": "63"
},
{
"nickname": "Bill",
"dob": "1955-03-29"
}
]
}
All the fields in top-level, "title", "author", "number_pages", "year" are optional. And so are the fields in the second level, inside "author", for example.
How should I make a schema for this JSON when loading it to BQ?
A related question:
For example, suppose there is another similar table, but the data is from different date, so it's possible to have different schema. Is it possible to query across these 2 tables?
How should I make a schema for this JSON when loading it to BQ?
The following schema should work. You may want to change some of the types (e.g. maybe you want the dob field to be a TIMESTAMP instead of a STRING), but the general structure should be similar. Since types are NULLABLE by default, all of these fields should handle not being present for a given row.
[
{
"name": "title",
"type": "STRING"
},
{
"name": "author",
"type": "RECORD",
"fields": [
{
"name": "name",
"type": "STRING"
},
{
"name": "age",
"type": "STRING"
},
{
"name": "nickname",
"type": "STRING"
},
{
"name": "dob",
"type": "STRING"
}
]
},
{
"name": "number_pages",
"type": "INTEGER"
},
{
"name": "year",
"type": "INTEGER"
}
]
A related question: For example, suppose there is another similar table, but the data is from different date, so it's possible to have different schema. Is it possible to query across these 2 tables?
It should be possible to union two tables with differing schemas without too much difficulty.
Here's a quick example of how it works over public data (kind of a silly example, since the tables contain zero fields in common, but shows the concept):
SELECT * FROM
(SELECT * FROM publicdata:samples.natality),
(SELECT * FROM publicdata:samples.shakespeare)
LIMIT 100;
Note that you need the SELECT * around each table or the query will complain about the differing schemas.

How can I handle duplicate data in Elasticsearch?

I have used parent & child mapping to normalize data but as far as I understand there is no way to get any fields from _parent document.
Here is the mapping of my index:
{
"mappings": {
"building": {
"properties": {
"name": {
"type": "string"
}
}
},
"flat": {
"_parent": {
"type": "building"
},
"properties": {
"name": {
"type": "string"
}
}
},
"room": {
"_parent": {
"type": "flat"
},
"properties": {
"name": {
"type": "string"
},
"floor": {
"type": "long"
}
}
}
}
}
Now, I'm trying to find the best way of storing flat_name and building_name in room type. I won't query these fields but I should be able to get them when I query other fields like floor.
There will be millions of rooms and I don't have much memory so I suspect that these duplicate values may cause out of memory. For now, flat_name and building_name fields are has "index": "no" property and I turned on compression for _source field.
Do you have any efficient suggestion for avoiding duplicate values like querying multiple queries or hacky way to get fields from _parent document or denormalized data is the only way to handle this kindle of problem?