How to design a REST API - api

I am building a metadata platform for a broadcaster. It exposes a REST API to write data to the DB.
I struggle with the design of the endpoint boundaries. There is a wide range of options to split the API controller model so that the API producers can write data in either normalized or denormalized form. I am not sure what a widely accepted design is so that the platform provides the best possible developer experience.
I have modeled both extremes with a PUT request for updates. Which one is more usable?
Normalised
PUT /episodes
{
"season" : 1,
"episodeInSeason" : 2,
"texts" : [
"#ref TextId"
],
"variants" : [
"#ref VariantId"
]
}
PUT /variants
{
"texts" : [
"#ref TextId"
],
"assets" : [
"#ref AssetId"
]
}
PUT /assets
{
"texts" : [
"#ref TextId"
],
"renditions" : [{
"fps": 60.0
}],
"markIn" : "#ref MarkerId",
"markOut" : "#ref MarkerId"
}
PUT /texts
{
"title": "foo",
"lang" : "en"
}
PUT /renditions
{
"fps": 60.0
}
PUT /markers
{
"position" : 2.2
}
Denormalised
PUT /episodes
{
"season" : 1,
"episodeInSeason" : 2,
"texts" : [{
"title": "foo",
"lang" : "en"
}],
"variants" : [{
"texts" : [{
"title": "foo",
"lang" : "en"
}],
"assets" : [{
"texts" : [{
"title": "foo",
"lang" : "en"
}],
"renditions" : [{
"fps": 60.0
}],
"markIn" : {
"position" : 1.0
},
"markOut" : {
"position" : 2.2
}
}]
}]
}

Related

Google Geocoding API - return any valid address based of zip code

I need to return any (at least one) valid address that exist around a zip code.
The country will always be the same.
How can I achieve this?
Thank you,
You can accomplish this through 2 requests, using the output of the first as input for the second. There are a few caveats outlined at the end.
Request 1
Get data based on Zip Code:
curl --location --request GET 'https://maps.googleapis.com/maps/api/geocode/json?address=90058&key={{YOUR_API_KEY}}'
From your response you will want to take results[0].geometry.location object. You will use the lat and lng values for your next query. The example above provides the following:
// … more above
"formatted_address" : "Los Angeles, CA 90058, USA",
"geometry" : {
"bounds" : {
"northeast" : {
"lat" : 34.045639,
"lng" : -118.1685449
},
"southwest" : {
"lat" : 33.979672,
"lng" : -118.2435201
}
},
"location" : {
"lat" : 34.00637469999999,
"lng" : -118.2234229
},
"location_type" : "APPROXIMATE",
"viewport" : {
"northeast" : {
"lat" : 34.045639,
"lng" : -118.1685449
},
"southwest" : {
"lat" : 33.979672,
"lng" : -118.2435201
}
}
},
"place_id" : "ChIJDeH8s8TIwoARYQFWkBcCzFk",
// … more below
Request 2
Get data based on Latitude and Longitude
curl --location --request GET 'https://maps.googleapis.com/maps/api/geocode/json?latlng=34.00637469999999,-118.2234229&key={{YOUR_API_KEY}}'
NOTE! The query param has changed from address to latlng
This will likely return many results, you will have to decide which type meets your requirements, but I would recommend looping through the results array and looking for an entry with a types array which includes street_address or premise and using that entry as your target result set.
For the example above, results[1] will have "types" : [ "premise" ]
Its associated formatted_address and the complete parent object will be as follows:
see note at bottom on targeting specific types
"formatted_address" : "2727 E Vernon Ave, Vernon, CA 90058, USA",
// … more above
{
"address_components": [
{
"long_name": "2727",
"short_name": "2727",
"types": [
"street_number"
]
},
{
"long_name": "East Vernon Avenue",
"short_name": "E Vernon Ave",
"types": [
"route"
]
},
{
"long_name": "Vernon",
"short_name": "Vernon",
"types": [
"locality",
"political"
]
},
{
"long_name": "Los Angeles County",
"short_name": "Los Angeles County",
"types": [
"administrative_area_level_2",
"political"
]
},
{
"long_name": "California",
"short_name": "CA",
"types": [
"administrative_area_level_1",
"political"
]
},
{
"long_name": "United States",
"short_name": "US",
"types": [
"country",
"political"
]
},
{
"long_name": "90058",
"short_name": "90058",
"types": [
"postal_code"
]
},
{
"long_name": "1822",
"short_name": "1822",
"types": [
"postal_code_suffix"
]
}
],
"formatted_address": "2727 E Vernon Ave, Vernon, CA 90058, USA",
"geometry": {
"bounds": {
"northeast": {
"lat": 34.0065676,
"lng": -118.2228652
},
"southwest": {
"lat": 34.0057081,
"lng": -118.2245065
}
},
"location": {
"lat": 34.0061691,
"lng": -118.2236056
},
"location_type": "ROOFTOP",
"viewport": {
"northeast": {
"lat": 34.0074868302915,
"lng": -118.2223368697085
},
"southwest": {
"lat": 34.0047888697085,
"lng": -118.2250348302915
}
}
},
"place_id": "ChIJCQv2_sPIwoARZn8W2bF9a9g",
"types": [
"premise"
]
},
// … more below
Caveats
I assume since you said "Zip Code" you meant the United States. I only tested for the US, if you need other country support, consider adding the country code to your address query params in request 1.
If you use this approach starting with Zip Code, you might encounter results with a route or street_address but no premise if the Zip Code represents a large geographic area with a very low population density. I tested this on "Vernon, California (90058) - population 112" and "Freeport City, Kansas (67049) - population 5". Freeport City returned no result having type of premise.
If you want to query a specific type so you don't have to search through a large result set, you can apply the query param &result_type=premise to request 2. However, if no result exists for the given lat/lng with your specified type(s), you will receive a results.status = ZERO_RESULTS.
Multiple result_type values may be applied, separate with a pipe (|)
Full docs here on Reverse Geocoding: https://developers.google.com/maps/documentation/geocoding/intro#ReverseGeocoding

How to load Nested json data into a single column in druid

I am trying to load nested json data in Apache druid:
Data-->
{
"a": "a_data",
"b": "b_data",
"c_blob_Column": {"aaaa"{"k":"sample"{"c":"sample2"}}}}
Spec -->
{ "type" : "kafka", "dataSchema" : { "dataSource" : "blob", "parser" : { "type" : "string", "parseSpec" : { "format" : "json", "dimensionsSpec" : { "dimensions" : [ "a", "b", "c_blob_Column"
]
},
"timestampSpec": {
"column": "timestamp",
"format": "iso"
}
}
},
"metricsSpec" : [],
"granularitySpec" : {
"type" : "uniform",
"segmentGranularity" : "DAY",
"queryGranularity" : "none",
"rollup" : false
}
},
"ioConfig" : {
"topic":"blob_topic",
"consumerProperties":{
"bootstrap.servers":"<local server>"
},
"appendToExisting" : false,
"useEarliestOffset": true,
"taskDuration": "PT15M"
},
"tuningConfig" : {
"type" : "kafka",
"maxRowsPerSegment" : 5000000,
"maxRowsInMemory" : 25000
}
}
Output columns-->
a,b,c_blob_Column,__time
I am able to load the data but the issue is in the column c_blob_Column the data is not coming as in json form data Could someone please help me to find how to load the json blob data?
you can use jq expression:
"flattenSpec": {
"fields": [
{
"type": "jq",
"name": "c_blob_Column",
"expr": ".c_blob_Column | tojson"
}
]
}

ELasticsearch Post bulk on elastic xpack role

I have an Elastic cluster with xpack enable.
I'd like to make a backup of all xpack roles created :
GET _xpack/security/role
=> I get a big JSON, ex :
{
"kibana_dashboard_only_user": {
"cluster": [],
"indices": [
{
"names": [
".kibana*"
],
"privileges": [
"read",
"view_index_metadata"
]
}
],
"run_as": [],
"metadata": {
"_reserved": true
},
"transient_metadata": {
"enabled": true
}
},
"watcher_admin": {
"cluster": [
"manage_watcher"
],
"indices": [
{
"names": [
".watches",
".triggered_watches",
".watcher-history-*"
],
"privileges": [
"read"
]
}
],
"run_as": [],
"metadata": {
"_reserved": true
},
"transient_metadata": {
"enabled": true
}
},
....
}
And now I'd like to put it back in the cluster (or another). I cannot just PUT it to _xpack/security/role. If i understand correctly I have to use bulk :
$ curl --user elastic:password https://elastic:9200/_xpack/security/_bulk?pretty -XPOST -H 'Content-Type: application/json' -d '
{"index":{"_index": "_xpack/security/role"}}
{"ROOOOLE" : {"cluster" : [ ],"indices" : [{"names" : [".kibana*"],"privileges" : ["read","view_index_metadata"]}],"run_as" : [ ],"metadata" : {"_reserved" : true},"transient_metadata" : {"enabled" : true}}}
'
But I get an error:
{
"took" : 3,
"errors" : true,
"items" : [
{
"index" : {
"_index" : "_xpack/security/role",
"_type" : "security",
"_id" : null,
"status" : 400,
"error" : {
"type" : "invalid_index_name_exception",
"reason" : "Invalid index name [_xpack/security/role], must not contain the following characters [ , \", *, \\, <, |, ,, >, /, ?]",
"index_uuid" : "_na_",
"index" : "_xpack/security/role"
}
}
}
]
}
Is there a way to do this easily? Or do I have to parse the JSON, and put each role one by one to:
_xpack/security/role/rolexxx
_xpack/security/role/roleyyy
...
More globally, is there a way to get all data of an index (config index), then upload it back or put it into another cluster?

Avro allow blank JSON/hash for nested attributes

Need to Create Avro schema for this ->
{"city":"XXXXXX", "brand":"YYYY", "discount": {} }
{"city":"XXXXXX", "brand":"YYYY", "discount": {"name": "Freedom", "value": 100} }
{"city":"XXXXXX", "brand":"YYYY", "discount": {"name": "Festive Sale", "value": 100} }
I tried with the below shemas, which do not work:
{ "type":"record", "name":"simple_avro",
"fields":[ { "name":"city", "type":"string" },
{ "name":"brand", "type":"string" },
{ "name":"discount",
"type":{ "type":"record", "name":"discount", "default":"",
"fields":[ { "name":"discount_name", "type":"string", "default":"null" },
{ "name":"discount_value", "type":"float", "default":0 }
] }}
] }
For discount field, I have tried default to as "[]", "{}", "", but none of these work.
I don't think an empty {} object is allowed in any case, but if you want to allow no object at all, then it needs to be a union type, as designated by an array for the type, the the default value goes on the outer field rather than inside the record body
{ "name":"discount",
"type" : [
"null",
{ "type":"record", "name":"discount", "fields": [...] }
],
"default" : "null"
In general, I find that easier to express in IDL format
Then, a valid message could be {"city":"XXXXXX", "brand":"YYYY"}

ElasticSearch - how to give priority to the matching from the same row

I have the following documents in ElasticSearch 0.19.11, using:
{ "title": "dogs species",
"col_names": [ "name", "description", "country_of_origin" ],
"rows": [
{ "row": [ "Boxer", "good dog", "Germany" ] },
{ "row": [ "Irish Setter", "great dog", "Ireland" ] }
]
}
{ "title": "Misc stuff",
"col_names": [ "foo" ],
"rows": [
{ "row": [ "Setter is impotant" ] },
{ "row": [ "Ireland is green" ] }
]
}
The mapping is as follows:
{
"table" : {
"properties" : {
"title" : {"type" : "string"},
"col_names" : {"type" : "string"},
"rows" : {
"properties" : {
"row" : {"type" : "string"}
}
}
}
}
}
Question: I'm now searching for "Ireland Setter" and I need to have a higher score for documents that have search terms in the same row.
Currently the second document gets score of 0.22, while the first one - 0.14.
I want the first document to get a higher score in this case, since it has both "Ireland" and "Setter" in the same row. How can it be done?
With great cooperation from ElasticSearch google-group members, the solution is found.
Here is the link to the discussion: https://groups.google.com/forum/?fromgroups#!topic/elasticsearch/4O9dff2SNhg