Transloadit: PDF to Video, variable length - pdf

/video/merge docs
I need a template that converts a PDF to a video. Unfortunately, the /video/merge robot seems to require both a framerate (slide length) as well as a video duration. Since I don't know how many pages the PDF will have, I'm unable to supply duration.
Is there a way around this?
This is a section of my current template:
"pdf_to_images": {
"use": ":original",
"robot": "/document/thumbs",
"format": "png",
"width": 1920,
"height": 1080
},
"encode": {
"use": {
"steps": [
{
"name": "pdf_to_images",
"as": "image"
}
]
},
"robot": "/video/merge",
"preset": "iphone",
"width": 1920,
"height": 1080,
"ffmpeg": {
"b": "8000K"
},
"framerate": "${fields.framerate}",
"duration": "100"
},
//...store...
I need to replace "framerate" from a field in the upload, but can I replace "duration" from dynamically counting the number of results from "pdf_to_images"?
Otherwise I'm stuck creating individual videos from each image result from "pdf_to_images" and then concatenating them, which seems rather excessive in terms of resource capitalization.
Thoughts?

What you could be looking for is ${file.meta.page_count} – which, as expected, will return the number of pages in the PDF from /document/thumbs. For example, if you wanted to make a video where each page of the PDF lasts a second you'd use a template like so:
{
"steps": {
":original": {
"robot": "/upload/handle"
},
"pdf_to_images": {
"use": ":original",
"robot": "/document/thumbs",
"format": "png",
"results": true,
"width": 1920,
"height": 1080,
"imagemagick_stack": "v2.0.7"
},
"merged": {
"robot": "/video/merge",
"use": {
"steps": [
{
"name": "pdf_to_images",
"as": "image"
}
]
},
"result": true,
"framerate": "1",
"duration": "${file.meta.page_count}",
"ffmpeg_stack": "v4.3.1",
"preset": "iphone",
"resize_strategy": "fillcrop"
}
}
}

Related

How to validate properties in loopback remote method?

Recently I have started learning Loopback. I have tried to add required, min and max.
Here is my remote method in json file with parameters :
{
"name": "registration",
"methods": {
"registrationIn": {
"accepts": [
{
"arg": "firstname",
"type": "string",
"min": 1,
"max": 25, // same for here
"required": true,
"description": "Firstname of the person.",
"http": {
"source": "form"
}
}
{
"arg": "mobile",
"type": "number",
"min": 1, // since required it correct but not checking i removed required
"max": 10, // not working
"required": true, // working
"description": "",
"http": {
"source": "form"
}
}
],
"returns": [],
"description": "This method used to registration.",
"http": [{
"path": "/registrationIn",
"verb": "post"
}]
}
}
required property is working fine but min and max is not working.
Can anyone guide me to where i am doing wrong ?
Thanks
Please refer this link validation
You can add validation on model.
module.exports = function(user) {
user.validatesLengthOf('password', {min: 5, message: {min: 'Password is too short'}});
}

Cumulocity measurement representation

I create measurements at reception of an event, I can get them using the API, but they are not represented graphically in the Device Management interface. I there a specific format they would have to respect to be representable automatically? If so, is there a place I can find all the formats supported by Cumulocity? I infered the c8y_TemperatureMeasurement from the examples in the doc but I didn't find an exhaustive list of the native formats.
Here are examples of the measurements I have at the moment:
{
"time": "2016-06-29T12:10:02.000+02:00",
"id": "27006",
"self": "https://<tenant-id>/measurement/measurements/27006",
"source": {
"id": "26932",
"self": "https://<tenant-id>/inventory/managedObjects/26932"
},
"type": "c8y_BatteryMeasurement",
"c8y_BatteryMeasurement": {
"unit": "V",
"value": 80
}
},
{
"time": "2016-06-29T10:15:22.000+02:00",
"id": "27010",
"self": "https://<tenant-id>/measurement/measurements/27010",
"source": {
"id": "26932",
"self": "https://<tenant-id>/inventory/managedObjects/26932"
},
"type": "c8y_TemperatureMeasurement",
"c8y_TemperatureMeasurement": {
"T": {
"unit": "C",
"value": 24
}
}
}
The measurements have to be sent to Cumulocity in the following format:
{
"fragment": {
"series": {
"unit": "x",
"value": y
}
}
}

Instagram API and Pagination

I can't get all images using Instagram API, Pagination seems to be working somehow different and I can't understand it yet
I use request:
https://api.instagram.com/v1/users/self/media/recent?access_token=TOKEN
and can get first 20 photos:
...
{
"attribution": null,
"tags": [
"beautiful",
"instalife",
"picoftheday",
"beauty",
"instalike",
"gf",
"traveling",
"instatravel",
"vsco",
"tourism",
"\u0438\u0441\u043f\u0430\u043d\u0438\u044f",
"travelphoto",
"vscogood",
"instafollow",
"travel",
"\u0433\u0440\u0430\u043d\u0430\u0434\u0430",
"amazing",
"vscocam",
"followme",
"photooftheday"
],
"type": "image",
"location": null,
"comments": {
"count": 1
},
"filter": "Normal",
"created_time": "1442825564",
"link": "https:\/\/instagram.com\/p\/74vm3GOCEn\/",
"likes": {
"count": 18
},
"images": {
"low_resolution": {
"url": "https:\/\/scontent.cdninstagram.com\/hphotos-xap1\/t51.2885-15\/s320x320\/e15\/11934647_531283580370186_1131008999_n.jpg",
"width": 320,
"height": 320
},
"thumbnail": {
"url": "https:\/\/scontent.cdninstagram.com\/hphotos-xap1\/t51.2885-15\/s150x150\/e15\/11934647_531283580370186_1131008999_n.jpg",
"width": 150,
"height": 150
},
"standard_resolution": {
"url": "https:\/\/scontent.cdninstagram.com\/hphotos-xap1\/t51.2885-15\/e15\/11934647_531283580370186_1131008999_n.jpg",
"width": 612,
"height": 612
}
},
"users_in_photo": [
],
"caption": {
"created_time": "1442825564",
"text": "#\u0413\u0440\u0430\u043d\u0430\u0434\u0430 #\u0418\u0441\u043f\u0430\u043d\u0438\u044f #photooftheday #picoftheday #instalike #followme #vscogood #vscocam #vsco #instafollow #travel #traveling #instatravel #instalife #tourism #gf #beauty #beautiful #amazing #travelphoto",
"from": {
"username": "solotravel_me",
"profile_picture": "https:\/\/igcdn-photos-h-a.akamaihd.net\/hphotos-ak-xaf1\/t51.2885-19\/11282631_115839268762391_863189534_a.jpg",
"id": "736938591",
"full_name": "and"
},
"id": "1078821495951073761"
},
"user_has_liked": false,
"id": "1078821489441513767_736938591",
"user": {
"username": "solotravel_me",
"profile_picture": "https:\/\/igcdn-photos-h-a.akamaihd.net\/hphotos-ak-xaf1\/t51.2885-19\/11282631_115839268762391_863189534_a.jpg",
"id": "736938591",
"full_name": "and"
}
}
...
after that I'm trying use max_id parameter, but I'm not sure which ID I need to use
I tried id of the photo, photo_user id, I even tried timestamp (found this idea on some forum), but every time I receive only first 20 photos
example:
https://api.instagram.com/v1/users/self/media/recent?access_token=TOKEN&max_id=1078821495951073761
I was having the same problem with the empty pagination object while the account actually had more photos.
And just to make things clear here. This question is answered in the comments of the origial question.
There is nothing in the pagination object because the app is in sandbox mode and app's in sandbox mode never returns more than 20 posts(photos).
Use this API for getting all posts of the user:
https://i.instagram.com/api/v1/feed/user/{user_id}
For pagination, the parameter name is max_id
Pass max_id which you have returned in this API's response.
Hope this helps !!

Finding similar documents with Elasticsearch

I'm using ElasticSearch to develop service that will store uploaded files or web pages as attachment (file is one field in document). This part works fine as I can search these files using like_text as input. However, the second part of this service should compare the file that is just uploaded with the existing files in order to find duplicates or very similar files, so it doesn't recommend users same files or same web pages. The problem is that I can't get expected results for documents that are the same. Similarity between same files varies, but is never more then 0.4. Even worse, sometimes I get better scores for files which are not the same then for two exactly the same files. The java code give bellow gives me always the set of documents which are in the same order, regardless of the input. It looks like like_text extracted from uploaded file is always the same.
String mapping = copyToStringFromClasspath("/org/prosolo/services/indexing/documents- mapping.json");
byte[] txt = org.elasticsearch.common.io.Streams.copyToByteArray(file);
Client client = ElasticSearchFactory.getClient();
client.admin().indices().putMapping(putMappingRequest(indexName).type(indexType).source(mapping)).actionGet();
IndexResponse iResponse = client.index(indexRequest(indexName).type(indexType)
.source(jsonBuilder()
.startObject()
.field("file", txt)
.field("title",title)
.field("visibility",visibilityType.name().toLowerCase())
.field("ownerId",ownerId)
.field("description",description)
.field("contentType",DocumentType.DOCUMENT.name().toLowerCase())
.field("dateCreated",dateCreated)
.field("url",link)
.field("relatedToType",relatedToType)
.field("relatedToId",relatedToId)
.endObject()))
.actionGet();
client.admin().indices().refresh(refreshRequest()).actionGet();
MoreLikeThisRequestBuilder mltRequestBuilder=new MoreLikeThisRequestBuilder(client, ESIndexNames.INDEX_DOCUMENTS, ESIndexTypes.DOCUMENT, iResponse.getId());
mltRequestBuilder.setField("file");
SearchResponse response = client.moreLikeThis(mltRequestBuilder.request()).actionGet();
SearchHits searchHits= response.getHits();
System.out.println("getTotalHits:"+searchHits.getTotalHits());
Iterator<SearchHit> hitsIter=searchHits.iterator();
while(hitsIter.hasNext()){
SearchHit searchHit=hitsIter.next();
System.out.println("FOUND DOCUMENT:"+searchHit.getId()+" title:"+searchHit.getSource().get("title")+" score:"+searchHit.score());
}
And the query from browser which looks like:
http://localhost:9200/documents/document/m2HZM3hXS1KFHOwvGY1pVQ/_mlt?mlt_fields=file&min_doc_freq=1
Gives me results:
{"took":120,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},
"hits":{"total":4,
"max_score":0.41059873,
"hits":
[{"_index":"documents","_type":"document",
"_id":"gIe6NDEWRXWTMi4kMPRbiQ",
"_score":0.41059873,
"_source" :
{"file":"PCFET0NUWVBFIGh..._skiping_the_file_content_here...",
"title":"Univariate Analysis",
"visibility":"public",
"description":"Univariate Analysis Simple Tools for Description ",
"contentType":"webpage",
"dateCreated":"null",
"url":"http://www.slideshare.net/christineshearer/univariate-analysis"}}
This is exactly the same web page, so I'm expecting the score to be 1.0, not 0.41 as there is not difference between two documents except in _id. The results are even worse with files.
Mapping I was using is:
{
"document":{
"properties":{
"title":{
"type":"string",
"store":true
},
"description":{
"type":"string",
"store":"yes"
},
"contentType":{
"type":"string",
"store":"yes"
},
"dateCreated":{
"store":"yes",
"type":"date"
},
"url":{
"store":"yes",
"type":"string"
},
"visibility": {
"store":"yes",
"type":"string"
},
"ownerId": {
"type": "long",
"store":"yes"
},
"relatedToType": {
"type": "string",
"store":"yes"
},
"relatedToId": {
"type": "long",
"store":"yes"
},
"file":{
"path": "full",
"type":"attachment",
"fields":{
"author": {
"type": "string"
},
"title": {
"store": true,
"type": "string"
},
"keywords": {
"type": "string"
},
"file": {
"store": true,
"term_vector": "with_positions_offsets",
"type": "string"
},
"name": {
"type": "string"
},
"content_length": {
"type": "integer"
},
"date": {
"format": "dateOptionalTime",
"type": "date"
},
"content_type": {
"type": "string"
}
} } } } }
Does anyone have an idea what could be wrong here?
Thanks

Specific analyzers for sub-documents in lucene / elasticsearch

After reading the documentation, testing and reading a lot of other questions here on stackoverflow:
We have documents that have titles and description in multiple languages. There are also tags that are translated to the same languages. There might be up to 30-40 different languages in the system, but probably only 3 or 4 translations for a single document.
This is the planned document structure:
{
"luck": {
"id": 10018,
"pub": 0,
"pr": 100002,
"loc": {
"lat": 42.7,
"lon": 84.2
},
"t": [
{
"lang": "en-analyzer",
"title": "Forest",
"desc": "A lot of trees.",
"tags": [
"Wood",
"Nature",
"Green Mouvement"
]
},
{
"lang": "fr-analyzer",
"title": "Forêt",
"desc": "A grand nombre d'arbre.",
"tags": [
"Bois",
"Nature",
"Mouvement Vert"
]
}
],
"dates": [
"2014-01-01T20:00",
"2014-06-06T20:00",
"2014-08-08T20:00"
]
}
}
Possible queries are "arbre" or "wood" or "forest" or "nature" combined with a date and a geo_distance filter, furthermore there will be some facets over the tags array (that obviously include counting).
We can produce any document structure that fits best for elasticsearch (or for lucene). It's crucial that each language is analyzed specifically, so we use "_analyzer" in order to distinguish the languages.
{
"luck": {
"properties": {
"id": {
"type": "long"
},
"pub": {
"type": "long"
},
"pr": {
"type": "long"
},
"loc": {
"type": "geo_point"
},
"t": {
"_analyzer": {
"path": "t.lang"
},
"properties": {
"lang": {
"type": "string"
},
"properties": {
"title": {
"type": "string"
},
"desc": {
"type": "string"
},
"tags": {
"type": "string"
}
}
}
}
}
}
A) Apparently, this idea does not work: after PUTting the mapping, we retrieve the same mapping ("GET") and it seems to ignore the specific analyzers (A test with a top-level "_analyzer" worked fine). Does "_analyzer" work for sub-documents and if yes how to should we refer to it? We also tested declaring the sub-document as "object" or "nested". How is multi-language document indexing supposed to work.
B) One possibility would be to put each language in its own document: In that case how do we manage the id? Finally both documents should refer to the same id. For example if the user searches for "nature" (and we don't know if the user intends to find "nature" in English or French), this document would appear twice in the result set, and the counting and paging would be very wrong (also facet counting).
Any ideas?