ILM policy and template - indexing

Can I know whether can I use ilm template for my custom index. Will it rollover if yes to what value? My index is created in logstash. My index is indexname-team. Will it rollover using the template? What is the name?

The rollover happens base on the condition you have configured in the policy. The name of the roll over index would be incremented the integer as prefix of the index name.
For example, If your index name is sample-team, then below are the rollover indices.
sample-team-000001 // This is the initial index that you have created manually
sample-team-000002 // after first all other will be created automatic by policy
sample-team-000003
sample-team-000004
Reference: https://www.elastic.co/guide/en/elasticsearch/reference/current/set-up-lifecycle-policy.html

Related

Nifi add flow file attributes to S3 Object (PutS3Object) Metadata

I have a simple flow consisting of
GenerateFlowFile ----> PutS3Object ----> Wait
And the generated flow files are getting stored in the bucket correctly.
Now I want to add Metadata to my flow file.
If I add a property "Test1" to PutS3Object, it shows up as "X-Amz-Meta-Test1" in the metadata of the object.
But if I add a property "Test2" in GenerateFlowFile it doesn't show up in metadata.
I tried adding "Test2" as s3.usermetadata.Test2 but it still didn't work.
Is there a way to pass all the flow files attributes as metadata without explicitly adding properties in the PutS3Object.
PutS3Object only inserts metadata values that you have set as Dynamic Properties on the PutS3Object processor itself. Please see the docs link and look at the Dynamic Properties section.
PutS3Object does not just stick any Attribute you set as metadata, otherwise you would end up with potentially hundreds of metadata entries that you aren't interested in. The only Attribute it reads by default is filename - please see the Reads Attributes section of the docs.
If you have an existing Attibute, and you want to push the value of this Attribute into the metadata, you must add a Dynamic Property to PutS3Object and reference the value of the Attribute.
E.g. you have an Attribute called file_author with a value Steve and you want the S3 object to have the metadata field author with the value Steve:
You would add a Dynamic Property to PutS3Object with a name of author and a value of ${file_author}.
Edit:
You could fork PutS3Object into a custom processor to add the dynamic functionality you want, but I would recommend just using the standard PutS3Object config and manually configuring the Attributes you want.

Adding Prefix when creating an Index using Jredisearch

I use Jredisearch(com.redislabs:jredisearch:2.0.0) to store data in an Index. I want to add a prefix while creating the Index. I am able to add prefix using the below Redisearch command
FT.CREATE MyIndex ON HASH PREFIX 1 doc: SCHEMA name TEXT
But not able to find options for the same when writing in Java. I use the following code in Java,
client.createIndex(schema, Client.IndexOptions.defaultOptions());
Could anyone suggest how do we add Prefix when using Jredisearch?
IndexDefinition class has a setPrefixes(...) method which serves your purpose.
Note: You may have to create IndexDefinition using new IndexDefinition().

How to query Ravendb document size

I have a scenario in which I need to find the largest documents in my Ravendb database.
When I select any given document in Ravendb Studio, the size is displayed in the Properties section as circled in red in this screen shot:
Is there a query I can run that will order documents by this Size property so that I can identity the largest documents?
Maybe write a method that calculates your object size, probably using reflection.
Then, create a static Map Index with a field 'size',
and set it with your method that you will provide in the 'additional sources' in the index
See https://ravendb.net/docs/article-page/4.2/Csharp/studio/database/indexes/create-map-index#additional-sources
And then you could query this index and order-by the 'size' field
fyi - you can get a specific document size using the following endpoint:
{yourServerUrl}/databases/{yourDatabaseName}/docs/size?id={yourDocumentId}
Learn about ravenDB rest api in:
https://ravendb.net/docs/article-page/4.2/csharp/client-api/rest-api/rest-api-intro
Index (Map) definition:
from doc in docs
select new {
doc.BlittableJson.Size
}

nested field in Solr 5.2

I'm new to Solr and I have a very specific problem that I need to solve:
I have a csv file that contains my Solr document. Now, I do have a column (field) that's not only multiValued, but also contains 'subfields'
for example
"id":"0101",
"addMaterials":[{"name":"Mat1", "property":"prop1"},
{"name":"Mat2","property":"prop2"},
{"name":"Mat3","property":"prop3"}],
"mainProperty":"mainproperty1",
"URL":"http://www.mySite..."
where id, addMaterials, mainProperty, and URL are my main fields while 'name' and 'property' are my subfields. I know that Solr is designed to handle denormalized documents but denormalizing is not a possible solution for my application.
What I'm thinking is to just separate my data set and move the fields (that have subfields) to another document and somehow make a new field to link it to the orginial document (e.g. fromIdField).
Is there any other solution to do this? My minimum goal is to index the values of addMaterials field (even without indexing the subfields)
from:
"addMaterials":[{"name":"Mat1", "property":"prop1"},
{"name":"Mat2","property":"prop2"},
{"name":"Mat3","property":"prop3"}],
to
"addMaterials":{"name":"Mat1", "property":"prop1"}
"addMaterials":{"name":"Mat2", "property":"prop2"}
"addMaterials":{"name":"Mat3", "property":"prop3"}
Thanks in advance.
I have found a solution to my problem. Instead of separating my data set, I kept the addMaterials field as a multiValued field and ignored the subfields. So I only have one multiValued field to be indexed. What I did was to use the update/ request of Solr to index my csv file and put },{ as my separator in my addMaterials multiValued field. The indexed document looks like this:
"addMaterials": ["[{\"name\":\"Mat1\", \"property\":\"prop1\"",
"\"name\":\"Mat2\", \"property\":\"prop2\"",
"\"name\":\"Mat3\", \"property\":\"prop3\"}]"]
I indexed my document using this:
curl "http://localhost:8983/solr/<coreName>/update/csv?
stream.file=C:/userName/Solr/solr-5.2.0/documentFolder/myFile.csv&
f.addMaterials.split=true&
f.addMaterials.separator=\},\{&
stream.contentType=text/plain;charset=utf-8"
Also, this assumes that the addMaterials field is a multiValued field. So make sure you modify your schema first before indexing your document using the procedure above. Otherwise, it will give an error saying that the f. is not a multiValued field.
Of course, if you need to query against the sub-fields then I guess you can use the !join command/function of Solr.

Neo4j super crazy auto index issue

I use rest api to create nodes,
Ive auto index for node's "Name" and "ObjectType" property.
it works gr8 before.
but I found it doesn't index some nodes recently.
some nodes only index "ObjectType", which means cypher below works:
start n=node:node_auto_index(ObjectType="User")
where id(n) =123
but this will not work:
start n=node:node_auto_index(Name="demouser")
where id(n) =123
if I change the Name from "demouser" to something like "anotheruser", still not in node_auto_index.
the only work around is change demouser to a number like 123, then change it back to "demouser",
then I can query it with node_auto_index(Name="demouser").
I really don't know what is going on. and I cannot find all nodes which not been auto indexed.
I'm using neo4j 1.9.4
updated with neo4j properties
# Autoindexing
# Enable auto-indexing for nodes, default is false
node_auto_indexing=true
# The node property keys to be auto-indexed, if enabled
node_keys_indexable=Name,ObjectType
#relationship_auto_indexing=true
keep_logical_logs=true
online_backup_enabled=true
online_backup_server=127.0.0.1:6362
execution_guard_enabled=true
thanks