Count of documents 0 after inserting data with Nest - nest

I am using Nest with the following connection settings:
var connectionPool = new SingleNodeConnectionPool(new Uri("http://localhost:9200"));
var settings = new ConnectionSettings(connectionPool, new InMemoryConnection());
settings.DisableDirectStreaming(true); // needed to see good looking debug log on insert
settings.DefaultIndex(Index);
Client = new ElasticClient(settings);
With new InMemoryConnection() I hope to query with Nest - changing data inside an Azure Cloud function.
Strangely the debug logs look promising Indexing:
/*
var res = await Client.IndexManyAsync(response.Elements, Index); //
Console.WriteLine(res.DebugInformation);
*/
/*
var res = await Client.IndexAsync(response, i => i.Index(Index)); // Index = "data"
Console.WriteLine(res.DebugInformation); // <--
*/
And logging directly after the insertions the count is 0:
// var anyDocs = await Client.CountAsync<OverpassElement>(c => c.Index(Index));
var anyDocs = await Client.CountAsync<OverpassElement>(c => c);
Console.WriteLine("count: " + anyDocs.Count);
..but the entire json data being logged with the insertion.
How come i can't count it (so that I can search in a next step), after insertion?
Actually I get:
Invalid NEST response built from a successful (200) low level call on POST: /data/_doc
And there is 0 Items in the on the IndexResponse inserting.
The data is of Element looking like the following part of an array containing 4221 such items:
{
"type": "relation",
"id": 8353694,
"timestamp": "2018-06-04T22:54:27Z",
"version": 1,
"changeset": 59551528,
"user": "asdf2",
"uid": 1416503,
"members": [
{
"type": "way",
"ref": 89956942,
"role": "from"
},
{
"type": "node",
"ref": 1042756547,
"role": "via"
},
{
"type": "way",
"ref": 89956938,
"role": "to"
}
],
"tags": {
"restriction": "no_left_turn",
"type": "restriction"
}
},

ElasticSearch has many similarities to a NoSql data store. In this case, "read after write" is not guaranteed by default. When the index API call returns success, it doesn't mean "this document is now available for searching"; it means "ElasticSearch has accepted your document and it will be available for searching shortly". ElasticSearch uses eventual consistency by default.
However, this can be annoying during testing. So ElasticSearch has a Refresh API that essentially just blocks until all documents already indexed are available for searching. I strongly recommend that you do not call this in production; only in test code.

As the risk of reviving an old question, this answer from Russ Cam explains that InMemoryConnection does not actually run the operation against Elasticsearch.
InMemoryConnection doesn't actually send any requests or receive any responses from Elasticsearch; used in conjunction with .SetConnectionStatusHandler() on Connection settings (or .OnRequestCompleted() in NEST 2.x+), it's a convenient way to see the serialized form of requests.
So you can inspect the query that NEST generates from your code but you won't be able to observe the results.

I don`t know what Nest is, but I'd bet 100$ that if it use Transactional concepts, maybe you should commit it in order to see count correctly ?

Related

What is the correct way to implement update of the entity? (Symfony 6)

I have an event entity.
What is the correct way to implement update of this entity? Our frontend-developer wants everything to be done with a single PUT request to the backend: changing the values of the title, description fields, as well as adding, deleting, and editing prices, event_dates, and event_dates.
I made separate endpoints put /event/{id}, put /price/{id}, put event_date/{id}
What can you recommend?
{
"id": 504,
"title": "First Event",
"description": "Description of First Event",
"created_at": "2022-08-16T08:42:11+00:00",
"prices": [
{
"id": 4,
"value": "12.99",
"type": "regular",
"is_entrance_free": false,
"info": "some extra infos",
"sorting": 7
}
],
"event_dates": [
{
"id": 2,
"start_date": "2022-12-10",
"end_date": "2022-12-31",
"start_time": "13:00",
"end_time": "16:00",
"entrance_time": "12:30",
"is_open_end": false,
"info": "7"
}
]
}
One of the standard ways is to POST or PUT the JSON for either the complete new record, with everything changed, effectively overwriting the old one, but keeping the same ID, or a subset.
The request would go to an endpoint for PUT /event/{id} where the action reads the current record, and gets the JSON with the information to update.
<?php
// various use statements as required
class ApiEventController extends AbstractController
{
#[Route('/api/event/{id}', methods: ['PUT'])]
public function eventPut(Request $request, \App\Entity\Event $event)
{
// Security here - ensure the current user has permission to access & edit the event
// a custom Deserializer can restrict what is used from the content
// for example, ensuring the ID, or other fields are not changed.
$serializer->deserialize(
$request->getContent(),
\App\Entity\Event::class,
'json',
[
// takes the new values, from the request content,
// and update the old value, fetched by ID from the URL
AbstractNormalizer::OBJECT_TO_POPULATE => $entity,
]
);
// $event is now the mix of the old, and new
$entityManager->persist($event);
$entityManager->flush();
// return the updated event details
}
Updating more complex contents (such as replacing an array of prices, or event_dates within the main entity) will need other deserializers and the configuration in the Event entity and others, so that the Symfony Serializer component understands what is required. https://symfony.com/doc/current/components/serializer.html and https://symfonycasts.com/tracks/symfony has more information and tutorials that well assist in learning more.
API-platform can make much of this simpler, for the simpler cases, but an understanding of the basics would be useful as a basis of understanding.

Inability to call SPROC from Azure Logic Apps - can't find syntax for the parameters

Statement of intent:
I'm trying to automate a workflow, moving data periodically from a CSV in Sharepoint into a table in Azure SQL Database. I've gotten so far as 1) Formatting a JSON array, and 2) Creating a SPROC that successfully takes the text of the JSON Array, and imports it into the appropriate table.
Array appears like:
JSON = [{"col1":"col1Data","col2":"col2Data", ...}, <600-some more iterations>]
Invocation of stored procedure in SQL Management Studio looks like:
EXECUTE SprocName #json=N'<text of JSON above>'
===========================================
Problem:
Lack of documentation allowing me to properly format one of the following two SQL Connectors' parameters to link these two statements together:
Both Execute a Query (v2) and Execute a Stored Procedure (v2) require that parameters or query text be provided, but no indication of how said parameters should be formatted.
For example, in terms of executing a stored procedure that takes a single parameter #json, the following text "looks" correct, but results in an error:
"body": "#json=N'+#string(outputs('Convert_Rows_To_Json').body)+'"
Error:
Failed to save logic app UpdateDomainCoverage. The template validation failed: 'The template action 'Execute_stored_procedure_(V2)' at line '1' and column '3148' is not valid: "The template language expression 'json=N'+#string(outputs('Convert_Rows_To_Json').body)+'' is not valid: the string character '=' at position '4' is not expected.".'.
I've tried a number of variations, for both the #json parameter on Execute Stored Procedure, or simply building the query from whole cloth in Execute SQL, to no avail. Suggestions?
Here is sample from Code View of calling a stored procedure with parameter 'from' that takes a datetime value. When you pick the sproc in the Designer it should show all the parameters for you to populate.
"Get_jobs": {
"inputs": {
"body": {
"from": "#{convertFromUtc( variables('SelectTime'), variables('timeZone'), 'yyyy-MM-dd HH:mm:ss')}"
},
"host": {
"connection": {
"name": "#parameters('$connections')['sql_2']['connectionId']"
}
},
"method": "post",
"path": "/datasets/default/procedures/#{encodeURIComponent(encodeURIComponent('[dbo].[GetJobs]'))}"
},
"runAfter": {
"Refresh_data_for_BI": [
"Succeeded"
]
},
"type": "ApiConnection"
},
OK, I've been messing with this on-and-off in between other tasks today, and finally got tired of trying to get it done in the input of the "Execute query".
Brute Force Solution: I added another Javascript step, with the following code:
var input = workflowContext.actions.Convert_Rows_To_Json.outputs.body;
var sqlQuery = 'EXECUTE [ImportDomainCoverage] N\'' + input + '\'';
return sqlQuery;
It's not pretty (one more step), but it works.
Now to see if I can modify things sufficiently to parameterize the table name, rather than needing six steps for each table.
Finally figured out the syntax. Didn't find any documentation, just tried working from one error message to another.
"Pump_data_into_target_table": {
"inputs": {
"body": {
"json": "#{body('Pull_FeedbackItems_from_source').ResultSets['Table1']}"
},
"headers": {
"Content-Type": "application/json"
},
"host": {
"connection": {
"name": "#parameters('$connections')['sql_2']['connectionId']"
}
},
"method": "post",
"path": "/v2/datasets/#{encodeURIComponent(encodeURIComponent('servername.database.windows.net'))},#{encodeURIComponent(encodeURIComponent('dbname'))}/procedures/#{encodeURIComponent(encodeURIComponent('sprocname'))}"
},
"runAfter": {
"Pull_FeedbackItems_from_Source": [
"Succeeded"
]
},
"type": "ApiConnection"
}
The fundamental answer to my question was: provide the parameter/value pairs as a JSON object. See the value of the "body" element in the listing above. For this to work, though, one also has to enter the "headers" element, which I didn't even see documented on the API call. Was led to that by an error message stating that the content type was plain text, when it was clearly json.

How To Get Particular Security Advisory Repository in Graphql

I have Tried
I have tried this code
`# Type queries into this side of the screen, and you will
# see intelligent typeaheads aware of the current GraphQL type schema,
# live syntax, and validation errors highlighted within the text.
# We'll get you started with a simple query showing your username!
query {
securityAdvisories(orderBy: {field: PUBLISHED_AT, direction: DESC}, first: 2) {
nodes {
description
ghsaId
summary
publishedAt
}
}
}
And got the below response
{
"data": {
"securityAdvisories": {
"nodes": [
{
"description": "In Symfony before 2.7.51, 2.8.x before 2.8.50, 3.x before 3.4.26, 4.x before 4.1.12, and 4.2.x before 4.2.7, when service ids allow user input, this could allow for SQL Injection and remote code execution. This is related to symfony/dependency-injection.",
"ghsaId": "GHSA-pgwj-prpq-jpc2",
"summary": "Critical severity vulnerability that affects symfony/dependency-injection",
"publishedAt": "2019-11-18T17:27:31Z"
},
{
"description": "Tapestry processes assets `/assets/ctx` using classes chain `StaticFilesFilter -> AssetDispatcher -> ContextResource`, which doesn't filter the character `\\`, so attacker can perform a path traversal attack to read any files on Windows platform.",
"ghsaId": "GHSA-89r3-rcpj-h7w6",
"summary": "Moderate severity vulnerability that affects org.apache.tapestry:tapestry-core",
"publishedAt": "2019-11-18T17:19:03Z"
}
]
}
}
}
But i want to get the response for specific security advisory like this
i.e i want to get graphql response for specific id for below example url ID is GHSA-wmx6-vxcf-c3gr
Thanks!
The simplest way would be to use the securityAdvisory() query.
query {
securityAdvisory(ghsaId: "GHSA-wmx6-vxcf-c3gr") {
ghsaId
summary
}
}
If you need to use the securityAdvisories() query for some reason, you simply have to add an identifier:. The following query should get the distinct entry for GHSA-wmx6-vxcf-c3gr.
query {
securityAdvisory(ghsaId: "GHSA-wmx6-vxcf-c3gr") {
ghsaId
summary
}
}

Extracting additional data with query with keen.io

I have a (simplified) query that looks as follows.
var pageViews = new Keen.Query('count', {
eventCollection: 'Loaded a Page',
groupBy: 'company.id'
});
And use it as follows.
client.run(pageViews, function(result, error) {
// Do something here
});
This will give me the following JSON to work with:
{
"result": [
{
"company.id": 1,
"result": 3
},
{
"company.id": 2,
"result": 11
},
{
"company.id": 3,
"result": 7
}
]
}
However, I would also like to get back the name of each company, i.e. the company.name property. I looked through keen.io's documentation, and I could find no way of doing this. Is there a way to do this? Logically speaking, I don't see any reason why it would not be possible, but the question is if it has been implemented.
Grouping by multiple properties will get you what you're looking for:
var pageViews = new Keen.Query('count', {
eventCollection: 'Loaded a Page',
groupBy: ['company.id','company.name']
});
That being said, it's important to note that Keen is not an entity database. Keen is optimized to store and analyze event data, which is different than entity data. More complex uses of entity data may not perform well using this solution.

Design pattern - update join table through REST API

I'm struggling with a REST API design concept. I have these classes:
user:
- first_name
- last_name
metadata_fields:
- field_name
user_metadata:
- user_id
- field_id
- value
- unique index on [user_id, field_id]
Ok, so users have many metadata and the type of metadata is defined in metadata_fields. Typical HABTM with extra data in the join table.
If I were to update user_metadata through a Rails form, the data would look like this:
user_metadata: {
id: 1,
user_id: 2,
field_id: 3,
value: 'foo'
}
If I posted to the user#update controller, the data would look like this:
user: {
user_metadata: {
id: 1,
field_id: 3,
value: 'foo'
}
}
The trouble with this approach is that we're ignoring the uniqueness of the user_id/field_id relationship. If I change the field_id in either update, I'm not just changing data, I'm changing the meaning of that data. This tends to work fine in Rails because it's somewhat of a walled garden, but it breaks down when you open up an API endpoint.
If I allow this:
PATCH /api/user_metadata
Then I'm opening myself up to someone modifying the user_id or field_id or both. Similarly with this:
PATCH /api/user/:user_id/metadata
Now user_id is set but field_id can still change. So really the only way to solve this is to limit the update to a single field:
PATCH /api/user/:user_id/metadata/:field_id
Or a bulk update:
PATCH /api/user/:user_id/metadata
But with that call, we have to modify the data structure so that the uniqueness of the user_id/field_id relationship is intact:
user_metadata: {
field_id1: 'value1',
field_id2: 'value2',
...
}
I'd love to hear thoughts here. I've scoured Google and found absolutely nothing. Any recommendations?
As metadata belongs to a certain user /api/user/{userId}/metadata/{metadataId} is probably the clean URI for a single metadata resource of a user. The URI of your resource is already the unique-key you are looking for. There can't be 2 resources with the same URI! Furthermore, the URI already contains the user and field IDs.
A request like GET /api/user/1 HTTP/1.1 could return a HAL-like representation like the one below:
{
"user" : {
"id": "1",
"firstName": "Max",
"lastName": "Sample",
...
"_links": {
"self" : {
"href": "/api/user/1"
}
},
"_embedded": {
"metadata" : {
"fields" : [{
"id": "1",
"type": "string",
"value": "foo",
"_links": {
"self": {
"href": "/api/user/1/metadata/1"
}
}
}, {
"id": "2",
"type": "string",
"value": "bar",
"_links": {
"self": {
"href": "/api/user/1/metadata/2"
}
}
}],
"_links": {
"self": {
"href": "/api/user/1/metadata"
}
}
}
}
}
}
Of course you could send a PUT or a PATCH request to modify an existing metadata field. Though, the URI of the resource will still be the same (unless you move or delete a resource within a PATCH request).
You also have the possibility to ignore certain fields on incomming PUT requests which prevents modification of certain fields like id or _link. I'll assume this should also be valid for PATCH requests, though will have to re-read the spec again therefore.
Therefore, I'd suggest to ignore any id or _link fields contained in requests and update the remaining fields. But you also have the option to return a 403 Forbidden or 409 Conflict response if someone tries to update an ID-field.
UPDATE
If you want to update multiple fields within a single request, you have two options:
Using PUT and replace the current set of fields with the new version
Using PATCH and send the server the necessary steps to transform the current field-set to the new field-set
Example PUT:
PUT /api/user/1/metadata HTTP/1.1
{
"metadata": {
"fields": [{
"type": "string",
"value": "newFoo"
}, {
"type": "string",
"value": "newBar"
}]
}
}
This request would first delete every stored metadata field of the user the metadata belong to and afterwards create a new resoure for each contained field in the request. While this still guarantees unique URIs, there are a couple of drawbacks to this approach however:
all the data which should be available after the update, even fields that do not change, need to be transmitted
clients which have a URI pointing to a certain resource may point to a false representation. F.e. a client has retrieved /user/1/metadata/2right before a further client updated all the metadata, the IDs are dispatched via auto-increment, the update however introduced a new second item and therefore moved the former 2 to position 3, client1 has now a reference to /user/1/metadata/2 while the actual data is /user/1/metadata/3 however. To prevent this, unique UUIDs could be used instead of autoincrement IDs. If client 1 later on tries to retrieve or update former resource 2, his can be notified that the resource is not available anymore, even a redirect to the new location could be created.
Example PATCH:
A PATCH request contains the necessary steps to transform the state of a resource to the new state. The request itself can affect multiple resources at the same time and even create or delete other resources as needed.
The following example is in json-patch+json format:
PATCH /api/user/1/metadata HTTP/1.1
[
{
"op": "add",
"path": "/0/value",
"value": "newFoo"
},
{
"op": "add",
"path": "/2",
"value": { "type": "string", "value": "totally new entry" }
},
{
"op": "remove",
"path": "/1"
},
]
The path is defined as a JSON Pointer for the invoked resource.
The add operation of the JSON-Patch type is defined as:
If the target location specifies an array index, a new value is inserted into the array at the specified index.
If the target location specifies an object member that does not already exist, a new member is added to the object.
If the target location specifies an object member that does exist, that member's value is replaced.
For the removal case however, the spec states:
If removing an element from an array, any elements above the specified index are shifted one position to the left.
Therefore the newly added entry would end up in position 2 in the array. If not an auto-increment value is used for the ID, this should not be a big problem though.
Besindes add, and remove the spec also contains definitions for replace, move, copy and test.
The PATCH should be transactional - either all operations succeed or none. The spec states:
If a normative requirement is violated by a JSON Patch document, or if an operation is not successful, evaluation of the JSON Patch document SHOULD terminate and application of the entire patch document SHALL NOT be deemed successful.
I'll interpret this lines as, if it tries to update a field which it is not supposed to update, you should return an error for the whole PATCH request and therefore do not alter any resources.
Drawback to the PATCH approach is clearly the transactional requirement as well as the JSON Pointer notation, which might not be that popular (at least I haven't used it often and had to look it up again). Same as with PUT, PATCH allows to add new resources inbetween existing resources and shifting further ones to the right which may lead to an issue if you rely on autoincrement values.
Therefore, I strongly recommend to use randomly generated UUIDs as identifier rather than auto-increment values.