Where to specify "MaxRetry" and "Timeout" variables in the Elasticsearch / Nest Bulk API? - nest

Question: Where to specify bulk operation "MaxRetry" or "Timeout" variables used in the NEST bulk API?
When I do the following bulk operation, the program stops after successfully inserting 60K records. I got a MaxRetryException in Elasticsearch.Net.Connection.RequestHandlers.RequestHandlerBase.cs.
So, I am thinking to increase the MaxRetry number or Timeout seconds to overcome this problem, Am I on the right path?
var counter = 0;
var indexName = "SomeIndexName";
var indexType = "SomeType";
var routingString = "SomeRouting";
var bulkDescriptor = new BulkDescriptor();
while (await result.ReadAsync())
{
counter++;
var document = GetDocumentObject<T>(result);
var idString = GetID(result);
bulkDescriptor.Index<T>(op => op
.Routing(routingString)
.Index(indexName)
.Type(indexType)
.Id(idString)
.Document(document));
if (counter % 1000 == 0)
{
var bulkResponse = await client.BulkAsync(bulkDescriptor);
bulkDescriptor = new BulkDescriptor();
}
}

I tested and that is how I specify the timeout & max retries:
var connectionSettings = new ConnectionSettings(_connectionPool)
.SetTimeout(1000*30) // 30 minutes timeout
.MaximumRetries(5); // 5 times retry
var client = new ElasticClient(connectionSettings);

Related

In the beginning, the script worked fine but it's getting slow with time as the number of rows is increasing

In the beginning, the script worked fine but it's getting slow with time as the number of rows is increasing.
Is there a way to optimize it or should I use some other tool?
function onEdit() {
var ss = SpreadsheetApp.getActiveSpreadsheet().getActiveSheet();
var datass = SpreadsheetApp.getActiveSpreadsheet().getSheetByName("Confidential- 1");
var activeCell = ss.getActiveCell();
if(activeCell.getColumn() == 4 && activeCell.getRow() > 1){
activeCell.offset(0,1).clearContent().clearDataValidations();
var makes = datass.getRange(1,1,1, datass.getLastColumn()).getValues();
var makeIndex = makes[0].indexOf(activeCell.getValue()) + 1;
if(makeIndex != 0){
var validationRange = datass.getRange(3, makeIndex, datass.getLastRow());
var validationRule = SpreadsheetApp.newDataValidation().requireValueInRange(validationRange).build();
activeCell.offset(0,1).setDataValidation(validationRule);
}
}
}

How to escape _ wildcard within Google app script sql?

The function to run a standard sql query within the app script throws up an error when _is used within the sql. It is used within the condition filter to look for all names with _x_. Backslashes break the app script when used.
Within Google Apps Script: var sql1 = 'sql string';
Within sql: WHERE lower(name) like "%\_x_\%"
Update: I managed to find a workaround using REGEXP_CONTAINS(LOWER(name), r"(_x_)" but am still interested to know if it works with the regular LIKE clause.
I reproduced your case using a modified sample code from the documentation.
I queried against a sample dataset using where like "%_". Then, I write the results in a Google spreadsheet.The table I am querying in BigQuery is:
Row id
1 _id_1212
2 id1212
The code I am using is below:
/**
* Runs a BigQuery query and logs the results in a spreadsheet.
*/
function runQuery() {
// Replace this value with the project ID listed in the Google
// Cloud Platform project.
var projectId = 'project_id';
//modified query
var request = {
query: 'SELECT * from `project_id.dataset.table` where id LIKE "%_id_%";'//it will also work for where like "%\_id\_%",
//configuring the query to use StandardSQL
useLegacySql: false
};
var queryResults = BigQuery.Jobs.query(request, projectId);
var jobId = queryResults.jobReference.jobId;
// Check on status of the Query Job.
var sleepTimeMs = 500;
while (!queryResults.jobComplete) {
Utilities.sleep(sleepTimeMs);
sleepTimeMs *= 2;
queryResults = BigQuery.Jobs.getQueryResults(projectId, jobId);
}
// Get all the rows of results.
var rows = queryResults.rows;
while (queryResults.pageToken) {
queryResults = BigQuery.Jobs.getQueryResults(projectId, jobId, {
pageToken: queryResults.pageToken
});
rows = rows.concat(queryResults.rows);
}
if (rows) {
var spreadsheet = SpreadsheetApp.create('BiqQuery Results');
var sheet = spreadsheet.getActiveSheet();
// Append the headers.
var headers = queryResults.schema.fields.map(function(field) {
return field.name;
});
sheet.appendRow(headers);
// Append the results.
var data = new Array(rows.length);
for (var i = 0; i < rows.length; i++) {
var cols = rows[i].f;
data[i] = new Array(cols.length);
for (var j = 0; j < cols.length; j++) {
data[i][j] = cols[j].v;
}
}
sheet.getRange(2, 1, rows.length, headers.length).setValues(data);
Logger.log('Results spreadsheet created: %s',
spreadsheet.getUrl());
} else {
Logger.log('No rows returned.');
}
}
The output,
id
_id_1212
Both where id LIKE "%_id_%" and where id LIKE "%\_id\_%" work when I set the query to use StandardSQL (useLegacySql: false).
In addition, the error GoogleJsonResponseException: API call to bigquery.jobs.query failed with error: Syntax error: Illegal escape sequence: \_ will be thrown when trying to escape the underscore using a double backslash such as where id LIKE "%\\_id\\_%".

Ravendb select - update multiple row

I am trying to select and update multiple row from ravendb, but it recursively update same rows. Namely first 100 rows. There is no changes.
Here is my code. How can I select some rows, Update some fields of each rows and do it again and again until my job finished.
var currentEmailId = 100;
using (var session = store.OpenSession())
{
var goon = true;
while(goon){
var contacts = session.Query<Contacts>().Where(f => f.LastEmailId < currentEmailId).Take(100);
if(contacts.Any()){
foreach(var contact in contacts){
EmailOperation.Send(contact, currentEmailId);
contact.LastEmailId = currentEmailId;
}
session.SaveChanges();
}
else{
goon = false
}
}
}
It's probably because you're doing a query immediately after saving changes, without letting the indexes update after save changes. Thus, you're getting back the same items. To fix that, you can tell SaveChanges to wait until indexes are updated. Your code would look something like this:
Try this:
var goon = true;
var currentEmailId = 100;
while (goon)
{
using (var session = store.OpenSession())
{
var contacts = session.Query<Contacts>()
.Where(f => f.LastEmailId < currentEmailId)
.Take(100);
if(contacts.Any())
{
foreach(var contact in contacts)
{
EmailOperation.Send(contact, currentEmailId);
contact.LastEmailId = currentEmailId;
}
// Wait for the indexes to update when calling SaveChanges.
DbSession.Advanced.WaitForIndexesAfterSaveChanges(TimeSpan.FromSeconds(30), false);
session.SaveChanges();
}
else
{
goon = false
}
}
}
If you're updating many contacts at once, you may wish to consider using using Streaming query results combined with BulkInsert to update many Contacts en mass.

Rally c# task time spent

I have a C# .net application using the rally 3.0.1 API. When I query task in my system I get 0.0 for time spent when I know they have time against them. Anyone know how to get this? Below is my code:
if (uTasks.Count > 0)
{
Request taskRequest = new Request(resultChild["Tasks"]);
QueryResult TaskQueryResult = restApi.Query(taskRequest);
foreach (var items in TaskQueryResult.Results)
//foreach (var items in uTasks)
{
DataRow dtrow2;
dtrow2 = dt.NewRow();
dtrow2["TaskID"]=items["FormattedID"];
dtrow2["Task Name"] = items["Name"];
if (items["Owner"] != null)
{
var owner = items["Owner"];
String ownerref = owner["_ref"];
var ownerFetch = restApi.GetByReference(ownerref, "Name");
string strTemp = ownerFetch["_refObjectName"];
dtrow2["Owner"] = strTemp.Replace(",", " ");
}
\\else { dtrow2["Owner"] = ""; }
dtrow2["Task-Est"] = items["Estimate"];
dtrow2["Task-ToDo"] = items["ToDo"];
dtrow2["Task-Spent"] = items["TimeSpent"];
dtrow2["ObjectType"] = "T";
dt.Rows.Add(dtrow2);
}
}
It seems like that should work. You may want to make sure you're including the TimeSpent field in your fetch before issuing the request.
taskRequest.Fetch = new List<string>() { "TimeSpent" };

Updating the google spreedsheet from MS SQL using Google Scripts

I'm trying to connect the MS SQL to a google spreadsheet using google app scripts. here is my app script code
function SQLdb() {
// Replace the variables in this block with real values.
// Read up to 1000 rows of data from the table and log them.
function readFromTable() {
var user = '{usename}';
var userPwd = '{password}';
var database = '{databasename}'
var connectionString = 'jdbc:sqlserver://server.database.windows.net:1433;databaseName='+database;
var conn = Jdbc.getConnection(connectionString , user, userPwd);
var start = new Date();
var stmt = conn.createStatement();
stmt.setMaxRows(1000);
var results = stmt.executeQuery('SELECT TOP 1000 * FROM dbo.dbo');
var numCols = results.getMetaData().getColumnCount();
while (results.next()) {
var rowString = '';
for (var col = 0; col < numCols; col++) {
rowString += results.getString(col + 1) + '\t';
}
Logger.log(rowString)
}
results.close();
stmt.close();
var end = new Date();
Logger.log('Time elapsed: %sms', end - start);
}
readFromTable();
}
Now when I look the log in Script Editor, I can see that this connection to the SQL database is working and the script is able to read all the table cell. But I couldn't able to get that data into the spreedsheet. I'm new to app scripts. So is there something that I'm missing here ?
Any help would be much appricated!
Here's how I do it. Change all the capitalised bits to the appropriate URL, database name, Google Spreadsheet ID, etc.
To this function pass the SQL query you want to execute on the MSSQL database and the name of the sheet within the Spreadsheet that you want to put the data into. This basically fills that named sheet with the query results including column names.
function getData(query, sheetName) {
//jdbc:sqlserver://localhost;user=MyUserName;password=*****;
var conn = Jdbc.getConnection("jdbc:sqlserver://URL;user=USERNAME;password=PASSWORD;databaseName=DBNAME");
var stmt = conn.createStatement();
stmt.setMaxRows(MAXROWS);
var rs = stmt.executeQuery(query);
Logger.log(rs);
var doc = SpreadsheetApp.openById("SPREADSHEETID");
var sheet = doc.getSheetByName(sheetName);
var results = [];
var cell = doc.getRange('a1');
var row = 0;
cols = rs.getMetaData();
colNames = [];
for (i = 1; i <= cols.getColumnCount(); i++ ) {
Logger.log(cols.getColumnName(i));
colNames.push(cols.getColumnName(i));
}
results.push(colNames);
var rowCount = 1;
while(rs.next()) {
curRow = rs.getMetaData();
rowData = [];
for (i = 1; i <= curRow.getColumnCount(); i++ ) {
rowData.push(rs.getString(i));
}
results.push(rowData);
rowCount++;
}
sheet.getRange(1, 1, MAXROWS, cols.getColumnCount()).clearContent();
sheet.getRange(1, 1, rowCount, cols.getColumnCount()).setValues(results);
Logger.log(results);
rs.close();
stmt.close();
conn.close();
}
Here is how i did it. I also made a JDBC connector tool for Mysql & MSSQL. You can adapt this tool from my GitHub Repo Here: Google Spreadsheet JDBC Connector
function readFromTable(queryType, queryDb, query, tab, startCell) {
// Replace the variables in this block with real values.
var address;
var user;
var userPwd ;
var dbUrl;
switch(queryType) {
case 'sqlserver':
address = '%YOUR SQL HOSTNAME%';
user = '%YOUR USE%';
userPwd = '%YOUR PW%';
dbUrl = 'jdbc:sqlserver://' + address + ':1433;databaseName=' + queryDb;
break;
case 'mysql':
address = '%YOUR MYSQL HOSTNAME%';
user = '%YOUR USER';
userPwd = '%YOUR PW%';
dbUrl = 'jdbc:mysql://'+address + '/' + queryDb;
break;
}
var conn = Jdbc.getConnection(dbUrl, user, userPwd);
var start = new Date();
var stmt = conn.createStatement();
var results = stmt.executeQuery(query);
var sheet = SpreadsheetApp.getActiveSpreadsheet();
var sheetTab = sheet.getSheetByName(tab);
var cell = sheetTab.getRange(startCell);
var numCols = results.getMetaData().getColumnCount();
var numRows = sheetTab.getLastRow();
var headers ;
var row =0;
clearRange(tab,startCell,numRows, numCols);
for(var i = 1; i <= numCols; i++){
headers = results.getMetaData().getColumnName(i);
cell.offset(row, i-1).setValue(headers);
}
while (results.next()) {
var rowString = '';
for (var col = 0; col < numCols; col++) {
rowString += results.getString(col + 1) + '\t';
cell.offset(row +1, col).setValue(results.getString(col +1 ));
}
row++
Logger.log(rowString)
}
results.close();
stmt.close();
var end = new Date();
Logger.log('Time elapsed: %sms', end - start);
}
If you don't want to create your own solution, check out SeekWell. It allows you to connect to databases and write SQL queries directly in Sheets from a sidebar add-on. You can also schedule queries to automatically run daily, hourly or every five minutes.
Disclaimer: I made this.
Ok, I finally managed to make it work.
Some few tips: Script is executed at Google's server, so connection must be done over inetrnet, i.e. connect string should be something like "jdbc:sqlserver://172.217.29.206:7000;databaseName=XXXX" and make sure that ip/port mentioned at connect string, can reach you database from outside world. Open port at firewall, make IP forwarding at router and use dyndns or similar services if you do not have a valid domain etc. Sheet ID is the large id string that appeals at your google document's url.