How to escape _ wildcard within Google app script sql? - google-bigquery

The function to run a standard sql query within the app script throws up an error when _is used within the sql. It is used within the condition filter to look for all names with _x_. Backslashes break the app script when used.
Within Google Apps Script: var sql1 = 'sql string';
Within sql: WHERE lower(name) like "%\_x_\%"
Update: I managed to find a workaround using REGEXP_CONTAINS(LOWER(name), r"(_x_)" but am still interested to know if it works with the regular LIKE clause.

I reproduced your case using a modified sample code from the documentation.
I queried against a sample dataset using where like "%_". Then, I write the results in a Google spreadsheet.The table I am querying in BigQuery is:
Row id
1 _id_1212
2 id1212
The code I am using is below:
/**
* Runs a BigQuery query and logs the results in a spreadsheet.
*/
function runQuery() {
// Replace this value with the project ID listed in the Google
// Cloud Platform project.
var projectId = 'project_id';
//modified query
var request = {
query: 'SELECT * from `project_id.dataset.table` where id LIKE "%_id_%";'//it will also work for where like "%\_id\_%",
//configuring the query to use StandardSQL
useLegacySql: false
};
var queryResults = BigQuery.Jobs.query(request, projectId);
var jobId = queryResults.jobReference.jobId;
// Check on status of the Query Job.
var sleepTimeMs = 500;
while (!queryResults.jobComplete) {
Utilities.sleep(sleepTimeMs);
sleepTimeMs *= 2;
queryResults = BigQuery.Jobs.getQueryResults(projectId, jobId);
}
// Get all the rows of results.
var rows = queryResults.rows;
while (queryResults.pageToken) {
queryResults = BigQuery.Jobs.getQueryResults(projectId, jobId, {
pageToken: queryResults.pageToken
});
rows = rows.concat(queryResults.rows);
}
if (rows) {
var spreadsheet = SpreadsheetApp.create('BiqQuery Results');
var sheet = spreadsheet.getActiveSheet();
// Append the headers.
var headers = queryResults.schema.fields.map(function(field) {
return field.name;
});
sheet.appendRow(headers);
// Append the results.
var data = new Array(rows.length);
for (var i = 0; i < rows.length; i++) {
var cols = rows[i].f;
data[i] = new Array(cols.length);
for (var j = 0; j < cols.length; j++) {
data[i][j] = cols[j].v;
}
}
sheet.getRange(2, 1, rows.length, headers.length).setValues(data);
Logger.log('Results spreadsheet created: %s',
spreadsheet.getUrl());
} else {
Logger.log('No rows returned.');
}
}
The output,
id
_id_1212
Both where id LIKE "%_id_%" and where id LIKE "%\_id\_%" work when I set the query to use StandardSQL (useLegacySql: false).
In addition, the error GoogleJsonResponseException: API call to bigquery.jobs.query failed with error: Syntax error: Illegal escape sequence: \_ will be thrown when trying to escape the underscore using a double backslash such as where id LIKE "%\\_id\\_%".

Related

Automatically update my Gsheet with an SQL data base

I need to update a file automatically that already has data in it.
The document is filled with an SQL data base thanks to the code below
However, I want it to update itself everyday without deleting any data that are already in the document and only adding new ones (don't want any duplicates).
function readData(db, queryString) {
//connect to the database
var server = 'your-servername-OR-serverPublicIpAddress';
var username = 'your-sql-username';
var password = 'your-password';
var dbUrl = 'jdbc:sqlserver://' + server + ':1433;databaseName=' + db;
var conn = Jdbc.getConnection(dbUrl, username, password );
//query the data
var stmt = conn.createStatement();
var exec_query = stmt.executeQuery(queryString);
var metaData = exec_query.getMetaData();
var numCols = metaData.getColumnCount();
//save query data to an array
var result=[]; //initiate a blank array
//save the column header
header = []; //initiate the header row
for (var col = 0; col < numCols; col++) {
header.push(metaData.getColumnName(col + 1)); //add the name of each column to the header row
};
result.push(header);//after the header row is formed, put it to the result array
//save the data of each row
while (exec_query.next()) {
row_data = [];
for (var col = 0; col < numCols; col++) {
row_data.push(exec_query.getString(col + 1));//add data of each column to the row data
//Logger.log(row_data);
};
result.push(row_data); // add row data to result
//Logger.log(result);
};
exec_query.close();
return result
};
function pushDataToGoogleSheet(data, SheetName) {
var spreadsheet = SpreadsheetApp.getActiveSpreadsheet();
var sheet = SpreadsheetApp.setActiveSheet(spreadsheet.getSheetByName(SheetName));
var lastRow = sheet.getLastRow();
sheet.getRange(lastRow+1, 1, data.length, data[0].length).setValues(data);
sheet.getDataRange().removeDuplicates();
};
function main() {
db = 'YOUR_DATABASE_NAME'
SQLquery = 'YOUR_SQL_QUERY'
raw_statistics = readData(db, SQLquery); //get raw statistics
pushDataToGoogleSheet(raw_statistics, 'YOUR_SHEET_NAME'); //push to google sheet
};
function main() {
db = 'YOUR_DATABASE_NAME'
SQLquery = 'YOUR_SQL_QUERY'
raw_statistics = readData(db, SQLquery); //get raw statistics
pushDataToGoogleSheet(raw_statistics, 'YOUR_SHEET_NAME'); //push to google sheet
};
However, in the pushDataToGoogleSheet it says that it can't define the length. So, I don't know if I put the right thing for data or not or if there is an issue in my code...
Do you have an idea ?
Thank you for your help !

Ravendb select - update multiple row

I am trying to select and update multiple row from ravendb, but it recursively update same rows. Namely first 100 rows. There is no changes.
Here is my code. How can I select some rows, Update some fields of each rows and do it again and again until my job finished.
var currentEmailId = 100;
using (var session = store.OpenSession())
{
var goon = true;
while(goon){
var contacts = session.Query<Contacts>().Where(f => f.LastEmailId < currentEmailId).Take(100);
if(contacts.Any()){
foreach(var contact in contacts){
EmailOperation.Send(contact, currentEmailId);
contact.LastEmailId = currentEmailId;
}
session.SaveChanges();
}
else{
goon = false
}
}
}
It's probably because you're doing a query immediately after saving changes, without letting the indexes update after save changes. Thus, you're getting back the same items. To fix that, you can tell SaveChanges to wait until indexes are updated. Your code would look something like this:
Try this:
var goon = true;
var currentEmailId = 100;
while (goon)
{
using (var session = store.OpenSession())
{
var contacts = session.Query<Contacts>()
.Where(f => f.LastEmailId < currentEmailId)
.Take(100);
if(contacts.Any())
{
foreach(var contact in contacts)
{
EmailOperation.Send(contact, currentEmailId);
contact.LastEmailId = currentEmailId;
}
// Wait for the indexes to update when calling SaveChanges.
DbSession.Advanced.WaitForIndexesAfterSaveChanges(TimeSpan.FromSeconds(30), false);
session.SaveChanges();
}
else
{
goon = false
}
}
}
If you're updating many contacts at once, you may wish to consider using using Streaming query results combined with BulkInsert to update many Contacts en mass.

Updating the google spreedsheet from MS SQL using Google Scripts

I'm trying to connect the MS SQL to a google spreadsheet using google app scripts. here is my app script code
function SQLdb() {
// Replace the variables in this block with real values.
// Read up to 1000 rows of data from the table and log them.
function readFromTable() {
var user = '{usename}';
var userPwd = '{password}';
var database = '{databasename}'
var connectionString = 'jdbc:sqlserver://server.database.windows.net:1433;databaseName='+database;
var conn = Jdbc.getConnection(connectionString , user, userPwd);
var start = new Date();
var stmt = conn.createStatement();
stmt.setMaxRows(1000);
var results = stmt.executeQuery('SELECT TOP 1000 * FROM dbo.dbo');
var numCols = results.getMetaData().getColumnCount();
while (results.next()) {
var rowString = '';
for (var col = 0; col < numCols; col++) {
rowString += results.getString(col + 1) + '\t';
}
Logger.log(rowString)
}
results.close();
stmt.close();
var end = new Date();
Logger.log('Time elapsed: %sms', end - start);
}
readFromTable();
}
Now when I look the log in Script Editor, I can see that this connection to the SQL database is working and the script is able to read all the table cell. But I couldn't able to get that data into the spreedsheet. I'm new to app scripts. So is there something that I'm missing here ?
Any help would be much appricated!
Here's how I do it. Change all the capitalised bits to the appropriate URL, database name, Google Spreadsheet ID, etc.
To this function pass the SQL query you want to execute on the MSSQL database and the name of the sheet within the Spreadsheet that you want to put the data into. This basically fills that named sheet with the query results including column names.
function getData(query, sheetName) {
//jdbc:sqlserver://localhost;user=MyUserName;password=*****;
var conn = Jdbc.getConnection("jdbc:sqlserver://URL;user=USERNAME;password=PASSWORD;databaseName=DBNAME");
var stmt = conn.createStatement();
stmt.setMaxRows(MAXROWS);
var rs = stmt.executeQuery(query);
Logger.log(rs);
var doc = SpreadsheetApp.openById("SPREADSHEETID");
var sheet = doc.getSheetByName(sheetName);
var results = [];
var cell = doc.getRange('a1');
var row = 0;
cols = rs.getMetaData();
colNames = [];
for (i = 1; i <= cols.getColumnCount(); i++ ) {
Logger.log(cols.getColumnName(i));
colNames.push(cols.getColumnName(i));
}
results.push(colNames);
var rowCount = 1;
while(rs.next()) {
curRow = rs.getMetaData();
rowData = [];
for (i = 1; i <= curRow.getColumnCount(); i++ ) {
rowData.push(rs.getString(i));
}
results.push(rowData);
rowCount++;
}
sheet.getRange(1, 1, MAXROWS, cols.getColumnCount()).clearContent();
sheet.getRange(1, 1, rowCount, cols.getColumnCount()).setValues(results);
Logger.log(results);
rs.close();
stmt.close();
conn.close();
}
Here is how i did it. I also made a JDBC connector tool for Mysql & MSSQL. You can adapt this tool from my GitHub Repo Here: Google Spreadsheet JDBC Connector
function readFromTable(queryType, queryDb, query, tab, startCell) {
// Replace the variables in this block with real values.
var address;
var user;
var userPwd ;
var dbUrl;
switch(queryType) {
case 'sqlserver':
address = '%YOUR SQL HOSTNAME%';
user = '%YOUR USE%';
userPwd = '%YOUR PW%';
dbUrl = 'jdbc:sqlserver://' + address + ':1433;databaseName=' + queryDb;
break;
case 'mysql':
address = '%YOUR MYSQL HOSTNAME%';
user = '%YOUR USER';
userPwd = '%YOUR PW%';
dbUrl = 'jdbc:mysql://'+address + '/' + queryDb;
break;
}
var conn = Jdbc.getConnection(dbUrl, user, userPwd);
var start = new Date();
var stmt = conn.createStatement();
var results = stmt.executeQuery(query);
var sheet = SpreadsheetApp.getActiveSpreadsheet();
var sheetTab = sheet.getSheetByName(tab);
var cell = sheetTab.getRange(startCell);
var numCols = results.getMetaData().getColumnCount();
var numRows = sheetTab.getLastRow();
var headers ;
var row =0;
clearRange(tab,startCell,numRows, numCols);
for(var i = 1; i <= numCols; i++){
headers = results.getMetaData().getColumnName(i);
cell.offset(row, i-1).setValue(headers);
}
while (results.next()) {
var rowString = '';
for (var col = 0; col < numCols; col++) {
rowString += results.getString(col + 1) + '\t';
cell.offset(row +1, col).setValue(results.getString(col +1 ));
}
row++
Logger.log(rowString)
}
results.close();
stmt.close();
var end = new Date();
Logger.log('Time elapsed: %sms', end - start);
}
If you don't want to create your own solution, check out SeekWell. It allows you to connect to databases and write SQL queries directly in Sheets from a sidebar add-on. You can also schedule queries to automatically run daily, hourly or every five minutes.
Disclaimer: I made this.
Ok, I finally managed to make it work.
Some few tips: Script is executed at Google's server, so connection must be done over inetrnet, i.e. connect string should be something like "jdbc:sqlserver://172.217.29.206:7000;databaseName=XXXX" and make sure that ip/port mentioned at connect string, can reach you database from outside world. Open port at firewall, make IP forwarding at router and use dyndns or similar services if you do not have a valid domain etc. Sheet ID is the large id string that appeals at your google document's url.

Google BigQuery returns only partial table data with C# application using .net Client Library

I am trying to execute the query (Basic select statement with 10 fields). My table contains more than 500k rows. C# application returns the response with only 4260 rows. However Web UI returns all the records.
Why my code returns only partial data, What is the best way to select all the records and load into C# Data Table? If there is any code snippet it would be more helpful to me.
using Google.Apis.Auth.OAuth2;
using System.IO;
using System.Threading;
using Google.Apis.Bigquery.v2;
using Google.Apis.Bigquery.v2.Data;
using System.Data;
using Google.Apis.Services;
using System;
using System.Security.Cryptography.X509Certificates;
namespace GoogleBigQuery
{
public class Class1
{
private static void Main()
{
try
{
Console.WriteLine("Start Time: {0}", DateTime.Now.ToString());
String serviceAccountEmail = "SERVICE ACCOUNT EMAIL";
var certificate = new X509Certificate2(#"KeyFile.p12", "notasecret", X509KeyStorageFlags.Exportable);
ServiceAccountCredential credential = new ServiceAccountCredential(
new ServiceAccountCredential.Initializer(serviceAccountEmail)
{
Scopes = new[] { BigqueryService.Scope.Bigquery, BigqueryService.Scope.BigqueryInsertdata, BigqueryService.Scope.CloudPlatform, BigqueryService.Scope.DevstorageFullControl }
}.FromCertificate(certificate));
BigqueryService Service = new BigqueryService(new BaseClientService.Initializer()
{
HttpClientInitializer = credential,
ApplicationName = "PROJECT NAME"
});
string query = "SELECT * FROM [publicdata:samples.shakespeare]";
JobsResource j = Service.Jobs;
QueryRequest qr = new QueryRequest();
string ProjectID = "PROJECT ID";
qr.Query = query;
qr.MaxResults = Int32.MaxValue;
qr.TimeoutMs = Int32.MaxValue;
DataTable DT = new DataTable();
int i = 0;
QueryResponse response = j.Query(qr, ProjectID).Execute();
string pageToken = null;
if (response.JobComplete == true)
{
if (response != null)
{
int colCount = response.Schema.Fields.Count;
if (DT == null)
DT = new DataTable();
if (DT.Columns.Count == 0)
{
foreach (var Column in response.Schema.Fields)
{
DT.Columns.Add(Column.Name);
}
}
pageToken = response.PageToken;
if (response.Rows != null)
{
foreach (TableRow row in response.Rows)
{
DataRow dr = DT.NewRow();
for (i = 0; i < colCount; i++)
{
dr[i] = row.F[i].V;
}
DT.Rows.Add(dr);
}
}
Console.WriteLine("No of Records are Readed: {0} # {1}", DT.Rows.Count.ToString(), DateTime.Now.ToString());
while (true)
{
int StartIndexForQuery = DT.Rows.Count;
Google.Apis.Bigquery.v2.JobsResource.GetQueryResultsRequest SubQR = Service.Jobs.GetQueryResults(response.JobReference.ProjectId, response.JobReference.JobId);
SubQR.StartIndex = (ulong)StartIndexForQuery;
//SubQR.MaxResults = Int32.MaxValue;
GetQueryResultsResponse QueryResultResponse = SubQR.Execute();
if (QueryResultResponse != null)
{
if (QueryResultResponse.Rows != null)
{
foreach (TableRow row in QueryResultResponse.Rows)
{
DataRow dr = DT.NewRow();
for (i = 0; i < colCount; i++)
{
dr[i] = row.F[i].V;
}
DT.Rows.Add(dr);
}
}
Console.WriteLine("No of Records are Readed: {0} # {1}", DT.Rows.Count.ToString(), DateTime.Now.ToString());
if (null == QueryResultResponse.PageToken)
{
break;
}
}
else
{
break;
}
}
}
else
{
Console.WriteLine("Response is null");
}
}
int TotalCount = 0;
if (DT != null && DT.Rows.Count > 0)
{
TotalCount = DT.Rows.Count;
}
else
{
TotalCount = 0;
}
Console.WriteLine("End Time: {0}", DateTime.Now.ToString());
Console.WriteLine("No. of records readed from google bigquery service: " + TotalCount.ToString());
}
catch (Exception e)
{
Console.WriteLine("Error Occurred: " + e.Message);
}
Console.ReadLine();
}
}
}
In this Sample Query get the results from public data set, In table contains 164656 rows but response returns 85000 rows only for the first time, then query again to get the second set of results. (But not known this is the only solution to get all the results).
In this sample contains only 4 fields, even-though it does not return all rows, in my case table contains more than 15 fields, I get response of ~4000 rows out of ~10k rows, I need to query again and again to get the remaining results for selecting 1000 rows takes time up to 2 minutes in my methodology so I am expecting best way to select all the records within single response.
Answer from User #:Pentium10
There is no way to run a query and select a large response in a single shot. You can either paginate the results, or if you can create a job to export to files, then use the files generated in your app. Exporting is free.
Step to run a large query and export results to files stored on GCS:
1) Set allowLargeResults to true in your job configuration. You must also specify a destination table with the allowLargeResults flag.
Example:
"configuration":
{
"query":
{
"allowLargeResults": true,
"query": "select uid from [project:dataset.table]"
"destinationTable": [project:dataset.table]
}
}
2) Now your data is in a destination table you set. You need to create a new job, and set the export property to be able to export the table to file(s). Exporting is free, but you need to have Google Cloud Storage activated to put the resulting files there.
3) In the end you download your large files from GCS.
It my turn to design the solution for better results.
Hoping this might help someone. One could retrieve next set of paginated result using PageToken. Here is the sample code for how to use PageToken. Although, I liked the idea of exporting for free. Here, I write rows to flat file but you could add them to your DataTable. Obviously, it is a bad idea to keep large DataTable in memory though.
public void ExecuteSQL(BigqueryService bqservice, String ProjectID)
{
string sSql = "SELECT r.Dealname, r.poolnumber, r.loanid FROM [MBS_Dataset.tblRemitData] R left join each [MBS_Dataset.tblOrigData] o on R.Dealname = o.Dealname and R.Poolnumber = o.Poolnumber and R.LoanID = o.LoanID Order by o.Dealname, o.poolnumber, o.loanid limit 100000";
QueryRequest _r = new QueryRequest();
_r.Query = sSql;
QueryResponse _qr = bqservice.Jobs.Query(_r, ProjectID).Execute();
string pageToken = null;
if (_qr.JobComplete != true)
{
//job not finished yet! expecting more data
while (true)
{
var resultReq = bqservice.Jobs.GetQueryResults(_qr.JobReference.ProjectId, _qr.JobReference.JobId);
resultReq.PageToken = pageToken;
var result = resultReq.Execute();
if (result.JobComplete == true)
{
WriteRows(result.Rows, result.Schema.Fields);
pageToken = result.PageToken;
if (pageToken == null)
break;
}
}
}
else
{
List<string> _fieldNames = _qr.Schema.Fields.ToList().Select(x => x.Name).ToList();
WriteRows(_qr.Rows, _qr.Schema.Fields);
}
}
The Web UI automatically flattens the data. This means that you see multiple rows for each nested field.
When you run the same query via the API, it won't be flattened, and you get fewer rows, as the nested fields are returned as objects. You should check if this is the case at you.
The other is that indeed you need to paginate through the results. Paging through list results has this explained.
If you want to do only one job, than you should write your query ouput to a table, than export the table as JSON, and download the export from GCS.

NHibernate named query and multiple result sets

We have a stored procedure that returns several tables. When calling it using NHibernate, we use the bean transformer but only get the first table transformed and all other results are ignored.
I know that NH is able to process several queries in one db trip using futures but we only have one query and it produces a result that is similar to what we would get with futures, but getting this from a stored procedure.
I believe this scenario is quite common but could not find any clues. Is it possible to use NH to retrieve such results?
Yes,you can use MultiQuery "Hack" like this:
The procudure:
CREATE PROCEDURE [dbo].[proc_Name]
AS BEGIN
SELECT * FROM Question
SELECT * FROM Question
END
The NHibernate Query Code:
public void ProcdureMultiTableQuery()
{
var session = Session;
var procSQLQuery = session.CreateSQLQuery("exec [proc_Name] ?,?");// prcodure returns two table
procSQLQuery.SetParameter(0, userId);
procSQLQuery.SetParameter(1, page);
procSQLQuery.AddEntity(typeof(Question));
var multiResults = session.CreateMultiQuery()
.Add(procSQLQuery)
// More table your procedure returns,more empty SQL query you should add
.Add(session.CreateSQLQuery(" ").AddEntity(typeof(Question))) // the second table returns Question Model
.List();
if (multiResults == null || multiResults.Count == 0)
{
return;
}
if (multiResults.Count != 2)
{
return;
}
var questions1 = ConvertObjectsToArray<Question>((System.Collections.IList)multiResults[0]);
var questions2 = ConvertObjectsToArray<Question>((System.Collections.IList)multiResults[1]);
}
static T[] ConvertObjectsToArray<T>(System.Collections.IList objects)
{
if (objects == null || objects.Count == 0)
{
return null;
}
var array = new T[objects.Count];
for (int i = 0; i < array.Length; i++)
{
array[i] = (T)objects[i];
}
return array;
}