Google BigQuery returns only partial table data with C# application using .net Client Library - google-bigquery

I am trying to execute the query (Basic select statement with 10 fields). My table contains more than 500k rows. C# application returns the response with only 4260 rows. However Web UI returns all the records.
Why my code returns only partial data, What is the best way to select all the records and load into C# Data Table? If there is any code snippet it would be more helpful to me.
using Google.Apis.Auth.OAuth2;
using System.IO;
using System.Threading;
using Google.Apis.Bigquery.v2;
using Google.Apis.Bigquery.v2.Data;
using System.Data;
using Google.Apis.Services;
using System;
using System.Security.Cryptography.X509Certificates;
namespace GoogleBigQuery
{
public class Class1
{
private static void Main()
{
try
{
Console.WriteLine("Start Time: {0}", DateTime.Now.ToString());
String serviceAccountEmail = "SERVICE ACCOUNT EMAIL";
var certificate = new X509Certificate2(#"KeyFile.p12", "notasecret", X509KeyStorageFlags.Exportable);
ServiceAccountCredential credential = new ServiceAccountCredential(
new ServiceAccountCredential.Initializer(serviceAccountEmail)
{
Scopes = new[] { BigqueryService.Scope.Bigquery, BigqueryService.Scope.BigqueryInsertdata, BigqueryService.Scope.CloudPlatform, BigqueryService.Scope.DevstorageFullControl }
}.FromCertificate(certificate));
BigqueryService Service = new BigqueryService(new BaseClientService.Initializer()
{
HttpClientInitializer = credential,
ApplicationName = "PROJECT NAME"
});
string query = "SELECT * FROM [publicdata:samples.shakespeare]";
JobsResource j = Service.Jobs;
QueryRequest qr = new QueryRequest();
string ProjectID = "PROJECT ID";
qr.Query = query;
qr.MaxResults = Int32.MaxValue;
qr.TimeoutMs = Int32.MaxValue;
DataTable DT = new DataTable();
int i = 0;
QueryResponse response = j.Query(qr, ProjectID).Execute();
string pageToken = null;
if (response.JobComplete == true)
{
if (response != null)
{
int colCount = response.Schema.Fields.Count;
if (DT == null)
DT = new DataTable();
if (DT.Columns.Count == 0)
{
foreach (var Column in response.Schema.Fields)
{
DT.Columns.Add(Column.Name);
}
}
pageToken = response.PageToken;
if (response.Rows != null)
{
foreach (TableRow row in response.Rows)
{
DataRow dr = DT.NewRow();
for (i = 0; i < colCount; i++)
{
dr[i] = row.F[i].V;
}
DT.Rows.Add(dr);
}
}
Console.WriteLine("No of Records are Readed: {0} # {1}", DT.Rows.Count.ToString(), DateTime.Now.ToString());
while (true)
{
int StartIndexForQuery = DT.Rows.Count;
Google.Apis.Bigquery.v2.JobsResource.GetQueryResultsRequest SubQR = Service.Jobs.GetQueryResults(response.JobReference.ProjectId, response.JobReference.JobId);
SubQR.StartIndex = (ulong)StartIndexForQuery;
//SubQR.MaxResults = Int32.MaxValue;
GetQueryResultsResponse QueryResultResponse = SubQR.Execute();
if (QueryResultResponse != null)
{
if (QueryResultResponse.Rows != null)
{
foreach (TableRow row in QueryResultResponse.Rows)
{
DataRow dr = DT.NewRow();
for (i = 0; i < colCount; i++)
{
dr[i] = row.F[i].V;
}
DT.Rows.Add(dr);
}
}
Console.WriteLine("No of Records are Readed: {0} # {1}", DT.Rows.Count.ToString(), DateTime.Now.ToString());
if (null == QueryResultResponse.PageToken)
{
break;
}
}
else
{
break;
}
}
}
else
{
Console.WriteLine("Response is null");
}
}
int TotalCount = 0;
if (DT != null && DT.Rows.Count > 0)
{
TotalCount = DT.Rows.Count;
}
else
{
TotalCount = 0;
}
Console.WriteLine("End Time: {0}", DateTime.Now.ToString());
Console.WriteLine("No. of records readed from google bigquery service: " + TotalCount.ToString());
}
catch (Exception e)
{
Console.WriteLine("Error Occurred: " + e.Message);
}
Console.ReadLine();
}
}
}
In this Sample Query get the results from public data set, In table contains 164656 rows but response returns 85000 rows only for the first time, then query again to get the second set of results. (But not known this is the only solution to get all the results).
In this sample contains only 4 fields, even-though it does not return all rows, in my case table contains more than 15 fields, I get response of ~4000 rows out of ~10k rows, I need to query again and again to get the remaining results for selecting 1000 rows takes time up to 2 minutes in my methodology so I am expecting best way to select all the records within single response.

Answer from User #:Pentium10
There is no way to run a query and select a large response in a single shot. You can either paginate the results, or if you can create a job to export to files, then use the files generated in your app. Exporting is free.
Step to run a large query and export results to files stored on GCS:
1) Set allowLargeResults to true in your job configuration. You must also specify a destination table with the allowLargeResults flag.
Example:
"configuration":
{
"query":
{
"allowLargeResults": true,
"query": "select uid from [project:dataset.table]"
"destinationTable": [project:dataset.table]
}
}
2) Now your data is in a destination table you set. You need to create a new job, and set the export property to be able to export the table to file(s). Exporting is free, but you need to have Google Cloud Storage activated to put the resulting files there.
3) In the end you download your large files from GCS.
It my turn to design the solution for better results.

Hoping this might help someone. One could retrieve next set of paginated result using PageToken. Here is the sample code for how to use PageToken. Although, I liked the idea of exporting for free. Here, I write rows to flat file but you could add them to your DataTable. Obviously, it is a bad idea to keep large DataTable in memory though.
public void ExecuteSQL(BigqueryService bqservice, String ProjectID)
{
string sSql = "SELECT r.Dealname, r.poolnumber, r.loanid FROM [MBS_Dataset.tblRemitData] R left join each [MBS_Dataset.tblOrigData] o on R.Dealname = o.Dealname and R.Poolnumber = o.Poolnumber and R.LoanID = o.LoanID Order by o.Dealname, o.poolnumber, o.loanid limit 100000";
QueryRequest _r = new QueryRequest();
_r.Query = sSql;
QueryResponse _qr = bqservice.Jobs.Query(_r, ProjectID).Execute();
string pageToken = null;
if (_qr.JobComplete != true)
{
//job not finished yet! expecting more data
while (true)
{
var resultReq = bqservice.Jobs.GetQueryResults(_qr.JobReference.ProjectId, _qr.JobReference.JobId);
resultReq.PageToken = pageToken;
var result = resultReq.Execute();
if (result.JobComplete == true)
{
WriteRows(result.Rows, result.Schema.Fields);
pageToken = result.PageToken;
if (pageToken == null)
break;
}
}
}
else
{
List<string> _fieldNames = _qr.Schema.Fields.ToList().Select(x => x.Name).ToList();
WriteRows(_qr.Rows, _qr.Schema.Fields);
}
}

The Web UI automatically flattens the data. This means that you see multiple rows for each nested field.
When you run the same query via the API, it won't be flattened, and you get fewer rows, as the nested fields are returned as objects. You should check if this is the case at you.
The other is that indeed you need to paginate through the results. Paging through list results has this explained.
If you want to do only one job, than you should write your query ouput to a table, than export the table as JSON, and download the export from GCS.

Related

Rally c# task time spent

I have a C# .net application using the rally 3.0.1 API. When I query task in my system I get 0.0 for time spent when I know they have time against them. Anyone know how to get this? Below is my code:
if (uTasks.Count > 0)
{
Request taskRequest = new Request(resultChild["Tasks"]);
QueryResult TaskQueryResult = restApi.Query(taskRequest);
foreach (var items in TaskQueryResult.Results)
//foreach (var items in uTasks)
{
DataRow dtrow2;
dtrow2 = dt.NewRow();
dtrow2["TaskID"]=items["FormattedID"];
dtrow2["Task Name"] = items["Name"];
if (items["Owner"] != null)
{
var owner = items["Owner"];
String ownerref = owner["_ref"];
var ownerFetch = restApi.GetByReference(ownerref, "Name");
string strTemp = ownerFetch["_refObjectName"];
dtrow2["Owner"] = strTemp.Replace(",", " ");
}
\\else { dtrow2["Owner"] = ""; }
dtrow2["Task-Est"] = items["Estimate"];
dtrow2["Task-ToDo"] = items["ToDo"];
dtrow2["Task-Spent"] = items["TimeSpent"];
dtrow2["ObjectType"] = "T";
dt.Rows.Add(dtrow2);
}
}
It seems like that should work. You may want to make sure you're including the TimeSpent field in your fetch before issuing the request.
taskRequest.Fetch = new List<string>() { "TimeSpent" };

Bigquery tables.list() returns only 1000 tables?

Am I doing something wrong in the following code or is this a bug?
com.google.api.services.bigquery.Bigquery.Tables.List list =
bigquery.tables().list(PROJECT_ID, datasetid);
list.setMaxResults((long) 5000);
return list.execute().getTables();
I have more than a 1000 tables in this dataset.
The maximum number of tables that will be returned in one request is 1000. However, you should also receive a pageToken in the response that can be used to page through further results.
as in:
List<Table> tables = new ArrayList<>();
com.google.api.services.bigquery.Bigquery.Tables.List list =
bigquery.tables().list(PROJECT_ID, datasetid);
list.setMaxResults(5000L);
String nextPageToken = null;
while (true) {
if (nextPageToken != null) {
list.setPageToken(nextPageToken);
}
TableList result = list.execute();
tables.addAll(result.getTables());
if (result.getNextPageToken() == null) {
break;
} else {
nextPageToken = result.getNextPageToken();
}
}
return tables;

Existing posts keep on re-add upon deletion of selected row in jTable

I try to refresh the data of jTable upon deletion of selected row. Here are my codes to set up table :
private JTable getJTableManageReplies() {
jTableManageReplies.setSelectionMode(ListSelectionModel.SINGLE_SELECTION);
jTableManageReplies.getSelectionModel().addListSelectionListener(
new ListSelectionListener() {
#Override
public void valueChanged(ListSelectionEvent e) {
if (!e.getValueIsAdjusting()) {
int viewRow = jTableManageReplies.getSelectedRow();
// Get the first column data of the selectedrow
int replyID = Integer.parseInt(jTableManageReplies.getValueAt(
viewRow, 0).toString());
eForumRepliesAdmin reply = new eForumRepliesAdmin(replyID);
replyID = JOptionPane.showConfirmDialog(null, "Are you sure that you want to delete the selected reply? " , "Delete replies", JOptionPane.YES_NO_OPTION);
if(replyID == JOptionPane.YES_OPTION){
reply.deleteReply();
JOptionPane.showMessageDialog(null, "Reply has been deleted successfully.");
SetUpJTableManageReplies();
}
}
}
});
return jTableManageReplies;
}
public void SetUpJTableManageReplies() {
DefaultTableModel tableModel = (DefaultTableModel) jTableManageReplies
.getModel();
String[] data = new String[5];
db.setUp("IT Innovation Project");
String sql = "Select forumReplies.reply_ID,forumReplies.reply_topic,forumTopics.topic_title,forumReplies.reply_content,forumReplies.reply_by from forumReplies,forumTopics WHERE forumReplies.reply_topic = forumTopics.topic_id ";
ResultSet resultSet = null;
resultSet = db.readRequest(sql);
jTableManageReplies.repaint();
tableModel.getDataVector().removeAllElements();
try {
while (resultSet.next()) {
data[0] = resultSet.getString("reply_ID");
data[1] = resultSet.getString("reply_topic");
data[2] = resultSet.getString("topic_title");
data[3] = resultSet.getString("reply_content");
data[4] = resultSet.getString("reply_by");
tableModel.addRow(data);
}
resultSet.close();
} catch (Exception e) {
System.out.println(e);
}
}
And this is my sql statement :
public boolean deleteReply() {
boolean success = false;
DBController db = new DBController();
db.setUp("IT Innovation Project");
String sql = "DELETE FROM forumReplies where reply_ID = " + replyID
+ "";
if (db.updateRequest(sql) == 1)
success = true;
db.terminate();
return success;
}
I called the repaint() to update the table data with the newest data in database and it works. I mean the data after deletion of certain row. However, the existing posts will keep on re-add. Then I add the removeAllElement method to remove all the existing posts because my sql statement is select * from table. Then, there is an error message which is ArrayIndexOutOfBoundsException. Any guides to fix this? Thanks in advance.
I called the repaint() to update the table data with the newest data
in database and it works.
There is no need to call repaint method when data is changed. Data change is handled by the Table Model (DefaultTableModel in this case.) And fireXXXMethods are required to be called whenever data is changed but you are using DefaultTableModel even those are not required. (Since by default it call these methods when ever there is a change.)
I think the problem is in the valuesChanged(..) method. You are getting the value at row 0 but not checking whether table has rows or not. So keep a constraint.
int viewRow = jTableManageReplies.getSelectedRow();
// Get the first column data of the selectedrow
if(jTableManageReplies.getRowCount() > 0)
int replyID = Integer.parseInt(jTableManageReplies.getValueAt(viewRow, 0).toString());

Topic views do not show up in jTable

I try to code a forum using java swing. Firstly, on click for the jTable, it will lead to eForumContent.class which pass in the id together.
jTable.addMouseListener(new java.awt.event.MouseAdapter() {
public void mouseClicked(java.awt.event.MouseEvent e) {
int id = 0;
eForumTopics topics = new eForumTopics(id);
topics.retrieveDiscussion();
eForumThreadContent myWindow = new eForumThreadContent(id);
myWindow.getJFrame().setVisible(true);
}
});
I initialize the id first. But I set my id in database table to auto number. Then I call the retrieveDiscussion method to get the id from database and pass it to eForumThreadContent. Here are my codes for retrieveDiscussion method.
public boolean retrieveDiscussion(){
boolean success = false;
ResultSet rs = null;
DBController db = new DBController();
db.setUp("IT Innovation Project");
String dbQuery = "SELECT * FROM forumTopics WHERE topic_id = '" + id
+ "'";
rs = db.readRequest(dbQuery);
db.terminate();
return success;
}
}
Then, inside the eForumThreadContent, I want to display the content of chosen thread using a table. So I use the id which I pass in just now and insert into my sql statement. Here are my codes.
public eForumThreadContent(int id) {
this.id = id;
}
if (jTable == null) {
Vector columnNames = new Vector(); // Vector class allows dynamic
// array of objects
Vector data = new Vector();
try {
DBController db = new DBController();
db.setUp("IT Innovation Project");
Class.forName("sun.jdbc.odbc.JdbcOdbcDriver").newInstance();
String dsn = "IT Innovation Project";
String s = "jdbc:odbc:" + dsn;
Connection con = DriverManager.getConnection(s, "", "");
String sql = "Select topic_title,topic_description,topic_by from forumTopics WHERE topic_id = '"+id+"'";
java.sql.Statement statement = con.createStatement();
ResultSet resultSet = statement.executeQuery(sql);
ResultSetMetaData metaData = resultSet.getMetaData();
int columns = metaData.getColumnCount();
for (int i = 1; i <= columns; i++) {
columnNames.addElement(metaData.getColumnName(i));
}
while (resultSet.next()) {
Vector row = new Vector(columns);
for (int i = 1; i <= columns; i++) {
row.addElement(resultSet.getObject(i));
}
data.addElement(row);
}
resultSet.close();
((Connection) statement).close();
} catch (Exception e) {
System.out.println(e);
}
jTable = new JTable(data, columnNames);
TableColumn column;
for (int i = 0; i < jTable.getColumnCount(); i++) {
column = jTable.getColumnModel().getColumn(i);
if (i == 1) {
column.setPreferredWidth(400); // second column is bigger
}else {
column.setPreferredWidth(200);
}
}
String header[] = { "Title", "Description", "Posted by" };
for (int i = 0; i < jTable.getColumnCount(); i++) {
TableColumn column1 = jTable.getTableHeader().getColumnModel()
.getColumn(i);
column1.setHeaderValue(header[i]);
}
jTable.getTableHeader().setFont( new Font( "Dialog" , Font.PLAIN, 20 ));
jTable.getTableHeader().setForeground(Color.white);
jTable.getTableHeader().setBackground(new Color(102, 102, 102));
jTable.setEnabled(false);
jTable.setRowHeight(100);
jTable.getRowHeight();
jTable.setFont( new Font( "Dialog" , Font.PLAIN, 18 ));
}
return jTable;
}
However, the table inside eForumThreadContent is empty even when I clicked on certain thread. It also gives me an error message.
[Microsoft][ODBC Microsoft Access Driver] Data type mismatch in criteria expression.
at sun.jdbc.odbc.JdbcOdbc.createSQLException(Unknown Source)
at sun.jdbc.odbc.JdbcOdbc.standardError(Unknown Source)
at sun.jdbc.odbc.JdbcOdbc.SQLExecDirect(Unknown Source)
at sun.jdbc.odbc.JdbcOdbcStatement.execute(Unknown Source)
at sun.jdbc.odbc.JdbcOdbcStatement.executeQuery(Unknown Source)
at DBController.database.DBController.readRequest(DBController.java:27)
at kioskeForum.entity.eForumTopics.retrieveDiscussion(eForumTopics.java:67)
at kioskeForum.ui.eForumDiscussion$3.mouseClicked(eForumDiscussion.java:296)
I research online to get an idea for topic views using id. But my table does not show up anything. Can somebody enlighten me how to fix it? Or any other ways to display topic views in java swing? Thanks in advance.

How can I do this all in one query? Nhibernate

I have a list of Ids and I want to get all the rows back in one query. As a list of objects(So a List of Products or whatever).
I tried
public List<TableA> MyMethod(List<string> keys)
{
var query = "SELECT * FROM TableA WHERE Keys IN (:keys)";
var a = session.CreateQuery(query).SetParameter("keys", keys).List();
return a; // a is a IList but not of TableA. So what do I do now?
}
but I can't figure out how to return it as a list of objects. Is this the right way?
List<TableA> result = session.CreateQuery(query)
.SetParameterList("keys", keys)
.List<TableA>();
Howeever there could be a limitation in this query if number of ":keys" exceed more than 1000 (incase of oracle not sure with other dbs) so i would recommend to use ICriteria instead of CreateQuery- native sqls.
Do something like this,
[TestFixture]
public class ThousandIdsNHibernateQuery
{
[Test]
public void TestThousandIdsNHibernateQuery()
{
//Keys contains 1000 ids not included here.
var keys = new List<decimal>();
using (ISession session = new Session())
{
var tableCirt = session.CreateCriteria(typeof(TableA));
if (keys.Count > 1000)
{
var listsList = new List<List<decimal>>();
//Get first 1000.
var first1000List = keys.GetRange(0, 1000);
//Split next keys into 1000 chuncks.
for (int i = 1000; i < keys.Count; i++)
{
if ((i + 1)%1000 == 0)
{
var newList = new List<decimal>();
newList.AddRange(keys.GetRange(i - 999, 1000));
listsList.Add(newList);
}
}
ICriterion firstExp = Expression.In("Key", first1000List);
ICriterion postExp = null;
foreach (var list in listsList)
{
postExp = Expression.In("Key", list);
tableCirt.Add(Expression.Or(firstExp, postExp));
firstExp = postExp;
}
tableCirt.Add(postExp);
}
else
{
tableCirt.Add(Expression.In("key", keys));
}
var results = tableCirt.List<TableA>();
}
}
}