NHibernate does not seems doing Bulk Inserting into PostgreSQL

NHibernate does not seems doing Bulk Inserting into PostgreSQL - nhibernate

I am interfacing with a PostgreSQL database with NHibernate.
Background
I made some simple tests...it seems it's taking 2 seconds to persist 300 records.
I have a Perl program with identical functionality, but issue direct SQL instead, takes only 70% of the time.
I am not sure if this is expected. I thought C#/NHibernate would be faster or at least on par.
Questions
One of my observation is that (with show_sql turned on), the NHibernate is issuing INSERTs a few hundreds times, instead of doing bulk INSERT that take cares of multiple rows. And note I am assigning the primary key myself, not using the "native" generator.
Is that expected? Is there anyway I could make it issue bulk INSERT statement instead? It seems to me that this could be one of the area I could speed up the performance.

As stachu found out correctly: NHibernate does not have *BatchingBatcher(Factory) for PostgreSQL(Npgsql)
As stachu askes: Did anybody managed to force Nhibarnate to do batch inserts to PostgreSQL
I wrote a Batcher that doesn't use any Npgsql batching stuff, but does manipulate the SQL String "oldschool style" (INSERT INTO [..] VALUES (...),(...), ...)
using System;
using System.Collections;
using System.Data;
using System.Diagnostics;
using System.Text;
using Npgsql;
namespace NHibernate.AdoNet
{
public class PostgresClientBatchingBatcherFactory : IBatcherFactory
{
public virtual IBatcher CreateBatcher(ConnectionManager connectionManager, IInterceptor interceptor)
{
return new PostgresClientBatchingBatcher(connectionManager, interceptor);
}
}
/// <summary>
/// Summary description for PostgresClientBatchingBatcher.
/// </summary>
public class PostgresClientBatchingBatcher : AbstractBatcher
{
private int batchSize;
private int countOfCommands = 0;
private int totalExpectedRowsAffected;
private StringBuilder sbBatchCommand;
private int m_ParameterCounter;
private IDbCommand currentBatch;
public PostgresClientBatchingBatcher(ConnectionManager connectionManager, IInterceptor interceptor)
: base(connectionManager, interceptor)
{
batchSize = Factory.Settings.AdoBatchSize;
}
private string NextParam()
{
return ":p" + m_ParameterCounter++;
}
public override void AddToBatch(IExpectation expectation)
{
if(expectation.CanBeBatched && !(CurrentCommand.CommandText.StartsWith("INSERT INTO") && CurrentCommand.CommandText.Contains("VALUES")))
{
//NonBatching behavior
IDbCommand cmd = CurrentCommand;
LogCommand(CurrentCommand);
int rowCount = ExecuteNonQuery(cmd);
expectation.VerifyOutcomeNonBatched(rowCount, cmd);
currentBatch = null;
return;
}
totalExpectedRowsAffected += expectation.ExpectedRowCount;
log.Info("Adding to batch");
int len = CurrentCommand.CommandText.Length;
int idx = CurrentCommand.CommandText.IndexOf("VALUES");
int endidx = idx + "VALUES".Length + 2;
if (currentBatch == null)
{
// begin new batch.
currentBatch = new NpgsqlCommand();
sbBatchCommand = new StringBuilder();
m_ParameterCounter = 0;
string preCommand = CurrentCommand.CommandText.Substring(0, endidx);
sbBatchCommand.Append(preCommand);
}
else
{
//only append Values
sbBatchCommand.Append(", (");
}
//append values from CurrentCommand to sbBatchCommand
string values = CurrentCommand.CommandText.Substring(endidx, len - endidx - 1);
//get all values
string[] split = values.Split(',');
ArrayList paramName = new ArrayList(split.Length);
for (int i = 0; i < split.Length; i++ )
{
if (i != 0)
sbBatchCommand.Append(", ");
string param = null;
if (split[i].StartsWith(":")) //first named parameter
{
param = NextParam();
paramName.Add(param);
}
else if(split[i].StartsWith(" :")) //other named parameter
{
param = NextParam();
paramName.Add(param);
}
else if (split[i].StartsWith(" ")) //other fix parameter
{
param = split[i].Substring(1, split[i].Length-1);
}
else
{
param = split[i]; //first fix parameter
}
sbBatchCommand.Append(param);
}
sbBatchCommand.Append(")");
//rename & copy parameters from CurrentCommand to currentBatch
int iParam = 0;
foreach (NpgsqlParameter param in CurrentCommand.Parameters)
{
param.ParameterName = (string)paramName[iParam++];
NpgsqlParameter newParam = /*Clone()*/new NpgsqlParameter(param.ParameterName, param.NpgsqlDbType, param.Size, param.SourceColumn, param.Direction, param.IsNullable, param.Precision, param.Scale, param.SourceVersion, param.Value);
currentBatch.Parameters.Add(newParam);
}
countOfCommands++;
//check for flush
if (countOfCommands >= batchSize)
{
DoExecuteBatch(currentBatch);
}
}
protected override void DoExecuteBatch(IDbCommand ps)
{
if (currentBatch != null)
{
//Batch command now needs its terminator
sbBatchCommand.Append(";");
countOfCommands = 0;
log.Info("Executing batch");
CheckReaders();
//set prepared batchCommandText
string commandText = sbBatchCommand.ToString();
currentBatch.CommandText = commandText;
LogCommand(currentBatch);
Prepare(currentBatch);
int rowsAffected = 0;
try
{
rowsAffected = currentBatch.ExecuteNonQuery();
}
catch (Exception e)
{
if(Debugger.IsAttached)
Debugger.Break();
throw;
}
Expectations.VerifyOutcomeBatched(totalExpectedRowsAffected, rowsAffected);
totalExpectedRowsAffected = 0;
currentBatch = null;
sbBatchCommand = null;
m_ParameterCounter = 0;
}
}
protected override int CountOfStatementsInCurrentBatch
{
get { return countOfCommands; }
}
public override int BatchSize
{
get { return batchSize; }
set { batchSize = value; }
}
}
}

I also found that NHibernate is not doing batch inserts into PostgreSQL.
I identified two possible reasons:
1) Npgsql driver does not support batch inserts/updates (see forum)
2) NHibernate does not have *BatchingBatcher(Factory) for PostgreSQL(Npgsql). I tried using Devart dotConnect driver with NHibernate (I wrote custom driver for NHibernate) but it still did not worked.
I suppose this driver should also implement IEmbeddedBatcherFactoryProvider interface, but it seems not trivial for me (using one for Oracle did not worked ;) )
Did anybody managed to force Nhibarnate to do batch inserts to PostgreSQL or can confirm my conclusion?

Related

SSAS Cube Metadata using SSIS script component with C# program

I am using script component in ssis with C# code using Microsoft.Analysisservices namespace to fetch the cube metadata. The code looks somewhat like this
using System;
using System.Data;
using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
using Microsoft.SqlServer.Dts.Runtime.Wrapper;
using Microsoft.AnalysisServices;
using System.Windows.Forms;
[Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
public class ScriptMain : UserComponent
{
//IDTSConnectionManager100 connMgr;
Server OLAPServer = new Server();
public override void AcquireConnections(object Transaction)
{
OLAPServer.Connect(this.Connections.OLAPConnection.ConnectionString);
}
public override void PreExecute()
{
base.PreExecute();
/*
Add your code here for preprocessing or remove if not needed
*/
}
public override void PostExecute()
{
base.PostExecute();
/*
Add your code here for postprocessing or remove if not needed
You can set read/write variables here, for example:
Variables.MyIntVar = 100
*/
}
public override void CreateNewOutputRows()
{
IDTSVariables100 vars = null;
string OLAPDBName;
VariableDispenser.LockOneForRead("OLAPDBName", ref vars);
Database OLAPDB;
OLAPDBName = vars[0].Value.ToString();
try
{
OLAPDB = OLAPServer.Databases.GetByName(OLAPDBName);
}
catch
{
return;
}
// loop through cubes
CubeCollection Cubes = OLAPDB.Cubes;
MeasureGroupCollection Mgroups;
CubeDimensionCollection Dimensions;
MeasureGroupDimensionCollection MgroupDims;
DimensionAttributeCollection Attributes;
foreach (Cube cb in Cubes)
{
//Test for one Measure Group
//MeasureGroup mgroup = Mgroups.GetByName("Inward Exposure");
Mgroups = cb.MeasureGroups;
// all dimensions associated with that Measure Group
// loop through Measure Groups
foreach (MeasureGroup mg in Mgroups)
{
// loop though all cube dimensions
Dimensions = cb.Dimensions;**strong text**
foreach (CubeDimension dim in Dimensions)
{
bool CanBeAnalysed = false;**strong text**
// loop through dimensions and see if dimension exists in mgroupDims (ie check if it can be analysed)
MgroupDims = mg.Dimensions;
foreach (MeasureGroupDimension mgd in MgroupDims)
{
if (mgd.CubeDimension == dim)
{
CanBeAnalysed = true;
break;
}
}
// loop through each Measure and Attribute a
String DimName = dim.Name;
bool DimVisible = dim.Visible;
String MgroupName = mg.Name;
String CubeName = cb.Name;
String MeasureExpression;
String Description;
// for every attribute in dimension
Attributes = dim.Dimension.Attributes;
foreach (DimensionAttribute Attr in Attributes)
{
String AttrName = Attr.Name;
bool AttrVisible = Attr.AttributeHierarchyVisible;
String AttrNameColumn = Attr.NameColumn.ToString();
String AttributeRelationship = Attr.AttributeRelationships.ToString();
// get every measure in measuregroup
foreach (Measure m in mg.Measures)
{
String MeasureName = m.Name.ToString();
bool MeasureVisible = m.Visible;
String MeasureNameColumn = m.Source.ToString();
if (m.MeasureExpression != null)
{
// MessageBox.Show(m.MeasureExpression.ToString());
MeasureExpression = m.MeasureExpression.ToString();
}
else
{
// MessageBox.Show(m.MeasureExpression.ToString());
MeasureExpression = " " ;
}
if (m.Description != null)
{
// MessageBox.Show(m.MeasureExpression.ToString());
Description = m.Description.ToString();
}
else
{
// MessageBox.Show(m.MeasureExpression.ToString());
Description = " ";
}
Output0Buffer.AddRow();
Output0Buffer.OLAPDBName = OLAPDBName;
Output0Buffer.CubeName = CubeName;
Output0Buffer.DimensionName = DimName;
Output0Buffer.DimensionVisible = DimVisible;
Output0Buffer.AttrDDSColumn = AttrNameColumn;
Output0Buffer.AttrName = AttrName;
Output0Buffer.AttrVisible = AttrVisible;
Output0Buffer.MeasureGroupName = MgroupName;
Output0Buffer.MeasureName = MeasureName;
Output0Buffer.MeasureVisible = MeasureVisible;
Output0Buffer.MeasureDDSColumn = MeasureNameColumn;
Output0Buffer.IsAnalysable = CanBeAnalysed;
Output0Buffer.MeasureExpression = MeasureExpression;
Output0Buffer.Description = Description;
Output0Buffer.AttributeRelationship = AttributeRelationship;
}
}
} // end of Cube Dim Loop
} // end of Measure Group loop
} // end of cube loop
}
}
I was successful in getting the cube metadata with the above code.However, i am stuck at getting the metadata of the perspective cube and the Relationships of the measure groups i.e whether the measure groups are many-many. Any help is very much appreciated.

Here is some code for detecting dimension relationships including many-to-many. See the GetDimensionUsage function:
https://raw.githubusercontent.com/BIDeveloperExtensions/bideveloperextensions/master/SSAS/PrinterFriendlyDimensionUsage.cs
Here is some code around navigating perspectives:
https://raw.githubusercontent.com/BIDeveloperExtensions/bideveloperextensions/master/SSAS/TriStatePerspectivesPlugin.cs
Start reading around the following line:
if (perspective.MeasureGroups.Contains(mg.Name))

Adding message to faceContext is not working in Java EE7 run on glassFish?

I am doing the tutorial on Java EE7 that comes with glassFish installation. It is also available here. The code is present in glassFish server installation directory
/glassFish_installation/glassfish4/docs/javaee-tutorial/examples/cdi/guessnumber-cdi.
The code works fine as it is. It currently displays correct! when a user correctly guesses the number but does not display failed at end of the game. so I introduced, just one minor change to display the failed message. I have added comments right above the relevant change in code.
Somehow, this change did not help. That is, the at the end of the game, failed message is not displayed.
But the game works as usual. I would like to know why this did not work and how to correct it?
Thanks
public class UserNumberBean implements Serializable {
private static final long serialVersionUID = -7698506329160109476L;
private int number;
private Integer userNumber;
private int minimum;
private int remainingGuesses;
#Inject
#MaxNumber
private int maxNumber;
private int maximum;
#Inject
#Random
Instance<Integer> randomInt;
public UserNumberBean() {
}
public int getNumber() {
return number;
}
public void setUserNumber(Integer user_number) {
userNumber = user_number;
}
public Integer getUserNumber() {
return userNumber;
}
public int getMaximum() {
return (this.maximum);
}
public void setMaximum(int maximum) {
this.maximum = maximum;
}
public int getMinimum() {
return (this.minimum);
}
public void setMinimum(int minimum) {
this.minimum = minimum;
}
public int getRemainingGuesses() {
return remainingGuesses;
}
public String check() throws InterruptedException {
if (userNumber > number) {
maximum = userNumber - 1;
}
if (userNumber < number) {
minimum = userNumber + 1;
}
if (userNumber == number) {
FacesContext.getCurrentInstance().addMessage(null,
new FacesMessage("Correct!"));
}
//if remainingGuesses is less than or equal to zero, display failed message
//-----------------------------------------------
if (remainingGuesses-- <= 0) {
FacesContext.getCurrentInstance().addMessage(null,
new FacesMessage("failed "));
}
return null;
}
#PostConstruct
public void reset() {
this.minimum = 0;
this.userNumber = 0;
this.remainingGuesses = 10;
this.maximum = maxNumber;
this.number = randomInt.get();
}
public void validateNumberRange(FacesContext context,
UIComponent toValidate,
Object value) {
int input = (Integer) value;
if (input < minimum || input > maximum) {
((UIInput) toValidate).setValid(false);
FacesMessage message = new FacesMessage("Invalid guess");
context.addMessage(toValidate.getClientId(context), message);
}
}
}

Adding the FacesMessage is actually working, the problem is that you are using postdecrement in your condition.
Postdecrement, as the name suggests, is decremented AFTER the execution of the statement containing the postdecrement.
That means, if you write:
if (remainingGuesses-- <= 0) {
the var remainingGuesses is decremented after the if-condition was evaluated.
In your case, when the last guess is checked, remainingGuesses is actually 1 and therefore the if-condition is not true and the message is not added.
Different obvious solutions:
if (remainingGuesses-- <= 1) {
FacesContext.getCurrentInstance().addMessage(null,
new FacesMessage("failed "));
}
or
if (--remainingGuesses <= 0) {
FacesContext.getCurrentInstance().addMessage(null,
new FacesMessage("failed "));
}
or
remainingGuesses--;
if (remainingGuesses <= 0) {
FacesContext.getCurrentInstance().addMessage(null,
new FacesMessage("failed "));
}
See also:
Is there a difference between x++ and ++x in java?
Difference between i++ and ++i in a loop?

Looped road network extraction in ArcGIS

I have a road network with some dangled nodes and some looped roads. I have removed the dangled roads and now I also want to remove the looped roads. Can anyone tell me hoe can I do that. Thanks in advance (:

1) Finding Loops in Road Network:
For the roads that form 3/4 loops use the sinuosity formula (distance between line ends/ length). The Python Formula is:
!Shp_lngth! / math.pow(math.pow( !X_Start! - !X_End! , 2 ) + math.pow( !Y_Start! - !Y_End!, 2 ), 0.5) > 2
For the loops that are just at the end of the road (like cul de sac bulbs) use the Create Route tool and find roads with MMonotonicity values that are 4 or above.
2) Finding Dangling Nodes in Road Network:
Use this code
using System;
using System.Collections.Generic;
using ESRI.ArcGIS.Geodatabase;
using ESRI.ArcGIS.Carto;
using System.Windows.Forms;
using ESRI.ArcGIS.esriSystem;
using ESRI.ArcGIS.EditorExt;
namespace KalkulatorAddin
{
public class TestButton : ESRI.ArcGIS.Desktop.AddIns.Button
{
public TestButton()
{
}
protected override void OnClick()
{
try
{
Test();
}
catch (Exception ex)
{
MessageBox.Show(ex.Message);
}
}
protected override void OnUpdate()
{
}
public void Test()
{
var fSel = ArcMap.Document.FocusMap.get_Layer(0) as IFeatureSelection;
fSel.Clear();
var dangleOids = GetDangleOids();
if (dangleOids.Count > 0)
{
var oidarray = dangleOids.ToArray();
fSel.SelectionSet.AddList(dangleOids.Count, ref oidarray[0]);
}
((IActiveView)ArcMap.Document.FocusMap).Refresh();
}
private List<int> GetDangleOids()
{
UID topoUiD = new UID();
topoUiD.Value = "esriEditorExt.TopologyExtension";
var topoExt = ArcMap.Application.FindExtensionByCLSID(topoUiD) as ITopologyExtension;
var mapTopology = topoExt.CurrentTopology as IMapTopology;
if (mapTopology == null)
throw new Exception("map topology not found");
//assume just one class in the map topology
var extent = ((IGeoDataset)mapTopology.get_Class(0)).Extent;
mapTopology.Cache.Build(extent, false);
var dangleOids = new List<int>();
var nodes = mapTopology.Cache.Nodes;
nodes.Reset();
ITopologyNode node;
while ((node = nodes.Next()) != null)
{
// sometimes degree is referred to as valence
if (node.Degree == 1)
{
var parents = node.Parents;
parents.Reset();
int oid = parents.Next().m_FID;
if (!dangleOids.Contains(oid))
dangleOids.Add(oid);
}
}
return dangleOids;
}
}
}

What's the point of hibernatetemplate's bulkupdate?

Is hibernatetemplate's bulkUpdate actually doing a bulkUpdate? I looked at the code, and it doesn't seem to be doing bulkUpdate. Or maybe am I missing something?
public int bulkUpdate(final String queryString, final Object... values) throws DataAccessException {
return executeWithNativeSession(new HibernateCallback<Integer>() {
public Integer doInHibernate(Session session) throws HibernateException {
Query queryObject = session.createQuery(queryString);
prepareQuery(queryObject);
if (values != null) {
for (int i = 0; i < values.length; i++) {
queryObject.setParameter(i, values[i]);
}
}
return queryObject.executeUpdate();
}
});
}
whereas JdbcTemplate batchUpdate (looks like) is doing a batchUpdate
public int[] batchUpdate(final String[] sql) throws DataAccessException {
Assert.notEmpty(sql, "SQL array must not be empty");
if (logger.isDebugEnabled()) {
logger.debug("Executing SQL batch update of " + sql.length + " statements");
}
class BatchUpdateStatementCallback implements StatementCallback<int[]>, SqlProvider {
private String currSql;
public int[] doInStatement(Statement stmt) throws SQLException, DataAccessException {
int[] rowsAffected = new int[sql.length];
if (JdbcUtils.supportsBatchUpdates(stmt.getConnection())) {
for (String sqlStmt : sql) {
this.currSql = sqlStmt;
stmt.addBatch(sqlStmt);
}
rowsAffected = stmt.executeBatch();
}
else {
for (int i = 0; i < sql.length; i++) {
this.currSql = sql[i];
if (!stmt.execute(sql[i])) {
rowsAffected[i] = stmt.getUpdateCount();
}
else {
throw new InvalidDataAccessApiUsageException("Invalid batch SQL statement: " + sql[i]);
}
}
}
return rowsAffected;
}
public String getSql() {
return this.currSql;
}
}
return execute(new BatchUpdateStatementCallback());
}

Yes, it is doing bulk update. As you see, DELETE and INSERT queries executed in bulkUpdate method can affect multiple rows. That's why they are called bulk operations.
Point is to have handy method to execute update and execute query and return number of rows affected in bulk operation. Additionally it wraps exceptions to DataAccessException.

NHibernate: Saving different types of objects in the same session breaks batching

newbie here, sorry if this is an obvious question.
It seems saving different types of objects in the same session breaks batching, cause significant performance drop.
ID generator is set to Increment (as Diego Mijelshon advised, I tried hilo("100"), but unfortunately same issue, Test1() is still about 5 times slower than Test2()):
public class CustomIdConvention : IIdConvention
{
public void Apply(IIdentityInstance instance)
{
instance.GeneratedBy.Increment();
}
}
AdoNetBatchSize is set to 1000:
MsSqlConfiguration.MsSql2008
.ConnectionString(connectionString)
.AdoNetBatchSize(1000)
.Cache(x => x
.UseQueryCache()
.ProviderClass<HashtableCacheProvider>())
.ShowSql();
These are the models:
public class TestClass1
{
public virtual int Id { get; private set; }
}
public class TestClass2
{
public virtual int Id { get; private set; }
}
These are the test methods. Test1() takes 62 seconds, Test2() takes only 11 seconds. (as Phill advised, I tried stateless sessions, but unfortunately same issue):
[TestMethod]
public void Test1()
{
int count = 50 * 1000;
using (var session = SessionFactory.OpenSession())
{
using (var transaction = session.BeginTransaction())
{
for (int i = 0; i < count; i++)
{
var x = new TestClass1();
var y = new TestClass2();
session.Save(x);
session.Save(y);
}
transaction.Commit();
}
}
}
[TestMethod]
public void Test2()
{
int count = 50 * 1000;
using (var session = SessionFactory.OpenSession())
{
using (var transaction = session.BeginTransaction())
{
for (int i = 0; i < count; i++)
{
var x = new TestClass1();
session.Save(x);
}
transaction.Commit();
}
}
using (var session = SessionFactory.OpenSession())
{
using (var transaction = session.BeginTransaction())
{
for (int i = 0; i < count; i++)
{
var y = new TestClass2();
session.Save(y);
}
transaction.Commit();
}
}
}
Any ideas?
Thanks!
Update:
The test project can be downloaded from here. You need to change the connectionString in the Main method. I changed all sessions to stateless sessions.
My restuls: Test1 = 59.11, Test2 = 7.60, Test3 = 7.72. Test1 is 7.7 times slower than Test2 & Test3!

Do not use increment. It's the worst possible generator.
Try changing it to HiLo.
Update:
It looks like the problem occurs when alternating saves of different entities, regardless of whether the session/transaction are separated or not.
This produces similar results to the second test method:
[TestMethod]
public void Test3()
{
int count = 50 * 1000;
using (var session = SessionFactory.OpenSession())
{
using (var transaction = session.BeginTransaction())
{
for (int i = 0; i < count; i++)
{
var x = new TestClass1();
session.Save(x);
}
for (int i = 0; i < count; i++)
{
var y = new TestClass2();
session.Save(y);
}
transaction.Commit();
}
}
}
My guess, without looking at NH's sources, is that it preserves the order because of possible relationships between the entities, even when there are none.

When you run test2 and test3, the insert's are batched together.
When you run test1, where you alternate the inserts, the inserts are issued as separate statements and are not batched together.
I found this out by profiling all three tests.
So as per Diego's answer, it must preserve the order that you're inserting, and batch them together.
I wrote a 4th test, I set the batch size to 10, then alternated when i changed from TestClass1 to TestClass2 so that I was doing 5 of TestClass1 and then 5 of TestClass2, to hit the batch size.
This pushed out batch's of 5 in the order they were processed.
public void Test4()
{
int count = 10;
using (var session = SessionFactory.OpenSession())
using (var transaction = session.BeginTransaction())
{
for (int i = 0; i < count; i++)
{
if (i%2 == 0)
{
for (int j = 0; j < 5; j++)
{
var x = new TestClass1();
session.Save(x);
}
}
else
{
for (int j = 0; j < 5; j++)
{
var y = new TestClass2();
session.Save(y);
}
}
}
transaction.Commit();
}
}
Then I changed it to insert 3 at a time instead of 5. The batch's were in multiples of 3, so what must be happening is the batch size allows a batch of 1 type to go to specified amount, but groups only the same type together. While alternating causes separate insert statements.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

NHibernate does not seems doing Bulk Inserting into PostgreSQL - nhibernate

Related

SSAS Cube Metadata using SSIS script component with C# program

Adding message to faceContext is not working in Java EE7 run on glassFish?

Looped road network extraction in ArcGIS

What's the point of hibernatetemplate's bulkupdate?

NHibernate: Saving different types of objects in the same session breaks batching

Categories

Resources