What's the best way to get HTable handle of HBase? - apache

One way is directly call the HTable constructor, another is to call the getTable method from a HConnection. The second option requires the HConnection to be "unmanaged", which is not very good for me because my process will have many threads accessing HBase. I don't want to re-invent the wheel to manage the HConnections on my own.
Thanks for your help.
[Updates]:
We are stuck with 0.98.6, so ConnectionFactory is not available.
I found the bellow jira suggesting to create an "unmanaged" connection and use a single ExecuteService to create HTable. Why can't we simply use the getTable method of the unmanaged connection to get HTable? Is that because of HTable is not thread safe?
https://issues.apache.org/jira/browse/HBASE-7463

Im stuck with old versions (<0.94.11) in which you can still use HTablePool but since it has been deprecated by HBASE-6580 I think requests from HTables to the RS are now automatically pooled by providing an ExecutorService:
ExecutorService executor = Executors.newFixedThreadPool(10);
Connection connection = ConnectionFactory.createConnection(conf, executor);
Table table = connection.getTable(TableName.valueOf("mytable"));
try {
table.get(...);
...
} finally {
table.close();
connection.close();
}
I've been unable to find any good examples/docs about it, so please notice this is untested code which may not work as expected.
For more information you can take a look to the ConnectionFactory documentation & to the JIRA issue:
https://hbase.apache.org/apidocs/org/apache/hadoop/hbase/client/ConnectionFactory.html
https://issues.apache.org/jira/browse/HBASE-6580
Update, since you're using 0.98.6 and ConnectionFactory is not available you can use HConnectionManager instead:
HConnection connection = HConnectionManager.createConnection(config); // You can also provide an ExecutorService if you want to override the default one. HConnection is thread safe.
HTableInterface table = connection.getTable("table1");
try {
// Use the table as needed, for a single operation and a single thread
} finally {
table.close();
connection.close();
}
HTable is not thread safe so you must make sure you always get a new instance (it's a lightweight process) with HTableInterface table = connection.getTable("table1") and close it afterwards with table.close().
The flow would be:
Start your process
Initialize your HConnection
Each thread:
3.1 Gets a table from your HConnection
3.2 Writes/reads from the table
3.3 Closes the table
Close your HConnection when your process ends
HConnectionManager: http://archive.cloudera.com/cdh5/cdh/5/hbase/apidocs/org/apache/hadoop/hbase/client/HConnectionManager.html#createConnection(org.apache.hadoop.conf.Configuration)
HTable: http://archive.cloudera.com/cdh5/cdh/5/hbase/apidocs/org/apache/hadoop/hbase/client/HTable.html

Related

WebRtc Native-Crashed when I call peerconnection->Close()

How to close or destruct a PeerConnectionInterface object? It crashed when I'm trying to do so.
I have an object declared like this:
rtc::scoped_refptr<webrtc::PeerConnectionInterface> _peerConnection;
It works fine after I create the PeerConnectionInterface by factory.
However, when the session is over and I try to call _peerConnection->Close(); The program crashed.
And I also try to call _peerConnection.release()->Release(); Crashed as well.
I print logs in PeerConnection.cc which is from the source code of WebRtc, and find that it crashed here, which is in Close() function and ~PeerConnection() function:
webrtc_session_desc_factory_.reset(); //PeerConnection.cc
The declare is
std::unique_ptr<WebRtcSessionDescriptionFactory> webrtc_session_desc_factory_;
So I continue to log in WebRtcSessionDescriptionFactory.cc, the ~WebRtcSessionDescriptionFactory() function. Crashed in this function:FailPendingRequests().
Entered the FailPendingRequests() function:
RTC_DCHECK(signaling_thread_->IsCurrent());
while (!create_session_description_requests_.empty()) {
const CreateSessionDescriptionRequest& request =
create_session_description_requests_.front();
//Crashed here in third or fourth loop
PostCreateSessionDescriptionFailed(request.observer,
((request.type == CreateSessionDescriptionRequest::kOffer) ?
"CreateOffer" : "CreateAnswer") + reason);
create_session_description_requests_.pop();
}
I will be really grateful for any suggestion!
I faced the same issue in iOS when implemented Kurento Library. The key to fix this issue is to dispose the resources in the right manner.
Steps I followed:
The order of creation:
Created WebRTCPeer object
Created RoomClient object
Once RoomClient connected, generated SDP Offer.
and so on.
The order of disposition:
Disconnected RoomClient first.
Kept an eye on "RTCIceConnectionState", "RTCIceGatheringState" in the WebRTC events.
Once "RTCIceConnectionState" is closed and iceGatheringState is "RTCIceGatheringStateComplete", then disposed WebRTCPeer object.
This way the problem got resolved, otherwise resources were initialised and main object were disposed, which results in crashes.
Hope that helps!

Ignite Transactional From DataStream Receiver not supported?

I am using igniteDataStreamer and would like to know if it is possible to use transactions from closures.
Unfortunately, when running from different IgniteDataStreamer threads for the same record to update in the cache(receive() method in StreamReceiver), Ignite does not throw any TransactionOptimisticException even though CacheConfiguration atomicityMode is TRANSACTIONAL.
try (Transaction t = ignite.transactions().txStart(TransactionConcurrency.OPTIMISTIC, TransactionIsolation.SERIALIZABLE)) {
try {
cache.putAll(update);
t.commit();
catch (TransactionOptimisticException toe) {
LOG.error("TransactionOptimisticException Could not put all the profiles",toe);
}
}
Data streamer is not transactional. To execute updates in a single transaction, they must be initiated on the same node and by the same thread. For more details and examples read here: https://apacheignite.readme.io/docs/transactions

EFUtilities with EFProfiler running

I was wondering if there was a way to get EFUtilities running at the same time EFProfiler is running.
I appreciate the profiler would not show the bulk insert due to it being done outside the confines of DBContext. At the moment, I cannot run batch jobs as the profiler has the connection wrapped. It Runs fine when not enabled
The exception I am getting is thus:
A first chance exception of type 'System.InvalidOperationException'
occurred in EntityFramework.Utilities.dll
Additional information: No provider supporting the InsertAll operation
for this datasource was found
The inner exception is null.
This is because EFUtilities automatically finds the correct provider. But when the connection is wrapped this is no longer possible.
InsertAll looks like this.
public void InsertAll<TEntity>(IEnumerable<TEntity> items, DbConnection connection = null, int? batchSize = null)
To use the SqlProvider (which is actually the only provider out of the box) you can create a new SqlConnection() and pass that to insert all.
So basically you would need to do this:
using (var db = new YourContext())
using (var con = new SqlConnection(YourConnectionString))
{
EFBatchOperation.For(db, db.PartialTestClass1).InsertAll(partials, con);
}
Now, maybe you are doing more and want both parts to run under the same transaction. In that case you can wrap that code block in a TransactionScope.

Suppressing NServicebus Transaction to write errors to database

I'm using NServiceBus to handle some calculation messages. I have a new requirement to handle calculation errors by writing them the same database. I'm using NHibernate as my DAL which auto enlists to the NServiceBus transaction and provides rollback in case of exceptions, which is working really well. However if I write this particular error to the database, it is also rolled back which is a problem.
I knew this would be a problem, but I thought I could just wrap the call in a new transaction with the TransactionScopeOption = Suppress. However the error data is still rolled back. I believe that's because it was using the existing session with has already enlisted in the NServiceBus transaction.
Next I tried opening a new session from the existing SessionFactory within the suppression transaction scope. However the first call to the database to retrieve or save data using this new session blocks and then times out.
InnerException: System.Data.SqlClient.SqlException
Message=Timeout expired. The timeout period elapsed prior to completion of the >operation or the server is not responding.
Finally I tried creating a new SessionFactory using it to open a new session within the suppression transaction scope. However again it blocks and times out.
I feel like I'm missing something obvious here, and would greatly appreciate any suggestions on this probably common task.
As Adam suggests in the comments, in most cases it is preferred to let the entire message fail processing, giving the built-in Retry mechanism a chance to get it right, and eventually going to the error queue. Then another process can monitor the error queue and do any required notification, including logging to a database.
However, there are some use cases where the entire message is not a failure, i.e. on the whole, it "succeeds" (whatever the business-dependent definition of that is) but there is some small part that is in error. For example, a financial calculation in which the processing "succeeds" but some human element of the data is "in error". In this case I would suggest catching that exception and sending a new message which, when processed by another endpoint, will log the information to your database.
I could see another case where you want the entire message to fail, but you want the fact that it was attempted noted somehow. This may be closest to what you are describing. In this case, create a new TransactionScope with TransactionScopeOption = Suppress, and then (again) send a new message inside that scope. That message will be sent whether or not your full message transaction rolls back.
You are correct that your transaction is rolling back because the NHibernate session is opened while the transaction is in force. Trying to open a new session inside the suppressed transaction can cause a problem with locking. That's why, most of the time, sending a new message asynchronously is part of the solution in these cases, but how you do it is dependent upon your specific business requirements.
I know I'm late to the party, but as an alternative suggestion, you coudl simply raise another separate log message, which NSB handles independently, for example:
public void Handle(DebitAccountMessage message)
{
var account = this.dbcontext.GetById(message.Id);
if (account.Balance <= 0)
{
// log request - new handler
this.Bus.Send(new DebitAccountLogMessage
{
originalMessage = message,
account = account,
timeStamp = DateTime.UtcNow
});
// throw error - NSB will handle
throw new DebitException("Not enough funds");
}
}
public void Handle(DebitAccountLogMessage message)
{
var messageString = message.originalMessage.Dump();
var accountString = message.account.Dump(DumpOptions.SuppressSecurityTokens);
this.Logger.Log(message.UniqueId, string.Format("{0}, {1}", messageString, accountString);
}

Transaction timeout expired while using Linq2Sql DataContext.SubmitChanges()

please help me resolve this problem:
There is an ambient MSMQ transaction. I'm trying to use new transaction for logging, but get next error while attempt to submit changes - "Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding." Here is code:
public static void SaveTransaction(InfoToLog info)
{
using (TransactionScope scope =
new TransactionScope(TransactionScopeOption.RequiresNew))
{
using (TransactionLogDataContext transactionDC =
new TransactionLogDataContext())
{
transactionDC.MyInfo.InsertOnSubmit(info);
transactionDC.SubmitChanges();
}
scope.Complete();
}
}
Please help me.
Thx.
You could consider increasing the timeout or eliminating it all together.
Something like:
using(TransactionLogDataContext transactionDC = new TransactionLogDataContext())
{
transactionDC.CommandTimeout = 0; // No timeout.
}
Be careful
You said:
thank you. but this solution makes new question - if transaction scope was changed why submit operation becomes so time consuming? Database and application are on the same machine
That is because you are creating new DataContext right there:
TransactionLogDataContext transactionDC = new TransactionLogDataContext())
With new data context ADO.NET opens up new connection (even if connection strings are the same, unless you do some clever connection pooling).
Within transaction context when you try to work with more than 1 connection instances (which you just did)
ADO.NET automatically promotes transaction to a distributed transaction and will try to enlist it into MSDTC. Enlisting very first transaction per connection into MSDTC will take time (for me it takes 30+ seconds), consecutive transactions will be fast, however (in my case 60ms). Take a look at this http://support.microsoft.com/Default.aspx?id=922430
What you can do is reuse transaction and connection string (if possible) when you create new DataContext.
TransactionLogDataContext tempDataContext =
new TransactionLogDataContext(ExistingDataContext.Transaction.Connection);
tempDataContext.Transaction = ExistingDataContext.Transaction;
Where ExistingDataContext is the one which started ambient transaction.
Or attemp to speed up your MS DTC.
Also do use SQL Profiler suggested by billb and look for SessionId between different commands (save and savelog in your case). If SessionId changes, you are in fact using 2 different connections and in that case will have to reuse transaction (if you don't want it to be promoted to MS DTC).