mongoskin and bulk operations? (mongodb 3.2, mongoskin 2.1.0 & 2.2.0)

mongoskin and bulk operations? (mongodb 3.2, mongoskin 2.1.0 & 2.2.0) - mongoskin

I've read the various bits of literature, and I'm seeing the same problem that the questioner in
https://stackoverflow.com/a/25636911
was seeing.
My code looks like this:
coll = db.collection('foobar');
bulk = coll.initializeUnorderedBulkOp();
for entry in messages {
bulk.insert(entry);
}
bulk.execute(function (err, result) {
if (err) throw err
inserted += result.nInserted
});
bulk is an object
bulk.insert works just fine
bulk.execute is undefined
The answer in the stackoverflow question said, "only the callback flavor of db.collection() works, so I tried:
db.collection('foobar', function (err, coll) {
logger.debug "got here"
if (err) throw err
bulk = coll.initializeUnorderedBulkOp()
... same code as before
We never get to "got here" implying that the "callback flavor" of db.collection() was dropped for 3.0?
Unfortunately, my python is way better than my JS prototyping skills, so looking at the skin source code doesn't make any sense to me.
What is the right way, with mongoskin 2.1.0 and the 2.2.0 mongodb JS driver, to do a bulk operation, or is this not implemented at all anymore?

There are at least two answers:
(1) Use insert, but the array form, so you insert multiple documents with one call. Works like a charm.
(2) If you really need bulk operations, you'll need to switch from mongoskin to the native mongo interface, but just for that one call.
This kinda sucks because it's using a private interface in mongoskin, but it's also the most efficient way to stick with mongoskin:
(example in coffeescript)
// bulk write all the messages in "messages" to a collection
// and insert the server's current time in the recorded field of
// each message
// use the _native interface and wait for callback to get collection
db._native.collection collectionName, (err, collection) ->
bulk = collection.initializeUnorderedBulkOp()
for message in messages
bulk.find
_id: message._id
.upsert().updateOne
$set: message
$currentDate:
recorded: true
bulk.execute (err, result) ->
// ... error and result checking code
or (3) if you want to implement that $currentDate and not any generic bulk operation, refer to solution (1) but use the not-very-well-documented BSON object Timestamp() with no arguments:
for msg in messages:
msg.recorded = Timestamp()
db.mycollection.insert(msg)
which will do a bulk insert and set timestamp to the DB server's time at the time the record is written to the db.

Related

stream data into my vega view progressively

I am using Papaparse to parse the CSV and on each data, I run an insert into the view, like so:
Papa.parse(createReadStream('geo.csv'), {
header: true,
chunk(data) {
console.log('chunk: ', data.data.length)
// data.data.length > 0 && tally.push(...data.data)
view.insert('test1', data.data)
},
complete() {
view.data('test1').length // this will return 0
console.log('memory:', process.memoryUsage().heapUsed / 1024 / 1024, ` == time: ${Date.now() - start}`)
},
})
the only way to keep inserting new data is to either:
call run() after insert, insert('test1', data.data).run() to "commit", but I do not need it to run yet, not until I have all of the data (which is why I run() in the complete() callback).
I would have to parse everything at once in memory then pass it using data('test1', allRows) (which I think, will use a lot more memory)
how do I progressively stream data into my vega view? Note that I am running this inside a web worker, as far as I know, vega loader does not support browser's File instance (only URLs for browser environment) this I'm using papaparse.

You need to run runAsync and await it before inserting more data into the view or otherwise updates may bet lost. See https://github.com/vega/vega/issues/2513 for more information on this.
If you don't care about intermediate updates while more data comes in, I would recommend collecting all the data you want to insert and then adding it at once. Memory won't be an issue since you will need all the data in memory anyway. Vega will keep the full data in memory anyway.

How can I detect a connection failure in gorm?

I'm writing a small, simple web app in go using the gorm ORM.
Since the database can fail independently of the web application, I'd like to be able to identify errors that correspond to this case so that I can reconnect to my database without restarting the web application.
Motivating example:
Consider the following code:
var mrs MyRowStruct
db := myDB.Model(MyRowStruct{}).Where("column_name = ?", value).First(&mrs)
return &mrs, db.Error
In the event that db.Error != nil, how can I programmatically determine if the error stems from a database connection problem?
From my reading, I understand that gorm.DB does not represent a connection, so do I even have to worry about reconnecting or re-issuing a call to gorm.Open if a database connection fails?
Are there any common patterns for handling database failures in Go?

Gorm appears to swallow database driver errors and emit only it's own classification of error types (see gorm/errors.go). Connection errors do not currently appear to be reported.
Consider submitting an issue or pull request to expose the database driver error directly.
[Original]
Try inspecting the runtime type of db.Error per the advice in the gorm readme "Error Handling" section.
Assuming it's an error type returned by your database driver you can likely get a specific code that indicates connection errors. For example, if you're using PostgreSQL via the pq library then you might try something like this:
import "github.com/lib/pq"
// ...
if db.Error != nil {
pqerr, ok := err.(*pq.Error)
if ok && pqerr.Code[0:2] == "08" {
// PostgreSQL "Connection Exceptions" are class "08"
// http://www.postgresql.org/docs/9.4/static/errcodes-appendix.html#ERRCODES-TABLE
// Do something for connection errors...
} else {
// Do something else with non-pg error or non-connection error...
}
}

Read SQL Server Broker messages and publish them using NServiceBus

I am very new to NServiceBus, and in one of our project, we want to accomplish following -
Whenever table data is modified in Sql server, construct a message and insert in sql server broker queue
Read the broker queue message using NServiceBus
Publish the message again as another event so that other subscribers
can handle it.
Now it is point 2, that I do not have much clue, how to get it done.
I have referred the following posts, after which I was able to enter the message in broker queue, but unable to integrate with NServiceBus in our project, as the NServiceBus libraries are of older version and also many methods used are deprecated. So using them with current versions is getting very troublesome, or if I was doing it in improper way.
http://www.nullreference.se/2010/12/06/using-nservicebus-and-servicebroker-net-part-2
https://github.com/jdaigle/servicebroker.net
Any help on the correct way of doing this would be invaluable.
Thanks.

I'm using the current version of nServiceBus (5), VS2013 and SQL Server 2008. I created a Database Change Listener using this tutorial, which uses SQL Server object broker and SQLDependency to monitor the changes to a specific table. (NB This may be deprecated in later versions of SQL Server).
SQL Dependency allows you to use a broad selection of all the basic SQL functionality, although there are some restrictions that you need to be aware of. I modified the code from the tutorial slightly to provide better error information:
void NotifyOnChange(object sender, SqlNotificationEventArgs e)
{
// Check for any errors
if (#"Subscribe|Unknown".Contains(e.Type.ToString())) { throw _DisplayErrorDetails(e); }
var dependency = sender as SqlDependency;
if (dependency != null) dependency.OnChange -= NotifyOnChange;
if (OnChange != null) { OnChange(); }
}
private Exception _DisplayErrorDetails(SqlNotificationEventArgs e)
{
var message = "useful error info";
var messageInner = string.Format("Type:{0}, Source:{1}, Info:{2}", e.Type.ToString(), e.Source.ToString(), e.Info.ToString());
if (#"Subscribe".Contains(e.Type.ToString()) && #"Invalid".Contains(e.Info.ToString()))
messageInner += "\r\n\nThe subscriber says that the statement is invalid - check your SQL statement conforms to specified requirements (http://stackoverflow.com/questions/7588572/what-are-the-limitations-of-sqldependency/7588660#7588660).\n\n";
return new Exception(messageMain, new Exception(messageInner));
}
I also created a project with a "database first" Entity Framework data model to allow me do something with the changed data.
[The relevant part of] My nServiceBus project comprises two "Run as Host" endpoints, one of which publishes event messages. The second endpoint handles the messages. The publisher has been setup to IWantToRunAtStartup, which instantiates the DBListener and passes it the SQL statement I want to run as my change monitor. The onChange() function is passed an anonymous function to read the changed data and publish a message:
using statements
namespace Sample4.TestItemRequest
{
public partial class MyExampleSender : IWantToRunWhenBusStartsAndStops
{
private string NOTIFY_SQL = #"SELECT [id] FROM [dbo].[Test] WITH(NOLOCK) WHERE ISNULL([Status], 'N') = 'N'";
public void Start() { _StartListening(); }
public void Stop() { throw new NotImplementedException(); }
private void _StartListening()
{
var db = new Models.TestEntities();
// Instantiate a new DBListener with the specified connection string
var changeListener = new DatabaseChangeListener(ConfigurationManager.ConnectionStrings["TestConnection"].ConnectionString);
// Assign the code within the braces to the DBListener's onChange event
changeListener.OnChange += () =>
{
/* START OF EVENT HANDLING CODE */
//This uses LINQ against the EF data model to get the changed records
IEnumerable<Models.TestItems> _NewTestItems = DataAccessLibrary.GetInitialDataSet(db);
while (_NewTestItems.Count() > 0)
{
foreach (var qq in _NewTestItems)
{
// Do some processing, if required
var newTestItem = new NewTestStarted() { ... set properties from qq object ... };
Bus.Publish(newTestItem);
}
// Because there might be a number of new rows added, I grab them in small batches until finished.
// Probably better to use RX to do this, but this will do for proof of concept
_NewTestItems = DataAccessLibrary.GetNextDataChunk(db);
}
changeListener.Start(string.Format(NOTIFY_SQL));
/* END OF EVENT HANDLING CODE */
};
// Now everything has been set up.... start it running.
changeListener.Start(string.Format(NOTIFY_SQL));
}
}
}
Important The OnChange event firing causes the listener to stop monitoring. It basically is a single event notifier. After you have handled the event, the last thing to do is restart the DBListener. (You can see this in the line preceding the END OF EVENT HANDLING comment).
You need to add a reference to System.Data and possibly System.Data.DataSetExtensions.
The project at the moment is still proof of concept, so I'm well aware that the above can be somewhat improved. Also bear in mind I had to strip out company specific code, so there may be bugs. Treat it as a template, rather than a working example.
I also don't know if this is the right place to put the code - that's partly why I'm on StackOverflow today; to look for better examples of ServiceBus host code. Whatever the failings of my code, the solution works pretty effectively - so far - and meets your goals, too.
Don't worry too much about the ServiceBroker side of things. Once you have set it up, per the tutorial, SQLDependency takes care of the details for you.

The ServiceBroker Transport is very old and not supported anymore, as far as I can remember.
A possible solution would be to "monitor" the interesting tables from the endpoint code using something like a SqlDependency (http://msdn.microsoft.com/en-us/library/62xk7953(v=vs.110).aspx) and then push messages into the relevant queues.
.m

How to filter errors in a Bacon.js stream

Sometimes I want to filter out certain errors in a stream. I'd like to write something like this:
stream
.filterError (error) ->
error.type is 'foo'
But there is no filterError method.
As an alternative I thought I could use errors().mapError to map the errors into values, filter them, and then map them back into errors. However, I don't see a way to convert a value in a stream into an error.
# Filter only the errors we are interested in
errors = stream.errors()
.mapError (error) ->
error
.filter (error) ->
...
.mapValuesBackIntoErrors() # ?
The idea is that the stream in question either carries a value or an error. Both represent domain knowledge; the value means the system is in normal operation and the error means we have a domain error. Some of the domain errors are not such that we want to carry them, though, so I wish to filter them out.

The alternative works, of course, and you may use a combination of map and mapError to wrap normal values and errors in the style of the Either type in Haskell. For instance
stream.map((value) -> { value }).mapError((error) -> { error })
Now your stream would output something like this:
{ value: 1 }
{ value: 2 }
{ error: "cannot connect to host" }
On the other hand, actually implementing filterError wouldn't be too hard. Consider implementing this yourself and making a PR.

Grails transactions (not GORM based but using Groovy Sql)

My Grails application is not using GORM but instead uses my own SQL and DML code to read and write the database (The database is a huge normalized legacy one and this was the only viable option).
So, I use the Groovy Sql Class to do the job. The database calls are done in Services that are called in my Controllers.
Furthermore, my datasource is declared via DBCP in Tomcat - so it is not declared in Datasource.groovy.
My problem is that I need to write some transaction code, that means to open a transaction and commit after a series of successful DML calls or rollback the whole thing back in case of an error.
I thought that it would be enough to use groovy.sql.Sql#commit() and groovy.sql.Sql#rollback() respectively.
But in these methods Javadocs, the Groovy Sql documentation clearly states
If this SQL object was created from a DataSource then this method does nothing.
So, I wonder: What is the suggested way to perform transactions in my context?
Even disabling autocommit in Datasource declaration seems to be irrelevant since those two methods "...do nothing"

The Groovy Sql class has withTransaction
http://docs.groovy-lang.org/latest/html/api/groovy/sql/Sql.html#withTransaction(groovy.lang.Closure)
public void withTransaction(Closure closure)
throws java.sql.SQLException
Performs the closure within a transaction using a cached connection. If the closure takes a single argument, it will be called with the connection, otherwise it will be called with no arguments.
Give it a try.

Thanks James. I also found the following solution, reading http://grails.org/doc/latest/guide/services.html:
I declared my service as transactional
static transactional = true
This way, if an Error occurs, the previously performed DMLs will be rolled back.
For each DML statement I throw an Error describing the message. For example:
try{
sql.executeInsert("""
insert into mytable1 (col1, col2) values (${val1}, ${val2})
""")
catch(e){
throw new Error("you cant enter empty val1 or val2")
}
try{
sql.executeInsert("""
insert into mytable2 (col1, col2) values (${val1}, ${val2})
""")
catch(e){
throw new Error("you cant enter empty val1 or val2. The previous insert is rolledback!")
}
Final gotcha! The service when called from the controller, must be in a try catch, as follows:
try{
myService.myMethod(params)
}catch(e){
//http://jts-blog.com/?p=9491
Throwable t = e instanceof UndeclaredThrowableException ? e.undeclaredThrowable : e
// use t.toString() to send info to user (use in view)
// redirect / forward / render etc
}

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

mongoskin and bulk operations? (mongodb 3.2, mongoskin 2.1.0 & 2.2.0) - mongoskin

Related

stream data into my vega view progressively

How can I detect a connection failure in gorm?

Read SQL Server Broker messages and publish them using NServiceBus

How to filter errors in a Bacon.js stream

Grails transactions (not GORM based but using Groovy Sql)

Categories

Resources