Running ObjectScript programs against Intersystems Cache to provide data with SQL Server - com

Background
The main application where I work is based heavily on the MUMPS-esque Caché database engine from InterSystems. Everything is stored in global arrays. Getting data out of the system for external reporting ranges from simply being a pain to being egregiously slow and painful.
Caché provides an ODBC driver for the database but unless the global arrays involved happen to be keyed by the selection criteria, it resorts to scans and a simple query will take hours to run. For scale, the entire Caché production namespace is about 100GB. I can write ObjectScript (Intersystems' dialect of MUMPS) programs that pull data much faster than the ODBC driver in these cases.
Part of the problem I think is that the application vendor doesn't use Caché's object persistence support but instead has the SQL tables defined as a façade over the global arrays, and it often doesn't work well for batch requests.
I built a reporting database in MS SQL Server that pulls the most common data (2.5GB worth) and even if it has to scan every table, all results are returned within 3 seconds. Unfortunately, it takes a long time to refresh the data so I can only do a full refresh once a week and an active refresh once a day. This is good enough for most needs, but I want to do better.
I'm on Caché 2007, SQL Server 2008 R2, VS2010 on Windows 7 and Windows Server 2008 R2.
Scope of question
I need a way to integrate live data from the source Caché database with other data on SQL Server. I want to be able to integrate views or table valued functions into a SQL query and have it pull live data from the source db.
The live data must be available within SQL Server for processing. Doing it with a secondary application would be a huge pain and wouldn't work with reporting tools that just expect to push a query over ODBC and get a final dataset in the right format.
I understand that there are ways to get data into SQL Server or accomplish the same general things I want to do. That's not what this question is about.
The data needs to come from ObjectScript programs run on Caché since not all the data I need is exposed through the SQL defined tables and I get the control I need to make the performance usable with ObjectScript.
I am looking for advice on any new options or how I can improve one of the options I've tried or considered or other pros or cons for those approaches.
What I've tried so far
This project has been an exercise in frustration where each promising avenue I've looked into is either terrible or doesn't work for some reason. Often the reason is some unnecessary restriction on SQLCLR assemblies.
Pulling everything through InterSystem's Caché ODBC driver via a linked server. SQL Server often resorts to scans if it can't push conditions to the remote server or has to perform a join locally. A scan of any nontrivial table takes many hours and is unacceptable. Also, the length of many columns is incorrectly defined by the SQL table definitions in Caché; SQL Server doesn't like that and aborts the query. See this SO question. I can't change the table defs and the vendor doesn't think it's a problem because it works with MS Access.
Using OPENQUERY on demand. This works to some extent but I can still have the column length problem from the previous item and there's no way to parameterize OPENQUERY queries so that makes it pretty useless to pull contextual data.
Using SQLCLR to call the ODBC data provider through CLR table valued functions. This takes care of the parameterization and data length issues, although it does require me to define or modify a function each time I need a new piece of data. Unfortunately, not all the data elements I'm interested in are available through SQL. For some things, I need to access the global arrays directly.
Intersystems provides an ActiveX control that lets you run ObjectScript programs over TCP on the server and get the results. This works great in a stand-alone C# app but as soon as I try to make a connection from a SQLCLR assembly I get a ridiculous URI error:
A .NET Framework error occurred during execution of user-defined routine or aggregate "GetActiveAccounts":
System.UriFormatException: Invalid URI: The URI is empty.
System.UriFormatException:
at System.Uri.CreateThis(String uri, Boolean dontEscape, UriKind uriKind)
at System.Uri..ctor(String uriString)
at System.ComponentModel.Design.RuntimeLicenseContext.GetLocalPath(String fileName)
at System.ComponentModel.Design.RuntimeLicenseContext.GetSavedLicenseKey(Type type, Assembly resourceAssembly)
at System.ComponentModel.LicenseManager.LicenseInteropHelper.GetCurrentContextInfo(Int32& fDesignTime, IntPtr& bstrKey, RuntimeTypeHandle rth)
at FacsAccess.GetActiveAccounts.Client.connect()
at FacsAccess.GetActiveAccounts.Client..ctor()
at FacsAccess.GetActiveAccounts.E1.GetEnumerator()
See this unanswered SO question. There are other postings about it on the net but no one seems to have a clue. This is an extremely simple COM wrapper over a C++ DLL; it's not doing anything with licensing and has no reason to be in the managed licensing libraries. I wonder if this is some kind of boilerplate that's trying to get the name for an assembly that doesn't have a name because it's been loaded into the SQL database.
Intersystems also provides a more direct unmanaged interface but those interfaces are all C++, which I can't use through P/Invoke and I can't load a C++/CLI mixed mode impure assembly in SQLCLR.
Options I've considered but seem kind of terrible
I've considered trying the ActiveX control through SQL Server's COM support but that's terribly slow and really cumbersome.
I could create an out of process service to proxy the traffic but I can't use .NET remoting from SQLCLR and you're not supposed to use WCF and it would be really heavy weight for a such a simple interface anyway. I'd sooner roll my own IPC interface.
I could write some kind of extra unmanaged wrapper with a C style interface for the VisM or CacheDirect interfaces and access THAT through P/Invoke.
It doesn't seem like this should be so hard but it's really driving me up the wall and I need some perspective.

I think you can use ODBC via a linked server accessing stored procedures on the Cache database that are visible to the ODBC driver and that return result sets, but are not implemented using SQL.
I am 100% certain you can create such stored procedures and access them via ODBC, but I have never tried accessing them from SQL server as a linked server. Even if the linked server doesn't work, it seems like it would be preferable to access via the Intersystem's ODBC driver rather than the Active X control or CacheDirect.
I have an example of such a procedure for this question.
In case that link dies, here is the code:
Query OneGlobal(GlobalName As %String) As %Query(ROWSPEC = "NodeValue:%String,Sub1:%String,Sub2:%String,Sub3:%String,Sub4:%String,Sub5:%String,Sub6:%String,Sub7:%String,Sub8:%String,Sub9:%String") [SqlProc]
{
}
ClassMethod OneGlobalExecute(ByRef qHandle As %Binary, GlobalName As %String) As %Status
{
S qHandle="^"_GlobalName
Quit $$$OK
}
ClassMethod OneGlobalClose(ByRef qHandle As %Binary) As %Status [ PlaceAfter = OneGlobalExecute ]
{
Quit $$$OK
}
ClassMethod OneGlobalFetch(ByRef qHandle As %Binary, ByRef Row As %List, ByRef AtEnd As %Integer = 0) As %Status [ PlaceAfter = OneGlobalExecute ]
{
S Q=qHandle
S Q=$Q(#Q) b
I Q="" S Row="",AtEnd=1 Q $$$OK
S Depth=$QL(Q)
S $LI(Row,1)=$G(#Q)
F I=1:1:Depth S $LI(Row,I+1)=$QS(Q,I)
F I=Depth+1:1:9 S $LI(Row,I+1)=""
S AtEnd=0
S qHandle=Q
Quit $$$OK
}
FYI, that is not an example to use in production, since it exposes all the data from all the globals.
However, it sounded like you were prepared to write Cache Object Script to directly get the data anyway - this is a template for doing so. The main thing to understand is that qHandle will be passed back by the ODBC driver on each call, so you can use it to store state. If there is a lot of state, make qHandle be an integer index into a temporary global holding the "real" state, and clean that up in the close method.
Since you are concerned about performance, you may want to also implement a
MyQueryFetchRows (ByRef qHandle As %Binary, FetchCount As %Integer = 0, ByRef RowSet As %List, ByRef ReturnCount As %Integer, ByRef AtEnd As %Integer) As %Status
method - see the documentation for %Library.Query for more details.
If you really need this to appear to ODBC as a (read-only) table rather than a stored procedure I think it might be possible - but I've never before tried to see if an arbitrary stored procedure can be exposed as a read-only table and I'm not sure how easy it is, or if it's actually always possible.

Related

SQL Server 2012 keyword override

A question for which I already know there is no pretty answer.
I have a third party application that I cannot change. The application's database has been converted from MS Access to SQL Server 2012. The program connects with ODBC and does not care about the backend. It sends pretty straight-forward SQL that seems to work on SQL Server nicely as well.
There is however a problem with one table that has the name "PLAN" which I already know is a SQL Server keyword.
I know that you would normally access such a table with square brackets, but since I'm not able to change the SQL I was wondering if there is any "ugly" hack that can either override a keyword or transform SQL on the fly.
You could try to edit the third party application with a hex editor. If you find the strings PLAN, edit this to something like PPAN and then rename the table, views etc. If you catch all, it could work. But, of course it is an ugly thing.
I think you are screwed I am afraid. The only other approaches I could suggest are
Intercepting the network packets before it hits the SQL Server which is clearly quite complicated. See https://reverseengineering.stackexchange.com/questions/1617/server-side-query-interception-with-ms-sql-server and in particular answer https://reverseengineering.stackexchange.com/a/1816
Decompiling the program in order to change it if it's a Java or .Net app for instance.
I suspect you're hosed. You could
Wire up the 3rd party app to a shim MS Access database that uses linked tables, where the Access table is nothing but a pass-through to the underlying SQL Server table. What you want to do is:
Change the offending column names in the SQL Server schema.
Create the linked tables in Access
Create a set of view/query in access that has the same schema that the 3rd party app expects.
Having done that, the 3rd party app should be able to speak "Access SQL" like it always has. Access takes care of the "translation" to T-SQL. Life is good. I suspect you'll take something of a performance hit, since you're proxying everything through Access, but I don't think it'll be huge.
That would be my preferred solution.
The other option would be to write a "shim" DLL that implements the ODBC API and simply wraps the actual calls to the true ODBC driver. Then capture the requests and improve them as necessary prior to invoking the wrapped DLL method. The tricky part is that your 3rd party app might be going after columns by ordinal position or might be going after them by column name. Or a mix. That means that you might need to transform the columns names on the way back, which might be more difficult than it seems.

How to migrate shared database from Access to SQL Express

I have been using MS Access databases via DAO for many years, but feel that I ought to embrace newer techniques.
My main application runs on end user PCs (no server) and uses a shared database that is created and updated on-the-fly. When the application is first run it detects the absence of a database and creates a new empty one.
Any local user running the application is allowed to add or update records in this shared database. We have a couple of other shared databases, that contain templates, regional information, etc., but these are not updated directly by the application.
Updates of the application are released from time to time and each new update checks the main database version and if necessary executes code to bring the database up to the latest specification. This may involve the creation or deletion of tables and/or columns. New copies of the template databases are also included as part of the update.
Our users are not required to be computer-literate and should not need to run any sort of database management software beyond those facilities provided by the application.
It all works very nicely with DAO/Access, but I'm struggling to find how to do it with SQL Express. The databases seem to be squirrelled away in locations that are user-specific and database creation and update seems at best awkward to do by program code alone.
I came across some references "Xcopy deployment" that looks like it could be promising, but there seem to be references to "user instances" that sound suspiciously like something that's not shared. I'd appreciate advice from anyone who has done it.
It sounds to me like you haven't fully absorbed the fundamental difference between the Access Database Engine (ACE/Jet) and SQL Server:
When your users launch your Access application it connects to the Access Database Engine that has been installed on their machine. Their copy of ACE/Jet opens the shared database file (.accdb or .mdb) in the network folder. The various instances of ACE/Jet work together to manage concurrent updates, record locking, and so on. This is sometimes called a "peer-to-peer" or "shared-file" database architecture.
With an application that uses a SQL Server back-end, the copies of your application on each user's machine connect over the network to the same instance of SQL Server (that's why it's called "SQL Server"), and that instance of SQL Server manipulates the database (which is stored on its local hard drive) on behalf of all of the clients. This is called "client-server" or "server-based" database architecture.
Note that for a multi-user database you do not install SQL Server on the client machines, you only install the SQL Server Client components (OleDb and ODBC drivers). SQL Server itself is only installed in one place: the machine that will act as the SQL... Server.
re: "database creation and update seems at best awkward to do by program code alone" -- Not at all, it's just "different". Once again, you pass all of your commands to the SQL Server and it takes care of creating the actual database files. For example, once you've connected to the SQL Server if you tell it to
CREATE DATABASE NewDatabase
it will create the database files (NewDatabase.mdf and NewDatabase_log.LDF) in whatever local folder it uses to store such things, which is usually something like
C:\Program Files\Microsoft SQL Server\MSSQL10_50.SQLEXPRESS\MSSQL\DATA
on the server machine.
Note that your application never accesses those files directly. In fact it almost certainly cannot do so, and indeed your application does not even care where those files reside or what they are called. Your app simply talks to the SQL Server (e.g. ServerName\SQLEXPRESS) and the server takes care of the details.
Just to update on my progress. Inspired by suggestions here and this article on code project:
http://www.codeproject.com/Articles/63147/Handling-database-connections-more-easily,
I've created a wrapper for the ADO.NET methods that looks quite similar to the DAO stuff that I am familiar with.
I have a class that I can use just like a DAO Database. It wraps ADO methods like ExecuteReader, ExecuteNonQuery, etc. with overloads that can accept a SQL parameter. This allows me to directly replace DAO Recordsets with readers, OpenRecordset with ExecuteReader and Execute with ExecuteNonQuery.
Each method obtains and releases the connection from its parent class instance. These in turn open or close the underlying connection as required depending on the transaction state, if any. So a connection is held open for method calls that are part of a transaction, but closed immediately for a single call.
This has greatly simplified the migration of my program since much of the donkey work can be done by a simple "find and replace". The remaining issues are then relatively easy to find and sort out.
Thanks, once again to Gord and Maxwell for your advice.
This answer is too long to right down... but go to Microsoft page, there they explain how to make it: http://office.microsoft.com/en-us/access-help/move-access-data-to-a-sql-server-database-by-using-the-upsizing-wizard-HA010275537.aspx
I hope this help you!!

What are my options - sql express or?

I have a client who has a VB 6.0 application with MS Access as backend. But, now the Access is unable to take the load. So, we are now considering to shift to SqlExpress Edition. Also, we will convert VB6.0 application to c# based Winforms. My questions -
1) Can SqlserverExpress support 10 users concurrently? If not SqlExpress, then what other options are available?
2) Should I first convert VB 6.0 to C# application? Because, if I transfer data to Sqlserver, will VB 6.0 application continue to work?
thanks
Yes it can
You don't need to convert your app, but Access and Sql Express - are different database engines, so you will need to adopt your app to sql express
Note, that sql express prior to 2008 R2 can handle up to 4 Gb databases, while 2008 R2 can handle up to 10 Gb per database.
1) SQL Express allows over 32 thousand simultaneous users. The only real limit is database size, which is 10 Gigabytes.
2) You'll need to at least modify the VB 6 application to have the correct connection string before it will work with SQL server.
I am curious though why you say that Access (the JET database engine) is unable to take the load. Usually 20 or even more simultaneous users are no problem.
If product is for in house use and doesnt generate cash you can use oracle. Its free to use unless your app is for commercial use.
One question to ask is after how many users does the system slow down? And if it is slow with one user, then this might be some software design issue and not necessary the "load" on the server. There also the issue of the type of connection WAN or LAN. In fact you can read about this issue in the following article:
http://www.kallal.ca//Wan/Wans.html
The above in a nutshell explains why the Access data engine does not work well on a WAN.
Also migration of data to SQL server without determining what particular issue is causing the slowdown could very well result in a further slowdown. In other words often just up-sizing data to a server database engine will not solve performance issues and in some cases it can become worse.
In fact in many on line Access forums we often see users complain of a slowdown when moving the back end data file from Access to SQL server. So just moving to SQL server without taking advantages of SQL server features does not always guarantee a performance boost.
The other issue you want to determine here is if the VB6 program uses ADO or DAO. Either data object model is fine, but ADO would suggest LESS code will have to be modified then if the application is based on DAO.
Another issue is you not mentioned how large tables are, and the number. So say 30 to 50 highly related tables, and say a small number of rows (say 200,000) in some of the tables should run just fine with 5 to 15 users. If your user count is only about 10, and your table row counts are small as noted then performance should be ok, and if it is not, then as noted you might be able to keep the application as is and moving the data to SQL server may not yield performance gains without further code modifications. And of course SOME code will have to be modified to work with SQL server - how much will depend on the data object used, and how much code there is over all. (more recordset code = more chance of needing more code changes).
If you do decide to convert from Access to SQL Server Express, there is a migration wizard which can give you a quick start with that process. Here's the link

Change of code if Database is changed (form Access to SQL Server)

I am trying to migrate from Access Database to SQL server database. Do I need to make changes (for saving, reading, deleting etc. data) to my code as well?
I understand I need to change connection but what about code?
Thanks
Furqan
For the most part, Access SQL Queries are very similar to SQL Server Queries. Therefore, unless you're using Access-specific constructs such as Val, CInt, etc in your queries, you can safely migrate to SQL Server.
It is recommended though, that you thoroughly test the program after migration lest you run into 'strange' errors.
I'm sure the conversion will require some time and troubleshooting. For one, Access through version 2007 uses VBA for the back-end code and SQL Server's programming is quite different from VBA. Developer's transitioning from Visual Basic 6 (very similar to VBA) to VB.NET ran into quite a few challenges.
Also, keep in mind that SQL Server is best used as the back-end storage engine. The user interface is typically written in a .NET language using Visual Studio. SQL Server (at least through 2005 -- I haven't used 2008) doesn't have user interface components like a File Open dialog box.

Is it possible to monitor and log actual queries made against an Access MDB?

Is it possible to monitor what is happening to an Access MDB (ie. what SQL queries are being executed against it), in the same way as you would use SQL Profiler for the SQL Server?
I need logs of actual queries being called.
The answer depend on the technology used from the client which use MDB. There are different tracing settings which you can configure in HKEY_LOCAL_MACHINE\Software\Microsoft\Jet\4.0\Engines\ODBC http://office.microsoft.com/en-us/access/HP010321641033.aspx. If you use OLEDB to access MDB from SQL Server you can use DBCC TRACEON (see http://msdn.microsoft.com/en-us/library/ms187329.aspx). I can continue, but before all you should exactly define which interface you use to access MDB.
MDB is a file without any active components, so the tracing can makes not MDB itself, but the DB interface only.
UPDATED: Because use use DAO (Jet Engine) and OLE DB from VB I recommend you create JETSHOWPLAN regisry key with the "ON" value under HKEY_LOCAL_MACHINE\SOFTWARE\MICROSOFT\JET\4.0\Engines\Debug (Debug subkey you have to create). This key described for example in https://web.archive.org/web/1/http://articles.techrepublic%2ecom%2ecom/5100-10878_11-5064388.html, http://msdn.microsoft.com/en-us/library/aa188211%28office.10%29.aspx and corresponds to http://support.microsoft.com/kb/252883/en allow trace OLE DB queries. If this output will be not enough for you you can additionally use TraceSQLMode and TraceODBCAPI from HKEY_LOCAL_MACHINE\Software\Microsoft\Jet\4.0\Engines\ODBC. In my practice JETSHOWPLAN gives perfect information for me. See also SHOWPLAN commend.
UPDATED 2: For more recent version of Access (like Access 2007) use key like HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Office\12.0\Access Connectivity Engine\Engines. The tool ShowplanCapturer (see http://www.mosstools.de/index.php?option=com_content&view=article&id=54&Item%20%20id=57, to download http://www.mosstools.de/download/showplan_v9.zip also in english) can be also helpful for you.
If you're accessing it via ODBC, you can turn on ODBC logging. It will slow things down a lot, though. And it won't work for any other data interface.
Another thought is using Jet/ACE as a linked server in SQL Server, and then using SQL Profiler. But that's going to tell you the SQL that SQL Server processed, not what Jet/ACE processed. It may be sufficient for your purposes, but I don't think it would be a good diagnostic for Jet/ACE.
EDIT:
In a comment, the original poster has provided this rather crucial information:
The application I am trying to monitor
is compiled and at a customer's
premises. I am trying to monitor what
queries it is attempting against an
MDB. I cannot modify the application.
I am trying to do what SQL Profiler
would do for a SQL Server.
In that case, I think that you could do this:
rename the original MDB to something else.
use a SQL Server linked server to connect to the renamed MDB file.
create a new MDB with the name of the original MDB and link to the SQL Server with ODBC.
The result will be an MDB file that has the same tables in it as the original, but they are not local, but links to the SQL Server. In that case, all access will be going through the SQL Server and can be viewed with SQL Profiler.
I don't have a clue what this would do to performance, or if it would break any of the data retrieval in the original app. If that app uses table-type recordsets or SEEK, then, yes, it will break. But this is the only way I can see to get logging.
It shouldn't be surprising that there is no logging for Jet/ACE, given that there is no single server process managing access to the data store.
Keep in mind that the file sitting on your hard drive is simply a windows file. So, there is a big difference between a server based system and that of a simple text file, or Power Point file, or in this case a mdb file just sitting on the drive.
However you can get the jet engine to display its query optimizeing via showplan.
How to do this is explained here:
http://www.databasejournal.com/features/msaccess/article.php/3658041/Queries-On-Steroids--Part-IV.htm
The above article also shows how to access the jet disk read statistics, which I also find extremely useful for optimizing things.
Just remember to turn off that data engine logging system when you’re not using it as it creates huge log files…
you could write your own profiler, based on a "transaction" object that will centralize all instructions sent to the database, You'll end up somewhere with a "transaction.execute" method, and a transaction table in your access db. This table can then be used to collect transaction's instructions, start time, end time, user sending the instruction, etc.
I'd suggest upsizing the tables to SQL Server. There is a tool from the SQL Server group that is better than the Upsizing Wizard that is included with Access.
SQL Server Migration Assistant for Access (SSMA Access)
Also see my Random Thoughts on SQL Server Upsizing from Microsoft Access Tips page