Can Microsoft Azure be used to run an Access VBA simulation model? - vba

I have some simulation models that I routinely use that were built in Microsoft Access VBA. I have just became aware of Microsoft Azure (I know I am late to the show), and was curious to know if there was anyway to run my model via Azure's distributed computing services to make them faster?
I saw something call SQL Azure on the website but I didn't entirely understand the product. 95% of the computation that exists in the VBA model are sql commands.
If you have any knowledge or experience I would love to hear from you.

SQL Azure is most like a remote SQL Server which - as you know - knows nothing about VBA.
But you can create one or more virtual machines hosted at Azure and install your application on this/these. Then, as needed, you can assign expanded CPU resources to these machines for as little as one hour.
Azure has a free tier. Create an account and you have access to quite some resources to evaluate.

While Azure is distributed, it also utility computing (that means slow in terms of processing ability and considerable amounts of “governing” in terms of CPU available).
The other issue of course is that running SQL on the cloud OS Azure means that any data you pull into Access/VBA has to occur over a VERY slow network connection. This connection is 1000’s of times slower than a local Access table.
So the real issue then becomes transfer of data. You could I suppose re-write the VBA code into t-SQL code and much dump the use of Access. However t-sql is even less suited to simulation type software than that of VBA (and procedural t-sql code is not that fast in terms of execution speed either).
So between bandwidth issues, and that of t-sql on sql server being a rather limited language when it comes to writing + running lots of procedural code (which most simulation software entails), then this approach is likely the wrong approach and wrong technology here.

Related

moving from sql server to cassandra

I have a data intensive project for which I wrote the code recently, the data and sp live in a MS SQL db. My initial estimate is that the db will grow to 50TB, then it will become fairly static in growth. The final application will perform lots of row level look ups and readings, with a very small percentile of db write backs.
With the above scenario in mind, its being suggested that I should look at a NoSQL option in order to scale to the large load of data and transactions, and after a bit of research the roads leads to Cassandra (while considering MongoDB as a second alternative)
I would appreciate your guidance with the following set of initial questions:
-Does Cassandra support the concept of store procs?
-Would I be able to install and run the 50TB db on a single node (single Windows Server)?
-Does Cassandra support/leverage multiple CPUs in single server (ex: 4 CPUs)?
-Would open source version be able to support the 50TB db? or would I need to purchase the ENT version?
Regards,
-r
Does Cassandra support the concept of store procs?
Cassandra does not support stored procedures. However there is a feature called "prepared statements" which allows you to submit a CQL query once, and then have it executed multiple times with different parameters. But the set of things you can do with prepared statements is limited to regular CQL. In particular you can not do things like loops, conditional statements or other interesting things. But you do get some measure of protection against injection attacks and savings on multiple compilations.
Would I be able to install and run the 50TB db on a single node (single Windows Server)?
I am not aware of anything that would prevent you from running a 50TB database on one node, but you may require lots of memory to keep things relatively smooth, as you RAM/storage ratio is likely to be very low and thus impact your ability to cache disk data meaningfully. What is not recommended, however, is running a production setup on Windows. Cassandra uses some Linux specific IO optimizations, and is tested much more thoroughly on Linux. Far-out setups like you're suggesting are especially likely to be untested on Windows.
Does Cassandra support/leverage multiple CPUs in single server (ex: 4 CPUs)?
Yes
Would open source version be able to support the 50TB db? or would I need to purchase the ENT version?
The Apache distro does not have any usage limits baked into it (it makes little sense in an open source project, if you think about it). Neither does the free version from DataStax, the Community Edition.

What database strategy to choose for a large web application

I have to rewrite a large database application, running on 32 servers. The hardware is up to date, each machine has two quad core Xeon and 32 GByte RAM.
The database is multi-tenant, each customer has his own file, around 5 to 10 GByte each. I run around 50 databases on this hardware. The app is open to the web, so I have no control
on the load. There are no really complex queries, so SQL is not required if there is a better solution.
The databases get updated via FTP every day at midnight. The database is read-only.
C# is my favourite language and I want to use ASP.NET MVC.
I thought about the following options:
Use two big SQL servers running SQL Server 2012 to serve the 32 servers with data. On the 32 servers running IIS hosting providing REST services.
Denormalize the database and use Redis on each webserver. Use booksleeve as a Redis client.
Use a combination of SQL Server and Redis
Use SQL Server 2012 together with Hadoop
Use Hadoop without SQL Server
What is the best way for a read-only database, to get the best performance without loosing maintainability? Does Map-Reduce make sense at all in such a scenario?
The reason for the rewrite is, the old app written in C++ with ISAM technology is too slow, the interfaces are old fashioned and not nice to use from an website, especially when using ajax.
The app uses a relational datamodel with many tables, but it is possible to write one accerlerator table where all queries can be performed on, and all other information from the other tables are possible by a simple key lookup.
Few questions. What problems have come up that you're rewriting this? What do the query patterns look like? It sounds like you would be most comfortable with a SQLServer + caching (memcached) to address whatever issues that are causing you to rewrite this. Redis is good, but you won't need the data structure features with the db handling queries, and you don't need persistance if it's only being used as a cache. Without knowing more about the problem, I guess I'd look at MongoDB to handle data sharding, redundant storage, and caching all in one solution. There are no special machines in this setup, redundancy can be configured, and the load should balance well.
This question is almost an opinion piece. I'd personally prefer an Oracle RAC with TimesTen for caching if performance is of the utmost importance, and if volume of concurrent reads is high during the day.
There's a white paper here...
http://www.oracle.com/us/products/middleware/timesten-in-memory-db-504865.pdf
The specs of the disk subsystem and organization of indexes and data files across physical disks is probably the most important factor though.

Is there an "in memory" setting for SQL Server?

I'm curious if SQL Server can allow for the creation of in memory databases.
Currently I'm looking at unit testing/integration testing some data layer code that is connected to SQL Server.
In the past I have availed myself of SQLite's support for this concept and found it invaluable.
You could mount either an SSD drive or a ramdisk on your server and then put a regular database on that volume?
Otherwise, no.
I've not mentioned Table Variables as they're only partly held in memory and are likely to be too transient for your requirements.

microsoft sql server management studio express store db in memory?

I have a database intensive test I'm running that uses a small database ~100MB.
Is there a way to have microsoft sql server management studio express store the database in memory instead of hard drive? Is there some option I can select for it to do this?
I'm also thinking about a ram drive, but if there is an option in mssmse I'd rather do that.
Management Studio has nothing to do with how the database is stored. The SQL database engine will, given sufficient memory, cache appropriately to speed up queries. You really shouldn't need to do anything special. You'll see that the initial query is a bit slower than the ones that run after the cache is populated, that's normal.
Don't mess with a RAM drive, you'll be taking memory away from SQL to do it and will probably end up less efficient. If you have a critical need for fast disk, you'll either need to look at a properly configured array or solid state drives.
There are ways to performance tune SQL to specific applications, but it's very involved and requires a deep knowledge of the specific SQL server product. You're better off looking at database design and query optimization first.
Realistically databases around 100MB are tiny and shouldn't require special handling if properly designed.

Single logical SQL Server possible from multiple physical servers?

With Microsoft SQL Server 2005, is it possible to combine the processing power of multiple physical servers into a single logical sql server? Is it possible on SQL Server 2008?
I'm thinking, if the database files were located on a SAN and somehow one of the sql servers acted as a kind of master, then processing could be spread out over multiple physical servers, for instance even allowing simultaneous updates where there was no overlap, and in the case of read-only queries on unlocked tables no limit.
We have an application that is limited by the speed of our sql server, and probably stuck with server 2005 for now. Is the only option to get a single more powerful physical server?
Sorry I'm not an expert, I'm not sure if the question is a stupid one.
TIA
Before rushing out and buying new hardware, find out where your bottlenecks really are. Many locking problems can be solved with the appropriate indexes for your workload.
For example, I've seen instances where placing tempDB on SSD solved performance issues and saved the client buying an expensive new server.
Analyse your workload: How Can I Log and Find the Most Expensive Queries?
With SQL Server 2008 you can utilise the Management Data Warehouse (MDW) to capture your workload.
White Paper: SQL Server 2008 Performance and Scale
Also: please be aware that a SAN solution is not necessarily a faster I/O solution than directly attached storage. It depends on the SAN, number of Physical disks in a LUN, LUN subscription and usage, the speed of the HBA's and several other hardware factors...
Optimizing the app may be a big job of going through all business logic and lines of code. But looking for the most expansive query can easily locate the bottleneck area. Maybe it only happens to a couple of the biggest tables, views or stored procedures. Add or fine tune an index may help right the way. If bumping up the RAM is possible try that option as well. That is cheap and easy configure.
Good luck.
You might want to google for "sql server scalable shared database". Yes you can store your db files on a SAN and use multiple servers, but you're going to have to meet some pretty rigid criteria for it to be a performance boost or even useful (high ratio of reads to writes, small enough dataset to fit in memory or a fast enough SAN, multiple concurrent accessors, etc, etc).
Clustering is complicated and probably much more expensive in the long run than a bigger server, and far less effective than properly optimized application code. You should definitely make sure your app is well optimized.