Database suported by apache mahout - sql

I want to use apache mahout in my application to implement my recommendation engine.
but I do not know what are the database types supported by Apache Mahout?
NoSQL database like Cassandra, MongoDB, CouchDB and SQL database such as MySQL, Oracle and Access are they supported by apache mahout?

Mahout is a library and lately has added an interactive computation environment. That means you write code to put its output in a DB yourself. Therefore you can use any one you want. There are examples of how that might work inside the Mahout project but none of them are good for production.
I have used Cassandra, MongoDB, and MySQL.
The new Multimodal Recommender (also DB agnostic) uses a search engine to serve recs. It would be practical to put user input into a DB with this architecture also. See references here: http://mahout.apache.org/users/recommender/intro-cooccurrence-spark.html

Related

ZF2 DB based session management vs Redis

I am not sure if this is the right place to post this.
I am writing a PHP and ZF2 based website that needs to be scalable. So, I am looking into Database based sessions. I understand ZF2 supports DB Session management, so I can create a MySQL DB, and use it. But DB session management could be slow. So, I have looked into redis as a cache management solution.
My question is will using redis as a standalone server work for both server side session management and a cache solution (as it seems to have it's own in memory DB) or do I need to combine it with ZF2 DB Session management?

spring boot switching from in-memory database to persistent database

I have developed my web-application using spring-boot and spring-data-jpa and and in-memory database, and I have a couple questions:
how can i now switch to a persistent, let's say, MySQL database? What do I have to change in my configuration?
Can spring-boot set a database up for me with a specific port and where does it get stored in my file system?
Does IntelliJ provide a datasource browser for the created database?
I am sure this must be covered somewhere in the endless jungle of spring-boot documentation.
You can change the application properties for the datasource according to the link Gabor Bakos already provided.
That depends on the type of the database you want to use. HSQLDB and H2 allow you to specify a file path for the database file, however the database instance itself is still running within your application process. With full RMDBS like MySQL you have to install and configure the MySQL server yourself and provide the connection data to your Spring Boot application.
Yes, IntelliJ has a datasource browser for all major databases (maybe you have to download the database driver).

SQL Server Connected to Hadoop - Thoughts and Challenges of Implementation

I wanted to broach the issue of SQL Server's Hadoop distribution called HDInsight.
Given that there is a connection provided to Hadoop, does anyone have experience with HDInsight and particularly a comparison between the Hadoop / SQL Server connector and HDIinsight / SQL Server from a real life DTP scenario or personal 1 node installation?
http://sqlmag.com/blog/use-ssis-etl-hadoop
http://www.microsoft.com/en-us/download/details.aspx?id=27584
http://www.microsoft.com/en-us/sqlserver/solutions-technologies/business-intelligence/big-data.aspx
HDInsight is the distribution of Hadoop that Microsoft maintains for use in Azure. You could roughly compare this to Amazon Elastic MapReduce. They both serve the purpose of being a hosted Hadoop service that has almost no management overhead.
The Hortonworks Data Platform for Windows contains the open source changes that Hortonworks and Microsoft have collaborated on to make Hadoop run well on Windows. HDP isn't HDInsight.
In short - you don't need to use HDInsight if you want to run Hadoop in a Windows environment.
While I can't speak directly to using HDInsight and moving data back and forth between SQL Server, I've done implemented a data processing solution using SQL Server, Hadoop, and Elastic MapReduce. Barring some data quality issues and BULK INSERT weirdness, the process was painless.
Finally, you ask "do we really want to run Hadoop size datasets on Windows servers?" - Windows performs well and has solid tooling around it. I've been somewhat skeptical about running Hadoop and other Java platform software on Windows because of legacy Java I/O issues and a lack of community support, not because of any performance issues.
The largest issues that Windows companies will find moving to Hadoop is there will be limited support in community forums and channels when the problem becomes a Hadoop + Windows issue. It's very easy for people to throw their hands up and say "Nope, not helping out, don't have Windows." With time and adoption, this problem goes away. Besides, nothing says you have to finish on the same platform you start with. You could easily deploy with HDP on Windows and move to HDP on Linux at a later date.
I have put together some SQL Server and Hadoop basics for DBAs that should be helpful.

What SQL-based server software should I install locally to develop db skills using?

I'm currently working a contract for a company which will be moving an access database to MS SQL server in the future, and I'd like to hone my skills before the company makes the switch.
I'm also looking forward to possibly developing a rudimentary website which would have a simple HTML/CSS/JS front end, and those skills could also use a sharpening. I'd like to develop my skills through some SQL work locally on my home computer. Researching how to do that has only yielded the suggestion of installing an Apache web server with PHP and MySQL on my local machine. While I'm not opposed to doing that, the last time I worked with an Apache install, it was over-complicated and bloated.
Is there a more streamlined option? It doesn't seem necessary to load an entire web server for the specific use I'm going for.
I'd prefer to simply install a program that allows me to host the [My]SQL database locally, and perhaps later some way to test HTML and Javascript interacting with the database. I'm already somewhat familiar with Sequel Pro. As an added bonus, my Python skills are rusty and I'd like to get used to scripting Python. At this point Xcode (4.2) seems the likely solution here, but I'm open to other options.
I would be installing on a 11" MacBook Air running Lion and Xcode 4.2.
Apache and PHP ship with Mac OS X 10.7. MySQL installation is pretty straightforward with the package available from MySQL.
Take a look at these instructions--for your purposes, you can stop after the php.ini section.
Obviously a very subjective question.
I find CherryPy (with Python) for the web server and server programming, combined with jQuery to augment the HTML in the browser to be a powerful and lightweight combination. When possible, I use SQLite for the database — another extremely lightweight option. I share your distaste for Apache, at least when it's not needed.
If you want to develop solid skills that are peculiar to MS SQL Server, you really don't have much alternative to installing SQL Server someplace you can access it. Although you can certainly limit your use of SQL Server to standard SQL set-based commands SQL Server is more often used with a heavy emphasis on procedural code built into stored procedures. Those skills (writing SQL Server specific stored procedures) are only very minimally transferable to or from other database products.
If you were running Windows, you could get MS SQL Server Express for free:
http://www.microsoft.com/sqlserver/en/us/editions/express.aspx
In the end I just installed MySQL. I ended up following the instructions available from this blog post about setting up django, but not installing phpMyAdmin...
This post also has some good things about setting up passwords and such, since I really didn't want to be stuck relying on something else to admin the MySQL setup.

NHibernate monitor connections

Is there a way to monitor the round-trips to the database in an NHibernate application?
I need just a log to see when NHibernate connects to database.
Have a look at NHibernate Profiler (NHProf) if you don't mind a commercial product. Some of its features are:
Visual insight into the interaction between your database and application code.
Analysis and detection of common pitfalls when using NHibernate.
Analysis is delivered via perfectly styled SQL and linkable code execution.
Supports NHibernate (.NET) and Hibernate (Java).
You can 'monitor' via logging using NH's log4net. Some useful info here.
That will be monitor from the application side.
Have you tried monitoring from the DB serverside? E.g. enable logging at the say mySql.
In addition to NHProf and log4net, there is also a "show_sql" config entry that will dump the SQL to the console in a console app.
Your database vendor should also have tools for monitoring the SQL that is being run against it.