What's the best way of converting a mysql database to a sqlite one? - sql

I currently have a relatively small (4 or 5 tables, 5000 rows) MySQL database that I would like to convert to an sqlite database. As I'd potentially have to do this more than once, I'd be grateful if anyone could recommend any useful tools, or at least any easily-replicated method.
(I have complete admin access to the database/machines involved.)

I've had to do similar things a few times. The easiest approach for me has been to write a script that pulls from one data source and produces an output for the new data source. Just do a SELECT * query for each table in your current database, and then dump all the rows into an INSERT INTO query for your new database. You can either dump this into a file or pipe it straight into the database frontend.
It's not pretty, but honestly, pretty hardly seems to be a major concern for things like this. This technique is quick to write, and it works. Those are my primary criteria for things like this.
You might want to check out this thread, too. It looks like a couple of people have already put together basically what you need. I didn't look that far into it, though, so no guarantees.

As long as a MySQL dump file doesn't exceed the SQLite query language, you should be able to migrate fairly easily:
tgl#moto~$ mysqldump old-database > old-database-dump.sql
tgl#moto~$ sqlite3 -init old-database-dump.sql new-database
I haven't tried this myself.
UPDATE:
Looks like you'll need to do a couple edits of the MySQL dump. I'd use sed, or Google for it.
Just the comment syntax, auto_increment & TYPE= declaration, and escape characters differ.

Here is a list of converters:
http://www.sqlite.org/cvstrac/wiki?p=ConverterTools
An alternative method that would work nicely but is rarely mentioned is: use a ORM class that abstracts the specific database differences away for you. e.g. you get these in PHP (RedBean), Python (Django's ORM layer, Storm, SqlAlchemy), Ruby on Rails ( ActiveRecord), Cocoa (CoreData)
i.e. you could do this:
Load data from source database using the ORM class.
Store data in memory or serialize to disk.
Store data into source database using the ORM class.

If it's just a few tables you could probably script this in your preferred scripting langauge and have it all done by the time it'd take to read all the replies or track down a suitable tool. I would any way. :)

Related

MongoDB or SQL for text file?

I have a 25GB's text file with that structure(headers):
Sample Name Allele1 Allele2 Code metaInfo...
So it just one table with a few millions of records. I need to put it to database coz sometimes I need to search that file looking, for example, specific sample. Then I need to get all row and equals to file. This would be a basic application. What is important? File is constant. It no needed put function coz all samples are finished.
My question is:
Which DB will be better in this case and why? Should I put a file in SQL base or maybe MongoDB would be a better idea. I need to learn one of them and I want to pick the best way. Could someone give advice, coz I didn't find in the internet anything particular.
Your question is a bit broad, but assuming your 25GB text file in fact has a regular structure, with each line having the same number (and data type) of columns, then you might want to host this data in a SQL relational database. The reason for choosing SQL over a NoSQL solution is that the former tool is well suited for working with data having a well defined structure. In addition, if you ever need to relate your 25GB table to other tables, SQL has a bunch of tools at its disposal to make that fast, such as indices.
Both MySQL and MongoDB are equally good for your use-case, as you only want read-only operations on a single collection/table.
For comparison refer to MySQL vs MongoDB 1000 reads
But I will suggest going for MongoDB because of its aggeration pipeline. Though your current use case is very much straight forward, in future you may need to go for complex operations. In that case, MongoDB's aggregation pipeline will come very handy.

Database Type Agnostic Select Query Encapsulation class

I am upgrading a webapp that will be using two different database types. The existing database is a MySQL database, and is tightly integrated with the current systems, and a MongoDB database for the extended functionality. The new functionality will also be relying pretty heavily on the MySQL database for environmental variables such as information on the current user, content, etc.
Although I know I can just assemble the queries independently, it got me thinking of a way that might make the construction of queries much simpler (only for easier legibility while building, once it's finished, converting back to hard coded queries) that would entail an encapsulation object that would contain:
what data is being selected (including functionally derived data)
source (including joined data, I know that join's are not a good idea for non-relational db's, but it would be nice to have the facility just in case, which can be re-written into two queries later for performance times)
where and having conditions (stored as their own object types so they can be processed later, potentially including other select queries that can be interpreted by whatever db is using it)
orders
groupings
limits
This data can then be passed to an interface adapter that can build and execute the query, returning it in an array, or object or whatever is desired.
Although this sounds good, I have no idea if any code like this exists. If so, can anybody point it out to me, if not, are there any resources on similar projects undertaken that might allow me to continue the work and build a basic version?
I know this is a complicated library, but I have been working on this update for the last few days, and constantly switching back and forth has been getting me muddled at times and allowing for mistakes to occur
I would study things like the SQL grammar: http://www.h2database.com/html/grammar.html
Gives you an idea of how queries should be constructed.
You can study existing libraries around LINQ (C#): https://code.google.com/p/linqbridge/
Maybe even check out this link about FQL (Facebook's query language): https://code.google.com/p/mockfacebook/issues/list?q=label:fql
Like you already know, this is a hard problem. It will be a big challenge to make it run efficiently. Maybe consider moving all data from MySQL and Mongo to a third data store that has a copy of all the data and then running queries against that? Replicating all writes to something like Redis or Elastic Search and then write your queries there?
Either way, good luck!

querying larg text file containing JSON objects

I have few Gigabytes text file in format:
{"user_ip":"x.x.x.x", "action_type":"xxx", "action_data":{"some_key":"some_value"...},...}
each entry is one line.
First I would like to easily find entries for given ip. This part is easy because I can use grep for example. However even for this I would like to find better solution because I would like to get response as fast as possible.
Next part is more complicated because I would like to find entries from selected ip and of selected type and with particular value of some_key in action_data.
Probably I would have to convert this file to SQL db (probably SQLite, because it will be desktop APP), but I would ask if there are exists better solutions?
You could take a look at MongoDB, a document based database. With it you essentially store JSON objects that you can then index and easily query in an efficient way. You can find about how to query in the docs: Querying.
Yes, put it into a database, any database. Then querying it will be straightforward.
Just wanted to mention that Oracle Berkeley DB 11gR2 (released on April 1st, 2010) introduces support for a SQL API. In fact, the SQL API is the sqlite3() API. So, as Jason mentioned, if you'd like the ease-of-use of SQLite, combined with the scalability and concurrency of Berkeley DB, you can now get both things in a single library.
Regards,
Dave
If you need the relational guarantees of an SQL-based DB, definitely go ahead with SQLite. It will allow for fast queries, joins, aggregations, sorts, and overall any sort of search you could possibly dream up. It sounds like this is just a big list of Actions performed by users at some IP, so you'll probably want to use some sort of sequence as your primary key since none of the other attributes look like good candidates.
On the other hand, if you just need to do very simple queries, e.g. look up entries by IP, look up entries by action type, etc., you might want to look into Oracle Berkeley DB. As long as you don't need any searches that are too fancy, Berkeley DB will let you store Terabytes of data and access them at record speed.
So look over both and see what's best for your use case. They're good for different things, which might be why both are available as storage systems on Android, for instance. I think SQLite will probably win out, but when thinking about embedded local DB systems you should always at least consider both of these technologies.

Is SQL the ''assembler'' of the NoSQL database world?

I recently came across http://www.fossil-scm.org/index.html/doc/tip/www/theory1.wiki by D. Richard Hipp, the developer responsible for SQLite.
it go me thinking, is Fossil the only NoSQL database that uses SQL?
Do others uses SQL as a 'High Level Scripting Language'?
From the article, it sounds like Fossil isn't a database any more than git is a database. Yes, it's a thing that contains data, and yes, it's backed by a database, but it seems pretty far from a database itself. So the first part of of your question basically relies on a faulty assumption. There is a database called Friendly which uses MySQL to store schema-less models, but it seems like an awkward bandaid sort of solution at best.
I'm certainly not familiar with all of the NoSQL options out there, but, to my knowledge, none of the well-though-of ones use SQL for anything. MongoDB and CouchDB, the two I'm most familiar with, both use Javascript as part of their query interface, though in very different ways. MongoDB has queries more like what you'd expect from a relational database: you can write an arbitrary query for all documents that match a certain set of attributes. However, unlike a relational database, there's no such thing as a join (you'll only ever get a list of distinct documents back, not compound documents) and you can write arbitrary Javascript code to select documents. CouchDB, on the other hand, does not allow arbitrary queries. Instead, you create views (which are essentially simpler key-value stores) using map/reduce functions written in Javascript and then query those views from a start key to and end key.
In both cases, the type of information being transmitted to the server to perform the query isn't well-suited for the type of problem that SQL is good at solving. The trade-off to SQL being so high-level (to use the logic of the author of the paper) is that it's only suitable for a very narrow set of problems.
The creator of Fossil / SQLite is working and pushing UnQL as the NoSQL standard:
UnQL means Unstructured Query Language.
It's an open query language for JSON, semi-structured and document
databases.
It looks like a stripped down version of SQL.

Migrating from MySQL to arbitrary standards-compliant SQL2003 server

Is there an incantation of mysqldump or a similar tool that will produce a piece of SQL2003 code to create and fill the same databases in an arbitrary SQL2003 compliant RDBMS?
(The one I'm trying right now is MonetDB)
DDL statements are inherently database-vendor specific. Although they have the same basic structure, each vendor has their own take on how to define types, indexes, constraints, etc.
DML statements on the other hand are fairly portable. Therefore I suggest:
Dump the database without any data (mysqldump --no-data) to get the schema
Make necessary changes to get the schema loaded on the other DB - these need to be done by hand (but some search/replace may be possible)
Dump the data with extended inserts off and no create table (--extended-insert=0 --no-create-info)
Run the resulting script against the other DB.
This should do what you want.
However, when porting an application to a different database vendor, many other things will be required; moving the schema and data is the easy bit. Checking for bugs introduced, different behaviour and performance testing is the hard bit.
At the very least test every single query in your application for validity on the new database. Ideally do a lot more.
This one is kind of tough. Unless you've got a very simple DB structure with vanilla types (varchar, integer, etc), you're probably going to get the best results writing a migration tool. In a language like Perl (via the DBI), this is pretty straight-forward. The program is basically an echo loop that reads from one database and inserts into the other. There are examples of this sort of code that Google knows about.
Aside from the obvious problem of moving the data is the more subtle problem of how some datatypes are represented. For instance, MS SQL's datetime field is not in the same format as MySQL's. Other datatypes like BLOBs may have a different capacity in one RDBMs than in another. You should make sure that you understand the datatype definitions of the target DB system very well before porting.
The last problem, of course, is getting application-level SQL statements to work against the new system. In my work, that's by far the hardest part. Date math seems especially DB-specific, while annoying things like quoting rules are a constant source of irritation.
Good luck with your project.
From SQL Server 2000 or 2005 you can have it generate scripts for your objects, but I am not sure how well they will transfer to other RDBMS.
The generate script option is probably the easiest way to go. You'll undoubtedly have to do some search/replace on a few data types though.