I would like to play with Stack Overflow's data dump in Oracle. The format that they gave me is in XML and it is very very huge (one XML file is about 3GB). I would like to do an import of this data to my Oracle DB. I know one other guy in this topic managed to work on it using the XML directly. Any ideas or suggestions to make this happen easily?
Check out the groovy SQL and XML libraries--you should be able to get up and running pretty quick even with minimal Java/Groovy experience.
http://docs.codehaus.org/display/GROOVY/Tutorial+6+-+Groovy+SQL
Groovy XML
You'll need to install groovy and get the ojdbc14.jar drivers from Oracle. Put your code in a file and run:
groovy -cp ojdbc14.jar myscript.groovy
Related
I just read
How to parse a OFX (Version 1.0.2) file in PHP?
I am not a developer. What easy tool can I use to make this code run with no code skill or appetence ? web browser is pretty hard to use for non dev guys.
I need this to use the file into Power BI, which accept M code, json source or xml, but not sgml ofx or PHP.
Thanks in advance
Welcome Didier to StackOverflow!
I'm going to try and give you a clue how I'd approach the problem here. But keep in mind that your question really lacks details for us to help you, and I'm asking to update your question with example data that you want to integrate into PowerBI. Also, I'm not too familiar with PowerBI nor PHP, and won't go into making that PHP code you linked run for you.
Rather, I'd suggest to convert your OFX file into XML, and then use PowerBI's XML import on that converted file.
From your linked question, I get that your OFX file is in SGML format. There's a program specifically designed to convert SGML into XML (which is just a restricted form of SGML) called osx. I've detailed how to install it on Linux and Mac OS in another question related to SGML-to-XML down-converting; if you're on Windows, you may have luck by just downloading a really ancient (32bit) version of it from ftp://ftp.jclark.com/pub/sp/win32/sp1_3_4.zip. Alternatively, you can use my sgmljs.net software as explained in Converting HTML to XML though that tutorial is really about the much more complex task of converting HTML to XML/XHTML and will probably confuse you.
Anyway, if you manage to install osx, running it on your OFX file (which I assume to have the name yourfile.ofx just for illustration) is just a matter of invoking (on the Windows or Linux/Mac OS command line):
osx yourfile.ofx > yourfile.xml
to result in yourfile.xml which you can attempt to load with PowerBI.
Chances are your OFX file has additional text at the beginning (lines like XYZ:0001 that come before <ofx>). In that case, you can just remove those lines using a text editor before invoking osx on it. Maybe you also need a .dtd file or additional instructions at the top of the OFX file informing SGML about the grammar of your file; it's really difficult to say without seeing actual test data.
Before bothering with SGML and all that, however, I suggest to remove those first few lines in your OFX file (everything until the first < character) and check if PowerBI can already recognize your changed input file as XML (which, from other OFX example files, has a good chance of succeeding). Be sure to work on a copy of your original file rather than overwriting it. Then come back and update your question with your results and example data.
I have .sql file, I want to convert it to NoSQL, as I have a coursework on MongoDB.
What application can I use or how can I do it?
In a quick Google search, I found this website that converts CREATE and INSERT INTO statements to a JSON or Javascript format. However, if you want to create a different database structure (which I would probably recommend), you might want to program a Python script to create a JSON file to import to MongoDB. I guess it all depends on what you want to create.
I'm a bigginner at Talend,and I'm trying to load a database into an XML file, and that must be done automatically.So I don't have to specify any schema for the xml file all must be generated, because I'll have to use that XML file in other jobs. Is that possible using Talend ? and how can I do it ?
Thank you for your answers.
This is not possible by the very inner design of Talend: every schema (db, xml, delimited-files...) must be defined at compile time. It's not possible to detect it at runtime. You could try a complete java-solution using a user routine and some custom code, but this will move to a complete java-based solution, outside from Talend scope (and very inelegant and time-consuming, in my opinion). If it's your case, you probably should redesign your process.
Hello I was trying to learn db2 sql and I was having some problems.
I want to bind a package, but I don't have any packages to bind.
So when I try to create a package it obviously gives me an error. I know that a package is created when we create a database. But then why doesn't it list any packages when i do
db2 list packages
I have seen a lot of links but no help. I would really appreciate if someone actually explained it to me.
Thank you very much
In order to understand a package, you first need to understand dynamic and static queries.
Dynamic queries are created at execution time. Everything from PHP, Perl, Python, Ruby or Java (JDBC) are Dynamic queries. For example, when using Java, you get a Prepared statement, and you assign values (setXXX) to the parameter markers (?).
However, there are other programming languages, such as C, Java (sqlj), cobol, where you create the program, with embedded SQL. For example, when using SQLj, you write a class in a .sql file, and the queries are written in specific tags (not java, but started with #sql { }), then you do a precompilation, that is a process where the SQL are taken out from the code, and translated to natural programming language (ie. from sqlj to Java). The SQL in then inserted into a file that is called a bind file. Once you have that, you need to compile the code (javac to create the .class) and bind the file in the database. In this last step is where the packages are created.
A package is a set of data access plans. However, they were calculated at the bind time, not at the execution time, like in the dynamic queries. They are difference between them.
Finally, in order to create a package, you need to change the bind properties, and eventually the bind file itself.
I need to bulk load huge xml files to SQL Server 2005. I decided to use SQLXMLBULKLOAD in my C# app, but I need to get valid xsd-schemas of those xml files to load them. Which is best way to generate xsd file?
I tried MS VS xsd.exe, but it tries to load the file into memory, which causes OutOfMemory exception.
Thanks!
Strip the file down to create a smaller one that is representative of the whole, then generate an XSD from that. You can then tailor the result if necessary.
There are quite a few tools to generate schemas from instances, but I don't know how many of them are able to operate in pure streaming mode. One tool which will work regardless of the file size is the DTDGenerator that was originally part of Saxon; you can find it here:
http://saxon.sourceforge.net/dtdgen.html
It produces a DTD rather than a schema, but there are plenty of tools available to convert a DTD to a schema.