How to import data from eXist database to PostgreSQL database? - sql

Is there any extension/tool/script available to import data from eXist database to PostgreSQL database automatically?

From the tag description it's pretty clear that you're going to need to use an ETL tool or some custom code. Which is easier depends on the nature of the data and how you want to migrate it.
I'd start by looking at Talend Studio and Pentaho Kettle. See if either of them can meet your needs.
If you can turn the eXist data into structured CSV exports then you can probably just hand-define tables for it in PostgreSQL then COPY the data into it or use pgloader.
If not, then I'd suggest picking up the language you're most familiar with (Python, Java, whatever) and using the eXist data connector for that language along with the PostgreSQL data connector for the language. Write a script that fetches data from eXist and feeds it to PostgreSQL. If using Python I'd use the Psycopg2 database connector, as it's fast and supports COPY for bulk data loading.

Related

Convert an online JSON set of files to a relational DB (SQL Server, MySQL, SQLITE)

I'm using a tool called Teamwork to manage my team's projects.
The have an online API that consists of JSON files that are accessible with authorisation
https://developer.teamwork.com/projects/introduction/welcome-to-the-teamwork-projects-api
I would like to be able to convert this online data to an sql db so i can create custom reports for my management.
I can't seem to find anything ready to do that.
I need a strategy to do this..
If you know how to program, this should be pretty straightforward.
In Python, for example, you could:
Come up with a SQL schema that maps to the JSON data objects you want to store. Create it in a database of your choice.
Use the Requests library to download the JSON resources, if you don't already have them on your system.
Convert each JSON resource to a python data structure using json.loads.
Connect to your database server using the appropriate Python library for your database. e.g., PyMySQL.
Iterate over the python data, inserting rows into the database as appropriate. This is essentially the JSON-to-Tables mapping from step 1 made procedural.
If you are not looking to do this in code, you should be able to use an open-source ETL tool to do this transformation. At LinkedIn a coworker of mine used to use Talend Data Integration for solid ETL work of a very similar nature (JSON to SQL). He was very fond of it and I respected his opinion, so I figured I should mention it, although I have zero experience of it myself.

Transfer Data from Oracle database 11G to MongoDB

I want to have an automatic timed transfer from Oracle database to MongoDB. In a typical RDBMBS scenario, i would have established connection between two databases by creating a dblink and transferred the data by using PL/SQL procedures.
But i don't know what to do in MongoDB case; thus, how and what should i be implementing so that i can have an automatic transfer from Oracle database to MongoDB.
I would look at using Oracle Goldengate. It has a MONGODB Handler.
https://docs.oracle.com/goldengate/bd123110/gg-bd/GADBD/using-mongodb-handler.htm#GADBD-GUID-084CCCD6-8D13-43C0-A6C4-4D2AC8B8FA86
https://oracledb101.wordpress.com/2016/07/29/using-goldengate-to-replicate-to-mongodb/
What type of data do you want to transfer from the Oracle database to MongoDB? If you just want to export/import a small number of tables on a set schedule, you could use something like UTL_FILE on the Oracle side to create a .csv export of the table(s) and use DBMS_SCHEDULER to schedule the export to happen automatically based on your desired time frame.
You could also use an application like SQL Developer to export tables as .csv files by browsing to the table the schema list, then Right Click -> Export and choosing the .csv format. You may also find it a little easier to use UTL_FILE and DBMS_SCHEDULER through SQL Developer instead of relying on SQL*Plus.
Once you have your .csv file(s), you can use mongoimport to import the data, though I'm not sure if MongoDB supports scheduled jobs like Oracle (I work primarily with the latter.) If you are using Linux, you could use cron to schedule a script that will import the .csv file on a scheduled interval.

How to validate data in Hive HQL while import from Source

Please explain how to put the validation while importing data from Source in Hive table for example in a bulk of data if some data is corrupt which is not suppose to import so how to discard that data.
You need to develop ETL process and have strategy to discard the corrupt data. Either you can use 3rd party tools like Informatica big data edition, Talend etc or you need to develop your custom code. It is a major effort.

Which dashboard analytics will support Parse.com data source?

I've developed an app that uses Parse.com as the back end. I now need a dashboard analytics software package (such as iDashboards) that will enable me to pull data from my Parse.com database classes and present some of that data in a pretty dashboard fashion.
iDashboards looks to be the kind of tool i'm after, but it only supports certain data source inputs such as JDBC, ODBC, SQL, MySQL etc. Not being a database guru by any means, i'm not sure if Parse.com can be classed as any of the above, but from what i've read it doesn't come under any of these categories.
Can anybody recommend a way of either connecting Parse.com to iDashboard, or suggest another dashboard tool that will support Parse.com as a data source?
The main issue you are facing is that data coming out of Parse.com is going to be in json format. Most dashboards are going to prefer csv files.
The best dashboard I am aware of is Tableau and there is a discussion about getting json into Tableau here: http://community.tableau.com/ideas/1276
If your preference is using iDashboards then you need to convert the json coming out of Parse into a csv format that iDashboards can consume. You can do that using RJSON as mentioned in the post above but you'll probably have an easier time of it with a simple php or python script that periodically connects to Parse and pulls out data updates for you and then pushes it to your dashboard of choice.
Converting json to csv in php is addressed here: Converting JSON to CSV format using PHP
The difference is much more fundamental than "unsupported file format". In fact, JSON data coming out of Parse is stored in a so-called denormalized form, which means that a single JSON data file may contain the equivalent of arbitrarily many tables in a relational database. Stated differently, one JSON file may translated into potentially many CSV files, and there's no unique choice of how to perform that translation.
This is a so-called ETL problem, where ETL stands for Extract-Transform-Load. As such, you may be interested in open source ETL tools such as Kettle. Kettle is supported by Pentaho and includes functionality that can help you develop a workflow to turn JSON data into multiple CSV files that can then be imported into iDashboards (or similar). Aside from Kettle, Talend is also widely used for this purpose and has the same ability.
Finally, note that Parse is powered by MongoDB, and exports JSON data that is easily stored and manipulated in MongoDB. As such, a natural fit for reporting on Parse data is any reporting tool built for MongoDB.
As of the time of this writing, there are two such options:
JSON Studio, which is a commercial solution that is built explicitly for MongoDB and has your stated capability to produce dashboards.
SlamData, which is an open source solution, also built for MongoDB, which allows native SQL on the database. The current version does not have reporting capabilities (just CSV export), but the 2.09 version due out in June has reporting dashboards baked in.
An advantage of using a MongoDB reporting tool is that you will not have to wrangle your data into relational form. If it's heavily nested, using arrays, and so forth, it can be quite painful to develop an ETL workflow and keep it in sync with how the data is changing. Instead, all you have to do is built a script to pipe the raw data from Parse into a MongoDB instance (perhaps hosted by MongoLab or equivalent, if you don't want to host it yourself), and connect the MongoDB reporting tool on top.
You might also contact Parse and see if they have a recommended solution for this. It occurs to me they should probably bake some sort of analytical / reporting functionality into their APIs as this is such a common use case.
You can use Axibase Time-Series Database to ingest your data from parse.com and they have built in dashboards and widgets for visualization or you can just export data from ATSD to csv and use iDashboards.

how to access a database when the access is restricted to a particular place

There is a student database in Some College.Some Organization wants to access it from their headquarters.
But access is restricted within college only.
Is it possible for you to extract data?
How and what SQL queries and functions for the above?
in network programming in can do by connecting via tcp r udp and extracting information but is t possible if the databasae is large?
how can we do using sql function
One thing you can do is to dump the data and reimport it into your own database. Depends on how big the data is you require. At work I have similar problems and I have to do the same somteimes.
If your admin dumps the data for you, then it is easier. You can also export it with sql commands, but how depends on which database you are using. When you dump it to CSV format, you can import it into a SQLIte datbase easily (or others like MySQL etc.), if you don't have a local DB version of your own database.
An alternative is to export the data yourself into a CSV. How to do this depends on the DB that you use and you didn't mention it. Under Oracle you can use the set and spool command to achieve this.