I have a .txt file with rows of the following format
SI1334596|MRKU3|High Cube|1|EGST|First Line|Vehicle one|25|13|
How do I form a database of above .txt entries to perform SQL queries on it? I also want to assign column names to each of the columns. I have little or no knowledge on importing txt file entries in a database. I am looking for a software which can be installed on my windows computer which can import .txt file and convert into a database and allow me to perform queries thereafter.
If you are asking for recommendations on specific tools, then your Question is off-topic for StackOverflow.com. See the Software Recommendations Stack Exchange.
Here are some possible approaches, with and without programming.
Database Import
Databases often have a built-in command or facility for importing data straight from a text file. When directly importing text with little or processing, the import is often very fast.
For example, Postgres has the copy command to import. This command includes a parameter DELIMITER where you can tell it to expect the vertical bar | as the separator between fields.
You would define your table structure ahead of time, before the import, defining a name and data type for each expected column/field.
Custom App
You can write an app to read the text file, process the incoming data, and feed the prepared data to the database. For example, write a Java app that reads the text file, uses JDBC to connect to the server, and SQL written as text strings to instruct the database server on what to do.
You can do this row by row. Or, for increased speed, you can write a batch statement telling the database server to create multiple rows at the same time.
This is the way to go if the data requires complicated processing or there are other related chores such as keeping a history of many such imports, logging other information, reporting duplicate data, and so on.
For Java, the Apache Commons CSV library helps with reading/writing plain text files.
Spreadsheet
Many spreadsheets, such as LibreOffice Calc, can parse the data, deduce the column headers as titles, and populate a spreadsheet. You can do queries within the spreadsheet. Works well for smaller amounts of data that can comfortably reside within memory. You may not need a database at all.
Database Tool
SQL database engines such as Postgres, H2, SQLite, and MySQL/MariaDB are just black-box engines not full-blown interactive data tools. You can obtain such tools that connect with these engines. This tools can import/export text files, display lists of data, let you enter/modify data, create forms for better access to the data, and generate reports.
But there are some such data tools that have a database engine built-in. Examples include:
FileMaker
4D
LibreOffice Base
Related
I have been charged with determining the requirements to migrate data from applications running OpenVMS on DEC alpha. I have no knowledge of openvms or powerhouse, however, I have plenty of experience with linux. I am able to connect to the server via SSH.
My question is are there any standard tools part of openvms I can use to help me verify the database back end? get an idea of how many tables, rows of data, etc.?
What is the goal? Move the (structured) data over for once and for all?
Move application functionality over?
Move ongoing changes over?
You'll have to dig into the system to figure out what it does, what it is build upon. Is there no design guide, an operations play book, backup procedures?
Most likely it is based on RMS (indexed) files. The data file would be names .IDX, .INX, or .DAT or some such, and there would be mane files, one par 'table/object'. The procedures would table about BACKUP, and CONVERT.
There would be a PowerHouse Dictionary from which metadata can be extracted with "qshow generate file" into .ph files.
You may want to look at Attunity (I work there), Connx or Easysoft to use those definitions to provide ODBC or JDBC access to the data from the outside.
Attunity has tools to bulk unload into any target DB with 'one click' once the data definitions are in place, but it is likely too costly for one-time use.
Still, if the alternative is two months of consulting/coding then a tool may be attractive.
If it is based on RDB, then you would see a few .RDB files, .RBR and .AIJ files.
There would be .SQL script morcels and operations via "RMU"
Like any other database it would include metadata and has native option for remote ODBC, or (Oracle) OCI access
Hope this helps some,
Hein.
I've been tasked to create a search function for a websites' knowledgebase (which is stored in a github repo). I'm only really familiar with building databases with Django, so I'm having trouble understanding how I'm supposed to upload a bunch of html files to the database and query them with postgres. Any pointers on how the database can be structured. I've heard that html files can be stored in a text field, but how are the columns structured, does each page get its' own row, etc? and how can I do this with a fairly large knowledge base without having to manually upload each file?
The db hosting platform I am using has a migration utility that says
Uploading will accept data in any of three forms, plain text (SQL), tar archives (uncompressed), or PostgreSQL's own compressed 'custom' format.
That's assuming the database is already structured.
I've heard that html files can be stored in a text field, but how are the columns structured, does each page get its' own row, etc?
Storing html in a column is perfectly acceptable. If you're storing the html in a column, then each new page requires a new row.
and how can I do this with a fairly large knowledge base without having to manually upload each file?
You just said the hosting provider permits "PostgreSQL's own compressed 'custom' format". So install PostgreSQL locally. Get it all up and working. Insert every page locally. Then you can upload to the hosting provider using pg_dump --format=c which is not just a single action, but compressed.
I won't have access to SSIS until tomorrow so I thought I'd ask for advice before I start work on this project.
We currently use Access to store our data. It's not stored in a relational format so it's an awful mess. We want to move to a centralized database (SQL Server 2008 R2), which would require rewriting much of our codebase (which, incidentally, is also an awful mess.) Due to a time constraint, well before that can be done we are going to need to get a centralized database set up solely for the purpose of on-demand report generation for a client. So, our applications will still be running on Access. Instead of:
Receive data -> Import to Access initial file with one table -> Data processing -> Access result file with one table -> Report generation
The goal is:
Receive data -> Import to Access initial file with one table -> Import initial data to multiple tables in SQL Server -> Export Access working file with one table -> Data processing -> Access result file -> Import result to multiple tables in SQL Server -> Report generation whenever
We're going to use SSRS for the reporting component, which seems like it'll be straightforward enough. I'm not sure if SSIS alone would work well for splitting the Access data up into numerous tables, or if everything should be imported into a staging table with SSIS and then split up with stored procedures, or if I'll need to be writing a standalone application for this.
Haven't done much of any work with SQL Server before, so any advice is appreciated.
In SSIS package, you can write code (e.g. C#) to do your own/custom data transformations. However, SSIS comes with built-in transformations that may be good for your needs. SSIS is very powerful and flexible. Actually, you may do pretty much anything you want with the data in SSIS.
The high level workflow for your task could like like the following:
1. Connect to the data source and pull the data
2. Transform the data
3. Output data to the destination data source
You certainly can split a data flow into two separate branches and send it to two destinations. All you need to do is put a multi-cast in the dataflow and then the bulk of the transformations will happen after that.
From what you've said, however, a better solution might be to use the Access tables as a staging database and then grab the data from there and send it to SQL Server. That would mean two data flows but it will be a cleaner implementation.
I'm looking for an IDE or "visual editor" for some basic table manipulation.
I have a few tables, ~100K entries each. Most of them share two columns that together compose a UNIQUE PRIMARY KEY. These tables are static (they are just old record data), so no "online" or code interface is needed.
To be honest I only wish Excel or something like that could handle so many rows, since I want to perform simple tasks (e.g. erase a column, sort by column). What tool in your experienced is the most "Excel-like" for static tables?
Try Microsoft Access.
You can import or link to external data sources and access has lots of tools available to work with the data.
Using this method you will be able to do the following
View the data so that you can filter & sort the data.
write custom queries against the data (using a visual designer or SQL).
Add, edit and delete data (providing you have edit, delete privileges on the data source)
Write reports using the linked data.
Also, tables in Access 2007 and upwards look very much like Excel spreadsheets and as this is in the Microsoft office suite there are plenty of tools to export the data between Access and Excel.
phpmyadmin imo can serve as a good visual editor for what you need (though its 'online' and needs a running webserver)
DBDesignerFork is open source, free, and can reverse engineer your database to build the model.
You can then switch in to Query Mode and it will help you build queries from the table diagram.
You mention you want to be able to "erase a column, sort by column" these are two very different things. The sort is easily handled using an OREDER BY in your SELECT statements. Dropping columns can also be done in SQL using the ALTER TABLE command but remeber there is no easy "undo" unless you start wrapping your changes in transactions.
To sum up, you should forget the Excel comparison and learn SQL :)
I have an application which stored short descriptive data in DB and lots of related textual data in text files.
I would like to add "advanced search" for DB. I was thinking about adding own query language like JIRA does (Jira Query Language). Then I thought about having full text search across those textual files (less priority).
Wondering which tool will better suite me to implement that faster and simpler.
I most of all I want to provide users with ability to write their own queries instead of using elements to specify search filters.
Thanks
UPD. I save dates in DB and most of varchar fields contain one word strings.
UPD2. Apache Derby is used right now.
Take a look at the Searchable plugin for Grails.
http://www.grails.org/plugin/searchable