How to load multi dimensional data from Excel into relational database - sql

I have an excel sheet that shows the height-weight acceptable by underwriting class (insurance). I would like to load it into a relational database table with the following columns - underwriting_class, height and weight. Is there a way to do the transformation thru' SQL

You can use SSIS (Integration Services) for this as either a one time load or save the package to load as often as your data updates in the spreadsheet.
You can manage the headers and many other column properties within the package. So, you would be able to create your new table(s) based on the pages in the workbook.
Here is a good primer on SSIS 2005
How to import an Excel file into SQL Server 2005 using Integration Services

Assuming underwriting_class is in column A, height is in B and weight is in C, create column D with a formula like this:
="INSERT INTO some_table(underwriting_class, height, weight) VALUES ('"&A:A&"', "&B:B&", "&C:C&",);"
Fill that down for all the data you have
Copy column D and paste into Notepad
Save as a .sql file
Create the table in your database
Use whatever database client you use to load your .sql file into the database

Related

How to insert values with out scientific notation in ssis package creation?

I developing package for import data from excel to SQL DB using SSIS.
In one of the excel sheet following value present no format is used it's general .
0.0000316
0.0000088
0.0000022
0.000001
I insert above value into DB following data present in DB. in DB it's float data type.
3.16E-05
8.8E-06
2.2E-06
1E-06
How to insert without E . I need to insert same value in excel sheet. It's possible?
The data in the database is exactly the same data in the Excel. The only difference is the way in which it's shown. I don't know what are you using that data for, but I can assure you that you'll get the same data when fyou do calculations as you would in Excel (Excel also uses flotaing point numbers).
If you need, for some reason, to see the query results in a particular format, use SQL Server FORMAT function.

Import Excel Data into PostgreSQL 9.3

I've developed a huge table in excel and now facing problem in transferring it into the postgresql database. I've downloaded the odbc software and I'm able to open table created in postgresql with excel. However, I'm not able to do it in a reverse manner which is creating a table in excel and open it in the postgresql. So I would like to know it is can be done in this way or is there any alternative ways that can create a large table with pgAdmin III cause inserting the data raw by raw is quite tedious.
The typical answer is this:
In Excel, File/Save As, select CSV, save your current sheet.
transfer to a holding directory on the Pg server the postgres user can access
in PostgreSQL:
COPY mytable FROM '/path/to/csv/file' WITH CSV HEADER; -- must be superuser
But there are other ways to do this too. PostgreSQL is an amazingly programmable database. These include:
Write a module in pl/javaU, pl/perlU, or other untrusted language to access file, parse it, and manage the structure.
Use CSV and the fdw_file to access it as a pseudo-table
Use DBILink and DBD::Excel
Write your own foreign data wrapper for reading Excel files.
The possibilities are literally endless....
For python you could use openpyxl for all 2010 and newer file formats (xlsx).
Al Sweigart has a full tutorial from automate the boring parts on working with excel spreadsheets its very indepth and the whole book and accompanying Udemy course are great resources.
From his example
>>> import openpyxl
>>> wb = openpyxl.load_workbook('example.xlsx')
>>> wb.get_sheet_names()
['Sheet1', 'Sheet2', 'Sheet3']
>>> sheet = wb.get_sheet_by_name('Sheet3')
>>> sheet
<Worksheet "Sheet3">
Understandably once you have this access you can now use psycopg to parse the data to postgres as you normally would do.
This is a link to a list of python resources at python-excel also xlwings provides a large array of features for using python in place of vba in excel.
You can also use psql console to execute \copy without need to send file to Postgresql server machine. The command is the same:
\copy mytable [ ( column_list ) ] FROM '/path/to/csv/file' WITH CSV HEADER
A method that I use is to load the table into R as a data.frame, then use dbWriteTable to push it to PostgreSQL. These two steps are shown below.
Load Excel data into R
R's data.frame objects are database-like, where named columns have explicit types, such as text or numbers. There are several ways to get a spreadsheet into R, such as XLConnect. However, a really simple method is to select the range of the Excel table (including the header), copy it (i.e. CTRL+C), then in R use this command to get it from the clipboard:
d <- read.table("clipboard", header=TRUE, sep="\t", quote="\"", na.strings="", as.is=TRUE)
If you have RStudio, you can easily view the d object to make sure it is as expected.
Push it to PostgreSQL
Ensure you have RPostgreSQL installed from CRAN, then make a connection and send the data.frame to the database:
library(RPostgreSQL)
conn <- dbConnect(PostgreSQL(), dbname="mydb")
dbWriteTable(conn, "some_table_name", d)
Now some_table_name should appear in the database.
Some common clean-up steps can be done from pgAdmin or psql:
ALTER TABLE some_table_name RENAME "row.names" TO id;
ALTER TABLE some_table_name ALTER COLUMN id TYPE integer USING id::integer;
ALTER TABLE some_table_name ADD PRIMARY KEY (id);
As explained here http://www.postgresonline.com/journal/categories/journal/archives/339-OGR-foreign-data-wrapper-on-Windows-first-taste.html
With ogr_fdw module, its possible to open the excel sheet as foreign table in pgsql and query it directly like any other regular tables in pgsql.
This is useful for reading data from the same regularly updated table
To do this, the table header in your spreadsheet must be clean, the current ogr_fdw driver can't deal with wide-width character or new lines etc. with these characters, you will probably not be able to reference the column in pgsql due to encoding issue. (Major reason I can't use this wonderful extension.)
The ogr_fdw pre-build binaries for windows are located here http://winnie.postgis.net/download/windows/pg96/buildbot/extras/
change the version number in link to download corresponding builds.
extract the file to pgsql folder to overwrite the same name sub-folders.
restart pgsql. Before the test drive, the module needs to be installed by executing:
CREATE EXTENSION ogr_fdw;
Usage in brief:
use ogr_fdw_info.exe to prob the excel file for sheet name list
ogr_fdw_info -s "C:/excel.xlsx"
use "ogr_fdw_info.exe -l" to prob a individual sheet and generate a table definition code.
ogr_fdw_info -s "C:/excel.xlsx" -l "sheetname"
Execute the generated definition code in pgsql, a foreign table is created and mapped to your excel file. it can be queried like regular tables.
This is especially useful, if you have many small files with the same table structure. Just change the path and name in definition, and update the definition will be enough.
This plugin supports both XLSX and XLS file.
According to the document it also possible to write data back into the spreadsheet file, but all the fancy formatting in your excel will be lost, the file is recreated on write.
If the excel file is huge. This will not work. which is another reason I didn't use this extension. It load data in one time.
But this extension also support ODBC interface, it should be possible to use windows' ODBC excel file driver to create a ODBC source for the excel file and use ogr_fdw or any other pgsql's ODBC foreign data wrapper to query this intermediate ODBC source. This should be fairly stable.
The downside is that you can't change file location or name easily within pgsql like in the previous approach.
A friendly reminder. The permission issue applies to this fdw extensions. since its loaded into pgsql service. pgsql must have access privileged to the excel files.
It is possible using ogr2ogr:
C:\Program Files\PostgreSQL\12\bin\ogr2ogr.exe -f "PostgreSQL" PG:"host=someip user=someuser dbname=somedb password=somepw" C:/folder/excelfile.xlsx -nln newtablenameinpostgres -oo AUTODETECT_TYPE=YES
(Not sure if ogr2ogr is included in postgres installation or if I got it with postgis extension.)
I have used Excel/PowerPivot to create the postgreSQL insert statement. Seems like overkill, except when you need to do it over and over again. Once the data is in the PowerPivot window, I add successive columns with concatenate statements to 'build' the insert statement. I create a flattened pivot table with that last and final column. Copy and paste the resulting insert statement into my EXISTING postgreSQL table with pgAdmin.
Example two column table (my table has 30 columns from which I import successive contents over and over with the same Excel/PowerPivot.)
Column1 {a,b,...} Column2 {1,2,...}
In PowerPivot I add calculated columns with the following commands:
Calculated Column 1 has "insert into table_name values ('"
Calculated Column 2 has CONCATENATE([Calculated Column 1],CONCATENATE([Column1],"','"))
...until you get to the last column and you need to terminate the insert statement:
Calculated Column 3 has CONCATENATE([Calculated Column 2],CONCATENATE([Column2],"');"
Then in PowerPivot I add a flattened pivot table and have all of the insert statement that I just copy and paste to pgAgent.
Resulting insert statements:
insert into table_name values ('a','1');
insert into table_name values ('b','2');
insert into table_name values ('c','3');
NOTE: If you are familiar with the power pivot CONCATENATE statement, you know that it can only handle 2 arguments (nuts). Would be nice if it allowed more.
You can handle loading the excel file content by writing Java code using Apache POI library (https://poi.apache.org/). The library is developed for working with MS office application data including Excel.
I have recently created the application based on the technology that will help you to load Excel files to the Postgres database. The application is available under http://www.abespalov.com/. The application is tested only for Windows, but should work for Linux as well.
The application automatically creates necessary tables with the same columns as in the Excel files and populate the tables with content. You can export several files in parallel. You can skip the step to convert the files into the CSV format. The application handles the xls and xlsx formats.
Overall application stages are :
Load the excel file content. Here is the code depending on file extension:
{
fileExtension = FilenameUtils.getExtension(inputSheetFile.getName());
if (fileExtension.equalsIgnoreCase("xlsx")) {
workbook = createWorkbook(openOPCPackage(inputSheetFile));
} else {
workbook =
createWorkbook(openNPOIFSFileSystemPackage(inputSheetFile));
}
sheet = workbook.getSheetAt(0);
}
Establish Postgres JDBC connection
Create a Postgres table
Iterate over the sheet and inset rows into the table. Here is a piece of Java code :
{
Iterator<Row> rowIterator = InitInputFilesImpl.sheet.rowIterator();
//skip a header
if (rowIterator.hasNext()) {
rowIterator.next();
}
while (rowIterator.hasNext()) {
Row row = (Row) rowIterator.next();
// inserting rows
}
}
Here you can find all Java code for the application created for exporting excel to Postgres (https://github.com/palych-piter/Excel2DB).
the simplest answer is to use the psql command:
it's free and is include////
psql -U postgres -p 5432 -f sql-command-file.sql
I recently discovered https://sqlizer.io, it creates insert statements from an Excel file, supports MySQL and PostgreSQL. Not sure about if it supports large files though.
You can do that easily by DataGrip .
First save your excel file as csv formate . Open the excel file then SaveAs as csv format
Go to datagrip then create the table structure according to the csv file . Suggested create the column name as the column name as Excel column
right click on the table name from the list of table name of your database then click of the import data from file . Then select the converted csv file .
.

Excel database values to SQL Server and different tables

I got an Excel file with data which I want to import into a database in Microsoft SQL Server 2008 Express edition.
I do this with right click on a database -> task -> import data.
After this, my data from Excel is loaded in the database in one table.
But I want to seperate the columns from the Excel file into different tables.
So instead of loading all Excel data into one database and table, I want to load the Excel data into one database, but in different tables.
For example: Save column 1,2,3 from Excel in table A, and save column 4,5,6 from Excel in table B.
Anyone that knows how to do this?
I followed the suggestion of #bendataclear:
"Personally I would just import into a temporary table then write INSERT INTO X SELECT queries to move in the correct columns."

BCP utility to create a format file, to import Excel data to SQL Server 2008 for BULK insertion

Am trying to import Excel 2003 data into SQL table for SQL Server 2008.
Tried to add a linked server but have met with little success.
Now am trying to check if there's a way to use the BCP utility to do a BULK insert or BULK operation with OPENROWSET, using a format file to get the Excel mapping.
First of all, how can I create a format file for a table, that has differently named columns than the Excel spreadsheet colums?
Next, how to use this format file to import data from say a file at: C:\Folder1\Excel1.xsl
into table Table1 ?
Thank you.
There's some examples here that demonstrate what the data file should look like (csv) and what the format file should look like. Unless you need to do this lots I'd just hand-craft the format file, save the excel data to csv, then try using bcp or OPENROWSET.
The format file specifies the column names for the destination. The data file doesn't have column headings so you don't need to worry about the excel (source) cols being different.
If you need to do more mapping etc, then create an SSIS package. You can use the data import wizard to get you started, then save as SSIS package, then edit to your heart's content.
If it's a one-off I'd use the SQL data import size, from right-click on database in mgmt studio. If you just have a few rows to import from excel I typically open a query to Edit Top 200 rows, edit the query to match the columns I have in excel, then copy and paste the rows from excel into SQL mgmt studio. Doesn't handle errors very well, but quick.

SQL SSIS Help. Import an excel sheet into a temp table

I have a farily simple task of taking an Excel sheet and importing it into a SQL 2005 database table. I need to create an SSIS task for this. The Excel sheet does not have all the columns I need to make the insert directly into the permanent sql table, but I know how I could link out to other tables and get the columns that are missing. So I was wondering how I could import the Excel sheet into a #tempTable (or #VariableTable) and then one in a temp table I could just write my SQL Insert code (using the temptable as well as the other tables that I will link on) in a basic Execute SQL Task. But I am having trouble figuring out how to do this with SSIS. When I drag my excel source and try to link it to a SQL Server Destination the drop down doesn't have an option for temptables.
The SSIS way of doing this would be to use a Merge or Lookup transform. I don't think that you can put things into a temp table like that, but you could have an ExecuteSQL task that creates an actual table that you can then drop at the end of the package. You can then have your package use that.
During the design time you might need to have the table in place to link things up, but it shouldn't need to be there when you actually run the package.
First, you'll need to create the staging table for the excel worksheet. Open SSMS, right click the database, choose tasks, import data. Set the import source as excel. Browse to the file. Set the destination as SQL Server. You can accept the table name or name it as you wish. I suggest naming it something useful. Depending on your understanding of data types and what is in the excel sheet, it may take you a while to get this right. Eventually, you will have a table that will accept the contents of the excel sheet.
Second, create your ssis package by using an excel source and sql serve or oledb destination.
take Execute SQL task in control flow to create the target staging table of excel sheet surce.
in data flowuse excel source and oled db target to staging table create in the first step.
3.In control flow use merge or join statement to your excel targeted staging table with other source table to final target table.
thanks
prav