Access 2012 importing a 500MB text file results in an out of disk space error - vba

I have a 500MB txt file. It is not a CSV file (it does not have any delimitiers other than spaces), so I can't import it using tSQL. Currently I am trying with the help of Access's import specifications. I figured out how to call it in code and polished the code so I can import a small file (test file was 200kb large). But now i have the exact file i have to import and its 500MB in size. When i run my code it gets to around 50%, and then throws an "Your computer is out of disk space. You won't be able to undo this paste append. Do you want to continue anyway?" error.
I am inserting into a linked SQL table.
What can I do to get rid of this error and what exactly is causing it (I have plenty of disk space and memory capacity)?

You can bulk insert space delimited files in T-SQL like so:
BULK INSERT yourTable
FROM 'C:/<filepath>/yourTextFile.txt'
WITH
(
--Space delimited
FIELDTERMINATOR =' ',
--New rows starts at new line
ROWTERMINATOR ='\n'
--Use this if you have a header row with your column names.
--,FIRSTROW = 2
)

Are you sure you have disk space space? It does not just make up those messages.
Is the disk space actually allocated to data data and log files?

Well I managed to solve it using tsql. The problem was the columns in the file were not separated with any separators. So I create an import specification in access (the one where you pick where you want each column to be). Then i just used the specification as the reference for writing a tsql procedure, which uses bulk insert and then gets each "column" out of the txt file using SUBSTRING (access specification gives you the range to use in substring).
Now works without any problems. It takes about 20 min to import the 500MB file. But i have a job at night that runs it so thats not a problem.
Thank you all for your help. This question is now closed. Any questions about my solution, please ask.

Related

Import large delimited .txt file in SQL Server 2008

Every morning one of my clients send me a .txt file with ' ; ' as separator, and this is how the file is currently being imported in a temp table using SSIS:
mario.mascarenhas;MARIO LUIZ MASCARENHAS;2017-03-21 13:18:22;PDV;94d33a66dbaaff15a01d8139c7acd7c6;;;1;0;0;0;0;0;0;0;0;0;0;\N
evilanio.asevedo;EVILANIO ASEVEDO;2017-03-21 13:26:10;PDV;30a1bd072ac5f158f99445bb0975e423;;;1;1;0;0;0;0;0;0;0;0;0;\N
marcelo.tarso;MARCELO TARSO;2017-03-21 13:47:09;PDV;ef6b5e971242ec345552cdb724968f8a;;;1;0;0;0;0;0;0;0;0;0;0;\N
tiago.rodrigues;TIAGO ALVES RODRIGUES;2017-03-21 13:49:04;PDV;d782d4b30c0d302fe815b2cb48de4d03;;;1;1;0;0;0;0;0;0;0;0;0;\N
roberto.freire;ROBERTO CUSTODIO;2017-03-21 13:54:53;PDV;78794a18187f068d612e6b6370a60781;;;1;0;0;0;1;0;0;0;0;0;0;\N
eduardo.lima;EDUARDO MORO LIMA;2017-03-21 13:55:24;PDV;83e1c2696faa83d54881b13c70a07924;;;1;0;0;0;0;0;0;0;0;0;0;\N
Each file constains at least 23,000 rows just like that.
I already made a table with the correct number of columns to receive this data. So what I want is to "explode" (just like in PHP) the row using ' ; ' as the column separator and loop the insert in my table named dbo.showHistoricalLogging.
I've been searching for a solution here in Stack but nothing specific having this volume of data in consideration and looping an insert.
Any idea? I'm running SQL Server 2008.
My suggestion,
convert the text file into a csv file, then refer to this post from StackOverFlow to use the Bulk package. I have used this before while I was in University of Arizona for one of my programming assignments in my Database Designs class. Any clarifications and/or question, leave in a comment and will do my best.
Something like this should work
BULK INSERT [TableName] FROM 'C:\MyFile.txt' WITH (FIELDTERMINATOR = ';', ROWTERMINATOR = '\\N');
consult the Microsoft Bulk Insert documentation if you need other parameters. Alternatively SSIS makes this super easy as well - many ways you could do this honestly.

My file gets truncated in Hive after uploading it completely to Cloudera Hue

I am using Cloudera's Hue. In the file browser, I upload a .csv file with about 3,000 rows (my file is small <400k).
After uploading the file I go to the Data Browser, create a table and import the data into it.
When I go to Hive and perform a simple query (say SELECT * FROM table) I only see results for 99 rows. The original .csv has more than those rows.
When I do other queries I notice that several rows of data are missing although they show in the preview in the Hue File Browser.
I have tried with other files and they also get truncated sometimes at 65 rows or 165 rows.
I have also removed all the "," from the .csv data before uploading the file.
I finally solved this. There were several issues that appeared to cause a truncation.
The main was that the variable type automatically set after importing the data was assigned according to the first lines. So when the data type changed from TinyINT to INT it got truncated or changed to "NULL". To solve this perform EDA and change the datatype before creating the table.
Other issues were that the memory I had assigned to the virtual machine slowed the preview process and that the csv contained commas. You can set the VM to have more memory or change a csv to tab separated.

Bulk Insert with Limited Disk Space

I have a bit of a strange situation, and I'm wondering if anyone would have any ideas how to proceed.
I'm trying to bulk load a 48 gig pipe-delimited file into a table in SQL Server 2008, using a pretty simple bulk insert statement.
BULK INSERT ItemMovement
FROM 'E:\SQLexp\itemmove.csv'
WITH (DATAFILETYPE = 'char', FIELDTERMINATOR = '|', ROWTERMINATOR = '\n' )
Originally, I was trying to load directly into the ItemMovement table. But unfortunately, there's a primary key violation somewhere in this giant file. I created a temporary table to load this file to instead, and I'm planning on selecting distinct rows from the temporary table and merging them into the permanent table.
However, I keep running into space issues. The drive I'm working with is a total of 200 gigs, and 89 gigs are already devoted to both my CSV file and other database information. Every time I try to do my insertion, even with my recovery model set to "Simple", I get the following error (after 9.5 hours of course):
Msg 9002, Level 17, State 4, Line 1
The transaction log for database 'MyData' is full due to 'ACTIVE_TRANSACTION'.
Basically, my question boils down to two things.
Is there any way to load this file into a table that won't fill up the drive with logging? Simple Recovery doesn't seem to be enough by itself.
If we do manage to load up the table, is there a way to do a distinct merge that removes the items from the source table while it's doing the query (for space reasons)?
Appreciate your help.
Even with simple recovery the insert is still a single operation.
You are getting the error on the PK column
I assume the PK is only a fraction of the total size
I would break it up to only insert the PK
Pretty sure you can limit the columns with FORMATFILE
If you have to edit a bunch of duplicate PKs you may need use a program to parse and then load row by row
Sounds like a lot of work that is solved with a $100 drive.
For real would install a drive and use it for the transaction log.
#tommy_o was right about using TABLOCK in order to get my information loaded. Not only did it run in about an hour and a half instead of nine hours, but it barely increased my log size.
For the second part, I realized I could free up quite a bit of space by deleting my CSV after the load, which gave me enough space to get the tables merged.
Thanks everyone!

importing a text file using pgAdmin

I have just downloaded pgAdmin 1.14.3 in an effort to import, query, and manage large textfiles. These textfiles are either quote comma quote delimited or tab delimited (they come as quote comma quote and I edited many for use with another software). While version 1.16 allows an import function, it has not been released yet and I am wondering how to import data into a newly created table using pgAdmin.
The text files range from 12MB to 2GB, so I'm looking for a comprehensive solution that would not involve importing row by row. I tried this with phppgadmin, but ran into file size limitations embedded in the php.ini file (separate post) and am trying this as a possible workaround. I'm a little new to SQL, so not really sure of all the commands possible at my fingertips. Any helps is appreciated - thanks!
You can issue a COPY statement, like this:
COPY table_name (column_name)
FROM 'd:\test.sql';
Query returned successfully: 6 rows affected, 31 ms execution time.
See the documentation here:
http://www.postgresql.org/docs/9.1/static/sql-copy.html
Note that I did not test this in PgAdmin for large files, but using psql I have never seen a case where the file had been too big for COPY.

Importing an .RPT (6 gigs) file into SQL Server 2005

I'm trying to import two seperate .RPT files into SQL, one is small, one is large. Both have issues with determining where the columns are seperated.
My solution for this was to import the file into access, define the columns and then save it as a txt file.
This worked perfectly.
The problem however is the larger file is 6 gigs and MS Access won't allow me to open it. When trying to change the extension to simply .txt and importing it into SQL, everything comes down under one column (despite there being 10) and there is no way to accurately seperate the data.
Please help!
As Tony stated Access has a hard 2GB limit on database size.
You don't say what kind of file the .RPT file is. If it is a text file, then you could break it into smaller chunks by reading it line by line and appending it into temporary files. Then import/export these smaller files one at a time.
Keep in mind the 2GB limit is on the Access database, so your temporary text files will need to be somewhat smaller because the import will likely introduce some additional overhead. Also, you may need to compact/repair the database in between import/export cycles to reclaim space in the database; simply deleting the records is not enough.
If the file has column delimiters or fixed column widths you can try the following in SQL Management Studio:
Right click on a database, select "Tasks" and then "Import data...". This will take you through a wizard where you can define the source columns and map them to an existing or new table.