I am still yet to find a reasonable solution to this problem. I need to export a table which has two columns with nvarchar(max) data. The data will have new lines, line breaks and sometimes Unicode characters, so a flat file format doesn't make much sense.
What is the optimal way to achieve this task? I have previously attempted it in C#, but the time taken to do the job was too slow. I am open to solutions in any reasonable domain, i.e. SQL Management Studio, C#, Powershell etc.
Thanks
Related
I have a 25GB's text file with that structure(headers):
Sample Name Allele1 Allele2 Code metaInfo...
So it just one table with a few millions of records. I need to put it to database coz sometimes I need to search that file looking, for example, specific sample. Then I need to get all row and equals to file. This would be a basic application. What is important? File is constant. It no needed put function coz all samples are finished.
My question is:
Which DB will be better in this case and why? Should I put a file in SQL base or maybe MongoDB would be a better idea. I need to learn one of them and I want to pick the best way. Could someone give advice, coz I didn't find in the internet anything particular.
Your question is a bit broad, but assuming your 25GB text file in fact has a regular structure, with each line having the same number (and data type) of columns, then you might want to host this data in a SQL relational database. The reason for choosing SQL over a NoSQL solution is that the former tool is well suited for working with data having a well defined structure. In addition, if you ever need to relate your 25GB table to other tables, SQL has a bunch of tools at its disposal to make that fast, such as indices.
Both MySQL and MongoDB are equally good for your use-case, as you only want read-only operations on a single collection/table.
For comparison refer to MySQL vs MongoDB 1000 reads
But I will suggest going for MongoDB because of its aggeration pipeline. Though your current use case is very much straight forward, in future you may need to go for complex operations. In that case, MongoDB's aggregation pipeline will come very handy.
I would like to know if anyone has a good solution for importing BAIv2 banking files into SQL Server. First of all the files have "continuation records" which have to be considered along with the parent records. Also, T-SQL doesn't have a pleasant way of parsing comma-separated strings. Finally, one hierarchy in the file has a varying number of elements so that makes direct pasting into a table difficult because the columns would not line up.
This is the file from hell. If anyone has any insights into how to import and parse BAIv2 banking files I would be most appreciative.
Thank you,
You're best off handling this with a dedicated application server and a real (general-purpose) programming language. T-SQL is ill-suited for this task.
When that's not an option, you can use C# for a SQL CLR stored procedure to parse the files. I did something similar for banking flat-files when I didn't have the option of an application server.
Ok, so the background to the story. I am largely self taught the bits of SQL i do know, and it tends to be just enough to make things work that need to work - albeit with a fair bit of research for the most basic jobs!
I am using a piece of software which grabs a string of data, and then passes it straight to an SQL stored procedure to move the data around, perform a few tasks on the string to make it the format i need it to be, and then grabs lumps of this data and places it in various SQL tables as outlined by the SP. I get maybe half a million lines of data each day, and this process works perfectly well and quickly. However, should data be lost, or not manage to make it through to the SQL database correctly, i do still have a log of the 500,000 lines of raw data in CSV file format.
I cant seem to find a way to simply bulk import this data into the various tables in the various formats it needs to be in. Assuming there is no way to re-pass this data through the 3rd party software (i have tried and failed), what is the best (read easiest for a relative lamen) way to send each line of this CSV file through my existing stored procedure, which can then process and import the data as normal? i have looked at the bcp utility, but that didnt seem to be viable (or i am not well enough informed to make it do what i need). I have also done the usual trawling of the web (and these forums) to see if anything jumped out at me as the obvious way forward, but come up a bit dry.
Apologies if i am asking something a bit 101, but i would certainly be grateful if anyone could help me out with this - if i missed out any salient bits of information, let me know! :)
Cheers.
The SQL Server Import/Export Wizard is a point and click solution that can be used to import CSV files into SQL Server.
The wizard builds an SSIS package behind the scenes, which can be saved and scheduled to run as needed. The wizard doesn't give you much in the way of data transformation, but the data could be loaded into a staging table and then processed by your existing stored procedure.
i have to insert and update some values which is daily coming from
excel file but as everyday excel file format is different
so tell me other possible ways to automate insert update ?
Are the excel files really in different formats or does Excel just think they are different? If the columns are still in the same ordinal positions but they are being interpreted as having different data types, then yes, you can provide hints to the driver to overcome.
Otherwise, you could use C#/vb.net and query the worksheet, dump that into an dataset, write that to a variable and then shred that object but it's ugly. In fact, dealing with Excel in a programmatic fashion is always ugly and best avoided.
If your file is a different format each day then you are out of luck. That is a problem, and there is really no easy or efficient way to parse and insert/update based on that. Whatever the source of the data is, you need to ensure that it becomes consistent.
If it is a handful of formats that you can test and handle accordingly then you could always have some data flow logic inside the SSIS package, but if this isn't predetermined then you would have no way of handling these cases.
YOu deal with this by returning the file to the provider and requiring them to provide in the same fashion every day. Then your SSIS pacakge should reject the file if it is not in the correct format. While you aer at it you will have far fewer problems if they send .txt or .csv file. Excel support is exceedingly poor.
It's my first post ever... and I really need help on this one so any one who has some knowlege on the subject - please help!
What I need to do is to read an xml file into sql server data tables. I was looking over and over for solutions to this one and have found a few actualy. The problem is the size of the xml which is being loaded. It weights 2GB (and there will 10GB ones). I have managed to do this but I saw one particular solution which seems to me to be a great one but I cannot figure it out.
Ok lets get to the point. Currently I do it this way:
I read an entire XML using the openrowset into a variable. (this takes the whole ram memory...)
next I use the .node() thing to pull out the data and fill the tables with them.
Thats a two-step process. I was wondering if I could do it in only one step. I saw that there are things like format files and there are numerus examples on how to use that to pull out data from flat files or even excel documents in a record-based maner (in stead of sucking the whole thing into a variable) but I CANNOT find any example which would show how to read that huge XML into a table parsing the data on the fly (based on the format file). Is it even possible? I would really appreciate some help, or guidence on where to find a good example.
Pardon my English - it's been a while since I had to write so much in that language :-)
Thanks in advance!
For very large files, you could use SSIS: Loading XML data into SQL Server 2008
It gives you the flexibility of transforming the XML data, as well as reducing your memory footprint for very large files. Of course, it might be slower compared to using OPENROWSET in BULK mode.