Need to export fields containing linebreaks as a CSV from SQL Server - sql-server-2005

I'm running a query from SQL Server Management Studio 2005 that has an HTML file stored as a a string, e.g.:
SELECT html
FROM table
This query displays in the "Results" window nicely, i.e., each row contains a string of the whole HTML file, with one record per row.
However, when I use the "Results to file" option, it exports it as an unusable CSV with line breaks in the CSV occurring wherever line breaks occurred in the field (i.e., in the HTML), rather than one per row as needed. I tried experimenting with the Query>Query Options for both the "Grid" and "Text" results, to no avail. The fields exported to the CSV do not appear to be enclosed within quotes.
Some ideas:
Might it be possible to append
quotation marks w/ the SQL?
Is there some T-SQL equivalent to the
WITH CSV HEADER commands that are
possible in other dialects?

I don't see where you will have much success exporting html to csv - it's really not what csv is meant for. You would be better off using an xml format, where the html code can be enclosed in a cdata element.
That said, you could try using the Replace function to remove the line breaks, and you could manually add the quotes - something like this:
select '"' + replace (replace (html, char(10), ''), char(13), '') + '"'
If your html value could have double quotes in it, you would need to escape those.

If you are using Visual Studio, Server Explorer is an alternative solution. You can correctly copy & paste the results from its grid.

According to the closest thing to a standard we have, a correctly formatted CSV-file should quote fields containing either the separator (most often ; or ,) or a linebreak.
SQL Server Management Studio can do that, but amazingly it doesn't have this option enabled by default. To enable, go to Tools → Options → Query Results → Results to Grid and check "Quote strings containing list separators when saving .csv results"

I know how old this is, but I found it and others might, as well. You might want to take Martijn van Hoof's answer one step further and remove possible tabs (char(9)) in addition to carriage return(13) and line feed(10).
SELECT REPLACE(REPLACE(REPLACE(mycolumn, CHAR(9), ''), CHAR(10), ''), CHAR(13), '') as 'mycolumn' FROM mytable

column with line breaks: "mycolumn"
SELECT REPLACE(REPLACE(mycolumn, CHAR(13), ''), CHAR(10), '') as `mycolumn` FROM mytable
This replaces the linebreaks.

i know this is very old question and has been answered now but this should help as well:
SELECT html
From table
For Xml Auto, Elements, Root('doc')
it should spit out an xml and then you can import that xml into excel

found this useful but I was still hitting problems as my field was type text so I cast the text as varchar(8000) and above replace works like a charm
REPLACE(REPLACE(CAST([TEXT FIELD] AS VARCHAR(8000)), CHAR(10), ' '), CHAR(13), ' ') AS 'Field Name',

SQL Server Import and Export Data tool, FTW!
Start -> Programs -> Microsoft SQL Server -> Import and Export Data
Sample query:
select *
from (
select 'Row 1' as [row], 'Commas, commas everywhere,' as [commas], 'line 1
line 2
line 3' as [linebreaks]
union all
select 'Row 2' as [row], 'Nor any drop to drink,' as [commas], 'line 4
line 5
line 6' as [data]
) a
CSV output:
"row","commas","linebreaks"
"Row 1","Commas, commas everywhere,","line 1
line 2
line 3"
"Row 2","Nor any drop to drink,","line 4
line 5
line 6"
Disclaimer: You may have to get creative with double-quotes in the data. Good luck!

I got around this limitation by creating an Access database, using the "Link to the data source by creating a linked table" feature, opening the "linked" table, then copy/paste to Excel from there. Excel can save the CSV data as needed. You should be able to connect directly from Excel too, but the "linked table" feature in Access let me set up several tables at once pretty quickly.

Related

Hex code of space after using XML in SQL Server

I have 4 rows. Each one contains ony symbol. Using XML PATH('') I combine them in one string. What should I obtain is ' ab' (space, space, a, b). Instead of it I get next thing '&#x20&#x20ab' (i cut semicolons because it is not visible then)
How to convert first symbols back to regular spaces?
Solved like in this article: http://codecorner.galanter.net/2009/06/25/t-sql-string-aggregate-in-sql-server/
Just added .value('root[1]','varchar(max)')
Thanks to Yuriy Galanter

Incorrect syntax near ''

I'm trying to run the following fairly simple query in SQL Server Management Studio:
SELECT TOP 1000 *
FROM
master.sys.procedures as procs
left join
master.sys.parameters as params on procs.object_id = params.object_id
This seems totally correct, but I keep getting the following error:
Msg 102, Level 15, State 1, Line 6
Incorrect syntax near ''.
It works if I take out the join and only do a simple select:
SELECT TOP 1000 *
FROM
master.sys.procedures as procs
But I need the join to work. I don't even have the string '' in this query, so I can't figure out what it doesn't like.
Such unexpected problems can appear when you copy the code from a web page or email and the text contains unprintable characters like individual CR or LF and non-breaking spaces.
Panagiotis Kanavos is right, sometimes copy and paste T-SQL can make appear unwanted characters...
I finally found a simple and fast way (only Notepad++ needed) to detect which character is wrong, without having to manually rewrite the whole statement: there is no need to save any file to disk.
It's pretty quick, in Notepad++:
Click "New file"
Check under the menu "Encoding": the value should be "Encode in UTF-8"; set it if it's not
Paste your text
From Encoding menu, now click "Encode in ANSI" and check again your text
You should easily find the wrong character(s)
The error for me was that I read the SQL statement from a text file, and the text file was saved in the UTF-8 with BOM (byte order mark) format.
To solve this, I opened the file in Notepad++ and under Encoding, chose UTF-8. Alternatively you can remove the first three bytes of the file with a hex editor.
You can identify the encoding used for the file (in this case sql file) using an editor (I used Visual studio code). Once you open the file, it shows you the encoding of the file at the lower right corner on the editor.
encoding
I had this issue when I was trying to check-in a file that was encoded UTF-BOM (originating from a non-windows machine) that had special characters appended to individual string characters
You can change the encoding of your file as follows:
In the bottom bar of VSCode, you'll see the label UTF-8 With BOM. Click it. A popup opens. Click Save with encoding. You can now pick a new encoding for that file (UTF-8)
I was using ADO.NET and was using SQL Command as:
string query =
"SELECT * " +
"FROM table_name" +
"Where id=#id";
the thing was i missed a whitespace at the end of "FROM table_name"+
So basically it said
string query = "SELECT * FROM table_nameWHERE id=#id";
and this was causing the error.
Hope it helps
I got this error because I pasted alias columns into a DECLARE statement.
DECLARE #userdata TABLE(
f.TABLE_CATALOG nvarchar(100),
f.TABLE_NAME nvarchar(100),
f.COLUMN_NAME nvarchar(100),
p.COLUMN_NAME nvarchar(100)
)
SELECT * FROM #userdata
ERROR:
Msg 102, Level 15, State 1, Line 2
Incorrect syntax near '.'.
DECLARE #userdata TABLE(
f_TABLE_CATALOG nvarchar(100),
f_TABLE_NAME nvarchar(100),
f_COLUMN_NAME nvarchar(100),
p_COLUMN_NAME nvarchar(100)
)
SELECT * FROM #userdata
NO ERROR
For me I was miss single quote in the statement
Incorrect One : "INSERT INTO Customers (CustomerNo, FirstName, MobileNo1, RelatedPersonMobileNo) VALUES ('John123', John', '1111111111', '1111111111)"
missed quote in John' and '1111111111
Correct One: "INSERT INTO Customers (CustomerNo, FirstName, MobileNo1, RelatedPersonMobileNo) VALUES ('John123', 'John', '1111111111', '1111111111')"
I was able to run this by replacing the 'Dot'; with and 'Underscore'; for the [dbo][tablename].
EXAMPLE:
EXEC sp_columns INFORMATION_SCHEMA.COLUMNS
GO //**this will NOT work. But will intelliSence/autocomplete as if its correct.
EXEC sp_columns INFORMATION_SCHEMA_COLUMNS
GO //**This will run in Synapse. but funny enough will not autocomplete.

How to check my data in SQL Server have carriage return and line feed? [duplicate]

This question already has answers here:
SQL query for a carriage return in a string and ultimately removing carriage return
(10 answers)
Closed 8 years ago.
Facing a problem, it seems my data stored in SQL Server does not stored correctly, simply put, how to verify that a varchar data has carriage return and line feed in it? I try to print them out, does not show the special characters.
Thanks
You can use SQL Server's char(n) and contains() functions to match on field contents in your WHERE clause.
carriage return: char(13)
line feed: char(10)
The following SQL will find all rows in some_table where the values of some_field contain newline and/or carriage return characters:
SELECT * FROM some_table
WHERE CONTAINS(some_field, char(13)) OR CONTAINS(some_field, char(10))
To remove carriage returns in your some_field values you could use the replace() function long with char() and contains(). The SQL would look something like:
UPDATE some_table
SET some_field = REPLACE(some_field, char(13), '')
WHERE CONTAINS(some_field, char(13))
To remove new lines you can follow up the last statement with the char(10) version of that update. How you do it all depends on what types of newlines your text contains. But, depending on where the text was inserted/pasted from the new lines may be \r\n or \n so running updates against both \r and \n characters would be safer than assuming that you're getting one version of newline or the other.
Note that if the newlines were removed and you want to retain them then you have to fix the issue at the point of entry. You can't replace or fix what has been removed so you should save the original text data in a new column that holds the original, unmodified text.
To add to what others have said; when I need to embed newlines in T-SQL, I tend to do;
DECLARE #nl CHAR(2) = CHAR(13) + CHAR(10);
..then use #nl as required. That's for Windows line-endings, naturally.
Take a look at the Char function. See MSDN. This will help look for the special characters.

Replace() on a field with line breaks in it?

So I have a field that's basically storing an entire XML file per row, complete with line breaks, and I need to remove some text from close to three hundred rows. The replace() function doesn't find the offending text no matter what I do, and all I can find by searching is a bunchy of people trying to remove the line breaks themselves. I don't see any reason that replace() just wouldn't work, so I must just be formatting it wrong somehow. Help?
Edit: Here's an example of what I mean in broad terms:
<script>...</script><dependencies>...</dependencies><bunch of other stuff></bunch of other stuff><labels><label description="Field2" languagecode="1033" /></labels><events><event name="onchange" application="false" active="true"><script><![field2.DataValue = (some equation);
</script><dependencies /></event></events><a bunch more stuff></a bunch more stuff>
I need to just remove everything between the events tags. So my sql code is this:
replace(fieldname, '<events><event name="onchange" application="false" active="true"><script><![field2.DataValue = (some equation);
</script><dependencies /></event></events>', '')
I've tried it like that, and I've tried it all on one line, and I've tried using char(10) where the line breaks are supposed to be, and nothing.
Nathan's answer was close. Since this question is the first thing that came up from a search I wanted to add a solution for my problem.
select replace(field,CHAR(13)+CHAR(10),' ')
I replaced the line break with a space incase there was no break. It may be that you want to always replace it with nothing in which case '' should be used instead of ' '.
Hope this helps someone else and they don't have to click the second link in the results from the search engine.
Worked for me on SQL2012-
UPDATE YourTable
SET YourCol = REPLACE(YourCol, CHAR(13) + CHAR(10), '')
If your column is an xml typed column, you can use the delete method on the column to remove the events nodes. See http://msdn.microsoft.com/en-us/library/ms190254(v=SQL.90).aspx for more info.
try two simple tests.
try the replace on an xml string that has no double quotes (or single quotes) but does have CRLFs. Does it work? If yes, you need to escape the quote marks.
try the replace on an xml string that has no CRLFs. Does it work? Great. If yes use two nested replace() one for the CRLFs only, then a second outter replace for the string in question.
A lot of people do not remember that line breaks are two characters
(Char 10 \n, and Char 13 \r)
replace both, and you should be good.
SELECT
REPLACE(field , CHR(10)+CHR(13), '' )
FROM Blah..

Bulk insert, SQL Server 2000, unix linebreaks

I am trying to insert a .csv file into a database with unix linebreaks. The command I am running is:
BULK INSERT table_name
FROM 'C:\file.csv'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)
If I convert the file into Windows format the load works, but I don't want to do this extra step if it can be avoided. Any ideas?
I felt compelled to contribute as I was having the same issue, and I need to read 2 UNIX files from SAP at least a couple of times a day. Therefore, instead of using unix2dos, I needed something with less manual intervention and more automatic via programming.
As noted, the Char(10) works within the sql string. I didn't want to use an sql string, and so I used ''''+Char(10)+'''', but for some reason, this didn't compile.
What did work very slick was: with (ROWTERMINATOR = '0x0a')
Problem solved with Hex!
Thanks to all who have answered but I found my preferred solution.
When you tell SQL Server ROWTERMINATOR='\n' it interprets this as meaning the default row terminator under Windows which is actually "\r\n" (using C/C++ notation). If your row terminator is really just "\n" you will have to use the dynamic SQL shown below.
DECLARE #bulk_cmd varchar(1000)
SET #bulk_cmd = 'BULK INSERT table_name
FROM ''C:\file.csv''
WITH (FIELDTERMINATOR = '','', ROWTERMINATOR = '''+CHAR(10)+''')'
EXEC (#bulk_cmd)
Why you can't say BULK INSERT ...(ROWTERMINATOR = CHAR(10)) is beyond me. It doesn't look like you can evaluate any expressions in the WITH section of the command.
What the above does is create a string of the command and execute that. Neatly sidestepping the need to create an additional file or go through extra steps.
I confirm that the syntax
ROWTERMINATOR = '''+CHAR(10)+'''
works when used with an EXEC command.
If you have multiple ROWTERMINATOR characters (e.g. a pipe and a unix linefeed) then the syntax for this is:
ROWTERMINATOR = '''+CHAR(124)+''+CHAR(10)+'''
It's a bit more complicated than that! When you tell SQL Server ROWTERMINATOR='\n' it interprets this as meaning the default row terminator under Windows which is actually "\r\n" (using C/C++ notation). If your row terminator is really just "\n" you will have to use the dynamic SQL shown above. I have just spent the best part of an hour figuring out why \n doesn't really mean \n when used with BULK INSERT!
One option would be to use bcp, and set up a control file with '\n' as the line break character.
Although you've indicated that you would prefer not to, another option would be to use unix2dos to pre-process the file into one with '\r\n' line breaks.
Finally, you can use the FORMATFILE option on BULK INSERT. This will use a bcp control file to specify the import format.
Looks to me there are two general avenues that can be taken: some alternate way to read the CSV in the SQL script or convert the CSV beforehand with any of the numerous ways you can do that (bcp, unix2dos, if it is a one-time king of a thing, you can probably even use your code editor to fix the file for you).
But you will have to have an extra step!
If this SQL is launched from a program, you might want to convert the line endings in that program. In that case and you decide to code the conversion yourself, here is what you need to watch out for:
1. The line ending might be \n
2. or \r\n
3. or even \r (Mac!)
4. good grief, it could be that some lines have \r\n and others \n, any combination is possible unless you control where the CSV came from
OK, OK. Possibility 4 is farfetched. It happens in email, but that is another story.
I would think "ROWTERMINATOR = '\n'" would work. I would suggest opening the file in a tool that shows "hidden characters" to make sure the line is being terminated like you think. I use notepad++ for things like this.
It comes down to this. Unix uses LF (ctrl-J), MS-DOS/Windows uses CR/LF (ctrl-M/Ctrl-J).
When you use '\n' on Unix, it gets translated to a LF character. On MS-DOS/Windows it gets translated to CR/LF. When the your import runs on the Unix formatted file, it sees only a LF. Hence, its often easier to run the file through unix2dos first. But as you said in you original question, you don't want to do this (I'll assume there is a good reason why you can't).
Why can't you do:
(ROWTERMINATOR = CHAR(10))
Probably because when the SQL code is being parsed, it is not replacing the char(10) with the LF character (because it's already encased in single-quotes). Or perhaps its being interpreted as:
(ROWTERMINATOR =
)
What happens when you echo out the contents of #bulk_cmd?