BigQuery - quoting strings produced unexpected results - google-bigquery

I am running the following query:
select concat('{','"name"',':', chr(34), str,
chr(34), ', ','"type"',':','"string"','},') jsonl
from (select 'part_number' as str)
which results in:
and this is the expected results.
But when I save the results to a csv file,
the results look different.
The issue is with the extra double quotation mark that is surrounding each element.
Any idea what is causing this discrepancy.
btw, my local machine is running Windows 11.

This is a common escaping for quotes inside strings in csv files. Try opening this file in Excel or LibreOffice Calc. It should look as expected.

Related

Remove space between empty quotes in csv file using powershell

I have a csv file with many empty quotes and I want to remove them using powershell. Tried various solutions but it didn't work.
Sample data :" ","abc",""," ","123"
Expected output:,"abc",,"123"

" replaced by ""

redshift unload command is replacing " by "".
example :
UNLOAD($$ select '"Jane"' as name $$)
TO 's3://s3-bucket/test_'
iam_role arn:aws:iam::xxxxxx:role/xxxxxx'
HEADER
CSV
DELIMITER ','
ALLOWOVERWRITE
The output looks like : ""Jane""
If I run the same command with select 'Jane' as name , the output shows without quote at all like Jane. But I need the output to be "Jane"
You are asking for the unloaded file to be in CSV format and CSV format says that if you want a double quote in your data you need to escape it with another double quote. See https://datatracker.ietf.org/doc/html/rfc4180
So Redshift is doing exactly as you requested. Now if you just want a comma delimited file then you don't want to use "CSV" as this will add all the necessary characters to make the file fully compliant with the CSV specification.
This choice will come down to what tool or tools are reading the file and if they expect an rfc compliant CSV or just a simple file where fields are separated by commas.
This is a gripe of mine - tools that say they read CSV but don't follow the spec. If you say CSV then follow the format. Or call what you read something different, like CDV - comma delimited values.

Getting Unescaped JSON from SQL

I've created a stored procedure to pull data as a JSON object from my SQL Server database. All my data is relational and I'm trying to get it out as a JSON string.
Currently, I am able to get out a JSON string from SQL Server just fine, however this object ALWAYS includes escape characters (e.g. "{\"field\":\"value\"}). I'd like to pull the same JSON but without escaped characters. To test this I'm using some simple queries and getting them into .NET with a SqlDataAdapter using my stored procedure.
The thing that puzzles me is that when I run my query within SSMS, I never see any escape characters, but as soon as it's pulled a .NET application, the escape characters appear. I'd like to prevent this from happening and have my applications get only the unescaped JSON string.
I've tried several suggestions I've found during my research but nothing has produced my desired results. The changes I've seen (documented in MSDN and in other SO posts) have dealt with getting unescaped results, but only within SSMS and not within other applications.
What I've tried:
Simple Json query set to param and then using JSON_QUERY to select the param:
DECLARE #JSON varchar(max)
SET #JSON = (SELECT '{"Field":"Value"}' AS myJson FOR JSON PATH)
SELECT JSON_QUERY(#JSON) AS 'JsonResponse' FOR JSON PATH
This produces the following in a .NET application:
"[{\"JsonResponse\":{\"Field\":\"Value\"}}]"
This produces the following in SSMS:
[{"JsonResponse":[{"myJson":"{\"Field\":\"Value\"}"}]}]
Simple Json query without param using JSON_QUERY:
SELECT JSON_QUERY('{"Field":"Value"}') AS 'JsonResponse' FOR JSON PATH
This produces the following in a .NET application
"[{\"JsonResponse\":{\"Field\":\"Value\"}}]"
This produces the following in SSMS
[{"JsonResponse":{"Field":"Value"}}]
Simple Json query with temp tables using JSON_QUERY:
CREATE TABLE #temp(
jsoncol varchar(255)
)
INSERT INTO #temp VALUES ('{"Field":"Value"}')
SELECT JSON_QUERY(jsoncol) AS 'JsonResponse' FROM #temp FOR JSON PATH
DROP TABLE #temp
This produces the following in a .NET application:
"[{\"JsonResponse\":{\"Field\":\"Value\"}}]"
This produces the following in SSMS:
[{"JsonResponse":{"Field":"Value"}}]
I'm lead to believe that there is no way to get out a JSON string from SQL Server without having the escaped characters. In case the examples above weren't enough, I've included my stored procedure here. Hopefully someone can point me in the right direction.
This depends where you look at the string...
In SSMS a string is marked with single quotes. The double quote can exist within a string without problems:
DECLARE #SomeString = 'This can include "double quotes" but you have to double ''single quote''';
In a C# application the double quote is the string marker. So the above example would look like this:
string SomeString = "This must escape \"double quotes\" but you can use 'single quote' without problems";
Within your IDE (is it VS?) you can look at the string as is or as you'd need to be used in code. Your example shows " at the beginning and at the end of your string. That is a clear hint, that this is the option as in code. You could use this string and place it into your code. The real string, which is used and processed will not contain escape characters.
Hint: Escape characters are only needed in human-readable formats, where there are characters with special meaning (a ; in a CSV, a < in HTML and so on)...
UPDATE Some more explanation
Escape characters are needed to place a string within a string. Somehow you have to mark the beginning and the end of the string, but there is nothing else you can use then some magic characters.
In order to use these characters within the embedded string you have to go one the following ways:
escaping (e.g. XML will replace & with & and JSON will replace a " with \" as JSON uses the " to mark its labels) or
Magic borders (e.g. a CDATA-section in XML, which allows to place unescaped characters as is: <![CDATA[forbidden characters &<> allowed here]]>)
Whatever you do, you must distinguish between the visible string in an editor or in a text-based container like XML or JSON and the value the application will pick out of this.
An example:
<root><a>this & that</a></root>
visible string: "this & that"
real value: "this & that"

Teradata SQL - Replacing special characters

I'm using Report Builder 3.0 for my reports. My report runs, however, if a user exports the results to Excel (xlsx) instead of Excel 2003 (xls), they get an "illegal xml character" message when the file is open.
4 of the columns contain "&" and / or " ' "; so I'm trying to replace these special characters; which I believe are causing the issue.
I've tried to update this line:
j.journal_desc AS "Jrnl Description",
with this line:
oreplace(oreplace(j.journal_desc, ’&’, β€˜and’),'''','') AS "Jrnl Description",
and it works fine. However when I do this on a second line I get the message: "SELECT Failed. [9804] Response Row size or Constant Row size overflow".
I've tried "otranslate" and it works on 2 columns. However, when I try it on the 3rd column, I get the same overflow message.
Is it possible to use oreplace or otranslate on multiple columns? Am I doing something wrong? Is there a better way to replace these special characters? t
Thanks for the help......
oreplace and otranslate when used the result string will have length of 8000 unicode characterset.each of otranslate will make much longer by 8000. Try to cast to smaller length should fix problem.
CAST(oreplace(journal_desc,'&','and') AS VARCHAR(100))

Dealing with commas in csv files csv-river plugin

I am trying to index data present in csv file to elasticsearch server. The problem is the string itself contain multiple "," so during indexing it is giving indexoutofbound exception.
How to handle commas using csv-river plugin.
Edit:
The example file would be:
MESSAGE_ID,PARENT_MESSAGE_ID,THREAD_ID,FORUM_ID,FORUMINDEX,USER_ID,SUBJECT,BODY,MODVALUE,FORUM_NAME,CATEGORY_NAME,LIKES,DISLIKES,IS_ROOT_MESSAGE,IS_QUESTION
244,195,103,4,3,341,Re: The most stupidest program I've ever seen--Amazon,"I know nothing of your case, but I do know that throwing around terms like ""stupid idiot"" doesn't exactly help your side any.",1,"Order Management, Shipping, Feedback & Returns",Sell on Amazon,,,no,no
you need to enclose your fields in quotes. If the field contains a quote, you need to escape it with a preceding quote.
For example:
"field1","field2","field3 with, commas","field4","field ""5"" with quotes","field6"