How to load special characters (non-English letters) in SQL Loader - sql

Some of my developer_id are in foreign language (special character). I googled how to handle those characters, and what people said was using
NVARCHAR2()
or use:
INSERT INTO table_name VALUES (N'你好');
However, I used NVARCHAR2() in stage and on all the tables but still doesn't work for me (the original datatype for developer_id was VARCHAR2()). Also, the insert statement with N at the beginning is not working for SQL Loader I think.
What should I do?
Here is where the problem shows:
Here is my ctl. file
Here is the datatype for all the data in the flat file:
The character set for the flat file is UTF-8. I thought I have successfully solved this problem by changing my Encoding when pre-loading the data to stage table, but the same problem still shows up when I finished importing my data to stage.

Related

SQL encoding after restore from backup (mariadb, unix)

I need to restore a SQL table from daily backup but there are problems with encoding. Backup is made by virtualmin, encoding set to "default". Texts are in French language, so with accents...
Here is the dump of the webmin backup file:
For the table (wordpress table, interesting fields are:)
I need to insert a part of this table into the live table (after some deletion of lines..). So the table is already created with
Default collation UTF8mb4_unicode_ci
When I import the table lines into the table, text is not "converted" into the right charset. For example the french "é" shows up as "é". And so on.
I tried a few things, adding SET commands to utf8mb4 before the INSERT, no way, encoding is never done correctly. Text in the base itself shows "é" instead "é", and of course the same when displaying in a browser.
Any suggestion? Thank you!

Exporting SQL Server table containing a large text column

I have to export a table from a SQL Server, the table contains a column that has a large text content with the maximum length of the text going up to 100,000 characters.
When I use Excel as an export destination, I find out that the length of this text is capped and truncated to 32,765.
Is there an export format that preserves the length?
Note:
I will eventually be importing this data into another SQL Server
The destination SQL Server is in another network, so linked servers and other local options are not feasible
I don't have access to the actual server, so generating back up is difficult
As is documented in the Excel specifications and limits the maximum characters that can be stored in a single Excel cell is 32,767 characters; hence why your data is being truncated.
You might be better off exporting to a CSV, however, note that Quote Identified CSV files aren't supported within bcp/BULK INSERT until SQL Server 2019 (currently in preview). You can use a characters like || to denote a field delimited, however, if you have any line breaks you'll need to choose a different row delimitor too. SSIS, and other ETL tools, however, do support quote identified CSV files; so you can use something like that.
Otherwise, if you need to export such long values and want to use Excel as much as you can (which I actually personally don't recommend due to those awful ACE drivers), I would suggest exporting the (n)varchar(MAX) values to something else, like a text file, and naming each file with the value of your Primary Key included. Then, when you import the data back you can retrieve the (n)varchar(MAX) value again from each individual file.
The .sql is the best format for sql table. Is the native format for sql table, with that, you haven't to concert the export.

SSMS - Importing into nvarchar(max) still giving truncate errors

I'm using SSIS to import an Excel table into SQL Server.
The field in the SQL Server table is set as nvarchar(max) but it still gives me Truncate Error.
The column that I want to import can have any number of characters, it could be 1 or it could be 10,000. It's a free-text filed without any limitations.
Go into the Advanced settings of your Excel Source Component, and manually set the length of the Output columns.
SSIS samples your data to get an idea of each column. It will use the max length of the sample to determine the "proper" field size. Of course this causes constant issues.
Can you add something to order your data to make the longest first?
ORDER BY LEN(LongFIELD) DESC
Check out StackExchange for more info:
Text was truncated or one or more characters had no match in the target code page When importing from Excel file

How do I import data from a csv when the records are not separated by line breaks but with brackets

Looking at the AM data, just for a data analysis project and I'm having trouble importing the data into my dbms (postgresql).
My code is sql code is this:
DROP TABLE IF EXISTS member_details;
CREATE TABLE member_details(
pnum varchar(255),
.....
updatedon timestamp);
COPY member_details
FROM '/Users/etc/data/sample_dump.csv'
WITH DELIMITER ','
CSV;
Problem is that the csv file has no line breaks to separate the data, instead each record is within a bracket which my code above does not recognise and thus just imports all the data into the header in one line and so no records are created
how the data is structured
(dataA1, ....,dataAx),(dataB1,...,dataBx)
How can I alter my code so that postgresql imports the data record by record by recognising the brackets.
Based on the PostgreSQL COPY documentation, I don't believe it allows for row delimiters other than carriage returns and/or line feeds. I believe you'll need to process your file before importing. You can simply replace all ,( with \n(, then replace all the parenthesis to make it a standard csv format that COPY will happy consume.
Perhaps there's another method for PostgreSQL that would work too, but I haven't come across anything yet.

Registered Symbol not getting inserted as-is in table

I am working on Oracle 10gR2.
The character set for DB is as below:
NLS_NCHAR_CHARACTERSET AL16UTF16
NLS_CHARACTERSET AL32UTF8.
I am getting data to be processed in TXT files. The first step in processing this data is creating external tables based on these flat files. One of the fields (and the columns in DB) in the flat file has String data, which contains ® (registered symbol). This character is visible in the txt file, but when I check the external table, the character is saved as �
I have modified the encoding of the IDE to UTF-8, where I am seeing the output of the query.
The data type for the column is: COL NVARCHAR2(1000)
Please suggest as to what could be causing this?
Generally this is caused by incorrect setting of the NLS_LANG environment variable. The NLS_LANG variable must tell oracle the encoding you are using for your data. If the NLS_LANG is unset, oracle assumes ASCII text (and your symbol is non-ascii).
If your data is UTF-8, try:
NLS_LANG=.AL32UTF8
For windows/iso try
NLS_LANG=.WE8ISO8859P15
You NEED to determine the encoding of your text file first. Use a hex editor to determine of the (R) symbol is UTF-8 or not.