Moving Data From SQL Server To Oracle -- Character vs Bytes

Moving Data From SQL Server To Oracle -- Character vs Bytes - sql

I am in the process of moving data from SQL Server to Oracle. I am having to do this using C# code we have written. (Long story but has to do with corporate standards, so no SSIS or other utilities allowed)
The question I have is when I have a field that is NVARCHAR(200) on SQL Server and in oracle it is NVARCHAR(200). I understand that on Oracle the 200 represents 200 bytes. My question is how can I move data from the SQL Server where the field has all 200 characters populated. The problem is that the 200 character in SQL Server is more than 200 bytes.
In the process I am reading the data form SQL Server, storing it into a string array, and then using Array Binding (Oracle Data Access) to push the data to Oracle. It all works fine, however, when I have a field that is fully populated in SQL Server it has more than the max allowed bytes for the same field definition in Oracle.
Is there an easy way to check the byte size of the string from SQL Server and see if it is more than 200 bytes, and if so, truncate it so that only 200 bytes are moved across? (For what I am doing, truncation of the data is ok)

I understand that on Oracle the 200 represents 200 bytes.
That is true for VARCHAR but not for NVARCHAR
Quote from the manual
The NVARCHAR2 data type is a Unicode-only data type. When you create a table with an NVARCHAR2 column, you supply the maximum number of characters it can hold. [...] Width specifications of character data type NVARCHAR2 refer to the number of characters.
(Emphasis mine)
So for NVARCHAR you should be fine.
For VARCHAR2 you can indeed specify the length in bytes or characters - but even there the number denotes the default setting which can be changed anytime.

Related

Query remote oracle CLOB data from MSSQL

I read different posts about this problem but it didn't help me with my problem.
I am on a local db (Microsoft SQL Server) and query data on remote db (ORACLE).
In this data, there is a CLOB type.
CLOB type column shows me only 7 correct data the others show me <null>
I tried to CAST(DEQ_COMMENTAIRE_REFUS_IMPORT AS VARCHAR(4000))
I tried to SUBSTRING(DEQ_COMMENTAIRE_REFUS_IMPORT, 4000, 1)
Can you help me, please ?
Thank you

No MSSQL but in my case we were pulling data into MariaDB using the ODBC Connect engine from Oracle.
For CLOBs, we did the following (in outline):
Create PLSQL function get_clob_chunk ( clobin CLOB, chunkno NUMBER) RETURN VARCHAR2.
This will return the the specified nth chunk of 1000 chars for the CLOB.
We found 1,000 worked best with multibyte data. If the data is all plain text single byte that chunks of 4,000 are safe.
Apologies for the absence of actual code, as I'm a bit rushed for time.
Create a Oracle VIEW which calls the get_clob_chunk function to split the CLOB into 1,000 char chunk columns chunk1, chunk2, ... chunkn, CAST as VARCHAR2(1000).
We found that Oracle did not like having more than 16 such columns, so we had to split the views into sets of 16 such columns.
What this means is that you must check what the maximum size of data in the CLOB is so you know how many chunks/views you need. To do this dynamically adds complexity, needless to say.
Create a view in MariaDB querying the view.
Create table/view in MariaDB that joins the chunks up into a single Text column.
Note, in our case, we found that copying Text type columns between MariaDB databases using the ODBC Connect engine was also problematic, and required a similar splitting method.
Frankly, I'd rather use Java/C# for this.

Data type equivalents: MS Access Tables ↔ 'CREATE TABLE' Queries ↔ ODBC SQL

What is the correct syntax when creating a table in Access with SQL? I have tried DECIMAL, DOUBLE, NUMBER, INT... nothing lets me create an integer category with limiters.
Example:
CREATE TABLE NONGAME (
ITEM_NUM CHAR(4) NOT NULL PRIMARY KEY,
DESCRIPTION CHAR(30),
ON_HAND NUMBER(4), <------- DOES NOT WORK!
CATEGORY CHAR(3),
PRICE DECIMAL(6,2), <------- DOES NOT WORK!
ANYTHING DOUBLE(4,2) <------- DOES NOT WORK!
);

MICROSOFT ACCESS DATA TYPES
The following table shows the Microsoft Access data types, data types used to create tables, and ODBC SQL data types. Some types have limitations, outlined following the table.
Microsoft Access data type Data type (CREATE TABLE) ODBC SQL data type
~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~
BIGBINARY[1] LONGBINARY SQL_LONGVARBINARY
BINARY BINARY SQL_BINARY
BIT BIT SQL_BIT
COUNTER COUNTER SQL_INTEGER
CURRENCY CURRENCY SQL_NUMERIC
DATE/TIME DATETIME SQL_TIMESTAMP
GUID GUID SQL_GUID
LONG BINARY LONGBINARY SQL_LONGVARBINARY
LONG TEXT LONGTEXT SQL_LONGVARCHAR[2]
MEMO LONGTEXT SQL_LONGVARCHAR[2]
NUMBER (FieldSize= SINGLE) SINGLE SQL_REAL
NUMBER (FieldSize= DOUBLE) DOUBLE SQL_DOUBLE
NUMBER (FieldSize= BYTE) UNSIGNED BYTE SQL_TINYINT
NUMBER (FieldSize= INTEGER) SHORT SQL_SMALLINT
NUMBER (FieldSize= LONG INTEGER) LONG SQL_INTEGER
NUMERIC NUMERIC SQL_NUMERIC
OLE LONGBINARY SQL_LONGVARBINARY
TEXT VARCHAR SQL_VARCHAR[1]
ARBINARY VARBINARY SQL_VARBINARY
[1] Access 4.0 applications only. Max 4000 B. Behaviour similar to LONGBINARY.
[2] ANSI applications only.
[3] Unicode and Access 4.0 applications only.
Note: SQLGetTypeInfo returns ODBC data types. It will not return all Microsoft Access data types if more than one Microsoft Access type is mapped to the same ODBC SQL data type. All conversions in Appendix D of the ODBC Programmer's Reference are supported for the SQL data types listed in the previous table.
Limitations on Microsoft Access data types
BINARY**, **VARBINARY**, and **VARCHAR: Creating a BINARY, VARBINARY, or VARCHAR column of zero or unspecified length actually returns a 510-byte column.
BYTE: Even though a Microsoft Access NUMBER field with a FieldSize equal to BYTE is unsigned, a negative number can be inserted into the field when using the Microsoft Access driver.
CHAR**, **LONGVARCHAR**, and **VARCHAR: A character string literal can contain any ANSI character (1-255 decimal). Use two consecutive single quotation marks ('') to represent one single quotation mark ('). Procedures should be used to pass character data when using any special character in a character data type column.
DATE: Date values must be either delimited according to the ODBC canonical date format or delimited by the datetime delimiter (#). Otherwise, Microsoft Access will treat the value as an arithmetic expression and will not raise a warning or error.
For example, the date "March 5, 1996" must be represented as {d '1996-03-05'} or #03/05/1996#; otherwise, if only 03/05/1993 is submitted, Microsoft Access will evaluate this as 3 ÷ 5 ÷ 1996. This value rounds up to the integer 0, and since the zero day maps to 1899-12-31, this is the date used.
A pipe character (|) cannot be used in a date value, even if enclosed in back quotes.
GUID: Data type limited to Microsoft Access 4.0.
NUMERIC: Data type limited to Microsoft Access 4.0.
(More information at the Source)
Limitations on ODBC Desktop Driver data types
The Microsoft ODBC Desktop Database Drivers impose the following limitations on data types:
All data types Type conversion failures might result in the affected column being set to NULL.
BINARY Creating a zero-length BINARY column actually returns a 255-byte BINARY column.
DATE The DATE data type cannot be converted to another data type (or itself) by the CONVERT function.
DECIMAL (Exact Numeric)** Not supported.
Floating-Point Data Types The number of decimal places in a floating-point number may be limited by the number format set in the International section of the Windows Control Panel.
NUMERIC Supports maximum precision and a scale of 28.
TIMESTAMP The TIMESTAMP data type cannot be converted to itself by the CONVERT function.
TINYINT: TINYINT values are always unsigned.
Zero-Length Strings: When a dBASE, Microsoft Excel, Paradox, or Textdriver is used, inserting a zero-length string into a column actually inserts a NULL instead.
(Source)
More Information:
MSDN : Create and Delete Tables and Indexes Using Access SQL
MSDN : CREATE TABLE Statement (Microsoft Access SQL)
Microsoft Docs : Microsoft Access Data Types
Microsoft Docs : Data Type Limitations
Microsoft Docs : Converting between ODBC and SQL Server data types
Microsoft Docs : Limitations of SQL ODBC Desktop Drivers
Wikipedia : Open Database Connectivity (ODBC)
Small addendum by Erik:
You can actually use the Decimal data type in CREATE TABLE queries. However, this requires your statement to be executed either using ADO, or on a database that's been set to use ANSI-92 compatible syntax.
To set your database to ANSI-92 compatible syntax:
Go to File -> Options. Open the tab Object Designers. Go to Query Designer, and under SQL Server Compatible Syntax (ANSI 92), check This Database. Now you can just execute the query. Note that this affects all queries in the database, and affects queries in various ways.
To execute a query using ADO:
In the VBA Immediate Window, execute the following line:
CurrentProject.Connection.Execute "CREATE TABLE NONGAME (ITEM_NUM CHAR(4) NOT NULL PRIMARY KEY, PRICE DECIMAL(6,2));"
Of course, you can execute more complex queries using ADO.

DECIMAL and DOUBLE cannot be used in Access. For a "price", CURRENCY is the best bet. For my other integer, I just used NUMBER and gave it no limiters.

Oracle : constrained by the 32K limit on a VARCHAR2 variable when send mail

I am trying to send HTML mail by oracle stored procedure using
SYS.UTL_MAIL.send
But unfortunately body is limited by varchar length 32k which will be exceeded in a lot of scenarios.
what i can use instead of above method to send such long mails?

Use a CLOB or NCLOB datatype:
The CLOB and NCLOB datatypes store up to 128 terabytes of character data in the database. CLOBs store database character set data, and NCLOBs store Unicode national character set data. Storing varying-width LOB data in a fixed-width Unicode character set internally enables Oracle Database to provide efficient character-based random access on CLOBs and NCLOBs.
There are multiple examples available on the internet of how to then send CLOB values in an e-mail (unfortunately none of them detail the license of the code so I won't cross post):
Ask Tom: How to send more than 32K message
Ask Tom: Sending e-mail! -- Oracle 8i specific response
Searching for "oracle utl_mail clob" will get lots more.
Or you can use JavaMail to create a function and upload it into the database with the loadjava utility (or CREATE JAVA SOURCE).
These solutions do not use UTL_MAIL - AskTom's response to this question is:
The interface to utl_mail for attachments accepts either a 32k RAW or 32k VARCHAR2.
(I.e. you can't but there are alternatives - see the links above)

difference between varchar(500) vs varchar(max) in sql server

I want to know what are pros and cons while using varchar(500) vs varchar(max) in terms of performance, memory and anything else to consider?
Will both use same amount of storage space?
Is the answer differ in case of sql server 2000/2005/2008?

In SQL Server 2000 and SQL Server 7, a row cannot exceed 8000 bytes in size. This means that a VARBINARY column can only store 8000 bytes (assuming it is the only column in a table), a VARCHAR column can store up to 8000 characters and an NVARCHAR column can store up to 4000 characters (2 bytes per unicode character). This limitation stems from the 8 KB internal page size SQL Server uses to save data to disk.
To store more data in a single column, you needed to use the TEXT, NTEXT, or IMAGE data types (BLOBs) which are stored in a collection of 8 KB data pages that are separate from the data pages that store the other data in the same table. These data pages are arranged in a B-tree structure. BLOBs are hard to work with and manipulate. They cannot be used as variables in a procedure or a function and they cannot be used inside string functions such as REPLACE, CHARINDEX or SUBSTRING. In most cases, you have to use READTEXT, WRITETEXT, and UPDATETEXT commands to manipulate BLOBs.
To solve this problem, Microsoft introduced the VARCHAR(MAX), NVARCHAR(MAX), and VARBINARY(MAX) data types in SQL Server 2005. These data types can hold the same amount of data BLOBs can hold (2 GB) and they are stored in the same type of data pages used for other data types. When data in a MAX data type exceeds 8 KB, an over-flow page is used. SQL Server 2005 automatically assigns an over-flow indicator to the page and knows how to manipulate data rows the same way it manipulates other data types. You can declare variables of MAX data types inside a stored procedure or function and even pass them as variables. You can also use them inside string functions.
Microsoft recommend using MAX data types instead of BLOBs in SQL Server 2005. In fact, BLOBs are being deprecated in future releases of SQL Server.
Credit: http://www.teratrax.com/articles/varchar_max.html
In SQL Server 2005 and SQL Server 2008, The maximum storage size for VARCHAR(MAX) is 2^31-1 bytes (2,147,483,647 bytes or 2GB - 1 bytes). The storage size is the actual length of data entered + 2 bytes. The data entered can be 0 characters in length. Since each character in a VARCHAR data type uses one byte, the maximum length for a VARCHAR(MAX) data type is 2,147,483,645.
Full Interesting read for you: http://www.sql-server-helper.com/faq/sql-server-2005-varchar-max-p01.aspx
Reference: http://msdn.microsoft.com/en-us/library/ms143432.aspx

A VARCHAR(MAX) column will accept a value of 501 characters or more whereas a VARCHAR(500) column will not. So if you have a business rule that restricts a value to 500 characters, VARCHAR(500) will be more appropriate.

Why is maximum length of varchar less than 8,000 bytes?

So I have a stored procedure in a SQLServer 2005 database, which retrieves data from a table, format the data as a string and put it into a varchar(max) output variable.
However, I notice that although len(s) reports the string to be > 8,000, the actual string I receive (via SQLServer output window) is always truncated to < 8,000 bytes.
Does anybody know what the causes of this might be ? Many thanks.

The output window itself is truncating your data, most likely. The variable itself holds the data but the window is showing only the first X characters.
If you were to read that output variable from, for instance, a .NET application, you'd see the full value.

Are you talking about in SQL Server Management Studio? If so, there are some options to control how many characters are returned (I only have 2008 in front of me, but the settings are in Tools|Options|Query Results|SQL Server|Results to Grid|Maximum Characters Retrieved and Results to Text|Maximum number of characters displayed in each column.

The data is all there, but management studio isn't displaying all of the data.
In cases like this, I've used MS Access to link to the table and read the data. It's sad that you have to use Access to view the data instead of Management Studio or Query Analyzer, but that seems to be the case.

However, I notice that although len(s) reports the string to be > 8,000
I have fallen for the SQL Studio issue too :) but isn't the maximum length of varchar 8,000 bytes, or 4,000 for nvarchar (unicode).
Any chance the column data type is actually text or ntext and you're converting to varchar?

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas