how to store pandas frame into database using to_sql - pandas

Am trying to store a dataframe into an oracle table using the below code
the data is inserted successfully if I omit dtype={'PN': types.VARCHAR}
merged.to_sql('table1', conn, if_exists='append', index=False, dtype={'PN': types.VARCHAR})
else it throws
sqlalchemy.exc.OperationalError: (cx_Oracle.OperationalError) ORA-00604: error occurred at recursive SQL level 1
ORA-06502: PL/SQL: numeric or value error
ORA-06512: at line 13
ORA-00906: missing left parenthesis
[SQL:
CREATE TABLE tabl1(
"PN" VARCHAR,
"DT" DATE,
"COL1" FLOAT,
"COL2" NUMBER(19),
"COL3" NUMBER(19),
"COL4" FLOAT,
"COL5" FLOAT,
"COL6" FLOAT
)
]

Oracle expects length for a varchar column in a "Create Table" DDL statement. As suggested by Gord ,providing a value between 1 and 255 in parenthesis will solve the issue.You can try dtype={'PN': types.VARCHAR(255)}.
If you want to know what happens in the database i have reproduced the issue on dbfiddle - Oracle 18c Express edition that you can check out.
https://dbfiddle.uk/?rdbms=oracle_18&fiddle=9362757cfcb1cfc3052425190367d3d8
I wouldn't recommend using varchar for storing alphanumeric data because of following reasons :
Comparison Semantics Use the CHAR datatype when you require ANSI
compatibility in comparison semantics, that is, when trailing blanks
are not important in string comparisons. Use the VARCHAR2 when
trailing blanks are important in string comparisons.
Space Usage To store data more efficiently, use the VARCHAR2
datatype. The CHAR datatype blank-pads and stores trailing blanks up
to a fixed column length for all column values, while the VARCHAR2
datatype does not blank-pad or store trailing blanks for column
values.
Future Compatibility The CHAR and VARCHAR2 datatypes are and will
always be fully supported. At this time, the VARCHAR datatype
automatically corresponds to the VARCHAR2 datatype and is reserved
for future use.

Related

Does altering column type corrupt the column's existing data?

I am trying to change a column's datatype. The column of type VARCHAR has thousands of GUID values like look those shown below:
b1f4ff32-48d4-494e-a32c-044014cea9
bc5a1158-b310-49ff-a1f3-09d4f8707f69
4b7ebc9d-9fa1-42d9-811e-0b7b4b7297a
fc7ba848-98ea-4bc6-add7-11f0ee9c6917a21
485741ff-2ab2-4705-91b3-136389948b7c
I need to convert the column type to unqiqueidentifier using the script below. Can I do that safely without corrupting the column data?
alter table MyTable
alter column guidColumn uniqueidentifier not null
If you change the data type SQL Server will first check if all the values in the columns can be implicitly converted to the new data type; if they cannot then the ALTER will fail. If they can, then they will be implicitly converted and the ALTER will be successful (assuming no dependencies of course).
For a uniqueidentifier then either it's a valid value or it's not, so either the data will all convert or the ALTER won't take place. For something like a date and time data type, however, you could very easily end up with incorrect data if the data is stored in an ambiguous format like dd/MM/yyyy. This could mean a value like '12/05/2022' ends up being stored as the date value 2022-12-05 rather than 2022-05-12. For such scenarios you would therefore want to UPDATE the data to an unambiguous format first, and then ALTER the data type of the column.
The uniqueidentifier type is considered a character type for the purposes of conversion from a character expression, and therefore is subject to the truncation rules for converting to a character type.
Also there are limitations, uniqueidentifier type is limited to 36 char
So if you decide to truncate the table like in this example:
DECLARE #ID NVARCHAR(max) = N'0E984725-C51C-4BF4-9960-E1C80E27ABA0wrong';
SELECT #ID, CONVERT(uniqueidentifier, #ID) AS TruncatedValue;
This will be the result:
String
Truncated Value
0E984725-C51C-4BF4-9960-E1C80E27ABA0wrong
0E984725-C51C-4BF4-9960-E1C80E27ABA0
So, if your string is more or less than 36 it will not truncate correctly.
For more information check Microsoft documentation:
https://learn.microsoft.com/en-us/sql/t-sql/data-types/uniqueidentifier-transact-sql?view=sql-server-ver15

Convert ntext to numeric

One of my column is defined as ntext datatype which is no longer supported by many streams. I'm trying my best to convert that column to numeric or int and every attempt is reaching anywhere.
reg_no #my ntext field | name | country | etc |
I tried to alter the col using the command we all use, but failed
alter table tabl1
alter column [reg_no] numeric(38,0)
Error:
Error converting data type nvarchar to numeric
Any suggestions on fixing this or has anyone come across this in the past, if yes how did you get over it
You should be able to do this in two steps:
alter table tabl1 alter column [reg_no] nvarchar(max);
alter table tabl1 alter column [reg_no] numeric(38,0);
ntext is deprecated and conversion to numeric is not supported, but converting to nvarchar() is supported.
This assumes that the values are compatible with numeric. Otherwise you will get a type conversion error. If this happens, you can get the offending values by using:
select *
from t
where try_convert(numeric(38, 0), try_convert(nvarchar(max), x)) is null
Try,
select convert(int,convcert(varchar(40), reg_no)) as newfieldname from tabl1

Nvarchar working with logical operator working?

Just need your help here.
I have a table T
A (nvarchar) B()
--------------------------
'abcd'
'xyzxcz'
B should output length of entries in A for which I did
UPDATE T
SET B = LEN(A) -- I know LEN function returns int
But when I checked out the datatype of B using sp_help T, it showed column B as nvarchar.
What's going on ?
select A
from T
where B > 100
also returned correct output?
Why is nvarchar working with logical operators ?
Please help.
Check https://learn.microsoft.com/en-us/sql/t-sql/data-types/data-type-conversion-database-engine?view=sql-server-2017 where it is said that data types are converted explicitly or implicitly when you move, compare or store a variable. In your case, you are comparing column B with 100, forcing sql server to implicitly convert it to integer type (check the picture about conversions on the same page). As a prove, try to alter a row putting some text in column B and, after repeating your select query B>100, sql server will throw a conversione error trying to obtain an integer out of your text.
It works because of implicit conversion between types.
Data type precedence
When an operator combines expressions of different data types, the data type with the lower precedence is first converted to the data type with the higher precedence. If the conversion isn't a supported implicit conversion, an error is returned.
Types precedence:
16. int
...
25. nvarchar (including nvarchar(max) )
In you example:
select A
from T
where B > 100
--nvarchar and int (B is implicitly casted to INT)
when adding a column to a table in ssms, not adding a datatype a "default" datatype is chosen. for me on 2017 developer it's nchar(10). if you want it to be int define the column with datatype of int. in tsql it'd be
create table T (
A nvarchar --for me the nvarchar without a size gives an nvarchar(2)
,B int
);
sp_help T
--to make a specific size, largest for nvarchar is 4000 or max...max is the replacement for ntext of old, as.
create table Tmax (
A nvarchar(max)
,B int
);
--understanding nvarchar and varchar for len() and datalength()
select
datalength(N'wibble') datalength_nvarchar -- nvarchar is unicode and uses 2 bytes per char, so 12
,datalength('wibble') datalength_varchar -- varchar uses 1 byte per so 6
,len(N'wibble') len_nvarchar -- count of chars, so 6
,len('wibble') len_varchar -- count of char so still 6
nvarchar(max) and varchar(max)
hope this helps, the question is a bit discombobulated

How to save Russian character in Oracle Database [duplicate]

I have a database with one column of the type nvarchar. If I write
INSERT INTO table VALUES ("玄真")
It shows ¿¿ in the table. What should I do?
I'm using SQL Developer.
Use single quotes, rather than double quotes, to create a text literal and for a NVARCHAR2/NCHAR text literal you need to prefix it with N
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE table_name ( value NVARCHAR2(20) );
INSERT INTO table_name VALUES (N'玄真');
Query 1:
SELECT * FROM table_name
Results:
| VALUE |
|-------|
| 玄真 |
First, using NVARCHAR might not even be necessary.
The 'N' character data types are for storing data that doesn't 'fit' in the database's defined character set. There's an auxiliary character set defined as the NCHAR Character set. It's kind of a band aid - once you create a database it can be difficult to change its character set. Moral of this story - take great care in defining the Character Set when creating your database and do not just accept the defaults.
Here's a scenario (LiveSQL) where we're storing a Chinese string in both NVARCHAR and VARCHAR2.
CREATE TABLE SO_CHINESE ( value1 NVARCHAR2(20), value2 varchar2(20 char));
INSERT INTO SO_CHINESE VALUES (N'玄真', '我很高興谷歌翻譯。' )
select * from SO_CHINESE;
Note that both the character sets are in the Unicode family. Note also I told my VARCHAR2 string to hold 20 characters. That's because some characters may require up to 4 bytes to be stored. Using a definition of (20) would give you only room to store 5 of those characters.
Let's look at the same scenario using SQL Developer and my local database.
And to confirm the character sets:
SQL> clear screen
SQL> set echo on
SQL> set sqlformat ansiconsole
SQL> select *
2 from database_properties
3 where PROPERTY_NAME in
4 ('NLS_CHARACTERSET',
5 'NLS_NCHAR_CHARACTERSET');
PROPERTY_NAME PROPERTY_VALUE DESCRIPTION
NLS_NCHAR_CHARACTERSET AL16UTF16 NCHAR Character set
NLS_CHARACTERSET AL32UTF8 Character set
First of all, you should to establish the Chinese character encoding on your Database, for example
UTF-8, Chinese_Hong_Kong_Stroke_90_BIN, Chinese_PRC_90_BIN, Chinese_Simplified_Pinyin_100_BIN ...
I show you an example with SQL Server 2008 (Management Studio) that incorporates all of this Collations, however, you can find the same characters encodings in other Databases (MySQL, SQLite, MongoDB, MariaDB...).
Create Database with Chinese_PRC_90_BIN, but you can choose other Coallition:
Select a Page (Left Header) Options > Collation > Choose the Collation
Create a Table with the same Collation:
Execute the Insert Statement
INSERT INTO ChineseTable VALUES ('玄真');

Create table in SAS using DB2 timestamp

We've recently gotten the accelerator (IDAA) working on our DB2, which I mainly access using SAS.
This requires us, due to network issues, to create tables first, before inserting rows.
My problem is creating a table with the correct timestamp format, I can create the table using a select statement, but this is very slow, but here I can see the format in SAS is DATETIME30.6
But if I try something like:
RSUBMIT prod_acc;
Proc delete data=user.table1; run; %PUT &sqlxrc &sqlxmsg;
proc sql inobs=MAX stimer feedback noerrorstop;
connect to db2(ssid=server);
create table user.table1
(
date datetime30.6
,reference char(16)
,transact char(20)
,alias char(60)
,amount decimal(15,2)
,currency char(3)
);
%PUT &sqlxrc &sqlxmsg;
quit;
run;
Which gives the following in the log
(
15 date datetime30.6
-----------
1 22
200
WARNING 1-322: Assuming the symbol DATE was misspelled as datetime30.
ERROR 22-322: Syntax error, expecting one of the following: a quoted string,
an integer constant, ), ',', CHECK, DISTINCT, FORMAT, INFORMAT, LABEL, LEN,
LENGTH, NOT, PRIMARY, REFERENCES, TRANSCODE, UNIQUE, ^, ~.
ERROR 200-322: The symbol is not recognized and will be ignored.
And if I look in DB2, the column has the type timestmp which SAS don't recognize as a type.
(
31 date timestmp
--------
22
76
ERROR 22-322: Syntax error, expecting one of the following: CHAR, CHARACTER, DATE, DEC,
DECIMAL, DOUBLE, FLOAT, INT, INTEGER, NUM, NUMERIC, REAL, SMALLINT, VARCHAR.
ERROR 76-322: Syntax error, statement will be ignored.
Tried googling and found a lot of different versions of answers, but nothing I can see is relevant to this, the closest was something about manually creating the format, but I can't figure out how to do that.
Any ideas?
It is probably more natural in SAS to define a table's structure using a DATA step rather than PROC SQL.
data userdb.table1;
stop;
length date 8 reference $16 transact $20 alias $60 amount 8 currency $3 ;
format date datetime30.6 amount 15.2 ;
run;
If your libref is pointing to a database then you should be able to use DBTYPE= dataset option to tell SAS what data types to use for your fields in the external database. At least it works for Teradata. These dataset options should work inside PROC SQL also.
proc delete data=userdb.table1; run;
data userdb.table1
(dbtype=
( date='timestamp'
reference='varchar(16)'
transact='varchar(20)'
alias='varchar(60)'
amount='decimal(15,2)'
currency='char(3)'
)
);
stop;
length date 8 reference $16 transact $20 alias $60 amount 8 currency $3 ;
format date datetime30.6 amount 15.2 ;
run;
Can't you just:
create table user.table1
(
"date" TIMESTAMP(6)
,reference char(16)
,transact char(20)
,alias char(60)
,amount decimal(15,2)
,currency char(3)
);
? Remember, in DB2, date is a reserved word, and then it's always safe to put that into double quotes. Alternatively, use a non-reserved word for the column name, like dt or so.
These two lines are incongruous:
connect to db2(ssid=server);
create table user.table1
The first creates a connection for a pass-through query, while the latter creates the table using the libname engine. In this case your first statement is irrelevant as it's not used; you should remove it (unless you use it later and just left it in by mistake in your example).
Since you used the libname syntax, you must follow SAS syntax rather than DB2. There is no specification for datetime type in the create table statement, specifically under the column-definition documentation page. Instead you have this list to choose from:
CHARACTER | VARCHAR <(width)> indicates a character column with a column width of width. The default column width is eight characters.
INTEGER | SMALLINT indicates an integer column.
DECIMAL | NUMERIC | FLOAT <(width<, ndec>)> indicates a floating-point column with a column width of width and ndec decimal
places.
REAL | DOUBLE PRECISION indicates a floating-point column.
DATE indicates a date column.
The way I find best to specify datetime (meaning, most likely to work as you expect) is not to use date but numeric, and then use the format argument to define it as datetime.
proc sql;
create table table1
( date num format=datetime30.6
,reference char(16)
,transact char(20)
,alias char(60)
,amount decimal(15,2)
,currency char(3)
);
quit;
However, I would suggest your best choice is to use passthrough to create the table, so you can use DB2 syntax - since you're creating a table there, not in SAS itself.

Categories