pandas read_sql Imports NULLs as 0s - sql

Edit:
Apparently my connection (PYODBC) is reading in the NULLs as 0s. I'd like to correct this on a "higher" level as I'll be reading in lots of SQL files. I can't find anything in the documentation to prevent this. Any suggestions?
I am importing a SQL file/query as a string and then using pd.read_sql to query my database. In SQL Server I can do calculations between NULL values which results in a NULL value, however, in Python, this errors out my script.
Query ran in SQL Server:
SELECT PurchaseOrderID, POTotalQtyVouched, NewColumn = PurchaseOrderID/POTotalQtyVouched FROM #POQtys;
Here is my desired output after running the query (which works in SQL Server):
PurchaseOrderID POTotalQtyVouched NewColumn
NULL NULL NULL
007004 8 875.5
008017 21 381.761904761905
008478 NULL NULL
Running the query in Python:
query = '''
...
[Other code defining #POQTYs]
...
SELECT PurchaseOrderID, POTotalQtyVouched, NewColumn = PurchaseOrderID/POTotalQtyVouched FROM #POQtys;
'''
conn = pyodbc.connect('Driver={ODBC Driver 17 for SQL Server};'
'Server=MSS-SL-SQL;'
'Database=TRACE DB;'
'Trusted_Connection=yes;')
df = pd.read_sql(query, conn)
Error in Python:
DataError: ('22012', '[22012] [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]Divide by zero error encountered. (8134) (SQLFetch)')

Related

Numeric Value out of Range when Inserting to SQL from R

Edit: Here is the column file for you to try to insert to your database: https://easyupload.io/jls3mk
So I narrowed my problem down to 1 column in my dataframe. It's a numeric column from 0-260000 with NaNs in it.
When I try to insert pred_new_export[46] (only column 46) using this statement:
dbWriteTable(conn = con,
name = SQL("ML.CreditLineApplicationOutputTemp"),
value = pred_new_export[46], overwrite=TRUE) ## x is any data frame
I get the issue:
Error in result_insert_dataframe(rs#ptr, values, batch_rows) :
nanodbc/nanodbc.cpp:1655: 22003: [Microsoft][SQL Server Native Client 11.0]Numeric value out of range
I've looked at this for 2 hours and it's been driving me insane. I can't figure out why it wouldn't insert into a fresh SQL table. The column only contains numbers.
The numbers are within range of the column:
This is the SQL schema create statement.
USE [EDWAnalytics]
GO
/****** Object: Table [ML].[CreditLineApplicationOutputTemp] Script Date: 4/20/2022 9:26:22 AM ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [ML].[CreditLineApplicationOutputTemp](
[MedianIncomeInAgeBracket] [float] NULL
) ON [PRIMARY]
GO
You said it has NaNs, which many DBMSes do not understand. I suggest you replace all NaN with NA.
Reprex:
# con <- DBI::dbConnect(..)
DBI::dbExecute(con, "create table quux (num float)")
# [1] 0
df <- data.frame(num=c(1,NA,NaN))
DBI::dbAppendTable(con, "quux", df)
# Error in result_insert_dataframe(rs#ptr, values, batch_rows) :
# nanodbc/nanodbc.cpp:1655: 42000: [Microsoft][ODBC Driver 17 for SQL Server][SQL Server]The incoming tabular data stream (TDS) remote procedure call (RPC) protocol stream is incorrect. Parameter 1 (""): The supplied value is not a valid instance of data type float. Check the source data for invalid values. An example of an invalid value is data of numeric type with scale greater than precision.
df$num[is.nan(df$num)] <- NA
DBI::dbAppendTable(con, "quux", df)
DBI::dbGetQuery(con, "select * from quux")
# num
# 1 1
# 2 NA
# 3 NA
FYI, the version of SQL Server ODBC you are using is rather antiquated: even the most recent release of 11 was in 2017. For many reasons, I suggest you upgrade to ODBC Driver for SQL Server 17 (the 17 has nothing to do with the version of SQL Server to which you are connecting).
FYI, my DBMS/version:
cat(DBI::dbGetQuery(con, "select ##version")[[1]], "\n")
# Microsoft SQL Server 2019 (RTM-CU14) (KB5007182) - 15.0.4188.2 (X64)
# Nov 3 2021 19:19:51
# Copyright (C) 2019 Microsoft Corporation
# Developer Edition (64-bit) on Linux (Ubuntu 20.04.3 LTS) <X64>
though this is also the case with SQL Server 2016 (and likely other versions).

Importing OLAP metadata in SQL Server via linked server results in out-of-range date

Currently, I am trying to extract metadata from an OLAP cube in SQL Server (via a linked server) using this simple query:
select *
into [dbo].[columns_metadata]
from openquery([LINKED_SERVER], '
select *
from $System.TMSCHEMA_COLUMNS
')
But in the result set, there is a column named RefreshedTime with values 31.12.1699 00:00:00.
Because of this value, the query gives this error message:
Msg 8114, Level 16, State 9, Line 1 Error converting data type (null)
to datetime.
The problem is that I need to run the query without specifying the columns in the SELECT statement.
Do you know a trick to avoid this error?
I know you wanted not to have to mention the columns explicitly, but in case nobody can suggest a way to have it handle the 1699-12-31 dates, then you can fallback to this:
select *
into [dbo].[columns_metadata]
from openquery([LINKED_SERVER], '
SELECT [ID]
,[TableID]
,[ExplicitName]
,[InferredName]
,[ExplicitDataType]
,[InferredDataType]
,[DataCategory]
,[Description]
,[IsHidden]
,[State]
,[IsUnique]
,[IsKey]
,[IsNullable]
,[Alignment]
,[TableDetailPosition]
,[IsDefaultLabel]
,[IsDefaultImage]
,[SummarizeBy]
,[ColumnStorageID]
,[Type]
,[SourceColumn]
,[ColumnOriginID]
,[Expression]
,[FormatString]
,[IsAvailableInMDX]
,[SortByColumnID]
,[AttributeHierarchyID]
,[ModifiedTime]
,[StructureModifiedTime]
,CStr([RefreshedTime]) as [RefreshedTime]
,[SystemFlags]
,[KeepUniqueRows]
,[DisplayOrdinal]
,[ErrorMessage]
,[SourceProviderType]
,[DisplayFolder]
from $System.TMSCHEMA_COLUMNS
')

Is it normal for Laravel "wherenotin" statement to include null values?

I'm using Laravel 5.6.7 with the SQL server driver connecting to SQL Server 2014.
When attempting the following query:
Warrant::where('vsStatus',632)
->whereNotIn('vsType',[-26927,-26929])
->whereNotIn('vsCategory',[-15723,-21708,-21708])
->where('Warrant.ORI','XXXXXXXXX')
->join('GlobalJacket','GlobalJacket.JacketID','=','Warrant.JacketID')
->where('GlobalJacket.JacketType','=',2)->get();
I am receiving back 2 results sets, where vsType = -19325 and where vsType = null
This is the query seen by the debugbar:
select * from [Warrant] inner join [GlobalJacket] on [GlobalJacket].[JacketID] = [Warrant].[JacketID] where [vsStatus] = '632' and [vsType] not in ('-26927', '-26929') and [vsCategory] not in ('-15723', '-21708', '-21708') and [Warrant].[ORI] = 'XXXXXXXXX' and [GlobalJacket].[JacketType] = '2'
And when pasting that directly into Management Studio 17 I am only getting the 1 result where vsType = -19325
Now I could easily include the null value into the 'whereNotIn' value list, however is this by design? All things consider 'null' is not in list, however why would MMS ignore the null value, but eloquent wouldn't?
Thanks for the help.

saved data frame is not shown correctly in sql server

I have data frame named distTest which have columns with UTF-8 format. I want to save the distTest as table in my sql database. My code is as follows;
library(RODBC)
load("distTest.RData")
Sys.setlocale("LC_CTYPE", "persian")
dbhandle <- odbcDriverConnect('driver={SQL Server};server=****;database=TestDB;
trusted_connection=true',DBMSencoding="UTF-8" )
Encoding(distTest$regsub)<-"UTF-8"
Encoding(distTest$subgroup)<-"UTF-8"
sqlSave(dbhandle,distTest,
tablename = "DistBars", verbose = T, rownames = FALSE, append = TRUE)
I considered DBMSencoding for my connection and encodings Encoding(distTest$regsub)<-"UTF-8"
Encoding(distTest$subgroup)<-"UTF-8"
for my columns. However, when I save it to sql the columns are not shown in correct format, and they are like this;
When I set fast in sqlSave function to FALSE, I got this error;
Error in sqlSave(dbhandle, Distbars, tablename = "DistBars", verbose =
T, : 22001 8152 [Microsoft][ODBC SQL Server Driver][SQL
Server]String or binary data would be truncated. 01000 3621
[Microsoft][ODBC SQL Server Driver][SQL Server]The statement has been
terminated. [RODBC] ERROR: Could not SQLExecDirect 'INSERT INTO
"DistBars" ( "regsub", "week", "S", "A", "F", "labeled_cluster",
"subgroup", "windows" ) VALUES ( 'ظâ€', 5, 4, 2, 3, 'cl1', 'ط­ظ…ظ„
ط²ط¨ط§ظ„ظ‡', 1 )'
I also tried NVARCHAR(MAX) for utf-8 column in the design of table with fast=false the error gone, but the same error with format.
By the way, a part of data is exported as RData in here.
I want to know why the data format is not shown correctly in sql server 2016?
UPDATE
I am fully assured that there is something wrong with RODBC package.
I tried inserting to table by
sqlQuery(channel = dbhandle,"insert into DistBars
values(N'7من',NULL,NULL,NULL,NULL,NULL,NULL,NULL)")
as a test, and the format is still wrong. Unfortunately, adding CharSet=utf8; to connection string does not either work.
I had the same issue in my code and I managed to fix it eliminating rows_at_time = 1 from my connection configuration.

Advantage Database 8.1 SQL IN clause

Using Advantage Database Server 8.1 I am having trouble executing a successful query. I am trying to do the following
SELECT * FROM Persons
WHERE LastName IN ('Hansen','Pettersen')
To check for multiple values in a column. But I get an error when I try to execute this query in Advantage.
Edit - Error
poQuery: Error 7200: AQE Error: State = 42000; NativeError = 2115; [iAnywhere Solutions][Advantage SQL Engine]Expected lexical element not found: ( There was a problem parsing the
WHERE clause in your SELECT statement. -- Location of error in the SQL statement is: 46
And here is the SQL i'm executing
select * from "Pat Visit" where
DIAG1 IN = ('43644', '43645', '43770', '43771', '43772', '43773', '43774',
'43842', '43843', '43845', '43846', '43847', '43848', '97804', '98961',
'98962', '99078')
Done
Does anyone have any Idea how I could do something similar in advantage that would be efficient as well?
Thanks
You have an extraneous = in the statement after the IN. It should be:
select * from "Pat Visit" where
DIAG1 IN ('43644', '43645', <snip> )