SQL Insert Out of Sync - sql

I have a bit of SQL here which is throwing an error:
DROP TABLE HACP_TEMP_PIC_HCV_Imported;
CREATE TABLE HACP_TEMP_PIC_HCV_Imported
(
HeadSSN varchar(255) NOT NULL,
HeadFName varchar(255) NOT NULL,
HeadMName varchar(255),
HeadLName varchar(255) NOT NULL,
ModifiedDate varchar(255) NOT NULL,
ActionType varchar(255) NOT NULL,
EffectiveDate varchar(255) NOT NULL
);
BULK INSERT HACP_TEMP_PIC_HCV_Imported
FROM 'C:\Work\MTWAdhocReport.csv'
WITH
(
FIRSTROW = 11,
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n',
ERRORFILE = 'C:\Work\Import_ErrorRows_HCV.csv',
TABLOCK
);
UPDATE HACP_TEMP_PIC_HCV_Imported
SET HeadSSN = REPLACE(HeadSSN, '"', ''),
HeadFName = REPLACE(HeadFName, '"', ''),
HeadMName = REPLACE(HeadMName, '"', ''),
HeadLName = REPLACE(HeadLName, '"', ''),
ModifiedDate = REPLACE(ModifiedDate, '"', ''),
ActionType = REPLACE(ActionType, '"', ''),
EffectiveDate = REPLACE(REPLACE(EffectiveDate, '"', ''),',','');
DROP TABLE HACP_PIC_HCV_Imported;
CREATE TABLE HACP_PIC_HCV_Imported
(
HeadSSN varchar(255) NOT NULL,
HeadFName varchar(255) NOT NULL,
HeadMName varchar(255),
HeadLName varchar(255) NOT NULL,
ModifiedDate varchar(255) NOT NULL,
ActionType int NOT NULL,
EffectiveDate varchar(255) NOT NULL
);
INSERT INTO HACP_PIC_HCV_Imported(HeadSSN, HeadFName, HeadMName, HeadLName, ModifiedDate, ActionType, EffectiveDate)
SELECT
LTRIM(HeadSSN),
LTRIM(HeadFName),
LTRIM(HeadMName),
LTRIM(HeadLName),
LTRIM(ModifiedDate),
CONVERT(int, LTRIM(ActionType)),
LTRIM(EffectiveDate)
FROM
HACP_TEMP_PIC_HCV_Imported;
Stepping through this, creating the temp table and importing the CSV into it works fine. Updating the table to remove quotes and a trailing comma from the EffectiveDate column works. Creating the new table-proper works.
When trying to copy the data into the second table (and converting ActionType into an INT), I get this error message:
Conversion failed when converting the varchar value '4/07/2016' to data type int.
That data is the second row value in ModifiedDate, so the columns are apparently getting out of sync after importing the first row. I have double-checked that all of the data is in the proper columns after being imported into the temp table initially.
Any thoughts? I feel like I'm missing something obvious.

Your code suggests that you are using "proper" CSV format, which allows fields to be enclosed in double quotes. These delimited fields can contain commas. This is the format produced and read by Excel.
My guess is that you have a comma in such a delimited field and this is throwing off the import.
But, this format is not read properly by bulk insert. Ironically, (at least) one database does import the CSV formatted files with commas in the fields.
In the past when I've had this problem, it has only been on smallish files. I simply loaded the data into Excel and then saved in out using tabs or vertical bars as delimiters. This solved the problem in my case.
I'm not sure if there is a more advanced solution now. But I'm pretty sure your problem is that some fields have embedded commas in the text fields.

Related

Oracle SQL: cannot specify rows separator for sqlldr when loading from a file

I have a table DimTime. Its schema is as follows:
CREATE TABLE DimTime ( SK_TimeID NUMBER(11,0) PRIMARY KEY,
TimeValue DATE Not NULL,
HourID NUMBER(2) Not NULL,
HourDesc CHAR(20) Not NULL,
MinuteID NUMBER(2) Not NULL,
MinuteDesc CHAR(20) Not NULL,
SecondID NUMBER(2) Not NULL,
SecondDesc CHAR(20) Not NULL,
MarketHoursFlag CHAR(5) NOT NULL check (MarketHoursFlag = 'false' or MarketHoursFlag = 'true'),
OfficeHoursFlag CHAR(5) NOT NULL check (OfficeHoursFlag = 'false' or OfficeHoursFlag = 'true')
);
I am trying to load data from the source file Time.txt into it. This file was generated for TPC-DI benchmarking on Oracle Linux 8.6. Here are the first 3 rows:
000000|00:00:00|00|00|00|00:00|00|00:00:00|false|false
000001|00:00:01|00|00|00|00:00|01|00:00:01|false|false
000002|00:00:02|00|00|00|00:00|02|00:00:02|false|false
I tried to load the data into the table using the following ctl file:
LOAD DATA
INFILE
TRUNCATE
INTO TABLE DimTime
FIELDS TERMINATED BY '|' OPTIONALLY ENCLOSED BY '"' TRAILING NULLCOLS
(
SK_TimeID,
TimeValue DATE "HH24:MI:SS",
HourID,
HourDesc,
MinuteID,
MinuteDesc,
SecondID,
SecondDesc,
MarketHoursFlag,
OfficeHoursFlag
)
and running sqlldr userid=user/password#localhost/db_name control=DimTime.ctl data=path/to/Time.txt
But the log file states that the last column had an error:
Record 1: Rejected - Error on table DIMTIME, column OFFICEHOURSFLAG.
ORA-12899: value too large for column "SAYYUS"."DIMTIME"."OFFICEHOURSFLAG" (actual: 6, maximum: 5)
even though it contains only values "true" or "false" (max 5 characters) as can be seen in the sample file.
I am sure, the problem is that as explained by TPC-DI documentation, "records have a terminator character appropriate for the System Under Test". So it has a character at the end of each line that is appended to the last column but I cannot discard this character. I tried to put INFILE "STR X'0A'" as explained by this webpage and tried to trim the column in the ctl file: OfficeHoursFlag "TRIM (:OfficeHoursFlag)" but it didn't work.
Does anybody know what character is in the end of the line? How is it possible to discard it?

How to retrieve German characters from a large CSV File into SQL Server 2017 script

I have a CSV file including a list of employees, where some of them includes German characters like 'ö' in their names. I need to create a temp table in my SQL Server 2017 script and fill it with the content of the CSV file. My script is:
CREATE TABLE #AllAdUsers(
[PhysicalDeliveryOfficeName] [NVARCHAR](255) NULL,
[Name] [NVARCHAR](255) COLLATE SQL_Latin1_General_CP1_CI_AS NULL ,
[DisplayName] [NVARCHAR](255) NULL,
[Company] [NVARCHAR](255) NULL,
[SAMAccountName] [NVARCHAR](255) NULL
)
--import AD users
BULK INSERT #AllAdUsers
FROM 'C:\Employees.csv'
WITH
(
FIRSTROW = 2,
FIELDTERMINATOR = ',', --CSV field delimiter
ROWTERMINATOR = '\n', --Use to shift the control to next row
TABLOCK
)
However, even though I use "Nvarchar" variable type with the collation of "SQL_Latin1_General_CP1_CI", the German characters are not seem OK, for instance "Kösker" seems like:
"K├╢sker"
I've tried many other collations but couldn't find a fix for it. Any help would be very much appreciated.

SSDT Drop and Recreate Tables when nothing has changed

I'm using SSDT database project to create deployment scripts for my database.
One of the tables, [AdrInfo].[IL] is dropped and then recreated every time when the deployment runs.
Nothing has changed in the definition of the tables in the project files.
Definition of the table:
CREATE TABLE [AdrInfo].[IL] (
[IL_ID] NVARCHAR (50) NULL,
[IL_ADI] NVARCHAR (50) NULL,
[XCOOR] VARCHAR (50) NULL,
[YCOOR] VARCHAR (50) NULL,
[IL_ADI_KEY] AS (CONVERT (NVARCHAR (255), replace(replace([IL_ADI], ' ', ''), '.', ''), 0) COLLATE SQL_Latin1_General_Cp850_CI_AI) PERSISTED );
CREATE CLUSTERED INDEX [index_IX_IL_CI1] ON [AdrInfo].[IL]([IL_ADI_KEY] ASC);
Snippet from deployment script:
GO PRINT N'Starting rebuilding table [AdrInfo].[IL]...';
GO BEGIN TRANSACTION;
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE;
SET XACT_ABORT ON;
CREATE TABLE [AdrInfo].[tmp_ms_xx_IL] (
[IL_ID] NVARCHAR (50) NULL,
[IL_ADI] NVARCHAR (50) NULL,
[XCOOR] VARCHAR (50) NULL,
[YCOOR] VARCHAR (50) NULL,
[IL_ADI_KEY] AS (CONVERT (NVARCHAR (255), replace(replace([IL_ADI], ' ', ''), '.', ''), 0) COLLATE SQL_Latin1_General_Cp850_CI_AI) PERSISTED );
CREATE CLUSTERED INDEX [tmp_ms_xx_index_IX_IL_CI1]
ON [AdrInfo].[tmp_ms_xx_IL]([IL_ADI_KEY] ASC);
I would expect SSDT to not touch this table during deployment. What can cause such a behavior?
SSDT is very picky when deploying default expressions for table columns.
Please compare expressions below:
(CONVERT (NVARCHAR (255), replace(replace([IL_ADI], ' ', ''), '.', ''), 0) COLLATE SQL_Latin1_General_Cp850_CI_AI) PERSISTED
((CONVERT (NVARCHAR (255), replace(replace([IL_ADI], ' ', ''), '.', ''), 0)) COLLATE SQL_Latin1_General_Cp850_CI_AI) PERSISTED
Using 1st one will cause the table to be redeployed every time, using the second one will stop this behavior. SQL Server do not store default expressions as text, but normalizes them. SSDT uses own normalization and then compares it to normalized expression.
If the both sets of normalization rules are not creating the same expression, SSDT will redeploy the column expression every time, which was causing redeploying the table in your case.
To avoid it, use SSMS script table to get normalized expression and save it in the project file.

Bulk load: An unexpected end of file was encountered in the data file

I am using SQL Server Express 2008
When I'm trying load data from txt file in to this table
create table Clients
(
ClientID int not null IDENTITY (9000,1),
LastName varchar (30)not null,
FirsName varchar (30)not null,
MidInitial varchar (3),
DOB date not null,
Adress varchar (40) not null,
Adress2 varchar (10),
City varchar (40) not null,
Zip int not null,
Phone varchar (30) ,
CategCode varchar (2) not null,
StatusID int not null,
Hispanic BINARY default 0,
EthnCode varchar(3) ,
LangID int,
ClientProxy varchar (200),
Parent varchar (40),
HshldSize int default 1,
AnnualHshldIncome INT,
MonthlyYearly VARCHAR(7) ,
PFDs INT,
WIC BINARY default 0,
Medicaid BINARY default 0,
ATAP BINARY default 0,
FoodStamps BINARY default 0,
AgencyID int not null,
RoutID int ,
DeliveryNotes varchar (200),
RecertificationDate date not null,
Notes text,
Primary Key (ClientID)
);
I use
SET IDENTITY_INSERT Clients2 ON;
BULK INSERT Clients2
FROM 'c:\Sample_Clients.txt'
WITH
(
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\r\n'
)
SQL Server Express trows me errors
Msg 545, Level 16, State 1, Line 2
Explicit value must be specified for identity column in table 'Clients' either when IDENTITY_INSERT is set to ON or when a replication user is inserting into a NOT FOR REPLICATION identity column.
File has only one line (for now just sample data) I check it many times its one line
Data looks like this
13144,Vasya,Pupkin,,1944-10-20,P.O. Box 52,,Wrna,99909,(907) 111-1111,SR,4,0,W,1,,,3,1198,month,0,0,1,0,1,45,,,2011-04-27
Any ideas how to fix this problem?
You need the parameter KEEPIDENTITY in your bulk insert statement. This is required to retain identity values in the load.
BULK INSERT Clients2 FROM 'c:\Sample_Clients.txt'
WITH ( KEEPIDENTITY, FIELDTERMINATOR = ',', ROWTERMINATOR = '\r\n'
)
I also think you will have a problem because you have no data or placeholder for the Notes column. A comma added to the end of the file should address this.

Sql Server - Insufficient result space to convert uniqueidentifier value to char

I am getting below error when I run sql query while copying data from one table to another,
Msg 8170, Level 16, State 2, Line 2
Insufficient result space to convert
uniqueidentifier value to char.
My sql query is,
INSERT INTO dbo.cust_info (
uid,
first_name,
last_name
)
SELECT
NEWID(),
first_name,
last_name
FROM dbo.tmp_cust_info
My create table scripts are,
CREATE TABLE [dbo].[cust_info](
[uid] [varchar](32) NOT NULL,
[first_name] [varchar](100) NULL,
[last_name] [varchar](100) NULL)
CREATE TABLE [dbo].[tmp_cust_info](
[first_name] [varchar](100) NULL,
[last_name] [varchar](100) NULL)
I am sure there is some problem with NEWID(), if i take out and replace it with some string it is working.
I appreciate any help. Thanks in advance.
A guid needs 36 characters (because of the dashes). You only provide a 32 character column. Not enough, hence the error.
You need to use one of 3 alternatives
1, A uniqueidentifier column, which stores it internally as 16 bytes. When you select from this column, it automatically renders it for display using the 8-4-4-4-12 format.
CREATE TABLE [dbo].[cust_info](
[uid] uniqueidentifier NOT NULL,
[first_name] [varchar](100) NULL,
[last_name] [varchar](100) NULL)
2, not recommended Change the field to char(36) so that it fits the format, including dashes.
CREATE TABLE [dbo].[cust_info](
[uid] char(36) NOT NULL,
[first_name] [varchar](100) NULL,
[last_name] [varchar](100) NULL)
3, not recommended Store it without the dashes, as just the 32-character components
INSERT INTO dbo.cust_info (
uid,
first_name,
last_name
)
SELECT
replace(NEWID(),'-',''),
first_name,
last_name
FROM dbo.tmp_cust_info
I received this error when I was trying to perform simple string concatenation on the GUID. Apparently a VARCHAR is not big enough.
I had to change:
SET #foo = 'Old GUID: {' + CONVERT(VARCHAR, #guid) + '}';
to:
SET #foo = 'Old GUID: {' + CONVERT(NVARCHAR(36), #guid) + '}';
...and all was good. Huge thanks to the prior answers on this one!
Increase length of your uid column from varchar(32) ->varchar(36)
because guid take 36 characters
Guid.NewGuid().ToString() -> 36 characters
outputs: 12345678-1234-1234-1234-123456789abc
You can try this. This worked for me.
Specify a length for VARCHAR when you cast/convert a value..for uniqueidentifier use VARCHAR(36) as below:
SELECT Convert (varchar(36),NEWID()) AS NEWID
The default length for VARCHAR datatype if we don't specify a length during CAST/CONVERT is 30..
Credit : Krishnakumar S
Reference : https://social.msdn.microsoft.com/Forums/en-US/fb24a153-f468-4e18-afb8-60ce90b55234/insufficient-result-space-to-convert-uniqueidentifier-value-to-char?forum=transactsql