Loading data from a CSV file using <loadUpdateData> Postgres - liquibase

I would appreciate some suggestions. I am accomplishing a simple load of one table with five records from a CSV file. When I load the table, I get the below the error:
liquibase.exception.DatabaseException: org.postgresql.util.PSQLException: ERROR: zero-length delimited identifier at or near """"
This is by csv file
1,Nate Happy,natehappy1761#me.com,1761 Brookview Trail,(205) 555-1212
2,Brigette Happy,brigettehappy7507#me.com,7507 Meadowgate Lane,(704) 555-1212
3,Katie Happy,katiehappy7507#me.com,7507 Meadowgate Lane,(704) 555-1212
4,Lauren Happy,laurenhappy#me.com,7507 Meadowgate Lane,(704) 555-1212
5,Jackson Hope,jacksonhope#me.com,7507 Meadowgate Lane,(704) 555-1212
This is my changeset for loading the data
<changeSet id="6-loadData" author="liquibase" dbms="postgresql" >
<preConditions onErrorMessage="Failed Pre Conditions for table" onFail="HALT">
<and>
<tableExists schemaName="public" tableName="contact" />
<sqlCheck expectedResult ="1">SELECT COUNT(*) contact</sqlCheck>
</and>
</preConditions>
<comment>Adding Data...</comment>
<loadUpdateData catalogName="pg_catalog"
encoding="UTF-8"
file="src/main/resources/data/contacts.csv"
primaryKey="contact_id"
quotchar="A String"
schemaName="public"
separator=","
tableName="contact">
<column name="contact_id" type="int" />
<column name="contact_name" type="varchar(45)"/>
<column name="email" type="varchar(45)" />
<column name="address" type="varchar(45)" />
<column name="telephone" type="varchar(45)" />
</loadUpdateData>
This is my changeset for creating the table:
<changeSet id="4 Create Table" author="liquibase" runAlways="true">
<preConditions onErrorMessage="Failed Pre Conditions for table" onFail="MARK_RAN">
<not><tableExists schemaName="public" tableName="contact"/> </not>
</preConditions>
<comment>Creating Table named: Contact...</comment>
<createTable tableName="contact" schemaName="public">
<column name="contact_id" type="int" >
<constraints primaryKey="true" nullable="false"/>
</column>
<column name="contact_name" type="varchar(45)">
<constraints nullable="false"/>
</column>
<column name="email" type="varchar(45)">
<constraints nullable="false"/>
</column>
<column name="address" type="varchar(45)">
<constraints nullable="false"/>
</column>
<column name="telephone" type="varchar(45)">
<constraints nullable="false"/>
</column>
</createTable>
Here is the sequence I am using to the primary key (contact_id)
<changeSet id="2-Create Sequence" author="liquibase" runAlways="true">
<preConditions onErrorMessage="Failed Pre Conditions for sequence" onFail="MARK_RAN">
<not><sequenceExists schemaName="public" sequenceName="contactid_seq" /></not>
</preConditions>
<comment>Creating Sequence...</comment>
<createSequence sequenceName="contactid_seq"
incrementBy="1"
minValue="1"
maxValue="9223372036854775807"
startValue="1"
ordered="1"
schemaName="public"/>
This is how I am using the constraint:
<changeSet id="5-Add Constraint" author="liquibase">
<comment>Adding contactid_seq sequence to Contact table...</comment>
<addDefaultValue catalogName="pg_catalog"
columnDataType="int"
columnName="contact_id"
tableName="contact"
schemaName="public"
defaultValueSequenceNext="contactid_seq" />
Thanks for taking the time to read my post.
Russ

I discovered the CSV file was missing the HEADER column names. The LoadUpdateData's column elements use specific Java types, such as String instead of VARCHAR(45) and NUMERIC instead of "int". Once I corrected these two errors, I was successful.

Don't forget to change a string of quotchar, if you set it like in manual,
quotchar="A String"
then Liquibase will mark letter "A" as quotes.
You can exclude it from changeset and use default html quotes charset

Related

How to insert bulk csv file into SQL Server

I have a csv file that have columns like this
"Hold","EmpID","Source","Shard-Exists","Type"
but my DB table look like this
//Note: My Id is auto increment
"Id","Hold","EmpID","Source","Type","CurrentDate"
I'm just wondering how can bulk insert my csv file into the database table without the shard-Exist column and also passing the Current Date automatically.
Any help or suggestion will be really appreciated
TRUNCATE TABLE dbo.Actors;
GO
-- import the file
BULK INSERT dbo.Actors
FROM 'C:\Documents\Skyvia\csv-to-mssql\actor.csv'
WITH
(
FORMAT='CSV',
FIRSTROW=2
)
GO
You should be able to use a format file to accomplish the 'skip columns in table' task. I'll modify the example from the MS docs.
<?xml version="1.0"?>
<BCPFORMAT xmlns="https://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="7"/>
<FIELD ID="2" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="25" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="3" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="25" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="4" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="25" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="5" xsi:type="CharTerm" TERMINATOR="," MAX_LENGTH="25" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="6" xsi:type="CharTerm" TERMINATOR="\r\n" MAX_LENGTH="30" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
</RECORD>
<ROW>
<COLUMN SOURCE="2" NAME="Hold" xsi:type="SQLINT"/>
<COLUMN SOURCE="3" NAME="EmpID" xsi:type="SQLINT"/>
<COLUMN SOURCE="4" NAME="Source" xsi:type="SQLVARYCHAR"/>
<COLUMN SOURCE="6" NAME="Type" xsi:type="SQLVARYCHAR"/>
</ROW>
</BCPFORMAT>
Note, in the <ROW> portion that I'm not specifying anything for the ID or CurrentDate columns. It's also noteworthy that there's no SOURCE="5"; that's how the Shard-Exists field in the source data is being skipped.
As to auto-generating a value for CurrentDate, my recommendation would be to add a default constraint to your table. that can be done like so:
ALTER TABLE dbo.Actors
ADD CONSTRAINT DF_Actors__CurrentDate
DEFAULT (getdate()) FOR CurrentDate;

Bulk insert csv file with semicolon as delimiter

I'm trying to import data from semicolon separated csv file into a SQL Server database. Here is the table structure
CREATE TABLE [dbo].[waste_facility]
(
[Id] INT IDENTITY (1, 1) NOT NULL,
[postcode] VARCHAR (50) NULL,
[name] VARCHAR (50) NULL,
[type] VARCHAR (255) NULL,
[street] VARCHAR (255) NULL,
[suburb] VARCHAR (255) NULL,
[municipality] VARCHAR (255) NULL,
[telephone] VARCHAR (255) NULL,
[website] VARCHAR (255) NULL,
[longtitude] DECIMAL (18, 8) NULL,
[latitude] DECIMAL (18, 8) NULL,
PRIMARY KEY CLUSTERED ([Id] ASC)
);
The csv file is shown below:
Location Coordinate;Feature Extent;Projection;Postcode;Name Of Facility;Type Of Facility;Street;Suburb;Municipality;Telephone Number;Website;Easting Coordinate;Northing Coordinate;Longitude Coordinate;Latitude Coordinate;Google Maps Direction
-37.9421182892,145.3193857967;"{""coordinates"": [145.3193857967, -37.9421182892], ""type"": ""Point""}";MGA zone 55;3156;Cleanaway Lysterfield Resource Recovery Centre;Recovery Centre;840 Wellington Road;LYSTERFIELD;Yarra Ranges;9753 5411;https://www.cleanaway.com.au/location/lysterfield/;352325;5799275;145.31938579674124;-37.94211828921733;https://www.google.com.au/maps/dir//-37.94211828921733,145.31938579674124/#your+location,17z/data=!4m2!4m1!3e0
-38.0529529215,145.2433557709;"{""coordinates"": [145.2433557709, -38.0529529215], ""type"": ""Point""}";MGA zone 55;3175;Smart Recycling (South Eastern Depot);Recycling Centre;185 Dandenong-Hastings Rd;LYNDHURST;Greater Dandenong;8787 3300;https://smartrecycling.com.au/;345876;5786853;145.24335577090602;-38.05295292152536;https://www.google.com.au/maps/dir//-38.05295292152536,145.24335577090602/#your+location,17z/data=!4m2!4m1!3e0
-38.0533129717,145.267610135;"{""coordinates"": [145.267610135, -38.0533129717], ""type"": ""Point""}";MGA zone 55;3976;Hampton Park Transfer Station (Outlook Environmental);Transfer Station;274 Hallam Road;HAMPTON PARK;Casey;9554 4502;https://www.suez.com.au/en-au/who-we-are/suez-in-australia-and-new-zealand/our-locations/waste-management-hampton-park-transfer-station;348005;5786853;145.2676101350274;-38.053312971691255;https://www.google.com.au/maps/dir//-38.053312971691255,145.2676101350274/#your+location,17z/data=!4m2!4m1!3e0
-38.1243050577,145.2183465487;"{""coordinates"": [145.2183465487, -38.1243050577], ""type"": ""Point""}";MGA zone 55;3977;Frankston Regional Recycling and Recovery Centre;Recycling Centre;20 Harold Road;SKYE;Frankston;1300 322 322;https://www.frankston.vic.gov.au/Environment-and-Waste/Waste-and-Recycling/Frankston-Regional-Recycling-and-Recovery-Centre-FRRRC/Accepted-Items-at-FRRRC;343833;5778893;145.21834654873447;-38.12430505770815;https://www.google.com.au/maps/dir//-38.12430505770815,145.21834654873447/#your+location,17z/data=!4m2!4m1!3e0
-38.0973208774,145.4920399066;"{""coordinates"": [145.4920399066, -38.0973208774], ""type"": ""Point""}";MGA zone 55;3810;Pakenham Waste Transfer Station (Future Recycling);Transfer Station;30-32 Exchange Drive;PAKENHAM;Cardinia;13Recycling;https://www.futurerecycling.com.au/;367776;5782313;145.4920399066473;-38.09732087738631;https://www.google.com.au/maps/dir//-38.09732087738631,145.4920399066473/#your+location,17z/data=!4m2!4m1!3e0
There are some columns that I don't need, so I create a format file to import the data. The format file is shown as below
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="CharFixed" LENGTH="50"/>
<FIELD ID="12" xsi:type="CharFixed" LENGTH="50"/>
<FIELD ID="13" xsi:type="CharFixed" LENGTH="50"/>
<FIELD ID="2" xsi:type="CharFixed" LENGTH="50" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="3" xsi:type="CharFixed" LENGTH="50" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="4" xsi:type="CharFixed" LENGTH="255" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="5" xsi:type="CharFixed" LENGTH="255" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="6" xsi:type="CharFixed" LENGTH="255" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="7" xsi:type="CharFixed" LENGTH="255" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="8" xsi:type="CharFixed" LENGTH="255" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="9" xsi:type="CharFixed" LENGTH="255" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
<FIELD ID="14" xsi:type="CharFixed" LENGTH="50"/>
<FIELD ID="15" xsi:type="CharFixed" LENGTH="50"/>
<FIELD ID="10" xsi:type="CharFixed" LENGTH="41"/>
<FIELD ID="11" xsi:type="CharTerm" TERMINATOR="\r\n" MAX_LENGTH="41"/>
<FIELD ID="16" xsi:type="CharFixed" LENGTH="50"/>
</RECORD>
<ROW>
<COLUMN SOURCE="2" NAME="postcode" xsi:type="SQLVARYCHAR"/>
<COLUMN SOURCE="3" NAME="name" xsi:type="SQLVARYCHAR"/>
<COLUMN SOURCE="4" NAME="type" xsi:type="SQLVARYCHAR"/>
<COLUMN SOURCE="5" NAME="street" xsi:type="SQLVARYCHAR"/>
<COLUMN SOURCE="6" NAME="suburb" xsi:type="SQLVARYCHAR"/>
<COLUMN SOURCE="7" NAME="municipality" xsi:type="SQLVARYCHAR"/>
<COLUMN SOURCE="8" NAME="telephone" xsi:type="SQLVARYCHAR"/>
<COLUMN SOURCE="9" NAME="website" xsi:type="SQLVARYCHAR"/>
<COLUMN SOURCE="10" NAME="longtitude" xsi:type="SQLDECIMAL" PRECISION="18" SCALE="8"/>
<COLUMN SOURCE="11" NAME="latitude" xsi:type="SQLDECIMAL" PRECISION="18" SCALE="8"/>
</ROW>
</BCPFORMAT>
Then I tried both bulk insert and bcp in - neither of them works.
Here is the bulk insert command
USE [waste-facility-locations];
BULK INSERT [dbo].[waste_facility]
FROM 'E:\onboardingIteration\waste-facility-locations.csv'
WITH (FORMATFILE = 'E:\onboardingIteration\waste_facility_formatter.xml',
FIRSTROW = 2,
LASTROW = 6,
FIELDTERMINATOR = ';',
ROWTERMINATOR = '\n',
ERRORFILE = 'E:\onboardingIteration\myRubbishData.log');
But unlucky some error file were generated. Here is what myRubbishData.log error says:
Row 2 File Offset 1993 ErrorFile Offset 0 - HRESULT 0x80004005
And the actual row stored in myRubbishData.txt:
;Pakenham Waste Transfer Station (Future Recycling);Transfer Station;30-32 Exchange Drive;PAKENHAM;Cardinia;13Recycling;https://www.futurerecycling.com.au/;367776;5782313;145.4920399066473;-38.09732087738631;https://www.google.com.au/maps/dir//-38.09732087738631,145.4920399066473/#your+location,17z/data=!4m2!4m1!3e0;Pakenham Waste Transfer Station (Future Recycling);Transfer Station;30-32 Exchange Drive;PAKENHAM;Cardinia;13Recycling;https://www.futurerecycling.com.au/;367776;5782313;145.4920399066473;-38.09732087738631;https://www.google.com.au/maps/dir//-38.09
As you can see, it seems like rows are not correctly separated. So I tried to change the row delimiter to "\n","\r","\n\r","\r\n", none of them work.
And I tried bcp. It did not work either.
Here is the bcp command I used:
bcp [waste-facility-locations].[dbo].[waste_facility] in "E:\onboardingIteration\waste-facility-locations.csv" -f "E:\onboardingIteration\waste_facility_formatter.xml" -T -S "(LocalDB)\MSSQLLocalDB" -F 2 -t ";" -r "\n"
Then I get an error said somehow the same thing
SQLState = S1000, NativeError = 0
Error = [Microsoft][ODBC Driver 17 for SQL Server]Unexpected EOF encountered in BCP data-file
0 rows copied.
Network packet size (bytes): 4096
Clock Time (ms.) Total : 1
One interesting things is, if I create a new excel and choose "Get data" option to import the csv file, the file can be literally correctly parsed.
Basically I can't find what I did wrong here. Can someone help me on this one?
The SQL Server import facilities are very intolerant of bad data and even just formatting variations or options. In my career, I have literally spent thousands of work-hours trying to develop and debug import procedures for customers. I can tell you right now, that trying to fix this with SQL alone is both difficult and time-consuming.
When you have this problem (bad data and/or inconsistent formatting) it is almost always easier to find or develop a more flexible tool to pre-process the data into the rigid standard that SQL expects. So I would say that if Excel can parse it then just use Excel automation to pre-process them and then use SQL to import the Excel output. If that's not practical for you, then I'd advise writing your own tool in some client language (C#, Vb, Java, Python, etc.) to pre-process the files.
You can do it in SQL (and I have done it many times), but I promise you that it is a long complicated trek.
SSIS has more flexible error-handling for problems like this, but if you are not already familiar and using it, it has a very steep learning curve and your first SSIS project is likely to be very time-consuming also.

Update Single XML Node Value of XML Column using SQL Server

I want to update single value of XML Node in SQl Server
Below is the table structure
XML Structure
<PayDetails>
<Column Name="FG" DataType="float" Value="7241" />
<Column Name="SKILL" DataType="float" Value="3" />
<Column Name="PI" DataType="float" Value="87" />
<Column Name="MD" DataType="float" Value="30" />
<Column Name="LD" DataType="float" Value="4" />
<Column Name="WEEKOFF_DAYS" DataType="float" Value="4" />
<Column Name="NETPAY" DataType="float" Value="5389" />
</PayDetails>
I want to update value of FG from 7241 to 8000
You want to use replace value of...with keywords:
Try something like the following:
update tablename
set TransactionFieldDetails.modify(
'replace value of
(/PayDetails/Column[#Name="FG"]/#Value)[1]
with "8000"');

SQL Server xquery sum cast error when schema data type is string

Trying to run this in SQL Server 2014 in order to sum all Values in "UserData" xml:
IF EXISTS (SELECT * FROM sys.xml_schema_collections WHERE name = 'SC')
DROP XML SCHEMA COLLECTION SC
go
CREATE XML SCHEMA COLLECTION SC AS N'<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"><xsd:element name="UserData"><xsd:complexType><xsd:complexContent><xsd:restriction base="xsd:anyType"><xsd:sequence><xsd:element name="Item" minOccurs="0" maxOccurs="unbounded"><xsd:complexType><xsd:complexContent><xsd:restriction base="xsd:anyType"><xsd:sequence><xsd:element name="Value" type="xsd:string" /><xsd:any minOccurs="0" /></xsd:sequence><xsd:attribute name="Key" type="xsd:string" /><xsd:attribute name="Type" type="xsd:string" /></xsd:restriction></xsd:complexContent></xsd:complexType></xsd:element></xsd:sequence></xsd:restriction></xsd:complexContent></xsd:complexType></xsd:element></xsd:schema>'
go
Declare #xml xml(SC)
set #xml= '<UserData>
<Item Key="CONVERTED_PAGES_1" Type="CONVERTED_PAGES">
<Value>2</Value>
</Item>
<Item Key="CONVERTED_PAGES_2" Type="CONVERTED_PAGES">
<Value>4</Value>
</Item>
</UserData>'
Select #xml.value('sum(/UserData/Item[#Type="CONVERTED_PAGES"]/Value)','int') as Sum
and getting the following error:
Msg 9308, Level 16, State 1, Line 16
XQuery [value()]: The argument of 'sum()' must be of a single numeric primitive type or 'http://www.w3.org/2004/07/xpath-datatypes#untypedAtomic'. Found argument of type 'xs:string *'.
I tried changing the select to the following:
Select #xml.value('sum(/UserData/Item[#Type="CONVERTED_PAGES"]/Value cast as xs:int?)','int') as Sum
But then I get this:
Msg 2365, Level 16, State 1, Line 16 XQuery [value()]: Cannot
explicitly convert from 'xs:string *' to 'xs:int ?'
I am not able to change the xml schema in this case, but figured I could cast in order to perform this operation (since I know that in my case all of the Values will be int). Any suggestions would be appreciated!
The xquery sum aggregate requires the input to be a number. Currently it is defined as string in your XSD. To get this to work, you have three options:
Option 1:
You change the schema to force "value" to be an int. Instead of the first line below, use the second. (The difference is highlighted in between the two statements with "|||||||".)
Query 1:
CREATE XML SCHEMA COLLECTION SC AS N'<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"><xsd:element name="UserData"><xsd:complexType><xsd:complexContent><xsd:restriction base="xsd:anyType"><xsd:sequence><xsd:element name="Item" minOccurs="0" maxOccurs="unbounded"><xsd:complexType><xsd:complexContent><xsd:restriction base="xsd:anyType"><xsd:sequence><xsd:element name="Value" type="xsd:string" /><xsd:any minOccurs="0" /></xsd:sequence><xsd:attribute name="Key" type="xsd:string" /><xsd:attribute name="Type" type="xsd:string" /></xsd:restriction></xsd:complexContent></xsd:complexType></xsd:element></xsd:sequence></xsd:restriction></xsd:complexContent></xsd:complexType></xsd:element></xsd:schema>'
|||||||
CREATE XML SCHEMA COLLECTION SC AS N'<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"><xsd:element name="UserData"><xsd:complexType><xsd:complexContent><xsd:restriction base="xsd:anyType"><xsd:sequence><xsd:element name="Item" minOccurs="0" maxOccurs="unbounded"><xsd:complexType><xsd:complexContent><xsd:restriction base="xsd:anyType"><xsd:sequence><xsd:element name="Value" type="xsd:integer" /><xsd:any minOccurs="0" /></xsd:sequence><xsd:attribute name="Key" type="xsd:string" /><xsd:attribute name="Type" type="xsd:string" /></xsd:restriction></xsd:complexContent></xsd:complexType></xsd:element></xsd:sequence></xsd:restriction></xsd:complexContent></xsd:complexType></xsd:element></xsd:schema>'
Option 2:
If changing the XSD is not an option, you can also use the T-SQL SUM aggregate instead of the xquery one, like this:
Query 2:
IF EXISTS (SELECT * FROM sys.xml_schema_collections WHERE name = 'SC')
DROP XML SCHEMA COLLECTION SC
go
CREATE XML SCHEMA COLLECTION SC AS N'<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"><xsd:element name="UserData"><xsd:complexType><xsd:complexContent><xsd:restriction base="xsd:anyType"><xsd:sequence><xsd:element name="Item" minOccurs="0" maxOccurs="unbounded"><xsd:complexType><xsd:complexContent><xsd:restriction base="xsd:anyType"><xsd:sequence><xsd:element name="Value" type="xsd:string" /><xsd:any minOccurs="0" /></xsd:sequence><xsd:attribute name="Key" type="xsd:string" /><xsd:attribute name="Type" type="xsd:string" /></xsd:restriction></xsd:complexContent></xsd:complexType></xsd:element></xsd:sequence></xsd:restriction></xsd:complexContent></xsd:complexType></xsd:element></xsd:schema>'
go
Declare #xml xml(SC)
set #xml= '<UserData>
<Item Key="CONVERTED_PAGES_1" Type="CONVERTED_PAGES">
<Value>2</Value>
</Item>
<Item Key="CONVERTED_PAGES_2" Type="CONVERTED_PAGES">
<Value>4</Value>
</Item>
</UserData>'
SELECT SUM(N.value('.','INT')) AS [Sum]
FROM #xml.nodes('/UserData/Item[#Type="CONVERTED_PAGES"]/Value') AS X(N);
Option 3:
As you noticed, SQL Server does not allow us to convert an XSD-typed value to another data type. To get around that, you could instruct SQL Server to forget about the schema:
Query 3:
IF EXISTS (SELECT * FROM sys.xml_schema_collections WHERE name = 'SC')
DROP XML SCHEMA COLLECTION SC;
GO
CREATE XML SCHEMA COLLECTION SC AS N'<xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"><xsd:element name="UserData"><xsd:complexType><xsd:complexContent><xsd:restriction base="xsd:anyType"><xsd:sequence><xsd:element name="Item" minOccurs="0" maxOccurs="unbounded"><xsd:complexType><xsd:complexContent><xsd:restriction base="xsd:anyType"><xsd:sequence><xsd:element name="Value" type="xsd:string" /><xsd:any minOccurs="0" /></xsd:sequence><xsd:attribute name="Key" type="xsd:string" /><xsd:attribute name="Type" type="xsd:string" /></xsd:restriction></xsd:complexContent></xsd:complexType></xsd:element></xsd:sequence></xsd:restriction></xsd:complexContent></xsd:complexType></xsd:element></xsd:schema>';
GO
DECLARE #xml XML(SC);
SET #xml= '<UserData>
<Item Key="CONVERTED_PAGES_1" Type="CONVERTED_PAGES">
<Value>2</Value>
</Item>
<Item Key="CONVERTED_PAGES_2" Type="CONVERTED_PAGES">
<Value>4</Value>
</Item>
</UserData>';
SELECT CAST(#xml AS XML).value('sum((/UserData/Item[#Type="CONVERTED_PAGES"]/Value ))','int') AS Sum;
Note: Without the schema, you still cannot cast (not sure why), but the sum now works without casting.
Update:
I did a little more digging. The original error message you got after attempting to cast is this one:
Msg 2365, Level 16, State 1, Line 16 XQuery [value()]: Cannot
explicitly convert from 'xs:string *' to 'xs:int ?'
It tells us that you can't convert a sequence of strings into a single integer.
The * as well as the ? are Occurrence Indicators. So the error message reads: zero-to-many strings can't be converted to zero-to-one integer.
Your xquery /UserData/Item[#Type="CONVERTED_PAGES"]/Value returns more than one value, and to sum them up we need to convert each one individually.
xquery offers multiple ways to accomplish that, but not all of them work in SQL Server. The one that works uses a for-each construct:
.value('sum(for $val in /UserData/Item[#Type="CONVERTED_PAGES"]/Value return $val cast as xs:int?)','INT');
Thanks to #MikaelEriksson for helping me out with this.

Bulk Insert using format file for varying number of columns between datafile and actual database table

This is the Schema for a table
Create table dbo.Project
(
ProjectID (int,not null)
ManagerID (int,not null)
CompanyID(int, not null)
Title (nvarchar(50),not null)
StartDate(datetime,not null)
EndDate(datetime,null)
ProjDescription(nvarchar(max))
)
I created a datafile called bob.dat from this table which has around 15 rows with the following bcp command
bcp "Select ProjectID,ManagerID,CompanyID,Title,StartDate from CATS.dbo.Project" queryout "C:\Documents\bob.dat" -Sbob-pc -T -n
Also a format/mapping file called bob.fmt was created using the following bcp command
bcp CATS.dbo.Project format nul -f C:\Documents\bob.fmt -x -Sbob-pc -T -n
Then i created a copy of the table Project.
Create table dbo.ProjectCopy
(
ProjectID (int,not null)
ManagerID (int,not null)
CompanyID(int, not null)
Title (nvarchar(50),not null)
StartDate(datetime,not null)
EndDate(datetime,null)
ProjDescription(nvarchar(max))
)
What i want to do now is use the bob.dat and bob.format file to populate this table ProjectCopy using the following Bulk Insert statement.
BULK INSERT CATS.dbo.ProjectCopy
FROM 'C:\Documents\bob.dat'
WITH (FORMATFILE = 'C:\Documents\bob.fmt',
LASTROW=5,
KEEPNULLS,
DATAFILETYPE='native');
GO
SELECT * FROM CATS.dbo.ProjectCopy
GO
So basically the data file does not contain any data for the columns EndDate and ProjDescription. I want these two columns to be remained as null. Unfortunately i get the
following error when i run the bulk insert statement.
Msg 4863, Level 16, State 4, Line 2
Bulk load data conversion error (truncation) for row 1, column 6 (EndDate).
Msg 7399, Level 16, State 1, Line 2
The OLE DB provider "BULK" for linked server "(null)" reported an error.
The provider did not give any information about the error.
Msg 7330, Level 16, State 2, Line 2
Cannot fetch a row from OLE DB provider "BULK" for linked server "(null)".
(0 row(s) affected)
Anyone got any clue how this could be fixed?
just to inform you all i have already been to this sections and the solution provided there
didn't work out for me.
BULK INSERT with inconsistent number of columns,
Can't identify reason for BULK INSERT errors,
BULK INSERT with inconsistent number of columns
First of all, for a creating a datafile called bob.dat you need to add two columns: EndDate and ProjDescription. In addition, for the bulk copy operation using Unicode-characters must be added the argument -W.
Example:
bcp "Select ProjectID,ManagerID,CompanyID,Title,StartDate, NULL AS EndDate, NULL AS ProjDescription from CATS.dbo.Project" queryout "C:\Users\Pawan\Documents\bob.dat" -Sbob-pc -T -n -w
The original format file:
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="NCharTerm" TERMINATOR="\t\0" MAX_LENGTH="24"/>
<FIELD ID="2" xsi:type="NCharTerm" TERMINATOR="\t\0" MAX_LENGTH="24"/>
<FIELD ID="3" xsi:type="NCharTerm" TERMINATOR="\t\0" MAX_LENGTH="24"/>
<FIELD ID="4" xsi:type="NCharTerm" TERMINATOR="\t\0" MAX_LENGTH="100" COLLATION="Cyrillic_General_CI_AS"/>
<FIELD ID="5" xsi:type="NCharTerm" TERMINATOR="\t\0" MAX_LENGTH="48"/>
<FIELD ID="6" xsi:type="NCharTerm" TERMINATOR="\t\0" MAX_LENGTH="48"/>
<FIELD ID="7" xsi:type="NCharTerm" TERMINATOR="\r\0\n\0" COLLATION="Cyrillic_General_CI_AS"/>
</RECORD>
<ROW>
<COLUMN SOURCE="1" NAME="ProjectID" xsi:type="SQLINT"/>
<COLUMN SOURCE="2" NAME="ManagerID" xsi:type="SQLINT"/>
<COLUMN SOURCE="3" NAME="CompanyID" xsi:type="SQLINT"/>
<COLUMN SOURCE="4" NAME="Title" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="5" NAME="StartDate" xsi:type="SQLDATETIME"/>
<COLUMN SOURCE="6" NAME="EndDate" xsi:type="SQLDATETIME"/>
<COLUMN SOURCE="7" NAME="ProjDescription" xsi:type="SQLNVARCHAR"/>
</ROW>
</BCPFORMAT>
But you want a way to fill the data only upto StartDate. Therefore, this file needs to be changed:
<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<RECORD>
<FIELD ID="1" xsi:type="NCharTerm" TERMINATOR="\t\0" MAX_LENGTH="24"/>
<FIELD ID="2" xsi:type="NCharTerm" TERMINATOR="\t\0" MAX_LENGTH="24"/>
<FIELD ID="3" xsi:type="NCharTerm" TERMINATOR="\t\0" MAX_LENGTH="24"/>
<FIELD ID="4" xsi:type="NCharTerm" TERMINATOR="\t\0" MAX_LENGTH="100" COLLATION="Cyrillic_General_CI_AS"/>
<FIELD ID="5" xsi:type="NCharTerm" TERMINATOR="\r\0\n\0" MAX_LENGTH="48"/>
</RECORD>
<ROW>
<COLUMN SOURCE="1" NAME="ProjectID" xsi:type="SQLINT"/>
<COLUMN SOURCE="2" NAME="ManagerID" xsi:type="SQLINT"/>
<COLUMN SOURCE="3" NAME="CompanyID" xsi:type="SQLINT"/>
<COLUMN SOURCE="4" NAME="Title" xsi:type="SQLNVARCHAR"/>
<COLUMN SOURCE="5" NAME="StartDate" xsi:type="SQLDATETIME"/>
</ROW>
</BCPFORMAT>