Not able to upload json data to Bigquery tables using c# - google-bigquery

I am trying to upload json data to one of the table created under the dataset in Bigquery but fails with "Google.GoogleApiException: 'Google.Apis.Requests.RequestError
Not found: Table currency-342912:sampleDataset.currencyTable [404]"
Service account is created with roles BigQuery.Admin/DataEditor/DataOwner/DataViewer.
The roles are also applied to the table also.
Below is the snippet
public static void LoadTableGcsJson(string projectId = "currency-342912", string datasetId = "sampleDataset", string tableId= "currencyTable ")
{
//Read the Serviceaccount key json file
string dir = Directory.GetParent(Directory.GetCurrentDirectory()).Parent.Parent.FullName + "\\" + "currency-342912-ae9b22f23a36.json";
GoogleCredential credential = GoogleCredential.FromFile(dir);
string toFileName = Directory.GetParent(Directory.GetCurrentDirectory()).Parent.Parent.FullName + "\\" + "sample.json";
BigQueryClient client = BigQueryClient.Create(projectId,credential);
var dataset = client.GetDataset(datasetId);
using (FileStream stream = File.Open(toFileName, FileMode.Open))
{
// Create and run job
BigQueryJob loadJob = client.UploadJson(datasetId, tableId, null, stream); //This throws error
loadJob.PollUntilCompleted();
}
}
Permissions for the table, using the service account "sampleservicenew" from the screenshot
Any leads on this , much appreciated

Your issue might reside in your user credentials. Please follow this steps to check your code:
Please check if the user you are using to execute your application have access to the table you want to insert data.
If your json tags matchs your table columns.
If you json inputs are correct ( table name, dataset name ).
Use a dummy table to perform a quick test of your credentials and data integrity.
These steps will help you identifying what could be missing on your side. I perform the following operations to reproduce your case:
I created a table on BigQuery based on the values of your json data:
create or replace table `projectid.datasetid.tableid` (
IsFee BOOL,
BlockDateTime timestamp,
Address STRING,
BlockHeight INT64,
Type STRING,
Value INT64
);
Created a .json file with your test data
{"IsFee":false,"BlockDateTime":"2018-09-11T00:12:14Z","Address":"tz3UoffC7FG7zfpmvmjUmUeAaHvzdcUvAj6r","BlockHeight":98304,"Type":"OUT","Value":1}
{"IsFee":false,"BlockDateTime":"2018-09-11T00:12:14Z","Address":"tz2KuCcKSyMzs8wRJXzjqoHgojPkSUem8ZBS","BlockHeight":98304,"Type":"IN","Value":18}
Build & Run below code.
using System;
using Google.Cloud.BigQuery.V2;
using Google.Apis.Auth.OAuth2;
using System.IO;
namespace stackoverflow
{
class Program
{
static void Main(string[] args)
{
String projectid = "projectid";
String datasetid = "datasetid";
String tableid = "tableid";
String safilepath ="credentials.json";
var credentials = GoogleCredential.FromFile(safilepath);
BigQueryClient client = BigQueryClient.Create(projectid,credentials);
using (FileStream stream = File.Open("data.json", FileMode.Open))
{
BigQueryJob loadJob = client.UploadJson(datasetid, tableid, null, stream);
loadJob.PollUntilCompleted();
}
}
}
}
output
Row
IsFee
BlockDateTime
Address
BlockHeight
Type
value
1
false
2018-09-11 00:12:14 UTC
tz3UoffC7FG7zfpmvmjUmUeAaHvzdcUvAj6r
98304
OUT
1
2
false
2018-09-11 00:12:14 UTC
tz2KuCcKSyMzs8wRJXzjqoHgojPkSUem8ZBS
98304
IN
18
Note: You can use above code to perform your quick tests of your credentials and the integrity of the data to insert.
I also make use of the following documentation:
Load Credentials from a file
Google.Cloud.BigQuery.V2
Load Json data into a new table

Related

Stream analytics - How to handle json in reference input

I have an Azure Stream Analytics (ASA) job which processes device telemetry data from event hub. The stream should be joined with reference data from a sql table, to enhance each message with additional device meta data. The merged entry should be stored in CosmosDb.
The sql database to serve the device metadata:
CREATE TABLE [dbo].[MyTable]
(
[DeviceId] NVARCHAR(20) NOT NULL PRIMARY KEY,
[MetaData] NVARCHAR(MAX) NULL /* this stores json, which can vary per record */
)
In ASA I have configured the reference data input with a simple query:
SELECT DeviceId, JSON_QUERY(MetaData) FROM [dbo].[MyTable]
And I have the main ASA query that performs the join:
WITH temptable AS (
SELECT * FROM [telemetry-input] TD PARTITION BY PartitionId
LEFT OUTER JOIN [metadata-input] MD
ON TD.DeviceId = MD.DeviceId
)
SELECT TD.*, MD.MetaData
INTO [cosmos-db-output]
FROM temptable PARTITION BY PartitionId
It all works and merged data gets stored in CosmosDb. However, the value of the Metadata column from sql is treated as string, and stored in comos with quotes and escape chars. Example:
{ "DeviceId" : "abc1234", … , "MetaData" : "{ \"TestKey\": \"test value\" }" };
Is there a way to handle & store the json from Metadata as a proper Json object i.e.
{ "DeviceId" : "abc1234", … , "MetaData" : { "TestKey": "test value" } };
I found the way to achieve it in ASA - you need to create javascript user function:
function parseJson(strjson){
return JSON.parse(strjson);
}
And call it in your query:
...
SELECT TD.*, udf.parseJson(MD.MetaData)
...
As you mentioned in your question,the reference json data is treated as json string, not json object. Based on my researching on the Query Syntax in ASA, there is no built-in function to convert that.
However, I'd suggest you using Azure Function Cosmos DB Trigger to process every document which is created. Please refer to my function code:
using System;
using System.Collections.Generic;
using Microsoft.Azure.Documents;
using Microsoft.Azure.Documents.Client;
using Microsoft.Azure.WebJobs;
using Microsoft.Azure.WebJobs.Host;
using Newtonsoft.Json.Linq;
namespace ProcessJson
{
public class Class1
{
[FunctionName("DocumentUpdates")]
public static void Run(
[CosmosDBTrigger(databaseName:"db",collectionName: "item", ConnectionStringSetting = "CosmosDBConnection",LeaseCollectionName = "leases",
CreateLeaseCollectionIfNotExists = true)]
IReadOnlyList<Document> documents,
TraceWriter log)
{
log.Verbose("Start.........");
String endpointUrl = "https://***.documents.azure.com:443/";
String authorizationKey = "***";
String databaseId = "db";
String collectionId = "import";
DocumentClient client = new DocumentClient(new Uri(endpointUrl), authorizationKey);
for (int i = 0; i < documents.Count; i++)
{
Document doc = documents[i];
if((doc.alreadyFormat == Undefined.Value) ||(!doc.alreadyFormat)){
String MetaData = doc.GetPropertyValue<String>("MetaData");
JObject o = JObject.Parse(MetaData);
doc.SetPropertyValue("MetaData", o);
doc.SetPropertyValue("alreadyFormat", true);
client.ReplaceDocumentAsync(UriFactory.CreateDocumentUri(databaseId, collectionId, doc.Id), doc);
log.Verbose("Update document Id " + doc.Id);
}
}
}
}
}
In addition, please refer to the case: Azure Cosmos DB SQL - how to unescape inner json property

Converting StructType to Avro Schema, returns type as Union when using databricks spark-avro

I am using databricks spark-avro to convert a dataframe schema into avro schema.The returned avro schema fails to have a default value. This is causing issues when i am trying to create a Generic record out of the schema. Can, any one help with the right way of using this function ?
Dataset<Row> sellableDs = sparkSession.sql("sql query");
SchemaBuilder.RecordBuilder<Schema> rb = SchemaBuilder.record("testrecord").namespace("test_namespace");
Schema sc = SchemaConverters.convertStructToAvro(sellableDs.schema(), rb, "test_namespace");
System.out.println(sc.toString());
System.out.println(sc.getFields().get(0).toString());
String schemaString = sc.toString();
sellableDs.foreach(
(ForeachFunction<Row>) row -> {
Schema scEx = new Schema.Parser().parse(schemaString);
GenericRecord gr;
gr = new GenericData.Record(scEx);
System.out.println("Generic record Created");
int fieldSize = scEx.getFields().size();
for (int i = 0; i < fieldSize; i++ ) {
// System.out.println( row.get(i).toString());
System.out.println("field: " + scEx.getFields().get(i).toString() + "::" + "value:" + row.get(i));
gr.put(scEx.getFields().get(i).toString(), row.get(i));
//i++;
}
}
);
This is the df schema:
StructType(StructField(key,IntegerType,true), StructField(value,DoubleType,true))
This is the avro converted schema:
{"type":"record","name":"testrecord","namespace":"test_namespace","fields":[{"name":"key","type":["int","null"]},{"name":"value","type":["double","null"]}]}
The problems is that the class SchemaConverters does not include default values as part of the schema creation. You have 2 options, modify the schema adding default values before Record creation or filling the record before building with some value( it could be actually values from your row). For example null. This is an example how create a Record using your schema
import org.apache.avro.generic.GenericRecordBuilder
import org.apache.avro.Schema
var schema = new Schema.Parser().parse("{\"type\":\"record\",\"name\":\"testrecord\",\"namespace\":\"test_namespace\",\"fields\":[{\"name\":\"key\",\"type\":[\"int\",\"null\"]},{\"name\":\"value\",\"type\":[\"double\",\"null\"]}]}")
var builder = new GenericRecordBuilder(schema);
for (i <- 0 to schema.getFields().size() - 1 ) {
builder.set(schema.getFields().get(i).name(), null)
}
var record = builder.build();
print(record.toString())

How can I log something in USQL UDO?

I have custom extractor, and I'm trying to log some messages from it.
I've tried obvious things like Console.WriteLine, but cannot find where output is. However, I found some system logs in adl://<my_DLS>.azuredatalakestore.net/system/jobservice/jobs/Usql/.../<my_job_id>/.
How can I log something? Is it possible to specify log file somewhere on Data Lake Store or Blob Storage Account?
A recent release of U-SQL has added diagnostic logging for UDOs. See the release notes here.
// Enable the diagnostics preview feature
SET ##FeaturePreviews = "DIAGNOSTICS:ON";
// Extract as one column
#input =
EXTRACT col string
FROM "/input/input42.txt"
USING new Utilities.MyExtractor();
#output =
SELECT *
FROM #input;
// Output the file
OUTPUT #output
TO "/output/output.txt"
USING Outputters.Tsv(quoting : false);
This was my diagnostic line from the UDO:
Microsoft.Analytics.Diagnostics.DiagnosticStream.WriteLine(System.String.Format("Concatenations done: {0}", i));
This is the whole UDO:
using System.Collections.Generic;
using System.IO;
using System.Text;
using Microsoft.Analytics.Interfaces;
namespace Utilities
{
[SqlUserDefinedExtractor(AtomicFileProcessing = true)]
public class MyExtractor : IExtractor
{
//Contains the row
private readonly Encoding _encoding;
private readonly byte[] _row_delim;
private readonly char _col_delim;
public MyExtractor()
{
_encoding = Encoding.UTF8;
_row_delim = _encoding.GetBytes("\n\n");
_col_delim = '|';
}
public override IEnumerable<IRow> Extract(IUnstructuredReader input, IUpdatableRow output)
{
string s = string.Empty;
string x = string.Empty;
int i = 0;
foreach (var current in input.Split(_row_delim))
{
using (System.IO.StreamReader streamReader = new StreamReader(current, this._encoding))
{
while ((s = streamReader.ReadLine()) != null)
{
//Strip any line feeds
//s = s.Replace("/n", "");
// Concatenate the lines
x += s;
i += 1;
}
Microsoft.Analytics.Diagnostics.DiagnosticStream.WriteLine(System.String.Format("Concatenations done: {0}", i));
//Create the output
output.Set<string>(0, x);
yield return output.AsReadOnly();
// Reset
x = string.Empty;
}
}
}
}
}
And these were my results found in the following directory:
/system/jobservice/jobs/Usql/2017/10/20.../diagnosticstreams
good question. I have been asking myself the same thing. This is theoretical, but I think it would work (I'll updated if I find differently).
One very hacky way is that you could insert rows into a table with your log messages as a string column. Then you can select those out and filter based on some log_producer_id column. You also get the benefit of logging if part of the script works, but later parts do not assuming the failure does not roll back. Table can be dumped at end as well to file.
For the error cases, you can use the Job Manager in ADLA to open the job graph and then view the job output. The errors often have detailed information for data-related errors (e.g. row number in file with error and a octal/hex/ascii dump of the row with issue marked with ###).
Hope this helps,
J
ps. This isn't a comment or an answer really, since I don't have working code. Please provide feedback if the above ideas are wrong.

Amazon RDS w/ SQL Server wont allow bulk insert from CSV source

I've tried two methods and both fall flat...
BULK INSERT TEMPUSERIMPORT1357081926
FROM 'C:\uploads\19E0E1.csv'
WITH (FIELDTERMINATOR = ',',ROWTERMINATOR = '\n')
You do not have permission to use the bulk load statement.
but you cannot enable that SQL Role with Amazon RDS?
So I tried... using openrowset but it requires AdHoc Queries to be enabled which I don't have permission to do!
I know this question is really old, but it was the first question that came up when I searched bulk inserting into an aws sql server rds instance. Things have changed and you can now do it after integrating the RDS instance with S3. I answered this question in more detail on this question. But overall gist is that you setup the instance with the proper role, put your file on S3, then you can copy the file over to RDS with the following commands:
exec msdb.dbo.rds_download_from_s3
#s3_arn_of_file='arn:aws:s3:::bucket_name/bulk_data.csv',
#rds_file_path='D:\S3\seed_data\data.csv',
#overwrite_file=1;
Then BULK INSERT will work:
FROM 'D:\S3\seed_data\data.csv'
WITH
(
FIRSTROW = 2,
FIELDTERMINATOR = ',',
ROWTERMINATOR = '\n'
)
AWS doc
You can enable ad hoc distributed queries via heading to your Amazon Management Console, navigating to your RDS menu and then creating a DB Parameter group with ad hoc distributed queries set to 1, and then attaching this parameter group to your DB instance.
Don't forget to reboot your DB once you have made these changes.
Here is the source of my information:
http://blogs.lessthandot.com/index.php/datamgmt/dbadmin/turning-on-optimize-for-ad/
Hope this helps you.
2022
I'm adding for anyone like me who wants to quickly insert data into RDS from C#
While RDS allows csv bulk uploads directly from S3 instances, there are times when you just want to directly upload data straight from your program.
I've written a C# utility method which does inserts using a StringBuilder to concatenate statements to do 2000 inserts per call, which is way faster than an ORM like dapper which does one insert per call.
This method should handle date, int, double, and varchar fields, but I haven't had to use it for character escaping or anything like that.
//call as
FastInsert.Insert(MyDbConnection, new object[]{{someField = "someValue"}}, "my_table");
class FastInsert
{
static int rowSize = 2000;
internal static void Insert(IDbConnection connection, object[] data, string targetTable)
{
var props = data[0].GetType().GetProperties();
var names = props.Select(x => x.Name).ToList();
foreach(var batch in data.Batch(rowSize))
{
var sb = new StringBuilder($"insert into {targetTable} ({string.Join(",", names)})");
string lastLine = "";
foreach(var row in batch)
{
sb.Append(lastLine);
var values = props.Select(prop => CreateSQLString(row, prop));
lastLine = $"select '{string.Join("','", values)}' union all ";
}
lastLine = lastLine.Substring(0, lastLine.Length - " union all".Length) + " from dual";
sb.Append(lastLine);
var fullQuery = sb.ToString();
connection.Execute(fullQuery);
}
}
private static string CreateSQLString(object row, PropertyInfo prop)
{
var value = prop.GetValue(row);
if (value == null) return "null";
if (prop.PropertyType == typeof(DateTime))
{
return $"'{((DateTime)value).ToString("yyyy-MM-dd HH:mm:ss")}'";
}
//if (prop.PropertyType == typeof(string))
//{
return $"'{value.ToString().Replace("'", "''")}'";
//}
}
}
static class Extensions
{
public static IEnumerable<T[]> Batch<T>(this IEnumerable<T> source, int size) //split an IEnumerable into batches
{
T[] bucket = null;
var count = 0;
foreach (var item in source)
{
if (bucket == null)
bucket = new T[size];
bucket[count++] = item;
if (count != size)
continue;
yield return bucket;
bucket = null;
count = 0;
}
// Return the last bucket with all remaining elements
if (bucket != null && count > 0)
{
Array.Resize(ref bucket, count);
yield return bucket;
}
}
}

error, string or binary data would be truncated when trying to insert

I am running data.bat file with the following lines:
Rem Tis batch file will populate tables
cd\program files\Microsoft SQL Server\MSSQL
osql -U sa -P Password -d MyBusiness -i c:\data.sql
The contents of the data.sql file is:
insert Customers
(CustomerID, CompanyName, Phone)
Values('101','Southwinds','19126602729')
There are 8 more similar lines for adding records.
When I run this with start > run > cmd > c:\data.bat, I get this error message:
1>2>3>4>5>....<1 row affected>
Msg 8152, Level 16, State 4, Server SP1001, Line 1
string or binary data would be truncated.
<1 row affected>
<1 row affected>
<1 row affected>
<1 row affected>
<1 row affected>
<1 row affected>
Also, I am a newbie obviously, but what do Level #, and state # mean, and how do I look up error messages such as the one above: 8152?
From #gmmastros's answer
Whenever you see the message....
string or binary data would be truncated
Think to yourself... The field is NOT big enough to hold my data.
Check the table structure for the customers table. I think you'll find that the length of one or more fields is NOT big enough to hold the data you are trying to insert. For example, if the Phone field is a varchar(8) field, and you try to put 11 characters in to it, you will get this error.
I had this issue although data length was shorter than the field length.
It turned out that the problem was having another log table (for audit trail), filled by a trigger on the main table, where the column size also had to be changed.
In one of the INSERT statements you are attempting to insert a too long string into a string (varchar or nvarchar) column.
If it's not obvious which INSERT is the offender by a mere look at the script, you could count the <1 row affected> lines that occur before the error message. The obtained number plus one gives you the statement number. In your case it seems to be the second INSERT that produces the error.
Just want to contribute with additional information: I had the same issue and it was because of the field wasn't big enough for the incoming data and this thread helped me to solve it (the top answer clarifies it all).
BUT it is very important to know what are the possible reasons that may cause it.
In my case i was creating the table with a field like this:
Select '' as Period, * From Transactions Into #NewTable
Therefore the field "Period" had a length of Zero and causing the Insert operations to fail. I changed it to "XXXXXX" that is the length of the incoming data and it now worked properly (because field now had a lentgh of 6).
I hope this help anyone with same issue :)
Some of your data cannot fit into your database column (small). It is not easy to find what is wrong. If you use C# and Linq2Sql, you can list the field which would be truncated:
First create helper class:
public class SqlTruncationExceptionWithDetails : ArgumentOutOfRangeException
{
public SqlTruncationExceptionWithDetails(System.Data.SqlClient.SqlException inner, DataContext context)
: base(inner.Message + " " + GetSqlTruncationExceptionWithDetailsString(context))
{
}
/// <summary>
/// PArt of code from following link
/// http://stackoverflow.com/questions/3666954/string-or-binary-data-would-be-truncated-linq-exception-cant-find-which-fiel
/// </summary>
/// <param name="context"></param>
/// <returns></returns>
static string GetSqlTruncationExceptionWithDetailsString(DataContext context)
{
StringBuilder sb = new StringBuilder();
foreach (object update in context.GetChangeSet().Updates)
{
FindLongStrings(update, sb);
}
foreach (object insert in context.GetChangeSet().Inserts)
{
FindLongStrings(insert, sb);
}
return sb.ToString();
}
public static void FindLongStrings(object testObject, StringBuilder sb)
{
foreach (var propInfo in testObject.GetType().GetProperties())
{
foreach (System.Data.Linq.Mapping.ColumnAttribute attribute in propInfo.GetCustomAttributes(typeof(System.Data.Linq.Mapping.ColumnAttribute), true))
{
if (attribute.DbType.ToLower().Contains("varchar"))
{
string dbType = attribute.DbType.ToLower();
int numberStartIndex = dbType.IndexOf("varchar(") + 8;
int numberEndIndex = dbType.IndexOf(")", numberStartIndex);
string lengthString = dbType.Substring(numberStartIndex, (numberEndIndex - numberStartIndex));
int maxLength = 0;
int.TryParse(lengthString, out maxLength);
string currentValue = (string)propInfo.GetValue(testObject, null);
if (!string.IsNullOrEmpty(currentValue) && maxLength != 0 && currentValue.Length > maxLength)
{
//string is too long
sb.AppendLine(testObject.GetType().Name + "." + propInfo.Name + " " + currentValue + " Max: " + maxLength);
}
}
}
}
}
}
Then prepare the wrapper for SubmitChanges:
public static class DataContextExtensions
{
public static void SubmitChangesWithDetailException(this DataContext dataContext)
{
//http://stackoverflow.com/questions/3666954/string-or-binary-data-would-be-truncated-linq-exception-cant-find-which-fiel
try
{
//this can failed on data truncation
dataContext.SubmitChanges();
}
catch (SqlException sqlException) //when (sqlException.Message == "String or binary data would be truncated.")
{
if (sqlException.Message == "String or binary data would be truncated.") //only for EN windows - if you are running different window language, invoke the sqlException.getMessage on thread with EN culture
throw new SqlTruncationExceptionWithDetails(sqlException, dataContext);
else
throw;
}
}
}
Prepare global exception handler and log truncation details:
protected void Application_Error(object sender, EventArgs e)
{
Exception ex = Server.GetLastError();
string message = ex.Message;
//TODO - log to file
}
Finally use the code:
Datamodel.SubmitChangesWithDetailException();
Another situation in which you can get this error is the following:
I had the same error and the reason was that in an INSERT statement that received data from an UNION, the order of the columns was different from the original table. If you change the order in #table3 to a, b, c, you will fix the error.
select a, b, c into #table1
from #table0
insert into #table1
select a, b, c from #table2
union
select a, c, b from #table3
on sql server you can use SET ANSI_WARNINGS OFF like this:
using (SqlConnection conn = new SqlConnection("Data Source=XRAYGOAT\\SQLEXPRESS;Initial Catalog='Healthy Care';Integrated Security=True"))
{
conn.Open();
using (var trans = conn.BeginTransaction())
{
try
{
using cmd = new SqlCommand("", conn, trans))
{
cmd.CommandText = "SET ANSI_WARNINGS OFF";
cmd.ExecuteNonQuery();
cmd.CommandText = "YOUR INSERT HERE";
cmd.ExecuteNonQuery();
cmd.Parameters.Clear();
cmd.CommandText = "SET ANSI_WARNINGS ON";
cmd.ExecuteNonQuery();
trans.Commit();
}
}
catch (Exception)
{
trans.Rollback();
}
}
conn.Close();
}
I had the same issue. The length of my column was too short.
What you can do is either increase the length or shorten the text you want to put in the database.
Also had this problem occurring on the web application surface.
Eventually found out that the same error message comes from the SQL update statement in the specific table.
Finally then figured out that the column definition in the relating history table(s) did not map the original table column length of nvarchar types in some specific cases.
I had the same problem, even after increasing the size of the problematic columns in the table.
tl;dr: The length of the matching columns in corresponding Table Types may also need to be increased.
In my case, the error was coming from the Data Export service in Microsoft Dynamics CRM, which allows CRM data to be synced to an SQL Server DB or Azure SQL DB.
After a lengthy investigation, I concluded that the Data Export service must be using Table-Valued Parameters:
You can use table-valued parameters to send multiple rows of data to a Transact-SQL statement or a routine, such as a stored procedure or function, without creating a temporary table or many parameters.
As you can see in the documentation above, Table Types are used to create the data ingestion procedure:
CREATE TYPE LocationTableType AS TABLE (...);
CREATE PROCEDURE dbo.usp_InsertProductionLocation
#TVP LocationTableType READONLY
Unfortunately, there is no way to alter a Table Type, so it has to be dropped & recreated entirely. Since my table has over 300 fields (😱), I created a query to facilitate the creation of the corresponding Table Type based on the table's columns definition (just replace [table_name] with your table's name):
SELECT 'CREATE TYPE [table_name]Type AS TABLE (' + STRING_AGG(CAST(field AS VARCHAR(max)), ',' + CHAR(10)) + ');' AS create_type
FROM (
SELECT TOP 5000 COLUMN_NAME + ' ' + DATA_TYPE
+ IIF(CHARACTER_MAXIMUM_LENGTH IS NULL, '', CONCAT('(', IIF(CHARACTER_MAXIMUM_LENGTH = -1, 'max', CONCAT(CHARACTER_MAXIMUM_LENGTH,'')), ')'))
+ IIF(DATA_TYPE = 'decimal', CONCAT('(', NUMERIC_PRECISION, ',', NUMERIC_SCALE, ')'), '')
AS field
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = '[table_name]'
ORDER BY ORDINAL_POSITION) AS T;
After updating the Table Type, the Data Export service started functioning properly once again! :)
When I tried to execute my stored procedure I had the same problem because the size of the column that I need to add some data is shorter than the data I want to add.
You can increase the size of the column data type or reduce the length of your data.
A 2016/2017 update will show you the bad value and column.
A new trace flag will swap the old error for a new 2628 error and will print out the column and offending value. Traceflag 460 is available in the latest cumulative update for 2016 and 2017:
https://support.microsoft.com/en-sg/help/4468101/optional-replacement-for-string-or-binary-data-would-be-truncated
Just make sure that after you've installed the CU that you enable the trace flag, either globally/permanently on the server:
...or with DBCC TRACEON:
https://learn.microsoft.com/en-us/sql/t-sql/database-console-commands/dbcc-traceon-trace-flags-transact-sql?view=sql-server-ver15
Another situation, in which this error may occur is in
SQL Server Management Studio. If you have "text" or "ntext" fields in your table,
no matter what kind of field you are updating (for example bit or integer).
Seems that the Studio does not load entire "ntext" fields and also updates ALL fields instead of the modified one.
To solve the problem, exclude "text" or "ntext" fields from the query in Management Studio
This Error Comes only When any of your field length is greater than the field length specified in sql server database table structure.
To overcome this issue you have to reduce the length of the field Value .
Or to increase the length of database table field .
If someone is encountering this error in a C# application, I have created a simple way of finding offending fields by:
Getting the column width of all the columns of a table where we're trying to make this insert/ update. (I'm getting this info directly from the database.)
Comparing the column widths to the width of the values we're trying to insert/ update.
Assumptions/ Limitations:
The column names of the table in the database match with the C# entity fields. For eg: If you have a column like this in database:
You need to have your Entity with the same column name:
public class SomeTable
{
// Other fields
public string SourceData { get; set; }
}
You're inserting/ updating 1 entity at a time. It'll be clearer in the demo code below. (If you're doing bulk inserts/ updates, you might want to either modify it or use some other solution.)
Step 1:
Get the column width of all the columns directly from the database:
// For this, I took help from Microsoft docs website:
// https://learn.microsoft.com/en-us/dotnet/api/system.data.sqlclient.sqlconnection.getschema?view=netframework-4.7.2#System_Data_SqlClient_SqlConnection_GetSchema_System_String_System_String___
private static Dictionary<string, int> GetColumnSizesOfTableFromDatabase(string tableName, string connectionString)
{
var columnSizes = new Dictionary<string, int>();
using (var connection = new SqlConnection(connectionString))
{
// Connect to the database then retrieve the schema information.
connection.Open();
// You can specify the Catalog, Schema, Table Name, Column Name to get the specified column(s).
// You can use four restrictions for Column, so you should create a 4 members array.
String[] columnRestrictions = new String[4];
// For the array, 0-member represents Catalog; 1-member represents Schema;
// 2-member represents Table Name; 3-member represents Column Name.
// Now we specify the Table_Name and Column_Name of the columns what we want to get schema information.
columnRestrictions[2] = tableName;
DataTable allColumnsSchemaTable = connection.GetSchema("Columns", columnRestrictions);
foreach (DataRow row in allColumnsSchemaTable.Rows)
{
var columnName = row.Field<string>("COLUMN_NAME");
//var dataType = row.Field<string>("DATA_TYPE");
var characterMaxLength = row.Field<int?>("CHARACTER_MAXIMUM_LENGTH");
// I'm only capturing columns whose Datatype is "varchar" or "char", i.e. their CHARACTER_MAXIMUM_LENGTH won't be null.
if(characterMaxLength != null)
{
columnSizes.Add(columnName, characterMaxLength.Value);
}
}
connection.Close();
}
return columnSizes;
}
Step 2:
Compare the column widths with the width of the values we're trying to insert/ update:
public static Dictionary<string, string> FindLongBinaryOrStringFields<T>(T entity, string connectionString)
{
var tableName = typeof(T).Name;
Dictionary<string, string> longFields = new Dictionary<string, string>();
var objectProperties = GetProperties(entity);
//var fieldNames = objectProperties.Select(p => p.Name).ToList();
var actualDatabaseColumnSizes = GetColumnSizesOfTableFromDatabase(tableName, connectionString);
foreach (var dbColumn in actualDatabaseColumnSizes)
{
var maxLengthOfThisColumn = dbColumn.Value;
var currentValueOfThisField = objectProperties.Where(f => f.Name == dbColumn.Key).First()?.GetValue(entity, null)?.ToString();
if (!string.IsNullOrEmpty(currentValueOfThisField) && currentValueOfThisField.Length > maxLengthOfThisColumn)
{
longFields.Add(dbColumn.Key, $"'{dbColumn.Key}' column cannot take the value of '{currentValueOfThisField}' because the max length it can take is {maxLengthOfThisColumn}.");
}
}
return longFields;
}
public static List<PropertyInfo> GetProperties<T>(T entity)
{
//The DeclaredOnly flag makes sure you only get properties of the object, not from the classes it derives from.
var properties = entity.GetType()
.GetProperties(System.Reflection.BindingFlags.Public
| System.Reflection.BindingFlags.Instance
| System.Reflection.BindingFlags.DeclaredOnly)
.ToList();
return properties;
}
Demo:
Let's say we're trying to insert someTableEntity of SomeTable class that is modeled in our app like so:
public class SomeTable
{
[Key]
public long TicketID { get; set; }
public string SourceData { get; set; }
}
And it's inside our SomeDbContext like so:
public class SomeDbContext : DbContext
{
public DbSet<SomeTable> SomeTables { get; set; }
}
This table in Db has SourceData field as varchar(16) like so:
Now we'll try to insert value that is longer than 16 characters into this field and capture this information:
public void SaveSomeTableEntity()
{
var connectionString = "server=SERVER_NAME;database=DB_NAME;User ID=SOME_ID;Password=SOME_PASSWORD;Connection Timeout=200";
using (var context = new SomeDbContext(connectionString))
{
var someTableEntity = new SomeTable()
{
SourceData = "Blah-Blah-Blah-Blah-Blah-Blah"
};
context.SomeTables.Add(someTableEntity);
try
{
context.SaveChanges();
}
catch (Exception ex)
{
if (ex.GetBaseException().Message == "String or binary data would be truncated.\r\nThe statement has been terminated.")
{
var badFieldsReport = "";
List<string> badFields = new List<string>();
// YOU GOT YOUR FIELDS RIGHT HERE:
var longFields = FindLongBinaryOrStringFields(someTableEntity, connectionString);
foreach (var longField in longFields)
{
badFields.Add(longField.Key);
badFieldsReport += longField.Value + "\n";
}
}
else
throw;
}
}
}
The badFieldsReport will have this value:
'SourceData' column cannot take the value of
'Blah-Blah-Blah-Blah-Blah-Blah' because the max length it can take is
16.
Kevin Pope's comment under the accepted answer was what I needed.
The problem, in my case, was that I had triggers defined on my table that would insert update/insert transactions into an audit table, but the audit table had a data type mismatch where a column with VARCHAR(MAX) in the original table was stored as VARCHAR(1) in the audit table, so my triggers were failing when I would insert anything greater than VARCHAR(1) in the original table column and I would get this error message.
I used a different tactic, fields that are allocated 8K in some places. Here only about 50/100 are used.
declare #NVPN_list as table
nvpn varchar(50)
,nvpn_revision varchar(5)
,nvpn_iteration INT
,mpn_lifecycle varchar(30)
,mfr varchar(100)
,mpn varchar(50)
,mpn_revision varchar(5)
,mpn_iteration INT
-- ...
) INSERT INTO #NVPN_LIST
SELECT left(nvpn ,50) as nvpn
,left(nvpn_revision ,10) as nvpn_revision
,nvpn_iteration
,left(mpn_lifecycle ,30)
,left(mfr ,100)
,left(mpn ,50)
,left(mpn_revision ,5)
,mpn_iteration
,left(mfr_order_num ,50)
FROM [DASHBOARD].[dbo].[mpnAttributes] (NOLOCK) mpna
I wanted speed, since I have 1M total records, and load 28K of them.
This error may be due to less field size than your entered data.
For e.g. if you have data type nvarchar(7) and if your value is 'aaaaddddf' then error is shown as:
string or binary data would be truncated
You simply can't beat SQL Server on this.
You can insert into a new table like this:
select foo, bar
into tmp_new_table_to_dispose_later
from my_table
and compare the table definition with the real table you want to insert the data into.
Sometime it's helpful sometimes it's not.
If you try inserting in the final/real table from that temporary table it may just work (due to data conversion working differently than SSMS for example).
Another alternative is to insert the data in chunks, instead of inserting everything immediately you insert with top 1000 and you repeat the process, till you find a chunk with an error. At least you have better visibility on what's not fitting into the table.