I am fairly new to SSIS and need a little help getting started. I have several reports that come out of our mainframe. The reports are not in a columnar format. The date record is at the top then there might be some initial data then there might be a little more. So I need to read in each line look to see what the text reads and figure out if I need the data or move to the next row.
This is a VERY rough example of what the report I want to import into a SQL table.
DATE: 01/08/2020 FACILITY NAME PAGE1
REVENUE USAGE FOR ACCOUNTING PERIOD 02
----TOTAL---- ----TOTAL---- ----OTHER---- ----INSURANCE---- ----INSURANCE2----
SERVICE CODE - 123456789 DESCRIPTION: WIDGETS
CURR 2,077
IP 0.0000 3 2,345 0.00
143
OP 0.0000 2 1,231 0.00
YTD 5
IP 0.0000
76
OP 0.0000
etc . . . .. .
SERVICE CODE
After the SERVICE CODE the data will start to repeat like it is above. This is the basic idea of a report.
I want to get the Date then the Service Code, Description, Current IP Volume, Current IP Dollar, Current OP Volume, Current OP Dollar, YTD IP Volume, YTD IP Dollar, YTD OP Volume, YTD OP Dollar . . then repeat.
Just to clarify, I am not asking anyone to do this for me. I want to learn how to do this. I have looked on how to do this but every example I have looked at talks about doing this with a CSV, tab, or Excel file. i do not have that type of file so I was asking what I need to look at. I currently use Monarch to format the file, but again I want to learn more about SSIS and this is a perfect way to learn. Asking the vendor to redo the report is not an option plus I want to learn how to do this. Thank you I just wanted to get that out there.
Any help would be greatly appreciated.
Rodger
As stated in comments, you could do this using a script task. The basics steps are:
Define a DataTable to store your data.
Use a StreamReader to read your report.
Process this using a combination of conditionals, String Methods, and parsing to extract the relevant fields from the relevant line:
Write the DataTable to the database using SqlBulkCopy
The following would go inside your Main method in your script task:
//Define a table to store your data
var table = new DataTable
{
Columns =
{
{ "ServiceCode", typeof(string) },
{ "Description", typeof(string) },
{ "CurrentIPVolume", typeof(int) },
{ "CurrentIPDollar,", typeof(decimal) },
{ "CurrentOPVolume", typeof(int) },
{ "CurrentOPDollar", typeof(decimal) },
{ "YTDIPVolume", typeof(int) },
{ "YTDIPDollar,", typeof(decimal) },
{ "YTDOPVolume", typeof(int) },
{ "YTDOPDollar", typeof(decimal) }
}
};
var filePath = #"Your File Path";
using (var reader = new StreamReader(filePath))
{
string line = null;
DataRow row = null;
// As YTD and Curr are identical, we will need a flag later to mark our position within the record
bool ytdFlag= false;
//Loop through every line in the file
while ((line = reader.ReadLine()) != null)
{
//if the line is blank, move on to the next
if (string.IsNullOrWhiteSpace(line)
continue;
// If the line starts with service code, then it marks the start of a new record
if (line.StartsWith("SERVICE CODE"))
{
//If the current value for row is not null then this is
//not the first record, so we need to add the previous
//record to the tale before continuing
if (row != null)
{
table.Rows.Add(row);
ytdFlag= false; // New record, reset YTD flag
}
row = table.NewRow();
//Split the line now based on known values:
var tokens = line.Split(new string[] { "SERVICE CODE - ", "DESCRIPTION: "}, StringSplitOptions.None);
row[0] = tokens[0];
row[1] = tokens[1];
}
if (line.StartsWith("CURR"))
{
//Process the row --> "CURR 2,077"
//Not sure what 2,077 is, but this will parse it
int i = 0;
if (int.TryParse(line.Substring(4).Trim().Replace(",", ""), out i))
{
//Do something with your int
Console.WriteLine(i);
}
}
if (line.StartsWith(" IP"))
{
//Start at after IP then split the line into the 4 numbers
var tokens = line.Substring(3).Split(new [] { " "}, StringSplitOptions.RemoveEmptyEntries);
//If we have gone past the CURR record, then at to YTD Columns
if (ytdFlag)
{
row[6] = int.Parse(tokens[1]);
row[7] = decimal.Parse(tokens[1]);
}
//Otherwise we are still in the CURR section:
else
{
row[2] = int.Parse(tokens[1]);
row[3] = decimal.Parse(tokens[1]);
}
}
if (line.StartsWith(" OP"))
{
//Start at after OP then split the line into the 4 numbers
var tokens = line.Substring(3).Split(new [] { " "}, StringSplitOptions.RemoveEmptyEntries);
//If we have gone past the CURR record, then at to YTD Columns
if (ytdFlag)
{
row[8] = int.Parse(tokens[1]);
row[9] = decimal.Parse(tokens[1]);
}
//Otherwise we are still in the CURR section:
else
{
row[4] = int.Parse(tokens[1]);
row[5] = decimal.Parse(tokens[1]);
}
//After we have processed an OP record, we must set the YTD Flag to true.
//Doesn't matter if it is the YTD OP record, since the flag will be reset
//By the next line that starts with SERVICE CODE anyway
ytdFlag= true;
}
}
}
//Now that we have processed the file, we can write the data to a database
using (var sqlBulkCopy = new SqlBulkCopy("Your Connection String"))
{
sqlBulkCopy.DestinationTableName = "dbo.YourTable";
//If necessary add column mappings, but if your DataTable matches your database table
//then this is not required
sqlBulkCopy.WriteToServer(table);
}
This is a very quick example, far from the finished article, and I have done little or no testing, but it should give you the gist of how it could be done, and get you started on one possible solution.
It can definitely be cleaned up and refactored, but I have tried to make it as clear as possible what is going on, rather than trying to write the most efficient code ever. It should also (hopefully) demonstrate what a monumental pain this is to do, and very minor report changes things like an extra space be "OP" will break the whole thing.
So again, I would re-iterate, if you can get the data in a standard flat file format, with one line per record, you should. I do however appreciate that sometimes these things are out of your control, and I have had to write incredibly ugly import routines like this in the past, so I feel your pain if you can't get the data in a consumable format.
I have built an SSIS package that loads in several delimited text files into a SQL database. One of the files often contains line spaces in it, which breaks the standard data flow task of setting a flat file source and mapping to an ado.net destination since it thinks it is on a new line when it reaches a line break. The vendor sending over the files does not want to sent the file without any edits and can't do XML at this time. Is there any way to fix this? I was thinking of writing a small vb.net program that would correct the files so they would work in the SSIS package, but not sure how to write that logic. The file has 5 columns, the first 2 are big integer and always contain some long integer ID, then there is a small text column that just contains one short word, then a date, and then a long comments field that is causing the problem. The comments field is sometimes blank (which is ok), the problem are the rows that have line breaks. I never know how many line breaks are in the comments, some have none, some can have several, even multiple line breaks in a row, so was wondering if this is even possible.
5787626|6547599|Approved|1/10/2017|Applicant request for fee waiver approved
5443221|7742812|Active|11/5/2013|
3430962|7643957|Re-Scheduled|5/25/2016|REVISED TERMS AND CONDITIONS REJECTED
Applicant has 30 DAYS To submit paperwork for extension.
34433624|7673715|Denied|1/24/2017|
34113575|7653748|Active|1/8/2014|New terms have been granted.
Sample File Format.
As long as there is logic that you can program/predict, it will be possible.
I would do it using a Script Component as a source, which means you don't need to rewrite the file before processing it. It also provides a lot of flexibility, e.g., you can store values in variables while iterating over multiple lines in the file, etc.
I posted another answer recently that gives a lot of detail on how to go about this: SSIS import a Flat File to SQL with the first row as header and last row as a total.
An example of holding the values in variables until the row is ready to be written:-
For this example I am writing three columns, ID1, ID2 and Comments. The file looks like this:
1|2|Comment1
Comment2
4|5|Comment3
Comment4
Comment5
6|7|Comment6
The Script Component contains the following method.
public override void CreateNewOutputRows()
{
System.IO.StreamReader reader = null;
try
{
bool readFirstLine = false;
int id1 = 0;
int id2 = 0;
string comments = null;
reader = new System.IO.StreamReader(Variables.FilePath); // this refers to a package variable that contains the file path
while (!reader.EndOfStream)
{
string line = reader.ReadLine();
if (line.Contains("|"))
{
if (readFirstLine)
{
Output0Buffer.AddRow();
Output0Buffer.ID1 = id1;
Output0Buffer.ID2 = id2;
Output0Buffer.Comments = comments;
}
else
{
readFirstLine = true;
}
string[] fields = line.Split('|');
id1 = Convert.ToInt32(fields[0]);
id2 = Convert.ToInt32(fields[1]);
comments = fields[2];
}
else
{
comments += " " + line;
}
if (reader.EndOfStream)
{
Output0Buffer.AddRow();
Output0Buffer.ID1 = id1;
Output0Buffer.ID2 = id2;
Output0Buffer.Comments = comments;
}
}
}
catch
{
if (reader != null)
{
reader.Close();
reader.Dispose();
}
throw;
}
}
The result set is:
ID1 ID2 Comments
=== === ========
1 2 Comment1 Comment2
4 5 Comment3 Comment4 Comment5
6 7 Comment6
I am running data.bat file with the following lines:
Rem Tis batch file will populate tables
cd\program files\Microsoft SQL Server\MSSQL
osql -U sa -P Password -d MyBusiness -i c:\data.sql
The contents of the data.sql file is:
insert Customers
(CustomerID, CompanyName, Phone)
Values('101','Southwinds','19126602729')
There are 8 more similar lines for adding records.
When I run this with start > run > cmd > c:\data.bat, I get this error message:
1>2>3>4>5>....<1 row affected>
Msg 8152, Level 16, State 4, Server SP1001, Line 1
string or binary data would be truncated.
<1 row affected>
<1 row affected>
<1 row affected>
<1 row affected>
<1 row affected>
<1 row affected>
Also, I am a newbie obviously, but what do Level #, and state # mean, and how do I look up error messages such as the one above: 8152?
From #gmmastros's answer
Whenever you see the message....
string or binary data would be truncated
Think to yourself... The field is NOT big enough to hold my data.
Check the table structure for the customers table. I think you'll find that the length of one or more fields is NOT big enough to hold the data you are trying to insert. For example, if the Phone field is a varchar(8) field, and you try to put 11 characters in to it, you will get this error.
I had this issue although data length was shorter than the field length.
It turned out that the problem was having another log table (for audit trail), filled by a trigger on the main table, where the column size also had to be changed.
In one of the INSERT statements you are attempting to insert a too long string into a string (varchar or nvarchar) column.
If it's not obvious which INSERT is the offender by a mere look at the script, you could count the <1 row affected> lines that occur before the error message. The obtained number plus one gives you the statement number. In your case it seems to be the second INSERT that produces the error.
Just want to contribute with additional information: I had the same issue and it was because of the field wasn't big enough for the incoming data and this thread helped me to solve it (the top answer clarifies it all).
BUT it is very important to know what are the possible reasons that may cause it.
In my case i was creating the table with a field like this:
Select '' as Period, * From Transactions Into #NewTable
Therefore the field "Period" had a length of Zero and causing the Insert operations to fail. I changed it to "XXXXXX" that is the length of the incoming data and it now worked properly (because field now had a lentgh of 6).
I hope this help anyone with same issue :)
Some of your data cannot fit into your database column (small). It is not easy to find what is wrong. If you use C# and Linq2Sql, you can list the field which would be truncated:
First create helper class:
public class SqlTruncationExceptionWithDetails : ArgumentOutOfRangeException
{
public SqlTruncationExceptionWithDetails(System.Data.SqlClient.SqlException inner, DataContext context)
: base(inner.Message + " " + GetSqlTruncationExceptionWithDetailsString(context))
{
}
/// <summary>
/// PArt of code from following link
/// http://stackoverflow.com/questions/3666954/string-or-binary-data-would-be-truncated-linq-exception-cant-find-which-fiel
/// </summary>
/// <param name="context"></param>
/// <returns></returns>
static string GetSqlTruncationExceptionWithDetailsString(DataContext context)
{
StringBuilder sb = new StringBuilder();
foreach (object update in context.GetChangeSet().Updates)
{
FindLongStrings(update, sb);
}
foreach (object insert in context.GetChangeSet().Inserts)
{
FindLongStrings(insert, sb);
}
return sb.ToString();
}
public static void FindLongStrings(object testObject, StringBuilder sb)
{
foreach (var propInfo in testObject.GetType().GetProperties())
{
foreach (System.Data.Linq.Mapping.ColumnAttribute attribute in propInfo.GetCustomAttributes(typeof(System.Data.Linq.Mapping.ColumnAttribute), true))
{
if (attribute.DbType.ToLower().Contains("varchar"))
{
string dbType = attribute.DbType.ToLower();
int numberStartIndex = dbType.IndexOf("varchar(") + 8;
int numberEndIndex = dbType.IndexOf(")", numberStartIndex);
string lengthString = dbType.Substring(numberStartIndex, (numberEndIndex - numberStartIndex));
int maxLength = 0;
int.TryParse(lengthString, out maxLength);
string currentValue = (string)propInfo.GetValue(testObject, null);
if (!string.IsNullOrEmpty(currentValue) && maxLength != 0 && currentValue.Length > maxLength)
{
//string is too long
sb.AppendLine(testObject.GetType().Name + "." + propInfo.Name + " " + currentValue + " Max: " + maxLength);
}
}
}
}
}
}
Then prepare the wrapper for SubmitChanges:
public static class DataContextExtensions
{
public static void SubmitChangesWithDetailException(this DataContext dataContext)
{
//http://stackoverflow.com/questions/3666954/string-or-binary-data-would-be-truncated-linq-exception-cant-find-which-fiel
try
{
//this can failed on data truncation
dataContext.SubmitChanges();
}
catch (SqlException sqlException) //when (sqlException.Message == "String or binary data would be truncated.")
{
if (sqlException.Message == "String or binary data would be truncated.") //only for EN windows - if you are running different window language, invoke the sqlException.getMessage on thread with EN culture
throw new SqlTruncationExceptionWithDetails(sqlException, dataContext);
else
throw;
}
}
}
Prepare global exception handler and log truncation details:
protected void Application_Error(object sender, EventArgs e)
{
Exception ex = Server.GetLastError();
string message = ex.Message;
//TODO - log to file
}
Finally use the code:
Datamodel.SubmitChangesWithDetailException();
Another situation in which you can get this error is the following:
I had the same error and the reason was that in an INSERT statement that received data from an UNION, the order of the columns was different from the original table. If you change the order in #table3 to a, b, c, you will fix the error.
select a, b, c into #table1
from #table0
insert into #table1
select a, b, c from #table2
union
select a, c, b from #table3
on sql server you can use SET ANSI_WARNINGS OFF like this:
using (SqlConnection conn = new SqlConnection("Data Source=XRAYGOAT\\SQLEXPRESS;Initial Catalog='Healthy Care';Integrated Security=True"))
{
conn.Open();
using (var trans = conn.BeginTransaction())
{
try
{
using cmd = new SqlCommand("", conn, trans))
{
cmd.CommandText = "SET ANSI_WARNINGS OFF";
cmd.ExecuteNonQuery();
cmd.CommandText = "YOUR INSERT HERE";
cmd.ExecuteNonQuery();
cmd.Parameters.Clear();
cmd.CommandText = "SET ANSI_WARNINGS ON";
cmd.ExecuteNonQuery();
trans.Commit();
}
}
catch (Exception)
{
trans.Rollback();
}
}
conn.Close();
}
I had the same issue. The length of my column was too short.
What you can do is either increase the length or shorten the text you want to put in the database.
Also had this problem occurring on the web application surface.
Eventually found out that the same error message comes from the SQL update statement in the specific table.
Finally then figured out that the column definition in the relating history table(s) did not map the original table column length of nvarchar types in some specific cases.
I had the same problem, even after increasing the size of the problematic columns in the table.
tl;dr: The length of the matching columns in corresponding Table Types may also need to be increased.
In my case, the error was coming from the Data Export service in Microsoft Dynamics CRM, which allows CRM data to be synced to an SQL Server DB or Azure SQL DB.
After a lengthy investigation, I concluded that the Data Export service must be using Table-Valued Parameters:
You can use table-valued parameters to send multiple rows of data to a Transact-SQL statement or a routine, such as a stored procedure or function, without creating a temporary table or many parameters.
As you can see in the documentation above, Table Types are used to create the data ingestion procedure:
CREATE TYPE LocationTableType AS TABLE (...);
CREATE PROCEDURE dbo.usp_InsertProductionLocation
#TVP LocationTableType READONLY
Unfortunately, there is no way to alter a Table Type, so it has to be dropped & recreated entirely. Since my table has over 300 fields (😱), I created a query to facilitate the creation of the corresponding Table Type based on the table's columns definition (just replace [table_name] with your table's name):
SELECT 'CREATE TYPE [table_name]Type AS TABLE (' + STRING_AGG(CAST(field AS VARCHAR(max)), ',' + CHAR(10)) + ');' AS create_type
FROM (
SELECT TOP 5000 COLUMN_NAME + ' ' + DATA_TYPE
+ IIF(CHARACTER_MAXIMUM_LENGTH IS NULL, '', CONCAT('(', IIF(CHARACTER_MAXIMUM_LENGTH = -1, 'max', CONCAT(CHARACTER_MAXIMUM_LENGTH,'')), ')'))
+ IIF(DATA_TYPE = 'decimal', CONCAT('(', NUMERIC_PRECISION, ',', NUMERIC_SCALE, ')'), '')
AS field
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = '[table_name]'
ORDER BY ORDINAL_POSITION) AS T;
After updating the Table Type, the Data Export service started functioning properly once again! :)
When I tried to execute my stored procedure I had the same problem because the size of the column that I need to add some data is shorter than the data I want to add.
You can increase the size of the column data type or reduce the length of your data.
A 2016/2017 update will show you the bad value and column.
A new trace flag will swap the old error for a new 2628 error and will print out the column and offending value. Traceflag 460 is available in the latest cumulative update for 2016 and 2017:
https://support.microsoft.com/en-sg/help/4468101/optional-replacement-for-string-or-binary-data-would-be-truncated
Just make sure that after you've installed the CU that you enable the trace flag, either globally/permanently on the server:
...or with DBCC TRACEON:
https://learn.microsoft.com/en-us/sql/t-sql/database-console-commands/dbcc-traceon-trace-flags-transact-sql?view=sql-server-ver15
Another situation, in which this error may occur is in
SQL Server Management Studio. If you have "text" or "ntext" fields in your table,
no matter what kind of field you are updating (for example bit or integer).
Seems that the Studio does not load entire "ntext" fields and also updates ALL fields instead of the modified one.
To solve the problem, exclude "text" or "ntext" fields from the query in Management Studio
This Error Comes only When any of your field length is greater than the field length specified in sql server database table structure.
To overcome this issue you have to reduce the length of the field Value .
Or to increase the length of database table field .
If someone is encountering this error in a C# application, I have created a simple way of finding offending fields by:
Getting the column width of all the columns of a table where we're trying to make this insert/ update. (I'm getting this info directly from the database.)
Comparing the column widths to the width of the values we're trying to insert/ update.
Assumptions/ Limitations:
The column names of the table in the database match with the C# entity fields. For eg: If you have a column like this in database:
You need to have your Entity with the same column name:
public class SomeTable
{
// Other fields
public string SourceData { get; set; }
}
You're inserting/ updating 1 entity at a time. It'll be clearer in the demo code below. (If you're doing bulk inserts/ updates, you might want to either modify it or use some other solution.)
Step 1:
Get the column width of all the columns directly from the database:
// For this, I took help from Microsoft docs website:
// https://learn.microsoft.com/en-us/dotnet/api/system.data.sqlclient.sqlconnection.getschema?view=netframework-4.7.2#System_Data_SqlClient_SqlConnection_GetSchema_System_String_System_String___
private static Dictionary<string, int> GetColumnSizesOfTableFromDatabase(string tableName, string connectionString)
{
var columnSizes = new Dictionary<string, int>();
using (var connection = new SqlConnection(connectionString))
{
// Connect to the database then retrieve the schema information.
connection.Open();
// You can specify the Catalog, Schema, Table Name, Column Name to get the specified column(s).
// You can use four restrictions for Column, so you should create a 4 members array.
String[] columnRestrictions = new String[4];
// For the array, 0-member represents Catalog; 1-member represents Schema;
// 2-member represents Table Name; 3-member represents Column Name.
// Now we specify the Table_Name and Column_Name of the columns what we want to get schema information.
columnRestrictions[2] = tableName;
DataTable allColumnsSchemaTable = connection.GetSchema("Columns", columnRestrictions);
foreach (DataRow row in allColumnsSchemaTable.Rows)
{
var columnName = row.Field<string>("COLUMN_NAME");
//var dataType = row.Field<string>("DATA_TYPE");
var characterMaxLength = row.Field<int?>("CHARACTER_MAXIMUM_LENGTH");
// I'm only capturing columns whose Datatype is "varchar" or "char", i.e. their CHARACTER_MAXIMUM_LENGTH won't be null.
if(characterMaxLength != null)
{
columnSizes.Add(columnName, characterMaxLength.Value);
}
}
connection.Close();
}
return columnSizes;
}
Step 2:
Compare the column widths with the width of the values we're trying to insert/ update:
public static Dictionary<string, string> FindLongBinaryOrStringFields<T>(T entity, string connectionString)
{
var tableName = typeof(T).Name;
Dictionary<string, string> longFields = new Dictionary<string, string>();
var objectProperties = GetProperties(entity);
//var fieldNames = objectProperties.Select(p => p.Name).ToList();
var actualDatabaseColumnSizes = GetColumnSizesOfTableFromDatabase(tableName, connectionString);
foreach (var dbColumn in actualDatabaseColumnSizes)
{
var maxLengthOfThisColumn = dbColumn.Value;
var currentValueOfThisField = objectProperties.Where(f => f.Name == dbColumn.Key).First()?.GetValue(entity, null)?.ToString();
if (!string.IsNullOrEmpty(currentValueOfThisField) && currentValueOfThisField.Length > maxLengthOfThisColumn)
{
longFields.Add(dbColumn.Key, $"'{dbColumn.Key}' column cannot take the value of '{currentValueOfThisField}' because the max length it can take is {maxLengthOfThisColumn}.");
}
}
return longFields;
}
public static List<PropertyInfo> GetProperties<T>(T entity)
{
//The DeclaredOnly flag makes sure you only get properties of the object, not from the classes it derives from.
var properties = entity.GetType()
.GetProperties(System.Reflection.BindingFlags.Public
| System.Reflection.BindingFlags.Instance
| System.Reflection.BindingFlags.DeclaredOnly)
.ToList();
return properties;
}
Demo:
Let's say we're trying to insert someTableEntity of SomeTable class that is modeled in our app like so:
public class SomeTable
{
[Key]
public long TicketID { get; set; }
public string SourceData { get; set; }
}
And it's inside our SomeDbContext like so:
public class SomeDbContext : DbContext
{
public DbSet<SomeTable> SomeTables { get; set; }
}
This table in Db has SourceData field as varchar(16) like so:
Now we'll try to insert value that is longer than 16 characters into this field and capture this information:
public void SaveSomeTableEntity()
{
var connectionString = "server=SERVER_NAME;database=DB_NAME;User ID=SOME_ID;Password=SOME_PASSWORD;Connection Timeout=200";
using (var context = new SomeDbContext(connectionString))
{
var someTableEntity = new SomeTable()
{
SourceData = "Blah-Blah-Blah-Blah-Blah-Blah"
};
context.SomeTables.Add(someTableEntity);
try
{
context.SaveChanges();
}
catch (Exception ex)
{
if (ex.GetBaseException().Message == "String or binary data would be truncated.\r\nThe statement has been terminated.")
{
var badFieldsReport = "";
List<string> badFields = new List<string>();
// YOU GOT YOUR FIELDS RIGHT HERE:
var longFields = FindLongBinaryOrStringFields(someTableEntity, connectionString);
foreach (var longField in longFields)
{
badFields.Add(longField.Key);
badFieldsReport += longField.Value + "\n";
}
}
else
throw;
}
}
}
The badFieldsReport will have this value:
'SourceData' column cannot take the value of
'Blah-Blah-Blah-Blah-Blah-Blah' because the max length it can take is
16.
Kevin Pope's comment under the accepted answer was what I needed.
The problem, in my case, was that I had triggers defined on my table that would insert update/insert transactions into an audit table, but the audit table had a data type mismatch where a column with VARCHAR(MAX) in the original table was stored as VARCHAR(1) in the audit table, so my triggers were failing when I would insert anything greater than VARCHAR(1) in the original table column and I would get this error message.
I used a different tactic, fields that are allocated 8K in some places. Here only about 50/100 are used.
declare #NVPN_list as table
nvpn varchar(50)
,nvpn_revision varchar(5)
,nvpn_iteration INT
,mpn_lifecycle varchar(30)
,mfr varchar(100)
,mpn varchar(50)
,mpn_revision varchar(5)
,mpn_iteration INT
-- ...
) INSERT INTO #NVPN_LIST
SELECT left(nvpn ,50) as nvpn
,left(nvpn_revision ,10) as nvpn_revision
,nvpn_iteration
,left(mpn_lifecycle ,30)
,left(mfr ,100)
,left(mpn ,50)
,left(mpn_revision ,5)
,mpn_iteration
,left(mfr_order_num ,50)
FROM [DASHBOARD].[dbo].[mpnAttributes] (NOLOCK) mpna
I wanted speed, since I have 1M total records, and load 28K of them.
This error may be due to less field size than your entered data.
For e.g. if you have data type nvarchar(7) and if your value is 'aaaaddddf' then error is shown as:
string or binary data would be truncated
You simply can't beat SQL Server on this.
You can insert into a new table like this:
select foo, bar
into tmp_new_table_to_dispose_later
from my_table
and compare the table definition with the real table you want to insert the data into.
Sometime it's helpful sometimes it's not.
If you try inserting in the final/real table from that temporary table it may just work (due to data conversion working differently than SSMS for example).
Another alternative is to insert the data in chunks, instead of inserting everything immediately you insert with top 1000 and you repeat the process, till you find a chunk with an error. At least you have better visibility on what's not fitting into the table.
I've two sets of search indexes.
TestIndex (used in our test environment) and ProdIndex(used in PRODUCTION environment).
Lucene search query: +date:[20090410184806 TO 20091007184806] works fine for test index but gives this error message for Prod index.
"maxClauseCount is set to 1024"
If I execute following line just before executing search query, then I do not get this error.
BooleanQuery.SetMaxClauseCount(Int16.MaxValue);
searcher.Search(myQuery, collector);
Am I missing something here? Why am not getting this error in test index?The schema for two indexes are same.They only differ wrt to number of records/data.PROD index has got higher number of records(around 1300) than those in test one (around 950).
The range query essentially gets transformed to a boolean query with one clause for every possible value, ORed together.
For example, the query +price:[10 to 13] is tranformed to a boolean query
+(price:10 price:11 price:12 price:13)
assuming all the values 10-13 exist in the index.
I suppose, all of your 1300 values fall in the range you have given. So, boolean query has 1300 clauses, which is higher than the default value of 1024. In the test index, the limit of 1024 is not reached as there are only 950 values.
I had the same problem. My solution was to catch BooleanQuery.TooManyClauses and dynamically increase maxClauseCount.
Here is some code that is similar to what I have in production.
private static Hits searchIndex(Searcher searcher, Query query) throws IOException
{
boolean retry = true;
while (retry)
{
try
{
retry = false;
Hits myHits = searcher.search(query);
return myHits;
}
catch (BooleanQuery.TooManyClauses e)
{
// Double the number of boolean queries allowed.
// The default is in org.apache.lucene.search.BooleanQuery and is 1024.
String defaultQueries = Integer.toString(BooleanQuery.getMaxClauseCount());
int oldQueries = Integer.parseInt(System.getProperty("org.apache.lucene.maxClauseCount", defaultQueries));
int newQueries = oldQueries * 2;
log.error("Too many hits for query: " + oldQueries + ". Increasing to " + newQueries, e);
System.setProperty("org.apache.lucene.maxClauseCount", Integer.toString(newQueries));
BooleanQuery.setMaxClauseCount(newQueries);
retry = true;
}
}
}
I had this same issue in C# code running with the Sitecore web content management system. I used Randy's answer above, but was not able to use the System get and set property functionality. Instead I retrieved the current count, incremented it, and set it back. Worked great!
catch (BooleanQuery.TooManyClauses e)
{
// Increment the number of boolean queries allowed.
// The default is 1024.
var currMaxClause = BooleanQuery.GetMaxClauseCount();
var newMaxClause = currMaxClause + 1024;
BooleanQuery.SetMaxClauseCount(newMaxClause);
retry = true;
}
Add This Code
#using Lucene.Net.Search;
#BooleanQuery.SetMaxClauseCount(2048);
Just put, BooleanQuery.setMaxClauseCount( Integer.MAX_VALUE );and that's it.