SSIS external metadata column needs to be removed - sql-server-2005

I am creating a select statement on the fly because the column names and table name can change, but they all need to go into the same data destination. There are other commonalities that make this viable, if I need to later I will go into them. So, what it comes down to is this: I am creating the select statement with 16 columns, there will always be sixteen columns, no more, no less, the column names can change and the table name can change. When I execute the package the select statement gets built just fine but when the Data Flow tries to execute, I get the following error:
The "external metadata column "ColumnName" (79)" needs to be removed from the external metadata column collection.
The actual SQL Statement being generated is:
select 0 as ColumnName, Column88 as CN1, 0 as CN2, 0 as CN3, 0 as CN4,
0 as CN5, 0 as CN6, 0 as CN7, 0 as CN8, 0 as CN9, 0 as CN10,
0 as CN11, 0 as CN12, 0 as CN13, 0 as CN14, 0 as CN15 from Table3
The column 'Column88' is generated dynamicly and so is the table name. If source columns exist for the other ' as CNx' columns, they will appear the same way (Column88 as CN1, Column89 as CN2, Column90 as CN3, etc.) and the table name will always be in the form: Tablex where x is an integer.
Could anyone please help me out with what is wrong and how to fix it?

You're in kind of deep here. You should just take it as read that you can't change the apparent column names or types. The names and types of the input columns become the names and types of the metadata flowing down from the source. If you change those, then everything that depends on them must fail.
The solution is to arrange for these to be stable, perhaps by using column aliases and casts. For one table:
SELECT COLNV, COLINT FROM TABLE1
for another
SELECT CAST(COLV AS NVARCHAR(50)) AS COLNV, CAST(COLSMALL AS INTEGER) AS COLINT FROM TABLE2
Give that a try and see if it works out for you. You just really can't change the metadata without fixing up the entire remainder of the package.

I had the same issue here when I had to remove a column from my stored procedure (which spits out to a temp table) in SQL and add two columns. To resolve the issue, I had to go through each part of my SSIS package from beginning (source - in my case, pulls from a temporary table), all the way through to your destination (in my case a flat file connection to a flat file csv). I had to re-do all the mappings along the way and I watched for errors that game up in the GUI data flow tasks in SSIS.
This error did come up for me in the form of a red X with a circle around it, I hovered over and it mentioned the metadata thing...I double clicked on it and it warned me that one of my columns didn't exist anymore and wanted to know if I wanted to delete it. I did delete it, but I can tell you that this error has more to do with SSIS telling you that your mappings are off and you need to go through each part of your SSIS package to make sure everything is all mapped out correctly.

How about using a view in front of the table. and calling the view as the SSIS source. that way, you can map the the columns as necessary, and use ISNULL or COALESCE functions to keep consistent column patterns.

Related

How to create a table from another table in Matlab (version r2021b)?

I have a mat. file called 'settings' which contains an S file and a T file, which is a Table. From this T file, which is a table nested in the mat. file, I would like to create another table, e.g., T1, and include only certain variables from the original T table by adding or removing variables.
What I did is the following:
Settings = load('settings_20211221.mat'); %load data file and its subcontituents because the table T where the data are is nested in the settings_mat. file
S = Settings.S;
T = Settings.T;
I see that Matlab has accepted that T is a table because I can see the size(T) or head(T). However, it is proving very hard to continue and create my own table afterwards.
1)
T1 = readtable('T')
Error using readtable (line 498) Unable to find or open 'T'. Check the path and filename
or file permissions.
Question 1: I do not understand why it could be that I cannot read table T unless it has to do with the fact that is nested and I am missing something? My impression was that I have specified the Table and that I could thus apply the readtable function to it.
After this error code, I decided to simply create a duplicate of the T1 table called 'Table' in case for some reason I had no permission to manipulate the original T. I want to remove lots of variables from the table and I figured the easiest thing to do would be to specify the ranges of variables corresponding to the columns I want to remove.
Removing variables from the newly created table 'Table'.
T1 = removevars(T1, (2:8)) %specifying one range between 2 and 8 works
Table = removevars(Table, [24 25 26]) %using a numeric array to indicate the individuals positions of the variables I want to remove works
Then I wanted to specify the ranges all in one go either by using () or [] to be more efficient and did the following:
Table = removevars(Table, (25:28), (30-62))
Table = removevars(Table,[25:28], [30-62])
I always got the following error 'Error using tabular/removevars. Too many input arguments.'
Question 2: How can I specify multiple ranges of numbers corresponding to the table columns/variables I want to remove?
Alternatively, I thought I could specify the variables I want to remove using strings but I got the below error message even though both the 'flip_angle' and 'SubID' columns did exist in my table.
Table = removevars(Table,{'flip_angle', 'SubID'})
Error using tabular/subsasgnParens (line 230)
Unrecognized table variable name 'flip_angle
Sometimes I tried to specify multiple strings corresponding to the names of the variables I wanted to remove (e.g., 20 strings), and then Matlab would return an error for 'too many input arguments'.
Question 3: How are variabes removed using strings?
Question 4: Is there a more efficient way to create a new table from the original T file by indexing the variables I want to include in some other way?
I want to understand why I get these error codes, so any help would be much appreciated!
Thank you!

BigQuery - remove unused column from schema

I accidentally added a wrong column to my BigQuery table schema.
Instead of reloading the complete table (million of rows), I would like to know if the following is possible:
remove bad rows (rows with values contains the wrong column) by running a "select *" query on the table with some kind of filter, and saving result to same table.
removing the (now) unused column.
Is this functionality (or similar) supported?
Possibly the "save result to table" functionality can have a "compact schema" option.
The smallest time-saving way to remove a column from Big Query according to the documentation.
ALTER TABLE [table_name] DROP COLUMN IF EXISTS [column_name]
If your table does not consist of record/repeated type fields - your simple option is:
Select valid columns while filtering out bad records into new temp table
SELECT < list of original columns >
FROM YourTable
WHERE < filter to remove bad entries here >
Write above to temp table - YourTable_Temp
Make a backup copy of "broken" table - YourTable_Backup
Delete YourTable
Copy YourTable_Temp to YourTable
Check if all looks as expected and if so - get rid of temp and backup tables
Please note: the cost of above #1 is exactly the same as action in first bullet in your question. The rest of actions (copy) are free
In case if you have repeated/record fields - you still can execute above plan, but in #1 you will need to use some BigQuery User-Defined Functions to have proper schema in output
You can see below for examples - of course this will require some extra dev - but if you are in critical situation - this should work for you
Create a table with Record type column
create a table with a column type RECORD
I hope, at some point Google BigQuery Team will add better support for cases like yours when you need to manipulate and output repeated/record data, but for now this is a best workaround I found - at least for myself
Below is the code to do it. Lets say c is the column that you wants to delete.
CREATE OR REPLACE TABLE transactions.test_table AS
SELECT * EXCEPT (c) FROM transactions.test_table;
Or second method and my favorite is by following below steps.
Write Select query with the columns you want to exclude.
Go to Query Settings
Query Settings
In Destination setting Set destination table for query results, enter project name, Dataset name and table name exactly same as you entered in Step 1.
In Destination table write preference select Overwrite table.
Destination table settings
Save the Query Setting and run the query.
Save results to table is your way to go. Try on the big table with the selected columns you are interested, and you can apply a limit to make it small.

SSIS - Only Load Certain Records - Skip the remaining

I have a Flat File that I'm loading into SQL and that Flat file has 2 different RecordTypes and 2 Different File Layouts based on the RecordType.
So I may have
000010203201501011 (RecordType 1)
00002XXYYABCDEFGH2 (RecordType 2)
So I want to immediately check for Records of RecordType1 and then send those records thru [Derived Column] & [Data Conversion] & [Loading to SQL]
And I want to ignore all Records of RecordType2.
I tried a Conditional Split but it seems like the Records of RecordType2 are still trying to go thru the [Derived Column]&[DataConversion] Steps.
It gives me a DataConversion error on the RecordType2 Records.
I have the Conditional Split set up as RecordType == 1 to go thru the process i have set up.
I guess Conditional Split isn't set up to be used this way?
Where in my process can i tell it to check for RecordType1 and only send records past that point that are RecordType=1?
It makes perfect sense you are having data type errors for Record Type 2 rows since you probably have defined columns along with their data types based on Record Type 1 records. I see three options to achieve what you want to do:
Have a script task in the control flow to copy only Record Type 1
records to a fresh file that would be used by the data flow you
already have (Pro: you do not need to touch the data flow, Con:
reading file twice), OR
In the existing data flow: Instead of getting all the columns from
the data source, read every line coming from the file as one big-fat
column, then a Derived Column to get RecordType, then a Conditional
Split, then a Derived Column to re-create all the columns you had
defined in the data source, OR
Ideal if you have another package processing Record Type 2 rows:
Dump the file into a database table in the staging area, then
replace the Data Source in your Data Flow for an OLEDB Data Source
(or whatever you use) and obtain+filter the records with something
like: SELECT substring(rowdata,1,5) AS RecordType,
substring(rowdata,6,...) AS Column2, .... FROM STG.FileData WHERE
substring(rowdata,1,5) = '00001'. If using this approach it would
be better to have a dedicated column for RecordType

Find or Strip Invalid characters from Database

We are using a database where the front end software has allowed the input of invalid characters. (I have no control or re-writing of the software.)
The types of characters are carriage returns, line breaks, �, ¶, basically anything that is not 0-9, a-z or standard punctuation causes us issues with the database and how we use the data.
I'm looking for a way to scan the entire database to identify these invalid codes and either display them as results or strip them out?
I had been looking at This site wondering if there was a way of searching for a certain range? But I might be barking up the wrong tree.
I'm fairly new to SQL so be gentle with me, thanks.
The only way I could think to do this would be to write a stored procedure which uses system tables to get a list of all fields in the database/schema in question. Have it exclude system tables (or only include those that are user defined) then dynamically write out SQL update statements based on the columns/tables found in the system table inquiries. Using regular expressions or character removal like in this article
The system tables in question are:
SELECT
table_name,column_name
FROM
information_schema.columns
Psudo code would be:
Get list of tables we want to do this for
For each table in list
get list of columns for table that have string data.
For each column in table
generate update statement to strip unwanted characters
--Consider writing out table, column key, before after values to history table. incase this
has to be undone.
--Consider counter so I have an idea of what was updated
execute updatestatement
next column
next table
write out counter
Since you say
the data then moves to a second program that cannot handle these
characters and this causes the process to fail.
I'm wondering if you can leave the unreadable data where it is and create a new column for changed data that's only populated if/when the 2nd process fails. You'll still have to test every character of the data in the failed cell, but you wouldn't have to test every character of every row. After you determine the updated text to process, you can call the 2nd process again with the updated value.

SSIS Union All - is there a bug in SSIS for this item?

I am using SQL 2005 and SSIS
I have 2 data sources.
One from table A and one from table B. I want to move data from table A to table B. But first i get the MAX date from both and compare them. If they are the same then i must either stop the SSIS package or use the Conditional Split.
But when the MAX date from table B go through the Union ALL it become blank!
Any idea why?
Union All transforms do not change the data that comes into them. Check carefully and make sure that the output column for "maxdate" has both "maxdate" columns coming into it. Also check the data types for both.
In fact, I suggest you delete the row with "maxdate" and then add it again, making sure it is correctly set from both inputs.