SSIS Excel Source wont let me change DT_DATE to DT_WSTR - sql

I have an SSIS package that imports data form an excel source to a SQL Server DB. The file has a column named [Client Date Of Birth] that is USUALLY a valid date. I say usually because the data is entered by our client's agents and can be anything from 02/24/2017 to Feb 17 or even just 2017. I actually need this data to come in as a string because we do partial comparison to these dates so data coming in as Feb 2017 is still technically okay.
The problem is SSIS automatically determines that this column is of [DT_DATE] type. I try to change the DataType to [DT_WSTR] in the External columns section of the Advanced Editor for the source, but when i click OK, SSIS automatically switches it back to [DT_DATE]. How do I get SSIS to import this column as a string?
It should also be noted that this is an automatic process so I can't make any changes to the excel file itself because it will be replaced by a new file each week.

There are many ways to achieve this:
First way
Add a script component
Mark [Client Date Of Birth] as an Input Column
Create an Output Column [StrDOB] with Datatype equal [DT_STR]
In the script write the following code (i assumed that you want the date format yyyy-MM-dd HH:mm:ss):
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
If Not Row.ClientDateOfBirth_IsNull Then
Row.strDOB = Row.ClientDateOfBirth.ToString("yyyy-MM-dd HH:mm:ss")
Else
Row.strDOB_IsNull = True
End If
End Sub
Second Way
Use a Data Conversion Transformation component to achieve this. Just follow this MSDN links to get Help:
https://msdn.microsoft.com/en-us/library/ms141706.aspx
https://msdn.microsoft.com/en-us/library/ms140321.aspx
https://msdn.microsoft.com/en-us/library/ms186792.aspx

Related

ssis filter out rows with values starting in a letter

In an SSIS project I am trying to filter out rows from an Excel file source where a column has values that start in a letter and followed by numbers. Some cells contain more that one value and not all cells follow a data type format. The data flow is currently as follows:
Excel Source > Data Conversion > OLE DB Destination
I am adding a Conditional Split after the Excel Source, but I am troubled with how to filter out unneeded records. Below are examples of values that should not be included before the end of the data flow:
Row Value
1 1234
2 P123
3 P1234, P456
4 rec P678
Row 1 should be the only one that should flow to the destination. Is there a way to filter out records that start with 'P' and followed by numbers, regardless of how many values in each cell?
Update: I'm currently working around FINDSTRING(Value,"P",1) > 0 || FINDSTRING(Value,"p",1) > 0. The output blocks rows 2-4 but the Value for row 1 was changed to 0. Does anybody know why this happens?
1st Solution: Script Component with a .Net method to check whether a value is a number.
Script Component
Add a Script Component (type: transformation) where you need to do the check.
Select Input Columns
Add the column that needs to be checked as input column.
Add Output Column
Add a new column to the Output columns on the tab Inputs and Outputs. The type shoot be Boolean and give it a suitable name.
The Script
' VB.Net code
' Check whether the string value contains a number
Imports System
Imports System.Data
Imports System.Math
Imports Microsoft.SqlServer.Dts.Pipeline.Wrapper
Imports Microsoft.SqlServer.Dts.Runtime.Wrapper
Public Class ScriptMain
Inherits UserComponent
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
' Standard VB.net method, indicating whether
' an expression can be evaluated as a number
If (IsNumeric(Row.InputCol)) Then
Row.IsNumeric = True
Else
Row.IsNumeric = False
End If
End Sub
End Class
Create a conditional split to filter Rows
2nd Solution: Derived Column
Add Derived Column
Add a Derived Column where you need to do the check.
Add Expression
Add a new column with the following expression and give it a suitable name: !ISNULL((DT_I8)TextNumbers). All numbers will result in True and all non-numbers will raise an error.
Ignore error
Go to the Configure Error Output window in the Derived column and ignore errors for the new field.
The Result
Add a conditional split to filter rows using ISNULL expression
3rd Solution: Data Conversion
Data Conversion
An alternative to the second solution could be to try convert the value to an int via a Data Conversion Transformation and also ignore any errors. Than add a conditional split to filter rows using ISNULL expression
The condition you want to check looks whether a field is numeric or not. Unfortunately we do not have straightforward ISNUMERIC funtion within SSIS.
Please go thru the following link. We can achieve this with the use of derived columns.
Check IsNumeric() with Derived Column Transform in SSIS Package

SSRS export timespans more 24 hours to excel

friends.
I have field in my report that contains time string 25:00:00. How can i export this field to excel and get column format [h]:mm:ss automatically?
Steps that i tried:
Used function System.TimeSpan.FromSeconds(90000). But it gave me result 1.01:00:00 and it exports to excel as General format.
Used expression for TextBox properties -> Number in SSRS like this:
=IIF(Globals!RenderFormat.Name="EXCELOPENXML","[h]:mm:ss","HH:mm"). It gave the same result as previous.
If there are some ideas how to decide this problem, I wait your suggestions. Thanks.
FORMAT function removes the actual data type of the column while exporting. So Instead i will suggest use FORMAT property (as in below screenshot)
Now if you try again & export report to excel it will export using the format property of that cell.

Google Apps Script on Form Submit Time Formatting Glitch/Fix

Background:
How: I suspect that this is a glitch within Google Form (submission process)/Spreadsheet, but may be part of the Date conversion utility of the Spreadsheet interface (and is an intended feature).
When entering a format in a text box in Google Forms, there is some sort of communication error between the Form submit and Response Spreadsheet, or pre-processing of the Form's data before it is sent to the spreadsheet. The glitch only seems to happen for data in a text field of the format ##:## TEXT where TEXT contains no '.' characters. For example: 4:15 pm will reproduce the glitch, but 4:15 p.m and 4:15 p.m. will not.
Result: An apostrophe character is added to the beginning of the string when it is put into the Spreadsheet (i.e. '4:15 pm) which throws off several sub-systems I have in place that use that time data. Here are two screenshots (sorry for the bad sizing on the second):
I'm 99% certain that the glitch is caused by the ##: combination.
Temporary Fix?: The real question is... how might I go about removing that pesky apostrophe before I start manipulating the time data? I know how to getValue() of a cell/Range. Assume I have the value of a cell in the following manner:
var value = myRange.getValue();
// value = '4:15 pm
How can I go about processing that value into 4:15 pm? A simple java function could be
value = value.substring(1); // Assuming "value" is a String
But in Google App Scripts for Spreadsheets, I don't know how I would do that.
Post-Script: It is necessary to post-process this data so that I don't have to lecture university faculty in the language department about inputting time format correctly in their forms.
Thanks in advance to those who can help!
How can I go about processing that value into 4:15 pm? A simple java
function could be
value = value.substring(1); // Assuming "value" is a String But in
Google App Scripts for Spreadsheets, I don't know how I would do that.
Google Apps Scripts uses Javascript which has the exact same method.
value = value.substring(1);
should return all except the first character.
More about Javascript substring at: http://www.w3schools.com/jsref/jsref_substring.asp
If you remove the ' in the spreadsheet cell the spreadsheet interface will convert this entry to a date object.
This might (or not) be an issue for you so maybe you should handle this when you read back your data for another use...
It doesn't happen when text is different (for example with P.M) simply because in this case the ' is not necessary for the spreadsheet to keep it as a string since the spreadsheet can't convert it to a date object (time value).
Artificial intelligence has its bad sides ;-)
edit :
You cant do this in an onFormSubmit triggered function using the javascript substring() you mentioned. If you're not familiar with that, here is the way to go :
To run a script when a particular action is performed:
Open or a create a new Spreadsheet.
Click the Unsaved Spreadsheet dialog box and change the name.
Choose Tools > Script Editor and write the function you want to run.
Choose Resources > Current project's triggers. You see a panel with
the message No triggers set up. Click here to add one now.
Click the link.
Under Run, select the function you want executed by the trigger.
Under Events, select From Spreadsheet.
From the next drop-down list, select On open, On edit, or On form
submit.
Click Save.
see doc here and here

SSIS custom script: loop over columns to concatenate values

I'm trying to create a custom script in SSIS 2008 that will loop over the selected input columns and concatenate them so they can be used to create a SHA1 hash. I'm aware of the available custom components but I'm not able to install them on our system at work.
Whilst the example posed here appears to work fine http://www.sqlservercentral.com/articles/Integration+Services+(SSIS)/69766/ when I've tested this selected only a few and not all columns I get odd results. The script only seems to work if columns selected are in sequential order. Even when they are in order, after so many records or perhaps the next buffer different MD5 hashes are generated despite the rows being exactly the same throughout my test data.
I've tried to adapt the code from the previous link along with these articles but have had no joy thus far.
http://msdn.microsoft.com/en-us/library/ms136020.aspx
http://agilebi.com/jwelch/2007/06/03/xml-transformations-part-2/
As a starting point this works fine to display the column names that I have selected to be used as inputs
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
For Each inputColumn As IDTSInputColumn100 In Me.ComponentMetaData.InputCollection(0).InputColumnCollection
MsgBox(inputColumn.Name)
Next
End Sub
Building on this I try to get the values using the code below:
Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)
Dim column As IDTSInputColumn100
Dim rowType As Type = Row.GetType()
Dim columnValue As PropertyInfo
Dim testString As String = ""
For Each column In Me.ComponentMetaData.InputCollection(0).InputColumnCollection
columnValue = rowType.GetProperty(column.Name)
testString += columnValue.GetValue(Row, Nothing).ToString()
Next
MsgBox(testString)
End Sub
Unfortunately this does not work and I receive the following error:
I'm sure what I am trying to do is easily achievable though my limited knowledge of VB.net and in particular VB.net in SSIS, I'm struggling. I could define the column names individually as shown here http://timlaqua.com/2012/02/slowly-changing-dimensions-with-md5-hashes-in-ssis/ though I'd like to try out a dynamic method.
Your problem is trying to run ToString() on a NULL value from your database.
Try Convert.ToString(columnValue) instead, it just returns an empty string.
The input columns are not guaranteed to be in the same order each time. So you'll end up getting a different hash any time the metadata in the dataflow changes. I went through the same pain when writing exactly the same script.
Every answer on the net I've found states to build a custom component to be able to do this. No need. I relied on SSIS to generate the indexes to column names when it builds the base classes each time the script component is opened. The caveat is that any time the metadata of the data flow changes, the indexes may change and need to be updated by re-opening and closing the SSIS script component.
You will need to override ProcessInput() to get store a reference to PipelineBuffer, which isn't exposed in ProcessInputRow, where you actually need to use it to access the columns by their index rather than by name.
The list of names and associated indexes are stored in ComponentMetaData.InputCollection[0].InputColumnCollection, which needs to be iterated over and sorted to guarantee same HASH every time.
PS. I posted the answer last year but it vanished, probably because it was in C# rather than VB (kind of irrelevant in SSIS). You can find the code with all ugly details here https://gist.github.com/danieljarolim/e89ff5b41b12383c60c7#file-ssis_sha1-cs

sql reporting services formats ( datetime, numbers

i put the format of a textbox using expression like Format(Fields!name.Value, "dd/MM/yyyy")
can i somehow set a global format for datetime and currency/floats without having the need to set it for each textbox in part
Perhaps.
You can specify a report language/culture to use which affects all controls.
However, I don't think you can specify a certain format "dd/MM/yyyy" which is not linked to language.
i didn't found how to do that but i do use know a macro recorder, so i created some macros for formatting stuff in my report like for datetime and cash, so i format a cell with a hotkey now, that's my workaround
well you could also format it in DB and send it as string, but it's kind off the same