Reading a text file with SSIS with CRLF or LF

Reading a text file with SSIS with CRLF or LF - sql

Running into an issue where I receive a text file that has LF's as the EOL. Sometimes they send the file with CRLF's as the EOL. Does anyone have any good ideas on how I can make SSIS use either one as the EOL?
It's a very easy convert operation with notepad++ to change it to what ever I need, however, it's manual and I want it to be automatic.
Thanks,
EDIT. I fixed it (but not perfect) by using Swiss File Knife before the dataflow.

If the line terminators are always one or the other, I'd suggest setting up 2 File Connection Managers, one with the "CRLF" row delimiter, and the other with the "LF" row delimiter.
Then, create a boolean package variable (something like #IsCrLf) and scope this to your package. Make the first step in your SSIS package a Script Task, in which you read in a file stream, and attempt to discover what the line terminator is (based on what you find in the stream). Set the value of your variable accordingly.
Then, after the Script Task in your Control Flow, create 2 separate Data Flows (one for each File Connection Manager) and use a Precedence Constraint set to "Expression and Constraint" on the connectors to specify which Data Flow to use, depending on the value of the #IsCrLf variable.
Example of the suggested Control Flow below.

how about a derived column with the REPLACE operation after your file source to change the CRLFs to LFs?

I second the OP's vote for Swiss File Knife.
To integrate that, I had to add an Execute Process Task:
However, I have a bunch of packages that run For-Each-File loops, so I needed some BIML - maybe this'll help the next soul.
<ExecuteProcess Name="(EXE) Convert crlf for <#= tableName #>"
Executable="<#= myExeFolder #>sfk.exe">
<Expressions>
<Expression PropertyName="Arguments">
"crlf-to-lf " + #[User::sFullFilePath]
</Expression>
</Expressions>
</ExecuteProcess>

Related

SSIS is doubling up backslashes

I am loading some file names and locations as variables into SSIS, then tried using foreach loop to execute a process task.
after a few unsuccessful attempts I realized SSIS is doubling up all the Backslashes in the fields I am loading into my variables. hence the network addresses not working.
can we stop this behavior?
What I load:
"\\BBBB001\shared\GGGG\PiMSSSRSReportsPath\THM022\HHHH-NextWorkingDay-at1530.pdf"
What I get:
"\\\\BBBB001\\shared\\GGGG\\PiMSSSRSReportsPath\\THM022\\HHHH-NextWorkingDay-at1530.pdf"
SSIS Execute Process task:
as you can see foxit reader doesn't recognize the later filename with double backslashes. if I manually inter the first value it will work.

For future reference, I found a workaround:
Instead of adding variables in Arguments section, I created a single variable including all the parameters for the file to be printed. something like this:
/t "FileLocation\FileName.pdf" PrinterName
And then put this variable in the expression section of the Execute process task, add argument and put that final variable in front it. like this:

LDIF idempotent apply tool

Before I write one, are there any tools for idempotent applying LDIFs:
If change type is not specified, add or replace an entry (aka UPSERT) (removing any attributes not mentioned in the LDIF record).
If change type is specified process like normal ldapmodify.
I saw someone suggesting ldapmodify -c, but this is meh :) I want to catch all errors.

I'm writing a new tool. Something like this https://github.com/ip1981/ldapply

.bat file to merge .txt into one file, including filename but separated by a ';'

I'm having trouble with a kinda specific problem.
The monitoring software (used for robots in the manufacturing halls) used by the company i am working for, generates a log file (.sdat) every 15 minutes. The content of a log file looks like this:
The syntax: Time;machine;status
13:53:23;KP85;ms:9999
13:53:49;KP85;ms:3
13:54:54;KP85;ms:4
14:06:04;KP85;ms:9999
13:51:38;Robot1;ms:9999
etc...
I've managed to concatenate all the log files into one big file, including the filename at the start of each new row, like this:
The syntax: Filename:Time;Machine;Status
01-03-2016-00-20.sdat:0:07:40;KP65;ms:3
01-03-2016-00-20.sdat:0:09:09;KP65;ms:4
01-03-2016-00-20.sdat:0:09:11;KP65;ms:3
01-03-2016-00-20.sdat:0:09:13;KP65;ms:4
etc..
The reason I did this is because i need the time as well as the date(which is included in the filename) a certain status for a machine has been logged. However, if i import this into SQL management studio, it recognizes 3 columns instead of 4, because the filename and first column(time) are separated by a ':' instead of a ';'. I tried solving it with an SQL Query, separating the date and time with a LEFT() and RIGHT(), but guess what: the time field format changes when the time switches from 9:59:59 to 10:00:00, creating an extra character for the time field (so the data in the column would look like this ':9:59:59', which isn't a valid time field). Perhaps it could be done with SQL but it just seems to me like it would take too much complexity in the SQL code for such a small problem.
So at this point, i thought it would be better to tackle this problem early on; within the batch file which generates the large file, so Management Studio does recognize 4 instead of 3 columns. This is how my .bat file looks like at the moment:
#echo off
findstr "^" *.sdat >output.txt
What do i have to do to get this right?
Thanks in advance,
Mike Sohns

#echo off
(for %%f in (*.sdat) do (
for /f "delims=" %%a in (%%f) do #echo %%f;%%a
))>output.txt
NOTE: this will swallow empty lines, but I think in your case that's not a problem. If yes, the code can be adapted.

Given that all the log files you are concatenating have the sdat extension, you have to replace sdat: for sdat; in order to have a ; separated CSV.
To achieve this, you can use the batch script in this other answer(How to replace substrings in windows batch file), that replaces substrings in a text file using a batch script.

How to configure liquibase not to include file path or name for calculating checksum?

I found that liquibase uses the full path of the change log file to calculate the checksum.
This behavior restricts to modify change log file names and tries to reapply the change sets again once renamed the file.
Is there a way to configure liquibase to use only the changelog id to
calculate cuecksum?
Please provide your valuable thoughts.

Use the attribute logicalFilePath of the databaseChangeLog tag.

Upstream developers recommend to use logicalFilePath and suggest to perform direct update on DATABASECHANGELOG.FILENAME column:
https://forum.liquibase.org/t/why-does-the-change-log-contain-the-file-name/481
to fix broken entries with full paths.
If you set hashes DATABASECHANGELOG.MD5SUM to null that triggers hashes recalculation on next LiquiBase run. It is necessary as hash algorithm includes moving parts too into the result.

One really similar issue- you may just want to ignore the portion of the path before the changelog-master.xml file. In my scenario, I've checked out a project in C:\DEV\workspace and my colleague has the project checked out in C:\another_folder\TheWorkspace.
I'd recommend reading through http://forum.liquibase.org/topic/changeset-uniqueness-causing-issues-with-branched-releases-overlapped-changes-not-allowed-in-different-files first.
Like others have suggested, you'll want the logicalFilePath property set on the <databaseChangeLog> element.
You'll also need to specify the changeLogFile property in a certain way when calling liquibase. I'm calling it from the command line. If you specify an absolute or relative path to the changeLogFile without the classpath, like this, it will include the whole path in the DATABASECHANGELOG table:
liquibase.bat ^
--changeLogFile=C:\DEV\more\folders\schema\changelog-master.xml ^
...
then liquibase will break if you move your migrations to any folder other than that one listed above. To fix it (and ensure that other developers can use whatever workspace location they want), you need to reference the changelogFile from the classpath:
liquibase.bat ^
--classpath=C:\DEV\more\folders ^
--changeLogFile=schema/changelog-master.xml ^
...
The first way, my DATABASECHANGELOG table had FILENAME values (I might have the slash backwards) like
C:\DEV\more\folders\schema\subfolder\script.sql
The second way, my DATABASECHANGELOG table has FILENAME values like
subfolder/script.sql
I'm content to go with filenames like that. Each developer can run liquibase from whatever folder they want. If we decide we want to rename or move an individual SQL file later on, then we can specify the old value in the logicalFilePath property of the <changeSet> element.
For reference, my changelog-master.xml just consists of elements like
<include file="subfolder/script.sql" relativeToChangelogFile="true"/>

I have faced the same problem and found solution below.
If you are using liquibase sql format then simply put below in your sql file:
--liquibase formatted sql logicalFilePath:<relative SQL file path like(liquibase/changes.sql)>
If you are using liquibase xml format then simply put below in your xml file:
<databaseChangeLog logicalFilePath=relative XML file path like(liquibase/changes.xml)" ...>
...
</databaseChangeLog>
After adding above logicalFilePath attribute, run the liquibase update command.
It will put relative file path whatever you put in logicalFilePath in FILENAME column of table DATABASECHANGELOG

stata odbc sqlfile

I am trying to load data from database (either MS Access or SQL server) using odbc sqlfile it seems that the code is running with any error but I am not getting data. I am using the following code odbc sqlfile("sqlcode.sql"),dsn("mysqlodbcdata"). Note that sqlcode.sql contains just sql statement with SELECT. The thing is that the same sql code is giving data with odbc load,exec(sqlstmt) dsn("mysqlodbcdata"). Can anyone suggest how can I use odbc sqlfile to import data? This would be a great help for me.
Thanks
Joy

sqlfile doesn't load any data. It just executes (and displays the results when the loud option is specified), without loading any data into Stata. That's somewhat counter-intuitive, but true. The reasons are somewhat opaquely explained in the pdf/dead tree manual entry for the odbc command.
Here's a more helpful answer. Suppose you have your SQL file named sqlcode.sql. You can open it in Stata (as long as it's not too long, where too long depends on your flavor of Stata). Basically, -file read- reads the SQL code line by line, storing the results in a local macro named exec. Then you pass that macro as an argument to the -odbc load- command:
Updated Code To Deal With Some Double Quotes Issues
Cut & paste the following code into a file called loadsql.ado, which you should put in directory where Stata can see it (like ~/ado/personal). You can find such directories with the -adopath- command.
program define loadsql
*! Load the output of an SQL file into Stata, version 1.3 (dvmaster#gmail.com)
version 14.1
syntax using/, DSN(string) [User(string) Password(string) CLEAR NOQuote LOWercase SQLshow ALLSTRing DATESTRing]
#delimit;
tempname mysqlfile exec line;
file open `mysqlfile' using `"`using'"', read text;
file read `mysqlfile' `line';
while r(eof)==0 {;
local `exec' `"``exec'' ``line''"';
file read `mysqlfile' `line';
};
file close `mysqlfile';
odbc load, exec(`"``exec''"') dsn(`"`dsn'"') user(`"`user'"') password(`"`password'"') `clear' `noquote' `lowercase' `sqlshow' `allstring' `datestring';
end;
/* All done! */
The syntax in Stata is
loadsql using "./sqlfile.sql", dsn("mysqlodbcdata")
You can also add all the other odbc load options, such as clear, as well. Obviously, you will need to change the file path and the odbc parameters to reflect your setup. This code should do the same thing as -odbc sqlfile("sqlfile.sql"), dsn("mysqlodbcdata")- plus actually load the data.
I also added the functionality to specify your DB credentials like this:
loadsql using "./sqlfile.sql", dsn("mysqlodbcdata") user("user_name") password("not12345")

For "--XYZ" style comments, do something like this (assuming you don't have "--" in your SQL code):
if strpos(`"``line''"', "--") > 0 {;
local `line' = substr(`"``line''"', 1, strpos(`"``line''"', "--")-1);
};
I had to post this as an answer otherwise the formatting would've been all messed up, but it's obviously referring to Dimitriy's code.
(You could also define a local macro holding the position of the "--" string to make your code a little cleaner.)

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas