Strange character when importing '.csv' file in SSIS

Strange character when importing '.csv' file in SSIS - sql

So I'm trying to use SSIS to import a '.csv' file into SQL Server. The import works fine but the issue I'm having is that when I import the file, each field has the character � appended.
I've been trying all morning to fix this through SSIS but I'm not having any luck. What I have just noticed is that when I open the '.csv' file and go to Save As it shows up as Unicode Text rather than an actual csv. If I save it as a csv and then run that through all the fields come through fine without the � character.
So I have a fix of sorts but it requires me manually opening and re-saving the files, which I can't have as I need the process to be able to run automatically. I had the thought of converting the file automatically using a C# script task but I don't know how to do that, is anybody able to assist? Or is there a better way to do it that I don't know of?
Thank you.

You can use a simple Powershell script to change the encoding:
foreach ($file in Get-ChildItem *.csv) {
Get-Content $file.name | Set-Content -Encoding utf8 "UTF8_$($file.name)"
}

Related

How to extract .sql file that seems to be a .zip

I have received a file from a customer. The file is said to be
SQL code (application/sql)
However, this has turned out to be wrong: nothing could open it. It turns out it was secretely a .zip file. By renaming it to '.zip' and manually extracting it I was able to get the files contained in it. I would like to do a similar process in python.
So far I've renamed the file:
file_name_zip = file_name.replace('.sql', '.zip')
os.rename(file_name, file_name_zip)
And I've tried extracting it:
zip_ref = zipfile.ZipFile(file_name_zip, 'r')
zip_ref.extractall(extracted_file)
However, this failed because
zipfile.BadZipFile: File is not a zip file
I've googled, and apparently this can sometimes be fixed using:
zip_file_name_2 = zip_file_name.replace('.zip', '2.zip')
os.system(f'zip -FF {zip_file_name} --out {zip_file_name_2}')
This required me to put in a bunch of settings, which I wasn't able to figure out. There must be a better way to go about this.
Does anybody know how to parse such an .sql file?

Import Data From CSV Using Control File Is failing

I am importing CSV file into HANA server using control file using hdbsql and for that purpose i am using IMPORT FROm CSVs statements into control file. My HANA Studio file import is working fine but when i am trying to import through hdbsql using control file as input , my import is failing for no reason, no error.
My CSV file is record delimited {CR}{LF} and i am using '\r\n' as record delimited separator and this file is UTF-16LE encoded.

Just add to #LarsBr. comment, you also need to be careful on where you will load the file from.
It needs to be in a specific directory or you will need to adjust the configuration to use a different one.
Here is a tutorial I wrote to explain that: https://developers.sap.com/tutorials/mlb-hxe-import-data-sql-import.html
There is a "ERROR LOG" option available as well documented here: https://help.sap.com/viewer/4fe29514fd584807ac9f2a04f6754767/latest/en-US/20f712e175191014907393741fadcb97.html

Changing a string in an .exe file

I would like to know how to change a String in an .exe file. It is a list of 8 files which have all the same Content but are used in diferent paths. And These paths (Folders) are named 1-8. And now I have to Change that string ("word class 1") into 2-8 ("word class 2, ...") I did it manually with Notepad++ for a week now but it's time consuming and I don't want to do it anymore with Notepad++. :)
I don't mind any way of a solution to this Problem so, that's it.
I tried it with powershell so far but I can't figure out how to get the solution done with the get-content & select_string but it didn't work out as intended.
Thank you for reading and answering my question. (sorry for some typos)

You just want to replace some values within a .exe?
This is how I'd do it.
You need to provide a CSV file, first column titled OLD, second column titled new.
Here is my fake .exe file I made:
7deeadc7-a2b3-4c47-8cf6-61f09d986977ham
d1ea8982-4a04-4f2b-8e5a-244965921fccsam
b4a8f37a-c607-405b-8493-9b9b0e79673btam
0922496b-3064-4958-a6b0-46f61a711860turkey
e5f30554-e50e-4b61-aaa3-3797d9e0ed5ccheese
82e3d77f-53d5-49ef-bf84-b872dbbe556ffork
60a01cad-f6c4-44cc-af1a-fafb20377a12rice
e2cd71a1-7c34-456f-9af4-924f79874c38yummy
c85da055-c47e-41be-a0f8-5c320fa05317linux
7dbee5fc-87d5-4900-80c5-00818514d5b4morp
d9941dfe-dd97-422d-9088-2cecf4904fdepoo
05eaf9b3-09a2-45ea-b9a0-4c78ff9156f2pot
8c75d00d-4157-45b9-86df-74226790674fpoe
f0e77eb5-35fa-47f5-b89e-d1b5ef3c726fpoh
1d1ffc02-fee0-446d-aeac-940ab2864a76pof
Just a bunch of guids with a word at the end. Now, Here's my sample .csv file, with all of the replacements we want to make.
OLD,NEW
ham,pork
sam,Frodo
linux,Window
morp,porp
poo,restroom
Finally, here is the code to do this as a PowerShell Function.
Function Refresh-File {
param($inputCSV,$inputfile)
$file = get-content $inputfile
Foreach ($replacement in (Import-csv $inputCSV)){
$file = $file -replace $replacement.old,$replacement.New
}
$file | set-content $inputfile
}
Call it like this: Refresh-File -inputCSV T:\replace.csv -inputfile T:\blah.exe
Here's my .exe file after running this, just the value portions, to show you that it worked:
pork
Frodo
tam
turkey
cheese
fork
rice
yummy
Window
porp
restroom
pot
poe
poh
pof
Since you'll want to automate this, simply make a new replacement.csv file everyday. Then run this code. If you've never written a full PS1 script file before, as a quick summary, copy the function, paste it into notepad or the PowerShell ISE, and then at the last line of the script, put the command syntax to call the function. Save and enjoy.

Hive output to xlsx

I am not able to open an .xlsx file. Is this the correct way to output the result to an .xlsx file?
hive -f hiveScript.hql > output.xlsx

hive -S -f hiveScript.hql > output.xls
This will work

There is no easy way to create an Excel (.xlsx) file directly from hive. You could output you queries content to an older version of Excel (.xls) by the answers given above and it would open in Excel properly (with an initial warning in latest versions of Office) but in essence it is just a text file with .xls extension. If you open this file with any text editor you would see the contents of the query output.
Take any .xlsx file on your system and open it with a text editor and see what you get. It will be all junk characters since that is not a simple text file.
Having said that there are many programming languages that allow you to convert/read a text file and create xlsx. Since no information is provided/requested on this I will not go into details. However, you may use Pandas in Python to create excels.

output csv or tsv file, and I used Python to do converting (pandas library)

I am away from my setup right now so really cannot test this. But you can give this a try in your hive shell:
hive -f hiveScript.hql >> output.xls

Powershell to mass rename-move PDFs?

I'm looking to create an automated Powershell script with task scheduler to do a mass rename of auto-generated PDFs and then save them to a second folder. The original name is irrelevant but is generally in the form 0013238974.pdf. These each need to be renamed based on text contained within the file. Example:
TEXT TEXT TEXT
$ACCT_ID
TEXT TEXT TEXT
Thus the new name of the file would need to be $ACCT_ID.pdf, and then saved in the new destination. I've got no problem with the move, that's just a simple
Get-ChildItem -Path C:\Original\PDF\Generation\Folder -Include *.pdf -Recurse |
copy-item -destination C:\The\Folder\I\Need\Them\In
But I'm stumped after that when it comes to extracting the information from the already generated PDF and saving the renamed version as $ACCT_ID.pdf.
I considered running it through a separate PDF print command instead of open/resave, but that doesn't solve my $ACCT_ID extraction problem.
Thanks for any insight on this.

There isn't any build-in functionality for reading PDF files in PowerShell so your best bet is to use a third party .NET component. There are several commercial and also at least a few free open source alternatives.
Here's a few lines of example code using iTextSharp to read the PDF:
Add-Type -Path .\itextsharp.dll
$pdfReader = New-Object iTextSharp.text.pdf.PdfReader("C:\file.pdf")
$textFromFirstPage = [iTextSharp.text.pdf.parser.PdfTextExtractor]::GetTextFromPage($pdfReader, 1)
$pdfReader.Dispose()
How you go about finding your account id after that of course depends on the text of your files.

We Keep Coding

sql objective-c vba vb.net react-native apache vue.js tensorflow api pandas

Strange character when importing '.csv' file in SSIS - sql

You can use a simple Powershell script to change the encoding: foreach ($file in Get-ChildItem *.csv) { Get-Content $file.name | Set-Content -Encoding utf8 "UTF8_$($file.name)" }

Related

How to extract .sql file that seems to be a .zip

Import Data From CSV Using Control File Is failing

Changing a string in an .exe file

Hive output to xlsx

Powershell to mass rename-move PDFs?

Categories

Resources