Import Data From CSV Using Control File Is failing - hana

I am importing CSV file into HANA server using control file using hdbsql and for that purpose i am using IMPORT FROm CSVs statements into control file. My HANA Studio file import is working fine but when i am trying to import through hdbsql using control file as input , my import is failing for no reason, no error.
My CSV file is record delimited {CR}{LF} and i am using '\r\n' as record delimited separator and this file is UTF-16LE encoded.

Just add to #LarsBr. comment, you also need to be careful on where you will load the file from.
It needs to be in a specific directory or you will need to adjust the configuration to use a different one.
Here is a tutorial I wrote to explain that: https://developers.sap.com/tutorials/mlb-hxe-import-data-sql-import.html
There is a "ERROR LOG" option available as well documented here: https://help.sap.com/viewer/4fe29514fd584807ac9f2a04f6754767/latest/en-US/20f712e175191014907393741fadcb97.html

Related

trouble with utf-8 with julia and jupyterlab

I'm reading the csv file at https://github.com/VinitaSilaparasetty/julia-beginners/blob/master/data/nba/nba19-20.csv
I get a DataFrame and I save it as XLSX. When I try to read it in jupyterlab I get the error the file is not UTF-8 encoded and therefore the file is not read.
This is my code:
using HTTP, XLSX, CSV, DataFrames
df = CSV.read(HTTP.get("https://raw.githubusercontent.com/VinitaSilaparasetty/julia-beginners/master/data/nba/nba19-20.csv").body)
# first(df,5) # first shows the top five rows ok
XLSX.writetable("data/nba/nba19-20.XLSX", collect(eachcol(df)), names(df), overwrite = true)
The file is saved in my data folder. When I try to open it with jupyterlab, I get a pop up with the file is not UTF-8 encoded and the file is not opened.
When I try to open the file in Ubuntu (with LibreOffice) I do not see anything suspicious.
As I'm new to Julia I'm struggling to understand where the problem lies or how to fix it.
I tried to see if I could encode the dataframe in UTF-8 (after saving the file to disk) with
data = DataFrame(CSV.File(open(read,"data/nba/nba19-20.csv", enc"utf-8")))
But I did not see any change. Any suggestion is welcome.
Do you have the jupyterlab-spreadsheet plugin installed? JupyterLab by default doesn't support opening xlsx files (it isn't mentioned in the file formats list here for example).
See also this similar question involving Python pandas (which says pretty much the same thing).

Strange character when importing '.csv' file in SSIS

So I'm trying to use SSIS to import a '.csv' file into SQL Server. The import works fine but the issue I'm having is that when I import the file, each field has the character � appended.
I've been trying all morning to fix this through SSIS but I'm not having any luck. What I have just noticed is that when I open the '.csv' file and go to Save As it shows up as Unicode Text rather than an actual csv. If I save it as a csv and then run that through all the fields come through fine without the � character.
So I have a fix of sorts but it requires me manually opening and re-saving the files, which I can't have as I need the process to be able to run automatically. I had the thought of converting the file automatically using a C# script task but I don't know how to do that, is anybody able to assist? Or is there a better way to do it that I don't know of?
Thank you.
You can use a simple Powershell script to change the encoding:
foreach ($file in Get-ChildItem *.csv) {
Get-Content $file.name | Set-Content -Encoding utf8 "UTF8_$($file.name)"
}

Proper CSV export from SQL Server

I have a table in SQL Management Studio I want to export as a CSV and afterwards import it into WEKA.
I queried all data from the table, selected it, then right-clicked and chose "Save results as"->CSV.
When I try to import this CSV into WEKA, I get the following error message:
File <path> not recognized as an 'CSV data files' here.
Reason:
wrong number of values. READ 27, expected 26, read Token[EOL], line 1023
I assume, I need to escape a String at line 1023, but what if another 100... errors will follow?
Is there any way to automatically escape all characters to get a proper CSV file, without post-processing?

Hive output to xlsx

I am not able to open an .xlsx file. Is this the correct way to output the result to an .xlsx file?
hive -f hiveScript.hql > output.xlsx
hive -S -f hiveScript.hql > output.xls
This will work
There is no easy way to create an Excel (.xlsx) file directly from hive. You could output you queries content to an older version of Excel (.xls) by the answers given above and it would open in Excel properly (with an initial warning in latest versions of Office) but in essence it is just a text file with .xls extension. If you open this file with any text editor you would see the contents of the query output.
Take any .xlsx file on your system and open it with a text editor and see what you get. It will be all junk characters since that is not a simple text file.
Having said that there are many programming languages that allow you to convert/read a text file and create xlsx. Since no information is provided/requested on this I will not go into details. However, you may use Pandas in Python to create excels.
output csv or tsv file, and I used Python to do converting (pandas library)
I am away from my setup right now so really cannot test this. But you can give this a try in your hive shell:
hive -f hiveScript.hql >> output.xls

In kettle use text file input read csv file from a tar.gz file but it didn't worked. Where it might be wrong?

I have a csv file that is tared and zipped. So I have test.tar.gz.
I would like, through text file input, read csv file.
I try this tar:gz:file://C:/test/test.tar.gz!/test.tar! use wildcard like ".*\.csv".
But it sometime can't read success.
It throws Exception
org.apache.commons.vfs.FileNotFolderException:
Could not list the contents of
"tar:gz:file:///C:/test/test.tar.gz!/test.tar!/"
because it is not a folder.
I use windows8.1, pdi 5.2
Where it might be wrong?
For a compressed file csv reading, "Text File Input" step in Pentaho Kettle only supports the first files inside the compressed folder(either in Zip/GZip file). Check the Pentaho Wiki in the compression section.
Now for your issue, try removing the wildcard entry since only the first file inside the zip/gzip file will be read. (as explained above)
I have placed a sample code containing both reading zip and gzip files. Check it here.
Hope it helps :)