How to force scheme.ini to be used for MS Text Driver? - sql-server-2005

I am creating this huge csv import, that uses the ms text driver, to read the csv file.
And I am using ColdFusion to create the scheme.ini in each folder's location, where the file has been uploaded.
Here is a sample one I am using:
[some_filename.csv]
Format=CSVDelimited
ColNameHeader=True
MaxScanRows=0
Col1=user_id Text width 80
Col2=first_name Text width 20
Col3=last_name Text width 30
Col4=rights Text width 10
Col5=assign_training Text width 1
CharacterSet=ANSI
Then in my ColdFusion code, I am doing 2 cfdump's:
<cfdump var="#GetMetaData( csvfile )#" />
<cfdump var="#csvfile#">
The meta data shows that the query has not grabbed the correct data types for reading the csv file.
And the dump of the query to read file, shows that it is missing values, because of Excel we can not force them to use double quotes. And when fields have mixed data types, then it causes our process to not work..
How can I either change the data type inside the query, aka make it use scheme.ini, or update metadata to the correct data type.
I am using a view on information_schema in sql server 2005 to get the correct data types, column names, and max lengths...
Unless I have some kind of syntax error, I can't see why it's not grabbing the data as the correct data type.
Any suggestions?

Funnily, I had the filename spelled wrong, instead of using schema.ini i was having it as scheme.ini.
I hate when you make lil mistakes like this...
Thank You

Related

Full Text Search for extracting a snippet of the text (returning intended text and it's surrounding)

I'm using SQL file table and for instance I have a saved text file named "SOS.txt" which contains following text
For god's sake, save us right now please. We can't survive.
Now or never!
Now I want to find all files that contain the word save, so I execute following query
SELECT * FROM FileTableExample
WHERE CONTAINS(file_stream, 'save')
and here's the result:
stream file => 0x616C692053617665207573207269676874206E6F772E0D0A4E6F77206F72206E6576657221
As you can see I got the true result, the third column of the result indicates the file under name SOS.txt, I have the stream_id and stream_file but what I'm about to find is the way to show the the intended text in company with it's surrounding in human readable format.
Somethings like this:
Name | Excerpt
-------------+----------------------
SOS.txt |..sake, save us..
Is there any way?
Update:
After searching on the net I found this article which is useful but it didn't mention about full text search in filetable structure.
Based on this article, I converted file stream to string:
SELECT CONVERT(varchar(MAX), file_stream) AS Excerpt, *
from FileTableExample
where contains(file_stream, 'save')
It works if the file is a plain text like SOS.txt but if it's .docx or .pptx file, you are not going to gain a useful convention.
Use this, CAST(file_Stream as varchar(max))

DB2 SQL Interpret a field as other CCSID

So I have a file on my AS400 as a result of DSPJRN and I want to look at some data in the JOESD field which is the after image from the journal of a file. This is defined as char with CCSID = 65535. I guess this is because it is the whole record with a mixture of ccsid and numeric fields.
I can use substr() to get the actual field from the original file. In the original file the column is defined graphic(10) ccsid 13488. Thats UCS-2. If I do hex(substr(joesd,522,20)) I get a result of 004100530044... and so on so I know it's the correct data but I can't get it to display as 'ASD...'
I tried graphic(substr(joesd,522,20),10,13488) but it gives an error that the conversion from ccsid 65535 to 13488 isn't valid. I don't want to convert it but interpret it as the other ccsid
GRAPHIC() doesn't take CCSID as a parm. The third parm is length according to my 7.1 reference.
What version are you using?
I thought CAST() might be a solution, but it doesn't appear to work.
As I see it, one option would be to build a user defined function (UDF) that does the conversion you need; possibly with the iconv() API.
The other option, would be to dump the data into a properly formatted file. I use the DBUJRN utility from DBU. There's other similar options. Including an open source one (sorry that the description is in German, but google translate does a good enough job to figure out the source to download).
The utilities basically work the same way; you can in fact run through the same process manually. Try the following:
Step 1 (the DSPJRN you've been doing)
DSPJRN <...> OUTFILE(MYLIB/MYJRNOUT)
Step 2 - Create a new file with the journal header fields followed by all the fields from your journaled file (MYFILE)
CREATE TABLE mylib/mytbl as
( select JOENTL, JOSEQN, JOCODE, JOENTT, JODATE,
JOTIME, JOJOB, JOUSER, JONBR, JOPGM, JOOBJ,
JOLIB, JOMBR, JOCTRR, JOFLAG, JOCCID,
JOINCDAT, JOMINESD, JORES,
m.*
from MYLIB/MYJRNOUT , MYLIB/MYFILE m
) with no data
Step 3 - Copy the data without regard to the format differences..
CPYF FROMFILE(MYLIB/MYJRNOUT) TOFILE(MYLIB/MYTBL) MBROPT(*ADD) FMTOPT(*NOCHK)
You should end up with data originally in JOESD split into it's appropriate fields.
Note of course that this technique only works for one file at a time. Also, make sure you're only dumping *RCD entries and you'll probably want to skip the DELETE entries.

Cannot upload CSV that starts with an integer

I'm stuck with what seems like a weird BigQuery bug : I cannot upload a CSV file that starts (first line, first column) by an integer.
Here's my schema : COL1:INTEGER,COL2:INTEGER,COL3:STRING
Here's my csv file content :
100,4,XXX
100,4,XXX
If I put the STRING column as first column, the upload is OK.
If I add a header and tell BigQuery to skip it during the import, the upload is ok too.
But with the CSV and schema above, BigQuery always complains : Line:1 / Field:1, Value cannot be converted to expected type.
Anyone knows what the problem is ?
Thank you in advance,
David
I could not reproduce this problem--I copied and pasted the content into a file and uploaded it with no problems.
Perhaps the uploaded file format is corrupted somehow? If there are extra bytes at the beginning of the file, those would be ignored in a header row but might result in this error is the first value of the first field is expected to be an integer. I'd recommend examining the actual binary data in the file to make sure there's nothing funny going on.
Also, are you doing this import via web UI, command-line tool, or API? Have you tried one of the other methods?

SAS : read in PDF file

I am looking for ways to read in a PDF file with SAS. Apparently this is not basic functionality and there is very little to be found on the internet. (Let alone that google is not easy with PDF in you search giving you also links to PDF documents that go about other things.)
The only things that can be found, are people looking for ways to import data into datasets from a PDF. For me, that is not even necesarry. I would like to be able to read the contents of the PDF file in one big character variable. If possible, it would even be better to be able to read in the file's binary data.
Is this possible with SAS and how? (I got it to work in Access VBA, but can't find any similar ways in SAS.)
(In the end, the purpose is to convert this to base64 and put that base64-string into an XML document.)
You probably will not be able to read the entire file into one character variable since the maximum size of a character variable is around 33 KB. A simple way to read in one line at a time, though, is something like the following:
%let pdfFileName = Test.pdf;
%let lineSize = 2000;
data base;
format text_line $&lineSize..;
infile "&pdfFileName" lrecl=&lineSize;
input text_line $;
run;
This requires that you have a general idea of the maximum record length ahead of time, but you could write additional code to determine the maximum record size prior to reading in the file. In this example each line of text is read into one character variable named "text_line." From there, you could use a RETAIN statement or double trailers (##) in the INPUT line to process multiple lines at a time. The SAS web-site has plenty of documentation on how to read and process text from various types of input files.

how to import flat file source to database using sql

im currently want to inport my data from flat file to the database.
the flat file is in a txt file. in that txt file, i save a list of URLs. example:
http://www.mimi.com/Hotels-g303188-Rurrenabaque-Hotels.html
im using the SQL Server Import and Export wizard to do it. but when the time of execution, it has error saying
Error 0xc02020a1:
Data Flow Task 1: Data conversion failed. The data conversion for column
"Column 0" returned status value 4 and status text "Text was truncated or one
or more characters had no match in the target code page.".
can anyone help?..
You get this error because the text is too long for the column youve chosen to put it in.
Text was truncated or
You might want to check the size of the database column vis-a-vis your input data. Does the longest URL less than the column width?
one or more characters had no match in the target code page.".
Check if your input file has any special characters. An easy way to check this would be to save your file in ANSI (Notepad > Save As > Encoding = ANSI). Note - you'd still have to select the right code page so that the import interprets your input text correctly.
Here's a very nice link that has some background on what code pages are - http://www.joelonsoftware.com/articles/Unicode.html
Note you can also change the target column data type (to text stream for example) in the Datasource->Advanced section