DB2 SQL query returns some type of converted results when exporting to file - sql

have shell script which queries a DB2 db and exports the output to a file. When I sun the SQL statement without exporting, I get the following:
su - myid -c 'db2 connect to mydb;db2 -v "select COL1"; db2 connect reset;'
Sample Output
COL 1
x'20A0E2450080000'
x'50D24520E100GDS00'
x'10H0EFJ10080000'
x'50A0GH0080000'
x'80RHE1008B0000'
x'70A50E1F4008000'
x'10F329EF09BB0'
But when I export my results using the exact same query, I get the following:
su - myid -c 'db2 connect to mydb;db2 -v "EXPORT TO '/tmp/query_results.out' OF DEL MODIFIED BY COLDEL: select COL1 from MYTABLE"; db2 connect reset;'
Sample Output
hôª"
"xàÓ °á
"èÅ °á
hôª"
"é# °á
hôª"
"é« °á
hôª"
"éÅ °á
hôª"
"""ÒYá  á
hôª"
"#sYá  á
hôª"
I'm assuming this is due to the single quote characters. Due to the fact that they are both preceded by another character, I have not been able to add '\' in front of them. I've also attempted to run the substr function within the query, but I still get the same result, only shorter. I'm sure there must be something I am overlooking, so after a several days of trying on my own (and failing), I'm turning to you guys. Any help would be greatly appreciated.
*Edit: Just wanted to add that my actual select statement includes more than one column which are displayed correctly. So out of several columns, only one is displaying bad data.

"I'm assuming this is due to the single quote characters" -- No. This particular column contains binary data, either BLOB or VARCHAR FOR BIT DATA. If it is BLOB, specify LOBS TO in the EXPORT command, this way BLOBs will be written to binary files. If it is VARCHAR FOR BIT DATA, you can either convert it to BLOB on export (export to ... lobs to ... select blob(your_column)...) or export it as hex(your_column), depending on what you're planning to do with the export later.
Another alternative for VARCHAR FOR BIT DATA would be to export your table using the IXF format instead of DEL, which will preserve binary strings.

Related

How to extract data with Actual Column Size rather then Fixed Column Size in SYBASE by ISQL query

ISQL command executes the SQL file and generates a text file. The results data columns size is based on the fixed size of the column and not based on the actual size of the data.
e.g.
The Table "STUDENT" has columns
"FirstName" varchar(10)
"LastName" varchar(10)
ISQL Command :-
isql -UUserID -PPassword -SDatabase1 -DUserID -iName.sql -b -s -w2000 -oName.txt
When I execute the SELECT query(Name.sql) by the ISQL command it result in
Actual :-
FirstName |LastName
JOHN______|DOE_______
Note : "_" is blank spaces
Expected :-
FirstName|LastName
JOHN|DOE
I did google and I got few links but they were not helpful to me.
https://docs.faircom.com/doc/isql/32422.htm
http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.infocenter.dc30191.1550/html/utility/utility14.htm
Installed SYBASE version : 15.7.0
After researching I came to know that Sybase ISQL has this limitation and data result column is based on the fixed size of the column rather then the actual size.
There are other options available like having temporary table/views and we can get the desired data.
I ended up writing a utility that does the job for me.

pgAdmin4: Importing a CSV

I am trying to import a CSV using pgAdmin4. I created the table using the query,
CREATE TABLE i210_2017_02_18
(
PROBE_ID character varying(255),
SAMPLE_DATE timestamp without time zone,
LAT numeric,
LON numeric,
HEADING integer,
SPEED integer,
PROBE_DATA_PROVIDER character varying(255),
SYSTEM_DATE timestamp without time zone
)
The header and first line of my CSV read is...
PROBE_ID,SAMPLE_DATE,LAT,LON,HEADING,SPEED,PROBE_DATA_PROVIDER,SYSTEM_DATE
841625st,2017-02-18 00:58:19,34.11968,-117.80855,91.0,9.0,FLEET53,2017-02-18 00:58:58
When I try to use the import dialogue, the process fails with Error Code 1:
ERROR: invalid input syntax for type timestamp: "SAMPLE_DATE"
CONTEXT: COPY i210_2017_02_18, line 1, column sample_date: "SAMPLE_DATE"
Nothing seems wrong to me - any ideas?
According to your table structure, this import will fail in the columns HEADING and SPEED, since their values have decimals and you declared them as INTEGER. Either remove the decimals or change the column type to e.g. NUMERIC.
Having said that, just try this from pgAdmin (considering that file and database are in the same server):
COPY i210_2017_02_18 FROM '/home/jones/file.csv' CSV HEADER;
In case you're dealing with a remote server, try this using psql from your console:
$ cat file.csv | psql yourdb -c "COPY i210_2017_02_18 FROM STDIN CSV HEADER;"
You can also check this answer.
In case you really want to stick to the pgAdmin import tool, which I discourage, just select the Header option and the proper Delimiter:
Have you set the Header-Option = TRUE?
Import settings
that should work.
Step 1: Create a table.
you can use a query or dashboard to create a table.
Step 2: Create the exact number of columns present in the CSV file.
I would recommend creating columns using the dashboard.
Step 3: Click on your table_name in pgadmin you will see an option for import/export.
Step 4: provide the path of your CSV file, remember to choose delimiter as comma,

push query results into array

I have a bash shell script. I have a psql copy command that has captured some rows of data from a database. The data from the database are actual sql statments that I will use in the bash script. I put the statements in a database because they are of varying length and I want to be able to dynamically call certain statements.
1) I'm unsure what delimiter to use in the copy statement. I can't use a comma or pipe because they are in my data coming from the database. I have tried a couple random characters because those are not in my database but copy has a fit and only wants one ascii character.
Also to complicate things I need to get query_name and query_string for each row.
This is what I currently have. I get all the data fine with the copy but now I just want to push the data into an array so that I will be able to loop over it later:
q="copy (select query_name,query_string from query where active=1)
to stdout delimiter ','"
statements=$(psql -d ${db_name} -c "${q}")
statements_split=(`echo ${statements//,/ }`)
echo ${statements_split[0]};
Looks to me like you actually want to build something like a dictionary (associative array) mapping query_name to query_string. bash isn't really the best choice for handling complex data structures. I'd suggest using Perl for this kind of task if that's an option.

hive RegexSerDe null

How should I work with NULL values in RegexSerDe?
I have file with data:
cat MOS/ex1.txt
123,dwdjwhdjwh,456
543,\N,956
I have the table:
CREATE TABLE mos.stations (usaf string, wban STRING, name string)
ROW FORMAT SERDE 'org.apache.hadoop.hive.contrib.serde2.RegexSerDe'
WITH SERDEPROPERTIES (
"input.regex" = "(.*),(.*),(.*)"
);
I successfully loaded the data from file to table:
LOAD DATA LOCAL INPATH '/home/hduser/MOS/ex1.txt' OVERWRITE INTO TABLE mos.stations;
Simple select works fine:
hive> select * from mos.stations;
123dwdjwhdjwh456
543\N956
And next ends with error:
select * from mos.stations where wban is null;
[Hive Error]: Query returned non-zero code: 9, cause: FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
What is wrong?
I see a couple of possible issues:
1) It may not having anything to do with null handling at all. The first query doesn't actually spawn an M/R job while the second one does so it might be a simple classpath issue where RegexSerde is not being seen by the M/R tasks because its jar is not in the classpath of the tasktracker. You'll need to find where the hive-contrib jar on your system lives and then make hive aware of it via something like:
add jar /usr/lib/hive/lib/hive-contrib-0.7.1-cdh3u2.jar
Note, your path and jar name may be different. You can run the above through hive right before your query.
2) Another issue might be that the RegexSerde doesn't really deal with "\N" the same way as the default LazySimpleSerde. Judging by the output you are getting in the first query (where it returns a literal "\N") that could be the case. What happens if you query where wban='\\N'? or where wban='\N' (I forget if you need to double escape).
Finally, one word of caution about RegexSerde. While its really handy, its slow as molasses going uphill in January compared to the default serde. If the dataset is large and you plan to run a lot of queries against it, its best to pre-process so that you don't need the RegexSerde. Otherwise, your going to pay a penalty for every query. The same datset above looks like it would be fine with the default serde.

MySQL Convert latin1 data to UTF8

I imported some data using LOAD DATA INFILE into a MySQL Database. The table itself and the columns are using the UTF8 character set, but the default character set of the database is latin 1. Because the default character type of the database is latin1, and I used LOAD DATA INFILE without specifying a character set, it interpreted the file as latin1, even though the data in the file was UTF8. Now I have a bunch of badly encoded data in my UTF8 colum. I found this article which seems to address a similar problem, which is "UTF8 inserted in cp1251", but my problem is "Latin1 inserted in UTF8". I've tried editing the queries there to convert the latin1 data to UTF8, but can't get it to work. Either the data comes out the same, or even more mangled than before. Just as an example, the word Québec is showing as Québec.
[ADDITIONAL INFO]
When Selecting the data wrapped in HEX(), Québec has the value 5175C383C2A9626563.
The Create Table (shortened) of this table is.
CREATE TABLE MyDBName.`MyTableName`
(
`ID` INT NOT NULL AUTO_INCREMENT,
.......
`City` CHAR(32) NULL,
.......
`)) ENGINE InnoDB CHARACTER SET utf8;
I've had cases like this in old wordpress installations with the problem being that the data itself was already in UTF-8 within a Latin1 database (due to WP default charset). This means there was no real need for conversion of the data but the ddbb and table formats.
In my experience things get messed up when doing the dump as I understand MySQL will use the client's default character set which in many cases is now UTF-8.
Therefore making sure that exporting with the same coding of the data is very important. In case of Latin1 DDBB with UTF-8 coding:
$ mysqldump –default-character-set=latin1 –databases wordpress > m.sql
Then replace the Latin1 references within the exported dump before reimporting to a new database in UTF-8. Sort of:
$ replace "CHARSET=latin1" "CHARSET=utf8" \
"SET NAMES latin1" "SET NAMES utf8" < m.sql > m2.sql
In my case this link was of great help.
Commented here in spanish.
Though it is hardly still actual for the OP, I happen to have found a solution in MySQL documentation for ALTER TABLE. I post it here just for future reference:
Warning
The CONVERT TO operation converts column values between the character sets. This is not what you want if you have a column in one character set (like latin1) but the stored values actually use some other, incompatible character set (like utf8). In this case, you have to do the following for each such column:
ALTER TABLE t1 CHANGE c1 c1 BLOB;
ALTER TABLE t1 CHANGE c1 c1 TEXT CHARACTER SET utf8;
The reason this works is that there is no conversion when you convert to or from BLOB columns.
LOAD DATA INFILE allows you to set an encoding file is supposed to be in:
http://dev.mysql.com/doc/refman/5.1/en/load-data.html
I wrote that http://code.google.com/p/mysqlutf8convertor/ for Latin Database to UTF-8 Database. All tables and field to change UTF-8.
Converting latin1 to UTF8 is not what you want to do, you kind of need the opposite.
If what really happened was this:
UTF-8 strings were interpreted as Latin-1 and transcoded to UTF-8, mangling them.
You are now, or could be, reading UTF-8 strings with no further interpretation
What you must do now is:
Read the "UTF-8" with no transcode.
Convert it to Latin-1. Now you should actually have the original UTF-8.
Now put it in your "UTF-8" column with no further conversion.
I recently completed a shell script that automates the conversion process. It is also configurable to write custom filters for any text you wish to replace or remove. For example : stripping HTML characters etc. Table whitelists and blacklists are also possible. You can download it at sourceforge: https://sourceforge.net/projects/mysqltr/
Try this:
1) Dump your DB
mysqldump --default-character-set=latin1 -u username -p databasename < dump.sql
2) Open dump.sql in text editor and replace all occurences of "SET NAMES latin1" by "SET NAMES utf8"
3) Create a new database and restore your dumpfile
cat dump.sql | mysql -u root -p newdbname