I am stuck into a situation where I need to insert data into a blob column by reading a file from the Filesystem in DB2 (DB2 Express C on Windows 7).
Somewhere on the internet I found this INSERT INTO ... VALUES ( ..., readfile('filename'), ...); but here readfile is not an inbuilt function but I need to create it using UDF (c language libraries), but that might not be a useful solution.
Can somebody update us how to insert BLOB values using Insert command.
Also, you can insert blob values casting characters to its corresponding hex values:
CREATE TABLE BLOB_TEST (COL1 BLOB(50));
INSERT INTO BLOB_TEST VALUES (CAST('test' AS BLOB));
SELECT COL1 FROM BLOB_TEST;
DROP TABLE BLOB_TEST;
And this gives this result:
COL1
-------------------------------------------------------------------------------------------------------
x'74657374'
1 record(s) selected.
1) You could use LOAD or IMPORT via ADMIN_CMD. In this way, you can use SQL to call the administrative stored procedure that will call the tool. Import or Load can read files and then put that in a row.
You can also wrap this process by using a temporary table that will read the binary data from the file, insert it in the temporary table, and then return it to be from the table.
2) You can create an external stored procedure or UDF implemented in Java or C, that will read the data, and then insert that in the row.
I have not tried, but you can also use the Built-in modules that handle LOBs http://www-01.ibm.com/support/knowledgecenter/SSEPGG_10.5.0/com.ibm.db2.luw.apdv.sqlpl.doc/doc/r0055115.html
This is only available in DB2 LUW since version 9.7
I've succeeded to do this by using IBM Data Studio with following query:
INSERT INTO MY_TABLE (BLOB_COLUMN) values (?);
And selecting a file from pop-up dialog.
Somehow, same method from RAD 8 doesn't show an option to load blob column type the same way.
First and foremost, per IBM docs all LOB data in DB2 must have the following corresponding items in addition to a LOB column defined in a table. See docs for example CREATE statements.
LOB tablespace (one for every LOB column in each partition)
Auxiliary table on above table space that points to the blob column in base table (also, one for every LOB column in each partition)
A unique index in auxiliary table
Once this schema is prepared you can then run a LOAD command that can import with other data fields where blob content is referenced by file paths. Below is a demo with an Employees table:
DB Table (example table)
CREATE TABLE EMPLOYEES (
ID INTEGER NOT NULL GENERATED ALWAYS AS IDENTITY,
EMPLOYEE_NUMBER INTEGER,
EMPLOYEE_NAME VARCHAR(255),
EMPLOYEE_PIC BLOB(500K)
);
CSV FILE (comma being default delimiter in LOAD with no headers)
1234, "John Doe", johndoe.jpg
5678, "Jane Doe", janedoe.jpg
...
DB2 LOAD (simple version using defaults for many other LOAD parameters)
LOAD FROM "/path/to/file.csv"
OF DEL
LOBS FROM /path/to/picture/folder/ --PATH OF BLOB FILES WITH BASENAME IN CSV
--MUST END IN FORWARD SLASH
MODIFIED BY LOBSINFILE CHARDEL""
DUMPFILE="/path/to/dump.txt" --FOR FAILED IMPORTS
METHOD P (1,2,3) --NUMBER REFERENCE OF COLS, OR USE N FOR FIELD NAMES
MESSAGES "/path/to/messages.txt" --FOR LOAD COMMAND MESSAGES
REPLACE INTO "EMPLOYEES" --REMOVES EXISTING FOR IMPORT, OR USE INSERT TO ADD
(EMPLOYEE_NUMBER
EMPLOYEE_NAME,
EMPLOYEE_PIC);
Command lines
> db2 -tvf "/path/to/load_command.sql"
> db2 "SELECT LENGTH(EMPLOYEE_PIC) FROM EMPLOYEES"
DB2 SQL query to insert JPG file into table
create table table_name(column_name BLOB) /* BLOB is data type
insert into table_name(column_name)values(blob('c:\data\winter.jpg'))
c:\data\ its a path and winter.jpg its a image name
Related
I'm new to PostgreSQL and and looking for some guidance and best practice.
I have created a table by importing data from a csv file. I then altered the table by creating multiple generated columns like this:
ALTER TABLE master
ADD office VARCHAR(50)
GENERATED ALWAYS AS (CASE WHEN LEFT(location,4)='Chic' THEN 'CHI'
ELSE LEFT(location,strpos(location,'_')-1) END) STORED;
But when I try to import new data into the table I get the following error:
ERROR: column "office" is a generated column
DETAIL: Generated columns cannot be used in COPY.
My goal is to be able to import new data each day to the table and have the generated columns automatically populate in order to transform the data as I would like. How can I do so?
CREATE TEMP TABLE master (location VARCHAR);
ALTER TABLE master
ADD office VARCHAR
GENERATED ALWAYS AS (
CASE
WHEN LEFT(location, 4) = 'Chic' THEN 'CHI'
ELSE LEFT(location, strpos(location, '_') - 1)
END
) STORED;
--INSERT INTO master (location) VALUES ('Chicago');
--INSERT INTO master (location) VALUES ('New_York');
COPY master (location) FROM $$d:\cities.csv$$ CSV;
SELECT * FROM master;
Is this the structure and the behaviour you are expecting? If not, please provide more details regarding your table structure, your importable data and your importing commands.
Also, maybe when you try to import the csv file, the columns are not linked properly, or maybe the delimiter is not properly set. Try to specify each column in the exact order that appear in your csv file.
https://www.postgresql.org/docs/12/sql-copy.html
Note: d:\cities.csv contains:
Chicago
New_York
EDIT:
If columns positions are mixed up between table and csv, the following operation may come in handy:
1. create temporary table tmp (csv_column1 <data_type>, csv_column_2 <data_type>, ...); (including ALL csv columns)
2. copy tmp from '/path/to/file.csv';
3. insert into master (location, other_info, ...) select csv_column_3 as location, csv_column_7 as other_info, ... from tmp;
Importing data using an intermediate table may slow things down a little, but gives you a lot of flexibility.
I was getting the same error when importing to PG from a csv - I found that even though my column was generated, I still had to have it in the imported data, just left it empty. Worked fine when the column name was in there and mapped to my DB col name.
My database is hosted on a server to which I can only issue DML statements.
Is there an SQL command (for Oracle) that I could use to fill a table with the entries from a CSV file? The columns of the CSV file and the table are the same, but if there is a version of the command where I could decide which field from the file goes to which column it would be even better.
Also, I cannot install anything besides the Oracle SQL Developer so what I need is an SQL code that I can run from there. I believe that SQL*Loader and external tables don't help in this situation.
use oracle external table
create directory ext_data_files as 'C:\'; -- create oracle directory object point to the directory where your file resides, using this we will fetch the csv data
create table teachers_ext (
first_name varchar2(15),
last_name varchar2(15),
phone_number varchar2(12)
)
organization external (
type oracle_loader
default directory ext_data_files
access parameters (fields terminated by ',' )
location ('teacher.csv')
)
reject limit unlimited
/
your csv will be like
John,Smith,8737493
Foo, Bar, 829823832
Directly copied from Oracle 9i documentation:
CREATE TYPE student_type AS object (
student_no CHAR(5),
name CHAR(20))
/
CREATE TABLE roster (
student student_type,
grade CHAR(2));
Also assume there is an external table defined as follows:
CREATE TABLE roster_data (
student_no CHAR(5),
name CHAR(20),
grade CHAR(2))
ORGANIZATION EXTERNAL (TYPE ORACLE_LOADER DEFAULT DIRECTORY ext_tab_dir
ACCESS PARAMETERS (FIELDS TERMINATED BY ',')
LOCATION ('foo.dat'));
To load table roster from roster_data, you would specify something
similar to the following:
INSERT INTO roster (student, grade)
(SELECT student_type(student_no, name), grade FROM roster_data);
The external table access driver (aka ORACLE_LOADER) accept a bunch of options to handle many different cases: fixed width, CSV, endianness (binary data), separators... Once again, see the doc for the details.
... if there is a version of the command where I could decide which field from the file goes to which column it would be even better.
As you understood it now, external tables are handled like any other table. So, you can as usual re-order and/or perform calculation on the fly in your INSERT ... SELECT ... statement.
EMPDET is an external table containing the columns EMPNO and ENAME. What is external table in oracle database?
Why can/cannot we update/delete from an external table?
A. UPDATE empdet
SET ename = 'Amit'
WHERE empno = 1234;
B. DELETE FROM empdet
WHERE ename LIKE 'J%';
An external table in an oracle-database is a way of accessing data residing in some .txt or .csv file via sql-commands. So the table-data is not kept in the database-tablespace but it is rather some kind of view on the sequential dataset. So there is no way the database can index or update the data since it is outside it's scope but it can only do selects on it.
"External Table" means you have a (typically) CSV file stored on your file system and Oracle reads this CSV file defined by settings in CREATE TABLE statement. The data is not saved in Oracle Tablespace but you can select them like a normal table. However, you can only select them (or logically create a view from it) but you cannot modify anything.
Here a simple example of an external table:
CREATE TABLE ADHOC_CSV_EXT (
C1 VARCHAR2(4000),
C2 VARCHAR2(4000),
C3 VARCHAR2(4000)
)
ORGANIZATION EXTERNAL (
TYPE ORACLE_LOADER
DEFAULT DIRECTORY SOME_FOLDER
ACCESS PARAMETERS (
records delimited BY newline
fields terminated BY ',' optionally enclosed BY '"'
missing field VALUES are NULL)
LOCATION ('foo.csv')
);
When extracting data from a table (schema and data) I can do this by right clicking on the database and by going to tasks->Generate Scripts and it gives me all the data from the table including the create script, which is good.
This though gives me all the data from the table - can this be changed to give me only some of the data from the table? e.g only data on the table after a certain dtmTimeStamp?
Thanks,
I would recommend extracting your data into a separate table using a query and then using generate scripts on this table. Alternatively you can extract the data separately into a flatfile using the export data wizard (include your column headers and use comma seperators with double quote field delimiters).
To make a copy of your table:
SELECT Col1 ,Col2
INTO CloneTable
FROM MyTable
WHERE Col3 = #Condition
(Thanks to #MarkD for adding that)
I am new to hive. I have successfully setup a single node hadoop cluster for development purpose and on top of it, I have installed hive and pig.
I created a dummy table in hive:
create table foo (id int, name string);
Now, I want to insert data into this table. Can I add data just like sql one record at a time? kindly help me with an analogous command to:
insert into foo (id, name) VALUES (12,"xyz);
Also, I have a csv file which contains data in the format:
1,name1
2,name2
..
..
..
1000,name1000
How can I load this data into the dummy table?
I think the best way is:
a) Copy data into HDFS (if it is not already there)
b) Create external table over your CSV like this
CREATE EXTERNAL TABLE TableName (id int, name string)
ROW FORMAT DELIMITED
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION 'place in HDFS';
c) You can start using TableName already by issuing queries to it.
d) if you want to insert data into other Hive table:
insert overwrite table finalTable select * from table name;
There's no direct way to insert 1 record at a time from the terminal, however, here's an easy straight forward workaround which I usually use when I want to test something:
Assuming that t is a table with at least 1 record. It doesn't matter what is the type or number of columns.
INSERT INTO TABLE foo
SELECT '12', 'xyz'
FROM t
LIMIT 1;
Hive apparently supports INSERT...VALUES starting in Hive 0.14.
Please see the section 'Inserting into tables from SQL' at: https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML
What ever data you have inserted into one text file or log file that can put on one path in hdfs and then write a query as follows in hive
hive>load data inpath<<specify inputpath>> into table <<tablename>>;
EXAMPLE:
hive>create table foo (id int, name string)
row format delimited
fields terminated by '\t' or '|'or ','
stored as text file;
table created..
DATA INSERTION::
hive>load data inpath '/home/hive/foodata.log' into table foo;
to insert ad-hoc value like (12,"xyz), do this:
insert into table foo select * from (select 12,"xyz")a;
this is supported from version hive 0.14
INSERT INTO TABLE pd_temp(dept,make,cost,id,asmb_city,asmb_ct,retail) VALUES('production','thailand',10,99202,'northcarolina','usa',20)
It's a limitation of hive.
1.You cannot update data after it is inserted
2.There is no "insert into table values ... " statement
3.You can only load data using bulk load
4.There is not "delete from " command
5.You can only do bulk delete
But you still want to insert record from hive console than you can do select from statck. refer this
You may try this, I have developed a tool to generate hive scripts from a csv file. Following are few examples on how files are generated.
Tool -- https://sourceforge.net/projects/csvtohive/?source=directory
Select a CSV file using Browse and set hadoop root directory ex: /user/bigdataproject/
Tool Generates Hadoop script with all csv files and following is a sample of
generated Hadoop script to insert csv into Hadoop
#!/bin/bash -v
hadoop fs -put ./AllstarFull.csv /user/bigdataproject/AllstarFull.csv
hive -f ./AllstarFull.hive
hadoop fs -put ./Appearances.csv /user/bigdataproject/Appearances.csv
hive -f ./Appearances.hive
hadoop fs -put ./AwardsManagers.csv /user/bigdataproject/AwardsManagers.csv
hive -f ./AwardsManagers.hive
Sample of generated Hive scripts
CREATE DATABASE IF NOT EXISTS lahman;
USE lahman;
CREATE TABLE AllstarFull (playerID string,yearID string,gameNum string,gameID string,teamID string,lgID string,GP string,startingPos string) row format delimited fields terminated by ',' stored as textfile;
LOAD DATA INPATH '/user/bigdataproject/AllstarFull.csv' OVERWRITE INTO TABLE AllstarFull;
SELECT * FROM AllstarFull;
Thanks
Vijay
You can use following lines of code to insert values into an already existing table. Here the table is db_name.table_name having two columns, and I am inserting 'All','done' as a row in the table.
insert into table db_name.table_name
select 'ALL','Done';
Hope this was helpful.
Hadoop file system does not support appending data to the existing files. Although, you can load your CSV file into HDFS and tell Hive to treat it as an external table.
Use this -
create table dummy_table_name as select * from source_table_name;
This will create the new table with existing data available on source_table_name.
LOAD DATA [LOCAL] INPATH '' [OVERWRITE] INTO TABLE <table_name>;
use this command it will load the data at once just specify the file path
if file is in local fs then use LOCAL if file is in hdfs then no need to use local