Sqoop import from oracle to hdfs: No more data to read from socket - sql

I'm trying to import data from Oracle to HDFS using Sqoop. Oracle version: 10.2.0.2
Table is not having constraints. When I mention number of mappers(-m) and --split-by parameters, it's showing the error: No more data to read from socket. If I mention -m 1(setting the number of mappers as 1), it's running, but taking too much time.
Sqoop command:
sqoop import --connect jdbc:oracle:thin:#host:port:SID --username uname --password pwd --table abc.market_price --target-dir /ert/etldev/etl/market_price -m 4 --split-by MNTH_YR
Please help me.

Instead of giving the num of mappers why dont you try using --direct ..
what does it show then??
sqoop import --connect jdbc:oracle:thin:#host:port:SID --username uname --password pwd --table abc.market_price --target-dir /ert/etldev/etl/market_price --direct
or
sqoop import --connect jdbc:oracle:thin:#host:port:SID --username uname --password pwd --table abc.market_price --target-dir /ert/etldev/etl/market_price --split-by MNTH_YR --direct

Related

how to import-all-tables from Mysql to hive using sqoop for particular database in hive?

sqoop import-all-tables into hive with default database works fine but Sqoop import-all-tables into hive specified database is not working.
As --hive-database is depreciated how to specify database name
sqoop import-all-tables \
--connect "jdbc:mysql://quickstart.cloudera:3306/retail_db" \
--username root \
--password XXX \
--hive-import \
--create-hive-table
The above code creates tables in /user/hive/warehouse/ i.e default directory
How to import all tables into /user/hive/warehouse/retail.db/
you can set the HDFS path of your database using the option --warehouse-dir.
The next example worked for me:
sqoop import-all-tables \
--connect jdbc:mysql://localhost:3306/retail_db \
--username user \
--password password \
--warehouse-dir /apps/hive/warehouse/lina_test.db
--autoreset-to-one-mapper

Import Data from Postgresql to Hive

I am facing issues while importing Table from postgresql to hive. Query I am using is :
sqoop import \
--connect jdbc:postgresql://IP:5432/PROD_DB \
--username ABC_Read \
--password ABC#123 \
--table vw_abc_cust_aua \
-- --schema ABC_VIEW \
--target-dir /tmp/hive/raw/test_trade \
--fields-terminated-by "\001" \
--hive-import \
--hive-table vw_abc_cust_aua \
--m 1
Error I am getting
ERROR tool.ImportTool: Error during import: No primary key could be found for table vw_abc_cust_aua. Please specify one with --split-by or perform a sequential import with '-m 1'.
PLease let me know what is wrong with my query
I am considering -- --schema ABC_VIEW is a typo error, it should be --schema ABC_VIEW
The other issue is the option to provide number of mapper is either -m or --num-mappers and not --m
Solution
in you script change --m to -m or --num-mappers

sqoop all table from mysql to Hive import

I am trying to import all tables from mysql schema to hive by using blow sqoop query:-
sqoop import-all-tables --connect jdbc:mysql://ip-172-31-20-247:3306/retail_db --username sqoopuser -P --hive-import --hive-import --create-hive-table -m 3
it is saying ,
18/09/01 09:24:52 ERROR tool.ImportAllTablesTool: Encountered IOException running import job: org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory
hdfs://ip-172-31-35-141.ec2.internal:8020/user/kumarrupesh2389619/categories already exists
Run the below command
hdfs dfs -rmr /user/kumarrupesh2389619/categories
Your command is failing since the directory already exists.

Sqoop command-How to give schema name for Import All Tables

I am importing all tables from rdbms to hive using the sqoop command(v1.4.6).
Below is the command
sqoop-import-all-tables --verbose --connect jdbcconnection --username user --password pass --hive-import -m 1
This command works fine and it is loading all the tables in default schema.Is there a way to load the tables in particular schema?
Regards
Prakash
Use --hive-database <db name> in your import query.
Modified command:
sqoop-import-all-tables --verbose --connect jdbcconnection --username user --password pass --hive-import --hive-database new_db -m 1

Sqoop hive import from mysql to hive is failing

I am trying to load a table from mysql to hive using --hive-import in parquet format, We want to do incremental update of hive table. When we try below command. its failing with the error mentioned below. Can anybody please help here.
sqoop job --create users_test_hive -- import --connect 'jdbc:mysql://dbhost/dbname?characterEncoding=utf8&dontTrackOpenResources=true&defaultFetchSize=1000&useCursorFetch=true&useUnicode=yes&characterEncoding=utf8' --table users --incremental lastmodified --check-column n_last_updated --username username --password password --merge-key user_id --mysql-delimiters --as-parquetfile --hive-import --warehouse-dir /usr/hive/warehouse/ --hive-table users_test_hive.
Error while running it.
16/02/27 21:33:17 INFO mapreduce.Job: Task Id : attempt_1454936520418_0239_m_000000_1, Status : FAILED
Error: parquet.column.ParquetProperties.newColumnWriteStore(Lparquet/schema/MessageType;Lparquet/column/page/PageWriteStore;I)Lparquet/column/ColumnWriteStore;