Sqoop: double quotes query - sql

I have a problem with the double quotes on this sqoop query:
select i.Number, i.Date,i.Station, i.lStation,
count(*) ax, “1- Pd” St , b.Type
from Leg jl, yLeg i, senger b,
where jl.LegID = i.LegID and jl.rID = b.erID and b.gID = b.ID
and b.tus not in (1,4) group by Number, Date, tion, b.Type
how can i fixed? with some escape parameter

First debug the query with the below command sqoop eval -libjars /var/lib/sqoop/ojdbc6.jar --connect jdbc:oracle:thin:#hostname:portnumber/servicename --username user -password password --query "select * from schemaname.tablename where rownum=10" write your query in the --query and see if the actual query is generating the output you are expecting ? you can see the output in the terminal itself.
If the query is giving the results as you expected, use the below sqoop command
to import the table
sqoop import -libjars /var/lib/sqoop/ojdbc6.jar --connect 'jdbc:oracle:thin:#hostname/service_name' --username user -password password -m 1 --hive-overwrite --hive-import --hive-database database_name --hive-table table_nmae --target-dir '/user/hive/warehouse/databasename.db/tablename' --query "select * from source_database.source_tablename WHERE 1=1 AND \$CONDITIONS"
The exact problem with the double quotes you are facing can be resolved using escape key. Please us the WHERE 1=1 AND \$CONDITIONS as is and paste your query before the WHERE in sqoop command.
If you face any error please paste the error, you must need to add an other escape key to escape the double quotes.

There are two parts to this question.
The first is what is a valid query for your source database? Most databases have some kind of client or shell that let you enter and execute queries. Your query should be valid as far as the shell or client is concerned.
The second part of your question is how do you take that query (as a String) and pass it to the database via sqoop. The answer to that lies in the way you're running sqoop.
If you're running sqoop via command line then you need to identify those characters (usually double quotes) that give your OS fits when embedded in a command line argument. Use a backslash before those characters to help the OS parse the command correctly. You usually have to put the entire query string inside unescaped double quotes so that the OS treats your query as a single string argument.
If you're running sqoop via Oozie then I strongly recommend you break the Sqoop command into arguments in the Sqoop action:
<arg>--query</arg>
<arg>select ... count(*) ax, “1- Pd” St , b.Type ... WHERE $CONDITIONS</arg>
So that you can generally paste your query as is into the action.
Of course, nothing is that simple. You still have to remember that the query is sitting inside an XML document, so any character that will mess up an XML parse become problematic. The only characters like that that I've encountered so far are the angle brackets and I use property substitution (a bit of a kludge, I admit) to solve that problem:
In the Oozie workflow properties file I put:
lessThan=<
and I change my arg from
<arg>SELECT * from MyTable where $CONDITIONS AND (SOME_COL < 1000)</arg>
to
<arg>SELECT * from MyTable where $CONDITIONS AND (SOME_COL ${lessThan} 1000)</arg>
EDIT:
For those of you who don't like my kludge, you could try using a CDATA element to "escape" anything in the query (except, of course, ']]>'):
<arg><![CDATA[SELECT * from MyTable where $CONDITIONS AND (SOME_COL < 1000)]]></arg>

Related

How to use "%" character in sql query on linux shell?

I am trying to pull all the jdk packages installed on set of hosts by sending a sql select statement to osquery on linux shell via pssh .
Here is the query:
pssh -h myhosts -i 'echo "SELECT name FROM rpm_packages where name like '%jdk%';"| osqueryi --json'
but usage of "%" is giving me below error.
Error: near line 1: near "%": syntax error
I tried to escape % ,but the error remains same. Any ideas how to overcome this error?
You aren't getting this error from your shell but from the query parser, and it's not actually caused by the % character, but to the ' that immediately precedes it. Look at where you have quotes:
'echo "SELECT name FROM rpm_packages where name like '%jdk%';"| osqueryi --json'
^----------------------------------------------------^ ^-------------------^
These quotes are consumed by the shell when it parses the argument. Single quotes tell the shell to ignore any otherwise-special characters inside and treat what is within the quotes as part of the argument -- but not the quotes themselves.
After shell parsing finishes, the actual, verbatim argument that gets sent to pssh looks like this:
echo "SELECT name FROM rpm_packages where name like %jdk%;"| osqueryi --json
Note that all of the single quotes have been erased. The result is that your query tool sees the % (presumably modulus) operator in a place that it doesn't expect -- right after another operator (like) which makes about as much sense to the parser as name like * jdk. The parser doesn't understand what it means to have two consecutive binary operators, so it complains about the second one: %.
In order to get a literal ' there, you need to jump through this hoop:
'\''
^^^^- start quoting again
|||
|\+-- literal '
|
\---- stop quoting
So, to fix this, replace all ' instances inside the string with '\'':
pssh -h myhosts -i 'echo "SELECT name FROM rpm_packages where name like '\''%jdk%'\'';"| osqueryi --json'
osqueryi accepts a single statement on the command line. Eliminating the echo can make quoting a bit simpler:
osqueryi --json "SELECT * FROM users where username like '%jdk%'"
You will, however, need the quotes to pass through your pssh command line.
While osqueryi is great for short simple things, if you're building a frequent polling service, osqueryd with scheduled queries is generally simpler.

SQL Server :query for exporting to file

I'm trying to learn the basics of sql programming, I am working with SQL Server 2014. I have managed to import a file into a table with the command:
BULK INSERT Db.dbo.Co2_table
FROM 'd:\dataset_co2.txt'
with
(
FIRSTROW =2,
ROWTERMINATOR ='\n'
)
GO
I would like to do the dual operation, that is exporting the content of a table to a file. I have tried:
SELECT *
INTO OUTFILE 'C:\datadump\sqldbdump.txt"
FROM dbo.alarms_2_2014
FIELDS TERMINATED BY ','
LINES TERMINATED BY '\n';
bcp Db.dbo.Co2_table out "C:\users\ws5.en-cre\desktop\prova.txt" -T –c
sqlcmd -S . -d Db -E -s, -W -Q "SELECT * FROM dbo.Co2_table" > ExcelTest.csv
But none of these seem to work (I get error messages). Any idea?
I suspect you are running those commands from Management Studio. You should use console for this command.This works for me. Also check if you have permissions on that folder.
bcp "select * from Db.dbo.Co2_table" queryout C:\users\ws5.en-cre\desktop\prova.txt -c -T
or
bcp Db.dbo.Co2_table out C:\users\ws5.en-cre\desktop\prova.txt -c -T
Also you have suspicious symbol in c parameter -T –c. It is not a regular dash -.
Thank you for you answers and suggestions, and apologies for my lack of precision and my late reply (in this case I missed the notifications from stackoverflow).
Regarding the question on whether I use mstudio or console, what I do is clicking on “new query” from mstudio, write the code and press execute. So I guess the answer is that I use mstudio.
If I try:
bcp "select * from Db.dbo.Co2_table" queryout
C:\users\ws5.en-cre\desktop\prova.txt -c –T
it says
Msg 102, Level 15, State 1, Line 1 Incorrect syntax near 'queryout'.
I guess in this case one of the problem is that the quotes are missing, but even adding them doesn’t solve the problem.
I am looking for a solution that can be implemented as a script. I am familiar with excel vba macros, I would like to implement something like that.
Thanks,
Alex

Why does database query using sqlcmd produce just empty file while same query works in Management Studio?

I have made recently a huge MSSQL script which was working until the server environment changed and some queries are not anymore allowed. So I had to make a .bat file which executes this query from command line.
I get no error or something else. I just get a file with no entry. But I receive a lot of entries if I use the code of the query in the Management Studio.
Does somebody see where is the mistake in my command line?
I inserted some new lines for reading the code better. Everything except command PAUSE is on one line in the batch file.
EDIT: I figured out that the problem is in the last WHERE clause in the LIKE operater. If I take out the LIKE operater it works. It has nothing todo with the % caracter. it is effective the LIKE operater. Does anybody know how to fix that?
sqlcmd -S connection\string -U user -P password -d dbName -s";" -Q "SET NOCOUNT ON;
SELECT [per_nummer] as EmployeeID,
[per_id] as System_nr,
[per_pid] as PID,
[per_anrede] as Gender,
[per_vname] as Vorname,
[per_name] as Name,
[per_telEx] as Telefon,
[per_email] as Email,
[per_instradierungHauptort] as Instradierung,
[per_gebnr] as Gebaeudenummer,
(SELECT mobileTelephoneNumber FROM [dbName].[dbo].[import_zuko_GLDAP_DUMP] WHERE UserID = per_pid ) as Mobile,
[per_business_area] as Business_Area,
(SELECT csgdivision FROM [dbName].[dbo].[import_zuko_GLDAP_DUMP] WHERE UserID = per_pid ) as Division,
NULL as Bereich,
(SELECT csgCompany FROM [dbName].[dbo].[import_zuko_GLDAP_DUMP] WHERE UserID = per_pid ) as Firma,
[per_sprache] as Korespondenzsprache,
(SELECT roomnumber FROM [dbName].[dbo].[import_zuko_GLDAP_DUMP] WHERE UserID = per_pid ) as Bueronummer,
[per_floor] as Etage,
(SELECT convert(varchar, convert(date,[per_eintrittsdatum]), 104)) as Eintritt_per,
(SELECT convert(varchar, convert(date,[per_austrittsdatum]), 104)) as Austritt_per,
CONVERT(VARCHAR(10), GETDATE(), 104) AS letzte_mutation,
per_lm as Linemanager,
(SELECT TOP 1 per_pid FROM [dbName].[dbo].[person] WHERE per_nummer = master_table.per_lm) AS lm_pid,
(SELECT convert(varchar, convert(date,[per_lm_von]), 104)) as lm_von,
(SELECT convert(varchar, convert(date,[per_lm_bis]), 104)) as lm_bis,
NULL FROM [dbName].[dbo].[person] as master_table
WHERE
[per_nummer] IS NOT NULL AND per_pidStatus = 'A' AND
([per_pid] LIKE('A%')OR [per_pid] LIKE('F%')OR [per_pid] LIKE('W%'))"
-w 1000 -W -o "\\servername\G$\path\to\file.csv"
PAUSE
Different default options
Do you have read carefully sqlcmd Utility documentation by Microsoft?
There is at top the important information:
Because different default options may apply, you might see different behavior when you execute the same query in SQL Server Management Studio in SQLCMD Mode and in the sqlcmd utility.
Therefore the empty file could be caused by different default options.
Space between option and value of option
Further the documentation contains at top:
Currently, sqlcmd does not require a space between the command line option and the value. However, in a future release, a space may be required between the command line option and the value.
Double quotes inside an argument string are often problematic, see answer on Why double quotes should be always only at beginning and end of an argument string?
Therefore I suggest to follow the advice of Microsoft and change -s";" to -s ";" with a space charcter after -s as done for all other options, too.
Escaping percentage sign
Do you have tried to escape each % in the query string with an additional % and using therefore 3 times %% in the query string?
Command line interpreter cmd.exe could interpret the string between two % as a reference to an environment variable and as there is no such environment variable, removes everything between two % from the command line.
Update: It turned out that indeed the not escaped percentage signs resulted in a wrong query string and therefore in an empty results file.
Command with path and extension
General hint:
It is always advisable to specify in batch files commands like sqlcmd with full name which means with path and file extension as this makes the execution of the application independent on the values of the environment variables PATH and PATHEXT.

Generate a Properties File using Shell Script and Results from a SQL Query

I am trying to create a properties file like this...
firstname=Jon
lastname=Snow
occupation=Nights_Watch
family=Stark
...from a query like this...
SELECT
a.fname as firstname,
a.lname as lastname,
b.occ as occupation...
FROM
names a,
occupation b,
family c...
WHERE...
How can I do this? As I am aware of only using spool to a CSV file which won't work here?
These property files will be picked up by shell scripts to run automated tasks. I am using Oracle DB
Perhaps something like this?
psql -c 'select id, name from test where id = 1' -x -t -A -F = dbname -U dbuser
Output would be like:
id=1
name=test1
(For the full list of options: man psql.)
Since you mentionned spool I will assume you are running on Oracle. This should produce a result in the desired format, that you can spool straight away.
SELECT
'firstname=' || firstname || CHR(10) ||
'lastname=' || lastname || CHR(10) -- and so on for all fields
FROM your_tables;
The same approach should be possible with all database engines, if you know the correct incantation for a litteral new line and the syntax for string concatenation.
It is possible to to this from your command line SQL client but as STTLCU notes it might be better to get the query to output in something "standard" (like CSV) and then transform the results with a shell script. Otherwise, because a lot of the features you would use are not part of any SQL standard, they would depend on the database server and client application. Think of this step as sort of the obverse of ETL where you clean up the data you "unload" so that it is useful for some other application.
For sure there's ways to build this into your query application: e.g. if you use something like perl DBI::Shell as your client (which allows you to connect to many different servers using the DBI module) you can jazz up your output in various ways. But here you'd probably be best off if could send the query output to a text file and run it through awk.
Having said that ... here's how the Postgresql client could do what you want. Notice how the commands to set up the formatting are not SQL but specific to the client.
~/% psql -h 192.168.2.69 -d cropdusting -u stubblejumper
psql (9.2.4, server 8.4.14)
WARNING: psql version 9.2, server version 8.4.
Some psql features might not work.
You are now connected to database "cropdusting" as user "stubblejumper".
cropdusting=# \pset border 0 \pset format unaligned \pset t \pset fieldsep =
Border style is 0.
Output format is unaligned.
Showing only tuples.
Field separator is "=".
cropdusting=# select year,wmean_yld from bckwht where year=1997 AND freq > 13 ;
1997=19.9761904762
1997=14.5533333333
1997=17.9942857143
cropdusting=#
With the psql client the \pset command sets options affecting the output of query results tables. You can probably figure out which option is doing what. If you want to do this using your SQL client tell us which one it is or read through the manual page for tips on how to format the output of your queries.
My answer is very similar to the two already posted for this question, but I try to explain the options, and try to provide a precise answer.
When using Postgres, you can use psql command-line utility to get the intended output
psql -F = -A -x -X <other options> -c 'select a.fname as firstname, a.lname as lastname from names as a ... ;'
The options are:
-F : Use '=' sign as the field separator, instead of the default pipe '|'
-A : Do not align the output; so there is no space between the column header, separator and the column value.
-x : Use expanded output, so column headers are on left (instead of top) and row values are on right.
-X : Do not read $HOME/.psqlrc, as it may contain commands/options that can affect your output.
-c : The SQL command to execute
<other options> : Any other options, such as connection details, database name, etc.
You have to choose if you want to maintain such a file from shell or from PL/SQL. Both solutions are possible and both are correct.
Because Oracle has to read and write from the file I would do it from database side.
You can write data to file using UTL_FILE package.
DECLARE
fileHandler UTL_FILE.FILE_TYPE;
BEGIN
fileHandler := UTL_FILE.FOPEN('test_dir', 'test_file.txt', 'W');
UTL_FILE.PUTF(fileHandler, 'firstname=Jon\n');
UTL_FILE.PUTF(fileHandler, 'lastname=Snow\n');
UTL_FILE.PUTF(fileHandler, 'occupation=Nights_Watch\n');
UTL_FILE.PUTF(fileHandler, 'family=Stark\n');
UTL_FILE.FCLOSE(fileHandler);
EXCEPTION
WHEN utl_file.invalid_path THEN
raise_application_error(-20000, 'ERROR: Invalid PATH FOR file.');
END;
Example's source: http://psoug.org/snippet/Oracle-PL-SQL-UTL_FILE-file-write-to-file-example_538.htm
At the same time you read from the file using Oracle external table.
CREATE TABLE parameters_table
(
parameters_coupled VARCHAR2(4000)
)
ORGANIZATION EXTERNAL
(
TYPE ORACLE_LOADER
DEFAULT DIRECTORY test_dir
ACCESS PARAMETERS
(
RECORDS DELIMITED BY NEWLINE
FIELDS
(
parameters_coupled VARCHAR2(4000)
)
)
LOCATION ('test_file.txt')
);
At this point you can write data to your table which has one column with coupled parameter and value, i.e.: 'firstname=Jon'
You can read it by Oracle
You can read it by any shell script because it is a plain text.
Then it is just a matter of a query, i.e.:
SELECT MAX(CASE WHEN INSTR(parameters_coupled, 'firstname=') = 1 THEN REPLACE(parameters_coupled, 'firstname=') ELSE NULL END) AS firstname
, MAX(CASE WHEN INSTR(parameters_coupled, 'lastname=') = 1 THEN REPLACE(parameters_coupled, 'lastname=') ELSE NULL END) AS lastname
, MAX(CASE WHEN INSTR(parameters_coupled, 'occupation=') = 1 THEN REPLACE(parameters_coupled, 'occupation=') ELSE NULL END) AS occupation
FROM parameters_table;

Execute SQL from file in bash

I'm trying to load a sql from a file in bash and execute the loaded sql. The sql file needs to be versatile, meaning it cannot be altered in order to make things easy while being run in bash (escaping special characters like * )
So I have run into some problems:
If I read my sample.sql
SELECT * FROM SAMPLETABLE
to a variable with
ab=`cat sample.sql`
and execute it
db2 `echo $ab`
I receive an sql error because by doing a cat the * has been replaced by all the files in the directory of sample.sql.
Easy solution would be to replace "" with "\" . But I cannot do this, because the file needs to stay executable in programs like DB Visualizer etc.
Could someone give me hint in the right direction?
The DB2 command line processor has options that accept a filename as input, so you shouldn't need to load statements from a text file into a shell variable.
This command will execute all SQL statements in the file, with newline treated as the statement terminator:
db2 -f sample.sql
This command will execute all SQL statements in the file, with semicolon treated as the statement terminator:
db2 -t -f sample.sql
Other useful CLP flags are:
-x : Suppress the column headings
-v : Echo the statement text immediately before execution
-z : Tee a copy of all CLP output to the filename immediately following this flag
Redirect stdin from the file.
db2 < sample.sql
In case, you have a variable used in your script and wanted to get it replaced by the shell before executed in DB2 then use this approach:
Contents of File.sql:
cat <<xEOF
insert values(1,2) into ${MY_SCHEMA}.${MY_TABLE};
select * from ${MY_SCHEMA}.${MY_TABLE};
xEOF
In command prompt do:
export MY_SCHEMA='STAR'
export MY_TAVLE='DIMENSION'
Then you are all good to get it executed in DB2:
eval File.sq |db2 +p -t
The shell will replace the global variables and then DB2 will execute it.
Hope it helps.