Question:
My code works until it reachs last line then it throws a syntax error.
Error :
DB21034E The command was processed as an SQL statement because it was not a
valid Command Line Processor command. During SQL processing it returned:
SQL0104N An unexpected token "temp_dept" was found following "(case when ".
Expected tokens may include: "JOIN". SQLSTATE=42601
I am trying to do this:
After each insert on RD_EMP
for each row
insert into RD_Supervisor
check for cases, if temp_dept.RD_E_ID <= 0 THEN RD_Supervisor.RD_E_SUP 0
Code:
create trigger RD_total_dep_emp \
after insert on RD_Emp \
referencing new table as temp_dept
for each statement \
insert into RD_Supervisor(RD_E_SUP, RD_E_EMP, RD_QUOT) \
select temp_dept.RD_E_ID,
(case \
when temp_dept.RD_E_ID <= 0 then 0 \
when temp_dept.RD_E_ID > 0 AND temp_dept.RD_E_ID <= 15 then 15 \
when temp_dept.RD_E_ID > 15 AND temp_dept.RD_E_ID <= 25 then 25 \
when temp_dept.RD_E_ID > 25 AND temp_dept.RD_E_ID <= 35 then 35 \
when temp_dept.RD_E_ID > 35 then 100 \
end) as RD_E_SUP \
from temp_dept
You have three columns in the insert, but only two in the select -- and they appear to be in the wrong order. The following is probably more your intention:
create trigger RD_total_dep_emp \
after insert on RD_Emp \
referencing new table as temp_dept
for each statement \
insert into RD_Supervisor(RD_E_EMP, RD_E_SUP) \
select temp_dept.RD_E_ID,
(case \
when temp_dept.RD_E_ID <= 0 then 0 \
when temp_dept.RD_E_ID > 0 AND temp_dept.RD_E_ID <= 15 then 15 \
when temp_dept.RD_E_ID > 15 AND temp_dept.RD_E_ID <= 25 then 25 \
when temp_dept.RD_E_ID > 25 AND temp_dept.RD_E_ID <= 35 then 35 \
when temp_dept.RD_E_ID > 35 then 100 \
end) as RD_E_SUP \
from temp_dept
If there is a value you want to set for RD_QUOT, then you can specify that as well -- both in the insert and the select.
You have an opening parenthesis before CASE, but you don't have a closing parenthesis after END.
https://xkcd.com/859/
Related
I have a table containing over 50 columns (both numeric and char), is there a way to get the overall statistics without specifying each column?
As an example:
a b c d
1 2 3 4
5 6 7 8
9 10 11 12
Ideally I would have something like:
column_name min avg max sum
a 1 5 9 15
b 2 6 10 18
c 3 7 11 21
d 4 8 12 24
Nevertheless, getting one aggregate at a time it would be more more than helpful.
Any help/idea would be highly appreciated.
Thank you,
O
You can parse DESCRIBE TABLE output using AWK and generate comma separated string of SUM(col) as sum_col for numeric columns and column_list for all other columns.
In this example it generates select statement with goup by. Run in shell:
TABLE_NAME=your_schema.your_table
NUMERIC_COLUMNS=$(hive -S -e "set hive.cli.print.header=false; describe ${TABLE_NAME};" | awk -F " " 'f&&!NF{exit}{f=1}f{ if($2=="int"||$2=="double") printf c "sum("toupper($1)") as sum_"$1}{c=","}')
GROUP_BY_COLUMNS=$(hive -S -e "set hive.cli.print.header=false; describe ${TABLE_NAME};" | awk -F " " 'f&&!NF{exit}{f=1}f{if($2!="int"&&$2!="double") printf c toupper($1)}{c=","}')
SELECT_STATEMENT="select $NUMERIC_COLUMNS $GROUP_BY_COLUMNS from $TABLE_NAME group by $GROUP_BY_COLUMNS"
I'm checking only int and double columns. You add more types. Also you can optimize it and execute DESCRIBE only once, then parse result using same AWK scripts. Hope you got the idea.
I have a file with the following format :
TRINITY_DN119001_c0_g1_i1 4 * 0 0 * * 0 0 GAGCCTCCCTCATGAATGTACCAGCATTTACCTCATAAAGAGCT * XO:Z:NM
TRINITY_DN119037_c0_g1_i1 4 * 0 0 * * 0 0 TAAGATTAGGTTGTATTCCAG * XO:Z:NM
TRINITY_DN119099_c0_g1_i1 4 * 0 0 * * 0 0 AGGCAGGCGCTAAACGATTTGCATTTCTCTAATGATTACGCCAG * XO:Z:NM
I am trying to extract the 1st and 10th column and store it in the following format(output file) :
>TRINITY_DN119099_c0_g1_i1
GAGCCTCCCTCATGAATGTACCAGCATTTACCTCATAAAGAGCT
>TRINITY_DN119037_c0_g1_i1
TAAGATTAGGTTGTATTCCAG
>TRINITY_DN119001_c0_g1_i1
AGGCAGGCGCTAAACGATTTGCATTTCTCTAATGATTACGCCAG
I am doing the following code for now :
cut -d " " -f1,10 in.txt > out.txt
sed 's/^/>/' out.txt
but,unable to get how to get above output.
You may use awk:
awk '{printf ">%s\n%s\n", $1, $10}' file
>TRINITY_DN119001_c0_g1_i1
GAGCCTCCCTCATGAATGTACCAGCATTTACCTCATAAAGAGCT
>TRINITY_DN119037_c0_g1_i1
TAAGATTAGGTTGTATTCCAG
>TRINITY_DN119099_c0_g1_i1
AGGCAGGCGCTAAACGATTTGCATTTCTCTAATGATTACGCCAG
However note that it is 1st and 10th column in your shown output instead of 9th.
if your data in 'd' file, try gnu sed:
sed -E 's/^(TRINITY_DN\S+).*\s([ACGT]+).*/\1\n\2/' d
I am trying to import data in hive table using sqoop command. The hive table is partitioned by date2 and date is in the format of "9/6/2017 00:00:00". It's throwing error when I use sqoop command to import data using the date column.
Teradata table :
column1, date2, column3
1,9/6/2017 00:00:00, qwe
2,9/20/2017 00:00:00, wer
Sqoop command:
sqoop import \
--connect jdbc:teradata://<server>/database=<db_name> \
--connection-manager org.apache.sqoop.teradata.TeradataConnManager \
--username un \
--password 'pwd' \
--table <tbl_name> \
--where "cast(date2 as Date) > date '2017-09-07' and cast(date2 as Date) < date '2017-09-20'" \
--hive-import --hive-table <db_name>.<tbl_name> \
--hive-partition-key date2 \
-m1
Error
ERROR teradata.TeradataSqoopImportHelper: Exception running Teradata import job
java.lang.IllegalArgumentException:Wrong FS: /usr/tarun/date2=1900-01-01 00%3A00%3A00
When I tried translating your command to multiline, it looks you have missed one \ character and that's why it looks it is complaining. --hive-import is not ending with "\". The hive table name is also missing in the command
sqoop import \
--connect jdbc:teradata:///database= \
--connection-manager org.apache.sqoop.teradata.TeradataConnManager \
--username un \
--password 'pwd' \
--table \
--where "cast(date2 as Date) > date '2017-09-07' and cast(date2 as Date) < date '2017-09-20'" \
--hive-import \
--hive-table tarun121 \
--hive-partition-key date2 \
-m1
alternate to this is to try create-hive-table command
sqoop create-hive-table \
--connect jdbc:teradata:://localhost:port/schema \
--table hive_tble_name \
--fields-terminated-by ',';
let me know if this solves the issue.
mysqldumpslow -s c -t 15 -v /tmp/my-slow.log >> /tmp/file_date +'%d_%m_%Y_%H_%M_%S'.log
Reading mysql slow query log from /tmp/my-slow.log
Died at /usr/bin/mysqldumpslow line 162, <> chunk 18.
Try to reduce your "top entries" ... Try 10 or 5 instead of 15 ... maybe there are not enough entries for an top 15 list.
What's the easiest way to split a file and add a header to each section?
The unix split command does everything that I need minus being able to add a header.
Any easy way to do it with existing tools before I script it up?
It is probably easiest to do this in either awk or perl. If you aren't processing much data, then using a simple shell script to post-process the output of split is probably fine. However, this will traverse the input more than once which can be a bottleneck if you are doing this for any sort of online data processing task. Something like the following should work:
bash$ cat input-file | awk '
BEGIN {
fnum = 1
print "HEADER" > fnum
}
{ if ((NR % 10) == 0) {
close(fnum)
fnum++
print "HEADER" > fnum
}
print >> fnum
}
'
bash$ wc -l input-file
239 input-file
bash$ ls
1 19 6
10 2 7
11 20 8
12 21 9
13 22 input-file
14 23
15 24
16 3
17 4
18 5
bash$