Error in BQ shell Loading Datastore with write_disposition as Write append - google-bigquery

1: I tried to load on an existing table [using Datastore file]
2. Bq Shell asked me to add write_disposition to write append to load to existing table
3. If I do the above, throws an error as follows:
load --source_format=DATASTORE_BACKUP --write_disposition=WRITE_append --allow_jagged_rows=None sample_red.t1estchallenge_1 gs://test.appspot.com/bucket/ahFzfnZpcmdpbi1yZWQtdGVzdHJBCxIcX0FFX0RhdGFzdG9yZUFkbWluX09wZXJhdGlvbhiBwLgCDAsSFl9BRV9CYWNrdXBfSW5mb3JtYXRpb24YAQw.entity.backup_info
Error parsing command: flag --allow_jagged_rows=None: ('Non-boolean argument to boolean flag',None)
I tried allow jagged rows = 0 and allow jagged rows = None, nothing works just the same error.
Please advise on this.
UPDATE: As Mosha suggested --allow_jagged_rows=false has worked. It should be before --write_disposition=Write_truncate. But this has led to another issue on encoding. Can anyone say what should be the encoding type for DATASTORE_BACKUP?. I tried both --encoding=UTF-8 and --encoding=ISO-8859.
load --source_format=DATASTORE_BACKUP --allow_jagged_rows=false --write_disposition=WRITE_TRUNCATE sample_red.t1estchallenge_1 gs://test.appspot.com/STAGING/ahFzfnZpcmdpbi1yZWQtdGVzdHJBCxIcX0FFX0RhdGFzdG9yZUFkbWluX09wZXJhdGlvbhiBwLgCDAsSFl9BRV9CYWNrdXBfSW5mb3JtYXRpb24YAQw.entityname.backup_info
Please advise.

You should use "false" (or "true") with boolean arguments, i.e.
--allow_jagged_rows=false

Related

Error 1070 Pig Could not resolve PigSorage using imports:

I'm trying to read a file with pig and I have the error indicated in the title.
data = LOAD '/user/cloudera/pigexample/commands' USING PigSorage('\n') as
(command:chararray);
DUMPdata;
The file contains the following:
**SOF**
whoami
pwd
ls
say
saw
source
<1>
source
<1>
exit
**EOF**
**SOF**
where's
<1>
I don't understand why I get the error supposedly its delimiter is the line break, I'm also trying with another file whose delimiter is the tab ('\t') and it doesn't work either. Does anyone know what the delimiter is? PS: I don't know what tag to put on the question.
As explained above, I expected it to open the file with the indicated delimiter but it does not work.
I have already tried with \t \n
I have already seen my mistake
data = LOAD '/user/cloudera/pigexample/commands' USING PigSorage('\n') as (command:chararray);
data = LOAD '/user/cloudera/pigexample/commands' USING PigStorage('\n') as (command:chararray);

Kmodes function Error in x[[jj]][iseq] <- vjj : replacement has length zero

I got this error when using a big data set, I cleaned the data and used Data<-na.omit(Data) to delete all rows with nulls, it worked and in RStudio I don't get any errors.
When I run the script in SQL as an external script I get the same error as before
Error in x[[jj]][iseq] <- vjj : replacement has length zero
even though I'm using the same Rscript and dataset is the same.
Has anyone had the same issue and how did you solve it.
thanks

When trying to get the source of a function in Postgres using psql, what does the error "column p.proisagg does not exist" mean?

Background:
using postgres 11 on RDS, interface is psql on a Centos 7 box; objective is to show the source of certain stored procs / functions so that I can work with them
Problem description : When I attempt to list / show the source of a given stored function using the \df+ command which I understand to be correct for this use based on [official docs here](https://www.postgresql.org/docs/current/app-psql.html], an error is given as shown:
psql=> \df+ schema_foo.proc_bar;
ERROR: column p.proisagg does not exist
LINE 6: WHEN p.proisagg THEN 'agg'
I have no clue how to interpret this; the function in question does not contain the snippet of logic shown in the error, nor the column referenced there p.proisagg. I have verified this by opening the function in vim with \ef.
My guess based on several unrelated github issues that mention this same error for example is that it is in reference to some schema code internal to postgres.
Summary: in short, I can view the source of the functions using \ef, so my work is not impaired from a practical standpoint, however I wish to understand this error and why I'm encountering it with \df+.
I had the same issue and ran these 2 commands to fix it
sed -i "s/NOT pp.proisagg/pp.prokind='f'/g" /usr/share/phpPgAdmin/classes/database/Postgres.php
sed -i "s/NOT p.proisagg/p.prokind='f'/g" /usr/share/phpPgAdmin/classes/database/Postgres.php

Unable to extract data with double pipe delimiter in Pig Script

I am trying to extract data which is pipe delimited in Pig. Following is my command
L = LOAD 'entirepath_in_HDFS/b.txt/part-m*' USING PigStorage('||');
Iam getting following error
2016-08-04 23:58:21,122 [main] ERROR org.apache.pig.tools.grunt.Grunt - ERROR 1200: Pig script failed to parse:
<line 1, column 4> pig script failed to validate: java.lang.RuntimeException: could not instantiate 'PigStorage' with arguments '[||]'
My input sample file has exactly 5 lines as following
POS_TIBCO||HDFS||POS_LOG||1||7806||2016-07-18||1||993||0
POS_TIBCO||HDFS||POS_LOG||2||7806||2016-07-18||1||0||0
POS_TIBCO||HDFS||POS_LOG||3||7806||2016-07-18||1||0||5
POS_TIBCO||HDFS||POS_LOG||4||7806||2016-07-18||1||0||0
POS_TIBCO||HDFS||POS_LOG||5||7806||2016-07-18||1||0||19.99
I tried several options like using the backslash before delimiter(\||,\|\|) but everything failed. Also, I tried with schema but got the same error.I am using Horton works(HDP2.2.4) and pig (0.14.0).
Any help is appreciated. Please let me know if you need any further details.
I have faced this case, and by checking PigStorage code source, i think PigStorage argument should be parsed into only one character.
So we can use this code instead:
L0 = LOAD 'entirepath_in_HDFS/b.txt/part-m*' USING PigStorage('|');
L = FOREACH L0 GENERATE $0,$2,$4,$6,$8,$10,$12,$14,$16;
Its helpful if you know how many column you have, and it will not affect performance because it's map side.
When you load data using PigStorage, It only expects single character as delimiter.
However if still you want to achieve this you can use MyRegExLoader-
REGISTER '/path/to/piggybank.jar'
A = LOAD '/path/to/dataset' USING org.apache.pig.piggybank.storage.MyRegExLoader('||')
as (movieid:int, title:chararray, genre:chararray);

Hive error with the command show tables

I am using Apache-Hadoop and Hive as a setup. The hive do get connected with the Hadoop,tables are also created. But with the command show tables this exception occurs:Failed with the exception java.io.IOException:org.apache.hadoop.mapred.InvalidInputException:Input Pattern file:/tmp/${hduser}/034cbea3-2b60-49f5-8284-d6fba957dda3/hive_2015-06-18_05-10-04_183_5811447541305606525-1/-local-10000 matches 0 files
What is the exception and how should i solve it. Please help me.
So please check out the file: vim $HIVE_HOME/conf/hive-site.xml, and you should check the <name>system:user.name, it's value should be hduser not ${hduser}.
please take the right/correct name for the user.