Error When Creating Search Index in Cassandra Enterprise (DSE) - indexing

I'm trying to create a search index on my table in DSE 6.8. This is my table in the test keyspace:
CREATE TABLE users (
username text,
first_name text,
last_name text,
password text,
email text,
last_access timeuuid,
PRIMARY KEY(username));
I tried this query:
CREATE SEARCH INDEX ON test.users;
and this is the response:
InvalidRequest: Error from server: code=2200 [Invalid query] message="Search statements are not supported on this node"
I think there must be something that I should change in some file for DSE to support search statements. I've already set the SOLR_ENABLED in /etc/default/dse to 1. I'm totally new to this and I don't know if there's something wrong with my table or anything else.
can anyone suggest what might be causing this error? Thanks in advance.

As the error message suggests, you can only create a Search index on DSE nodes running in Search mode.
Check the node's workload by running the command below. It will tell you if the node is running in pure Cassandra mode or Search mode.
$ dsetool status
If you have installed DSE using the binary tarball, it doesn't use /etc/default/dse. Instead start DSE as a standalone process with the -s flag to start it in Search mode:
$ dse cassandra -s
Cheers!

Related

Redisgraph create index command timed out

Command is timing out when creating an index.
When I try to create an index on facilityNumber
GRAPH.QUERY GRAPH_NAME "CREATE INDEX ON :node(facilityNumber)"
I'm getting a timed out exception
CLI ERROR: Command timed out. Blocking commands are not supported
More Context:
My graph is constructed using redislab's bulk insert python script.
Graph consisting of 1214 nodes and 152846 relations.
node does contains facilityNumber when queried against.
With redisgraph running in docker, using image redislabs/redismod
On which type of machine are you running docker?
also what happens when you switch to redisgraph docker image instead of redislabs/redismod ?

Can't access external Hive metastore with Pyspark

I am trying to run a simple code to simply show databases that I created previously on my hive2-server. (note in this example there are both, examples in python and scala both with the same results).
If I log in into a hive shell and list my databases I see a total of 3 databases.
When I start Spark shell(2.3) on pyspark I do the usual and add the following property to my SparkSession:
sqlContext.setConf("hive.metastore.uris","thrift://*****:9083")
And re-start a SparkContext within my session.
If I run the following line to see all the configs:
pyspark.conf.SparkConf().getAll()
spark.sparkContext._conf.getAll()
I can indeed see the parameter has been added, I start a new HiveContext:
hiveContext = pyspark.sql.HiveContext(sc)
But If I list my databases:
hiveContext.sql("SHOW DATABASES").show()
It will not show the same results from the hive shell.
I'm a bit lost, for some reason it looks like it is ignoring the config parameter as I am sure the one I'm using it's my metastore as the address I get from running:
hive -e "SET" | grep metastore.uris
Is the same address also if I run:
ses2 = spark.builder.master("local").appName("Hive_Test").config('hive.metastore.uris','thrift://******:9083').getOrCreate()
ses2.sql("SET").show()
Could it be a permission issue? Like some tables are not set to be seen outside the hive shell/user.
Thanks
Managed to solve the issue, because a communication issue the Hive was not hosted in that machine, corrected the code and everything fine.

Gcloud SQL Postgres import error : CREATE TABLE ERROR: syntax error at or near "AS" LINE 2: AS integer ^ Import error: exit status 3**

Problem:
Getting below mentioned error while importing schema from AWS Postgres to Gcloud postgres.
Error:
Import failed:
SET
SET
SET
SET
SET set_config
------------
(1 row)
SET
SET
SET
CREATE SCHEMA
SET
SET
CREATE TABLE
ERROR: syntax error at or near "AS" LINE 2: AS integer ^
Import error: exit status 3
I used --no-acl --no-owner --format=plain while exporting data from AWS postgres
pg_dump -Fc -n <schema_name> -h hostname -U user -d database --no-acl --no-owner --format=plain -f data.dump
I am able to import certain schemas in gcloud sql exported using same method but getting error for some other similar schemas. Table has geospatial info and postgis is already installed in destination database.
Looking for some quick help here.
My solution:
Basically, I had a data dump file from postgres 10.0 with tables having 'sequence' for PK . Apparently, the way sequences along with other table data got dumped in file, was not been read properly by Gcloud postgres 9.6. That's where it was giving error "AS integer". Also, finally I did find this express in dump file which I couldn't find earlier. Hence I need to filter out this bit.
CREATE SEQUENCE sample.geofences_id_seq
AS integer <=====had to filter out this bit to get it working
START WITH 1
INCREMENT BY 1
NO MINVALUE
NO MAXVALUE
CACHE 1;
No sure if anyone else faced this issue but i had and this solution worked for me without loosing any functionality.
Happy to get other better solutions here.
The original answer is correct, and similar answers are given for the general case. Options include:
Upgrading the target database to 10: this depends on what you are using in GCP. For a managed service like Cloud SQL, upgrading is not an option (though support for 10 is in the works, so waiting may be an option in some cases). It is, if you are running the database inside a Compute instance, or as a container in, e.g., App Engine (a ready instance is available from the Marketplace).
Downgrading the source before exporting. Only possible if you control the source installation.
Removing all instances of this one line from the file before uploading it. Adapting other responses to modify an already-created dump file, the following worked for me:
cat dump10.sql | sed -e '/AS integer/d' > dump96.sql

HIVE create table is hanging - CDH 5.7

Create table script in HIVE is hanging and it is not completing for long time. I am using CDH 5.7, 'show databases' takes time to retrieve the data and finally it showed list of all databases. Below create script i am using:
create table dept
( dep_id int,
dep_name string
);
Am I missing some configuration settings with related to HIVE? Also I am able to see green image in Cloudera Manager(CM) for HIVE.
Looks like Hive metastore was hanging, after restarting Hive service it started working. Thanks for your help in Cloduera community

Database trouble in Django: can't reset because of dependencies

I'm trying to reset a database in Django, using:
python manage.py reset app
but get the following error:
Error: Error: app couldn't be reset. Possible reasons:
* The database isn't running or isn't configured correctly.
* At least one of the database tables doesn't exist.
* The SQL was invalid.
Hint: Look at the output of 'django-admin.py sqlreset app'. That's the SQL this command wasn't able to run.
The full error: cannot drop table app_record because other objects depend on it
HINT: Use DROP ... CASCADE to drop the dependent objects too.
This is what my models.py file looks like:
class Record(models.Model):
name = models.CharField(max_length=50, db_index=True)
year = models.IntegerField(blank=True, null=True)
def __unicode__(self):
return self.name
class Class(models.Model):
record = models.ForeignKey(Record)
def __unicode__(self):
return self.id
I get that I need to use the DROP... CASCADE command in the SQL that deletes and recreates the database (the output of django-admin.py).
But how can I edit that SQL directly from models.py?
UPDATE
OK, I figured out how to delete tables manually (the database is postgres), have noted it here for anyone with the same problem:
python manage.py dbshell
# drop table app_record cascade;
# \q
python manage.py reset app
Would still like to know if anyone has a better way to do it, though :)
The easy way to fully reset a Django database is using django-extensions.
It has a reset_db command that supports all Django's default database backends.
python manage.py reset_db
If you're using Django 1.2+ you should explicitly define the database you want to reset. If your project only uses one database, you should probably set --router=default
I use a little unix pipeline that adds CASCADE to all the DROP statements.
python manage.py sqlreset myapp | sed 's/DROP TABLE \(.*\);/DROP TABLE \1 CASCADE;/g' | \
psql --username myusername mydbname
The problem of DROP TABLE CASCADE is that it just remove a foreign keys on related tables - after syncdb this relation is not recreated.
I found no way to recreate the particular model's tables, so I'm reseting whole application by recreating schema:
DROP SCHEMA public CASCADE;
CREATE SCHEMA "public" AUTHORIZATION "owner of database";
That should work only with database that supports schema, e.g. postgresql
Using the details in other answers, I made a bash function that I dropped into ~/.bash_profile (on Mac OS X).
django_reset () { python mainsite/manage.py sqlreset "$*" | sed 's/DROP TABLE \(.*\);/DROP TABLE \1 CASCADE;/g' | mainsite/manage.py dbshell ; }
Then just run this command in the terminal from your root code directory (so the path to mainsite/manage.py makes sense).
django_reset myappA myappB
And it'll execute!
I found another way. I'm using sqlite3 that comes by default in Django.
To reset table to default.
python manage.py flush --database=default
after this you will need to use the syncdb command again.