How to index correctly using Python 3.7 and address associated errors - pandas

I have been having issues indexing in Python 3.7. I would greatly appreciate your insights and clarification on this.
I have tried to research and fix this issue but I am not able to understand what I am doing. I would greatly appreciate your help
enroll = pd.read_csv('enrollment_forecast.csv')
enroll.columns = ['year','roll','unem','hgrad','inc']
# the correlation between variables
enroll.corr()
enroll_data = enroll.ix[:(2,3)].values
print(enroll_data)
enroll_target = enroll.ix[:,1].values
print(enroll_target)
enroll_data_names = ['unem','hgrad']
Exception has occurred: AssertionError
End slice bound is non-scalar

Just a heads up, Pandas .ix index accessor is deprecated.
It's throwing the error error because you are passing it a tuple:
enroll_data = enroll.ix[:(2,3)].values
Try passing it a list instead of a tuple:
enroll_data = enroll.ix[:[2,3]].values

Related

Error: ('HY000', 'The driver did not supply an error!') - with string

I am trying to push a pandas DataFrame from python to Impala but am getting a very uninformative error. The code I am using looks as such:
cursor = connection.cursor()
cursor.fast_executemany = True
cursor.executemany(
f"INSERT INTO table({', '.join(df.columns.tolist())}) VALUES ({('?,' * len(df.columns))[:-1]})",
list(df.itertuples(index=False, name=None))
)
cursor.commit()
connection.close()
This works for the first 23 rows and then suddenly throws this error:
Error: ('HY000', 'The driver did not supply an error!')
This doesn't help me locate the issue at all. I've turned all Na values to None so there is compatibility, Unfortunatly I can't share the data.
Does anyone have any ideas/ leads as to how I would solve this. Thanks

Object of type 'closure' is not subsettable topic modeling

I am attempting some data analysis via topic modeling. I have followed the guide posted here: https://bookdown.org/joone/ComputationalMethods/topicmodeling.html
The code works fine until the last step. Here is my code snippet for your reference:
LePen_top <- tibble(topic = terms$topicnums, prob = apply(terms$prob, 1, paste, collapse = ", "),
frex = apply(terms$frex, 1, paste, collapse = ", "))
When I run it, the following message pops up: Error in terms$topicnums : object of type 'closure' is not subsettable
I have looked around the forums, but I have not been able to solve it due to my inexperience. I would greatly appreciate your help, and if you could explain what have I done wrong.
Best regards,
MRizak

Pandas diff-function: NotImplementedError

When I use the diff function in my snippet:
for customer_id, cus in tqdm(df.groupby(['customer_ID'])):
# Get differences
diff_df1 = cus[num_features].diff(1, axis = 0).iloc[[-1]].values.astype(np.float32)
I get:
NotImplementedError
The exact same code did run without any error before (on Colab), whereas now I'm using an Azure DSVM via JupyterHub and I get this error.
I already found this
pandas pd.DataFrame.diff(axis=1) NotImplementationError
but the solution doesnt work for me as I dont have any Date types. Also I did upgrade pandas but it didnt change anything.
EDIT:
I have found that the error occurs when the datatype is 'int16' or 'int8'. Converting the dtypes to 'int64' solves it.
However I leave the question open in case someone can explain it or show a solution that works with int8/int16.

Sparklyr : sql temporary error : argument is not interpretable as logical

Hi I'm new to sparklyr and I'm essentially running a query to create a temporary object in spark.
The code is something like
ts_data<-tbl(sc,"db.table") %>% filter(condition) %>% compute("ts_data")
sc is my spark connection.
I have run the same code before and it works but now I get the following error.
Error in if (temporary) sql("TEMPORARY ") : argument is not
interpretable as logical
I have tried changing filters, tried it with new tables, R versions and snapshots. Yet it still gives the same exact error. I am positive there are no syntax errors
Can someone help me understand how to fix this?
I ran into the same problem. Changing compute("x") to compute(name = "x") fixed it for me.
This was a bug of sparklyr, and is fixed with version 1.7.0. So either use matching by argument (name = x) or update your sparklyr version.

JanusGraph:Some key(s) on index verticesIndex do not currently have status REGISTERED

I have some questions when I build a JanusGraph Mixed index.
This is my code:
mgmt = graph.openManagement();
idx = mgmt.getGraphIndex('zhh1_index');
prop = mgmt.getPropertyKey('zhang');
mgmt.addIndexKey(idx, prop);
prop = mgmt.getPropertyKey('uri');
mgmt.addIndexKey(idx, prop);
prop = mgmt.getPropertyKey('age');
mgmt.addIndexKey(idx, prop);
mgmt.commit();
mgmt.awaitGraphIndexStatus(graph, 'zhh1_index').status(SchemaStatus.REGISTERED).call();
mgmt = graph.openManagement();
mgmt.updateIndex(mgmt.getGraphIndex('zhh1_index'),SchemaAction.ENABLE_INDEX).get();
mgmt.commit();
vertex2=graph.addVertex(label,'zhh1');
vertex2.property('zhang','male');
vertex2.property('uri','/zhh1/zhanghh');
vertex2.property('age','18');
vertex3=graph.addVertex(label,'zhh1');
vertex3.property('zhang','male');
vertex3.property('uri','/zhh1/zhangheng');
When the program executes this line:
mgmt.awaitGraphIndexStatus(graph, 'zhh1_index').status(SchemaStatus.REGISTERED).call();
the the log prints these information (and about 30s later, an exception like this: the sleep was interrupt):
GraphIndexStatusReport[success=false, indexName='zhh1_index', targetStatus=ENABLED, notConverged={jiyq=INSTALLED, zhang=INSTALLED, uri=INSTALLED, age=INSTALLED}, converged={}, elapsed=PT1M0.096S]
I was so confused about this!
It keeps printing a lot for all indexes I have. Am I doing anything wrong? How to avoid such message?
When I execute the following statement separately, the following exception is reported:
exception:java.util.concurrent.ExecutionException:
mgmt.updateIndex(mgmt.getGraphIndex('zhh1_index'),SchemaAction.ENABLE_INDEX).get();
org.apache.tinkerpop.gremlin.driver.exception.ResponseException: Cannot invoke method get() on null object
Your index seems to be stuck in the INSTALLED state, which may happening due to a few reasons: please see this post and look at my answer-- specifically bullet numbers 2,3, and 5.
When did you buildMixedIndex() ?
REINDEX procedure may be required.