Flask & SQLAlchemy & PostgreSQL - In a query can an 'int' be cast to a 'string' to permit use of 'like' - flask-sqlalchemy

Using Flask and SQLAlchemy is it possible to create a query where a column can be cast from a number to a string so that .like() can be used as a filter?
The sample code below illustrates what I'm after, however Test 3 is a broken statement (ie: No attempt at casting so the query fails. Error is below)
Test 1 - demonstrates a standard select
Test 2 - demonstrates a select using like on a string
Can 'test 3' be modified to permit a like on a number?
In PostgreSQL the SQL query would be:
SELECT * FROM mytable WHERE number::varchar like '%2%'
Any assistance gratefully appreciated.
from flask import Flask
from flask_sqlalchemy import SQLAlchemy
from sqlalchemy import Table, Column, Integer, String
app = Flask(__name__)
app.debug = True
app.config.from_pyfile('config.py')
db = SQLAlchemy( app )
class MyTable(db.Model):
'''My Sample Table'''
__tablename__ = 'mytable'
number = db.Column( db.Integer, primary_key = True )
text = db.Column( db.String )
def __repr__(self):
return( 'MyTable( ' + str( self.number ) + ', ' + self.text + ')' )
test_1 = (db.session.query(MyTable)
.all())
print "Test 1 = " + str( test_1 )
test_2 = (db.session.query(MyTable)
.filter( MyTable.text.like( '%orl%' ) )
.all())
print "Test 2 = " + str( test_2 )
test_3 = (db.session.query(MyTable)
.filter( MyTable.number.like( '%2%' ) )
.all())
And the sample data:
=> select * from mytable;
number | text
--------+-------
100 | Hello
20 | World
And the error:
Traceback (most recent call last):
File "sample.py", line 33, in <module>
.filter( MyTable.number.like( '%2%' ) )
File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2320, in all
return list(self)
File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2438, in __iter__
return self._execute_and_instances(context)
File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2453, in _execute_and_instances
result = conn.execute(querycontext.statement, self._params)
File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 729, in execute
return meth(self, multiparams, params)
File "/usr/lib64/python2.7/site-packages/sqlalchemy/sql/elements.py", line 322, in _execute_on_connection
return connection._execute_clauseelement(self, multiparams, params)
File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 826, in _execute_clauseelement
compiled_sql, distilled_params
File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 958, in _execute_context
context)
File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1159, in _handle_dbapi_exception
exc_info
File "/usr/lib64/python2.7/site-packages/sqlalchemy/util/compat.py", line 199, in raise_from_cause
reraise(type(exception), exception, tb=exc_tb)
File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 951, in _execute_context
context)
File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/default.py", line 436, in do_execute
cursor.execute(statement, parameters)
sqlalchemy.exc.ProgrammingError: (ProgrammingError) operator does not exist: integer ~~ unknown
LINE 3: WHERE mytable.number LIKE '%2%'
^
HINT: No operator matches the given name and argument type(s). You might need to add explicit type casts.
'SELECT mytable.number AS mytable_number, mytable.text AS mytable_text \nFROM mytable \nWHERE mytable.number LIKE %(number_1)s' {'number_1': '%2%'}

Solved. The Query method filter can take an expression, so the solution is:
from sqlalchemy import cast, String
result = (db.session.query(MyTable)
.filter( cast( MyTable.number, String ).like( '%2%' ) )
.all())
With the result:
Test 3 = [MyTable( 20, World)]
Found the information in the SQLAlchemy Query API documentation.

Related

cx_oracle ORA-01036: illegal variable name/number

I have this group of python scripts where I do an API call, merge the resulting df together, then output to a table in an Oracle db. This exact script is working perfectly in three other scripts configured same, except a different API, but in this particular script, this error is getting thrown. I read up on binds, but I can't see how I'm doing it incorrectly for a tuple. Thanks in advance for your time.
sf_joined = pd.merge(sf_opp, sf_account,on=["CustID","CustID"])
# sf_joined.to_csv('sf_joined.csv', index=False)
# sf_types = sf_joined.dtypes
# print(sf_types)
char_columns = sf_joined.select_dtypes(include=['object']).columns
for col in char_columns:
if col not in ['rundate','Amount','Estimated_GC','Probability','MDC','DaysOpen']:
sf_joined[col] = sf_joined[col].fillna('')
# sf_joined[col] = sf_joined[col].map(lambda x: x.encode('utf-8'))
sf_joined[col] = sf_joined[col].map(lambda x: x[:1000])
pw = '****'
db_con = cx_Oracle.connect('mktg', pw, "prd-bia-db-***.o******.com:1521/BIPRD", encoding = "UTF-8", nencoding = "UTF-8")
cur = db_con.cursor()
print(db_con.version)
cur.execute('drop table cs_salesforce')
create_opps = """create table cs_salesforce(
rundate date,
sfoppid varchar(500)
)
"""
cur.execute(create_opps)
all_opps = []
for x in sf_joined.itertuples():
all_opps.append(x[1:])
insert_statement = """insert into cs_salesforce(rundate,sfoppid)values(:1, :2)"""
cur.executemany(insert_statement, all_opps)
db_con.commit()
Error:
runfile('C:/python_scripts_prod/cs_salesforce.py', wdir='C:/python_scripts_prod')
18.3.0.0.0
Traceback (most recent call last):
File "C:\python_scripts_prod\cs_salesforce.py", line 163, in <module>
cur.executemany(insert_statement, all_opps)
DatabaseError: ORA-01036: illegal variable name/number

Not able to extract any column from a pandas dataframe [duplicate]

I have successfully read a csv file using pandas. When I am trying to print the a particular column from the data frame i am getting keyerror. Hereby i am sharing the code with the error.
import pandas as pd
reviews_new = pd.read_csv("D:\\aviva.csv")
reviews_new['review']
**
reviews_new['review']
Traceback (most recent call last):
File "<ipython-input-43-ed485b439a1c>", line 1, in <module>
reviews_new['review']
File "C:\Users\30216\AppData\Local\Continuum\Anaconda2\lib\site-packages\pandas\core\frame.py", line 1997, in __getitem__
return self._getitem_column(key)
File "C:\Users\30216\AppData\Local\Continuum\Anaconda2\lib\site-packages\pandas\core\frame.py", line 2004, in _getitem_column
return self._get_item_cache(key)
File "C:\Users\30216\AppData\Local\Continuum\Anaconda2\lib\site-packages\pandas\core\generic.py", line 1350, in _get_item_cache
values = self._data.get(item)
File "C:\Users\30216\AppData\Local\Continuum\Anaconda2\lib\site-packages\pandas\core\internals.py", line 3290, in get
loc = self.items.get_loc(item)
File "C:\Users\30216\AppData\Local\Continuum\Anaconda2\lib\site-packages\pandas\indexes\base.py", line 1947, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas\index.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandas\index.c:4154)
File "pandas\index.pyx", line 159, in pandas.index.IndexEngine.get_loc (pandas\index.c:4018)
File "pandas\hashtable.pyx", line 675, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12368)
File "pandas\hashtable.pyx", line 683, in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12322)
KeyError: 'review'
**
Can someone help me in this ?
I think first is best investigate, what are real columns names, if convert to list better are seen some whitespaces or similar:
print (reviews_new.columns.tolist())
I think there can be 2 problems (obviously):
1.whitespaces in columns names (maybe in data also)
Solutions are strip whitespaces in column names:
reviews_new.columns = reviews_new.columns.str.strip()
Or add parameter skipinitialspace to read_csv:
reviews_new = pd.read_csv("D:\\aviva.csv", skipinitialspace=True)
2.different separator as default ,
Solution is add parameter sep:
#sep is ;
reviews_new = pd.read_csv("D:\\aviva.csv", sep=';')
#sep is whitespace
reviews_new = pd.read_csv("D:\\aviva.csv", sep='\s+')
reviews_new = pd.read_csv("D:\\aviva.csv", delim_whitespace=True)
EDIT:
You get whitespace in column name, so need 1.solutions:
print (reviews_new.columns.tolist())
['Name', ' Date', ' review']
^ ^
import pandas as pd
df=pd.read_csv("file.txt", skipinitialspace=True)
df.head()
df['review']
dfObj['Hash Key'] = (dfObj['DEAL_ID'].map(str) +dfObj['COST_CODE'].map(str) +dfObj['TRADE_ID'].map(str)).apply(hash)
#for index, row in dfObj.iterrows():
# dfObj.loc[`enter code here`index,'hash'] = hashlib.md5(str(row[['COST_CODE','TRADE_ID']].values)).hexdigest()
print(dfObj['hash'])

Python Dbf append to memory indexed table fails

I'm using Python dbf-0.99.1 library from Ethan Furman. This approach to add record to table fails:
tab = dbf.Table( "MYTABLE" )
tab.open(mode=dbf.READ_WRITE)
idx = tab.create_index(lambda rec: (rec.id if not is_deleted(rec) else DoNotIndex ) ) # without this, append works
rec = { "id":id, "col2": val2 } # some values, id is numeric and is not None
tab.append( rec ) # fails here
My table contains various character and numeric columns. This is just an example. The exceptions is:
line 5959, in append
newrecord = Record(recnum=header.record_count, layout=meta, kamikaze=kamikaze)
line 3102, in __new__
record._update_disk()
line 3438, in _update_disk
index(self)
line 7550, in __call__
vindex = bisect_right(self._values, key)
TypeError: '<' not supported between instances of 'NoneType' and 'int'
Any help appreciated. Thanks.
EDIT: Here is testing script
import dbf
from dbf import is_deleted, DoNotIndex
tab = dbf.Table('temptable', "ID N(12,0)" )
tab.open(mode=dbf.READ_WRITE)
rc = { "id":1 }
tab.append( rc ) # need some data without index first
idx = tab.create_index(lambda rec: (rec.id if not is_deleted(rec) else DoNotIndex ) )
rc = { "id":2 }
tab.append( rc ) # fails here

TypeError: '<' not supported between instances of 'str' and 'int' Doc2Vec

Any ideas why this error is being thrown
"TypeError: '<' not supported between … 'str' and 'int'" when doc-tag not present for most_similar()
I have a list of .txt documents stored in my data folder and want to compare one doc to another through my flask app on localhost.
Traceback (most recent call last):
File "C:\Users\ibrahimm\AppData\Local\Continuum\anaconda3\lib\site-packages\flask\app.py", line
2463, in __call__
return self.wsgi_app(environ, start_response)
File "C:\Users\ibrahimm\AppData\Local\Continuum\anaconda3\lib\site-packages\flask\app.py", line
2449, in wsgi_app
response = self.handle_exception(e)
File "C:\Users\ibrahimm\AppData\Local\Continuum\anaconda3\lib\site-packages\flask\app.py", line
1866, in handle_exception
reraise(exc_type, exc_value, tb)
File "C:\Users\ibrahimm\AppData\Local\Continuum\anaconda3\lib\site-packages\flask\_compat.py", line
39, in reraise
raise value
File "C:\Users\ibrahimm\AppData\Local\Continuum\anaconda3\lib\site-packages\flask\app.py", line
2446, in wsgi_app
response = self.full_dispatch_request()
File "C:\Users\ibrahimm\AppData\Local\Continuum\anaconda3\lib\site-packages\flask\app.py", line
1951, in full_dispatch_request
rv = self.handle_user_exception(e)
File "C:\Users\ibrahimm\AppData\Local\Continuum\anaconda3\lib\site-packages\flask\app.py", line
1820,
in handle_user_exception
reraise(exc_type, exc_value, tb)
File "C:\Users\ibrahimm\AppData\Local\Continuum\anaconda3\lib\site-packages\flask\_compat.py", line
39, in reraise
raise value
File "C:\Users\ibrahimm\AppData\Local\Continuum\anaconda3\lib\site-packages\flask\app.py", line
1949,
in full_dispatch_request
rv = self.dispatch_request()
File "C:\Users\ibrahimm\AppData\Local\Continuum\anaconda3\lib\site-packages\flask\app.py", line
1935,
in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "C:\Users\ibrahimm\Desktop\doc2vec-compare-doc-demo\app.py", line 56, in api_compare_2
vec1 = d2v_model.docvecs.most_similar(data['doc1'])
File "C:\Users\ibrahimm\AppData\Local\Continuum\anaconda3\lib\site-
packages\gensim\models\keyedvectors.py", line 1715, in most_similar
elif doc in self.doctags or doc < self.count:
TypeError: '<' not supported between instances of 'str' and 'int'\
app.py
#app.route('/api/compare_2', methods=['POST'])
def api_compare_2():
data = request.get_json()
if not 'doc1' in data or not 'doc2' in data:
return 'ERROR'
vec1 = d2v_model.docvecs.most_similar(data['doc1'])
vec2 = d2v_model.docvecs.most_similar(data['doc2'])
vec1 = gensim.matutils.full2sparse(vec1)
vec2 = gensim.matutils.full2sparse(vec2)
print (data)
print (vec2)
print (vec1)
return jsonify(sim=gensim.matutils.cossim(vec1, vec2))
#app.route('/api/compare_all', methods=['POST'])
def api_compare_all():
data = request.get_json()
if not 'doc' in data:
return 'ERROR'
vec = d2v_model.docvecs.most_similar(data['doc'])
res = d2v_model.docvecs.most_similar([vec], topn=5)
return jsonify(list=res)
model.py
def load_model():
try:
return gensim.models.doc2vec.Doc2Vec.load("doc2vec.model2")
except:
print ('Model not found!')
return None
def train_model():
#path to the input corpus files
data="data"
#tagging the text files
class DocIterator(object):
def __init__(self, doc_list, labels_list):
self.labels_list = labels_list
self.doc_list = doc_list
def __iter__(self):
for idx, doc in enumerate(self.doc_list):
yield TaggedDocument(words=doc.split(), tags=[self.labels_list[idx]])
docLabels = [f for f in listdir(data) if f.endswith('.txt')]
print(docLabels)
data = []
for doc in docLabels:
data.append(open(r'C:\Users\ibrahimm\Desktop\doc2vec-compare-doc-demo\data\\' + doc,
encoding='cp437').read())
tokenizer = RegexpTokenizer(r'\w+')
stopword_set = set(stopwords.words('english'))
#This function does all cleaning of data using two objects above
def nlp_clean(data):
new_data = []
for d in data:
new_str = d.lower()
dlist = tokenizer.tokenize(new_str)
dlist = list(set(dlist).difference(stopword_set))
new_data.append(dlist)
return new_data
data = nlp_clean(data)
it = DocIterator(data, docLabels)
#train doc2vec model
model = gensim.models.Doc2Vec(size=300, window=15, min_count=4, workers=10,alpha=0.025, min_alpha=0.025, iter=20) # use fixed learning rate
model.build_vocab(it)
model.train(it, epochs=model.iter, total_examples=model.corpus_count)
model.save("doc2vec.model2")
If you try to look-up a string doc-tag that's not in the model, you unfortunately get this confusing error, instead of a clearer error. (See gensim's open-issue: https://github.com/RaRe-Technologies/gensim/issues/1737#issuecomment-346995119 )
Whatever is in data['doc1'] isn't a tag in the model.
You may be able to pre-check, before attempting a most_similar() operation, by looking at whether data['doc1'] in model.docvecs is True.
TypeError: '<' not supported between instances of 'str' and 'int'
[35182] Failed to execute script docker-compose
This error is was as a result of copy and paste code with a wrong quotation mark(). change this to this ''

Spacy phrasematcher does not get matcher name

I am new to phraseMatcher and want to extract some keyword from my emails.
Everything is working well except that I can't get a name of added matcher.
This is my code below:
def main():
patterns_months = 'phraseMatcher/months.txt'
text_loc = 'phraseMatcher/text.txt'
nlp = spacy.blank('en')
nlp.vocab.lex_attr_getters ={}
phrases_months = read_gazetter(patterns_months)
txts = read_text(text_loc, n=n)
months = [nlp(text) for text in phrases_months]
matcher = PhraseMatcher(nlp.vocab)
matcher.add('MONTHS', None, *months)
print(nlp.vocab.strings['MONTHS'])
for txt in txts:
doc = nlp(txt)
matches = matcher(doc)
for match_id ,start, end in matches:
span = doc[start: end]
label = nlp.vocab.strings[match_id]
print(label, span.text, start, end)
The result:
12298211501233906429 <--- this is from print(nlp.vocab.strings['MONTHS'])
Traceback (most recent call last):
File "D:/workspace/phraseMatcher/venv/phraseMatcher.py", line 71, in <module>
plac.call(main)
File "D:\workspace\phraseMatcher\venv\lib\site-packages\plac_core.py", line 328, in call
cmd, result = parser.consume(arglist)
File "D:\workspace\phraseMatcher\venv\lib\site-packages\plac_core.py", line 207, in consume
return cmd, self.func(*(args + varargs + extraopts), **kwargs)
File "D:/workspace/phraseMatcher/venv/phraseMatcher.py", line 47, in main
label = nlp.vocab.strings[match_id]
File "strings.pyx", line 117, in spacy.strings.StringStore.__getitem__
KeyError: "[E018] Can't retrieve string for hash '18446744072093410045'."
spaCy version:** 2.0.12
Platform:** Windows-7-6.1.7601-SP1
Python version:** 3.7.0
I can't find what I did wrong. It is simple and I read these already:
Using PhraseMatcher in SpaCy to find multiple match types
Help me, thanks in advance.