Django - Function with template returns `TypeError: not enough arguments for format string` - sql

I am trying to use PostgreSQL's FORMAT function in Django to format phone number strings.
I can accomplish this with the following SQL query:
SELECT
phone_number, FORMAT('(%s) %s-%s', SUBSTRING(phone_number,3,3), SUBSTRING(phone_number,6,3), SUBSTRING(phone_number,9,4))
FROM core_user
WHERE phone_number iS NOT NULL
which returns a result like:
Trying to implement this into Django to be used for an ORM query, I did the following:
class FormatPhoneNumber(Func):
function = "FORMAT"
template = "%(function)s('(%s) %s-%s', SUBSTRING(%(expressions)s,3,3), SUBSTRING(%(expressions)s,6,3), SUBSTRING(%(expressions)s,9,4))"
ORM query:
User.objects.annotate(phone_number2=FormatPhoneNumber(f"phone_number"))
Returns the following error:
File /venv/lib/python3.10/site-packages/django/db/models/expressions.py:802, in Func.as_sql(self, compiler, connection, function, template, arg_joiner, **extra_context)
800 arg_joiner = arg_joiner or data.get("arg_joiner", self.arg_joiner)
801 data["expressions"] = data["field"] = arg_joiner.join(sql_parts)
--> 802 return template % data, params
TypeError: not enough arguments for format string
I believe it is due to this line '(%s) %s-%s' that is supplied to the FORMAT function.
Does anyone have any ideas on how I can make this work?

Yes, you use two consecutive percentages to produce a percentage after formatting, so:
class FormatPhoneNumber(Func):
function = "FORMAT"
template = "%(function)s('(%%s) %%s-%%s', SUBSTRING(%(expressions)s,3,3), SUBSTRING(%(expressions)s,6,3), SUBSTRING(%(expressions)s,9,4))"
But normally formatting is not done by the database, normally you do this in the model, in a model field, or in the view or template.

I was not able to get the solution using the FORMAT function to work, but I did get this to work with REGEXP_REPLACE which gives the same output:
class FormatPhoneNumber2(Func):
function = "REGEXP_REPLACE"
template = "%(function)s(SUBSTRING(%(expressions)s,3,10), '(\d{3})(\d{3})(\d{4})', '(\\1) \\2-\\3')"
In [27]: result = (
...: User.objects.filter(phone_number__isnull=False)
...: .annotate(phone_number2=FormatPhoneNumber2("phone_number"))
...: .values_list("phone_number", "phone_number2")
...: )
In [28]: for user in result:
...: print(user)
('+16502553199', '(650) 255-3199')
('+12147047099', '(214) 704-7099')
('+12147547099', '(214) 754-7099')

Related

how to dynamically build select list from a API payload using PyPika

I have a JSON API payload containing tablename, columnlist - how to build a SELECT query from it using pypika?
So far I have been able to use a string columnlist, but not able to do advanced querying using functions, analytics etc.
from pypika import Table, Query, functions as fn
def generate_sql (tablename, collist):
table = Table(tablename)
columns = [str(table)+'.'+each for each in collist]
q = Query.from_(table).select(*columns)
return q.get_sql(quote_char=None)
tablename = 'customers'
collist = ['id', 'fname', 'fn.Sum(revenue)']
print (generate_sql(tablename, collist)) #1
table = Table(tablename)
q = Query.from_(table).select(table.id, table.fname, fn.Sum(table.revenue))
print (q.get_sql(quote_char=None)) #2
#1 outputs
SELECT "customers".id,"customers".fname,"customers".fn.Sum(revenue) FROM customers
#2 outputs correctly
SELECT id,fname,SUM(revenue) FROM customers
You should not be trying to assemble the query in a string by yourself, that defeats the whole purpose of pypika.
What you can do in your case, that you have the name of the table and the columns coming as texts in a json object, you can use * to unpack those values from the collist and use the syntax obj[key] to get the table attribute with by name with a string.
q = Query.from_(table).select(*(table[col] for col in collist))
# SELECT id,fname,fn.Sum(revenue) FROM customers
Hmm... that doesn't quite work for the fn.Sum(revenue). The goal is to get SUM(revenue).
This can get much more complicated from this point. If you are only sending column names that you know to belong to that table, the above solution is enough.
But if you have complex sql expressions, making reference to sql functions or even different tables, I suggest you to rethink your decision of sending that as json. You might end up with something as complex as pypika itself, like a custom parser or wathever. than your better option here would be to change the format of your json response object.
If you know you only need to support a very limited set of capabilities, it could be feasible. For example, you can assume the following constraints:
all column names refer to only one table, no joins or alias
all functions will be prefixed by fn.
no fancy stuff like window functions, distinct, count(*)...
Then you can do something like:
from pypika import Table, Query, functions as fn
import re
tablename = 'customers'
collist = ['id', 'fname', 'fn.Sum(revenue / 2)', 'revenue % fn.Count(id)']
def parsed(cols):
pattern = r'(?:\bfn\.[a-zA-Z]\w*)|([a-zA-Z]\w*)'
subst = lambda m: f"{'' if m.group().startswith('fn.') else 'table.'}{m.group()}"
yield from (re.sub(pattern, subst, col) for col in cols)
table = Table(tablename)
env = dict(table=table, fn=fn)
q = Query.from_(table).select(*(eval(col, env) for col in parsed(collist)))
print (q.get_sql(quote_char=None)) #2
Output:
SELECT id,fname,SUM(revenue/2),MOD(revenue,COUNT(id)) FROM customers

Query list index with django

i have a problem with django when i use these two arrays:
institution_ids [(2,), (16,)]
project_ids [(3,), (1,)]
in this query:
queryset = Patient.active.filter(tss_id__in=institution_ids, project_id__in = project_ids)
it gives me back all the combinations, but I need this kind of result:
queryset = Patient.active.filter(tss_id=institution_ids[0], project_id = project_ids[0])
queryset += Patient.active.filter(tss_id=institution_ids[1], project_id = project_ids[1])
how can i do?
Thanks
Giuseppe
What you search for is an OR statement.
You can do it using Q object.
In Django, we cannot directly use the OR operator to filter the
QuerySet. For this implementation, we have to use the Q() object. By
using the Q() object in the filter method, we will be able to use the
OR operator between the Q() objects.
For example-
If you want to get all objects with tss_id in institution array and all objects with project__id__in projects_id array you will use it like this.
queryset = Patient.active.filter(Q(tss_id__in=institution_ids) | project_id__in = project_ids))
Pay attention- you need to import Q.
Another option -
Using union
New in Django 1.11. Uses SQL’s UNION operator to combine the results of two or more QuerySets.
The UNION operator selects only distinct values by default. To allow duplicate values, use the all=True argument.
For example :
queryset = Patient.active.filter(tss_id=institution_ids[0], project_id = project_ids[0])
queryset2 = Patient.active.filter(tss_id=institution_ids[1], project_id = project_ids[1])
Combined_queryset = queryset.union(queryset2)

Pig Latin: how to try-catch for errors regarding casting?

Let's say I have a structured data in Pig named "my_great_data" and it has the following fields: "a_field", "b_field" and "c_field".
I want to write a filter statement that filters in all "rows" in "my_great_data" that their "a_field" can be casted to long data type.
Something like:
my_great_data_output = filter my_great_data BY (a_field can be casted to long);
You could write a simple Python UDF to do the casting and return a null if it's not possible. You can then use IS NOT NULL in a filter to remove the records that failed to cast.
Something like:
from pig_util import outputSchema
#outputSchema("cast_value:long")
def castToLong(num):
if num is None:
return None
try:
# Should handle strings with a decimal point
# Use long(num) if these should not be cast.
r = long(float(num))
return r
except:
return None

Python cx_oracle bind variable with a list of items

I have a query like this:
SELECT prodId, prod_name , prod_type FROM mytable WHERE prod_type in (:list_prod_names)
I want to get the information of a product, depending on the possible types are: "day", "week", "weekend", "month". Depending on the date it might be at least one of those option, or a combination of all of them.
This info (List type) is returned by the function prod_names(date_search)
I am using cx_oracle bindings with code like:
def get_prod_by_type(search_date :datetime):
query_path = r'./queries/prod_by_name.sql'
raw_query = open(query_path).read().strip().replace('\n', ' ').replace('\t', ' ').replace(' ', ' ')
print(sql_read_op)
# Depending on the date the product types may be different
prod_names(search_date) #This returns a list with possible names
qry_params = {"list_prod_names": prod_names} # See attempts bellow
try:
db = DB(username='username', password='pss', hostname="localhost")
df = db.get(raw_query,qry_params)
except Exception:
exception_error = traceback.format_exc()
exception_error = 'Exception on DB.get_short_cov_op2() : %s\n%s' % exception_error
print(exception_error)
return df
For this: qry_params = {"list_prod_names": prod_names} I have tried multiple different things such as:
prod_names = ''.join(prod_names)
prod_names = str(prod_names)
prod_names =." \'"+''.join(prod_names)+"\'"
The only thing I have managed to get it work is by doing:
new_query = raw_query.format(list_prod_names=prodnames_for_date(search_date)).replace('[', '').replace(']','')
df = db.query(new_query)
I am trying not to use .format() because is bad practie to do a .format to an sql to prevent attacks.
db.py contains among other functions:
def get(self, sql, params={}):
cur = self.con.cursor()
cur.prepare(sql)
try:
cur.execute(sql, **params)
df = pd.DataFrame(cur.fetchall(), columns=[c[0] for c in cur.description])
except Exception:
exception_error = traceback.format_exc()
exception_error = 'Exception on DB.get() : %s\n%s' % exception_error
print(exception_error)
self.con.rollback()
cur.close()
df.columns = df.columns.map(lambda x: x.upper())
return df
I would like to be able to do a type binding.
I am using:
python = 3.6
cx_oracle = 6.3.1
I have read the followig articles but I a still unable to find a solution:
Python cx_Oracle bind variables
Python cx_Oracle SQL with bind string variable
Search for name in cx_Oracle
Unfortunately you cannot bind an array directly unless you convert it to a SQL type and use a subquery -- which is fairly complex. So instead you need to do something like this:
inClauseParts = []
for i, inValue in enumerate(ARRAY_VALUE):
argName = "arg_" + str(i + 1)
inClauseParts.append(":" + argName)
clause = "%s in (%s)" % (columnName, ",".join(inClauseParts))
This works fine but be aware that if the number of elements in the array changes regularly that using this technique will create a separate statement that must be parsed for each number of elements. If you know that (in general) you won't have more than (for example) 10 elements in the array it would be better to append None to the incoming array so that the number of elements is always 10.
Hopefully that is clear enough!
I have finally manage to do it. It might not be pretty but it works.
I have modified my sql query to include an extra select which returns the value of my list of descriptors:
inner join (
SELECT regexp_substr(:my_list_of_items, '[^,]+', 1, LEVEL) as mylist
FROM dual
CONNECT BY LEVEL <= length(:my_list_of_items) - length(REPLACE(:my_list_of_items, ',', '')) + 1
) d
on d.mylist= a.corresponding_columns

Sql query not working on matlab properly

So i'm working on a laravel project where i pass some data to matlab and then matlab will edit them..everything works fine except the function of matlab that i wrote..
function show(a)
econ=database('datamining','root','');
curs=exec(con,'SELECT name FROM dataset_choices WHERE id = a');
curs = fetch(curs);
curs.Data
end
i want this function to display the name of the dataset the user choose..the problem is that it doesnt work writing just where id = a... but if i write for example where id=1 it works..
i tried to display just the a with disp(a) to see what is the value of the a and it is store the right id that user had choose..so how can i use it in my query??
Try:
a = num2str(a); % or make sure the user inputs a string instead
curs=exec(con,['SELECT name FROM dataset_choices WHERE id = ',a]);
If a = '1', then the brackets would print:
'SELECT name FROM dataset_choices WHERE id = 1'