Doing BNF for a simplified SQL “where” clause - sql

I have the following SQL statement with a rudimentary BNF applied:
SELECT
from_expression [, from_expression] ...
FROM
from_source
WHERE
condition
from_source:
table_name
from_expression:
literal # 'abc', 1,
column # Table.Field1, Table.Profit
function # ABS(...)
operator invocation # NOT field1, 2+3, Genres[0], Genres[1:2], Address.name
condition:
???
For now the WHERE condition is the same as the from_expression but evaluated as a boolean. What would be the proper way to show that?

The grammar doesn't care about semantics.
Syntactically, an expression is an expression, nothing more. If you later do some kind of semantic analysis, that is when you'll need to deal with the difference. You might do that in the reduction actions for condition and from_expression but it would be cleaner to just build an AST while parsing and do the semantic analysis later on the tree.

One option would be as follow, with examples in-line:
expression:
literal # 'abc', 1,
column # Table.Field1, Table.Profit
function call # ABS(...)
operator invocation # NOT field1, 2+3, Genres[0], Genres[1:2], Address.name
condition:
expression { = | != | < | <= | > | >= | IN } expression # Revenue > 0
expression IS [NOT] NULL # Revenue IS NOT NULL
condition { AND | OR } condition # Revenue > 0 AND Profit < 100

This is a comment that doesn't fit in the comments section.
I was hesitant to provide the link to the LiveSQL implementation, since it's specifically for Java language, it's open source, but we have never added any documentation whatsoever. Read at your own risk.
The main class is LiveSQL.java: in line 113 you can see the main variant of the select clause. It has many variants, but this is the one that allows the developer to include as many result set columns (expressions) as needed:
public SelectColumnsPhase<Map<String, Object>> select(
final ResultSetColumn... resultSetColumns) {
return new SelectColumnsPhase<Map<String, Object>>(this.sqlDialect,
this.sqlSession, this.liveSQLMapper, false,
resultSetColumns);
}
Of course, the SELECT clause has many other variants, that you can find in the same class if you explore it a bit. I think I was pretty exhaustive when I researched all variants. It should be [mostly] complete, except for non-standard SQL dialect variations that I didn't consider.
If you follow the QueryBuilder up the WHERE phase you can see how the predicate is assembled in the method where(final Predicate predicate) (line 101) of the SelectFrom.java class, as in:
public SelectWherePhase<R> where(final Predicate predicate) {
return new SelectWherePhase<R>(this.select, predicate);
}
As you can see the WHERE clause does not accept any type of expression. First of all, it only accepts a single expression, not a list. Second this expression must be a Predicate (boolean expression). Of course, this predicate can be as complex as you want, mixing all kinds of expressions and boolean logic. You can peek at the Predicate.java class to explore all the expressions that can be built.
Condition
Let's take as an example the Predicate class described above. If p and q are boolean expressions and a, b, c are of [mostly] any type you can express the condition as:
condition:
<predicate>
predicate:
p and q,
p or q,
not p,
( p ),
a == b,
a <> b,
a > b,
a < b,
a >= b,
a <= b,
a between b and c,
a not between b and c,
a in (b, c, ... )
a not in (b, c, ... )
Of course there are more operators but this gives you the gist of it.

Related

Add array of other records from the same table to each record

My project is a Latin language learning app. My DB has all the words I'm teaching, in the table 'words'. It has the lemma (the main form of the word), along with the definition and other information the user needs to learn.
I show one word at a time for them to guess/remember what it means. The correct word is shown along with some wrong words, like:
What does Romanus mean? Greek - /Roman/ - Phoenician - barbarian
What does domus mean? /house/ - horse - wall - senator
The wrong options are randomly drawn from the same table, and must be from the same part of speech (adjective, noun...) as the correct word; but I am only interested in their lemma. My return value looks like this (some properties omitted):
[
{ lemma: 'Romanus', definition: 'Roman', options: ['Greek', 'Phoenician', 'barbarian'] },
{ lemma: 'domus', definition: 'house', options: ['horse', 'wall', 'senator'] }
]
What I am looking for is a more efficient way of doing it than my current approach, which runs a new query for each word:
// All the necessary requires are here
class Word extends Model {
static async fetch() {
const words = await this.findAll({
limit: 10,
order: [Sequelize.literal('RANDOM()')],
attributes: ['lemma', 'definition'], // also a few other columns I need
});
const wordsWithOptions = await Promise.all(words.map(this.addOptions.bind(this)));
return wordsWithOptions;
}
static async addOptions(word) {
const options = await this.findAll({
order: [Sequelize.literal('RANDOM()')],
limit: 3,
attributes: ['lemma'],
where: {
partOfSpeech: word.dataValues.partOfSpeech,
lemma: { [Op.not]: word.dataValues.lemma },
},
});
return { ...word.dataValues, options: options.map((row) => row.dataValues.lemma) };
}
}
So, is there a way I can do this with raw SQL? How about Sequelize? One thing that still helps me is to give a name to what I'm trying to do, so that I can Google it.
EDIT: I have tried the following and at least got somewhere:
const words = await this.findAll({
limit: 10,
order: [Sequelize.literal('RANDOM()')],
attributes: {
include: [[sequelize.literal(`(
SELECT lemma FROM words AS options
WHERE "partOfSpeech" = "options"."partOfSpeech"
ORDER BY RANDOM() LIMIT 1
)`), 'options']],
},
});
Now, there are two problems with this. First, I only get one option, when I need three; but if the query has LIMIT 3, I get: SequelizeDatabaseError: more than one row returned by a subquery used as an expression.
The second error is that while the code above does return something, it always gives the same word as an option! I thought to remedy that with WHERE "partOfSpeech" = "options"."partOfSpeech", but then I get SequelizeDatabaseError: invalid reference to FROM-clause entry for table "words".
So, how do I tell PostgreSQL "for each row in the result, add a column with an array of three lemmas, WHERE existingRow.partOfSpeech = wordToGoInTheArray.partOfSpeech?"
Revised
Well that seems like a different question and perhaps should be posted that way, but...
The main technique remains the same. JOIN instead of sub-select. The difference being generating the list of lemmas for then piping then into the initial query. In a single this can get nasty.
As single statement (actually this turned out not to be too bad):
select w.lemma, w.defination, string_to_array(string_agg(o.defination,','), ',') as options
from words w
join lateral
(select defination
from words o
where o.part_of_speech = w.part_of_speech
and o.lemma != w.lemma
order by random()
limit 3
) o on 1=1
where w.lemma in( select lemma
from words
order by random()
limit 4 --<<< replace with parameter
)
group by w.lemma, w.defination;
The other approach build a small SQL function to randomly select a specified number of lemmas. This selection is the piped into the (renamed) function previous fiddle.
create or replace
function exam_lemma_definition_options(lemma_array_in text[])
returns table (lemma text
,definition text
,option text[]
)
language sql strict
as $$
select w.lemma, w.definition, string_to_array(string_agg(o.definition,','), ',') as options
from words w
join lateral
(select definition
from words o
where o.part_of_speech = w.part_of_speech
and o.lemma != w.lemma
order by random()
limit 3
) o on 1=1
where w.lemma = any(lemma_array_in)
group by w.lemma, w.definition;
$$;
create or replace
function exam_lemmas(num_of_lemmas integer)
returns text[]
language sql
strict
as $$
select string_to_array(string_agg(lemma,','),',')
from (select lemma
from words
order by random()
limit num_of_lemmas
) ll
$$;
Using this approach your calling code reduces to a needs a single SQL statement:
select *
from exam_lemma_definition_options(exam_lemmas(4))
order by lemma;
This permits you to specify the numbers of lemmas to select (in this case 4) limited only by the number of rows in Words table. See revised fiddle.
Original
Instead of using a sub-select to get the option words just JOIN.
select w.lemma, w.definition, string_to_array(string_agg(o.definition,','), ',') as options
from words w
join lateral
(select definition
from words o
where o.part_of_speech = w.part_of_speech
and o.lemma != w.lemma
order by random()
limit 3
) o on 1=1
where w.lemma = any(array['Romanus', 'domus'])
group by w.lemma, w.definition;
See fiddle. Obviously this will not necessary produce the same options as your questions provides due to random() selection. But it will get matching parts of speech. I will leave translation to your source language to you; or you can use the function option and reduce your SQL to a simple "select *".

Django model query with not equal exclusion

I would like to create query that has a not equal parameter, but in exclusion part.
All posts suggest to exclude the parameters that should be not equal.
So what I need is :
Object.objects.filter(param1=p).exclude(param2=False, param3=False, param4 != q)
I try
Object.objects.filter(param1=p).exclude(param2=False, param3=False).exclude(param4=q)
but, as expected, doesn't work it is the same as exclude(param2=False, param3=False, param4=q)
So how to do it, or how to rightly chain exclude statements, so that second exclude is pointed to first exclude and not first filter.
It's a kind of logical question.
Exclude (field1 != value1) is same as Include(field1 = value1)
So, you can use the filter() metbod as,
Object.objects.filter(param1=p, param4=q).exclude(param2=False, param3=False)
Or
Use Q() object with ~ symbol
from django.db.models import Q
Object.objects.filter(param1=p).exclude(~Q(param4=q), param2=False, param3=False)

Handle null values within SQL IN clause

I have following sql query in my hbm file. The SCHEMA, A and B are schema and two tables.
select
*
from SCHEMA.A os
inner join SCHEMA.B o
on o.ORGANIZATION_ID = os.ORGANIZATION_ID
where
case
when (:pass = 'N' and os.ORG_ID in (:orgIdList)) then 1
when (:pass = 'Y') then 1
end = 1
and (os.ORG_SYNONYM like :orgSynonym or :orgSynonym is null)
This is a pretty simple query. I had to use the case - when to handle the null value of "orgIdList" parameter(when null is passed to sql IN it gives error). Below is the relevant java code which sets the parameter.
if (_orgSynonym.getOrgIdList().isEmpty()) {
query.setString("orgIdList", "pass");
query.setString("pass", "Y");
} else {
query.setString("pass", "N");
query.setParameterList("orgIdList", _orgSynonym.getOrgIdList());
}
This works and give me the expected output. But I would like to know if there is a better way to handle this situation(orgIdList sometimes become null).
There must be at least one element in the comma separated list that defines the set of values for the IN expression.
In other words, regardless of Hibernate's ability to parse the query and to pass an IN(), regardless of the support of this syntax by particular databases (PosgreSQL doesn't according to the Jira issue), Best practice is use a dynamic query here if you want your code to be portable (and I usually prefer to use the Criteria API for dynamic queries).
If not need some other work around like what you have done.
or wrap the list from custom list et.

Using SQLDF to select specific values from a column

SQLDF newbie here.
I have a data frame which has about 15,000 rows and 1 column.
The data looks like:
cars
autocar
carsinfo
whatisthat
donnadrive
car
telephone
...
I wanted to use the package sqldf to loop through the column and
pick all values which contain "car" anywhere in their value.
However, the following code generates an error.
> sqldf("SELECT Keyword FROM dat WHERE Keyword="car")
Error: unexpected symbol in "sqldf("SELECT Keyword FROM dat WHERE Keyword="car"
There is no unexpected symbol, so I'm not sure whats wrong.
so first, I want to know all the values which contain 'car'.
then I want to know only those values which contain just 'car' by itself.
Can anyone help.
EDIT:
allright, there was an unexpected symbol, but it only gives me just car and not every
row which contains 'car'.
> sqldf("SELECT Keyword FROM dat WHERE Keyword='car'")
Keyword
1 car
Using = will only return exact matches.
You should probably use the like operator combined with the wildcards % or _. The % wildcard will match multiple characters, while _ matches a single character.
Something like the following will find all instances of car, e.g. "cars", "motorcar", etc:
sqldf("SELECT Keyword FROM dat WHERE Keyword like '%car%'")
And the following will match "car" or "cars":
sqldf("SELECT Keyword FROM dat WHERE Keyword like 'car_'")
This has nothing to do with sqldf; your SQL statement is the problem. You need:
dat <- data.frame(Keyword=c("cars","autocar","carsinfo",
"whatisthat","donnadrive","car","telephone"))
sqldf("SELECT Keyword FROM dat WHERE Keyword like '%car%'")
# Keyword
# 1 cars
# 2 autocar
# 3 carsinfo
# 4 car
You can also use regular expressions to do this sort of filtering. grepl returns a logical vector (TRUE / FALSE) stating whether or not there was a match or not. You can get very sophisticated to match specific items, but a basic query will work in this case:
#Using #Joshua's dat data.frame
subset(dat, grepl("car", Keyword, ignore.case = TRUE))
Keyword
1 cars
2 autocar
3 carsinfo
6 car
Very similar to the solution provided by #Chase. Because we do not use subset we do not need a logical vector and can use both grep or grepl:
df <- data.frame(keyword = c("cars", "autocar", "carsinfo", "whatisthat", "donnadrive", "car", "telephone"))
df[grep("car", df$keyword), , drop = FALSE] # or
df[grepl("car", df$keyword), , drop = FALSE]
keyword
1 cars
2 autocar
3 carsinfo
6 car
I took the idea from Selecting rows where a column has a string like 'hsa..' (partial string match)

Nested lambda expression translated from SQL statement

Using lambda expressions, how do you translate this query?
select * from employee where emp_id=1 and dep_id in (1,2,3,4).
I am trying this expression but this results in exceptions:
public IEnumrable<employees> getemployee(int emp,list<int> dep )
{
_employeeService.GetAll(e=>(e.emp_id=emp||emp==null) && (e.dep_id.where(dep.contains(e.dep_id))|| dep.count==0 )
}
Any suggestion for translating these queries to these lambda expressions?
wahts wrong in these function?
I don't mean to sound patronizing, but I'd suggest you pick up a beginner's C# book before going further. You've made errors in basic syntax such as operator precedence and equality comparison.
Also, remember that primitive data types cannot be checked for null, unless they're explicitly specified as nullable. In your case emp would be of type int?,not int.
Meanwhile, try
_employeeService.Where(e=>(e.emp_id==emp) && (dep == null || dep.contains(e.dep_id)) );