I need to run the following query on a Django model:
SELECT *
FROM app_model
WHERE GREATEST(field1, fixed_value) < LEAST(field2, another_fixed_value)
Is there any way for me to run this query without resorting to the raw() method?
You can at least avoid raw by using extra. I don't think the ORM otherwise exposes GREATEST or LEAST.
In theory you could break down your constraint into its different possibilities and or them together:
mymodel.objects.filter(Q(field1__gt=fixed_value and field2__lt=another_fixed_value and field1__lt=field2) | \
Q(field1__lte=fixed_value and field2__lt=another_fixed_value and field2__gt=fixed_value) | \
Q(field1__gt=fixed_value and field2__gte=another_fixed_value and field1__lt=another_fixed_value) | \
Q(field1__lte=fixed_value and field2__gte=another_fixed_value and fixed_value < another_fixed_value))
Except obviously you wouldn't actually include that and fixed_value < another_fixed_value. If they're literally fixed and you know them when writing the code, you'd just make the first two comparisons - if you don't know them, only build the last Q object and or it into the query if necessary.
That said, that's horrible and I think extra is a much better choice.
mymodel.objects.extra(where=['GREATEST(field1, fixed_value) < LEAST(field2, another_fixed_value)'])
Take a look to field lookups
Model.objects.filter(id__gt=4)
is equivalent to:
SELECT ... WHERE id > 4;
less than
Model.objects.filter(id__lt=4)
is equivalent to:
SELECT ... WHERE id < 4;
Related
We're accessing a view from another team's database and to make it a lot simpler, the view looks a bit like this:
create view x_view as
select
x.exec_time,
...
from
stuff x
where
x.exec_time > SYSDATE -2
and
...
;
and when accessing the view, we further filter on the same column:
select
*
from
x_view x
where
trunc(x.exec_time) = %1
and
...
;
since I'd rather not change the view, but still get our query done fast and with a stable execution plan, I want to tell them what Indices would be beneficial. But how do I deal with those 2 predicates on the date field? I have those 3 options:
add exec_time AND trunc(exec_time) to the index
only exec_time in the index
only trunc(exec_time) in the index
or is this construct so problematic that we should rather make a different view?
EDIT: I believe it's oracle 11.2
Change your query to be:
where x.exec_time >= %1
and x.exec_time < %1 + 1
and then a single index on EXEC_TIME should do the trick (obviously we are not considering other predicates you might have here)
I'm analyzing the code of someone else and stumbled on a query like this:
SELECT ...
FROM ...
GROUP BY date_field + 1
What's the difference between this query and one such as the following?
SELECT ...
FROM ...
GROUP BY date_field
If the result to both queries is the same, are they different in terms of performance? Thanks in advance.
There is no difference logically. The original code probably has something like:
select date_field + 1, . . .
from . . .
group by date_field + 1;
The group by key then directly matches the expression in the select. However, this is not necessary. SQL supports expressions on the group by keys, and + 1 does no affect the outcome.
It might affect the execution plan, however. I am not overly familiar with the optimization engine in Teradata, but two things come to mind:
The engine may not recognize that it could use an index on date_field.
The engine may get confused on statistics for the aggregation column, choosing a suboptimal aggregation algorithm.
Which operator in oracle gives better performance IN or OR
ex:
select * from table where y in (1,2,3)
or
select * from table where y = 1 or y = 2 or y = 3
You'd want to do an explain plan to be sure, but I would expect the performance to be identical.
The two statements are equivalent, the optimizer will generate the same access path. Therefore you should choose the one that is the most readable (I'd say the first one).
I would hesitate to use OR like that. You need to be careful if you add additional criteria. For instance adding an AND will require you remember to add parenthesis.
eg:
select * from table where y = 1 or y = 2 or y = 3
gets changed to:
select * from table where ( y = 1 or y = 2 or y = 3 ) AND x = 'value'
It is quite easy to forget to include the parenthesis and inject a difficult to daignose bug. For maintainability alone I would strongly suggest using IN instead of OR.
In a simple query like yours, the optimizer is smart enough to treat them both 100% the same so they are identical.
HOWEVER, that is potentially not 100% the case.
E.g. when optimizing large complex joints, it is plausible that the optimizer will not equate the two approaches as intelligently, thus choosing the wrong plan. I have observed somewhat similar problem on Sybase, although I don't know if it exists in Oracle (thus my "potentially" qualifyer).
Ok so I'm having a bit of a learning moment here and after figuring out A way to get this to work, I'm curious if anyone with a bit more postgres experience could help me figure out a way to do this without doing a whole lotta behind the scene rails stuff (or doing a single query for each item i'm trying to get)... now for an explaination:
Say I have 1000 records, we'll call them "Instances", in the database that have these fields:
id
user_id
other_id
I want to create a method that I can call that pulls in 10 instances that all have a unique other_id field, in plain english (I realize this won't work :) ):
Select * from instances where user_id = 3 and other_id is unique limit 10
So instead of pulling in an array of 10 instances where user_id is 3 and you can get multiple instances with the other_id is 5, I want to be able to run a map function on those 10 instances and get back something like [1,2,3,4,5,6,7,8,9,10].
In theory, I can probably do one of two things currently, though I'm trying to avoid them:
Store an array of id's and do individual calls making sure the next call says "not in this array". The problem here is I'm doing 10 individual db queries.
Pull in a large chunk of say, 50 instances and sorting through them in ruby-land to find 10 unique ones. This wouldn't allow me to take advantage of any optimizations already done in the database and I'd also run the risk of doing a query for 50 items that don't have 10 unique other_id's and I'd be stuck with those unless I did another query.
Anyways, hoping someone may be able to tell me I'm overlooking an easy option :) I know this is kind of optimizing before it's really needed but this function is going to be run over and over and over again so I figure it's not a waste of time right now.
For the record, I'm using Ruby 1.9.3, Rails 3.2.13, and Postgresql (Heroku)
Thanks!
EDIT: Just wanted to give an example of a function that technically DOES work (and is number 1 above)
def getInstances(limit, user)
out_of_instances = false
available = []
other_ids = [-1] # added -1 to avoid submitting a NULL query
until other_ids.length == limit || out_of_instances == true
instance = Instance.where("user_id IS ? AND other_id <> ALL (ARRAY[?])", user.id, other_ids).limit(1)
if instance != []
available << instance.first
other_ids << instance.first.other_id
else
out_of_instances = true
end
end
end
And you would run:
getInstances(10, current_user)
While this works, it's not ideal because it's leading to 10 separate queries every time it's called :(
In a single SQL query, it can be achieved easily with SELECT DISTINCT ON... which is a PostgreSQL-specific feature.
See http://www.postgresql.org/docs/current/static/sql-select.html
SELECT DISTINCT ON ( expression [, ...] ) keeps only the first row of
each set of rows where the given expressions evaluate to equal. The
DISTINCT ON expressions are interpreted using the same rules as for
ORDER BY (see above). Note that the "first row" of each set is
unpredictable unless ORDER BY is used to ensure that the desired row
appears first
With your example:
SELECT DISTINCT ON (other_id) *
FROM instances
WHERE user_id = 3
ORDER BY other_id LIMIT 10
All,
First time poster on here. I am trying to create a query where I want to exclude data between a date range but also having certain codes.
Date:
(substring(CAST(UTCBigintToUTCTime(starttime) as varchar(19)),0,11) not between '2012-05-08%' and '2012-05-10%
Status Code:
statuscode NOT IN ('58','59'))
What would my statement look like to exclude data that meets BOTH of those conditions? Everything I do excludes all in that date range and all in the status code range.
Thanks in advance. SQL newbie but learning :).
It seems to me that you're over-thinking it a bit, and making it more complex than it needs to be.
And making it this complex, especially with negative logic, will also make it perform poorly.
How about something like:
select * from myTable where starttime < '2012-05-08' and starttime > '2012-05-10' and statuscode < 58 and statuscode > 59
Not sure what database you are using, or exactly what your data types are - adjust slightly as necessary, but try to stay away from nasty date/string conversions and 'NOT' conditions wherever possible.
try this
select * from myTable where convert(date,starttime) not between '2012-05-08' and '2012-05-10' and statuscode not in (58,59)
and let me know.