Distinct elements in django model - sql

I'm a Django newbie and I wonder if there is a more efficient way (at a database level) of doing the following.
I have the model:
class Foo(models.Model):
item=models.IntegerField()
another_item=models.IntegerField()
And want to get an iterable of all the distinct values of "item".
This is what I have so far:
distinct=set([row.item for row in Foo.objects.all()])
This is easy to understand. But if I'm understanding how Django works then the SQL query is not very efficient because it is something like:
SELECT * FROM DB
when I only need:
SELECT DISTINCT item FROM DB
Any way of doing this more efficiently?

You want to use the distinct clause in combination with the values or values_list clauses.
Doc starts here. distinct, values and values_list are all in there.
So you could do:
Foo.objects.values_list('item', flat=True)
And that would return a list of item - matching your SELECT DISTINCT item FROM DB query.

Related

Counting unique text in query

I need to count how many distinct shipping methods there are in a query (the answer is 2). I am trying to use DISTINCT but it doesn't seem to be working the way I thought it would.
SELECT DISTINCT Count(Order.ship_method) AS CountOfship_method
FROM [Order];
Try this instead -
SELECT COUNT(*) as CountOfship_method
FROM
(SELECT DISTINCT Order.ship_method FROM [Order]);

Using multiple nested fields in BigQuery

I have some records that have information about stores. These records have several different nested fields. One of the nested fields is tags and one is employees. I am trying to get a count of the number of stores that have a tag and an employee with a certain name. So I did this:
SELECT count(*)
FROM [stores.stores_844_1]
where tags.tag_name='foo'
and employees.first_name='bar'
Then I get the error:
Error: Cannot query the cross product of repeated fields tags.tag_name and employees.first_name.
I can make it work by changing the query to:
SELECT count(*)
FROM ((flatten([stores.stores_844_1],tags))
where tags.tag_name='foo'
and employees.first_name='bar'
The problem with this is that I am dynamically creating the where clause and so my from clause will have to change depending on what I have in the where. While I could generate some logic in code to figure out what the from clause should be, I was wondering if there is a way to do something like:
SELECT count(*)
FROM [stores.stores_844_1]
where tags.tag_name='foo' WITHIN RECORD
and employees.first_name='bar' WITHIN RECORD
That would not have to flatten the main table?
I have tried using an ugly work around like this:
SELECT count(*)
FROM
(SELECT GROUP_CONCAT(CONCAT('>', tags.tag_name,'<')) WITHIN RECORD as f1, GROUP_CONCAT(CONCAT('>',employees.first_name,'<')) WITHIN RECORD as f2
FROM [stores.stores_844_1]
)
where f1 CONTAINS '>foo<'
and f2 CONTAINS '>bar<'
This ugly workaround works how I want it to, but it just seems really hacky and ugly and there must be a better way, right?
You can use WITHIN RECORD to come up with another field that indicates whether the values are present. I'm not sure if this meets your requirements, since you still have to change the FROM clause, but it seems cleaner than what you are currently doing. In other words, try this:
SELECT count(*) FROM (
SELECT SUM(IF(tags.tag_name='foo', 1, 0)) WITHIN RECORD as has_foo,
SUM(IF(employees.first_name='bar', 1, 0)) WITHIN RECORD as has_bar,
FROM [stores.stores_844_1])
WHERE has_foo > 0 AND has_bar > 0

SQL pattern to get "and" list of multiple-row matches?

I'm not a database programmer, but I have a simple database-backed app where I have items with tags. Each item may have multiple tags, so I'm using a typical junction table (like this), where each row represents the fact that the item with the appropriate ID has the tag with the appropriate ID.
This works very logically when I want to do something like select all items with a given tag.
But, what is the typical pattern for doing AND searches? That is, what if I want to find all items which have all of a certain set of tags? This is such a common operation that I'd think some of the intro tutorials would cover it, but I guess I'm not looking in the right places.
The approach I tried was to use INTERSECT, first directly and then with subqueries and IN. This works, but builds up long-seeming queries quickly as I add search terms. And, crucially, this approach appears to be about an order of magnitude slower than the approach of shoving all the tags as text into one "tags" column and using SQLite's full-text search. (And, as I would expect/hope, the FTS search gets faster as I add more terms, which doesn't seem to be the case with the INTERSECTS approach.)
What's the proper design pattern here, and what's the right way to make it snappy? I'm using SQLite in this case, but I'm most interested in a general answer, since this must be a common thing to do.
The following is the standard ANSI SQL solution which avoids synchronizing the number of ids and the ids themselves.
with tag_ids (tid) as (
values (1), (2)
)
select id
from tags
where id (select tid from tag_ids)
having count(*) = (select count(*) from tag_ids);
The values clause ("row constructor") is supported by PostgreSQL and DB2. For database that don't support that, you can replace it with a simple "select", e.g. in Oracle this would be:
with tag_ids (tid) as (
select 1 as tid from dual
union all
select 2 from dual
)
select id
from tags
where id (select tid from tag_ids)
having count(*) = (select count(*) from tag_ids);
For SQL Server you would simply leave out the "from dual", as it does not require a FROM clause for a SELECT.
This assumes that one tag can only be assigned exactly once. If that isn't the case, you would need to use a count(distinct id) in the having clause.
I would be inclined to use a group by:
select id
from tags
where id in (<tag1>, <tag2>)
group by id
having count(*) = 2
This would guarantee that both appear.
For an unlimited size list, you could store the ids in a string, such as '|tag1|tag2|tag3|' (note delimiters on ends). Then you can do:
select id
from tags
where #taglist like '%|'+tag+'|%'
group by id
having count(*) = len(#taglist) - (len(replace(#taglist, '|', '') - 1)
This is using SQL Server syntax. But, it is saying two things. The WHERE clause is saying that the tag is in the list. The HAVING clause is saying that the number of matches equals the length of the list. It does this with a trick, by counting the number of separtors and subtracting 1.

The used SELECT statements have a different number of columns

For examples I don't know how many rows in each table are and I try to do like this:
SELECT * FROM members
UNION
SELECT * FROM inventory
What can I put to the second SELECT instead of * to remove this error without adding NULL's?
Put the columns names explicitly rather than *, and make sure the number of columns and data types match for the same column in each select.
Update:
I really don't think you want to be UNIONing those tables, based on the tables names. They don't seem to contain related data. If you post your schema and describe what you are trying to achieve it is likely we can provide better help.
you could do
SELECT *
from members
UNION
SELECT inventory.*, 'dummy1' AS membersCol1, 'dummy2' AS membersCol2
from inventory;
Where membersCol1, membersCol12, etc... are the names of columns from members that are not in inventory. That way both queries in the union will have the same columns (Assuming that all the columns in inventory are the same as in members which seems very strange to me... but hey, it's your schema).
UPDATE:
As HLGEM pointed out, this will only work if inventory has columns with the same names as members, and in the same order. Naming all the columns explicitly is the best idea, but since I don't know the names I can't exactly do that. If I did, it might look something like this:
SELECT id, name, member_role, member_type
from members
UNION
SELECT id, name, '(dummy for union)' AS member_role, '(dummy for union)' AS member_type
from inventory;
I don't like using NULL for dummy values because then it's not always clear which part of the union a record came from - using 'dummy' makes it clear that the record is from the part of the union that didn't have that record (though sometimes this might not matter). The very idea of unioning these two tables seems very strange to me because I very much doubt they'd have more than 1 or 2 columns with the same name, but you asked the question in such a way that I imagine in your scenario this somehow makes sense.
Are you sure you don't want a join instead? It is unlikely that UNOIN will give you what you want given the table names.
Try this
(SELECT * FROM members) ;
(SELECT * FROM inventory);
Just add semicolons after both the select statements and don't use union or anything else. This solved my error.
I don't know how many rows in each table
Are you sure this isn't what you want?
SELECT 'members' AS TableName, Count(*) AS Cnt FROM members
UNION ALL
SELECT 'inventory', Count(*) FROM inventory
Each SELECT statement within the MySQL UNION ALL operator must have the same number of fields in the result sets with similar data types
Visit https://www.techonthenet.com/mysql/union_all.php

NHibernate use Criteria for Count(),First()

I have a question about Criteria method, one-to-many relation to the database, 'one' is "account", 'many' is "sites", when I use CreateCriteria() something is not right.
Like this: SessionFactory.OpenSession().CreateCriteria(typeof(Account)).List().Count();
Before it's run, I think the SQL should be SELECT COUNT(*) FROM table, but the SQL is SELECT id, siteurl...FROM table. So what's wrong with this? How can I solve it?
And First() method should be SELECT TOP1 ...FROM table, but it is SELECT ...FROM table
I'm an Nhiberate rookie, Please help me.
This happens because the Count method you are calling at the end executes after the query has run and outside of the database. You are only counting the elements in the list in memory. To achieve what you are looking for you could use a projection:
var count = session
.CreateCriteria<Account>()
.SetProjection(
Projections.Count(Projections.Id())
)
.UniqueResult<long>();