I am modifying a query for a sub-report in Access 2016 and need to select a set of rows, not all rows. By default the query generated looks like:
SELECT table_name.a, table_name.b, table_name.c
FROM table_name
WHERE (((table_name.dist_ft)<3001));
How can I select only rows m through n instead of all rows?
Thanks for your insights! ... [edit]
An additional clarification - when I run the query like
SELECT TOP 16 *
FROM table_name
WHERE (((table_name.dist_ft)<3001));
... or any other variation I've tried with TOP my sub-report does not get populated. It only contains data when all fields are selected and TOP is not used. I must be missing something.
Is ID your "row number" here?
SELECT table_name.a, table_name.b, table_name.c
FROM table_name
WHERE table_name.dist_ft<3001
AND table_name.ID>=m
AND table_name.ID<=n
;
Updated with more general case based on comments-
Select table_name.a, table_name.b, table_name.c from tablename
where tablename.id in
(select top n tablename.id from tablename)
and tablename.id not in
(select top m tablename.id from tablenane)
Records m to n are the records 1 to n minus the records 1 to m-1. Be aware though that you need an ORDER BY clause for a TOP clause to make sense.
Here is an example with m = 31 to n = 40 and an order by all three selected columns. MS Access does not support EXCEPT so we cannot subtract the two data sets, which would be the straight-forward way to go. We could also express the desired result as the top n where (a,b,c) not in top m-1, but MS Access does not support an IN clause on multiple columns either. So I am using an anti join here (for which I select dist_ft, but it can be any non-nullable column of the table).
In case your table has a unique ID column, you can use a more readable where (id) not in (select top 30 id ...) instead of an anti join. In any way make sure to apply the same WHERE clause (dist_ft < 3001 in your case) and ORDER BY clause (e.g. ORDER BY a, b, c) to the main query and subquery.
SELECT TOP 40 a, b, c
FROM table_name t
LEFT JOIN
(
SELECT TOP 30 a, b, c, dist_ft
FROM table_name
WHERE dist_ft < 3001
ORDER BY a, b, c
) no ON no.a = t.a AND no.b = t.b AND no.c = t.c
WHERE t.dist_ft < 3001
AND no.dist_ft is null
ORDER BY t.a, t.b, t.c;
MS Access is known for requiring additional parentheses on multiple joins. I cannot say whether above query works straight away or if parantheses must be added somewhere.
You sort thrice to get the result you are after. Let's say we want rows 31 to 40:
Sort and get top 40
sort in reverse order and get top 10
sort again to get the order you actually want
The query:
SELECT a, b, c
FROM
SELECT TOP 10 a, b, c
FROM
(
SELECT TOP 40 a, b, c
FROM table_name
WHERE t.dist_ft < 3001
ORDER BY a, b, c
) top_n
ORDER BY a desc, b desc, c desc
) top_m_to_n
ORDER BY a, b, c;
Related
Supposed you have a table T(A) with only positive integers allowed, like:
1,1,2,3,4,5,6,7,8,9,11,12,13,14,15,16,17,18
In the above example, the result is 10. We always can use ORDER BY and DISTINCT to sort and remove duplicates. However, to find the lowest integer not in the list, I came up with the following SQL query:
select list.x + 1
from (select x from (select distinct a as x from T order by a)) as list, T
where list.x + 1 not in T limit 1;
My idea is start a counter and 1, check if that counter is in list: if it is, return it, otherwise increment and look again. However, I have to start that counter as 1, and then increment. That query works most of the cases, by there are some corner cases like in 1. How can I accomplish that in SQL or should I go about a completely different direction to solve this problem?
Because SQL works on sets, the intermediate SELECT DISTINCT a AS x FROM t ORDER BY a is redundant.
The basic technique of looking for a gap in a column of integers is to find where the current entry plus 1 does not exist. This requires a self-join of some sort.
Your query is not far off, but I think it can be simplified to:
SELECT MIN(a) + 1
FROM t
WHERE a + 1 NOT IN (SELECT a FROM t)
The NOT IN acts as a sort of self-join. This won't produce anything from an empty table, but should be OK otherwise.
SQL Fiddle
select min(y.a) as a
from
t x
right join
(
select a + 1 as a from t
union
select 1
) y on y.a = x.a
where x.a is null
It will work even in an empty table
SELECT min(t.a) - 1
FROM t
LEFT JOIN t t1 ON t1.a = t.a - 1
WHERE t1.a IS NULL
AND t.a > 1; -- exclude 0
This finds the smallest number greater than 1, where the next-smaller number is not in the same table. That missing number is returned.
This works even for a missing 1. There are multiple answers checking in the opposite direction. All of them would fail with a missing 1.
SQL Fiddle.
You can do the following, although you may also want to define a range - in which case you might need a couple of UNIONs
SELECT x.id+1
FROM my_table x
LEFT
JOIN my_table y
ON x.id+1 = y.id
WHERE y.id IS NULL
ORDER
BY x.id LIMIT 1;
You can always create a table with all of the numbers from 1 to X and then join that table with the table you are comparing. Then just find the TOP value in your SELECT statement that isn't present in the table you are comparing
SELECT TOP 1 table_with_all_numbers.number, table_with_missing_numbers.number
FROM table_with_all_numbers
LEFT JOIN table_with_missing_numbers
ON table_with_missing_numbers.number = table_with_all_numbers.number
WHERE table_with_missing_numbers.number IS NULL
ORDER BY table_with_all_numbers.number ASC;
In SQLite 3.8.3 or later, you can use a recursive common table expression to create a counter.
Here, we stop counting when we find a value not in the table:
WITH RECURSIVE counter(c) AS (
SELECT 1
UNION ALL
SELECT c + 1 FROM counter WHERE c IN t)
SELECT max(c) FROM counter;
(This works for an empty table or a missing 1.)
This query ranks (starting from rank 1) each distinct number in ascending order and selects the lowest rank that's less than its number. If no rank is lower than its number (i.e. there are no gaps in the table) the query returns the max number + 1.
select coalesce(min(number),1) from (
select min(cnt) number
from (
select
number,
(select count(*) from (select distinct number from numbers) b where b.number <= a.number) as cnt
from (select distinct number from numbers) a
) t1 where number > cnt
union
select max(number) + 1 number from numbers
) t1
http://sqlfiddle.com/#!7/720cc/3
Just another method, using EXCEPT this time:
SELECT a + 1 AS missing FROM T
EXCEPT
SELECT a FROM T
ORDER BY missing
LIMIT 1;
In table A I have the dates, and in B I have the order numbers.
In both tables I have a common field called order Id.
I just have a simple goal to fetch the number of orders on each date[ as in 1st, 2nd ..]
Here is what I have tried as I dont want to use joins or views.
select
A.date_of_order,
count(B.order_number)
from A, B
where A.order_id=B.order_id;
group by A.date_of_order
I am getting the following error. Probably making some trivial error. Thanks in advance
Update:
After taking into consideration Dmitri and rafa s suggestions, I get the table as:
23-FEB-14 1
23-FEB-14 1
23-FEB-14 2
23-FEB-14 2
23-FEB-14 2
07-MAR-14 2
07-MAR-14 4
07-MAR-14 1
07-MAR-14 5
02-MAR-14 1
02-MAR-14 1
As I said my requirement is very simple, just get it as
23-Feb-14 10[i.e. all the orders placed on this date]
07-Mar-14 13
02-mar-14 2
WHERE should be put before GROUP BY:
select A.date_of_order,
count(B.order_number)
from A, B
where A.order_id = B.order_id -- <- possible, but join will be better here
group by A.date_of_order
If you want a condition after GROUP BY you should use HAVING
select A.date_of_order,
count(B.order_number)
from A, B
where A.order_id = B.order_id
group by A.date_of_order
having count(B.order_number) < 3 -- having demo
The WHERE clause must to be before of the GROUP BY clause.
Use TRUNC(date) to get rid of the time so the GROUP BY will work as expected.
SELECT TRUNC(A.date_of_order), COUNT(B.order_number)
FROM A, B
WHERE A.order_id=B.order_id
GROUP BY TRUNC(A.date_of_order)
Anyway it is recommended to use the ANSI-standard SQL JOIN clause instead.
SELECT TRUNC(A.date_of_order), COUNT(B.order_number)
FROM A INNER JOIN B ON A.order_id = B.order_id
-- (WHERE conditions here)
GROUP BY TRUNC(A.date_of_order)
-- (HAVING conditions here)
The possible reasons could be that
1) Dates have timestamp, in that case following query would be helpful:
select
trunc(A.date_of_order),
count(B.order_number)
from A, B
where A.order_id=B.order_id
group by trunc(A.date_of_order);
2) Since in your sample data, you already have count against the dates, you need to take sum instead of count for your query
select
A.date_of_order,
sum(B.order_number)
from A, B
where A.order_id=B.order_id
group by A.date_of_order;
3) Or could be both, in that case you can try
select
trunc(A.date_of_order),
sum(B.order_number)
from A, B
where A.order_id=B.order_id
group by trunc(A.date_of_order);
I don't see why you need to join to that other table at all. Try just running:
select date_of_order, count(order_id) from tbl_a group by date_of_order
Your question states that table A contains rows for each order ID and date, and all you want to do is count the number of orders by date.
Given a table in SQL-Server like:
Id INTEGER
A VARCHAR(50)
B VARCHAR(50)
-- Some other columns
with no index on A or B, I wish to find rows where a unique combination of A and B occurs more than once.
I'm using the query
SELECT A+B, Count(A+B) FROM MyTable
GROUP BY A+B
HAVING COUNT(A+B) > 1
First Question
Is there a more time-efficient way to do this? (I cannot add indices to the database)
Second Question
When I attempt to gain some formatting of the output by including a , in the concatenation:
SELECT A+','+B, Count(A+','+B) FROM MyTable
GROUP BY A+','+B
HAVING COUNT(A+','+B) > 1
The query fails with the error
Column 'MyDB.dbo.MyTable.A' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause
with a similar error for Column B.
How can I format the output to separate the two columns?
It would seem more natural to me to write:
SELECT A, B, Count(*) FROM MyTable
GROUP BY A, B
HAVING COUNT(*) > 1
And it's the most efficient way of doing it (and so is the query in the question).
Similarly to the above query, you can rewrite your second query:
SELECT A + ',' + B, Count(*) FROM MyTable
GROUP BY A, B
HAVING COUNT(*) > 1
I want to select some rows based on certain criteria, and then take one entry from that set and the 5 rows before it and after it.
Now, I can do this numerically if there is a primary key on the table, (e.g. primary keys that are numerically 5 less than the target row's key and 5 more than the target row's key).
So select the row with the primary key of 7 and the nearby rows:
select primary_key from table where primary_key > (7-5) order by primary_key limit 11;
2
3
4
5
6
-=7=-
8
9
10
11
12
But if I select only certain rows to begin with, I lose that numeric method of using primary keys (and that was assuming the keys didn't have any gaps in their order anyway), and need another way to get the closest rows before and after a certain targeted row.
The primary key output of such a select might look more random and thus less succeptable to mathematical locating (since some results would be filtered, out, e.g. with a where active=1):
select primary_key from table where primary_key > (34-5)
order by primary_key where active=1 limit 11;
30
-=34=-
80
83
100
113
125
126
127
128
129
Note how due to the gaps in the primary keys caused by the example where condition (for example becaseu there are many inactive items), I'm no longer getting the closest 5 above and 5 below, instead I'm getting the closest 1 below and the closest 9 above, instead.
There's a lot of ways to do it if you run two queries with a programming language, but here's one way to do it in one SQL query:
(SELECT * FROM table WHERE id >= 34 AND active = 1 ORDER BY id ASC LIMIT 6)
UNION
(SELECT * FROM table WHERE id < 34 AND active = 1 ORDER BY id DESC LIMIT 5)
ORDER BY id ASC
This would return the 5 rows above, the target row, and 5 rows below.
Here's another way to do it with analytic functions lead and lag. It would be nice if we could use analytic functions in the WHERE clause. So instead you need to use subqueries or CTE's. Here's an example that will work with the pagila sample database.
WITH base AS (
SELECT lag(customer_id, 5) OVER (ORDER BY customer_id) lag,
lead(customer_id, 5) OVER (ORDER BY customer_id) lead,
c.*
FROM customer c
WHERE c.active = 1
AND c.last_name LIKE 'B%'
)
SELECT base.* FROM base
JOIN (
-- Select the center row, coalesce so it still works if there aren't
-- 5 rows in front or behind
SELECT COALESCE(lag, 0) AS lag, COALESCE(lead, 99999) AS lead
FROM base WHERE customer_id = 280
) sub ON base.customer_id BETWEEN sub.lag AND sub.lead
The problem with sgriffinusa's solution is that you don't know which row_number your center row will end up being. He assumed it will be row 30.
For similar query I use analytic functions without CTE. Something like:
select ...,
LEAD(gm.id) OVER (ORDER BY Cit DESC) as leadId,
LEAD(gm.id, 2) OVER (ORDER BY Cit DESC) as leadId2,
LAG(gm.id) OVER (ORDER BY Cit DESC) as lagId,
LAG(gm.id, 2) OVER (ORDER BY Cit DESC) as lagId2
...
where id = 25912
or leadId = 25912 or leadId2 = 25912
or lagId = 25912 or lagId2 = 25912
such query works more faster for me than CTE with join (answer from Scott Bailey). But of course less elegant
You could do this utilizing row_number() (available as of 8.4). This may not be the correct syntax (not familiar with postgresql), but hopefully the idea will be illustrated:
SELECT *
FROM (SELECT ROW_NUMBER() OVER (ORDER BY primary_key) AS r, *
FROM table
WHERE active=1) t
WHERE 25 < r and r < 35
This will generate a first column having sequential numbers. You can use this to identify the single row and the rows above and below it.
If you wanted to do it in a 'relationally pure' way, you could write a query that sorted and numbered the rows. Like:
select (
select count(*) from employees b
where b.name < a.name
) as idx, name
from employees a
order by name
Then use that as a common table expression. Write a select which filters it down to the rows you're interested in, then join it back onto itself using a criterion that the index of the right-hand copy of the table is no more than k larger or smaller than the index of the row on the left. Project over just the rows on the right. Like:
with numbered_emps as (
select (
select count(*)
from employees b
where b.name < a.name
) as idx, name
from employees a
order by name
)
select b.*
from numbered_emps a, numbered_emps b
where a.name like '% Smith' -- this is your main selection criterion
and ((b.idx - a.idx) between -5 and 5) -- this is your adjacency fuzzy-join criterion
What could be simpler!
I'd imagine the row-number based solutions will be faster, though.
I've got an sql query that selects data from several tables, but I only want to match a single(randomly selected) row from another table.
Easier to show some code, I guess ;)
Table K is (k_id, selected)
Table C is (c_id, image)
Table S is (c_id, date)
Table M is (c_id, k_id, score)
All ID-columns are primary keys, with appropriate FK constraints.
What I want, in english, is for eack row in K that has selected = 1 to get a random row from C where there exists a row in M with (K_id, C_id), where the score is higher than a given value, and where c.image is not null and there is a row in s with c_id
Something like:
select k.k_id, c.c_id, m.score
from k,c,m,s
where k.selected = 1
and m.score > some_value
and m.k_id = k.k_id
and m.c_id = c.c_id
and c.image is not null
and s.c_id = c.c_id;
The only problem is this returns all the rows in C that match the criteria - I only want one...
I can see how to do it using PL/SQL to select all relevent rows into a collection and then select a random one, but I'm stuck as to how to select a random one.
you can use the 'order by dbms_random.random' instruction with your query.
i.e.:
SELECT column FROM
(
SELECT column FROM table
ORDER BY dbms_random.value
)
WHERE rownum = 1
References:
http://awads.net/wp/2005/08/09/order-by-no-order/
http://www.petefreitag.com/item/466.cfm
with analytics:
SELECT k_id, c_id, score
FROM (SELECT k.k_id, c.c_id, m.score,
row_number() over(PARTITION BY k.k_id ORDER BY NULL) rk
FROM k, c, m, s
WHERE k.selected = 1
AND m.score > some_value
AND m.k_id = k.k_id
AND m.c_id = c.c_id
AND c.image IS NOT NULL
AND s.c_id = c.c_id)
WHERE rk = 1
This will select one row that satisfies your criteria per k_id. This will likely select the same set of rows if you run the query several times. If you want more randomness (each run produces a different set of rows), you would replace ORDER BY NULL by ORDER BY dbms_random.value
I'm not too familiar with oracle SQL, but try using LIMIT random(), if there is such a function available.