Why is this BigQuery WHERE NOT IN statement giving no results? - google-bigquery

We would expect that this Google BigQuery query would remove at most 10 rows of results - but this query gives us zero results - despite that table A has thousands of rows all with unique ENCNTR_IDs.
SELECT ENCNTR_ID
FROM `project.dataset.table_A`
WHERE ENCNTR_ID NOT IN
(
SELECT ENCNTR_ID
FROM `project.dataset.table_B`
LIMIT 10
)
If we make the query self-referential, it behaves as expected: we get thousands of results with just 10 rows removed.
SELECT ENCNTR_ID
FROM `project.dataset.table_A`
WHERE ENCNTR_ID NOT IN
(
SELECT ENCNTR_ID
FROM `project.dataset.table_A` # <--- same table name
LIMIT 10
)
What are we doing wrong? Why does the first query give us zero results rather than just remove 10 rows of results?

Solution: Use NOT EXISTS instead of NOT IN when dealing with possible nulls:
SELECT *
FROM UNNEST([1,2,3]) i
WHERE NOT EXISTS (SELECT * FROM UNNEST([2,3,null]) i2 WHERE i=i2)
# 1
Previous guess - which turned out to be the cause:
SELECT *
FROM UNNEST([1,2,3]) i
WHERE i NOT IN UNNEST([2,3])
# 1
vs
SELECT *
FROM UNNEST([1,2,3]) i
WHERE i NOT IN UNNEST([2,3,null])
# This query returned no results.
Are there any nulls in that project.dataset.table_B?

Related

BigQuery Querying Events Table

Objective : To take the users count which has event_name= 'Wallet'
Problem : I have limited the query's result to 100 to check so the expected result must be 100 but when I use count(params.value.string_value) it shows 124 .
Code : SELECT count(params.value.string_value) FROM "myproj.analytics_197163127.events_20190528",UNNEST(event_params) as params where event_name ='Wallet' and params.key = 'UserId' limit 100
Expected Result : if the query is returning 100 records the count should be 100 but how is it showing 124?
Hope the question is clear
limit is applied to the result set produced by the query.
Your query is an aggregation query with no group by. Such an aggregation always returns one row. So, the limit does not affect the results.
If you want to see 100 for the result set, use a CTE or subquery:
SELECT count(params.value.string_value)
FROM (SELECT params
FROM "myproj.analytics_197163127.events_20190528" e CROSS JOIN
UNNEST(e.event_params) params
WHERE e.event_name ='Wallet' AND params.key = 'UserId'
LIMIT 100
) ep
The query shows 100 records because of the limit 100 at the end:
SELECT event_date,event_timestamp,event_name, params.value.string_value
FROM myproj.analytics_197163127.events_20190528, UNNEST(event_params) as params
where event_name ='Wallet' and params.key = 'UserId'
limit 100
Remove that and check again.
The LIMIT 100 specifies the number of rows to be returned from the SQL statement. It does not influence the COUNT() in your query. So there is a difference between :
select count(*) from table limit 100
this will return a single value with the number of rows in the table. On the other hand :
select count(*) from (select * from table limit 100)
This will return 100 (if the table has more than 100 rows - otherwise it will return the number of rows in table)

subquerying in WHERE/Joining 3 tables, 2 for records and one for number, returns no result/fails - MSAccess

My query does return any records. Depending on how I write it, it returns no records or all records, although I don't have the code that just returned everything.
I need to pull data from two sources with actual records, and a third table which has project-wide information not specific to any records. I need to filter out records which are greater than the Miles_Budgeted variable.
This returns no records, although if I replaces param.Miles_Budgeted with a numeric value e.g. 1000, it filters to the desired records.
SELECT
a.sort_id,
a.l1l2,
a.rtot_pct_oftot_miles,
b.sumofeq,
b.c_per_mile,
b.sumofo_total,
a.cpminrmd,
a.RunTotMiles,
param.Miles_Budgeted
FROM
(SELECT (p.Budget_Cost_Targ / p.Project_Cost_Per_Mi) AS Miles_Budgeted FROM Tbl_Project_Parameters as p) AS param,
qry_par_l2_by_cpermi AS a
INNER JOIN
qry_l2 AS b
ON a.l1l2 = b.l1l2
WHERE
((a.RunTotMiles) <=
(Param.Miles_Budgeted
)
)
ORDER BY
a.sort_id;
This variant of the query does not run (Syntax Error in FROM Clause)
SELECT
a.sort_id,
a.l1l2,
a.rtot_pct_oftot_miles,
b.sumofeq,
b.c_per_mile,
b.sumofo_total,
a.cpminrmd,
a.runtotmiles,
param.miles_budgeted
FROM (
(
SELECT (p.budget_cost_targ / p.project_cost_per_mi) AS miles_budgeted
FROM tbl_project_parameters AS p ) AS param
INNER JOIN qry_par_l2_by_cpermi AS a )
INNER JOIN qry_l2 AS b
ON a.l1l2 = b.l1l2
AND (
a.runtotmiles) <= ( param.miles_budgeted )
ORDER BY a.sort_id;
This also returns no records:
SELECT
a.sort_id,
a.l1l2,
a.rtot_pct_oftot_miles,
b.sumofeq,
b.c_per_mile,
b.sumofo_total,
a.RunTotMiles,
a.cpminrmd
FROM
qry_par_l2_by_cmipermi AS a
INNER JOIN
qry_l2 AS b
ON a.l1l2 = b.l1l2
WHERE
(
((a.RunTotMiles) <=
(
SELECT
(p.Budget_Cost_Targ / p.Project_Cost_Per_Mi) AS Budgeted_Miles
FROM
Tbl_Project_Parameters AS p
)
)
)
ORDER BY
a.sort_id;
Again, if
SELECT
(p.Budget_Cost_Targ / p.Project_Cost_Per_Mi) AS Budgeted_Miles
FROM
Tbl_Project_Parameters AS p
is replaces with a numeric value, the query returns the correct records. I have tried surrounding the subq or field with val() or Format(,"Standard") but this does not see to fix the issue; a separate query with just the relevant code returns the correct Budgeted_Miles as 1000 as it should.
Any thoughts appreciated.
Have you tried limiting that subquery to return only one record? I know some versions of SQL don't like when you try comparing the results of a SELECT to a single value.
I believe the syntax for MS Access would use "TOP":
SELECT TOP 1
(p.Budget_Cost_Targ / p.Project_Cost_Per_Mi) AS Budgeted_Miles
FROM
Tbl_Project_Parameters AS p

Returning the lowest integer not in a list in SQL

Supposed you have a table T(A) with only positive integers allowed, like:
1,1,2,3,4,5,6,7,8,9,11,12,13,14,15,16,17,18
In the above example, the result is 10. We always can use ORDER BY and DISTINCT to sort and remove duplicates. However, to find the lowest integer not in the list, I came up with the following SQL query:
select list.x + 1
from (select x from (select distinct a as x from T order by a)) as list, T
where list.x + 1 not in T limit 1;
My idea is start a counter and 1, check if that counter is in list: if it is, return it, otherwise increment and look again. However, I have to start that counter as 1, and then increment. That query works most of the cases, by there are some corner cases like in 1. How can I accomplish that in SQL or should I go about a completely different direction to solve this problem?
Because SQL works on sets, the intermediate SELECT DISTINCT a AS x FROM t ORDER BY a is redundant.
The basic technique of looking for a gap in a column of integers is to find where the current entry plus 1 does not exist. This requires a self-join of some sort.
Your query is not far off, but I think it can be simplified to:
SELECT MIN(a) + 1
FROM t
WHERE a + 1 NOT IN (SELECT a FROM t)
The NOT IN acts as a sort of self-join. This won't produce anything from an empty table, but should be OK otherwise.
SQL Fiddle
select min(y.a) as a
from
t x
right join
(
select a + 1 as a from t
union
select 1
) y on y.a = x.a
where x.a is null
It will work even in an empty table
SELECT min(t.a) - 1
FROM t
LEFT JOIN t t1 ON t1.a = t.a - 1
WHERE t1.a IS NULL
AND t.a > 1; -- exclude 0
This finds the smallest number greater than 1, where the next-smaller number is not in the same table. That missing number is returned.
This works even for a missing 1. There are multiple answers checking in the opposite direction. All of them would fail with a missing 1.
SQL Fiddle.
You can do the following, although you may also want to define a range - in which case you might need a couple of UNIONs
SELECT x.id+1
FROM my_table x
LEFT
JOIN my_table y
ON x.id+1 = y.id
WHERE y.id IS NULL
ORDER
BY x.id LIMIT 1;
You can always create a table with all of the numbers from 1 to X and then join that table with the table you are comparing. Then just find the TOP value in your SELECT statement that isn't present in the table you are comparing
SELECT TOP 1 table_with_all_numbers.number, table_with_missing_numbers.number
FROM table_with_all_numbers
LEFT JOIN table_with_missing_numbers
ON table_with_missing_numbers.number = table_with_all_numbers.number
WHERE table_with_missing_numbers.number IS NULL
ORDER BY table_with_all_numbers.number ASC;
In SQLite 3.8.3 or later, you can use a recursive common table expression to create a counter.
Here, we stop counting when we find a value not in the table:
WITH RECURSIVE counter(c) AS (
SELECT 1
UNION ALL
SELECT c + 1 FROM counter WHERE c IN t)
SELECT max(c) FROM counter;
(This works for an empty table or a missing 1.)
This query ranks (starting from rank 1) each distinct number in ascending order and selects the lowest rank that's less than its number. If no rank is lower than its number (i.e. there are no gaps in the table) the query returns the max number + 1.
select coalesce(min(number),1) from (
select min(cnt) number
from (
select
number,
(select count(*) from (select distinct number from numbers) b where b.number <= a.number) as cnt
from (select distinct number from numbers) a
) t1 where number > cnt
union
select max(number) + 1 number from numbers
) t1
http://sqlfiddle.com/#!7/720cc/3
Just another method, using EXCEPT this time:
SELECT a + 1 AS missing FROM T
EXCEPT
SELECT a FROM T
ORDER BY missing
LIMIT 1;

How can I select rows by range? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Select statement in SQLite recognizing row number
For example, SELECT * FROM table WHERE [row] BETWEEN x AND y
How can this be done? I've done some reading but haven't found anything specifically correct.
Imagine a list where you want results paged by an X amount of results, so for page 10 you would need results from rows 10 * X to 10 * X + X. Rather than display ALL results in one go
For mysql you have limit, you can fire query as :
SELECT * FROM table limit 100` -- get 1st 100 records
SELECT * FROM table limit 100, 200` -- get 200 records beginning with row 101
For Oracle you can use rownum
See mysql select syntax and usage for limit here.
For SQLite, you have limit, offset. I haven't used SQLite but I checked it on SQLite Documentation. Check example for SQLite here.
Following your clarification you're looking for limit:
SELECT * FROM `table` LIMIT 0, 10
This will display the first 10 results from the database.
SELECT * FROM `table` LIMIT 5, 5 .
Will display 5-9 (5,6,7,8,9)
The syntax follows the pattern:
SELECT * FROM `table` LIMIT [row to start at], [how many to include] .
The SQL for selecting rows where a column is between two values is:
SELECT column_name(s)
FROM table_name
WHERE column_name
BETWEEN value1 AND value2
See: http://www.w3schools.com/sql/sql_between.asp
If you want to go on the row number you can use rownum:
SELECT column_name(s)
FROM table_name
WHERE rownum
BETWEEN x AND y
However we need to know which database engine you are using as rownum is different for most.
You can use rownum :
SELECT * FROM table WHERE rownum > 10 and rownum <= 20
Using Between condition
SELECT *
FROM TEST
WHERE COLUMN_NAME BETWEEN x AND y ;
Or using Just operators,
SELECT *
FROM TEST
WHERE COLUMN_NAME >= x AND COLUMN_NAME <= y;
Assuming id is the primary key of table :
SELECT * FROM table WHERE id BETWEEN 10 AND 50
For first 20 results
SELECT * FROM table order by id limit 20;
Have you tried your own code?
This should work:
SELECT * FROM people WHERE age BETWEEN x AND y
Use the LIMIT clause:
/* rows x- y numbers */
SELECT * FROM tbl LIMIT x,y;
refer : http://dev.mysql.com/doc/refman/5.0/en/select.html

select top 1 * returns diffrent recordset each time

In my application I use SELECT TOP 12 * clause to select top 12 records from database and show it to user. In another case I have to show the same result one by one. So I use SELECT TOP 1 * clause,rest of the query is same. I used Sql row_number() function to select items one by on serially.
The problem is SELECT TOP 1 * doesn't return me same row as I get in SELECT TOP 12 *. Also the result set of SELECT TOP 12 * get changed each time I execute the query.
Can anybody explain me why the result is not get same in SELECT TOP 12 * and SELECT TOP 1 *.
FYI: here is my sql
select distinct top 1 * from(
select row_number() over ( ORDER BY Ratings desc ) as Row, * from(
SELECT vw.IsHide, vw.UpdateDate, vw.UserID, vw.UploadPath, vw.MediaUploadID, vw.Ratings, vw.Caption, vw.UserName, vw.BirthYear, vw.BirthDay, vw.BirthMonth, vw.Gender, vw.CityProvince, vw.Approved
FROM VW_Media as vw ,Users as u WITH(NOLOCk)
WHERE vw.IsHide='false' and
GenderNVID=5 and
vw.UserID=u.UserID and
vw.UserID not in(205092) and
vw.UploadTypeNVID=1106 and
vw.IsDeleted='false' and
vw.Approved = 1 and
u.HideProfile=0 and
u.StatusNVID=126 and
vw.UserID not in(Select BlockedToUserID from BlockList WITH(NOLOCk) where UserID=205092) a) totalres where row >0
Thanks in Advance
Sachin
When you use SELECT TOP, you must use also the ORDER BY clause to avoid different results every time.
For performance resons, the database is free to return the records in any order it likes if you don't specify any ordering.
So, you always have to specify in which order you want the records, if you want them in any specific order.
Up to some version of SQL Server (7 IIRC) the natural order of the table was preserved in the result if you didn't specify any ordering, but this feature was removed in later versions.